Text Extraction from a file containing email addresses etc [duplicate] - linux

This question already has answers here:
Extract email addresses from text file using regex with bash or command line
(4 answers)
Closed 2 years ago.
How would I extract the email addresses from a file like below where the email addresses are in different columns and then output this to a file
aaa jim#gmail.com bbb ccc
aaa bbb john#gmail.com ccc ddd
aaa joe#gmail.com' bbb ccc
etc

This can easily be done, using grep -o: the -o only shows the result of the regular expression. As far as the regular expression for an e-mail address is concerned, you might try this:
grep -o "[a-zA-Z0-9.-]+#[a-zA-Z0-9.-]+.[a-zA-Z0-9.-]+" filename.txt

Related

how to insert text at specific lines in vim?

I want to put cursor at specific lines that is:
AAA
BBB
CCC
I know how to edit those three lines using CTRL_V and then SHIFT_I, but my question is how do i edit the first and the last line and leave out the second line that is:
test_AAA
BBB
test_CCC
thanks in advance.
Vim doesn't have a built-in "multiple cursors" feature. If you want to do it the vanilla way, you can use :help .:
" cursor is on first A
> AAA
BBB
CCC
" do itest_<Esc>
> test_AAA
BBB
CCC
" move the cursor to CCC
test_AAA
BBB
> CCC
" press .
test_AAA
BBB
> test_CCC
or the almighty :help :global, which is the de-facto way to operate on non-contiguous line:
:g/^AA|^CC/s/^/test_
And there are many other ways.
Or you can ask your favourite engine to help you find the couple of plugins that replicate that feature: "vim multiple cursors".
You could use a reverse global command:
:v/^BBB/norm Itest_
Explanation:
The normal global command uses "g" instead of "v" which inverts the search, so, for each line that does not start with BBB use the normal command "I" (insert)

How to find one of every values in excel

basically I want to list one of every value like this.
data I have:
aaa
aaa
bbb
aaa
ccc
bbb
ddd
ccc
ddd
aaa
the output that I want:
aaa
bbb
ccc
ddd
how can I do this?
I can't find anything on google because I don't know the right keyword to search.

Natural sorting in reverse?

I'm trying to sort a text file in this manner:
6 aaa
4 bbb
2 ccc
2 ddd
That is, each line sorted first in numeric descending order (the number indicates the number of occurrences of the word on the right), and if multiple words are repeated the same number of times, I'd like to have those words sorted alphabetically.
What I have:
6 aaa
4 bbb
2 ddd
2 ccc
When I try sort -nr | sort -V it kind of does what I want but in ascending order.
2 ccc
2 ddd
4 bbb
6 aaa
What's a clean way to accomplish this?
I think you just need to specify that the numeric reverse sort only applies to the first field:
$ sort -k1,1nr file
6 aaa
4 bbb
2 ccc
2 ddd
-k1,1[OPTS] means that OPTS only apply between the 1st and 1st field. The rest of the line is sorted according to global ordering options. In this case, since no other options were passed, this means the default lexicographic sort.
Maybe using tac? (not a shell expert here, just remembering uni days...
sort -nr | sort -V | tac

How to modify a line after every 6 lines in vim

I have a file with these lines:
aaa
aaa
aaa
aaa
aaa
aaa
bbb
bbb
bbb
bbb
bbb
bbb
ccc
ccc
ccc
ccc
ccc
ccc
ddd
ddd
ddd
ddd
ddd
ddd
i want to convert this into:
aaa;100;
aaa
aaa
aaa
aaa
aaa
bbb;100;
bbb
bbb
bbb
bbb
bbb
ccc;100;
ccc
ccc
ccc
ccc
ccc
ddd;100;
ddd
ddd
ddd
ddd
ddd
Is this possible in vim in one command ?
Yet another one
:g/^/ if !((line('.')-1)%6)|s/$/;100;
Breakdown
g/^/ Global command to apply next expression on each line
if !((line('.')-1)%6) If the modulus of the current line equals 0
s/$/;100; Replace the line ending with ;100;
That depends on what you mean by "one command", but you can do without manually repeating it for each item by using a macro:
Position your cursor on the first line
Start recording a macro named z: qz
Enter insert mode at the end of the line: <shift-A>
Enter the text you want: ;100;
Exit insert mode: <esc>
Jump down six lines: 6j
Stop recording the macro: q
Repeat the macro the right number more times: 3#z
Because the jumping down 6 lines is part of the macro, it will line up properly and loop through the file.
The relevant commands here are q# to start recording a macro, q to end the recording, and ## to play a recording back.
More information can be found in various places, such as the vim wiki: http://vim.wikia.com/wiki/Macros
If the lines are all the same until the change, this is a pretty nasty Vim solution:
/\v(.*)\n\zs\1#!
:%g//norm A;100;
Traditionally in Vim you craft a search first, then use :%s//replace to replace your last search with "replace". In one line it would be:
:%g/\v(.*)\n\zs\1#!/norm A;100;
I'm so sorry. This is what happens after years of Vim use. It's not pretty.
Explanation:
Essentially we're finding lines that AREN'T duplicated on the next line, and performing an action on them, in this case, adding some text.
:%g/ Perform an action on a pattern (same syntax as %s)
\v Very magic. Makes all special characters in Vim regexes special.
(.*)\n any text followed by a line break. Capture the text.
\zs Start highlighting the match here. This will put the cursor on the next line after the match, where we will perform the action.
\1 The capture group from above (.*), so a new line with the same text...
#! But negate it! So the cursor will go to any line that is NOT duplicated by the previous line.
/norm A;100; Peform the normal mode command A;100; which will execute keystrokes on each line as if you were in normal mode. Regular Vim here, just append (A) text.
:%s/\v(.*)(\n.*)(\n.*)(\n.*)(\n.*)(\n.*\n)/\1;100;\2\3\4\5\6/

For every string in file1.txt check if it exists in file2.txt then do something

I got two txt file, file1.txt and file2.txt.
Both of them have one single string for each line. Strings in file1.txt are uniqe (no duplication), as well as strings in file2.txt.
The files have different numbers of strings.
file1.txt file2.txt
FFF AAA
GGG BBB
ZZZ CCC
ZZZ
I'd like to compare those files, so that for every string in file1.txt, if it exists in file2.txt than it's ok. If not, than write that string in another file (file3.txt)
In this example, file3.txt would be:
file3.txt
FFF
GGG
I'd like to use the command shell, doing something like:
cat file1.txt | while read a; do something on file2.txt ...
but that is not compulsory.
See the man page for grep, specifically the -f option.
grep -vf file2.txt file1.txt
Your best bet would be to read in the input from file 2, put it in a sorted list (or even better, a balanced search tree) and then as you read in each line from file1, go through the tree or do a binary search of the list to find if the string exists.
The idea is that you want to do processing once to make the list of allowed values as easy to check as possible. Putting them in a binary search tree would mean that you first compare it against the word in the middle (alphabetically) of list 2, if it is before it, you take the left branch (which contains words that come before the word you just compared to, or if it comes after, you only have to look at the right branch.
Similarly, if using a list, you look at the word in the middle of the list and then can remove half of the remaining list from consideration each iteration. This means you only have to do log n steps to check each of the words in List1 against the n words in list2.

Resources