grep after match - linux

I have the following lines in text file
myname aaa age 22
age 23 myname bbb
How can i find the word after myname using linux grep command .?
I want the output to be the word after myname ( aaa and bbb )

$ grep -Po '(?<=myname\s)\w+' inputFile

$ grep -o "myname [[:alnum:]]\+" /tmp/sample | cut -f2 -d' '
aaa
bbb

solution with sed:
sed -n '/myname/{s/.*myname \([^ ]*\).*/\1/;p}'

Related

How to obtain the query order output when we use grep?

I have 2 files
file1.txt
1
3
5
2
File2.txt
1 aaa
2 bbb
3 ccc
4 aaa
5 bbb
Desired output:
1 aaa
3 ccc
5 bbb
2 bbb
Command used : cat File1.txt |grep -wf- File2.txt but the output was:
1 aaa
2 bbb
3 ccc
5 bbb
Is it a way to return the output in the query order?
Thanks in advance!!!
Important Edit
On second thought, do not use grep with redirection as it's incredibly slow. Use awk to read the original patterns to get the order back.
Use this instead
grep -f patterns searchdata | awk 'NR==FNR { line[$1] = $0; next } $1 in line { print line[$1] }' - patterns > matched
Benchmark
#!/bin/bash
paste <(shuf -i 1-10000) <(crunch 4 4 2>/dev/null | shuf -n 10000) > searchdata
shuf -i 1-10000 > patterns
printf 'Testing awk:'
time grep -f patterns searchdata | awk 'NR==FNR { line[$1] = $0; next } $1 in line { print line[$1] }' - patterns > matched
wc -l matched
cat /dev/null > matched
printf '\nTesting grep with redirection:'
time {
while read -r pat; do
grep -w "$pat" searchdata >> matched
done < patterns
}
wc -l matched
Output
Testing awk:
real 0m0.022s
user 0m0.017s
sys 0m0.010s
10000 matched
Testing grep with redirection:
real 0m36.370s
user 0m28.761s
sys 0m7.909s
10000 matched
Original
To preserve the query order, read the file line-by-line:
while read -r pat; do grep -w "$pat" file2.txt; done < file1.txt
I don't think grep has an option to support this, but this solution will be slower if you have large files to read from.

How to select from the output of grep?

I want only to extract the date from the name of this file:
So I did this:
echo HNR.L04.C07.ldd.T2018050.BG.nc.nc |grep -o '[0-9]\+'
I got this:
04
07
2018050
Now I want to select the third line? any idea?
You can get the firthd line of output with:
set -n 3p
You can get the last line of output with:
tail -n 1
Ex:
echo HNR.L04.C07.ldd.T2018050.BG.nc.nc | grep -o '[0-9]\+' | sed -n 3p

In Linux command line console, how to get the sub-string from a file?

The content of the file is fixed.
Example:
2016-03-28T00:02 AAA 2016-03-28T00:03 ADASDASD
2016-03-28T00:03 BBB 2016-03-28T00:04 FAFAFDAS
2016-03-28T00:05 CCC 2016-03-28T00:06 SDAFAFAS
....
Which command can I use to get all sub-strings, AAA, BBB, CCC, etc.
you can use cut and awk and perl for this.
cat >> file.data << EOF
2016-03-28T00:02 AAA 2016-03-28T00:03 ADASDASD
2016-03-28T00:03 BBB 2016-03-28T00:04 FAFAFDAS
2016-03-28T00:05 CCC 2016-03-28T00:06 SDAFAFAS
EOF
AWK
awk '{ print $2 }' file.data
AAA
BBB
CCC
CUT
cut -d " " -f2 file.data
AAA
BBB
CCC
PERL
perl -alne 'print $F[1] ' file.data
AAA
BBB
CCC
You can use cut:
cut -d' ' -f 2 file
You can use AWK for this:
jayforsythe$ cat > file
2016-03-28T00:02 AAA 2016-03-28T00:03 ADASDASD
2016-03-28T00:03 BBB 2016-03-28T00:04 FAFAFDAS
2016-03-28T00:05 CCC 2016-03-28T00:06 SDAFAFAS
jayforsythe$ awk '{ print $2 }' file
AAA
BBB
CCC
To save the result to another file, simply add the redirection operator:
jayforsythe$ awk '{ print $2 }' file > file2

awk how to print the rest

my file contains lines like this
any1 aaa bbb ccc
The delimiter is space. the number of words in the line is unknown
I want to put the first word into a var1. It's simple with
awk '{print $1}'
Now I want to put the rest of the line into a var2 with awk.
How I can print the rest of the line with awk ?
Better to use read here:
s="any1 aaa bbb ccc"
read var1 var2 <<< "$s"
echo "$var1"
any1
echo "$var2"
aaa bbb ccc
For awk only solution use:
echo "$s" | awk '{print $1; print substr($0, index($0, " ")+1)}'
any1
aaa bbb ccc
$ var=$(awk '{sub(/^[^[:space:]]+[[:space:]]+/,"")}1' file)
$ echo "$var"
aaa bbb ccc
or in general to skip some number of fields use a RE interval:
$ awk '{sub(/^[[:space:]]*([^[:space:]]+[[:space:]]+){1}/,"")}1' file
aaa bbb ccc
$ awk '{sub(/^[[:space:]]*([^[:space:]]+[[:space:]]+){2}/,"")}1' file
bbb ccc
$ awk '{sub(/^[[:space:]]*([^[:space:]]+[[:space:]]+){3}/,"")}1' file
ccc
Note that doing this gets much more complicated if you have a FS that's more than a single char, and the above is just for the default FS since it additionally skips any leading blanks if present (remove the first [[:space:]]* if you have a non-default but still single-char FS).
awk solution:
awk '{$1 = ""; print $0;}'`

Find lines containing ' \N abcd '

How can I find lines that contain a double tab and then \N
It should match, for example, \N abcd
I've tried
grep $'\t'$'\t''\N' file1.txt
grep $'\t\t''\N' file1.txt
grep $'\t\t\N' file1.txt
The following works for me:
RHEL:
$ grep $'\t\t''\\N' file1.txt
OSX:
$ grep '\t\t\\N' file1.txt

Resources