grepping using the result of previous grep - linux

Is there a way to perform a grep based on the results of a previous grep, rather than just piping multiple greps into each other. For example, say I have the log file output below:
ID 1000 xyz occured
ID 1001 misc content
ID 1000 misc content
ID 1000 status code: 26348931276572174
ID 1000 misc content
ID 1001 misc content
To begin with, I'd like to grep the whole log file file to see if "xyz occured" is present. If it is, I'd like to get the ID number of that event and grep through all the lines in the file with that ID number looking for the status code.
I'd imagined that I could use xargs or something like that but I can't seem to get it work.
grep "xyz occured" file.log | awk '{ print $2 }' | xargs grep "status code" | awk '{print $NF}'
Any ideas on how to actually do this?

A general answer for grep-ing the grep-ed output:
grep 'patten1' *.txt | grep 'pattern2'
notice that the second grep is not pointing at a file.
More about cool grep stuff here

You're almost there. But while xargs can sometimes be used to do what you want (depending on how the next command takes its arguments), you aren't actually using it to grep for the ID you just extracted. What you need to do is take the output of the first grep (containing the ID code) and use that in the next grep's expression. Something like:
grep "^ID `grep 'xyz occured' file.log | awk '{print $2}'` status code" file.log
Obviously another option would be to write a script to do this in one pass, a-la Ed's suggestion.

Yet another way
for x in `grep "xyz occured" file.log | cut -d\ -f2`
do
grep $x file.log
done
The thing I like about this method is if you wanted to you could write the output to a file for each status code.
grep $x file.log >> /var/tmp/$x.out

This is all about retrieve the files in a narrowed search scope. In your case the search scope is determined by a file content.
I have found this problem more often while reducing the search scope through many searches (applying filters to the previous grep results).
Trying to find general answer:
Generate a list with the result of the first grep:
grep pattern | awk -F':' '{print $1}'
Second grep into the list of files like here
xargs grep -i pattern
apply this cascading filter the times you need just adding awk to get only the filenames and xargs to pass the filenames to grep -i
For example:
grep 'pattern1' | awk -F':' '{print $1}' | xargs grep -i 'pattern2'

Just use awk:
awk '{info[$2] = info[$2] $0 ORS} /xyz occured/{ids[$2]} END{ for (id in ids) printf "%s",info[id]}' file.log
or:
awk '/status code/{code[$2]=$NF} /xyz occured/{ids[$2]} END{ for (id in ids) print code[id]}' file.log
depending what you really want to output. Some expected output in your question would help.

Grep the result of a previous Grep:
Given this file contents:
ID 1000 xyz occured
ID 1001 misc content
ID 1000 misc content
ID 1000 status code: 26348931276572174
ID 1000 misc content
ID 1001 misc content
This command:
grep "xyz" file.log | awk '{ print $2 }' > f.log; grep `cat f.log` file.log;
returns this:
ID 1000 xyz occured
ID 1000 misc content
ID 1000 status code: 26348931276572174
ID 1000 misc content
It looks for "xyz" in file.log places the result in f.log. Then greps for that ID in file.log. If the outer grep returns multiple ID numbers, then the inner grep will only search the first ID number and error out on the others.

Related

bash: awk print with in print

I need to grep some pattern and further i need to print some output within that. Currently I am using the below command which is working fine. But I like to eliminate using multiple pipe and want to use single awk command to achieve the same output. Is there a way to do it using awk?
root#Server1 # cat file
Jenny:Mon,Tue,Wed:Morning
David:Thu,Fri,Sat:Evening
root#Server1 # awk '/Jenny/ {print $0}' file | awk -F ":" '{ print $2 }' | awk -F "," '{ print $1 }'
Mon
I want to get this output using single awk command. Any help?
You can try something like:
awk -F: '/Jenny/ {split($2,a,","); print a[1]}' file
Try this
awk -F'[:,]+' '/Jenny/{print $2}' file.txt
It is using muliple -F value inside the [ ]
The + means one or more since it is treated as a regex.
For this particular job, I find grep to be slightly more robust.
Unless your company has a policy not to hire people named Eve.
(Try it out if you don't understand.)
grep -oP '^[^:]*Jenny[^:]*:\K[^,:]+' file
Or to do a whole-word match:
grep -oP '^[^:]*\bJenny\b[^:]*:\K[^,:]+' file
Or when you are confident that "Jenny" is the full name:
grep -oP '^Jenny:\K[^,:]+' file
Output:
Mon
Explanation:
The stuff up until \K speaks for itself: it selects the line(s) with the desired name.
[^,:]+ captures the day of week (in this case Mon).
\K cuts off everything preceding Mon.
-o cuts off anything following Mon.

Ubuntu bash script grepping and cutting on first column

I am trying to implement a bash script that will take the piped input and cut the first column and return a total. The difference is that the cut is not from a file, but from the piped input. For example:
#!/bin/bash
total=$(cut -f1 $1 | numsum)
echo $total
Basically, in my scenario the first column will always be from the passed input.
Example usuage is:
./script.sh "cat data.txt | grep -i status | grep something"
data.txt contains:
1 status
2 something
This will produce something like:
2
How can this be achieved? I have noticed the cut only works for files only. I cannot see any examples on Google.
I have managed to fix the issue myself. The code is:
#!/bin/bash
total=$(eval $1 | awk '{sum+=$1} END {print sum}')
echo $total

Validating file records shell script

I have a file with content as follows and want to validate the content as
1.I have entries of rec$NUM and this field should be repeated 7 times only.
for example I have rec1.any_attribute this rec1 should come only 7 times in whole file.
2.I need validating script for this.
If records for rec$NUM are less than 7 or Greater than 7 script should report that record.
FILE IS AS FOLLOWS :::
rec1:sourcefile.name=
rec1:mapfile.name=
rec1:outputfile.name=
rec1:logfile.name=
rec1:sourcefile.nodename_col=
rec1:sourcefle.snmpnode_col=
rec1:mapfile.enc=
rec2:sourcefile.name=abc
rec2:mapfile.name=
rec2:outputfile.name=
rec2:logfile.name=
rec2:sourcefile.nodename_col=
rec2:sourcefle.snmpnode_col=
rec2:mapfile.enc=
rec3:sourcefile.name=abc
rec3:mapfile.name=
rec3:outputfile.name=
rec3:logfile.name=
rec3:sourcefile.nodename_col=
rec3:sourcefle.snmpnode_col=
rec3:mapfile.enc=
Please Help
Thanks in Advance... :)
Simple awk:
awk -F: '/^rec/{a[$1]++}END{for(t in a){if(a[t]!=7){print "Some error for record: " t}}}' test.rc
grep '^rec1' file.txt | wc -l
grep '^rec2' file.txt | wc -l
grep '^rec3' file.txt | wc -l
All above should return 7.
The commands:
grep rec file2.txt | cut -d':' -f1 | uniq -c | egrep -v '^ *7'
will success if file follows your rules, fails (and returns the failing record) if it doesn't.
(replace "uniq -c" by "sort -u" if record numbers can be mixed).

How to run grep inside awk?

Suppose I have a file input.txt with few columns and few rows, the first column is the key, and a directory dir with files which contain some of these keys. I want to find all lines in the files in dir which contain these key words. At first I tried to run the command
cat input.txt | awk '{print $1}' | xargs grep dir
This doesn't work because it thinks the keys are paths on my file system. Next I tried something like
cat input.txt | awk '{system("grep -rn dir $1")}'
But this didn't work either, eventually I have to admit that even this doesn't work
cat input.txt | awk '{system("echo $1")}'
After I tried to use \ to escape the white space and the $ sign, I came here to ask for your advice, any ideas?
Of course I can do something like
for x in `cat input.txt` ; do grep -rn $x dir ; done
This is not good enough, because it takes two commands, but I want only one. This also shows why xargs doesn't work, the parameter is not the last argument
You don't need grep with awk, and you don't need cat to open files:
awk 'NR==FNR{keys[$1]; next} {for (key in keys) if ($0 ~ key) {print FILENAME, $0; next} }' input.txt dir/*
Nor do you need xargs, or shell loops or anything else - just one simple awk command does it all.
If input.txt is not a file, then tweak the above to:
real_input_generating_command |
awk 'NR==FNR{keys[$1]; next} {for (key in keys) if ($0 ~ key) {print FILENAME, $0; next} }' - dir/*
All it's doing is creating an array of keys from the first file (or input stream) and then looking for each key from that array in every file in the dir directory.
Try following
awk '{print $1}' input.txt | xargs -n 1 -I pattern grep -rn pattern dir
First thing you should do is research this.
Next ... you don't need to grep inside awk. That's completely redundant. It's like ... stuffing your turkey with .. a turkey.
Awk can process input and do "grep" like things itself, without the need to launch the grep command. But you don't even need to do this. Adapting your first example:
awk '{print $1}' input.txt | xargs -n 1 -I % grep % dir
This uses xargs' -I option to put xargs' input into a different place on the command line it runs. In FreeBSD or OSX, you would use a -J option instead.
But I prefer your for loop idea, converted into a while loop:
while read key junk; do grep -rn "$key" dir ; done < input.txt
Use process substitution to create a keyword "file" that you can pass to grep via the -f option:
grep -f <(awk '{print $1}' input.txt) dir/*
This will search each file in dir for lines containing keywords printed by the awk command. It's equivalent to
awk '{print $1}' input.txt > tmp.txt
grep -f tmp.txt dir/*
grep requires parameters in order: [what to search] [where to search]. You need to merge keys received from awk and pass them to grep using the \| regexp operator.
For example:
arturcz#szczaw:/tmp/s$ cat words.txt
foo
bar
fubar
foobaz
arturcz#szczaw:/tmp/s$ grep 'foo\|baz' words.txt
foo
foobaz
Finally, you will finish with:
grep `commands|to|prepare|a|keywords|list` directory
In case you still want to use grep inside awk, make sure $1, $2 etc are outside quote.
eg. this works perfectly
cat file_having_query | awk '{system("grep " $1 " file_to_be_greped")}'
// notice the space after grep and before file name

Grep - returning both the line number and the name of the file

I have a number of log files in a directory. I am trying to write a script to search all the log files for a string and echo the name of the files and the line number that the string is found.
I figure I will probably have to use 2 grep's - piping the output of one into the other since the -l option only returns the name of the file and nothing about the line numbers. Any insight in how I can successfully achieve this would be much appreciated.
Many thanks,
Alex
$ grep -Hn root /etc/passwd
/etc/passwd:1:root:x:0:0:root:/root:/bin/bash
combining -H and -n does what you expect.
If you want to echo the required informations without the string :
$ grep -Hn root /etc/passwd | cut -d: -f1,2
/etc/passwd:1
or with awk :
$ awk -F: '/root/{print "file=" ARGV[1] "\nline=" NR}' /etc/passwd
file=/etc/passwd
line=1
if you want to create shell variables :
$ awk -F: '/root/{print "file=" ARGV[1] "\nline=" NR}' /etc/passwd | bash
$ echo $line
1
$ echo $file
/etc/passwd
Use -H. If you are using a grep that does not have -H, specify two filenames. For example:
grep -n pattern file /dev/null
My version of grep kept returning text from the matching line, which I wasn't sure if you were after... You can also pipe the output to an awk command to have it ONLY print the file name and line number
grep -Hn "text" . | awk -F: '{print $1 ":" $2}'

Resources