Pass command-line arguments to grep as search patterns and print lines which match them all - linux

I'm learning about grep commands.
I want to make a program that when a user enters more than one word, outputs a line containing the word in the data file.
So I connected the words that the user typed with '|' and put them in the grep command to create the program I intended.
But this is OR operation. I want to make AND operation.
So I learned how to use AND operation with grep commands as follows.
cat <file> | grep 'pattern1' | grep 'pattern2' | grep 'pattern3'
But I don't know how to put the user input in the 'pattern1', 'pattern2', 'pattern3' position. Because the number of words the user inputs is not determined.
As user input increases, grep must be executed using more and more pipes, but I don't know how to build this part.
The user input is as follows:
$ [the name of my program] 'pattern1' 'pattern2' 'pattern3' ...
I'd really appreciate your help.

With grep -f you can grep multiple items, when each of them is on a line in a file.
With <(command) you can let Bash think that the result of command is a file.
With printf "%s\n" and a list of arguments, each argument is printed on a new line.
Together:
grep -f <(printf "%s\n" "$#") datafile

suggesting to use awk pattern logic:
awk '/RegExp-pattern-1/ && /RegExp-pattern-2/ && /RegExp-pattern-3/ 1' input.txt
The advantages: you can play with logic operators && || on RegExp patterns. And your are scanning the whole file once.
The disadvantages: must provide files list (can't traverse sub directories), and limited RegExp syntax compared to grep -E or grep -P

In principle, what you are asking could be done with a loop with output to a temporary file.
file=inputfile
temp=$(mktemp -d -t multigrep.XXXXXXXXX) || exit
trap 'rm -rf "$temp"' ERR EXIT
for regex in "$#"; do
grep "$regex" "$file" >"$temp"/output
mv "$temp"/output "$temp"/input
file="$temp"/input
done
cat "$temp"/input
However, a better solution is probably to arrange for Awk to check for all the patterns in one go, and avoid reading the same lines over and over again.
Passing the arguments to Awk with quoting intact is not entirely trivial. Here, we simply pass them as command-line arguments and process those into an array within the Awk script itself.
awk 'BEGIN { for(i=1; i<ARGC; ++i) a[i]=ARGV[i];
ARGV[1]="-"; ARGC=1 }
{ for(n=1; n<=i; ++n) if ($0 !~ a[n]) next; }1' "$#" <file
In brief, in the BEGIN block, we copy the command-line arguments from ARGV to a, then replace ARGV and ARGC to pass Awk a new array of (apparent) command-line arguments which consists of just - which means to read standard input. Then, we simply iterate over a and skip to the next line if the current input line from standard input does not match. Any remaining lines have matched all the patterns we passed in, and are thus printed.

Related

How to move files using the result as condition after grep command

I have 2 files that I needed to grep in a separate file.
The two files are in this directory /var/list
TB.1234.txt
TB.135325.txt
I have to grep them in another file in another directory which is in /var/sup/. I used the command below:
for i in TB.*; do grep "$i" /var/sup/logs.txt; done
what I want to do is, if the result of the grep command contains the word "ERROR" the files which is found in /var/list will be moved to another directory /var/last.
for example I grep this file TB.1234.txt to /var/sup/logs.txt then the result is like this:
ERROR: TB.1234.txt
TB.1234.txt will be move to /var/last.
please help. I don't know how to construct the logic on how to move the files, I'm stuck in that I provided, I am also trying to use two greps in a for loop but I am encountering an error.
I am new in coding and really appreciates any help and suggestions. Thank you so much.
If you are asking how to move files which contain "ERROR", this should be extremely straightforward.
for file in TB.*; do
grep -q 'ERROR' "$file" &&
mv "$file" /var/last/
done
The notation this && that is a convenient shorthand for
if this; then
that
fi
The -q option to grep says to not print the matches, and quit as soon as you find one. Like all well-defined commands, grep sets its exit code to reflect whether it succeeded (the status is visible in $?, but usually you would not examine it directly; perhaps see also Why is testing ”$?” to see if a command succeeded or not, an anti-pattern?)
Your question is rather unclear, but if you want to find either of the matching files in a third file, perhaps something like
awk 'FNR==1 && (++n < ARGC-1) { a[n] = FILENAME; nextfile }
/ERROR/ { for(j=1; j<=n; ++j) if ($0 ~ a[j]) b[a[j]]++ }
END { for(f in b) print f }' TB*.txt /var/sup/logs.txt |
xargs -r mv -t /var/last/
This is somewhat inefficient in that it will read all the lines in the log file, and brittle in that it will only handle file names which do not contain newlines. (The latter restriction is probably unimportant here, as you are looking for file names which occur on the same line as the string "ERROR" in the first place.)
In some more detail, the Awk script collects the wildcard matches into the array a, then processes all lines in the last file, looking for ones with "ERROR" in them. On these lines, it checks if any of the file names in a are also found, and if so, also adds them to b. When all lines have been processed, print the entries in b, which are then piped to a simple shell command to move them.
xargs is a neat command to read some arguments from standard input, and run another command with those arguments added to its command line. The -r option says to not run the other command if there are no arguments.
(mv -t is a GNU extension; it's convenient, but not crucial to have here. If you need portable code, you could replace xargs with a simple while read -r loop.)
The FNR==1 condition requires that the input files are non-empty.
If the text file is small, or you expect a match near its beginning most of the time, perhaps just live with grepping it multiple times:
for file in TB.*; do
grep -Eq "ERROR.*$file|$file.*ERROR" /var/sup/logs.txt &&
mv "$file" /var/last/
done
Notice how we now need double quotes, not single, around the regular expression so that the variable $file gets substituted in the string.
grep has an -l switch, showing only the filename of the file which contains a pattern. It should not be too difficult to write something like (this is pseudocode, it won't work, it's just for giving you an idea):
if $(grep -l "ERROR" <directory> | wc -l) > 0
then foreach (f in $(grep -l "ERROR")
do cp f <destination>
end if
The wc -l is to check if there are any files which contain the word "ERROR". If not, nothing needs to be done.
Edit after Tripleee's comment:
My proposal can be simplified as:
if grep -lq "ERROR" TB.*;
then foreach (f in $(grep -l "ERROR")
do cp f <destination>
end if
Edit after Tripleee's second comment:
This is even shorter:
for f in $(grep -l "ERROR" TB.*);
do cp "$f" destination;
done

parsing complex string using shell script

I'm trying the whole day to find a good way for parsing some strings with a shell script. the strings are used as calling parameter for some applications.
they looks like:
parsingParams -c "id=uid5 prog=/opt/bin/example arg=\"-D -t5 >/dev/null 1>&2\" info='fdhff fd'" start
I'm only allowed to use shell-script. I tried to use some sed and cut commands but nothing works fine.
My tries are like:
prog=$(echo $# | cut -d= -f3 | sed 's|\s.*$||')
that return the correct value of prog but for the value of arg I couldn't find a good way to get it.
the info parameter is optional also it may be left.
may any one have a good idea that can solve this problem?
many thanks in advance
Looks like you could use eval to let the shell parse your input string, but if you don't control the input (if it comes from an unreliable source), that will introduce a major vulnerability (imagine an attacker somehow passes -c "rm -rf /" to your program).
A safer way would be to explicitly specify allowed forms of user input.
The problem you have with splitting on space (with cut) if the space is quoted, can be avoided if you specify valid fields (content, not separator), for example in GNU awk, you can use FPAT:
$ params="id=uid5 prog=/opt/bin/example arg=\"-D -t5 >/dev/null 1>&2\" info='fdhff fd'"
$ awk -v FPAT="[^=]+=(\"[^\"]*\"|'[^']*'|[^ ]*) *" '{for (i=1; i<=NF; i++) print $i}' <<<"$params"
id=uid5
prog=/opt/bin/example
arg="-D -t5 >/dev/null 1>&2"
info='fdhff fd'
Valid fields will be in one of the following forms:
var="val with spaces"
var='val with spaces'
var=val_no_spaces
Now with assignments split (one per line, assuming newline is not allowed in params), you can process them further, even with cut:
$ awk ... | cut -d $'\n' -f3
arg="-D -t5 >/dev/null 1>&2"
eval
$ eval "id=uid5 prog=/opt/bin/example arg=\"-D -t5 >/dev/null 1>&2\" info='fdhff fd'"
$ echo $id
uid5
$ echo $prog
/opt/bin/example
$ echo $arg
-D -t5 >/dev/null 1>&2
$ echo $info
fdhff fd

How to escape square brackets in a ls output

I'm experiencing some problems to escape square brackets in any file name.
I need to compare two list. The ls output is the first list and the second is the ARQ02.
#!/bin/bash
exec 3< <(ls /home/lint)
while read arq <&3; do
var=`grep -e "$arq" ARQ02`
if [ "$?" -ne 0 ] ; then
echo "$arq" >> result
fi
done
exec 3<&-
Sorry for my bad english.
Your immediate problem is that you must instruct grep to interpret the search term as a literal rather than a regular expression, using the -F option:
var=$(grep -Fe "$arq" ARQ02)
That way, any regex metacharacters that happen to be in the output from ls /home/lint - such as [ and ] - will still be treated as literals and won't break the grep invocation.
That said, it looks like your command could be streamlined, such as by using the output from ls /home/lint directly as the set of search strings to pass to grep at once, using the -f option:
grep -Ff <(ls /home/lint) ARQ02 > result
<(...) is a so-called process substitution, which, simply put, presents the output from a command as if it were a (temporary) file, which is what -f expects: a file containing the search terms for grep.
Alternatively, if:
the lines of ARQ02 contain only filenames that fully match (some of) the filenames in the output from ls /home/lint, and
you don't mind sorting or want to sort the matches stored in result,
consider HuStmpHrrr's helpful answer.
i have to assume my interpretation is correct. based on that, i can raise a oneliner easily solve your solution. there are 2 assumption i need to make here: your file name doesn't contain carriage return and you are using modern bash:
comm -23 <(printf "%s\n" * | sort) <(sort ARQ02)
in bash <() emits a subshell and pipe the stdout as a file. comm is the command to compute difference of 2 input stream.
to explain in details,
comm
-23 # suppress files unique in ARQ02 and files in common
<(printf "%s\n" * | # print all the files in local folder with new line breaker
sort) # sort them
<(sort ARQ02)
it's necessary to sort as comm only compare incrementally.

Linux shell scripting: How to store output from terminal in integers (but only numbers)?

I'm new to shell scripting and here is my problem:
I want to store PID's from output of airmon-ng check to some variables (for ex: $1, $2, $3) so that I can execute kill $1 $2 $3.
here is sample output of airmon-ng check:
Found 3 processes that could cause trouble.
If airodump-ng, aireplay-ng or airtun-ng stops working after
a short period of time, you may want to kill (some of) them!
PID Name
707 NetworkManager
786 wpa_supplicant
820 dhclient
I want to grab numbers 707, 786, 820.
I tried using set 'airmon-ng check' and then using for loop:
set `airmon-ng check`
n=$#
for (( i=0; i<=n; i++ ))
do
echo $i
done
it outputs 1,2,3,...36
not words or numbers so I couldn't figure out how I should do it.
airmon-ng check | egrep -o '\b[0-9]+\b' | xargs kill
egrep is grep with extended regular expressions (like grep -E), -o says to extract only the matching parts, \b matches word boundaries so you don't get any numbers accidentally occuring in process names or something, [0-9]+ matches one or more decimal digit, xargs kill passes all the matches as arguments to the kill command.
Note that parsing output intended to be read by humans might not always be a good idea. Also, just killing all those processes doesn't sound too smart either, but proper usage of airocrack is beyond this question.
You can get list of the PIDs separated by spaces e.g. like this (everything from the 1st column after "PID"):
l=`airmon-ng check | awk 'BEGIN { p=0 } { if (p) { print $1" "; } if ($1=="PID") { p=1 } }' | tr '\n' ' '`
Why not use grep?
myvar=$(airmon-ng check | grep '[0-9]\{3,6\}')
This assumes a PID of 3 to 6 digits, and will grab anything from the airmon-ng output of a similar length. So this may not work as well if the output includes other strings with digits of a similar length.
I would use awk for this and store the output in an array
pids=( $(airmon-ng check | awk '/^[[:blank:]]+[[:digit:]]+[[:blank:]]+/{print $1}') )
#'pids' is an array
kill "${pids[#]}" #killing all the processes thus found.

Delete lines from a file matching first 2 fields from a second file in shell script

Suppose I have setA.txt:
a|b|0.1
c|d|0.2
b|a|0.3
and I also have setB.txt:
c|d|200
a|b|100
Now I want to delete from setA.txt lines that have the same first 2 fields with setB.txt, so the output should be:
b|a|0.3
I tried:
comm -23 <(sort setA.txt) <(sort setB.txt)
But the equality is defined for whole line, so it won't work. How can I do this?
$ awk -F\| 'FNR==NR{seen[$1,$2]=1;next;} !seen[$1,$2]' setB.txt setA.txt
b|a|0.3
This reads through setB.txt just once, extracts the needed information from it, and then reads through setA.txt while deciding which lines to print.
How it works
-F\|
This sets the field separator to a vertical bar, |.
FNR==NR{seen[$1,$2]=1;next;}
FNR is the number of lines read so far from the current file and NR is the total number of lines read. Thus, when FNR==NR, we are reading the first file, setB.txt. If so, set the value of associative array seen to true, 1, for the key consisting of fields one and two. Lastly, skip the rest of the commands and start over on the next line.
!seen[$1,$2]
If we get to this command, we are working on the second file, setA.txt. Since ! means negation, the condition is true if seen[$1,$2] is false which means that this combination of fields one and two was not in setB.txt. If so, then the default action is performed which is to print the line.
This should work:
sed -n 's#\(^[^|]*|[^|]*\)|.*#/^\1/d#p' setB.txt |sed -f- setA.txt
How this works:
sed -n 's#\(^[^|]*|[^|]*\)|.*#/^\1/d#p'
generates an output:
/^c|d/d
/^a|b/d
which is then used as a sed script for the next sed after the pipe and outputs:
b|a|0.3
(IFS=$'|'; cat setA.txt | while read x y z; do grep -q -P "\Q$x|$y|\E" setB.txt || echo "$x|$y|$z"; done; )
explanation: grep -q means only test if grep can find the regexp, but do not output, -P means use Perl syntax, so that the | is matched as is because the \Q..\E struct.
IFS=$'|' will make bash to use | instead of the spaces (SPC, TAB, etc.) as token separator.

Resources