Function to search of multiple patterns using grep - linux

I want to make a bash script to use grep to search for lines which have multiple patterns (case-insensitive). I want to create a bash script which I can use as follows:
myscript file.txt pattern1 pattern2 pattern3
and it should get traslated to:
grep -i --color=always pattern1 file.txt | grep -i pattern2 | grep -i pattern3
I tried following bash script, but it is not working:
#!/bin/bash
grep -i --color=always $2 $1 | grep -i $3 | grep -i $4 | grep -i $5 | grep -i $6 | grep -i $7
The error is:
Usage: grep [OPTION]... PATTERN [FILE]...
Try 'grep --help' for more information.
Usage: grep [OPTION]... PATTERN [FILE]...
Try 'grep --help' for more information.
Usage: grep [OPTION]... PATTERN [FILE]...
Try 'grep --help' for more information.
Usage: grep [OPTION]... PATTERN [FILE]...
Try 'grep --help' for more information.

I think you can do a recursive function:
search() {
if [ $# -gt 0 ]; then
local pat=$1
shift
grep "$pat" | search "$#"
else
cat
fi
}
In your script you would call this function and pass the search patterns as arguments. Say that $1 is the file and the rest of the arguments are patterns then you would do
file=$1
shift
cat "$file" | search "$#"

When you have GNU awk, you can use
awk 'BEGIN {IGNORECASE=1} /pattern1/ && /pattern2/ && /pattern3/' file.txt
EDIT:
You can use this in a script like this:
inputfile="$1"
shift
awk -f <(echo "BEGIN {IGNORECASE=1}"; printf " /%s/ &&" $* | sed 's/&&$//') "${inputfile}"

If you omit one argument or other at the end, so that $3 etc. will be missing, then some grep command will not receive an argument and will whine.

Related

how to use && with grep in bash

I want to check if multiple lines in a file exist in bash.
so for that I use grep -q which works with only one line:
if grep -q string1 "/path/to/file";then
echo 'exists'
else
echo 'does not exist'
fi
I tried many things in various combinations, for example:
if grep -q [ string1 ] && grep -q [ string2 ] "path/to/file";then
I also tried it with -E:
grep -E 'pattern1' filename | grep -E 'pattern2'
but nothing seems to work. Any ideas?
Rather than running multiple grep commands you can use this gnu-awk command to assert presence of multiple strings in a file:
awk -v RS='\\Z' '/string1/ && /string2/ && /string3/{e=1} END{exit !e}' file &&
echo 'exists' || echo 'does not exist'
RS=\Z will make awk read all the input in a single record separator
Using && between multiple search terms will make sure all the search words exist in input file
This will print exists only if all 3 search terms exists in the input file.
since #iruvar hasn't posted his comment as answer, i'll put it here:
grep -q string_1 file && grep -q string_2 file
now, here is my contribution. is #anubhava's more computationally complex awk answer, which reads the file only once, any faster than #iruvar's simpler answer, which reads the file three times?
awk 11.730 s
grep && grep 0.258 s
no.
this surely will depend on the speed of the filesystem vs the cpu, and on how much caching goes on, but on my system, which is probably a typical B+/A- workstation, grep kw1 file && grep kw2 file && grep kw3 file is ~50x as fast as #anubhava's awk solution. this held true both on ssd and spindle raid. (details: test file was 5,000,000 lines, 160M, and had kw1 on the first line, kw2 on the 2.5 millionth, and kw3 on the 5 millionth.)
some easy optimization is possible, for example, if you can solve your problem by matching whole lines, do so (with grep -x); it's twice as fast in this case.
for many (e.g., >1,000) files, it is faster to use grep -l and xargs:
grep -l kw1 *.txt | xargs grep -l kw2 | xargs grep -q kw3
as opposed to a loop:
for f in *.txt; do
grep -q kw1 $f && grep -q kw2 $f && grep -q kw3 $f
done
with the same test file, grep -l | xargs grep took 0.258 s, just like grep && grep. with two test files, it was still no faster than grep && grep. with 2000 test files of 5,000 lines each, none of which contained any matches, grep -l | xargs grep was ~10x as faster as grep && grep.
There are a couple ambiguities in your question, but assuming you want pattern_1 and pattern_2 to exist in a file (not on the same line) then you can do this.
for file in *; do
egrep -q pattern_1 $file && egrep -q pattern_2 $file && echo $file
done
With grep -p you can match multiply patterns in the same line:
grep -P '(?=.*string1)(?=.*string2)' file
The above will print lines that matches string1 and string2.
(?=...) is a positive lookaheads which matches a pattern without making it a part of the match.
And -z will slurp the whole file:
% seq 1 100 | grep -qzP '(?=.*1)(?=.*5)'; echo $?
0
% seq 1 100 | grep -qzP '(?=.*1)(?=.*a)'; echo $?
1
You can do it like this:
if grep -q 'string1' /path/to/file; then
if grep -q 'string2' /path/to/file; then
echo exists
else
echo 'does not exist'
else
echo 'does not exist'
fi
Or:
grep -q 'string1' /path/to/file &&
grep -q 'string2' /path/to/file &&
echo exists ||
echo 'does not exist'
you can use "-q" to search using grep
if grep -q string1 "/path/to/file" && grep -q string2 "/path/to/file";then
echo 'exists'
else
echo 'does not exist'
fi

Concatenating xargs with the use of if-else in bash

I've got two test files, namely, ttt.txt and ttt2.txt, the Content of which is shown as below:
#ttt.txt
(132) 123-2131
543-732-3123
238-3102-312
#ttt2.txt
1
2
3
I've already tried the following commands in bash and it works fine:
if grep -oE "(\(\d{3}\)[ ]?\d{3}-\d{4})|(\d{3}-\d{3}-\d{4})" ttt1.txt ; then echo "found"; fi
# with output 'found'
if grep -oE "(\(\d{3}\)[ ]?\d{3}-\d{4})|(\d{3}-\d{3}-\d{4})" ttt2.txt ; then echo "found"; fi
But when I combine the above command with xargs, it complains error '-bash: syntax error near unexpected token `then''. Could anyone give me some explanation? Thanks in advance!
ll | awk '{print $9}' | grep ttt | xargs -I $ if grep --quiet -oE "(\(\d{3}\)[ ]?\d{3}-\d{4})|(\d{3}-\d{3}-\d{4})" $; then echo "found"; fi
$ is a special character in bash (it marks variables) so don't use it as your xargs marker, you'll only get confused.
The real problem here though is that you are passing if grep --quiet -oE "(\(\d{3}\)[ ]?\d{3}-\d{4})|(\d{3}-\d{3}-\d{4})" $ as the argument to xargs, and then the remainder of the line is being treated as a new command, because it breaks at the ;.
You can wrap the whole thing in a sub-invocation of bash, so that xargs sees the whole command:
$ ll | awk '{print $9}' | grep ttt | xargs -I xx bash -c 'if grep --quiet -oE "(\(\d{3}\)[ ]?\d{3}-\d{4})|(\d{3}-\d{3}-\d{4})" xx; then echo "found"; fi'
found
Finally, ll | awk '{print $9}' | grep ttt is a needlessly complicated way of listing the files that you're looking for. You actually you don't need any of the code above, just do this:
$ if grep --quiet -oE "(\(\d{3}\)[ ]?\d{3}-\d{4})|(\d{3}-\d{3}-\d{4})" ttt*; then echo "found"; fi
found
Alternatively, if you want to process each file in turn (which you don't need here, but you might want when this gets more complicated):
for file in ttt*
do
if grep --quiet -oE "(\(\d{3}\)[ ]?\d{3}-\d{4})|(\d{3}-\d{3}-\d{4})" "$file"
then
echo "found"
fi
done

Pass variable to grep command

I want to pass a variable in my grep command in Linux bash script. Variable is a text file from Internet and i want to find some words in it.
I have tried the following command in my bash:
cat "$var" | grep -Po '(?<=\d[a-zA-Z]).*\..*(?=[a-zA-Z]\d)'
echo "$var" | grep -Po '(?<=\d[a-zA-Z]).*\..*(?=[a-zA-Z]\d)'
grep -Po '(?<=\d[a-zA-Z]).*\..*(?=[a-zA-Z]\d)' <<< "$var"
but i dont get a right Result.
How can i do it?
Here is my bash:
#!/bin/sh
urltext=$(curl -s https://example.com)
string=$(grep -Po '(?<=\d[a-zA-Z]).*\..*(?=[a-zA-Z]\d)' "$urltext" | tr '.' '\n' )
cat $string
What's supposed to be contained in the variable ?
It's a file ?
grep <options> <expression> "$var"
It's a string ?
echo "$var"|grep <options> <expression>
grep <options> <expression> <(echo "$var")
NB : try -e option if there are several lines in $var

Find and highlight text in linux command line

I am looking for a linux command that searches a string in a text file,
and highlights (colors) it on every occurence in the file, WITHOUT omitting text lines (like grep does).
I wrote this handy little script. It could probably be expanded to handle args better
#!/bin/bash
if [ "$1" == "" ]; then
echo "Usage: hl PATTERN [FILE]..."
elif [ "$2" == "" ]; then
grep -E --color "$1|$" /dev/stdin
else
grep -E --color "$1|$" $2
fi
it's useful for stuff like highlighting users running processes:
ps -ef | hl "alice|bob"
Try
tail -f yourfile.log | egrep --color 'DEBUG|'
where DEBUG is the text you want to highlight.
command | grep -iz -e "keyword1" -e "keyword2" (ignore -e switch if just searching for a single word, -i for ignore case, -z for treating as a single file)
Alternatively,while reading files
grep -iz -e "keyword1" -e "keyword2" 'filename'
OR
command | grep -A 99999 -B 99999 -i -e "keyword1" "keyword2" (ignore -e switch if just searching for a single word, -i for ignore case,-A and -B for no of lines before/after the keyword to be displayed)
Alternatively,while reading files
grep -A 99999 -B 99999 -i -e "keyword1" "keyword2" 'filename'
command ack with --passthru switch:
ack --passthru pattern path/to/file
I take it you meant "without omitting text lines" (instead of emitting)...
I know of no such command, but you can use a script such as this (this one is a simple solution that takes the filename (without spaces) as the first argument and the search string (also without spaces) as the second):
#!/usr/bin/env bash
ifs_store=$IFS;
IFS=$'\n';
for line in $(cat $1);
do if [ $(echo $line | grep -c $2) -eq 0 ]; then
echo $line;
else
echo $line | grep --color=always $2;
fi
done
IFS=$ifs_store
save as, for instance colorcat.sh, set permissions appropriately (to be able to execute it) and call it as
colorcat.sh filename searchstring
I had a requirement like this recently and hacked up a small program to do exactly this. Link
Usage: ./highlight test.txt '^foo' 'bar$'
Note that this is very rough, but could be made into a general tool with some polishing.
Using dwdiff, output differences with colors and line numbers.
echo "Hello world # $(date)" > file1.txt
echo "Hello world # $(date)" > file2.txt
dwdiff -c -C 0 -L file1.txt file2.txt

Shell file size in Linux

How can I get the size of a file into a variable?
ls -l | grep testing.txt | cut -f6 -d' '
gave the size, but how can I store it in a shell variable?
filesize=$(stat -c '%s' testing.txt)
You can do it this way with ls (check the man page for the meaning of -s)
var=$(ls -s1 testing.txt | awk '{print $1}')
Or you can use stat with -c '%s'.
Or you can use find (GNU):
var=$(find testing.txt -printf "%s")
size() {
file="$1"
if [ -b "$file" ]; then
/sbin/blockdev --getsize64 "$file"
else
wc -c < "$file" # Handles pseudo files like /proc/cpuinfo
# stat --format %s "$file"
# find "$file" -printf '%s\n'
# du -b "$file" | cut -f1
fi
}
fs=$(size testing.txt)
size=`ls -l | grep testing.txt | cut -f6 -d' '`
You can get the file size in bytes with the command wc, which is fairly common on Linux systems since it's part of GNU coreutils:
wc -c < file
In a Bash script you can read it into a variable like this:
FILESIZE=$(wc -c < file)
From man wc:
-c, --bytes
print the byte counts
a=\`stat -c '%s' testing.txt\`;
echo $a

Resources