Grep multiple bash parameters - linux

I'm writing a bash script which shall search in multiple files.
The problem I'm encountering is that I can't egrep an undetermined number of variables passed as parameters to the bash script
I want it to do the following:
Given a random number of parameters. i.e:
./searchline.sh A B C
Do a grep on the first one, and egrep the result with the rest:
grep "A" * | egrep B | egrep C
What I've tried to do is to build a string with the egreps:
for j in "${#:2}";
do
ADDITIONALSEARCH="$ADDITIONALSEARCH | egrep $j";
done
grep "$1" * "$ADDITIONALSEARCH"
But somehow that won't work, it seems like bash is not treating the "egrep" string as an egrep.
Do you guys have any advice?
By the way, as a side note, I'm not able to create any auxiliary file so grep -f is out of the line I guess. Also note, that the number of parameters passed to the bash script is variable, so I can't do egrep "$2" | egrep "$3".
Thanks in advance.
Fernando

You can use recursion here to get required number of pipes:
#!/bin/bash
rec_egrep() {
if [ $# -eq 0 ]; then
exec cat
elif [ $# -eq 1 ]; then
exec egrep "$1"
else
local pat=$1
shift
egrep "$pat" | rec_egrep "$#"
fi
}
first_arg="$1"
shift
grep "$first_arg" * | rec_egrep "$#"

A safe eval can be a good solution:
#!/bin/bash
if [[ $# -gt 0 ]]; then
temp=("grep" "-e" "\"\$1\"" "*")
for (( i = 2; i <= $#; ++i )); do
temp=("${temp[#]}" "|" "egrep" "-e" "\"\$$i\"")
done
eval "${temp[#]}"
fi
To run it:
bash script.sh A B C

Related

bash/ksh grep script take more than one argument

#!/bin/ksh
if [ -n "$1" ]
then
if grep -w -- "$1" codelist.lst
then
true
else
echo "Value not Found"
fi
else
echo "Please enter a valid input"
fi
This is my script and it works exactly how I want at the moment, I want to add if I add more arguments It will give me the multiple outputs, How can I do that?
So For Example I do ./test.sh apple it will grep apple in codelist.lst and Give me the output : Apple
I want to do ./test.sh apple orange and will do:
Apple
Orange
You can do that with shift and a loop, something like (works in both bash and ksh):
for ((i = $#; i > 0 ; i--)) ; do
echo "Processing '$1'"
shift
done
You'll notice I've also opted not to use the [[ -n "$1" ]] method as that would terminate the loop early with an empty string (such as with ./script.sh a b "" c stopping without doing c).
To iterate over the positional parameters:
for pattern in "$#"; do
grep -w -- "$pattern" codelist.lst || echo "'$pattern' not Found"
done
For a more advanced usage, which only invokes grep once, use the -f option with a shell process substitution:
grep -w -f <(printf '%s\n' "$#") codelist.lst

bash: set variable inside loop when piping find and grep [duplicate]

i want to compute all *bin files inside a given directory. Initially I was working with a for-loop:
var=0
for i in *ls *bin
do
perform computations on $i ....
var+=1
done
echo $var
However, in some directories there are too many files resulting in an error: Argument list too long
Therefore, I was trying it with a piped while-loop:
var=0
ls *.bin | while read i;
do
perform computations on $i
var+=1
done
echo $var
The problem now is by using the pipe subshells are created. Thus, echo $var returns 0.
How can I deal with this problem?
The original Code:
#!/bin/bash
function entropyImpl {
if [[ -n "$1" ]]
then
if [[ -e "$1" ]]
then
echo "scale = 4; $(gzip -c ${1} | wc -c) / $(cat ${1} | wc -c)" | bc
else
echo "file ($1) not found"
fi
else
datafile="$(mktemp entropy.XXXXX)"
cat - > "$datafile"
entropy "$datafile"
rm "$datafile"
fi
return 1
}
declare acc_entropy=0
declare count=0
ls *.bin | while read i ;
do
echo "Computing $i" | tee -a entropy.txt
curr_entropy=`entropyImpl $i`
curr_entropy=`echo $curr_entropy | bc`
echo -e "\tEntropy: $curr_entropy" | tee -a entropy.txt
acc_entropy=`echo $acc_entropy + $curr_entropy | bc`
let count+=1
done
echo "Out of function: $count | $acc_entropy"
acc_entropy=`echo "scale=4; $acc_entropy / $count" | bc`
echo -e "===================================================\n" | tee -a entropy.txt
echo -e "Accumulated Entropy:\t$acc_entropy ($count files processed)\n" | tee -a entropy.txt
The problem is that the while loop is part of a pipeline. In a bash pipeline, every element of the pipeline is executed in its own subshell [ref]. So after the while loop terminates, the while loop subshell's copy of var is discarded, and the original var of the parent (whose value is unchanged) is echoed.
One way to fix this is by using Process Substitution as shown below:
var=0
while read i;
do
# perform computations on $i
((var++))
done < <(find . -type f -name "*.bin" -maxdepth 1)
Take a look at BashFAQ/024 for other workarounds.
Notice that I have also replaced ls with find because it is not good practice to parse ls.
A POSIX compliant solution would be to use a pipe (p file). This solution is very nice, portable, and POSIX, but writes something on the hard disk.
mkfifo mypipe
find . -type f -name "*.bin" -maxdepth 1 > mypipe &
while read line
do
# action
done < mypipe
rm mypipe
Your pipe is a file on your hard disk. If you want to avoid having useless files, do not forget to remove it.
So researching the generic issue, passing variables from a sub-shelled while loop to the parent. One solution I found, missing here, was to use a here-string. As that was bash-ish, and I preferred a POSIX solution, I found that a here-string is really just a shortcut for a here-document. With that knowledge at hand, I came up with the following, avoiding the subshell; thus allowing variables to be set in the loop.
#!/bin/sh
set -eu
passwd="username,password,uid,gid
root,admin,0,0
john,appleseed,1,1
jane,doe,2,2"
main()
{
while IFS="," read -r _user _pass _uid _gid; do
if [ "${_user}" = "${1:-}" ]; then
password="${_pass}"
fi
done <<-EOT
${passwd}
EOT
if [ -z "${password:-}" ]; then
echo "No password found."
exit 1
fi
echo "The password is '${password}'."
}
main "${#}"
exit 0
One important note to all copy pasters, is that the here-document is setup using the hyphen, indicating that tabs are to be ignored. This is needed to keep the layout somewhat nice. It is important to note, because stackoverflow doesn't render tabs in 'code' and replaces them with spaces. Grmbl. SO, don't mangle my code, just cause you guys favor spaces over tabs, it's irrelevant in this case!
This probably breaks on different editor(settings) and what not. So the alternative would be to have it as:
done <<-EOT
${passwd}
EOT
This could be done with a for loop, too:
var=0;
for file in `find . -type f -name "*.bin" -maxdepth 1`; do
# perform computations on "$i"
((var++))
done
echo $var

Put grep output inside variable from a loop

I have CentOS and this bash script:
#!/bin/sh
files=$( ls /vps_backups/site )
counter=0
for i in $files ; do
echo $i | grep -o -P '(?<=-).*(?=.tar)'
let counter=$counter+1
done
In the site folder I have compressed backups with the following names :
site-081916.tar.gz
site-082016.tar.gz
site-082116.tar.gz
...
The code above prints :
081916
082016
082116
I want to put each extracted date to a variable so I replaced this line
echo $i | grep -o -P '(?<=-).*(?=.tar)'
with this :
dt=$($i | grep -o -P '(?<=-).*(?=.tar)')
echo $dt
however I get this error :
./test.sh: line 6: site-090316.tar.gz: command not found
Any help please?
Thanks
you still need the echo inside the $(...):
dt=$(echo $i | grep -o -P '(?<=-).*(?=.tar)')
Don't use ls in a script. Use a shell pattern instead. Also, you don't need to use grep; bash has a built-in regular expression operator.
#!/bin/bash
files=$( /vps_backups/site/* )
counter=0
for i in "${files[#]#/vps_backups/site/}" ; do
[[ $i =~ -(.*).tar.gz ]] && dt=${BASH_REMATCH[1]}
counter=$((counter + 1))
done

Bash function with grep, using if and then statements

I have been trying to build a function that will allow me to perform a dynamic grep search - dynamic in that if more than one variable is specified upon calling the function, then it will pass the additional variable through grep. I should never really need more than 2 variables, but would be interested in learning how to do that as well.
example of desired functionality:
> function word1 word2
result:
(
cd /path/folder;
less staticfile | grep -i word1 | grep -i --color word2
)
The above works great if there will always be two words. i'm trying to learn how to use if and then statements to allow me to be flexible in the number of variables.
function ()
{
if [ ! "$#" -gt "1" ];
then
(
cd ~/path/folder;
less staticfile | grep -i $1 | grep --color -i $2
)
else
(
cd ~/path/folder;
less staticfile | grep --color $1
)
fi
}
When I run this function, it seems to do the opposite -
> function word1
returns an error because it is using the "then" statement for some reason but has nothing to insert into the second grep call.
> function word1 word2
only greps "word1" - therefore is using the "else" statement.
What am i doing wrong? Is there any easier way to do this?
I think this will do what you want. You can functionize it. (Note the indirection use after grep.) Here is the code followed by a test file & output. (Takes any # of args.)
#!/bin/bash
if [[ $# -lt 2 ]]; then
echo -e "\nusage: ./prog.sh file grep-string1 grep-string2 grep-string3...\n"
echo -e "At least two arguments are required. exiting.\n"
exit 5
fi
fname=$1 # always make your file name the first argument
i=2 # initialize the argument counter
cmd="cat $fname"
while [[ $i -le $# ]]; do
cmd=${cmd}" | grep ${!i}"
(( i++ ))
done
eval $cmd
So the input file for this might be called tfile with contents:
a circuit
pavement
bye
Running the code works this way:
./prog.sh tfile e ye
bye

Creating bash script of a complex linux command

I have few long commands that I will be using on a day to day basis. So I though it would be better to have a bash script where I could pass arguments, thus saving typing. I guess this is the norm in Linux but I am kind of new to it. Could someone show me how to do it. A example is the following command
cut -f <column_number> <filename> | sort | uniq -c |
sort -r -k1 -n | awk '{printf "%-15s %-10d\n", $2,$1}'
so i want this in a script where i can pass the filename and column number (preferably in any order) and get the desired ouput instead of having to type the whole thing everytime.
Create a file say myscript.sh -
#!/bin/bash
if [ $# -ne 2 ]; then
echo Usage: myscript.sh column_number file_path
exit
fi
if ! [ -f $2 ]; then
echo File doesnt exist
exit
fi
if [ `echo $1 | grep -E ^[0-9]+$ | wc -l` -ne 1 ]; then
echo First argument must be a number
exit
fi
cut -f 10 $1 $2 | sort | uniq -c |
sort -r -k1 -n | awk '{printf "%-15s %-10d\n", $2,$1}'
Make sure this file is executable using command chmod +x mytask.sh
You can invoke it like sh myscript.sh 30 myfile.sh or ./myscript.sh 30 myfile.sh
The first line of above script specifies the shell you would like your script to be executed in. $1 and $2 refer to the first and second command line arguments.
About argument validity checks:
First check ensures that there are exactly two arguments passed to the script.
Second check ensures the file pointed by the argument two is existing
Third check ensures that the number passed as first argument is really a number. It uses regular expression for that purpose. May be someone provide a better replacement for this check but this is what came to my mind instantly.
To accept the filename and column number in any order, you'll need to use option switches. Bash's getopts allows you to specify and process options so you can call your script using scriptname -f filename -c 12 or scriptname -c 12 -f filename for example.
#!/bin/bash
options=":f:c:"
while getopts $options option
do
case $option in
f)
filename=$OPTARG
;;
c)
col_num=$OPTARG
;;
\?)
usage_function # not shown
exit 1
;;
*)
echo "Invalid option"
usage_function
exit 1
;;
esac
done
shift $((OPTIND - 1))
if [[ -z $filename || -z $col_num ]]
then
echo "Missing option"
usage_function
exit 1
fi
if [[ $col_num == *[^0-9]* ]]
then
echo "Invalid integer"
usage_function
exit 1
fi
# other checks
cut -f 10 $col_num "$filename" | ...

Resources