Bash function with grep, using if and then statements - linux

I have been trying to build a function that will allow me to perform a dynamic grep search - dynamic in that if more than one variable is specified upon calling the function, then it will pass the additional variable through grep. I should never really need more than 2 variables, but would be interested in learning how to do that as well.
example of desired functionality:
> function word1 word2
result:
(
cd /path/folder;
less staticfile | grep -i word1 | grep -i --color word2
)
The above works great if there will always be two words. i'm trying to learn how to use if and then statements to allow me to be flexible in the number of variables.
function ()
{
if [ ! "$#" -gt "1" ];
then
(
cd ~/path/folder;
less staticfile | grep -i $1 | grep --color -i $2
)
else
(
cd ~/path/folder;
less staticfile | grep --color $1
)
fi
}
When I run this function, it seems to do the opposite -
> function word1
returns an error because it is using the "then" statement for some reason but has nothing to insert into the second grep call.
> function word1 word2
only greps "word1" - therefore is using the "else" statement.
What am i doing wrong? Is there any easier way to do this?

I think this will do what you want. You can functionize it. (Note the indirection use after grep.) Here is the code followed by a test file & output. (Takes any # of args.)
#!/bin/bash
if [[ $# -lt 2 ]]; then
echo -e "\nusage: ./prog.sh file grep-string1 grep-string2 grep-string3...\n"
echo -e "At least two arguments are required. exiting.\n"
exit 5
fi
fname=$1 # always make your file name the first argument
i=2 # initialize the argument counter
cmd="cat $fname"
while [[ $i -le $# ]]; do
cmd=${cmd}" | grep ${!i}"
(( i++ ))
done
eval $cmd
So the input file for this might be called tfile with contents:
a circuit
pavement
bye
Running the code works this way:
./prog.sh tfile e ye
bye

Related

bash: set variable inside loop when piping find and grep [duplicate]

i want to compute all *bin files inside a given directory. Initially I was working with a for-loop:
var=0
for i in *ls *bin
do
perform computations on $i ....
var+=1
done
echo $var
However, in some directories there are too many files resulting in an error: Argument list too long
Therefore, I was trying it with a piped while-loop:
var=0
ls *.bin | while read i;
do
perform computations on $i
var+=1
done
echo $var
The problem now is by using the pipe subshells are created. Thus, echo $var returns 0.
How can I deal with this problem?
The original Code:
#!/bin/bash
function entropyImpl {
if [[ -n "$1" ]]
then
if [[ -e "$1" ]]
then
echo "scale = 4; $(gzip -c ${1} | wc -c) / $(cat ${1} | wc -c)" | bc
else
echo "file ($1) not found"
fi
else
datafile="$(mktemp entropy.XXXXX)"
cat - > "$datafile"
entropy "$datafile"
rm "$datafile"
fi
return 1
}
declare acc_entropy=0
declare count=0
ls *.bin | while read i ;
do
echo "Computing $i" | tee -a entropy.txt
curr_entropy=`entropyImpl $i`
curr_entropy=`echo $curr_entropy | bc`
echo -e "\tEntropy: $curr_entropy" | tee -a entropy.txt
acc_entropy=`echo $acc_entropy + $curr_entropy | bc`
let count+=1
done
echo "Out of function: $count | $acc_entropy"
acc_entropy=`echo "scale=4; $acc_entropy / $count" | bc`
echo -e "===================================================\n" | tee -a entropy.txt
echo -e "Accumulated Entropy:\t$acc_entropy ($count files processed)\n" | tee -a entropy.txt
The problem is that the while loop is part of a pipeline. In a bash pipeline, every element of the pipeline is executed in its own subshell [ref]. So after the while loop terminates, the while loop subshell's copy of var is discarded, and the original var of the parent (whose value is unchanged) is echoed.
One way to fix this is by using Process Substitution as shown below:
var=0
while read i;
do
# perform computations on $i
((var++))
done < <(find . -type f -name "*.bin" -maxdepth 1)
Take a look at BashFAQ/024 for other workarounds.
Notice that I have also replaced ls with find because it is not good practice to parse ls.
A POSIX compliant solution would be to use a pipe (p file). This solution is very nice, portable, and POSIX, but writes something on the hard disk.
mkfifo mypipe
find . -type f -name "*.bin" -maxdepth 1 > mypipe &
while read line
do
# action
done < mypipe
rm mypipe
Your pipe is a file on your hard disk. If you want to avoid having useless files, do not forget to remove it.
So researching the generic issue, passing variables from a sub-shelled while loop to the parent. One solution I found, missing here, was to use a here-string. As that was bash-ish, and I preferred a POSIX solution, I found that a here-string is really just a shortcut for a here-document. With that knowledge at hand, I came up with the following, avoiding the subshell; thus allowing variables to be set in the loop.
#!/bin/sh
set -eu
passwd="username,password,uid,gid
root,admin,0,0
john,appleseed,1,1
jane,doe,2,2"
main()
{
while IFS="," read -r _user _pass _uid _gid; do
if [ "${_user}" = "${1:-}" ]; then
password="${_pass}"
fi
done <<-EOT
${passwd}
EOT
if [ -z "${password:-}" ]; then
echo "No password found."
exit 1
fi
echo "The password is '${password}'."
}
main "${#}"
exit 0
One important note to all copy pasters, is that the here-document is setup using the hyphen, indicating that tabs are to be ignored. This is needed to keep the layout somewhat nice. It is important to note, because stackoverflow doesn't render tabs in 'code' and replaces them with spaces. Grmbl. SO, don't mangle my code, just cause you guys favor spaces over tabs, it's irrelevant in this case!
This probably breaks on different editor(settings) and what not. So the alternative would be to have it as:
done <<-EOT
${passwd}
EOT
This could be done with a for loop, too:
var=0;
for file in `find . -type f -name "*.bin" -maxdepth 1`; do
# perform computations on "$i"
((var++))
done
echo $var

shell script to find a word in a list of files, all of them given as parameters

I need a simple shell program which has to do something like this:
script.sh word_to_find file1 file2 file3 .... fileN
which will display
word_to_find 3 - if word_to_find appears in 3 files
or
word_to_find 5 - if word_to_find appears in 5 files
This is what I've tried
#!/bin/bash
count=0
for i in $#; do
if [ grep '$1' $i ];then
((count++))
fi
done
echo "$1 $count"
But this message appears:
syntax error: "then" unexpected (expecting "done").
Before this the error was
[: grep: unexpected operator.
Try this:
#!/bin/sh
printf '%s %d\n' "$1" $(grep -hm1 "$#" | wc -l)
Notice how all the script's arguments are passed verbatim to grep -- the first is the search expression, the rest are filenames.
The output from grep -hm1 is a list of matches, one per file with a match, and wc -l counts them.
I originally posted this answer with grep -l but that would require filenames to never contain a newline, which is a rather pesky limitation.
Maybe add an -F option if regular expression search is not desired (i.e. only search literal text).
The code you showed is:
#!/bin/bash
count=0
for i in $#; do
if [ grep '$1' $i ];then
((count++))
fi
done
echo "$1 $count"
When I run it, I get the error:
script.sh: line 5: [: $1: binary operator expected
This is reasonable, but it is not the same as either of the errors reported in the question. There are multiple problems in the code.
The for i in $#; do should be for i in "$#"; do. Always use "$#" so that any spaces in the arguments are preserved. If none of your file names contain spaces or tabs, it is not critical, but it is a good habit to get into. (See How to iterate over arguments in bash script for more information.)
The if operations runs the [ (aka test) command, which is actually a shell built-in as well as a binary in /bin or /usr/bin. The use of single quotes around '$1' means that the value is not expanded, and the command sees its arguments as:
[
grep
$1
current-file-name
]
where the first is the command name, or argv[0] in C, or $0 in shell. The error I got is because the test command expects an operator such as = or -lt at the point where $1 appears (that is, it expects a binary operator, not $1, hence the message).
You actually want to test whether grep found the word in $1 in each file (the names listed after $1). You probably want to code it like this, then:
#!/bin/bash
word="$1"
shift
count=0
for file in "$#"
do
if grep -l "$word" "$file" >/dev/null 2>&1
then ((count++))
fi
done
echo "$word $count"
We can negotiate on the options and I/O redirections used with grep. The POSIX grep
options -q and/or -s options provide varying degrees of silence and -q could be used in place of -l. The -l option simply lists the file name if the word is found, and stops scanning on the first occurrence. The I/O redirection ensures that errors are thrown away, but the test ensures that successful matches are counted.
Incorrect output claimed
It has been claimed that the code above does not produce the correct answer. Here's the test I performed:
$ echo "This country is young" > young.iii
$ echo "This country is little" > little.iii
$ echo "This fruit is fresh" > fresh.txt
$ bash findit.sh country young.iii fresh.txt little.iii
country 2
$ bash -x findit.sh country young.iii fresh.txt little.iii
+ '[' -f /etc/bashrc ']'
+ . /etc/bashrc
++ '[' -z '' ']'
++ return
+ alias 'r=fc -e -'
+ word=country
+ shift
+ count=0
+ for file in '"$#"'
+ grep -l country young.iii
+ (( count++ ))
+ for file in '"$#"'
+ grep -l country fresh.txt
+ for file in '"$#"'
+ grep -l country little.iii
+ (( count++ ))
+ echo 'country 2'
country 2
$
This shows that for the given files, the output is correct on my machine (Mac OS X 10.10.2; GNU bash, version 3.2.57(1)-release (x86_64-apple-darwin14)). If the equivalent test works differently on your machine, then (a) please identify the machine and the version of Bash (bash --version), and (b) please update the question with the output you see from bash -x findit.sh country young.iii fresh.txt little.iii. You may want to create a sub-directory (such as junk), and copy findit.sh into that directory before creating the files as shown, etc.
You could also bolster your case by showing the output of:
$ grep country young.iii fresh.txt little.iii
young.iii:This country is young
little.iii:This country is little
$
#!/usr/bin/perl
use strict;
use warnings;
my $wordtofind = shift(#ARGV);
my $regex = qr/\Q$wordtofind/s;
my #file = ();
my $count = 0;
my $filescount = scalar(#ARGV);
for my $file(#ARGV)
{
if(-e $file)
{
eval { open(FH,'<' . $file) or die "can't open file $file "; };
unless($#)
{
for(<FH>)
{
if(/$regex/)
{
$count++;
last;
}
}
close(FH);
}
}
}
print "$wordtofind $count\n";
You could use an Awk script:
#!/usr/bin/env awk -f
BEGIN {
n=0
} $0 ~ w {
n++
} END {
print w,n
}
and run it like this:
./script.awk w=word_to_find file1 file2 file3 ... fileN
or if you don't want to worry about assigning a variable (w) on the command line:
BEGIN {
n=0
w=ARGV[1]
delete ARGV[1]
} $0 ~ w {
n++
} END {
print w,n
}

Grep multiple bash parameters

I'm writing a bash script which shall search in multiple files.
The problem I'm encountering is that I can't egrep an undetermined number of variables passed as parameters to the bash script
I want it to do the following:
Given a random number of parameters. i.e:
./searchline.sh A B C
Do a grep on the first one, and egrep the result with the rest:
grep "A" * | egrep B | egrep C
What I've tried to do is to build a string with the egreps:
for j in "${#:2}";
do
ADDITIONALSEARCH="$ADDITIONALSEARCH | egrep $j";
done
grep "$1" * "$ADDITIONALSEARCH"
But somehow that won't work, it seems like bash is not treating the "egrep" string as an egrep.
Do you guys have any advice?
By the way, as a side note, I'm not able to create any auxiliary file so grep -f is out of the line I guess. Also note, that the number of parameters passed to the bash script is variable, so I can't do egrep "$2" | egrep "$3".
Thanks in advance.
Fernando
You can use recursion here to get required number of pipes:
#!/bin/bash
rec_egrep() {
if [ $# -eq 0 ]; then
exec cat
elif [ $# -eq 1 ]; then
exec egrep "$1"
else
local pat=$1
shift
egrep "$pat" | rec_egrep "$#"
fi
}
first_arg="$1"
shift
grep "$first_arg" * | rec_egrep "$#"
A safe eval can be a good solution:
#!/bin/bash
if [[ $# -gt 0 ]]; then
temp=("grep" "-e" "\"\$1\"" "*")
for (( i = 2; i <= $#; ++i )); do
temp=("${temp[#]}" "|" "egrep" "-e" "\"\$$i\"")
done
eval "${temp[#]}"
fi
To run it:
bash script.sh A B C

Multiple variables in loop input?

When using the following:
for what in $#; do
read -p "Where?" where
grep -H "$what" $where -R | cut -d: -f1
How can I, instead of using read to define a user-variable, have a second variable input along with the first variable when calling the script.
For example, the ideal usage I believe I can get is something like:
sh scriptname var1 var2
But my understanding is that the for... line is for looping the subsequent entires into the one variable; what would I need to change to input multiple variables?
As an aside: using | cut -D: -f1 is not safe, because grep does not escape colons in filenames. To see what I mean, you can try this:
ghoti#pc:~$ echo bar:baz > foo
ghoti#pc:~$ echo baz > foo:bar
ghoti#pc:~$ grep -Hr ba .
./foo:bar:baz
./foo:bar:baz
Clarity .. there is not.
So ... let's clarify what you're looking for.
Do you want to search for one string in multiple files? Or,
Do you want to search for multiple strings in one file?
If the former, then the following might work:
#!/bin/bash
if [[ "$#" -lt 2 ]]; then
echo "Usage: `basename $0` string file [file ...]
exit 1
fi
what="$1"
shift # discard $1, move $2 to $1, $3 to $2, etc.
for where in "$#"; do
grep -HlR "$what" "$where" -R
done
And if the latter, then this would be the way:
#!/bin/bash
if [[ "$#" -lt 2 ]]; then
echo "Usage: `basename $0` file string [string ...]
exit 1
fi
where="$1"
shift
for what in "$#"; do
grep -lR "$what" "$where"
done
Of course, this one might be streamlined if you concatenated your strings with an or bar, then used egrep. Depends on what you're actually looking for.
You can get parameters passed on the command line with $1 $2 etc.
Read up on positional parameters: http://www.linuxcommand.org/wss0130.php. You don't need a for loop to parse them.
sh scriptname var1 var2
v1=$1 # contains var1
v2=$2 # contains var1
$# is basically just a list of all the positional parameters: $1 $2 $3 etc.

looking for a command to tentatively execute a command based on criteria

I am looking for a command (or way of doing) the following:
echo -n 6 | doif -criteria "isgreaterthan 4" -command 'do some stuff'
The echo part would obviously come from a more complicated string of bash commands. Essentially I am taking a piece of text from each line of a file and if it appears in another set of files more than x (say 100) then it will be appended to another file.
Is there a way to perform such trickery with awk somehow? Or is there another command.. I'm hoping that there is some sort of xargs style command to do this in the sense that the -I% portion would be the value with which to check the criteria and whatever follows would be the command to execute.
Thanks for thy insight.
It's possible, though I don't see the reason why you would do that...
function doif
{
read val1
op=$1
val2="$2"
shift 2
if [ $val1 $op "$val2" ]; then
"$#"
fi
}
echo -n 6 | doif -gt 3 ls /
if test 6 -gt 4; then
# do some stuff
fi
or
if test $( echo 6 ) -gt 4; then : ;fi
or
output=$( some cmds that generate text)
# this will be an error if $output is ill-formed
if test "$output" -gt 4; then : ; fi

Resources