creating a shell script that does mutilple grep operation (AND operation) [duplicate] - linux

This question already has answers here:
Pass command-line arguments to grep as search patterns and print lines which match them all
(3 answers)
Match two strings in one line with grep
(23 answers)
Closed 11 months ago.
I have a datafile as below.
datafile
Sterling|Man City|England
Kane|Tottenham|England
Ronaldo|Man Utd|Portugal
Ederson|Man City|Brazil
If I want to find a player that has a "Man City" and "England" trait I would write a command as such in the interpreter.
grep "Man City" datafile | grep "England"
output: Sterling|Man City|England
Like this, I want to make a shell script that receives multiple(more than 1) arguments that finds a line which has all the arguments in it. It would receive input as arg1 arg2 arg3... and return all the data lines that has the arguments. The program would run like this.
$ ./find ManCity England
output: Sterling|Man City|England
But I have absolutely no idea how to implement the command that does multiple grep for each argument. Can you use a for loop and implement in a code that does something like this?
grep arg1 datafile | grep arg2 | grep arg3 ...
Also, if there is a better way to find a data line that has all the arguments(AND of the arguments) can you please show me how?

You need this:
#!/bin/bash
myArray=( "$#" ) # store all parameters as an array
grep="grep ${myArray[0]} somefile" # use the first param as the first grep
unset myArray[0]; # then unset the first param
for arg in "${myArray[#]}"; do
grep="$grep | grep '$arg'" # cycle through the rest of the params to build the AND grep logic
done
eval "$grep" # and finally execute the built line

You can use the awk command to do so. Awk is a very useful command and also supports loops.
Below is the awk command to find the AND of multiple patterns:
> awk '/Man City/ && /England/' test.txt
Sterling|Man City|England
Incase you want to do OR:
> awk '/Man City/ || /England/' test.txt
Sterling|Man City|England
Kane|Tottenham|England
Ederson|Man City|Brazil
Reference

Related

How to store output of a shell script inside a variable in conky?

So I'm getting to get the CPU core temperature using sensors command.
Inside conky, I wrote
$Core 0 Temp:$alignr${execi 1 sensors | grep 'Core 0' | awk {'print $3'}} $alignr${execibar 1 sensors | grep 'Core 0' | awk {'print $3'}}
Each second I'm running the exact same command sensors | grep 'Core 0' | awk {'print $3'} in two places for exact same output. Is there is a way to hold the output inside a variable and use that variable in place of the commands.
conky does not have user variables. What you can do instead is call lua from conky to do this for you. The lua language is usually built-in to conky by default, so you need only put some code in a file, include the file in the conky setup file, and call the function. For example, these shell commands will create a test:
cat >/tmp/.conkyrc <<\!
conky.config = {
lua_load = '/tmp/myfunction.lua',
minimum_height = 400,
minimum_width = 600,
use_xft = true,
font = 'Times:size=20',
};
conky.text = [[
set ${lua myfunction t ${execi 1 sensors | awk '/^Core 0/{print 0+$3}'}}°C
get ${lua myfunction t}°C ${lua_bar myfunction t}
]]
!
cat >/tmp/myfunction.lua <<\!
vars = {}
function conky_myfunction(varname, arg)
if(arg~=nil)then vars[varname] = conky_parse(arg) end
return vars[varname]
end
!
conky -c /tmp/.conkyrc -o
In the myfunction.lua file, we declare a function myfunction() (which needs to be prefixed conky_ so we can call it from conky). It takes 2 parameters, the name of a variable, and a conky expression. It calls conky_parse() to evaluate the expression, and saves the value in a table vars, under the name provided by the caller. It then returns the resulting value to the caller. If no expression was given, it will return the previous value.
In the conky.text the line beginning set calls myfunction in lua with the arbitrary name of a variable, t, and the execi sensors expression to evaluate, save, and return. The line beginning get calls myfunction to just get the value.
lua_bar is similar to exec_bar, but calls a lua function, see man conky. However, it expects a number without the leading + that exec_bar accepts, so I've changed the awk to return this, and have added the °C to the conky text instead.

How to add a line result "non_assigned" when no match using awk? [duplicate]

This question already has an answer here:
How to add output "non_assigned" when there is no match in grep?
(1 answer)
Closed 2 years ago.
When I run a command (COMMAND) on one line of my input file (input.txt) I get an associated result where only one line is interesting, always starting by the world phylum.
For instance:
superkingdom 2759 Eukaryota
clade 554915 Amoebozoa
phylum 555280 Discosea
order 313555 Himatismenida
family 313556 Cochliopodiidae
So I run :
for p in $(cat input.txt)
do COMMAND $p | grep "\bphylum\b" >> results.txt
done
In order to have in my file result.txt all the lines like :
phylum 555280 Discosea
However there is sometimes no results with grep (there is no line starting with phylum), and it adds no line to results.txt. I would like for these specific cases add some line with "0" or "non assigned" for instance (so each line of input.txt matches results.txt).
clade 2696291 Ochrophyta
class 5747 Eustigmatophyceae
order 425074 Eustigmatales
family 425072 Monodopsidaceae
I have tried adding | awk print '{print $0"non_assigned"}' , unsuccesfully.
Do you have any ideas to help me ? A member advices me to use awk '/phylum/{print $0}!/phylum/{print "non_assigned";exit} but i get as output "non_assigned" even if the phylum result is present.
Like this?:
$ grep phylum file || echo "non assigned"
Output when phylum found in file:
phylum 555280 Discosea
and when not found:
non assigned
Same in awk:
$ awk '/phylum/&&found=1;END{if(!found)print "non assigned"}'

How to make version-sort command work in a sh file?

I'm trying to use "sort -V" command (aka version-sort) in a sh file.
Specifically, I have the following line of code in a sh file:
SOME_PATH="$(ls dir_1/dir_2/v*/filename.txt | sort -V | tail -n1)"
What I'm trying to accomplish through the above command is that given a list of file paths with different version numbers, I want to get the file path with the greatest version number.
For example, let's assume that I have the following list of file paths:
dir_1/dir_2/v1/filename.txt,
dir_1/dir_2/v2/filename.txt,
dir_1/dir_2/v11/filename.txt
Then, I want the command to return dir_1/dir_2/v11/filename.txt instead of dir_1/dir_2/v2/filename.txt since the former has the greatest version value, "11".
From my understanding the above linux command precisely accomplishes this.
I confirmed it working on the Linux bash terminal.
However, when I run a sh file with the above command in it, I'm getting a
"ERROR: Unknown command line flag 'V'" error message.
Is there a way to make version-sort work in a sh file?
If not, is there a way to implement it not using -V flag?
Thank you.
Using shell's printf and awk:
SOME_PATH=$(printf %s\\0 dir_1/dir_2/v*/filename.txt |
awk 'BEGIN{FS="/";RS="\0";v=0}{match($3,/v([[:digit:]]+)/,m);if(m[1]>v){v=m[1];l=$0}}END{print l}')
Using awk only:
SOME_PATH=$(awk 'BEGIN{delete ARGV[0];v=0;for(i in ARGV){split(ARGV[i],s,"/");match(s[3],/v([[:digit:]]+)/,m);if(m[1]>v){v=m[1];l=ARGV[i]}}}END{print l}' dir_1/dir_2/v*/filename.txt)
Formatted awk script:
#!/usr/bin/env -S awk -f
BEGIN {
delete ARGV[0]
v=0
for (i in ARGV) {
split(ARGV[i], s, "/")
match(s[3], /v([[:digit:]]+)/, m)
if (m[1]>v) {
v=m[1]
l=ARGV[i]
}
}
}
END {
print l
}
Using a null delimited list stream, and not parsing the output of ls 1:
SOME_PATH=$(
printf '%s\0' dir_1/dir_2/v*/filename.txt |
sort -z -t'/' -k3V |
tail -zn1 |
tr -d '\0'
)
How it works:
printf '%s\0' dir_1/dir_2/v*/filename.txt: Expands the paths into a null delimited stream output.
sort -z -t'/' -k3V: Sorts the null delimited input stream on -k3V version number from the 3rd column, -t'/' using / as a delimiter.
tail -zn1: Outputs the least null delimited entry from the input stream.
tr -d '\0': Trim-out any remaining null to prevent the shell from complaining with error: warning: command substitution: ignored null byte in input.
StackExchange: Why not parse ls (and what to do instead)?

How can I fix my bash script to find a random word from a dictionary?

I'm studying bash scripting and I'm stuck fixing an exercise of this site: https://ryanstutorials.net/bash-scripting-tutorial/bash-variables.php#activities
The task is to write a bash script to output a random word from a dictionary whose length is equal to the number supplied as the first command line argument.
My idea was to create a sub-dictionary, assign each word a number line, select a random number from those lines and filter the output, which worked for a similar simpler script, but not for this.
This is the code I used:
6 DIC='/usr/share/dict/words'
7 SUBDIC=$( egrep '^.{'$1'}$' $DIC )
8
9 MAX=$( $SUBDIC | wc -l )
10 RANDRANGE=$((1 + RANDOM % $MAX))
11
12 RWORD=$(nl "$SUBDIC" | grep "\b$RANDRANGE\b" | awk '{print $2}')
13
14 echo "Random generated word from $DIC which is $1 characters long:"
15 echo $RWORD
and this is the error I get using as input "21":
bash script.sh 21
script.sh: line 9: counterintelligence's: command not found
script.sh: line 10: 1 + RANDOM % 0: division by 0 (error token is "0")
nl: 'counterintelligence'\''s'$'\n''electroencephalograms'$'\n''electroencephalograph': No such file or directory
Random generated word from /usr/share/dict/words which is 21 characters long:
I tried in bash to split the code in smaller pieces obtaining no error (input=21):
egrep '^.{'21'}$' /usr/share/dict/words | wc -l
3
but once in the script line 9 and 10 give error.
Where do you think is the error?
problems
SUBDIC=$( egrep '^.{'$1'}$' $DIC ) will store all words of the given length in the SUBDIC variable, so it's content is now something like foo bar baz.
MAX=$( $SUBDIC | ... ) will try to run the command foo bar baz which is obviously bogus; it should be more like MAX=$(echo $SUBDIC | ... )
MAX=$( ... | wc -l ) will count the lines; when using the above mentioned echo $SUBDIC you will have multiple words, but all in one line...
RWORD=$(nl "$SUBDIC" | ...) same problem as above: there's only one line (also note #armali's answer that nl requires a file or stdin)
RWORD=$(... | grep "\b$RANDRANGE\b" | ...) might match the dictionary entry catch 22
likely RWORD=$(... | awk '{print $2}') won't handle lines containing spaces
a simple solution
doing a "random sort" over the all the possible words and taking the first line, should be sufficient:
egrep "^.{$1}$" "${DIC}" | sort -R | head -1
MAX=$( $SUBDIC | wc -l ) - A pipe is used for connecting a command's output, while $SUBDIC isn't a command; an appropriate syntax is MAX=$( <<<$SUBDIC wc -l ).
nl "$SUBDIC" - The argument to nl has to be a filename, which "$SUBDIC" isn't; an appropriate syntax is nl <<<"$SUBDIC".
This code will do it. My test dictionary of words is in file file. It's a good idea to get all words of a given length first but put them in an array not in var. And then get a random index and echo it.
dic=( $(sed -n "/^.\{$1\}$/p" file) )
ind=$((0 + RANDOM % ${#dic[#]}))
echo ${dic[$ind]}
I am also doing this activity and I create one simple solution.
I create the script.
#!/bin/bash
awk "NR==$1 {print}" /usr/share/dict/words
Here if you want a random string then you have to run the script as per the below command from the terminal.
./script.sh $RANDOM
If you want the print any specific number string then you can run as per the below command from the terminal.
./script.sh 465
cat /usr/share/dict/american-english | head -n $RANDOM | tail -n 1
$RANDOM - Returns a different random number each time is it referred to.
this simple line outputs random word from the mentioned dictionary.
Otherwise as umläute mentined you can do:
cat /usr/share/dict/american-english | sort -R | head -1

Can't input date variable in bash

I have a directory /user/reports under which many files are there, one of them is :
report.active_user.30092018.77325.csv
I need output as number after date i.e. 77325 from above file name.
I created below command to find a value from file name:
ls /user/reports | awk -F. '/report.active_user.30092018/ {print $(NF-1)}'
Now, I want current date to be passed in above command as variable and get result:
ls /user/reports | awk -F. '/report.active_user.$(date +'%d%m%Y')/ {print $(NF-1)}'
But not getting required output.
Tried bash script:
#!/usr/bin/env bash
_date=`date +%d%m%Y`
active=$(ls /user/reports | awk -F. '/report.active_user.${_date}/ {print $(NF-1)}')
echo $active
But still output is blank.
Please help with proper syntax.
As #cyrus said you must use double quotes in your variable assignment because simple quote are use only for string and not for containing variables.
Bas use case
number=10
string='I m sentence with or wihtout var $number'
echo $string
Correct use case
number=10
string_with_number="I m sentence with var $number"
echo $string_with_number
You can use simple quote but not englobe all the string
number=10
string_with_number='I m sentence with var '$number
echo $string_with_number
Don't parse ls
You don't need awk for this: you can manage with the shell's capabilities
for file in report.active_user."$(date "+%d%m%Y")"*; do
tmp=${file%.*} # remove the extension
number=${tmp##*.} # remove the prefix up to and including the last dot
echo "$number"
done
See https://www.gnu.org/software/bash/manual/bashref.html#Shell-Parameter-Expansion

Resources