Ubuntu bash script grepping and cutting on first column - linux

I am trying to implement a bash script that will take the piped input and cut the first column and return a total. The difference is that the cut is not from a file, but from the piped input. For example:
#!/bin/bash
total=$(cut -f1 $1 | numsum)
echo $total
Basically, in my scenario the first column will always be from the passed input.
Example usuage is:
./script.sh "cat data.txt | grep -i status | grep something"
data.txt contains:
1 status
2 something
This will produce something like:
2
How can this be achieved? I have noticed the cut only works for files only. I cannot see any examples on Google.

I have managed to fix the issue myself. The code is:
#!/bin/bash
total=$(eval $1 | awk '{sum+=$1} END {print sum}')
echo $total

Related

Make this code read from file

Hello guys I wrote code in linux shell script but the code only read from keyboard i want to change it to read from file for example if i write ./car.sh lamborghini.txt it should give me most expensive model of it.
code is like this:
#!/bin/sh
echo "Choose one of them"
read manu
sort -t';' -nrk3 auto.dat > auto1.dat
grep $manu auto1.dat | head -n1 | cut -d';' -f2
and auto.dat file contains these:
Lamborghini;Aventador;700000
Lamborghini;Urus;200000
Tesla;ModelS;180000
Tesla;ModelX;140000
Ford;Mustang;300000
Ford;Focus;20000
The read command always reads from stdin. You can use redirection < to read the content of a file.
Reading $manu from a file's content
#!/bin/sh
read manu < "$1"
sort -t';' -nrk3 auto.dat | grep "$manu" | head -n1 | cut -d';' -f2
This version of your script expects a file name as a command line parameter. The first line of said file will be stored in $manu. Example:
./car.sh fileWithSelection.txt
The file should contain the text you would have entered in your old script.
Reading $manu from a command line parameter
In my opinion, it would make more sense to interpret the command line parameters directly, instead of using files and passing them to the script.
#!/bin/sh
manu="$1"
sort -t';' -nrk3 auto.dat | grep "$manu" | head -n1 | cut -d';' -f2
Example:
./car.sh "text you would have entered in your old script."
You can try this way but the file Tesla.txt must contain Tesla
#!/bin/sh
read manu < "$1"
awk -F\; -vmod="$manu" '
$1==mod{if($3>a){a=$3;b=$2}}
END{if(b){print "The more expensive "mod" is "b" at "a}}' auto.dat

Bash - Piping output of command into while loop

I'm writing a Bash script where I need to look through the output of a command and do certain actions based on that output. For clarity, this command will output a few million lines of text and it may take roughly an hour or so to do so.
Currently, I'm executing the command and piping it into a while loop that reads a line at a time then looks for certain criteria. If that criterion exists, then update a .dat file and reprint the screen. Below is a snippet of the script.
eval "$command"| while read line ; do
if grep -Fq "Specific :: Criterion"; then
#pull the sixth word from the line which will have the data I need
temp=$(echo "$line" | awk '{ printf $6 }')
#sanity check the data
echo "\$line = $line"
echo "\$temp = $temp"
#then push $temp through a case statement that does what I need it to do.
fi
done
So here's the problem, the sanity check on the data is showing weird results. It is printing lines that don't contain the grep criteria.
To make sure that my grep statement is working properly, I grep the log file that contains a record of the text that is output by the command and it outputs only the lines that contain the specified criteria.
I'm still fairly new to Bash so I'm not sure what's going on. Could it be that the command is force feeding the while loop a new $line before it can process the $line that met the grep criteria?
Any ideas would be much appreciated!
How does grep know what line looks like?
if ( printf '%s\n' "$line" | grep -Fq "Specific :: Criterion"); then
But I cant help feel like you are overcomplicating a lot.
function process() {
echo "I can do anything I want"
echo " per element $1"
echo " that I want here"
}
export -f process
$command | grep -F "Specific :: Criterion" | awk '{print $6}' | xargs -I % -n 1 bash -c "process %";
Run the command, filter only matching lines, and pull the sixth element. Then if you need to run an arbitrary code on it, send it to a function (you export to make it visible in subprocesses) via xargs.
What are you applying the grep on ?
Modify
if grep -Fq "Specific :: Criterion"; then
as below
if ( echo $line | grep -Fq "Specific :: Criterion" ); then

Issues passing AWK output to BASH Variable

I'm trying to parse lines from an error log in BASH and then send a certain part out to a BASH variable to be used later in the script and having issues once I try and pass it to a BASH variable.
What the log file looks like:
1446851818|1446851808.1795|12|NONE|DID|8001234
I need the number in the third set (in this case, the number is 12) of the line
Here's an example of the command I'm running:
tail -n5 /var/log/asterisk/queue_log | grep 'CONNECT' | awk -F '[|]' '{print $3}'
The line of code is trying to accomplish this:
Grab the last lines of the log file
Search for a phrase (in this case connect, I'm using the same command to trigger different items)
Separate the number in the third set of the line out so it can be used elsewhere
If I run the above full command, it runs successfully like so:
tail -n5 /var/log/asterisk/queue_log | grep 'CONNECT' | awk -F '[|]' '{print $3}'
12
Now if I try and assign it to a variable in the same line/command, I'm unable to have it echo back the variable.
My command when assigning to a variable looks like:
tail -n5 /var/log/asterisk/queue_log | grep 'CONNECT' | brand=$(awk -F '[|]' '{print $3}')
(It is being run in the same script as the echo command so the variable should be fine, test script looks like:
#!/bin/bash
tail -n5 /var/log/asterisk/queue_log | grep 'CONNECT' | brand=$(awk -F '[|]' '{print $3}')
echo "$brand";
I'm aware this is most likely not the most efficient/eloquent solution to do this, so if there are other ideas/ways to accomplish this I'm open to them as well (my BASH skills are basic but improving)
You need to capture the output of the entire pipeline, not just the final section of it:
brand=$(tail -n5 /var/log/asterisk/queue_log | grep 'CONNECT' | awk -F '|' '{print $3}')
You may also want to consider what will happen if there is more than one line containing CONNECT in the final five lines of the file (or indeed, if there are none). That's going to cause brand to have multiple (or no) values.
If your intent is to get the third field from the latest line in the file containing CONNECT, awk can pretty much handle the entire thing without needing tail or grep:
brand=$(awk -F '|' '/CONNECT/ {latest = $3} END {print latest}')

grepping using the result of previous grep

Is there a way to perform a grep based on the results of a previous grep, rather than just piping multiple greps into each other. For example, say I have the log file output below:
ID 1000 xyz occured
ID 1001 misc content
ID 1000 misc content
ID 1000 status code: 26348931276572174
ID 1000 misc content
ID 1001 misc content
To begin with, I'd like to grep the whole log file file to see if "xyz occured" is present. If it is, I'd like to get the ID number of that event and grep through all the lines in the file with that ID number looking for the status code.
I'd imagined that I could use xargs or something like that but I can't seem to get it work.
grep "xyz occured" file.log | awk '{ print $2 }' | xargs grep "status code" | awk '{print $NF}'
Any ideas on how to actually do this?
A general answer for grep-ing the grep-ed output:
grep 'patten1' *.txt | grep 'pattern2'
notice that the second grep is not pointing at a file.
More about cool grep stuff here
You're almost there. But while xargs can sometimes be used to do what you want (depending on how the next command takes its arguments), you aren't actually using it to grep for the ID you just extracted. What you need to do is take the output of the first grep (containing the ID code) and use that in the next grep's expression. Something like:
grep "^ID `grep 'xyz occured' file.log | awk '{print $2}'` status code" file.log
Obviously another option would be to write a script to do this in one pass, a-la Ed's suggestion.
Yet another way
for x in `grep "xyz occured" file.log | cut -d\ -f2`
do
grep $x file.log
done
The thing I like about this method is if you wanted to you could write the output to a file for each status code.
grep $x file.log >> /var/tmp/$x.out
This is all about retrieve the files in a narrowed search scope. In your case the search scope is determined by a file content.
I have found this problem more often while reducing the search scope through many searches (applying filters to the previous grep results).
Trying to find general answer:
Generate a list with the result of the first grep:
grep pattern | awk -F':' '{print $1}'
Second grep into the list of files like here
xargs grep -i pattern
apply this cascading filter the times you need just adding awk to get only the filenames and xargs to pass the filenames to grep -i
For example:
grep 'pattern1' | awk -F':' '{print $1}' | xargs grep -i 'pattern2'
Just use awk:
awk '{info[$2] = info[$2] $0 ORS} /xyz occured/{ids[$2]} END{ for (id in ids) printf "%s",info[id]}' file.log
or:
awk '/status code/{code[$2]=$NF} /xyz occured/{ids[$2]} END{ for (id in ids) print code[id]}' file.log
depending what you really want to output. Some expected output in your question would help.
Grep the result of a previous Grep:
Given this file contents:
ID 1000 xyz occured
ID 1001 misc content
ID 1000 misc content
ID 1000 status code: 26348931276572174
ID 1000 misc content
ID 1001 misc content
This command:
grep "xyz" file.log | awk '{ print $2 }' > f.log; grep `cat f.log` file.log;
returns this:
ID 1000 xyz occured
ID 1000 misc content
ID 1000 status code: 26348931276572174
ID 1000 misc content
It looks for "xyz" in file.log places the result in f.log. Then greps for that ID in file.log. If the outer grep returns multiple ID numbers, then the inner grep will only search the first ID number and error out on the others.

Split output of command by columns using Bash?

I want to do this:
run a command
capture the output
select a line
select a column of that line
Just as an example, let's say I want to get the command name from a $PID (please note this is just an example, I'm not suggesting this is the easiest way to get a command name from a process id - my real problem is with another command whose output format I can't control).
If I run ps I get:
PID TTY TIME CMD
11383 pts/1 00:00:00 bash
11771 pts/1 00:00:00 ps
Now I do ps | egrep 11383 and get
11383 pts/1 00:00:00 bash
Next step: ps | egrep 11383 | cut -d" " -f 4. Output is:
<absolutely nothing/>
The problem is that cut cuts the output by single spaces, and as ps adds some spaces between the 2nd and 3rd columns to keep some resemblance of a table, cut picks an empty string. Of course, I could use cut to select the 7th and not the 4th field, but how can I know, specially when the output is variable and unknown on beforehand.
One easy way is to add a pass of tr to squeeze any repeated field separators out:
$ ps | egrep 11383 | tr -s ' ' | cut -d ' ' -f 4
I think the simplest way is to use awk. Example:
$ echo "11383 pts/1 00:00:00 bash" | awk '{ print $4; }'
bash
Please note that the tr -s ' ' option will not remove any single leading spaces. If your column is right-aligned (as with ps pid)...
$ ps h -o pid,user -C ssh,sshd | tr -s " "
1543 root
19645 root
19731 root
Then cutting will result in a blank line for some of those fields if it is the first column:
$ <previous command> | cut -d ' ' -f1
19645
19731
Unless you precede it with a space, obviously
$ <command> | sed -e "s/.*/ &/" | tr -s " "
Now, for this particular case of pid numbers (not names), there is a function called pgrep:
$ pgrep ssh
Shell functions
However, in general it is actually still possible to use shell functions in a concise manner, because there is a neat thing about the read command:
$ <command> | while read a b; do echo $a; done
The first parameter to read, a, selects the first column, and if there is more, everything else will be put in b. As a result, you never need more variables than the number of your column +1.
So,
while read a b c d; do echo $c; done
will then output the 3rd column. As indicated in my comment...
A piped read will be executed in an environment that does not pass variables to the calling script.
out=$(ps whatever | { read a b c d; echo $c; })
arr=($(ps whatever | { read a b c d; echo $c $b; }))
echo ${arr[1]} # will output 'b'`
The Array Solution
So we then end up with the answer by #frayser which is to use the shell variable IFS which defaults to a space, to split the string into an array. It only works in Bash though. Dash and Ash do not support it. I have had a really hard time splitting a string into components in a Busybox thing. It is easy enough to get a single component (e.g. using awk) and then to repeat that for every parameter you need. But then you end up repeatedly calling awk on the same line, or repeatedly using a read block with echo on the same line. Which is not efficient or pretty. So you end up splitting using ${name%% *} and so on. Makes you yearn for some Python skills because in fact shell scripting is not a lot of fun anymore if half or more of the features you are accustomed to, are gone. But you can assume that even python would not be installed on such a system, and it wasn't ;-).
try
ps |&
while read -p first second third fourth etc ; do
if [[ $first == '11383' ]]
then
echo got: $fourth
fi
done
Your command
ps | egrep 11383 | cut -d" " -f 4
misses a tr -s to squeeze spaces, as unwind explains in his answer.
However, you maybe want to use awk, since it handles all of these actions in a single command:
ps | awk '/11383/ {print $4}'
This prints the 4th column in those lines containing 11383. If you want this to match 11383 if it appears in the beginning of the line, then you can say ps | awk '/^11383/ {print $4}'.
Using array variables
set $(ps | egrep "^11383 "); echo $4
or
A=( $(ps | egrep "^11383 ") ) ; echo ${A[3]}
Similar to brianegge's awk solution, here is the Perl equivalent:
ps | egrep 11383 | perl -lane 'print $F[3]'
-a enables autosplit mode, which populates the #F array with the column data.
Use -F, if your data is comma-delimited, rather than space-delimited.
Field 3 is printed since Perl starts counting from 0 rather than 1
Getting the correct line (example for line no. 6) is done with head and tail and the correct word (word no. 4) can be captured with awk:
command|head -n 6|tail -n 1|awk '{print $4}'
Instead of doing all these greps and stuff, I'd advise you to use ps capabilities of changing output format.
ps -o cmd= -p 12345
You get the cmmand line of a process with the pid specified and nothing else.
This is POSIX-conformant and may be thus considered portable.
Bash's set will parse all output into position parameters.
For instance, with set $(free -h) command, echo $7 will show "Mem:"

Resources