Parse info from bash - linux

I'm trying to parse out certain information from a bash script on Ubuntu
I'm having a bash script execute every x seconds which does write:
forever list
this response from that command looks like this:
info: Forever processes running
data: uid command script forever pid logfile uptime
data: [0] _1b2 /usr/bin/nodejs /home/ubuntu/node/server.js 28968 28970 /root/.forever/_1b2.log 0:0:17:17.233
I want to parse out the location of the logfile /root/.forever/_1b2.log
Any ideas how to accomplish this with bash?

Two of the many awk variations to solve this issue:
# most basic
command | awk 'NR==3{ print $8 }' data
# a little bit more robust:
command | awk '$1=="data:" && $2=="[0]" { print $8 }' data
# ^^^^^^^^^
# here I filter on the "[0]" text, but depending your needs
# you might want to use $3=="_1b2" or $4=="/usr/bin/nodejs"

You could try the below GNU and basic sed commands,
$ command | sed -nr 's/^.* (\S+) [0-9]+:[0-9]+\S+$/\1/p'
/root/.forever/_1b2.log
$ command | sed -n 's/^.* \(\S\+\) [0-9]\+:[0-9]\+\S\+$/\1/p'
/root/.forever/_1b2.log
It prints only the non-space characters which is followed by a space + one or more digits + : symbol + one or more digits.

You could grep the line(s) you're looking for and pipe the output to awk:
forever list | grep "\.log" | awk '{print $4}'
This will find all processes with a .log file, and print the log file's location (4th column).

Assuming the logfile is in /root/.forever directory, you can use grep this regex:
forever list | grep -o "/root/\.forever/.*\.log"

Related

Set part of grep to variable

mysqladmin proc status | grep "Threads"
Output:
Uptime: 2304 Threads: 14 Questions: 2652099 Slow queries: 0 Opens: 48791 Flush tables: 3 Open tables: 4000 Queries per second avg: 1151.08
I would like to set it so $mysqlthread would output 14 after running echo $mysqlthread
Probably the easiest way is with Perl instead of grep.
mysqladmin proc status | perl -nle'/Threads: (\d+)/ && print $1'
perl -n means "go through each line of input".
perl -l means "print a \n at the end of every print"
perl -e means "here is my program"
/Threads: (\d+)/ means "match Threads: followed by one or more digits. And print $1 means "print the digits I found as denoted by the parentheses around \d+.
Using grep
$ mysqlthread=$(mysqladmin proc status | grep -Po 'Threads: \K\d+')
$ echo "$mysqlthread"
14
There are many ways to solve this. This is one:
mysqladmin proc status | grep "Threads" | tr -s ' ' | cut -d' ' -f4
The tr command with flag -s is used to translate all multiple consecutive spaces into a single space. Then, cut command return the 4th field using a single space as delimiter.
The advantage of piping commands is that one can make this process interactively. And whenever you aren't sure which flag to use, the manual pages are there to help: man grep, man tr, man cut, etc.
Add awk to split the output,
mysqlthread=$(mysqladmin proc status | grep "Threads" | awk '{print $4}')

how to capture the 1st command output in terminal and to print that variable using the last command in the same line

actual output comes as
$ grep -Hcw count copy_hb_script
copy_hb_script:5
I'm using the below command to get the expected out put but I'm failing
grep -Hcw count copy_hb_script | awk '{print $1}' |xargs ls -ld | awk '{print $8 " " $9 }'
getting out put is
03:49 copy_hb_script
Missing the count of the file, is there any alternate to get the time stamp with count of the file like below
03:49 copy_hb_script:5
You can avoid parsing ls output and use the stat command in bash for detailed file information.
# 'stat -c' produces output as 2016-09-15 16:03:40.655456000 +0530
# Stripping off extra information after the '.' using string-manipulation
# Running the grep with the count together with the previous command
modDate=$(stat -c %y copy_hb_script); echo "${modDate%.*}" "$(grep -Hcw count copy_hb_script)"
Produces an output as
016-09-15 16:03:40 copy_hb_script:5

Issues passing AWK output to BASH Variable

I'm trying to parse lines from an error log in BASH and then send a certain part out to a BASH variable to be used later in the script and having issues once I try and pass it to a BASH variable.
What the log file looks like:
1446851818|1446851808.1795|12|NONE|DID|8001234
I need the number in the third set (in this case, the number is 12) of the line
Here's an example of the command I'm running:
tail -n5 /var/log/asterisk/queue_log | grep 'CONNECT' | awk -F '[|]' '{print $3}'
The line of code is trying to accomplish this:
Grab the last lines of the log file
Search for a phrase (in this case connect, I'm using the same command to trigger different items)
Separate the number in the third set of the line out so it can be used elsewhere
If I run the above full command, it runs successfully like so:
tail -n5 /var/log/asterisk/queue_log | grep 'CONNECT' | awk -F '[|]' '{print $3}'
12
Now if I try and assign it to a variable in the same line/command, I'm unable to have it echo back the variable.
My command when assigning to a variable looks like:
tail -n5 /var/log/asterisk/queue_log | grep 'CONNECT' | brand=$(awk -F '[|]' '{print $3}')
(It is being run in the same script as the echo command so the variable should be fine, test script looks like:
#!/bin/bash
tail -n5 /var/log/asterisk/queue_log | grep 'CONNECT' | brand=$(awk -F '[|]' '{print $3}')
echo "$brand";
I'm aware this is most likely not the most efficient/eloquent solution to do this, so if there are other ideas/ways to accomplish this I'm open to them as well (my BASH skills are basic but improving)
You need to capture the output of the entire pipeline, not just the final section of it:
brand=$(tail -n5 /var/log/asterisk/queue_log | grep 'CONNECT' | awk -F '|' '{print $3}')
You may also want to consider what will happen if there is more than one line containing CONNECT in the final five lines of the file (or indeed, if there are none). That's going to cause brand to have multiple (or no) values.
If your intent is to get the third field from the latest line in the file containing CONNECT, awk can pretty much handle the entire thing without needing tail or grep:
brand=$(awk -F '|' '/CONNECT/ {latest = $3} END {print latest}')

How to return substring from a linux command

I'm connecting to an exadata and want to get information about "ORACLE_HOME" variable inside them. So i'm using this command:
ls -l /proc/<pid>/cwd
this is the output:
2 oracle oinstall 0 Jan 23 21:20 /proc/<pid>/cwd -> /u01/app/database/11.2.0/dbs/
i need the get the last part :
/u01/app/database/11.2.0 (i dont want the "/dbs/" there)
i will be using this command several times in different machines. So how can i get this substring from whole output?
Awk and grep are good for these types of issues.
New:
ls -l /proc/<pid>/cwd | awk '{print ($NF) }' | sed 's#/dbs/##'
Old:
ls -l /proc/<pid>/cwd | awk '{print ($NF) }' | egrep -o '^.+[.0-9]'
Awk prints the last column of the input which is your ls command and then grep grabs the beginning of that string up the last occurrence of numbers and dots. This is a situational solution and perhaps not the best.
Parsing the output of ls is generally considered sub-optimal. I would use something more like this instead:
dirname $(readlink -f /proc/<pid>/cwd)

Split output of command by columns using Bash?

I want to do this:
run a command
capture the output
select a line
select a column of that line
Just as an example, let's say I want to get the command name from a $PID (please note this is just an example, I'm not suggesting this is the easiest way to get a command name from a process id - my real problem is with another command whose output format I can't control).
If I run ps I get:
PID TTY TIME CMD
11383 pts/1 00:00:00 bash
11771 pts/1 00:00:00 ps
Now I do ps | egrep 11383 and get
11383 pts/1 00:00:00 bash
Next step: ps | egrep 11383 | cut -d" " -f 4. Output is:
<absolutely nothing/>
The problem is that cut cuts the output by single spaces, and as ps adds some spaces between the 2nd and 3rd columns to keep some resemblance of a table, cut picks an empty string. Of course, I could use cut to select the 7th and not the 4th field, but how can I know, specially when the output is variable and unknown on beforehand.
One easy way is to add a pass of tr to squeeze any repeated field separators out:
$ ps | egrep 11383 | tr -s ' ' | cut -d ' ' -f 4
I think the simplest way is to use awk. Example:
$ echo "11383 pts/1 00:00:00 bash" | awk '{ print $4; }'
bash
Please note that the tr -s ' ' option will not remove any single leading spaces. If your column is right-aligned (as with ps pid)...
$ ps h -o pid,user -C ssh,sshd | tr -s " "
1543 root
19645 root
19731 root
Then cutting will result in a blank line for some of those fields if it is the first column:
$ <previous command> | cut -d ' ' -f1
19645
19731
Unless you precede it with a space, obviously
$ <command> | sed -e "s/.*/ &/" | tr -s " "
Now, for this particular case of pid numbers (not names), there is a function called pgrep:
$ pgrep ssh
Shell functions
However, in general it is actually still possible to use shell functions in a concise manner, because there is a neat thing about the read command:
$ <command> | while read a b; do echo $a; done
The first parameter to read, a, selects the first column, and if there is more, everything else will be put in b. As a result, you never need more variables than the number of your column +1.
So,
while read a b c d; do echo $c; done
will then output the 3rd column. As indicated in my comment...
A piped read will be executed in an environment that does not pass variables to the calling script.
out=$(ps whatever | { read a b c d; echo $c; })
arr=($(ps whatever | { read a b c d; echo $c $b; }))
echo ${arr[1]} # will output 'b'`
The Array Solution
So we then end up with the answer by #frayser which is to use the shell variable IFS which defaults to a space, to split the string into an array. It only works in Bash though. Dash and Ash do not support it. I have had a really hard time splitting a string into components in a Busybox thing. It is easy enough to get a single component (e.g. using awk) and then to repeat that for every parameter you need. But then you end up repeatedly calling awk on the same line, or repeatedly using a read block with echo on the same line. Which is not efficient or pretty. So you end up splitting using ${name%% *} and so on. Makes you yearn for some Python skills because in fact shell scripting is not a lot of fun anymore if half or more of the features you are accustomed to, are gone. But you can assume that even python would not be installed on such a system, and it wasn't ;-).
try
ps |&
while read -p first second third fourth etc ; do
if [[ $first == '11383' ]]
then
echo got: $fourth
fi
done
Your command
ps | egrep 11383 | cut -d" " -f 4
misses a tr -s to squeeze spaces, as unwind explains in his answer.
However, you maybe want to use awk, since it handles all of these actions in a single command:
ps | awk '/11383/ {print $4}'
This prints the 4th column in those lines containing 11383. If you want this to match 11383 if it appears in the beginning of the line, then you can say ps | awk '/^11383/ {print $4}'.
Using array variables
set $(ps | egrep "^11383 "); echo $4
or
A=( $(ps | egrep "^11383 ") ) ; echo ${A[3]}
Similar to brianegge's awk solution, here is the Perl equivalent:
ps | egrep 11383 | perl -lane 'print $F[3]'
-a enables autosplit mode, which populates the #F array with the column data.
Use -F, if your data is comma-delimited, rather than space-delimited.
Field 3 is printed since Perl starts counting from 0 rather than 1
Getting the correct line (example for line no. 6) is done with head and tail and the correct word (word no. 4) can be captured with awk:
command|head -n 6|tail -n 1|awk '{print $4}'
Instead of doing all these greps and stuff, I'd advise you to use ps capabilities of changing output format.
ps -o cmd= -p 12345
You get the cmmand line of a process with the pid specified and nothing else.
This is POSIX-conformant and may be thus considered portable.
Bash's set will parse all output into position parameters.
For instance, with set $(free -h) command, echo $7 will show "Mem:"

Resources