Parsing each line and column in bash - linux

I'm writing a bash script to get all process data. I'm using the following command
ps -eaf -o %cpu,%mem,acflag,acflg,args,blocked,comm,command,cpu,cputime,etime,f,flags,gid,group,inblk,inblock,jobc,ktrace,ktracep,lim,login,logname,lstart,majflt,minflt,msgrcv,msgsnd,ni,nice,nivcsw,nsignals,nsigs,nswap,nvcsw,nwchan,oublk,oublock,p_ru,paddr,pagein,pcpu,pending,pgid,pid,pmem,ppid,pri,pstime,putime,re,rgid,rgroup,rss,ruid,ruser,sess,sig,sigmask,sl,start,stat,state,stime,svgid,svuid,tdev,time,tpgid,tsess,tsiz,tt,tty,ucomm,uid,upr,user,usrpri,utime,vsize,vsz,wchan,wq,wqb,wql,wqr,xstat
What I'm trying to do is parse each line and column from this output and I'm kind of lost on where to begin. Here is the pseudo-code for what I'm wanting to do
processes = ps -aef -o ...
for i in processes
processes[i].ppid # do some stuff with this column
processes[i].pid # do some stuff with this column
processes[i].stime # do some stuff with this column
What is the best way to easily work with the output of this ps command?

Pipe the output to a loop that reads each column.
ps -aef -o ... | while read cpu mem acflag acflg args ...
do
echo "$cpu"
echo "$pid"
...
done
However, this is not going to work well with fields like args, since they have embedded whitespace and read uses whitespace as its delimiter.

Related

Bash script- number of processes sorted

im trying to improve my bash on my road to become a DevOps,
one of my excercises states that i should be able to
1-Write a bash script using Vim editor that checks all the processes running for the current user
2-Extend the previous script to ask for a user input for sorting the processes output either by memory or CPU consumption, and print the sorted list.
3-Extend the previous script to ask additionally for user input about how many processes to print. Hint: use head program to limit the number of outputs.
The error message is a syntax error, it doesnt seem to work, any ideas or tips ?
#!/bin/bash
read -p "Press 1- to sort by memory OR 2 to sort by CPU consumption" sorting
read -p "how much output should be displayed, choose a number between 1-9 ?" output
if[$sorting = 1];
ps auck-%mem | head -n $output | grep kami
else
ps auck-%cpu | head -n $output | grep kami
fi
Cheers
Kami

Store result of "ps -ax" for later iterating through it

When I do
ps -ax|grep myApp
I get the one line with PID and stuff of my app.
Now, I'ld liked to process the whole result of ps -ax (without grep, so, the full output):
Either store it in a variable and grep from it later
Or go through the results in a for loop, e.g. like that:
for a in $(ps -ax)
do
echo $a
done
Unfortunally, this splits with every space, not with newline as |grep does it.
Any ideas, how I can accomplish one or the other (grep from variable or for loop)?
Important: No bash please, only POSIX, so #!/bin/sh
Thanks in advance
Like stated above, while loop can be helpful here.
One more useful thing is --no-headers argument which makes ps skip the header.
Or - even better - specify the exact columns you need to process, like ps -o pid,command --no-header ax
The overall code would look like
processes=`ps --no-headers -o pid,command ax`
echo "$processes" | while read pid command; do
echo "we have process with pid $pid and command line $command"
done
The only downside to this approach is that commands inside while loop will be executed in subshell so if you need to export some var to the parent process you'll have to do it using inter-process communication stuff.
I usually dump the results into temp file created before while loop and read them after the loop is finished.
I found a solution by removing the spaces while executing the command:
result=$(ps -aux|sed 's/ /_/g')
You can also make it more filter friendly by removing duplicated spaces:
result=$(ps -aux| tr -s ' '|sed 's/ /_/g')

How to pipe all the output of "ps" into a shell script for further processing?

When I run this command:
ps aux|awk {'print $1,$2,$3,$11'}
I get a listing of the user, PID, CPU% and the actual command.
I want to pipe all those listings into a shell script to calculate the CPU% and if greater than, say 5, then to kill the process via the PID.
I tried piping it to a simple shell script, i.e.
ps aux|awk {'print $1,$2,$3,$11'} | ./myscript
where the content of my script is:
#!/bin/bash
# testing using positional parameters
echo "$1 $2 $3 $4"
But I get a blank output. Any idea how to do this?
Many thanks!
If you use awk, you don't need an additional bash script. Also, it is a good idea to reduce the output of the ps command so you don't have to deal with extra information:
ps acxho user,pid,%cpu,cmd | awk '$3 > 5 {system("echo kill " $2)}'
Explanation
The extra ps flags I use:
c: command only, no extra arguments
h: no header, good for scripting
o: output format. In this case, only output the user, PID, %CPU, and command
The awk command compare the %CPU, which is the third column, with a threshold (5). If it is over the threshold, then issue the system command to kill that process.
Note the echo in the command. Once you are certain the scripts works the way you like, then remove the word echo from the command to execute it for real.
Your script needs to read its input
#!/bin/bash
while read a b c d; do
echo $a $b
done
I think you can get it using xargs command to pass the AWK output to your script as arguments:
ps aux|awk {'print $1,$2,$3,$11'} | xargs ./myscript
Some extra info about xargs: http://en.wikipedia.org/wiki/Xargs
When piping input from one process to another in Linux (or POSIX-compliant systems) the output is not given as arguments to the receiving process. Instead, the standard output of the first process is piped into the standard input of the other process.
Because of this, your script cannot work. $1...$n accesses variables that have been passed as arguments to it. As there are none it won't display anything. Instead, you have to read the standard input into variables with the read command (as pointed out by William).
The pipe '|' redirects the standard output of the left to the standard input of the right. In this case, the output of the ps goes to the input of awk, then the output of awk goes to the stdin of the script.
Therefore your scripts needs to read its STDIN.
#!/bin/bash
read var1 var2 var3 ...
Then you can do whatever you want with those variables.
More info, type in bash: help read
If I well understood your problem, you want to kill every process that exceeds X% of the CPU (using ps aux).
Here is the solution using AWK:
ps aux | grep -v "%CPU" | awk '{if ($3 > XXX) { print "Killing process with PID "$2", called "$4", consuming "$3"% and launched by "$1; system( "kill -9 " $2 );}}' -
Where XXX is your threshold (% of CPU).
It also prints related info to the killed process, if it is not desired just remove the print statement.
You can add some filters like: do not remove root's process...
Try putting myscript in front like this:
./myscript `ps aux|awk {'print $1,$2,$3,$11'}`

Grep filtering output from a process after it has already started?

Normally when one wants to look at specific output lines from running something, one can do something like:
./a.out | grep IHaveThisString
but what if IHaveThisString is something which changes every time so you need to first run it, watch the output to catch what IHaveThisString is on that particular run, and then grep it out? I can just dump to file and later grep but is it possible to do something like background it and then bring it to foreground and bringing it back but now piped to some grep? Something akin to:
./a.out
Ctrl-Z
fg | grep NowIKnowThisString
just wondering..
No, it is only in your screen buffer if you didn't save it in some other way.
Short form: You can do this, but you need to know that you need to do it ahead-of-time; it's not something that can be put into place interactively after-the-fact.
Write your script to determine what the string is. We'd need a more detailed example of the output format to give a better example of usage, but here's one for the trivial case where the entire first line is the filter target:
run_my_command | { read string_to_filter_for; fgrep -e "$string_to_filter_for" }
Replace the read string_to_filter_for with as many commands as necessary to read enough input to determine what the target string is; this could be a loop if necessary.
For instance, let's say that the output contains the following:
Session id: foobar
and thereafter, you want to grep for lines containing foobar.
...then you can pipe through the following script:
re='Session id: (.*)'
while read; do
if [[ $REPLY =~ $re ]] ; then
target=${BASH_REMATCH[1]}
break
else
# if you want to print the preamble; leave this out otherwise
printf '%s\n' "$REPLY"
fi
done
[[ $target ]] && grep -F -e "$target"
If you want to manually specify the filter target, this can be done by having the loop check for a file being created with filter contents, and using that when starting up grep afterwards.
That is a little bit strange what you need, but you can do it tis way:
you must go into script session first;
then you use shell how usually;
then you start and interrupt you program;
then run grep over typescript file.
Example:
$ script
$ ./a.out
Ctrl-Z
$ fg
$ grep NowIKnowThisString typescript
You could use a stream editor such as sed instead of grep. Here's an example of what I mean:
$ cat list
Name to look for: Mike
Dora 1
John 2
Mike 3
Helen 4
Here we find the name to look for in the fist line and want to grep for it. Now piping the command to sed:
$ cat list | sed -ne '1{s/Name to look for: //;h}' \
> -e ':r;n;G;/^.*\(.\+\).*\n\1$/P;s/\n.*//;br'
Mike 3
Note: sed itself can take file as a parameter, but you're not working with text files, so that's how you'd use it.
Of course, you'd need to modify the command for your case.

cannot understand grep in cron job solution

i was looking at the solution of Run cron job only if it isn't already running in order to apply it to a similar problem that I have but I cannot understand the
ps -u $USER -f | grep "[ ]$(cat ${PIDFILE})[ ]"'
It appears to be saying check the end of each line from ps for ' PIDnumber ' but when I look at my ps output the PIDnumber is in column two. I am interpreting the first $ as the regular expression check_end_of_line option.
$(stuff) will execute "stuff" (in this case cat ${PIDFILE})
PIDFILE is assumed to be a path to a file, so the whole line is basically looking for any line in the ps output that contains the contents of the "pid file" ([ ] adds some spaces on each side of the pid so that if pid file contains '888' it wont match '8888' in the ps output)

Resources