Inconsistency between perl grep and cli grep - linux

I am doing the following:
#!/usr/bin/perl
use strict;
use warnings;
my $proc = `ps -ef|grep -c myscriptname`;
print $proc;
This prints 2 when I run it inside the script.
ps -ef|grep -c myscriptname on the command line just shows: 1
Why?
same for my $proc = qx/ps -ef|grep -c myscriptname/
UPDATE
To be clear I run this snippet from somerandomscript.pl
Update 2
Following the advice of edorqui I remove -c getting:
12013 15777 15776 0 14:11 pts/6 00:00:00 sh -c ps -ef | grep myscriptname
12013 15779 15777 0 14:11 pts/6 00:00:00 grep myscriptname Argument "12013 15777 15776 0 14:11 pts/6 00:00:00 sh -c ps..." isn't numeric in numeric gt (>) at somerandomscript.pl line 8
from inside the script

The ps -ef command is showing the grep itself.
To skip this behaviour, try grepping for a regex condition that does not match the grep itself:
ps -ef | grep myscript[n]ame
or whatever similar can make it:
ps -ef | grep myscriptnam[e]
Explanation
If you run a sleep command in the background:
$ sleep 100 &
[1] 9768
and then look for it with ps -ef:
$ ps -ef | grep sleep
me 9768 3673 0 14:00 pts/6 00:00:00 sleep 100
me 9771 3673 0 14:00 pts/6 00:00:00 grep --color=auto sleep
You get two lines: the process itself and the grep command looking for it. To avoid it and show just the process itselves, we can either:
$ ps -ef | grep -v grep | grep sleep
or use a regex in the code so that the grep process is not matched:
$ ps -ef | grep slee[p]
me 9768 3673 0 14:00 pts/6 00:00:00 sleep 100
because line
me 9771 3673 0 14:00 pts/6 00:00:00 grep --color=auto sleep
does not match in grep slee[p].
See explanation in a related topic.

I suposse your perl script is named "myscriptname".
When you run this script, you have a new process (perl myscriptname.pl), and it's showed by the ps -ef command. The other one is related to the grep command (it has the text you are looking for)

#fedorqui's answer is right on -- the grep is matching its own invocation in the process table, and perhaps that of its parent shell, too, though timing issues mean this does not always happen from the CLI.
However, another approach, avoiding grep in favor of perl, would be:
my $count = () = qx(ps -e -o cmd) =~ /myscriptname/mg;
# Now $count tells you the number of times myscriptname appears in the process table
See this answer for why the empty parens are used above. Note, too, that you don't need the full ps output (-f), you just want to match on the command name (-o cmd).

Take a look at the pgrep and the pkill commands. These are standard Linux commands are are way easier to use than trying to do a ps and then a grep.
Also, if you do use ps, take a look at the -o options. These let you display the columns you want, and give you a way to strip out the heading.

Related

grep for process id to get the correct value

I am using the grep command to get a specific process id, but sometimes i am getting two process ids and my output is not correct.
ps -ef |grep AS_Cluster.js
root 2711 2624 0 07:15 pts/0 00:00:00 grep AS_Cluster.js
root 14630 14625 0 Sep13 ? 00:32:36 node xx/x/xx/x/xx/AS_Cluster.js
I want to get the pid value of only node xx/xxx/xx/AS_Cluster.js this process id. Any help
on this
Use preferably pgrep(1) (probably as pgrep -f AS_cluster.js) or pipe the output of ps to some awk command (see gawk(1)) or script.
Try following
ps -ef | grep AS_Cluster.js | grep -v grep

ps command in linux

I'm a newbie to unix-like. And I met a weird issue that I really cannot find answers by searching.
#!/bin/bash
me=`basename "$0"`
echo $(ps -e | grep "$me" | wc -l)
ps -e | grep "$me" | wc -l
After executing that bash script, the echo shows me 2, and ps just shows me 1 which is what I want. How can this happen? Why echo shows me an extra process?
As Charles Duffy pointed out, $() creates a subshell. That answers my question. Apparently I still have a lot to learn. Thanks for all the help.
As noted by a comment by Cyrus; this script:
me=$(basename $0)
ps -ef |grep $me
when launched with "./ps.sh", prints:
auser#pc:/tmp$ ./ps.sh
auser 4425 4422 0 08:42 pts/3 00:00:00 grep ps.sh
auser#pc:/tmp$
No subshells are involved here, it is the grep(1) itself that is listed by ps(1). The same script, launched with "bash ps.sh" outputs:
auser 4426 3946 0 08:44 pts/3 00:00:00 bash ps.sh
auser 4429 4426 0 08:44 pts/3 00:00:00 grep ps.sh
This is the result the OP gets, even without subshells. Even more explicit:
auser#pc:/tmp$ ps -ef |grep grep
auser 4467 3946 0 08:49 pts/3 00:00:00 grep grep
although you are creating a sub shell by using $() you can grep this out by using grep -v grep.
So:
$(ps -e | grep "$me" | grep -v grep | wc -l)
which will return 1 instead of 2

Get a process ID

I read a lot of questions about this argument, but I can't solve my issue.
I need to get a specific process ID and I wrote the following test.sh script:
#!/bin/sh
PID=$(ps -ef | grep abc | grep -v grep | awk '{print $2}')
echo $PID
Running this script I get two different PIDs if the abc process is not running and three different PIDs if the abc process is running.
If I run the
ps -ef | grep abc | grep -v grep | awk '{print $2}'
command from shell I get the right result.
Modifing the test.sh script removing the last awk I noticed that the script prints the following output:
user1 22153 129551 0 15:56 pts/3 00:00:00 /bin/sh ./test.sh
user1 22155 22153 0 15:56 pts/3 00:00:00 /bin/sh ./test.sh
How is it possible and how can I ignore them?
If you know exactly what is the process called, use pidof, otherwise, you can just use pgrep, it saves your grep|grep|awk....
Note that, when you ps|grep regex or pgrep regex there could be multiple entries in your result.
Do not use these tools, use the right tool meant for this, command pidof with the POSIX compatible -s flag which according to the man page says,
-s Single shot - this instructs the program to only return one
pid.
Using the above,
processID=$(pidof -s "abc")
I am not a big fan of parsing the process table. It could be inaccurate. For the same reason as "why not parse ls" You may want to look at the command pgrep
My suggestion is doing
pgrep -u user1 abc

bash process substitution can't work fine with tee

The real thing I want to do is like ps -ef|head -n1 && ps -ef|grep httpd. The output should be something like this.
UID PID PPID C STIME TTY TIME CMD
xxxxx 6888 6886 0 16:49 pts/1 00:00:00 grep --color=auto httpd
root 10992 1 0 13:56 ? 00:00:00 sudo ./myhttpd
root 10993 10992 0 13:56 ? 00:00:00 ./myhttpd
root 11107 10993 0 13:56 ? 00:00:00 ./myhttpd
root 12142 10993 0 14:00 ? 00:00:00 ./myhttpd
root 31871 10993 0 15:03 ? 00:00:00 ./myhttpd
But I hate duplicates. So, I want ps -ef to appear only once.
Considering bash process substitution, I tried ps -ef | tee > >(head -n1) >(grep httpd), but the only output is
UID PID PPID C STIME TTY TIME CMD
However, ps -ef | tee > >(head -n1) >(head -n2) can work fine in the following way
UID PID PPID C STIME TTY TIME CMD
root 1 0 0 13:36 ? 00:00:00 /sbin/init
UID PID PPID C STIME TTY TIME CMD
Can anyone help me ?
You can do head and grep on the same stream.
ps -ef | (head -n 1; grep '[h]ttpd')
It might be marginally more efficient to refactor to use sed:
ps -ef | sed -n -e '1p' -e '/[h]ttpd/p'
... but not all sed dialects deal amicably with multiple -e options. Perhaps this is more portable:
ps -ef | sed '1b;/[h]ttpd/b;d'
Also note the old trick to refactor the regex so as not to match itself by using a character class.
This can be achieved simply with pgrep and ps.
ps -fp $(pgrep -d, -o -f httpd)
use AWK
ps -ef | awk 'NR==1 || /httpd/'
print out 1st line or any line contains "httpd"
or use sed
ps -ef | sed -n '1p;/httpd/p'
ps -f -C httpd --noheaders | head -n1

More elegant "ps aux | grep -v grep"

When I check list of processes and 'grep' out those that are interesting for me, the grep itself is also included in the results. For example, to list terminals:
$ ps aux | grep terminal
user 2064 0.0 0.6 181452 26460 ? Sl Feb13 5:41 gnome-terminal --working-directory=..
user 2979 0.0 0.0 4192 796 pts/3 S+ 11:07 0:00 grep --color=auto terminal
Normally I use ps aux | grep something | grep -v grep to get rid of the last entry... but it is not elegant :)
Do you have a more elegant hack to solve this issue (apart of wrapping all the command into a separate script, which is also not bad)
The usual technique is this:
ps aux | egrep '[t]erminal'
This will match lines containing terminal, which egrep '[t]erminal' does not! It also works on many flavours of Unix.
Use pgrep. It's more reliable.
This answer builds upon a prior pgrep answer. It also builds upon another answer combining the use of ps with pgrep. Here are some pertinent training examples:
$ pgrep -lf sshd
1902 sshd
$ pgrep -f sshd
1902
$ ps up $(pgrep -f sshd)
USER PID %CPU %MEM VSZ RSS TTY STAT START TIME COMMAND
root 1902 0.0 0.1 82560 3580 ? Ss Oct20 0:00 /usr/sbin/sshd -D
$ ps up $(pgrep -f sshddd)
error: list of process IDs must follow p
[stderr output truncated]
$ ps up $(pgrep -f sshddd) 2>&-
[no output]
The above can be used as a function:
$ psgrep() { ps up $(pgrep -f $#) 2>&-; }
$ psgrep sshd
USER PID %CPU %MEM VSZ RSS TTY STAT START TIME COMMAND
root 1902 0.0 0.1 82560 3580 ? Ss Oct20 0:00 /usr/sbin/sshd -D
Compare with using ps with grep. The useful header row is not printed:
$ ps aux | grep [s]shd
root 1902 0.0 0.1 82560 3580 ? Ss Oct20 0:00 /usr/sbin/sshd -D
You can filter in the ps command, e.g.
ps u -C gnome-terminal
(or search through /proc with find etc.)
One more alternative:
ps -fC terminal
Here the options:
-f does full-format listing. This option can be combined
with many other UNIX-style options to add additional
columns. It also causes the command arguments to be
printed. When used with -L, the NLWP (number of
threads) and LWP (thread ID) columns will be added. See
the c option, the format keyword args, and the format
keyword comm.
-C cmdlist Select by command name.
This selects the processes whose executable name is
given in cmdlist.
Disclaimer: I'm the author of this tool, but...
I'd use px:
~ $ px atom
PID COMMAND USERNAME CPU RAM COMMANDLINE
14321 crashpad_handler walles 0.01s 0% /Users/walles/Downloads/Atom.app/Contents/Frameworks/Electron Framework.framework/Resources/crashpad_handler --database=
16575 crashpad_handler walles 0.01s 0% /Users/walles/Downloads/Atom.app/Contents/Frameworks/Electron Framework.framework/Resources/crashpad_handler --database=
16573 Atom Helper walles 0.5s 0% /Users/walles/Downloads/Atom.app/Contents/Frameworks/Atom Helper.app/Contents/MacOS/Atom Helper --type=gpu-process --cha
16569 Atom walles 2.84s 1% /Users/walles/Downloads/Atom.app/Contents/MacOS/Atom --executed-from=/Users/walles/src/goworkspace/src/github.com/github
16591 Atom Helper walles 7.96s 2% /Users/walles/Downloads/Atom.app/Contents/Frameworks/Atom Helper.app/Contents/MacOS/Atom Helper --type=renderer --no-san
Except for finding processes with a sensible command line interface it also does a lot of other useful things, more details on the project page.
Works on Linux and OS X, easily installed:
curl -Ls https://github.com/walles/px/raw/python/install.sh | bash
Using brackets to surround a character in the search pattern excludes the grep process since it doesn't contain the matching regex.
$ ps ax | grep 'syslogd'
16 ?? Ss 0:09.43 /usr/sbin/syslogd
18108 s001 S+ 0:00.00 grep syslogd
$ ps ax | grep '[s]yslogd'
16 ?? Ss 0:09.43 /usr/sbin/syslogd
$ ps ax | grep '[s]yslogd|grep'
16 ?? Ss 0:09.43 /usr/sbin/syslogd
18144 s001 S+ 0:00.00 grep [s]yslogd|grep
Depending on the ultimate use case, you often want to prefer Awk instead.
ps aux | awk '/[t]erminal/'
This is particularly true when you have something like
ps aux | grep '[t]erminal' | awk '{print $1}' # useless use of grep!
where obviously the regex can be factored into the Awk script trivially:
ps aux | awk '/[t]erminal/ { print $1 }'
But really, don't reinvent this yourself. pgrep and friends have been around for a long time and handle this entire problem space much better than most ad hoc reimplementations.
Another option is to edit your .bash_profile (or other file that you keep bash aliases in) to create a function that greps 'grep' out of the results.
function mygrep {
grep -v grep | grep --color=auto $1
}
alias grep='mygrep'
The grep -v grep has to be first otherwise your --color=auto won't work for some reason.
This works if you're using bash; if you're using a different shell YMMV.

Resources