Bash: Inline Execution returns Duplicate "Process". Why? - linux

bash: 4.3.42(1)-release (x86_64-pc-linux-gnu)
Executing the following script:
# This is myscript.sh
line=$(ps aux | grep [m]yscript) # A => returns two duplicates processes (why?)
echo "'$line'"
ps aux | grep [m]yscript # B => returns only one
Output:
'tom 31836 0.0 0.0 17656 3132 pts/25 S+ 10:33 0:00 bash myscript.sh
tom 31837 0.0 0.0 17660 1736 pts/25 S+ 10:33 0:00 bash myscript.sh'
tom 31836 0.0 0.0 17660 3428 pts/25 S+ 10:33 0:00 bash myscript.sh
Why does the inline executed ps-snippet (A) return two lines?

Summary
This creates a subshell and hence two processes are running:
line=$(ps aux | grep [m]yscript)
This does not create a subshell. So, myscript.sh has only one process running:
ps aux | grep [m]yscript
Demonstration
Let's modify the script slightly so that the process and subprocess PIDs are saved in the variable line:
$ cat myscript.sh
# This is myscript.sh
line=$(ps aux | grep [m]yscript; echo $$ $BASHPID)
echo "'$line'"
ps aux | grep [m]yscript
In a bash script, $$ is the PID of the script and is unchanged in subshells. By contrast, when a subshell is entered, bash updates $BASHPID with the PID of the subshell.
Here is the output:
$ bash myscript.sh
'john1024 30226 0.0 0.0 13280 2884 pts/22 S+ 18:50 0:00 bash myscript.sh
john1024 30227 0.0 0.0 13284 1824 pts/22 S+ 18:50 0:00 bash myscript.sh
30226 30227'
john1024 30226 0.0 0.0 13284 3196 pts/22 S+ 18:50 0:00 bash myscript.sh
In this case, 30226 is the PID on the main script and 30227 is the PID of the subshell running ps aux | grep [m]yscript.

a command substitution ($(...))
each segment of a pipeline[1]
cause Bash to create a subshell (a child process created by forking the current shell process), but then Bash optimizes away subshells if they result in a single call to an external utility.
(What I think is happening in the optimization scenario is that a subshell is actually created but then instantly replaced by the external utility's process, via something like exec. Do let me know if you know for sure.)
Applied to your example:
line=$(ps aux | grep [m]yscript) creates 3 child processes:
1 subshell - the fork of your script you see as an additional match returned by grep.
2 child processes (1 for each pipeline segment) - ps and grep; they take the place of the optimized-away subshells; their parent process is the 1 remaining subshell created by the command substitution.
ps aux | grep [m]yscript creates 2 child processes (1 for each pipeline segment):
ps and grep; they take the place of the optimized-away subshells; their parent process is the current shell.
For an overview of the scenarios in which a subshell is created in Bash, see this answer of mine, which, however, doesn't cover the optimizing-away scenarios.
[1] In Bash v4.2+ you can set option lastpipe (off by default) in order to make the last pipeline segment run in the current shell instead of a subshell; aside from a slight efficiency gain, this allows you to declare variables in the last segment that the current shell can see after the pipeline exits.

Related

Kill 'ps aux' output in one line

I have to do the same thing many times a day:
ps aux
look for process running ssh to one of my servers ...
kill -9 <pid>
I'm looking to see if I can alias process into one line. The output of ps aux is usually something like this:
user 6871 0.0 0.0 4351260 8 ?? Ss 3:28PM 0:05.95 ssh -Nf -L 18881:my-server:18881
user 3018 0.0 0.0 4334292 52 ?? S 12:08PM 0:00.15 /usr/bin/ssh-agent -l
user 9687 0.0 0.0 4294392 928 s002 S+ 10:48AM 0:00.00 grep ssh
I always want to kill the process with the my-server in it.
Does anyone know how I could accomplish this?
for pid in $(ps aux | grep "[m]y-server" | awk '{print $2}'); do kill -9 $pid; done
ps aux | grep "[m]y-server" | awk '{print $2}' - this part gives you list of pids processes that include "my-server". And this script will go through this list and kill all this processes.
I use pgrep -- if you're sure you want to kill all the processes that match:
$ kill -9 `pgrep my-server`
A simple solution is to use pkill:
sudo pkill my-server

command in terminal and in script produce different results [duplicate]

I have a bash script (ScreamDaemon.sh) inside which a check that example of it isn't running already is added.
numscr=`ps aux | grep ScreamDaemon.sh | wc -l`;
if [ "${numscr}" -gt "2" ]; then
echo "an instance of ScreamDaemon still running";
exit 0;
fi
Normally, if there are no another copy of script running, ps aux | grep ScreamDaemon.sh | wc -l should return 2 (it should find itself and grep ScreamDaemon.sh), but it returns 3.
So, I try to analyse what happens and after adding some echoes see this:
there are lines I have added into the script
ps aux | grep ScreamDaemon.sh
ps aux | grep ScreamDaemon.sh | wc -l
str=`ps aux | grep ScreamDaemon.sh`
echo $str
numscr=`ps aux | grep ScreamDaemon.sh | wc -l`;
echo $numscr
there is an output:
pamela 27894 0.0 0.0 106100 1216 pts/1 S+ 13:41 0:00 /bin/bash ./ScreamDaemon.sh
pamela 27899 0.0 0.0 103252 844 pts/1 S+ 13:41 0:00 grep ScreamDaemon.sh
2
pamela 27894 0.0 0.0 106100 1216 pts/1 S+ 13:41 0:00 /bin/bash ./ScreamDaemon.sh pamela 27903 0.0 0.0 106100 524 pts/1 S+ 13:41 0:00 /bin/bash ./ScreamDaemon.sh pamela 27905 0.0 0.0 103252 848 pts/1 S+ 13:41 0:00 grep ScreamDaemon.sh
3
I also tried to add the sleep command right inside `ps aux | grep ScreamDaemon.sh; sleep 1m` and see from the parallel terminal how many instances ps aux|grep ScreamDaemon.sh shows:
[pamela#pm03 ~]$ ps aux | grep ScreamDaemon.sh
pamela 28394 0.0 0.0 106100 1216 pts/1 S+ 14:23 0:00 /bin/bash ./ScreamDaemon.sh
pamela 28403 0.0 0.0 106100 592 pts/1 S+ 14:23 0:00 /bin/bash ./ScreamDaemon.sh
pamela 28408 0.0 0.0 103252 848 pts/9 S+ 14:23 0:00 grep ScreamDaemon.sh
So, it seems that
str=`ps aux | grep ScreamDaemon.sh`
contrary to
ps aux | grep ScreamDaemon.sh
found two instances of ScreamDaemon.sh, but why? Where this additional copy of ScreamDaemon.sh come from?
This is an output of pstree -ap command
│ ├─sshd,27806
│ │ └─sshd,27808
│ │ └─bash,27809
│ │ └─ScreamDaemon.sh,28731 ./ScreamDaemon.sh
│ │ └─ScreamDaemon.sh,28740 ./ScreamDaemon.sh
│ │ └─sleep,28743 2m
Why can a single bash script show up multiple times in ps?
This is typical when any constructs which implicitly create a subshell are in play. For instance, in bash:
echo foo | bar
...creates a new forked copy of the shell to run the echo, with its own ps instance. Similarly:
( bar; echo done )
...creates a new subshell, has that subshell run the external command bar, and then has the subshell perform the echo.
Similarly:
foo=$(bar)
...creates a subshell for the command substitution, runs bar in there (potentially exec'ing the command and consuming the subshell, but this is not guaranteed), and reads its output into the parent.
Now, how does this answer your question? Because
result=$(ps aux | grep | wc)
...runs that ps command in a subshell, which itself creates an extra bash instance.
How can I properly ensure that only one copy of my script is running?
Use a lockfile.
See for instance:
How to prevent a script from running simultaneously?
What is the best way to ensure only one instance of a Bash script is running?
Note that I strongly suggest use of a flock-based variant.
Of course, the reason you find an additional process is because:
One process is running the sub-shell (of the command execution `..`)
included in your line: numscr=`ps aux | grep ScreamDaemon.sh | wc -l`;
that's the simplest answer.
However I would like to make some additional suggestions about your code:
First, quote your expansions, it should be: echo "$str".
Not doing so is making several lines collapse into a long one.
Second, you may use: grep [S]creamDaemon.sh to avoid matching the grep command itself.
Third, capture the command just once in a variable, then count lines from the variable. In this case it presents no problem, but for dynamic processes, one capture and the following capture to count could give different results.
Fourth, make an habit of using $(...) command substitutions instead of the more error prone (especially when Nesting) `...`.
### Using a file as the simplest way to capture the output of a command
### that is running in this shell (not a subshell).
ps aux | grep "[S]creamDaemon.sh" > "/tmp/tmpfile$$.txt"
str="$(< "/tmp/tmpfile$$.txt")" ### get the value of var "str"
rm "/tmp/tmpfile$$.txt" ### erase the file used ($$ is pid).
numscr="$(echo "$str" | wc -l)" ### count the number of lines.
echo "$numscr" ### present results.
echo "$str"
str="$( ps aux | grep "[S]creamDaemon.sh" )" ### capture var "str".
numscr="$(echo "$str" | wc -l)" ### count the number of lines.
echo "$numscr" ### present results.
echo "$str"
### The only bashim is the `$(<...)`, change to `$(cat ...)` if needed.
#CharlesDuffy covered the point of a flock quite well, please read it.

When launching script in background I get two processes running

I came across a weird scenario that is stumping me.
I have a script that I launch in the background with an &
example:
root## some_script.sh &
After running it, I do a ps -ef | grep some_script and I see TWO processes running where the 2nd process keeps getting an different PID but it's Parent is the process that I started (like the parent process is spawning children that die off - but this was never written in the code).
example:
root## ps -ef | grep some_script.sh
root 4696 17882 0 13:30 pts/2 00:00:00 /bin/bash ./some_script.sh
root 4778 4696 0 13:30 pts/2 00:00:00 /bin/bash ./some_script.sh
root## ps -ef | grep some_script.sh
root 4696 17882 0 13:30 pts/2 00:00:00 /bin/bash ./some_script.sh
root 4989 4696 0 13:30 pts/2 00:00:00 /bin/bash ./some_script.sh
What gives here? It seems to be messing up the output and functionality of the script too and basically makes it a never ending process (when I have a defined start and stop in the script).
the script:
`
#! /bin/bash
# Set Global Variables
LOGDIR="/srv/script_logs"
OUTDIR="/srv/audits"
BUCKET_LS=$OUTDIR"/LSOUT_"$i"_"$(date +%d%b%Y)".TXT"
MYCMD1="aws s3api list-objects --bucket viddler-flvs"
MYCMD2="--starting-token"
MAX_ITEMS="--max-items 10000"
MYSTARTING_TOKEN='""'
rm tokenlog.txt flv_out.txt
while [[ $MYSTARTING_TOKEN != "null" ]]
do
# First - Get the token for the next batch
CMD_PRE="$MYCMD1 $MAX_ITEMS $MYCMD2 $MYSTARTING_TOKEN"
MYSTARTING_TOKEN=($($CMD_PRE | jq -r .NextToken))
echo $MYSTARTING_TOKEN >> tokenlog.txt
# Now - get the values of the files for the existing batch
# First - re-run the batch and get the file values we want
MYOUT2=$($CMD_PRE | (jq ".Contents[] | {Key, Size, LastModified,StorageClass }"))
echo $MYOUT2 | sed 's/[{},"]//g;s/ /\n/g;s/StorageClass://g;s/LastModified://g;s/Size://g;s/Key://g;s/^ *//g;s/ *$//g' >> flv_out.txt
#echo $STARTING_TOKEN
done
`
I guess you have
(
some shell instructions
)
inside of your .sh
This syntax executes commands in the new process (but command line would be the same).

About shell and subshell

I'm new to shell,I just learned that use (command) will create a new subshell and exec the command, so I try to print the pid of father shell and subshell:
#!/bin/bash
echo $$
echo "`echo $$`"
sleep 4
var=$(echo $$;sleep 4)
echo $var
But the answer is:
$./test.sh
9098
9098
9098
My questions are:
Why just three echo prints? There are 4 echos in my code.
Why three pids are the same? subshell's pid is obviously not same with his father's.
Thanks a lot for answers :)
First, the assignment captures standard output of the child and puts it into var, rather than printing it:
var=$(echo $$;sleep 4)
This can be seen with:
$ xyzzy=$(echo hello)
$ echo $xyzzy
hello
Secondly, all those $$ variables are evaluated in the current shell which means they're turned into the current PID before any children start. The children see the PID that has already been generated. In other words, the children are executing echo 9098 rather than echo $$.
If you want the PID of the child, you have to prevent translation in the parent, such as by using single quotes:
bash -c 'echo $$'
echo "one.sh $$"
echo `eval echo '$$'`
I am expecting the above to print different pids, but it doesn't. It's creating a child process. Verified by adding sleep in ``.
echo "one.sh $$"
echo `eval "echo '$$'";sleep 10`
On executing the above from a script and running ps shows two processs one.sh(name of the script) and sleep.
USER PID %CPU %MEM VSZ RSS TTY STAT START TIME COMMAND
test 12685 0.0 0.0 8720 1012 pts/15 S+ 13:50 0:00 \_ bash one.sh
test 12686 0.0 0.0 8720 604 pts/15 S+ 13:50 0:00 \_ bash one.sh
test 12687 0.0 0.0 3804 452 pts/15 S+ 13:50 0:00 \_ sleep 10
This is the output produced
one.sh 12685
12685
Not sure what i am missing.
The solution is $!. As in:
#!/bin/bash
echo "parent" $$
yes > /dev/null &
echo "child" $!
Output:
$ ./prueba.sh
parent 30207
child 30209

More elegant "ps aux | grep -v grep"

When I check list of processes and 'grep' out those that are interesting for me, the grep itself is also included in the results. For example, to list terminals:
$ ps aux | grep terminal
user 2064 0.0 0.6 181452 26460 ? Sl Feb13 5:41 gnome-terminal --working-directory=..
user 2979 0.0 0.0 4192 796 pts/3 S+ 11:07 0:00 grep --color=auto terminal
Normally I use ps aux | grep something | grep -v grep to get rid of the last entry... but it is not elegant :)
Do you have a more elegant hack to solve this issue (apart of wrapping all the command into a separate script, which is also not bad)
The usual technique is this:
ps aux | egrep '[t]erminal'
This will match lines containing terminal, which egrep '[t]erminal' does not! It also works on many flavours of Unix.
Use pgrep. It's more reliable.
This answer builds upon a prior pgrep answer. It also builds upon another answer combining the use of ps with pgrep. Here are some pertinent training examples:
$ pgrep -lf sshd
1902 sshd
$ pgrep -f sshd
1902
$ ps up $(pgrep -f sshd)
USER PID %CPU %MEM VSZ RSS TTY STAT START TIME COMMAND
root 1902 0.0 0.1 82560 3580 ? Ss Oct20 0:00 /usr/sbin/sshd -D
$ ps up $(pgrep -f sshddd)
error: list of process IDs must follow p
[stderr output truncated]
$ ps up $(pgrep -f sshddd) 2>&-
[no output]
The above can be used as a function:
$ psgrep() { ps up $(pgrep -f $#) 2>&-; }
$ psgrep sshd
USER PID %CPU %MEM VSZ RSS TTY STAT START TIME COMMAND
root 1902 0.0 0.1 82560 3580 ? Ss Oct20 0:00 /usr/sbin/sshd -D
Compare with using ps with grep. The useful header row is not printed:
$ ps aux | grep [s]shd
root 1902 0.0 0.1 82560 3580 ? Ss Oct20 0:00 /usr/sbin/sshd -D
You can filter in the ps command, e.g.
ps u -C gnome-terminal
(or search through /proc with find etc.)
One more alternative:
ps -fC terminal
Here the options:
-f does full-format listing. This option can be combined
with many other UNIX-style options to add additional
columns. It also causes the command arguments to be
printed. When used with -L, the NLWP (number of
threads) and LWP (thread ID) columns will be added. See
the c option, the format keyword args, and the format
keyword comm.
-C cmdlist Select by command name.
This selects the processes whose executable name is
given in cmdlist.
Disclaimer: I'm the author of this tool, but...
I'd use px:
~ $ px atom
PID COMMAND USERNAME CPU RAM COMMANDLINE
14321 crashpad_handler walles 0.01s 0% /Users/walles/Downloads/Atom.app/Contents/Frameworks/Electron Framework.framework/Resources/crashpad_handler --database=
16575 crashpad_handler walles 0.01s 0% /Users/walles/Downloads/Atom.app/Contents/Frameworks/Electron Framework.framework/Resources/crashpad_handler --database=
16573 Atom Helper walles 0.5s 0% /Users/walles/Downloads/Atom.app/Contents/Frameworks/Atom Helper.app/Contents/MacOS/Atom Helper --type=gpu-process --cha
16569 Atom walles 2.84s 1% /Users/walles/Downloads/Atom.app/Contents/MacOS/Atom --executed-from=/Users/walles/src/goworkspace/src/github.com/github
16591 Atom Helper walles 7.96s 2% /Users/walles/Downloads/Atom.app/Contents/Frameworks/Atom Helper.app/Contents/MacOS/Atom Helper --type=renderer --no-san
Except for finding processes with a sensible command line interface it also does a lot of other useful things, more details on the project page.
Works on Linux and OS X, easily installed:
curl -Ls https://github.com/walles/px/raw/python/install.sh | bash
Using brackets to surround a character in the search pattern excludes the grep process since it doesn't contain the matching regex.
$ ps ax | grep 'syslogd'
16 ?? Ss 0:09.43 /usr/sbin/syslogd
18108 s001 S+ 0:00.00 grep syslogd
$ ps ax | grep '[s]yslogd'
16 ?? Ss 0:09.43 /usr/sbin/syslogd
$ ps ax | grep '[s]yslogd|grep'
16 ?? Ss 0:09.43 /usr/sbin/syslogd
18144 s001 S+ 0:00.00 grep [s]yslogd|grep
Depending on the ultimate use case, you often want to prefer Awk instead.
ps aux | awk '/[t]erminal/'
This is particularly true when you have something like
ps aux | grep '[t]erminal' | awk '{print $1}' # useless use of grep!
where obviously the regex can be factored into the Awk script trivially:
ps aux | awk '/[t]erminal/ { print $1 }'
But really, don't reinvent this yourself. pgrep and friends have been around for a long time and handle this entire problem space much better than most ad hoc reimplementations.
Another option is to edit your .bash_profile (or other file that you keep bash aliases in) to create a function that greps 'grep' out of the results.
function mygrep {
grep -v grep | grep --color=auto $1
}
alias grep='mygrep'
The grep -v grep has to be first otherwise your --color=auto won't work for some reason.
This works if you're using bash; if you're using a different shell YMMV.

Resources