Bash 'swallowing' sub-shell children process when executing a single command - linux

Bumped into an unexpected bash/sh behavior and I wonder someone can explain the rationale behind it, and provide a solution to the question below.
In an interactive bash shell session, I execute:
$ bash -c 'sleep 10 && echo'
With ps on Linux it looks like this:
\_ -bash
\_ bash -c sleep 10 && echo
\_ sleep 10
The process tree is what I would expect:
My interactive bash shell process ($)
A children shell process (bash -c ...)
a sleep children process
However, if the command portion of my bash -c is a single command, e.g.:
$ bash -c 'sleep 10'
Then the middle sub-shell is swallowed, and my interactive terminal session executes sleep "directly" as children process.
The process tree looks like this:
\_ -bash
\_ sleep 10
So from process tree perspective, these two produce the same result:
$ bash -c 'sleep 10'
$ sleep 10
What is going on here?
Now to my question: is there a way to force the intermediate shell, regardless of the complexity of the expression passed to bash -c ...?
(I could append something like ; echo; to my actual command and that "works", but I'd rather not. Is there a more proper way to force the intermediate process into existence?)
(edit: typo in ps output; removed sh tag as suggested in comments; one more typo)

There's actually a comment in the bash source that describes much of the rationale for this feature:
/* If this is a simple command, tell execute_disk_command that it
might be able to get away without forking and simply exec.
This means things like ( sleep 10 ) will only cause one fork.
If we're timing the command or inverting its return value, however,
we cannot do this optimization. */
if ((user_subshell || user_coproc) && (tcom->type == cm_simple || tcom->type == cm_subshell) &&
((tcom->flags & CMD_TIME_PIPELINE) == 0) &&
((tcom->flags & CMD_INVERT_RETURN) == 0))
{
tcom->flags |= CMD_NO_FORK;
if (tcom->type == cm_simple)
tcom->value.Simple->flags |= CMD_NO_FORK;
}
In the bash -c '...' case, the CMD_NO_FORK flag is set when determined by the should_suppress_fork function in builtins/evalstring.c.
It is always to your benefit to let the shell do this. It only happens when:
Input is from a hardcoded string, and the shell is at the last command in that string.
There are no further commands, traps, hooks, etc. to be run after the command is complete.
The exit status does not need to be inverted or otherwise modified.
No redirections need to be backed out.
This saves memory, causes the startup time of the process to be slightly faster (since it doesn't need to be forked), and ensures that signals delivered to your PID go direct to the process you're running, making it possible for the parent of sh -c 'sleep 10' to determine exactly which signal killed sleep, should it in fact be killed by a signal.
However, if for some reason you want to inhibit it, you need but set a trap -- any trap will do:
# run the noop command (:) at exit
bash -c 'trap : EXIT; sleep 10'

Related

How to terminate 2 background processes running in bash script?

I want these 2 background processes to end, for example, after pressing CTRL + C, the only thing I found that could help, but this does not work correctly:
python3 script1.py &
python3 script2.py &
trap "trap - SIGTERM && kill -- -$$" SIGINT SIGTERM EXIT
Or is there another solution to run 2 scripts in parallel? Not necessarily in the background.
I would use something in the line of
function killSubproc(){
kill $(jobs -p -r)
}
./test.py one 10 &
./test.py two 5 &
trap killSubproc INT
wait
It is not limited to 2 subprocess, as you can see.
The idea is simply to kill all (still running) process when you hit Ctrl+C
jobs -p -r gives the process number of all running subprocess (-r limits to running subprocess, since some of your script may have terminated naturally ; -p gives process number, not jobs numbers)
And wait, without any argument, wait for all subprocess to end.
That way, your main script is running in foreground while some subtasks are running. It still receives the ctrl+c.
Yet, it terminates if all your subprocess terminates naturally.
And also if you hit ctrl+c
Note: test.py is just a test script I've used, that runs for $2 seconds (10 and 5 here) and display $1 string each second (one and two here)
kill -- -$$ uses $$, the the process id of the shell script, as a group ID. These aren't the same thing.
You need to get the group ID of your shell script and use that. This should work.
ps -o pgid= -p $$
Here's a complete example:
#/bin/bash -e
trap "kill -s INT -- -$(ps -o pgid= -p $$); wait" EXIT
python3 -c 'import time, signal, sys
def handler(a,b):
print("sigint!")
sys.exit(0)
signal.signal(signal.SIGINT, handler)
for i in range(3):
time.sleep(1)
print("hi from python")
' &
sleep 1
With the trap in place, you should see sigint!. With trap commented out, your python will continue to run as you've experienced yourself.
In my example I also added a wait to the trap code to make sure that when the shell script ends, all of the process group also ends.

Parallel run and wait for pocesses from subshell

Hi all/ I'm trying to make something like parallel tool for shell simply because the functionality of parallel is not enough for my task. The reason is that I need to run different versions of compiler.
Imagine that I need to compile 12 programs with different compilers, but I can run only 4 of them simultaneously (otherwise PC runs out of memory and crashes :). I also want to be able to observe what's going on with each compile, therefore I execute every compile in new window.
Just to make it easier here I'll replace compiler that I run with small script that waits and returns it's process id sleep.sh:
#!/bin/bash
sleep 30
echo $$
So the main script should look like parallel_run.sh :
#!/bin/bash
for i in {0..11}; do
xfce4-terminal -H -e "./sleep.sh" &
pids[$i]=$!
pstree -p $pids
if (( $i % 4 == 0 ))
then
for pid in ${pids[*]}; do
wait $pid
done
fi
done
The problem is that with $! I get pid of xfce4-terminal and not the program it executes. So if I look at ptree of 1st iteration I can see output from main script:
xfce4-terminal(31666)----{xfce4-terminal}(31668)
|--{xfce4-terminal}(31669)
and sleep.sh says that it had pid = 30876 at that time. Thus wait doesn't work at all in this case.
Q: How to get right PID of compiler that runs in subshell?
Maybe there is the other way to solve task like this?
It seems like there is no way to trace PID from parent to child if you invoke process in new xfce4-terminal as terminal process dies right after it executed given command. So I came to the solution which is not perfect, but acceptable in my situation. I run and put compiler's processes in background and redirect output to .log file. Then I run tail on these logfiles and I kill all tails which belongs to current $USER when compilers from current batch are done, then I run the other batch.
#!/bin/bash
for i in {1..8}; do
./sleep.sh > ./process_$i.log &
prcid=$!
xfce4-terminal -e "tail -f ./process_$i.log" &
pids[$i]=$prcid
if (( $i % 4 == 0 ))
then
for pid in ${pids[*]}; do
wait $pid
done
killall -u $USER tail
fi
done
Hopefully there will be no other tails running at that time :)

Subprocesses started by script using eval are not interrupted on ctrl+c

I'm aware that similar questions have been asked before on SO (for example here) but I can't get it to work for my case.
I have a bash script called kubetail which evaluates a set of background commands started from the script like this:
CMD="cat <( eval "${command_to_tail}" )"
eval "$CMD"
where command_to_tail calls several subprocesses (kubectl) and aggregates their output into one stream. The problem is that when ctrl+c is pressed during eval it won't interrupt the subprocesses when the main script stops. For example this is shown when I run ps -Af | grep kubectl after I've interrupted the script (the kubectl is spawned by my script):
$ ps -Af | grep kubectl
501 85748 85742 0 9:48AM ttys014 0:00.16 kubectl --context= logs pod-4074277481-3tlx6 core -f --since=10s --namespace=
501 85750 85742 0 9:48AM ttys014 0:00.17 kubectl --context= logs pod-4074277481-9r224 core -f --since=10s --namespace=
501 85752 85742 0 9:48AM ttys014 0:00.16 kubectl --context= logs pod-4074277481-hh9bz core -f --since=10s --namespace=
I've tried various forms of trap - INT but I fail to find a solution that kills all subprocesses on ctrl+c. Any suggestions?
The problem you are having is due to all of the subprocesses running independently under there own process ID. When you issue ctrl + c you are attempting to cancel your original script running under its original PID. The subprocesses are unaffected. A simple analogous example is:
#!/bin/bash
declare -i cnt=1
while [ "$cnt" -lt 100 ]; do
printf "iteration: %2d\n" "$cnt"
sleep 5
((cnt++))
done
When you run the script and then try and cancel the script, control is very likely within the sleep PID and ctrl + c has no effect. If you want control over canceling the script at any point, then you will need to have each subprocess identify that a SIGINT was received by the parent to know an interrupt occurred. In some cases (as with sleep above), that is not directly possible when called from within the script as pressing ctrl + c will not affect the sleep process (which will change on each iteration).
There is no magic-bullet for solving this problem in all cases because what is required will largely depend on what type of control you have over what can be done in the subprocesses and whether your parent script iterates at all that would make a per-iteration check (like for the existence of a temp file) a workable solution.

Why does shell command “{ command1; command2: } &" open a subshell?

As we all know, placing a list of commands between curly braces causes the list to be executed in the current shell context. No subshell is created. But when using "&" after "{}", why two subshells are created? pid 1002 and 1003.
{
./a.out
} &
sleep 19
when using "./a.out &", only a subshell is created. pid 17358.
./a.out &
sleep 19
Why ?
Background execution of a list uses a subshell because something needs to wait for each member of that list and run the next one. After a list is backgrounded, the parent shell needs to be available for new commands; it can't manage the backgrounded list too. bash can't do more than one thing at a time. So, to make the backgrounded list work, it runs a subshell.
Note that you can disown a backgrounded list and it will keep running, showing that the subshell is doing its work:
$ {
> sleep 1; sleep 2; sleep 3; sleep 4; sleep 5
> } &
$ disown
$ ps -f | grep sleep
dave 31845 31842 0 03:50 pts/1 00:00:00 sleep 3
dave 31849 31771 0 03:50 pts/1 00:00:00 grep sleep
You could even log out and the subshell would continue running processes in the list.
When you background a single command, there is no need for a subshell because there is no more work for the shell to do after it has run the command.
In your example, the second additional bash subprocess, PID 1002, appears to be a script which you're executing. That's unrelated (conceptually, at least) to the list-backgrounding mechanism; any script in a separate file has its own bash process.
If a command is terminated by the control operator &, the shell executes the command (or list of commands that are enclosed in {...}) in the background (or asynchronously) in a subshell.
The shell does not wait for the command to finish, and the return status is 0.
In C program it is done by doing a fork() followed by execvp() system calls.
Update: Based on comments below and updated question. Here is what is happening.
When you run:
./a.out &
BASH directly just runs a.out in background as running binary a.out doesn't need a separate shell process.
When you run:
{ ./a.out; } &
BASH must first fork and create a subshell as you can have series of commands inside {...} and then the newly forked subshell runs a.out in a separate process. So it is not that BASH is creating 2 subshells for this. Only one subshell gets created and 2nd pid you're seeing is for a.out.

Don't show the output of kill command in a Linux bash script [duplicate]

How can you suppress the Terminated message that comes up after you kill a
process in a bash script?
I tried set +bm, but that doesn't work.
I know another solution involves calling exec 2> /dev/null, but is that
reliable? How do I reset it back so that I can continue to see stderr?
In order to silence the message, you must be redirecting stderr at the time the message is generated. Because the kill command sends a signal and doesn't wait for the target process to respond, redirecting stderr of the kill command does you no good. The bash builtin wait was made specifically for this purpose.
Here is very simple example that kills the most recent background command. (Learn more about $! here.)
kill $!
wait $! 2>/dev/null
Because both kill and wait accept multiple pids, you can also do batch kills. Here is an example that kills all background processes (of the current process/script of course).
kill $(jobs -rp)
wait $(jobs -rp) 2>/dev/null
I was led here from bash: silently kill background function process.
The short answer is that you can't. Bash always prints the status of foreground jobs. The monitoring flag only applies for background jobs, and only for interactive shells, not scripts.
see notify_of_job_status() in jobs.c.
As you say, you can redirect so standard error is pointing to /dev/null but then you miss any other error messages. You can make it temporary by doing the redirection in a subshell which runs the script. This leaves the original environment alone.
(script 2> /dev/null)
which will lose all error messages, but just from that script, not from anything else run in that shell.
You can save and restore standard error, by redirecting a new filedescriptor to point there:
exec 3>&2 # 3 is now a copy of 2
exec 2> /dev/null # 2 now points to /dev/null
script # run script with redirected stderr
exec 2>&3 # restore stderr to saved
exec 3>&- # close saved version
But I wouldn't recommend this -- the only upside from the first one is that it saves a sub-shell invocation, while being more complicated and, possibly even altering the behavior of the script, if the script alters file descriptors.
EDIT:
For more appropriate answer check answer given by Mark Edgar
Solution: use SIGINT (works only in non-interactive shells)
Demo:
cat > silent.sh <<"EOF"
sleep 100 &
kill -INT $!
sleep 1
EOF
sh silent.sh
http://thread.gmane.org/gmane.comp.shells.bash.bugs/15798
Maybe detach the process from the current shell process by calling disown?
The Terminated is logged by the default signal handler of bash 3.x and 4.x. Just trap the TERM signal at the very first of child process:
#!/bin/sh
## assume script name is test.sh
foo() {
trap 'exit 0' TERM ## here is the key
while true; do sleep 1; done
}
echo before child
ps aux | grep 'test\.s[h]\|slee[p]'
foo &
pid=$!
sleep 1 # wait trap is done
echo before kill
ps aux | grep 'test\.s[h]\|slee[p]'
kill $pid ## no need to redirect stdin/stderr
sleep 1 # wait kill is done
echo after kill
ps aux | grep 'test\.s[h]\|slee[p]'
Is this what we are all looking for?
Not wanted:
$ sleep 3 &
[1] 234
<pressing enter a few times....>
$
$
[1]+ Done sleep 3
$
Wanted:
$ (set +m; sleep 3 &)
<again, pressing enter several times....>
$
$
$
$
$
As you can see, no job end message. Works for me in bash scripts as well, also for killed background processes.
'set +m' disables job control (see 'help set') for the current shell. So if you enter your command in a subshell (as done here in brackets) you will not influence the job control settings of the current shell. Only disadvantage is that you need to get the pid of your background process back to the current shell if you want to check whether it has terminated, or evaluate the return code.
This also works for killall (for those who prefer it):
killall -s SIGINT (yourprogram)
suppresses the message... I was running mpg123 in background mode.
It could only silently be killed by sending a ctrl-c (SIGINT) instead of a SIGTERM (default).
disown did exactly the right thing for me -- the exec 3>&2 is risky for a lot of reasons -- set +bm didn't seem to work inside a script, only at the command prompt
Had success with adding 'jobs 2>&1 >/dev/null' to the script, not certain if it will help anyone else's script, but here is a sample.
while true; do echo $RANDOM; done | while read line
do
echo Random is $line the last jobid is $(jobs -lp)
jobs 2>&1 >/dev/null
sleep 3
done
Another way to disable job notifications is to place your command to be backgrounded in a sh -c 'cmd &' construct.
#!/bin/bash
# ...
pid="`sh -c 'sleep 30 & echo ${!}' | head -1`"
kill "$pid"
# ...
# or put several cmds in sh -c '...' construct
sh -c '
sleep 30 &
pid="${!}"
sleep 5
kill "${pid}"
'
I found that putting the kill command in a function and then backgrounding the function suppresses the termination output
function killCmd() {
kill $1
}
killCmd $somePID &
Simple:
{ kill $! } 2>/dev/null
Advantage? can use any signal
ex:
{ kill -9 $PID } 2>/dev/null

Resources