Context:
I have a bash script that contains a subshell and a trap for the EXIT pseudosignal, and it's not properly trapping interrupts during an rsync. Here's an example:
#!/bin/bash
logfile=/path/to/file;
directory1=/path/to/dir
directory2=/path/to/dir
cleanup () {
echo "Cleaning up!"
#do stuff
trap - EXIT
}
trap '{
(cleanup;) | 2>&1 tee -a $logfile
}' EXIT
(
#main script logic, including the following lines:
(exec sleep 10;);
(exec rsync --progress -av --delete $directory1 /var/tmp/$directory2;);
) | 2>&1 tee -a $logfile
trap - EXIT #just in case cleanup isn't called for some reason
The idea of the script is this: most of the important logic runs in a subshell which is piped through tee and to a logfile, so I don't have to tee every single line of the main logic to get it all logged. Whenever the subshell ends, or the script is stopped for any reason (the EXIT pseudosignal should capture all of these cases), the trap will intercept it and run the cleanup() function, and then remove the trap. The rsync and sleep commands (the sleep is just an example) are run through exec to prevent the creation of zombie processes if I kill the parent script while they're running, and each potentially-long-running command is wrapped in its own subshell so that when exec finishes, it won't terminate the whole script.
The problem:
If I interrupt the script (via kill or CTRL+C) during the exec/subshell wrapped sleep command, the trap works properly, and I see "Cleaning up!" echoed and logged. If I interrupt the script during the rsync command, I see rsync end, and write rsync error: received SIGINT, SIGTERM, or SIGHUP (code 20) at rsync.c(544) [sender=3.0.6] to the screen, and then the script just dies; no cleanup, no trapping. Why doesn't an interrupting/killing of rsync trigger the trap?
I've tried using the --no-detach switch with rsync, but it didn't change anything.
I have bash 4.1.2, rsync 3.0.6, centOS 6.2.
How about just having all the output from point X be redirected to tee without having to repeat it everywhere and mess with all the sub-shells and execs ... (hope I didn't miss something)
#!/bin/bash
logfile=/path/to/file;
directory1=/path/to/dir
directory2=/path/to/dir
exec > >(exec tee -a $logfile) 2>&1
cleanup () {
echo "Cleaning up!"
#do stuff
trap - EXIT
}
trap cleanup EXIT
sleep 10
rsync --progress -av --delete $directory1 /var/tmp/$directory2
In addition to set -e, I think you want set -E:
If set, any trap on ERR is inherited by shell functions, command substitutions, and commands executed in a sub‐shell environment. The ERR trap is normally not inherited in such cases.
Alternatively, instead of wrapping your commands in subshells use curly braces which will still give you the ability to redirect command outputs but will execute them in the current shell.
The interupt will be properly caught if you add INT to the trap
trap '{
(cleanup;) | 2>&1 tee -a $logfile
}' EXIT INT
Bash is trapping interrupts correctly. However, this does not anwer the question, why the script traps on exit if sleep is interupted, nor why it does not trigger on rsync, but makes the script work as it is supposed to. Hope this helps.
Your shell might be configured to exit on error:
bash # enter subshell
set -e
trap "echo woah" EXIT
sleep 4
If you interrupt sleep (^C) then the subshell will exit due to set -e and print woah in the process.
Also, slightly unrelated: your trap - EXIT is in a subshell (explicitly), so it won't have an effect after the cleanup function returns
It's pretty clear from experimentation that rsync behaves like other tools such as ping and do not inherit signals from the calling Bash parent.
So you have to get a little creative with this and do something like the following:
$ cat rsync.bash
#!/bin/sh
set -m
trap '' SIGINT SIGTERM EXIT
rsync -avz LargeTestFile.500M root#host.mydom.com:/tmp/. &
wait
echo FIN
Now when I run it:
$ ./rsync.bash
X11 forwarding request failed
building file list ... done
LargeTestFile.500M
^C^C^C^C^C^C^C^C^C^C
sent 509984 bytes received 42 bytes 92732.00 bytes/sec
total size is 524288000 speedup is 1027.96
FIN
And we can see the file did fully transfer:
$ ll -h | grep Large
-rw-------. 1 501 games 500M Jul 9 21:44 LargeTestFile.500M
How it works
The trick here is we're telling Bash via set -m to disable job controls on any background jobs within it. We're then backgrounding the rsync and then running a wait command which will wait on the last run command, rsync, until it's complete.
We then guard the entire script with the trap '' SIGINT SIGTERM EXIT.
References
https://access.redhat.com/solutions/360713
https://access.redhat.com/solutions/1539283
Related
I found this article, that explains how to redirect output of a bash script to syslog. This is exactly what I needed, but there is a problem.
#!/bin/bash
# Don't ignore any error and return when first error occurs.
exec 1> >(logger -s -t $(basename $0)) 2>&1
set -e
# a list of command(s) that can fail:
chown -R user1:user1 /tmp/myappData/*
chown -R user1:user1 /tmp/myappTmp/*
chown -R user1:user1 /tmp/myappLog/*
#...
exit 0
When I execute above script, and an error occurs, I see that sometimes, the prompt doesn't return after the script is executed. I can't figure out why this is happening. The prompt doesn't return unless I hit enter.
I am concerned that if an app uses this script, it may not get proper exit code back.
If I comment out "set -e", then the prompt always returns properly after the script has executed.
So my question is, what is the proper way to setup a script so that it exits on error, and logs the corresponding message to syslog?
Thank you for your help and suggestions!
The problem here is that the logger pipeline is still running after your script exits, so some of the last content to be logged print after the parent shell has emitted its prompt. If you scroll up, you'll find the prompt hidden somewhere in that prior output.
If you have a very, very new bash, you can collect the PID of the process substitution, and wait for it later.
exec {orig_out}>&1 {orig_err}>&2 1> >(logger -s -t "${0##*/}") 2>&1; logger_pid=$!
[[ $logger_pid ]] || { echo "ERROR: Needs a newer bash" >&2; exit 1; }
cleanup() {
exec >&$orig_out 2>&$orig_err
wait "$logger_pid"
}
trap cleanup EXIT
With an older bash, you can consider other tricks. For example, on Linux, you can use the flock command to try to grab exclusive access to a lockfile before exiting, after ensuring that that lock is held for as long as the logger is running:
log_lock=$(mktemp "${TMPDIR:-/tmp}/logger.XXXXXX")
exec >(flock -x "$log_lock" logger -s -t "${0##*/}") 2>&1
cleanup() {
exec >/dev/tty 2>&1 || exec >/dev/null 2>&1
flock -x "$log_lock" true
}
trap cleanup EXIT
How can you suppress the Terminated message that comes up after you kill a
process in a bash script?
I tried set +bm, but that doesn't work.
I know another solution involves calling exec 2> /dev/null, but is that
reliable? How do I reset it back so that I can continue to see stderr?
In order to silence the message, you must be redirecting stderr at the time the message is generated. Because the kill command sends a signal and doesn't wait for the target process to respond, redirecting stderr of the kill command does you no good. The bash builtin wait was made specifically for this purpose.
Here is very simple example that kills the most recent background command. (Learn more about $! here.)
kill $!
wait $! 2>/dev/null
Because both kill and wait accept multiple pids, you can also do batch kills. Here is an example that kills all background processes (of the current process/script of course).
kill $(jobs -rp)
wait $(jobs -rp) 2>/dev/null
I was led here from bash: silently kill background function process.
The short answer is that you can't. Bash always prints the status of foreground jobs. The monitoring flag only applies for background jobs, and only for interactive shells, not scripts.
see notify_of_job_status() in jobs.c.
As you say, you can redirect so standard error is pointing to /dev/null but then you miss any other error messages. You can make it temporary by doing the redirection in a subshell which runs the script. This leaves the original environment alone.
(script 2> /dev/null)
which will lose all error messages, but just from that script, not from anything else run in that shell.
You can save and restore standard error, by redirecting a new filedescriptor to point there:
exec 3>&2 # 3 is now a copy of 2
exec 2> /dev/null # 2 now points to /dev/null
script # run script with redirected stderr
exec 2>&3 # restore stderr to saved
exec 3>&- # close saved version
But I wouldn't recommend this -- the only upside from the first one is that it saves a sub-shell invocation, while being more complicated and, possibly even altering the behavior of the script, if the script alters file descriptors.
EDIT:
For more appropriate answer check answer given by Mark Edgar
Solution: use SIGINT (works only in non-interactive shells)
Demo:
cat > silent.sh <<"EOF"
sleep 100 &
kill -INT $!
sleep 1
EOF
sh silent.sh
http://thread.gmane.org/gmane.comp.shells.bash.bugs/15798
Maybe detach the process from the current shell process by calling disown?
The Terminated is logged by the default signal handler of bash 3.x and 4.x. Just trap the TERM signal at the very first of child process:
#!/bin/sh
## assume script name is test.sh
foo() {
trap 'exit 0' TERM ## here is the key
while true; do sleep 1; done
}
echo before child
ps aux | grep 'test\.s[h]\|slee[p]'
foo &
pid=$!
sleep 1 # wait trap is done
echo before kill
ps aux | grep 'test\.s[h]\|slee[p]'
kill $pid ## no need to redirect stdin/stderr
sleep 1 # wait kill is done
echo after kill
ps aux | grep 'test\.s[h]\|slee[p]'
Is this what we are all looking for?
Not wanted:
$ sleep 3 &
[1] 234
<pressing enter a few times....>
$
$
[1]+ Done sleep 3
$
Wanted:
$ (set +m; sleep 3 &)
<again, pressing enter several times....>
$
$
$
$
$
As you can see, no job end message. Works for me in bash scripts as well, also for killed background processes.
'set +m' disables job control (see 'help set') for the current shell. So if you enter your command in a subshell (as done here in brackets) you will not influence the job control settings of the current shell. Only disadvantage is that you need to get the pid of your background process back to the current shell if you want to check whether it has terminated, or evaluate the return code.
This also works for killall (for those who prefer it):
killall -s SIGINT (yourprogram)
suppresses the message... I was running mpg123 in background mode.
It could only silently be killed by sending a ctrl-c (SIGINT) instead of a SIGTERM (default).
disown did exactly the right thing for me -- the exec 3>&2 is risky for a lot of reasons -- set +bm didn't seem to work inside a script, only at the command prompt
Had success with adding 'jobs 2>&1 >/dev/null' to the script, not certain if it will help anyone else's script, but here is a sample.
while true; do echo $RANDOM; done | while read line
do
echo Random is $line the last jobid is $(jobs -lp)
jobs 2>&1 >/dev/null
sleep 3
done
Another way to disable job notifications is to place your command to be backgrounded in a sh -c 'cmd &' construct.
#!/bin/bash
# ...
pid="`sh -c 'sleep 30 & echo ${!}' | head -1`"
kill "$pid"
# ...
# or put several cmds in sh -c '...' construct
sh -c '
sleep 30 &
pid="${!}"
sleep 5
kill "${pid}"
'
I found that putting the kill command in a function and then backgrounding the function suppresses the termination output
function killCmd() {
kill $1
}
killCmd $somePID &
Simple:
{ kill $! } 2>/dev/null
Advantage? can use any signal
ex:
{ kill -9 $PID } 2>/dev/null
edit
For future readers. The root of this problem really came down to running the function in an interactive shell vs. putting it in a separate script.
Also, there are many things that could be improved in the code I originally posted. Please see comments for things that could/should have been done better.
/edit
I have a bash function intended to rerun a process in the background when files in a directory change (think like Grunt, but for general purposes). The script functions as desired while running:
The subprocess is correctly started (including any children)
On file change, the sub is killed (including children) and started again
However, on exit (ctrl-c) none of the processes are killed. Additionally, pressing ctrl-c a second time will kill the current terminal session. I'm assuming this is a problem with my trap, but have been unable to identify a reason for the issue.
Here is the code of rerun.sh
#!/bin/bash
# rerun.sh
_kill_children() {
isTop=$1
curPid=$2
# Get pids of children
children=`ps -o pid --no-headers --ppid ${curPid}`
for child in $children
do
# Call this function to get grandchildren as well
_kill_children 0 $child
done
# Parent calls this with 1, all other with 0 so only children are killed
if [[ $isTop -eq 0 ]]; then
kill -9 $curPid 2> /dev/null
fi
}
rerun() {
trap " _kill_children 1 $$; exit 0" SIGINT SIGTERM
FORMAT=$(echo -e "\033[1;33m%w%f\033[0m written")
#Command that should be repeatedly run is passed as args
args=$#
$args &
#When a file changes in the directory, rerun the process
while inotifywait -qre close_write --format "$FORMAT" .
do
#Kill current bg proc and it's children
_kill_children 1 $$
$args & #Rerun the proc
done
}
#This is sourced in my bash profile so I can run it any time
To test this, create a pair of executable files parent.sh and child.sh as follows:
#!/bin/bash
#parent.sh
./child.sh
#!/bin/bash
#child.sh
sleep 86400
Then source the rerun.sh file and run rerun ./parent.sh. In another terminal window I watch "ps -ef | grep pts/4" to see all processes for the rerun (in this example on pts/4). Touching a file in the directory triggers a restart of parent.sh and children. [ctrl-c] exits, but leaves the pids running. [ctrl-c] again kills bash and all other processes on pts/4.
Desired behavior: on [ctrl-c], kill children and exit to shell normally. Help?
--
Code sources:
Inotify idea from: https://exyr.org/2011/inotify-run/
Kill children from: http://riccomini.name/posts/linux/2012-09-25-kill-subprocesses-linux-bash/
This isn't a good practice to follow in the first place. Track your children explicitly:
children=( )
foo & children+=( "$!" )
...then, you can kill or wait for them explicitly, referring to "${children[#]}" for the list. If you want to get grandchildren as well, this is a good user for fuser -k and a lockfile:
lockfile_name="$(mktemp /tmp/lockfile.XXXXXX)" # change appropriately
trap 'rm -f "$lockfile_name"' 0
exec 3>"$lockfile_name" # open lockfile on FD 3
kill_children() {
# close our own handle on the lockfile
exec 3>&-
# kill everything that still has it open (our children and their children)
fuser -k "$lockfile_name" >/dev/null
# ...then open it again.
exec 3>"$lockfile_name"
}
rerun() {
trap 'kill_children; exit 0' SIGINT SIGTERM
printf -v format '%b' "\033[1;33m%w%f\033[0m written"
"$#" &
#When a file changes in the directory, rerun the process
while inotifywait -qre close_write --format "$format" .; do
kill_children
"$#" &
done
}
I am trying to get the SIGSTOP CTRL+Z signal in my script's trap.
When my script is executing, if I temporarily suspend from execution, send a SIGSTOP signalCTRL+Z, it needs to remove the files I create in it and to kill the execution.
I don't understand why the following script doesn't work. But, more important, what is the correct way to do it?
#!/bin/bash
DIR="temp_folder"
trap "rm -r $DIR; kill -SIGINT $$" SIGSTP
if [ -d $DIR ]
then
rm -r $DIR
else
mkdir $DIR
fi
sleep 5
EDIT:
SIGSTOP cannot be trapped, however SIGTSTP can be trapped, and from what I understood after searching on the internet and the answer below it's the correct to trap when sending signal with CTRL+Z. However, when I press CTRL+Z while running the script it will get stuck, meaning that the script will be endlessly execute no matter what signals I send afterwards.
The problem here is you are trying to suspend a process that is already sleeping.
It is also good practice to use DIR=$(mktemp -d) in shell scripts to create temp directories.
CTRL-C is signal (2) / CTRL-Z (20):
catch_exits() {
printf "\n$(basename $0): exiting\n" 1>&2
rm -rf $DIR
exit 1
}
trap catch_exits 1 2 3 15 20
DIR="$(mktemp -d)"
read -p "not sleeping" test
if you send a function to the background (such as for a cursor spinner) - then you need to disable CTRL-Z while the long process is running with:
trap "" SIGTSTP
There are two signals you can't trap*, SIGKILL and SIGSTOP. Use another signal.
*: without modifying the kernel
IEEE standard:
Setting a trap for SIGKILL or SIGSTOP produces undefined results.
I am writing a bash wrapper for an application. This wrapper is responsible for changing the user, running the software and logging its output.
I also want it to propagate the SIGINT signal.
Here is my code so far :
#!/bin/bash
set -e; set -u
function child_of {
ps --ppid $1 -o "pid" --no-headers | head -n1
}
function handle_int {
echo "Received SIGINT"
kill -int $(child_of $SU_PID)
}
su myuser -p -c "bash /opt/loop.sh 2>&1 | tee -i >(logger -t mytag)" &
SU_PID=$!
trap "handle_int" SIGINT
wait $SU_PID
echo "This is the end."
My problem is that when I send a SIGINT to this wrapper, handle_int gets called but then the script is over, while I want it to continue to wait for $SU_PID.
Is there a way to catch the int signal, do something and then prevent the script from terminating ?
You have a gotcha: after Ctrl-C, "This is the end." is expected but it never comes because the script has exited prematurely. The reason is wait has (unexpectedly) returned non-zero while running under set -e.
According to "man bash":
If bash is waiting for a command to complete and receives a signal for which a trap has been set, the trap
will not be executed until the command completes. When bash is waiting for an asynchronous command via the
wait builtin, the reception of a signal for which a trap has been set will cause the wait builtin to return
immediately with an exit status greater than 128, immediately after which the trap is executed.
You should wrap your wait call in set +e so that your program can continue running after handling a trapped signal while waiting for an asynchronous command.
Like this:
# wait function that handles trapped signal on asynchronous commands.
function safe_async_wait {
set +e
wait $1 # returns >128 on asynchronous commands
set -e
}
#...
safe_async_wait $SU_PID