How to kill shell script without killing currently executed line - linux

I am running a shell script, something like sh script.sh in bash. The script contains many lines, some of which take seconds and others take days to execute. How can I kill the sh command but not kill its command currently running (the current line from the script)?

You haven't specified exactly what should happen when you 'kill' your script., but I'm assuming that you'd like the currently executing line to complete and then exit before doing any more work.
This is probably best achieved only by coding your script to behave in such a way as to receive such a kill command and respond in an appropriate way - I don't think that there is any magic to do this in linux.
for example:
You could trap a signal and then set a variable
Check for existence of a file (e.g touch /var/tmp/trigger)
Then after each line in your script, you'd need to check to see if each the trap had been called (or your trigger file created) - and then exit. If the trigger has not been set, then you continue on and do the next piece of work.
To the best of my knowledge, you can't trap a SIGKILL (-9) - if someone sends that to your process, then it will die.
HTH, Ace

The only way I can think of achieving this is for the parent process to trap the kill signal, set a flag, and then repeatedly check for this flag before executing another command in your script.
However the subprocesses need to also be immune to the kill signal. However bash seems to behave different to ksh in this manner and the below seems to work fine.
#!/bin/bash
QUIT=0
trap "QUIT=1;echo 'term'" TERM
function terminated {
if ((QUIT==1))
then
echo "Terminated"
exit
fi
}
function subprocess {
typeset -i N
while ((N++<3))
do
echo $N
sleep 1
done
}
while true
do
subprocess
terminated
sleep 3
done

I assume you have your script running for days and then you don't just want to kill it without knowing if one of its children finished.
Find the pid of your process, using ps.
Then
child=$(pgrep -P $pid)
while kill -s 0 $child
do
sleep 1
done
kill $pid

Related

How do I terminate a command that runs infinitely in shell script?

I have this command in my shell script that runs forever- it wouldn't finish unless I do ctrl-c. I have been trying to look up how to send ctrl-c signal to script and all the answers have been some sort of kill $! or kill$$ or such. My problem is that the command never finishes, so it never goes on to the next command like my "kill" commands or anything else. I have to manually hit the ctrl-C in my terminal for it to even execute kill $!. I'm sure there is a way to work around this but I am not sure what. Thanks in advance!
There are several approaches to this problem. The simplest (but not most robust) is (perhaps) to simply run your long running command in the background:
#!/bin/sh
long-running-command & # run in the background
sleep 5 # sleep for a bit
kill %1 # send SIGTERM to the command if it's still running

Trap process.kill in bash?

I am trying to run a bash script through the process class and do something when I use process.kil()
But it seems that the exit signal isn't triggered so:
A. Is it possible?
B. If it isn't possible is there a way to send a signal to the process?
Script.sh
#!/bin/bash
Exit(){
echo "terminated">>log.txt
kill $child}
trap Exit EXIT
daemon &
child=$!
echo $child > log.txt
wait $child
C# code:
Process script = new Process()
script.StartInfo.Filename = "Script.sh"
script.Start()
//Other stuff
script.Kill()
C# script.Kill doesn't seem to trigger Exit() in Script.sh
Exiting through ubuntu's system monitor
does trigger it though.
Edit:
Changing the signal to SIGTERM didn't change the results.
There is no such signal EXIT in UNIX/Linux. SIGTERM is the closest to what you want and the default signal used by kill.
However, bash has a pseudo-signal called EXIT. It's described in Bash Manual - Bourne Shell Builtins - trap which is where you might have gotten confused.
If a sigspec is 0 or EXIT, arg is executed when the shell exits. If a sigspec is DEBUG, the command arg is executed before every simple command, for command, case command, select command, every arithmetic for command, and before the first command executes in a shell function. Refer to the description of the extdebug option to the shopt builtin (see The Shopt Builtin) for details of its effect on the DEBUG trap. If a sigspec is RETURN, the command arg is executed each time a shell function or a script executed with the . or source builtins finishes executing
If you meant to use this, then you should start your script with the line:
#!/bin/bash
to ensure it is run under the bash shell.
See http://man7.org/linux/man-pages/man7/signal.7.html for a list of all the signals in Linux.

crash-stopping bash pipeline [duplicate]

This question already has answers here:
How do you catch error codes in a shell pipe?
(5 answers)
Closed 7 years ago.
I have a pipeline, say a|b where if a runs into a problem, I want to stop the whole pipeline.
'a exiting with exit=1 doesn't do this as often 'b doesn't care about return codes.
e.g.
echo 1|grep 0|echo $? <-- this shows that grep did exit=1
but
echo 1|grep 0 | wc <--- wc is unfazed by grep's exit here
If I ran the pipeline as a subprocess of an owning process, any of the pipeline processes could kill the owning process. But this seems a bit clumsy -- but it would zap the whole pipeline.
Not possible with basic shell constructs, probably not possible in shell at all.
Your first example doesn't do what you think. echo doesn't use standard input, so putting it on the right side of a pipe is never a good idea. The $? that you're echoing is not the exit value of the grep 0. All commands in a pipeline run simultaneously. echo has already been started, with the existing value of $?, before the other commands in the pipeline have finished. It echoes the exit value of whatever you did before the pipeline.
# The first command is to set things up so that $? is 2 when the
# second command is parsed.
$ sh -c 'exit 2'
$ echo 1|grep 0|echo $?
2
Your second example is a little more interesting. It's correct to say that wc is unfazed by grep's exit status. All commands in the pipeline are children of the shell, so their exit statuses are reported to the shell. The wc process doesn't know anything about the grep process. The only communication between them is the data stream written to the pipe by grep and read from the pipe by wc.
There are ways to find all the exit statuses after the fact (the linked question in the comment by shx2 has examples) but a basic rule that you can't avoid is that the shell will always wait for all the commands to finish.
Early exits in a pipeline sometimes do have a cascade effect. If a command on the right side of a pipe exits without reading all the data from the pipe, the command on the left of that pipe will get a SIGPIPE signal the next time it tries to write, which by default terminates the process. (The 2 phrases to pay close attention to there are "the next time it tries to write" and "by default". If a the writing process spends a long time doing other things between writes to the pipe, it won't die immediately. If it handles the SIGPIPE, it won't die at all.)
In the other direction, when a command on the left side of a pipe exits, the command on the right side of that pipe gets EOF, which does cause the exit to happen fairly soon when it's a simple command like wc that doesn't do much processing after reading its input.
With direct use of pipe(), fork(), and wait3(), it would be possible to construct a pipeline, notice when one child exits badly, and kill the rest of them immediately. This requires a language more sophisticated than the shell.
I tried to come up with a way to do it in shell with a series of named pipes, but I don't see it. You can run all the processes as separate jobs and get their PIDs with $!, but the wait builtin isn't flexible enough to say "wait for any child in this set to exit, and tell me which one it was and what the exit status was".
If you're willing to mess with ps and/or /proc you can find out which processes have exited (they'll be zombies), but you can't distinguish successful exit from any other kind.
Write
set -e
set -o pipefail
at the beginning of your file.
-e will exit on an error and -o pipefail will produce an errorcode on each stage of you "pipeline"

Bash: Why does parent script not terminate on SIGINT when child script traps SIGINT?

script1.sh:
#!/bin/bash
./script2.sh
echo after-script
script2.sh:
#!/bin/bash
function handler {
exit 130
}
trap handler SIGINT
while true; do true; done
When I start script1.sh from a terminal, and then use Ctrl+C to send SIGINT to its process group, the signal is trapped by script2.sh and when script2.sh terminates, script1.sh prints "after-script". However, I would have expected script1.sh to immediately terminate after the line that invokes script2.sh. Why is this not the case in this example?
Additional remarks (edit):
As script1.sh and script2.sh are in the same process group, SIGINT gets sent to both scripts when Ctrl+C is pressed on the command line. That's why I wouldn't expect script1.sh to continue when script2.sh exits.
When the line "trap handler SIGINT" in script2.sh is commented out, script1.sh does exit immediately after script2.sh exists. I want to know why it behaves differently then, as script2.sh produces just the same exit code (130) then.
New answer:
This question is far more interesting than I originally suspected. The answer is essentially given here:
What happens to a SIGINT (^C) when sent to a perl script containing children?
Here's the relevant tidbit. I realize you're not using Perl, but I assume Bash is using C's convention.
Perl’s builtin system function works just like the C system(3)
function from the standard C library as far as signals are concerned.
If you are using Perl’s version of system() or pipe open or backticks,
then the parent — the one calling system rather than the one called by
it — will IGNORE any SIGINT and SIGQUIT while the children are
running.
This explanation is the best I've seen about the various choices that can be made. It also says that Bash does the WCE approach. That is, when a parent process receives SIGINT, it waits until its child process returns. If that process handled exited from a SIGINT, it also exits with SIGINT. If the child exited any other way it ignores SIGINT.
There is also a way that the calling shell can tell whether the called
program exited on SIGINT and if it ignored SIGINT (or used it for
other purposes). As in the WUE way, the shell waits for the child to
complete. It figures whether the program was ended on SIGINT and if
so, it discontinue the script. If the program did any other exit, the
script will be continued. I will call the way of doing things the
"WCE" (for "wait and cooperative exit") for the rest of this document.
I can't find a reference to this in the Bash man page, but I'll keep looking in the info docs. But I'm 99% confident this is the correct answer.
Old answer:
A nonzero exit status from a command in a Bash script does not terminate the program. If you do an echo $? after ./script2.sh it will show 130. You can terminate the script by using set -e as phs suggests.
$ help set
...
-e Exit immediately if a command exits with a non-zero status.
The second part of #seanmcl's updated answer is correct and the link to http://www.cons.org/cracauer/sigint.html is a really good one to read through carefully.
From that link, "You cannot 'fake' the proper exit status by an exit(3) with a special numeric value, even if you look up the numeric value for your system". In fact, that's what is being attempted in #Hermann Speiche's script2.sh.
One answer is to modify function handler in script2.sh as follows:
function handler {
# ... do stuff ...
trap INT
kill -2 $$
}
This effectively removes the signal handler and "rethrows" the SIGINT, causing the bash process to exit with the appropriate flags such that its parent bash process then correctly handles the SIGINT that was originally sent to it. This way, using set -e or any other hack is not actually required.
It's also worth noting that if you have an executable that behaves incorrectly when sent a SIGINT (it doesn't conform to "How to be a proper program" in the above link, e.g. it exits with a normal return-code), one way of working around this is to wrap the call to that process with a script like the following:
#!/bin/bash
function handler {
trap INT
kill -2 $$
}
trap handler INT
badprocess "$#"
The reason is your script1.sh doesn't terminate is that script2.sh is running in a subshell. To make the former script exit, you can either set -e as suggested by phs and seanmcl or force the script2.sh to run in the same shell by saying:
. ./script2.sh
in your first script. What you're observing would be apparent if you were to do set -x before executing your script. help set tells:
-x Print commands and their arguments as they are executed.
You can also let your second script send a terminating signal on its parent script by SIGHUP, or other safe and usable signals like SIGQUIT in which the parent script may consider or trap as well (sending SIGINT doesn't work).
script1.sh:
#!/bin/bash
trap 'exit 0' SIQUIT ## We could also just accept SIGHUP if we like without traps but that sends a message to the screen.
./script2.sh ## or "bash script.sh" or "( . ./script.sh; ) which would run it on another process
echo after-script
script2.sh:
#!/bin/bash
SLEEPPID=''
PID=$BASHPID
read PPID_ < <(exec ps -p "$PID" -o "$ppid=")
function handler {
[[ -n $SLEEPPID ]] && kill -s SIGTERM "$SLEEPPID" &>/dev/null
kill -s SIGQUIT "$PPID_"
exit 130
}
trap handler SIGINT
# better do some sleeping:
for (( ;; )); do
[[ -n $SLEEPPID ]] && kill -s 0 "$SLEEPPID" &>/dev/null || {
sleep 20 &
SLEEPPID=$!
}
wait
done
Your original last line in your script1.sh could have just like this as well depending on your scripts intended implementation.
./script2.sh || exit
...
Or
./script2.sh
[[ $? -eq 130 ]] && exit
...
The correct way this should work is through setpgrp(). All children of shell should be placed in the same pgrp. When SIGINT is signaled by the tty driver, it will be summarily delivered to all processes. The shell at any level should note the receipt of the signal, wait for children to exit and then kill themselves, again, with no signal handler, with sigint, so that their exit code is correct.
Additionally, when SIGINT is set to ignore at startup by their parent process, they should ignore SIGINT.
A shell should not "check if a child exited with sigint" as any part of the logic. The shell should always just honor the signal it receives directly as the reason to act and then exit.
Back in the day of real UNIX, SIGINT stopped the shell and all sub processes with a single key stroke. There was never any problem with the exit of a shell and child processes continuing to run, unless they themselves had set SIGINT to ignore.
For any shell pipeline, their should be a child process relationship created from pipelines going right to left. The right most command is the immediate child of the shell since thats the last process to exit normally. Each command line before that, is a child of the process immediately to the right of the next pipe symbol or && or || symbol. There are obvious groups of children around && and || which fall out naturally.
in the end, process groups keep things clean so that nohup works as well as all children receiving SIGINT or SIGQUIT or SIGHUP or other tty driver signals.

Making linux "Wait" command wait for ALL child processes

Wait is not waiting for all child processes to stop. This is my script:
#!/bin/bash
titlename=`echo "$#"|sed 's/\..\{3\}$//'`
screen -X title "$titlename"
/usr/lib/process.bash -verbose $#
wait
bash -c "mail.bash $#"
screen -X title "$titlename.Done"
I don't have access to /usr/lib/process.bash, but it is a script that changes frequently, so I would like to reference it... but in that script:
#!/bin/ksh
#lots of random stuff
/usr/lib/runall $path $auto $params > /dev/null 2>&1&
My problem is that runall creates a log file... and mail.bash is suppose to mail me that log file, but the wait isn't waiting for runall to finish, it seems to only be waiting for process.bash to finish. Is there anyway, without access to process.bash, or trying to keep my own up to date version of process.bash, to make the wait properly wait for runall to finish? (The log file overwrites the previous run, so I can't just check for the presence of the log file, since there is always one there)
Thanks,
Dan
(
. /usr/lib/process.bash -verbose $#
wait
)
Instead of letting the OS start process.bash, this creates a subshell, runs all the commands in process.bash as if they were entered into our shell script, and waits within that subshell.
There are some caveats to this, but it should work if you're not doing anything unusual.
wait only waits for direct children; if any children spawn their own children, it won't wait for them.
The main problem is that because process.bash has exited the runall process will be orphaned and owned by init (PID 1). If you look at the process list runall won't have any visible connection to your process any more since the intermediate process.bash script exited. There's no way to use ps --ppid or anything similar to search for this "grandchild" process once it's orphaned.
You can wait on a specific PID. Do you know the PID of the runall process? If there's only one such process you could try this, which will wait for all running runalls:
wait `pidof runall`
You could recuperate the PID of the process for whom you want to wait
And then pass this PID as an argument to the command Wait

Resources