How would you stop an application that is ignoring all signals? - linux

I know that in most cases, it is impossible to ignore all signals. If it was possible for an application to ignore all signals, how would you stop it? Since you wouldn't be able to use SIGKILL or SIGSTOP...Not even sure if this question makes sense...

SIGKILL will end the process without it having any say in the matter.
When SIGKILL is send to a process, the kernel will not relay the signal to the process and call a signal handler it specified. Instead the kernel will simply immediately stop and destroy the process.
So, SIGKILL will always work. There is nothing the process can do to prevent it. It won't get any time to execute any code or do any cleanup. This is why you would usually try to send a SIGTERM first to ask the process to come to an end on its own and follow with a SIGKILL after a while only if the process didn't honor the SIGTERM request.
For SIGSTOP the matter is similar.

Related

Long Running Python Script in VSCode Exits with 'Polite quit request'

I have a long running Python script which is running in Visual Studio Code.
After a while the script stops running, there are no errors just this statement:
"fish: “/usr/bin/python3 /home/ubuntu/.…” terminated by signal SIGTERM (Polite quit request)"
What is happening here?
If a process recieves SIGTERM, some other process sent that signal. That is what happened in your case.
The SIGTERM signal is sent to a process to request its termination. Unlike the SIGKILL signal, it can be caught and interpreted or ignored by the process. This allows the process to perform nice termination releasing resources and saving state if appropriate. SIGINT is nearly identical to SIGTERM.
SIGTERM is not sent automatically by the system. There are a few signals that are sent automatically like SIGHUP when a terminal goes away, SIGSEGV/SIGBUS/SIGILL when a process does things it shouldn't be doing, SIGPIPE when it writes to a broken pipe/socket, etc.
SIGTERM is the signal that is typically used to administratively terminate a process.
That's not a signal that the kernel would send, but that's the signal a process would typically send to terminate (gracefully) another process. It is sent by default by the kill, pkill, killall, fuser -k commands.
Possible reasons why your process recieved such signal are:
execution of the process takes too long
insufficient memory or system resources to continue the execution of the process
But these are some possibilities. In your case, the root of the issue might be related with something different. You can avoid from a SIGTERM signal by telling the procces to ignore the signal but it is not suggested to do.
Refer to this link for more information.
Check this similar question for additional information.

How is a signal uncatchable (e.g. sigkill)? What goes on under the hood of linux?

I understand there are catchable and uncatchable signals. However, it seems that both types of signals get sent to processes by the OS. What makes uncatchable signals uncatchable? Is it the signal handler that catches signals, and because no signal handler is written to handle an a particular signal (e.g. sigkill), that it becomes uncatchable? If that's true, can I conclude that it is possible to catch an uncatchable signal by writing a signal handler?
No, you can't catch the uncatchable signals as it is always caught by the default handler implemented by the kernel.
SIGKILL always terminates the process. Even if the process attempts to handle the SIGKILL signal by registering a handler for it, still the control would always land in the default SIGKILL handler which would terminate the program.
This is what happens when you try to shut down your system.
First, the system sends a SIGTERM signal to all the processes and waits for a while giving those processes a grace period. If it still doesn't stop even after the grace period, the system forcibly terminates all the process by using SIGKILL signal.

System V msg_send interrupted by SIGKILL

I have a multi-process application that works like so...
There is a parent process. The parent process queries a database to find work, then forks children to process that work. The children communicate back to the parent via System V message queues to indicate they're done with their work. When the parent process picks up that message, it updates the database to indicate that the work is complete.
This works okay but I'm struggling with handling the parent process being killed.
What happens is the parent receives a SIGINT(from CTRL-C), and then sends SIGKILLs to each of the children. If a child is currently blocking on a Sys V message queue write when it receives that signal, the write is "interrupted" by the signal and the blocking canceled and the parent never learns that the child's work was done, and the database never gets updated.
That means that the next time I run the script, it will re-run any work that was blocking on the System V queue write.
I don't have a good idea for a solution for this yet. Ideally I would like to be able to force the queue write to remain blocking even when it receives that SIGKILL but I don't think such a thing is possible.
Well SIGKILL is, by definition, immediately fatal to the process which receives it and cannot be trapped or handled.
That is why you should only use it as a last resort, when the process does not respond to more polite requests to shut down. Your parent process should start off by sending something like SIGINT or SIGTERM to the children, and only reset to SIGKILL if they don't exit within a reasonable period of time.
Signals like SIGINT and SIGTERM may still cause the system call in the child to return, with EINTR, but you can handle that and retry the call and let it complete before exiting.

How does SIGINT relate to the other termination signals such as SIGTERM, SIGQUIT and SIGKILL?

On POSIX systems, termination signals usually have the following order (according to many MAN pages and the POSIX Spec):
SIGTERM - politely ask a process to terminate. It shall terminate gracefully, cleaning up all resources (files, sockets, child processes, etc.), deleting temporary files and so on.
SIGQUIT - more forceful request. It shall terminate ungraceful, still cleaning up resources that absolutely need cleanup, but maybe not delete temporary files, maybe write debug information somewhere; on some system also a core dump will be written (regardless if the signal is caught by the app or not).
SIGKILL - most forceful request. The process is not even asked to do anything, but the system will clean up the process, whether it like that or not. Most likely a core dump is written.
How does SIGINT fit into that picture? A CLI process is usually terminated by SIGINT when the user hits CRTL+C, however a background process can also be terminated by SIGINT using KILL utility. What I cannot see in the specs or the header files is if SIGINT is more or less forceful than SIGTERM or if there is any difference between SIGINT and SIGTERM at all.
UPDATE:
The best description of termination signals I found so far is in the GNU LibC Documentation. It explains very well that there is an intended difference between SIGTERM and SIGQUIT.
It says about SIGTERM:
It is the normal way to politely ask a program to terminate.
And it says about SIGQUIT:
[...] and produces a core dump when it terminates the process, just like a program error signal.
You can think of this as a program error condition “detected” by the user. [...]
Certain kinds of cleanups are best omitted in handling SIGQUIT. For example, if the program
creates temporary files, it should handle the other termination requests by deleting the temporary
files. But it is better for SIGQUIT not to delete them, so that the user can examine them in
conjunction with the core dump.
And SIGHUP is also explained well enough. SIGHUP is not really a termination signal, it just means the "connection" to the user has been lost, so the app cannot expect the user to read any further output (e.g. stdout/stderr output) and there is no input to expect from the user any longer. For most apps that mean they better quit. In theory an app could also decide that it goes into daemon mode when a SIGHUP is received and now runs as a background process, writing output to a configured log file. For most daemons already running in the background, SIGHUP usually means that they shall reexamine their configuration files, so you send it to background processes after editing config files.
However there is no useful explanation of SIGINT on this page, other than that it is sent by CRTL+C. Is there any reason why one would handle SIGINT in a different way than SIGTERM? If so what reason would this be and how would the handling be different?
SIGTERM and SIGKILL are intended for general purpose "terminate this process" requests. SIGTERM (by default) and SIGKILL (always) will cause process termination. SIGTERM may be caught by the process (e.g. so that it can do its own cleanup if it wants to), or even ignored completely; but SIGKILL cannot be caught or ignored.
SIGINT and SIGQUIT are intended specifically for requests from the terminal: particular input characters can be assigned to generate these signals (depending on the terminal control settings). The default action for SIGINT is the same sort of process termination as the default action for SIGTERM and the unchangeable action for SIGKILL; the default action for SIGQUIT is also process termination, but additional implementation-defined actions may occur, such as the generation of a core dump. Either can be caught or ignored by the process if required.
SIGHUP, as you say, is intended to indicate that the terminal connection has been lost, rather than to be a termination signal as such. But, again, the default action for SIGHUP (if the process does not catch or ignore it) is to terminate the process in the same way as SIGTERM etc. .
There is a table in the POSIX definitions for signal.h which lists the various signals and their default actions and purposes, and the General Terminal Interface chapter includes a lot more detail on the terminal-related signals.
man 7 signal
This is the convenient non-normative manpage of the Linux man-pages project that you often want to look at for Linux signal information.
Version 3.22 mentions interesting things such as:
The signals SIGKILL and SIGSTOP cannot be caught, blocked, or ignored.
and contains the table:
Signal Value Action Comment
----------------------------------------------------------------------
SIGHUP 1 Term Hangup detected on controlling terminal
or death of controlling process
SIGINT 2 Term Interrupt from keyboard
SIGQUIT 3 Core Quit from keyboard
SIGILL 4 Core Illegal Instruction
SIGABRT 6 Core Abort signal from abort(3)
SIGFPE 8 Core Floating point exception
SIGKILL 9 Term Kill signal
SIGSEGV 11 Core Invalid memory reference
SIGPIPE 13 Term Broken pipe: write to pipe with no
readers
SIGALRM 14 Term Timer signal from alarm(2)
SIGTERM 15 Term Termination signal
SIGUSR1 30,10,16 Term User-defined signal 1
SIGUSR2 31,12,17 Term User-defined signal 2
SIGCHLD 20,17,18 Ign Child stopped or terminated
SIGCONT 19,18,25 Cont Continue if stopped
SIGSTOP 17,19,23 Stop Stop process
SIGTSTP 18,20,24 Stop Stop typed at tty
SIGTTIN 21,21,26 Stop tty input for background process
SIGTTOU 22,22,27 Stop tty output for background process
which summarizes signal Action that distinguishes e.g. SIGQUIT from SIGQUIT, since SIGQUIT has action Core and SIGINT Term.
The actions are documented in the same document:
The entries in the "Action" column of the tables below specify the default disposition for each signal, as follows:
Term Default action is to terminate the process.
Ign Default action is to ignore the signal.
Core Default action is to terminate the process and dump core (see core(5)).
Stop Default action is to stop the process.
Cont Default action is to continue the process if it is currently stopped.
I cannot see any difference between SIGTERM and SIGINT from the point of view of the kernel since both have action Term and both can be caught. It seems that is just a "common usage convention distinction":
SIGINT is what happens when you do CTRL-C from the terminal
SIGTERM is the default signal sent by kill
Some signals are ANSI C and others not
A considerable difference is that:
SIGINT and SIGTERM are ANSI C, thus more portable
SIGQUIT and SIGKILL are not
They are described on section "7.14 Signal handling " of the C99 draft N1256:
SIGINT receipt of an interactive attention signal
SIGTERM a termination request sent to the program
which makes SIGINT a good candidate for an interactive Ctrl + C.
POSIX 7
POSIX 7 documents the signals with the signal.h header: https://pubs.opengroup.org/onlinepubs/9699919799/basedefs/signal.h.html
This page also has the following table of interest which mentions some of the things we had already seen in man 7 signal:
Signal Default Action Description
SIGABRT A Process abort signal.
SIGALRM T Alarm clock.
SIGBUS A Access to an undefined portion of a memory object.
SIGCHLD I Child process terminated, stopped,
SIGCONT C Continue executing, if stopped.
SIGFPE A Erroneous arithmetic operation.
SIGHUP T Hangup.
SIGILL A Illegal instruction.
SIGINT T Terminal interrupt signal.
SIGKILL T Kill (cannot be caught or ignored).
SIGPIPE T Write on a pipe with no one to read it.
SIGQUIT A Terminal quit signal.
SIGSEGV A Invalid memory reference.
SIGSTOP S Stop executing (cannot be caught or ignored).
SIGTERM T Termination signal.
SIGTSTP S Terminal stop signal.
SIGTTIN S Background process attempting read.
SIGTTOU S Background process attempting write.
SIGUSR1 T User-defined signal 1.
SIGUSR2 T User-defined signal 2.
SIGTRAP A Trace/breakpoint trap.
SIGURG I High bandwidth data is available at a socket.
SIGXCPU A CPU time limit exceeded.
SIGXFSZ A File size limit exceeded.
BusyBox init
BusyBox's 1.29.2 default reboot command sends a SIGTERM to processes, sleeps for a second, and then sends SIGKILL. This seems to be a common convention across different distros.
When you shutdown a BusyBox system with:
reboot
it sends a signal to the init process.
Then, the init signal handler ends up calling:
static void run_shutdown_and_kill_processes(void)
{
/* Run everything to be run at "shutdown". This is done _prior_
* to killing everything, in case people wish to use scripts to
* shut things down gracefully... */
run_actions(SHUTDOWN);
message(L_CONSOLE | L_LOG, "The system is going down NOW!");
/* Send signals to every process _except_ pid 1 */
kill(-1, SIGTERM);
message(L_CONSOLE, "Sent SIG%s to all processes", "TERM");
sync();
sleep(1);
kill(-1, SIGKILL);
message(L_CONSOLE, "Sent SIG%s to all processes", "KILL");
sync();
/*sleep(1); - callers take care about making a pause */
}
which prints to the terminal:
The system is going down NOW!
Sent SIGTERM to all processes
Sent SIGKILL to all processes
Here is a minimal concrete example of that.
Signals sent by the kernel
SIGKILL:
OOM killer: What is RSS and VSZ in Linux memory management
As DarkDust noted many signals have the same results, but processes can attach different actions to them by distinguishing how each signal is generated. Looking at the FreeBSD kernel source code (kern_sig.c) I see that the two signals are handled in the same way, they terminate the process and are delivered to any thread.
SA_KILL|SA_PROC, /* SIGINT */
SA_KILL|SA_PROC, /* SIGTERM */
After a quick Google search for sigint vs sigterm, it looks like the only intended difference between the two is whether it was initiated by a keyboard shortcut or by an explicit call to kill.
As a result, you could, for example, intercept sigint and do something special with it, knowing that it was likely sent by a keyboard shortcut. Perhaps refresh the screen or something, instead of dying (not recommended, as people expect ^C to kill the program, just an example).
I also learned that ^\ should send sigquit, which I may start using myself. Looks very useful.
Using kill (both the system call and the utility) you can send almost any signal to any process, given you've got the permission. A process cannot distinguish how a signal came to life and who has sent it.
That being said, SIGINT really is meant to signal the Ctrl-C interruption, while SIGTERM is the general terminal signal. There is no concept of a signal being "more forceful", with the only exception that there are signals that cannot be blocked or handled (SIGKILL and SIGSTOP, according to the man page).
A signal can only be "more forceful" than another signal with respect to how a receiving process handles the signal (and what the default action for that signal is). For example, by default, both SIGTERM and SIGINT lead to termination. But if you ignore SIGTERM then it will not terminate your process, while SIGINT still will.
With the exception of a few signals, signal handlers can catch the various signals, or the default behavior upon receipt of a signal can be modified. See the signal(7) man page for details.

Should I be worried about the order, in which processes in a process goup receive signals?

I want to terminate a process group by sending SIGTERM to processes within it. This can be accomplished via the kill command, but the manuals I found provide few details about how exactly it works:
int kill(pid_t pid, int sig);
...
If pid is less than -1, then sig is sent to every process in
the process group whose ID is -pid.
However, in which order will the signal be sent to the processes that form the group? Imagine the following situation: a pipe is set between master and slave processes in the group. If slave is killed during processing kill(-pid), while the master is still not, the master might report this as an internal failure (upon receiving notification that the child is dead). However, I want all processes to understand that such termination was caused by something external to their process group.
How can I avoid this confusion? Should I be doing something more than mere kill(-pid,SIGTERM)? Or it is resolved by underlying properties of the OS, about which I'm not aware?
Note that I can't modify the code of the processes in the group!
Try doing it as a three-step process:
kill(-pid, SIGSTOP);
kill(-pid, SIGTERM);
kill(-pid, SIGCONT);
The first SIGSTOP should put all the processes into a stopped state. They cannot catch this signal, so this should stop the entire process group.
The SIGTERM will be queued for the process but I don't believe it will be delivered, since the processes are stopped (this is from memory, and I can't currently find a reference but I believe it is true).
The SIGCONT will start the processes again, allowing the SIGTERM to be delivered. If the slave gets the SIGCONT first, the master may still be stopped so it will not notice the slave going away. When the master gets the SIGCONT, it will be followed by the SIGTERM, terminating it.
I don't know if this will actually work, and it may be implementation dependent on when all the signals are actually delivered (including the SIGCHLD to the master process), but it may be worth a try.
My understanding is that you cannot rely on any specific order of signal delivery.
You could avoid the issue if you send the TERM signal to the master process only, and then have the master kill its children.
Even if all the various varieties of UNIX would promise to deliver the signals in a particular order, the scheduler might still decide to run the critical child process code before the parent code.
Even your STOP/TERM/CONT sequence will be vulnerable to this.
I'm afraid you may need something more complicated. Perhaps the child process could catch the SIGTERM and then loop until its parent exits before it exits itself? Be sure and add a timeout if you do this.
Untested: Use shared memory and put in some kind of "we're dying" semaphore, which may be checked before I/O errors are treated as real errors. mmap() with MAP_ANONYMOUS|MAP_SHARED and make sure it survives your way of fork()ing processes.
Oh, and be sure to use the volatile keyword or your semaphore is optimized away.

Resources