When segmentation fault occurs on Linux within multithreaded application and handler is called, are all other threads instantly stopped before handler is called?
So, is it appropriate to rely on fact that no any parralel code will execute during segmentation fault handling?
Thank you.
From the signal(7) manual page:
A signal may be generated (and thus pending) for a process as a whole (e.g., when sent using kill(2)) or for a specific thread (e.g., certain signals, such as SIGSEGV and SIGFPE, generated as a consequence of executing a specific machine-language instruction are thread directed, as are signals targeted at a specific thread using pthread_kill(3)). A process-directed signal may be delivered to any one of the threads that does not currently have the signal blocked. If more than one of the threads has the signal unblocked, then the kernel chooses an arbitrary thread to which to deliver the signal.
This paragraph says that certain signals, like SIGSEGV, are thread specific. Which should answer your question.
Related
According to POSIX definitions,
3.28 Asynchronously-Generated Signal
A signal that is not attributable to a specific thread. Examples are signals sent via kill(), signals sent from the keyboard, and signals delivered to process groups. Being asynchronous is a property of how the signal was generated and not a property of the signal number. All signals may be generated asynchronously.
Then,
3.379 Synchronously-Generated Signal
A signal that is attributable to a specific thread.
For example, a thread executing an illegal instruction or touching invalid memory causes a synchronously-generated signal. Being synchronous is a property of how the signal was generated and not a property of the signal number.
If an illegal instruction causes a synchronously-generated signal, how may it be generated asynchronously?
For example, say I have a program that runs two threads, A and B. Now, suppose an illegal instruction takes place in A and causes a signal SIGILL to be raised. Is a POSIX-compliant system required to invoke the signal handler defined for SIGILL in thread A? Or is it allowed to interrupt
thread B and invoke that signal handler in B?
Related:
List of Synchronous and Asynchronous Linux/Posix Signals
The two scenarios are different, and 'synchronous' vs. 'asynchronous' is determined at the time of generation.
To rephrase your second question a bit, you ask, "can a synchronously-generated SIGILL attributed to a particular thread be delivered to a different thread in the same process?"
No. A synchronously-generated signal can only be delivered to the thread that caused it. (2.4.1 Signal Generation and Delivery)
Now as a caveat and as mentioned in another answer, normal signal masking semantics do not apply to some synchronously-generated signals. In particular, as the specs pthread_sigmask and sigprocmask say, "[i]f any of the SIGFPE, SIGILL, SIGSEGV, or SIGBUS signals are generated while they are blocked, the result is undefined, unless the signal was generated [asynchronously]". (emphasis added)
Yes, you can send SIGILL,SIGSEGV,SIGBUS,SIGFPE,... to processes via kill(2), or a specific thread via pthread_kill(3). Masking only applies to kill generated signals, so if you install a handler for SIGILL, and mask it, your handler will only be invoked for real illegal instruction traps.
If a send cause a SIGPIPE signal, which thead would handle it ? The thread which send or a random thread? In other words, the Linux system send the signal by kill or pthread_kill?
Asynchronous signals like SIGPIPE can go to any thread. You can use signal masks to limit which of the threads is eligible.
Synchronous signals like SIGSEGV will be delivered on the thread that caused them.
Summary
The answer to this question has two facets: How the system in question should behave and how it actually behaves.
Since most programmers expect Linux to be mostly POSIX-compatible, we can look into that standard, which actually unambiguously specifies the behavior – the signal is sent directly to the thread which did the write. But whether Linux adheres to it is unclear and Linux documentation is not helpful here. An examination of Linux behavior suggests it conforms to POSIX, but doesn't prove it, and a reading of the source gives us the necessary proof about the current version of Linux.
tl;dr: It is always handled by the thread that did the write.
POSIX Standard
The POSIX standard mandates (since IEEE Std. 1003.1-2001/Cor 2-2004) that SIGPIPE generated as a result of write to a pipe with no readers be delivered to the thread doing the write. See EPIPE in the ERRORS section of the description of write() (emphasis mine):
[EPIPE] An attempt is made to write to a pipe or FIFO that is not open for reading by any process, or that only has one end open. A SIGPIPE signal shall also be sent to the thread.
Linux documentation
That said, it is not clear whether Linux handles this correctly. The page man 7 signal doesn't give concrete lists of thread- and process-directed signals, just examples, and its definition of thread-directed signals doesn't include SIGPIPE:
A signal may be thread-directed because it was generated as a consequence of executing a specific machine-language instruction that triggered a hardware exception […]
SIGPIPE is not a result of a specific instruction, nor is it triggered by a hardware exception.
Glibc documentation doesn't discuss kernel-generated synchronous thread-directed signals at all (i.e. not even SIGSEGV or SIGBUS are discussed as being thread-directed), and there are years-old reports of bugs in NPTL, although these may have been fixed in the meantime.
Observable Linux behavior
I wrote a program which spawns a thread, which blocks SIGPIPE using pthread_sigmask, creates a pipe pair, closes the read end and writes a byte into the write end. If the signal is thread-directed, nothing should happen until the signal is unblocked again. If the signal is process-directed, the main thread should handle the signal and the process should die. The reason for this again comes from POSIX: If there is a thread which has the (process-directed) signal unblocked, it should be delivered there instead of queueing:
Signals generated for the process shall be delivered to exactly one of those threads within the process which […] has not blocked delivery of the signal. If […] all threads within the process block delivery of the signal, the signal shall remain pending on the process until […] a thread unblocks delivery of the signal, or the action associated with the signal is set to ignore the signal.
My experimentation suggests that on modern (2020) Linux with recent Glibc the signal is indeed directed to the thread which did the write, because blocking it with pthread_sigmask in the writing thread prevents SIGPIPE from being delivered until it's unblocked.
Linux 5.4.28 source
The behavior observed above doesn't prove anything, because it is entirely possible that Linux simply violates POSIX in several places and the signal delivery depends on some factors I didn't take into account. To get the proof we seek, we can read the source. Of course, this only tells us about the current behavior, not about the intended one – but if we find the current behavior to be POSIX-conforming, it is probably here to stay.
Disclaimer: I'm not a kernel hacker and the following is a result of a cursory reading of the sources. I might have missed something important.
In kernel/signal.c, there is a SYNCHRONOUS_MASK listing the synchronous signals which are handled specially. These are SIGSEGV, SIGBUS, SIGILL, SIGTRAP, SIGFPE and SIGSYS – SIGPIPE is not in the list. However, that doesn't answer the question – it can be thread-directed without being synchronous.
So how is SIGPIPE sent? It originates from pipe_write() in fs/pipe.c, which calls send_sig() on task_struct current. The use of current already hints that the signal is thread-directed, but let's press on. The send_sig() function is defined in kernel/signal.c and through some indirection ultimately calls __send_signal() with pid_type type = PIDTYPE_PID.
In Linux terminology, PID refers to a single thread. And sure enough, with those parameters, the pending signal list is the thread-specific one, not the shared one; and complete_signal() (called at the end of the function) doesn't even try to find a thread to wake up, it just returns because the thread has already been chosen. I don't fully understand how the signal queues work, but it seems that the queue is per-thread and so the current thread is the one that gets the signal.
I wrote a simple multithreaded application in C++11 on Linux platform and I would like to terminate the server and its running threads by sending SIGINT signal.
Obviously my server application uses thread support from C++11 (std::thread etc.). Although I found some support for signal handling in C++11 (std::signal), I couldn't find any support for handling signals in multithreaded environment.
So my question is - is there any way how to handle signals in multithreaded application in C++11 or do I have to rely back on pthreads just because my application needs to deal with signals?
2.4 Signal Concepts:
At the time of generation, a determination shall be made whether the signal has been generated for the process or for a specific thread within the process. Signals which are generated by some action attributable to a particular thread, such as a hardware fault, shall be generated for the thread that caused the signal to be generated. Signals that are generated in association with a process ID or process group ID or an asynchronous event, such as terminal activity, shall be generated for the process.
...
Signals generated for the process shall be delivered to exactly one of those threads within the process which is in a call to a sigwait() function selecting that signal or has not blocked delivery of the signal.
In the light of the above, in a multi-threaded process a common solution is to block all signals one intends to handle in all threads but one. That one thread would normally handle all process signals and tell other threads what to do (e.g. terminate) and is often the main thread. It easy to block the signals in the main thread before creating other threads, that inherit the signal mask of the parent thread. Once the main thread is done creating child threads and is ready to handle signals it must unblock those.
Unfortunately, C++11 does not provide any means for that. You have to use POSIX functions. Scroll down to Signalling in a Multi-Threaded Process in pthread_sigmask for an example that creates a special signal handling thread. The latter is not necessary if you are using an event loop in the main thread that can handle signals, just unblock the signals before entering the event loop.
I am not new to the use of signals in programming. I mostly work in C/C++ and Python.
But I am interested in knowing how signals are actually implemented in Linux (or Windows).
Does the OS check after each CPU instruction in a signal descriptor table if there are any registered signals left to process? Or is the process manager/scheduler responsible for this?
As signal are asynchronous, is it true that a CPU instruction interrupts before it complete?
The OS definitely does not process each and every instruction. No way. Too slow.
When the CPU encounters a problem (like division by 0, access to a restricted resource or a memory location that's not backed up by physical memory), it generates a special kind of interrupt, called an exception (not to be confused with C++/Java/etc high level language exception abstract).
The OS handles these exceptions. If it's so desired and if it's possible, it can reflect an exception back into the process from which it originated. The so-called Structured Exception Handling (SEH) in Windows is this kind of reflection. C signals should be implemented using the same mechanism.
On the systems I'm familiar with (although I can't see why it should be much different elsewhere), signal delivery is done when the process returns from the kernel to user mode.
Let's consider the one cpu case first. There are three sources of signals:
the process sends a signal to itself
another process sends the signal
an interrupt handler (network, disk, usb, etc) causes a signal to be sent
In all those cases the target process is not running in userland, but in kernel mode. Either through a system call, or through a context switch (since the other process couldn't send a signal unless our target process isn't running), or through an interrupt handler. So signal delivery is a simple matter of checking if there are any signals to be delivered just before returning to userland from kernel mode.
In the multi cpu case if the target process is running on another cpu it's just a matter of sending an interrupt to the cpu it's running on. The interrupt does nothing other than force the other cpu to go into kernel mode and back so that signal processing can be done on the way back.
A process can send signal to another process. process can register its own signal handler to handle the signal. SIGKILL and SIGSTOP are two signals which can not be captured.
When process executes signal handler, it blocks the same signal, That means, when signal handler is in execution, if another same signal arrives, it will not invoke the signal handler [ called blocking the signal], but it makes the note that the signal has arrived [ ie: pending signal]. once the already running signal handler is executed, then the pending signal is handled. If you do not want to run the pending signal, then you can IGNORE the signal.
The problem in the above concept is:
Assume the following:
process A has registered signal handler for SIGUSR1.
1) process A gets signal SIGUSR1, and executes signalhandler()
2) process A gets SIGUSR1,
3) process A gets SIGUSR1,
4) process A gets SIGUSR1,
When step (2) occurs, is it made as 'pending signal'. Ie; it needs to be served.
And when the step (3) occors, it is just ignored as, there is only one bit
available to indicate the pending signal for each available signals.
To avoid such problem, ie: if we dont want to loose the signals, then we can use
real time signals.
2) Signals are executed synchronously,
Eg.,
1) process is executing in the middle of signal handler for SIGUSR1,
2) Now, it gets another signal SIGUSR2,
3) It stops the SIGUSR1, and continues with SIGUSR2,
and once it is done with SIGUSR2, then it continues with SIGUSR1.
3) IMHO, what i remember about checking if there are any signal has arrived to the process is:
1) When context switch happens.
Hope this helps to some extend.
If you have a multithreaded program (Linux 2.26 kernel), and one thread does something that causes a segfault, will the other threads still be scheduled to run? How are the other threads terminated? Can someone explain the process shutdown procedure with regard to multithreaded programs?
When a fatal signal is delivered to a thread, either the do_coredump() or the do_group_exit() function is called. do_group_exit() sets the thread group exit code and then signals all the other threads in the thread group to exit with zap_other_threads(), before exiting the current thread. (do_coredump() calls coredump_wait() which similarly calls zap_threads()).
zap_other_threads() posts a SIGKILL for every other thread in the thread group and wakes it up with signal_wake_up(). signal_wake_up() calls kick_process(), which will boot the thread into kernel mode so that it can recieve the signal, using an IPI1 if necessary (eg. if it's executing on another CPU).
1. Inter-Processor Interrupt
Will the other thread still be scheduled to run?
No. The SEGV is a process-level issue. Unless you've handled the SEGV (which is almost always a bad idea) your whole process will exit, and all threads with it.
I suspect that the other threads aren't handled very nicely. If the handler calls exit() or _exit() thread cleanup handlers won't get called. This may be a good thing if your program is severely corrupted, it's going to be hard to trust much of anything after a seg fault.
One note from the signal man page:
According to POSIX, the behaviour of a process is undefined after it ignores a SIGFPE, SIGILL, or SIGSEGV signal that was not generated by the kill(2) or the raise(3) functions.
After a segfault you really don't want to be doing anything other than getting the heck out of that program.