Where linux signals are sent or processed inside the kernel? - linux

How is the signalling(interrupts) mechanism handled in kernel? The cause why I ask is: somehow a SIGABRT signal is received by my application and I want to find where does that come from..

You should be looking in your application for the cause, not in the kernel.
Usually a process receives SIGABRT when it directly calls abort or when an assert fails. Finding exactly the piece of the kernel that delivers the signal will gain you nothing.
In conclusion, your code or a library your code is using is causing this. See abort(3) and assert.

cnicutar's answer is the best guess IMHO.
It is possible that the signal has been emitted by another process, although in the case of SIGBART it most likely to be emitted by the same process which receives it via the abort(3) libc function.
In doubt, you can run your application with strace -e kill yourapp you args ... to quickly check if that kill system call is indeed invoked from within your program or dependent libraries. Or use gdb catch syscall.
Note that in some cases the kernel itself can emit signals, such as a SIGKILL when the infamous "OOM killer" goes into action.
BTW, signals are delivered asynchronously, they disrupt the normal workflow of your program. This is why they're painful to trace. Besides machinery such as SystemTap I don't know how to trace or log signals emission and delivery within the kernel.

Related

Can a Linux process/thread terminate without pass through do_exit()?

To verify the behavior of a third party binary distributed software I'd like to use, I'm implementing a kernel module whose objective is to keep track of each child this software produces and terminates.
The target binary is a Golang produced one, and it is heavily multi thread.
The kernel module I wrote installs hooks on the kernel functions _do_fork() and do_exit() to keep track of each process/thread this binary produces and terminates.
The LKM works, more or less.
During some conditions, however, I have a scenario I'm not able to explain.
It seems like a process/thread could terminate without passing through do_exit().
The evidence I collected by putting printk() shows the process creation but does not indicate the process termination.
I'm aware that printk() can be slow, and I'm also aware that messages can be lost in such situations.
Trying to prevent message loss due to slow console (for this particular application, serial tty 115200 is used), I tried to implement a quicker console, and messages have been collected using netconsole.
The described setup seems to confirm a process can terminate without pass through the do_exit() function.
But because I wasn't sure my messages couldn't be lost on the printk() infrastructure, I decided to repeat the same test but replacing printk() with ftrace_printk(), which should be a leaner alternative to printk().
Still the same result, occasionally I see processes not passing through the do_exit(), and verifying if the PID is currently running, I have to face the fact that it is not running.
Also note that I put my hook in the do_exit() kernel function as the first instruction to ensure the function flow does not terminate inside a called function.
My question is then the following:
Can a Linux process terminate without its flow pass through the do_exit() function?
If so, can someone give me a hint of what this scenario can be?
After a long debug session, I'm finally able to answer my own question.
That's not all; I'm also able to explain why I saw the strange behavior I described in my scenario.
Let's start from the beginning: monitoring a heavily multithreading application. I observed rare cases where a PID that suddenly stops exists without observing its flow to pass through the Linux Kernel do_exit() function.
Because this my original question:
Can a Linux process terminate without pass through the do_exit() function?
As for my current knowledge, which I would by now consider reasonably extensive, a Linux process can not end its execution without pass through the do_exit() function.
But this answer is in contrast with my observations, and the problem leading me to this question is still there.
Someone here suggested that the strange behavior I watched was because my observations were somehow wrong, alluding my method was inaccurate, as for my conclusions.
My observations were correct, and the process I watched didn't pass through the do_exit() but terminated.
To explain this phenomenon, I want to put on the table another question that I think internet searchers may find somehow useful:
Can two processes share the same PID?
If you'd asked me this a month ago, I'd surely answered this question with: "definitively no, two processes can not share the same PID."
Linux is more complex, though.
There's a situation in which, in a Linux system, two different processes can share the same PID!
https://elixir.bootlin.com/linux/v4.19.20/source/fs/exec.c#L1141
Surprisingly, this behavior does not harm anyone; when this happens, one of these two processes is a zombie.
updated to correct an error
The circumstances of this duplicate PID are more intricate than those described previously. The process must flush the previous exec context if a threaded process forks before invoking an execve (the fork copies also the threads). If the intention is to use the execve() function to execute a new text, the kernel must first call the flush_old_exec()  function, which then calls the de_thread() function for each thread in the process other than the task leader. Except the task leader, all the process' threads are eliminated as a result. Each thread's PID is changed to that of the leader, and if it is not immediately terminated, for example because it needs to wait an operation completion, it keeps using that PID.
end of the update
That was what I was watching; the PID I was monitoring did not pass through the do_exit() because when the corresponding thread terminated, it had no more the PID it had when it started, but it had its leader's.
For people who know the Linux Kernel's mechanics very well, this is nothing to be surprised for; this behavior is intended and hasn't changed since 2.6.17.
Current 5.10.3, is still this way.
Hoping this to be useful to internet searchers; I'd also like to add that this also answers the followings:
Question: Can a Linux process/thread terminate without pass through do_exit()? Answer: NO, do_exit() is the only way a process has to end its execution — both intentional than unintentional.
Question: Can two processes share the same PID? Answer: Normally don't. There's some rare case in which two schedulable entities have the same PID.
Question: Do Linux kernel have scenarios where a process change its PID? Answer: yes, there's at least one scenario where a Process changes its PID.
Can a Linux process terminate without its flow pass through the do_exit() function?
Probably not, but you should study the source code of the Linux kernel to be sure. Ask on KernelNewbies. Kernel threads and udev or systemd related things (or perhaps modprobe or the older hotplug) are probable exceptions. When your /sbin/init of pid 1 terminates (that should not happen) strange things would happen.
The LKM works, more or less.
What does that means? How could a kernel module half-work?
And in real life, it does happen sometimes that your Linux kernel is panicking or crashes (and it could happen with your LKM, if it has not been peer-reviewed by the Linux kernel community). In such a case, there is no more any notion of processes, since they are an abstraction provided by a living Linux kernel.
See also dmesg(1), strace(1), proc(5), syscalls(2), ptrace(2), clone(2), fork(2), execve(2), waitpid(2), elf(5), credentials(7), pthreads(7)
Look also inside the source code of your libc, e.g. GNU libc or musl-libc
Of course, see Linux From Scratch and Advanced Linux Programming
And verifying if the PID is currently running,
This can be done is user land with /proc/, or using kill(2) with a 0 signal (and maybe also pidfd_send_signal(2)...)
PS. I still don't understand why you need to write a kernel module or change the kernel code. My intuition would be to avoid doing that when possible.

How to log signals to an application signals with a log handler?

There are ways to do some work with linux signal handlers.
We can either register system handlers for every signals (if we have sourcecode) or
Run the process under strace to view them.
Stratergy 1:
But if we dont have source code, how can we catch a signals to an application to do something with it and return back? (not a one time debugging but permanent feature). [may be hack a system call?]
Stratergy 2:
And in case we do have source code, is writing to a file safe in case of multiple signals ? or is it more safe to execute signal handler in a fork() process and discard SIGCHLD? what happens if another signals comes in when handling previous signal?
For your Stratergy 2, depends on how your log files are written and how the signals are triggered (asynchronously or not). Normally stdio library functions are not async-signal-safe.
See details in http://man7.org/linux/man-pages/man7/signal-safety.7.html
To avoid problems with unsafe functions, there are two possible
choices:
1. Ensure that (a) the signal handler calls only async-signal-safe
functions, and (b) the signal handler itself is reentrant with
respect to global variables in the main program.
2. Block signal delivery in the main program when calling functions
that are unsafe or operating on global data that is also accessed
by the signal handler.
Stratergy 1: But if we dont have source code, how can we catch a signals to an application to do something with it and return back? (not a one time debugging but permanent feature). [may be hack a system call?]
To intercept a signal delivered to a process there are at least 2 ways:
ptrace(2) (which is what strace uses) see this answer for an example.
LD_PRELOAD: (I'd not advise this approach) you can use it to set handlers for every signal and replace signal and sigaction with two wrapper functions to prevent the program from overriding your signal handlers (please note the recommendations in this other answer).

when multi-thread program receive a SIGPIPE signal because send, which thread would handle the signal in linux?

If a send cause a SIGPIPE signal, which thead would handle it ? The thread which send or a random thread? In other words, the Linux system send the signal by kill or pthread_kill?
Asynchronous signals like SIGPIPE can go to any thread. You can use signal masks to limit which of the threads is eligible.
Synchronous signals like SIGSEGV will be delivered on the thread that caused them.
Summary
The answer to this question has two facets: How the system in question should behave and how it actually behaves.
Since most programmers expect Linux to be mostly POSIX-compatible, we can look into that standard, which actually unambiguously specifies the behavior – the signal is sent directly to the thread which did the write. But whether Linux adheres to it is unclear and Linux documentation is not helpful here. An examination of Linux behavior suggests it conforms to POSIX, but doesn't prove it, and a reading of the source gives us the necessary proof about the current version of Linux.
tl;dr: It is always handled by the thread that did the write.
POSIX Standard
The POSIX standard mandates (since IEEE Std. 1003.1-2001/Cor 2-2004) that SIGPIPE generated as a result of write to a pipe with no readers be delivered to the thread doing the write. See EPIPE in the ERRORS section of the description of write() (emphasis mine):
[EPIPE] An attempt is made to write to a pipe or FIFO that is not open for reading by any process, or that only has one end open. A SIGPIPE signal shall also be sent to the thread.
Linux documentation
That said, it is not clear whether Linux handles this correctly. The page man 7 signal doesn't give concrete lists of thread- and process-directed signals, just examples, and its definition of thread-directed signals doesn't include SIGPIPE:
A signal may be thread-directed because it was generated as a consequence of executing a specific machine-language instruction that triggered a hardware exception […]
SIGPIPE is not a result of a specific instruction, nor is it triggered by a hardware exception.
Glibc documentation doesn't discuss kernel-generated synchronous thread-directed signals at all (i.e. not even SIGSEGV or SIGBUS are discussed as being thread-directed), and there are years-old reports of bugs in NPTL, although these may have been fixed in the meantime.
Observable Linux behavior
I wrote a program which spawns a thread, which blocks SIGPIPE using pthread_sigmask, creates a pipe pair, closes the read end and writes a byte into the write end. If the signal is thread-directed, nothing should happen until the signal is unblocked again. If the signal is process-directed, the main thread should handle the signal and the process should die. The reason for this again comes from POSIX: If there is a thread which has the (process-directed) signal unblocked, it should be delivered there instead of queueing:
Signals generated for the process shall be delivered to exactly one of those threads within the process which […] has not blocked delivery of the signal. If […] all threads within the process block delivery of the signal, the signal shall remain pending on the process until […] a thread unblocks delivery of the signal, or the action associated with the signal is set to ignore the signal.
My experimentation suggests that on modern (2020) Linux with recent Glibc the signal is indeed directed to the thread which did the write, because blocking it with pthread_sigmask in the writing thread prevents SIGPIPE from being delivered until it's unblocked.
Linux 5.4.28 source
The behavior observed above doesn't prove anything, because it is entirely possible that Linux simply violates POSIX in several places and the signal delivery depends on some factors I didn't take into account. To get the proof we seek, we can read the source. Of course, this only tells us about the current behavior, not about the intended one – but if we find the current behavior to be POSIX-conforming, it is probably here to stay.
Disclaimer: I'm not a kernel hacker and the following is a result of a cursory reading of the sources. I might have missed something important.
In kernel/signal.c, there is a SYNCHRONOUS_MASK listing the synchronous signals which are handled specially. These are SIGSEGV, SIGBUS, SIGILL, SIGTRAP, SIGFPE and SIGSYS – SIGPIPE is not in the list. However, that doesn't answer the question – it can be thread-directed without being synchronous.
So how is SIGPIPE sent? It originates from pipe_write() in fs/pipe.c, which calls send_sig() on task_struct current. The use of current already hints that the signal is thread-directed, but let's press on. The send_sig() function is defined in kernel/signal.c and through some indirection ultimately calls __send_signal() with pid_type type = PIDTYPE_PID.
In Linux terminology, PID refers to a single thread. And sure enough, with those parameters, the pending signal list is the thread-specific one, not the shared one; and complete_signal() (called at the end of the function) doesn't even try to find a thread to wake up, it just returns because the thread has already been chosen. I don't fully understand how the signal queues work, but it seems that the queue is per-thread and so the current thread is the one that gets the signal.

linux: finding cause of realtime signal

I have a linux test prog with some custom drivers. When running, my prog will suddenly exit with "realtime signal 5". The core dump shows the signal being handled by a thread that was in a nanosleep call, so I guess it's an asynchronous signal coming from somewhere.
Can anyone recommend a strategy for tracking down the origin of the signal? eg are there specific functions in the kernel I can add some logging to (send_sig maybe?). Thanks.

When does a process handle a signal

I want to know when does a linux process handles the signal.
Assuming that the process has installed the signal handler for a signal, I wanted to know when would the process's normal execution flow be interrupted and signal handler called.
According to http://www.tldp.org/LDP/tlk/ipc/ipc.html, the process would handle the signal when it exits from a system call. This would mean that a normal instruction like a = b+c (or its equivalent machine code) would not be interrupted because of signal.
Also, there are system calls which would get interrupted (and fail with EINTR or get restarted) upon receiving a signal. This means that signal is processed even before the system call completes. This behaviour seems to b conflicting with what I have mentioned in the previous paragraph.
So, I am not clear as to when is the signal processed and in which process states would it be handled by the process. Can it be interrupted
Anytime it enters from kernel space to user space, or
Anytime it is in user space, or
Anytime the process is scheduled for execution by the scheduler
Thanks!
According to http://www.tldp.org/LDP/tlk/ipc/ipc.html, the process would handle the signal when it exits from a system call. This would mean that a normal instruction like a = b+c (or its equivalent machine code) would not be interrupted because of signal.
Well, if that were the case, a CPU-intensive process would not obey the process scheduler. The scheduler, in fact, can interrupt a process at any point of time when its time quantum has elapsed. Unless it is a FIFO real-time process.
A more correct definition: One point when a signal is delivered to the process is when the control flow leaves the kernel mode to resume executing user-mode code. That doesn't necessarily involve a system call.
A lot of the semantics of signal handling are documented (for Linux, anyway - other OSes probably have similar, but not necessarily in the same spot) in the section 7 signal manual page, which, if installed on your system, can be accessed like this:
man 7 signal
If manual pages are not installed, online copies are pretty easy to find.

Resources