Accessing shared data from a signal handler

Accessing shared data from a signal handler - multithreading

I want to know if it is a good idea to access shared data from a signal handler. I mean consider the scenario of multi process system and multithreaded system with a single process. In multi process system, lets say I have the processes handle a particular signal and update certain shared variable or memory by the processes. Can I do that from the signal handler itself.
However, in the case of threads using pthreads, I don't think it is doable. http://maxim.int.ru/bookshelf/PthreadsProgram/htm/r_40.html. As given in this article, they have mentioned that it is not asynchronous signal safe and have suggested to use sigwait for that. I am not why it is not asynchronous signal safe. I mean lets say, I handle a signal by a thread and is in the signal handler routing. I acquire a lock on the shared memory to update it. In the mean time another signal of the same type arrives and another thread responsible for handling it executes the signal handler again. Here the signal handler is same for the process but it is called multiple time. The second time around, it cannot see the lock and updates/overrides the data. Is this the issue with multithreaded signal handlers using shared data.
I am a bit confused, in multi process systems, I have a copy of the signal handler for each process. But in multithreaded system, there is a single copy of the signal handler used by the multiple threads isn't it. So when multiple signals of the same type arrives and we have two threads that are responsible for handling it try to handle it, then both of them will try to execute the same piece of handler code? How does it all fit in?

I read through the article that you reference and found some interesting information in the "Threads in Signal Handlers" section. In that section, you'll see that they have a list of Posix function calls that can be made from within signal handlers. Then soon after that list, they mention the following:
But where are the Pthreads calls? They're not in either of these
lists! In fact, the Pthreads standard specifies that the behavior of
all Pthreads functions is undefined when the function is called from a
signal handler. If your handler needs to manipulate data that is
shared with other threads≈buffers, flags, or state variables≈it's out
of luck. The Pthreads mutex and condition variable synchronization
calls are off limits.
Notice the last sentence: "Pthreads mutex and condition variable synchronization calls are off limits"
The aforementioned functions that can be called from a signal handler are described as follows:
These functions have a special property known as reentrancy that
allows a process to have multiple calls to these functions in progress
at the same time.
The pthread synchronization functions dont have the special property known as reentrancy, so I imagine that if these functions (pthread_mutex_lock() for instance) are interrupted by an arriving signal, then the behavior is not "safe".
Imagine that your application calls pthread_mutex_lock(&theMutex) and at exactly that moment (that is, while in the pthread_mutex_lock() function) a signal arrives. If the signal handler also calls pthread_mutex_lock(&theMutex), the previous pthread call may not have terminated, so it cant be guaranteed which call to pthread_mutex_lock() will get the lock. So the resulting behavior will be undefined/undeterministic.
I would imagine that the call to sigwait() from a particular thread would guarantee that no important, non-reentrancy function calls may get interrupted, thus allowing calls to the pthread synchronization functions to be "safe".

Related

sigwaitinfo() - Is it typically used in thread only?

I am reading Linux book about signals and wondering what is the typical use model of sigwaitinfo().
Since sigwaitinfo() halts the current process, if I use it in the main function flow of the process, that will stop the process until the signals arrive. This essentially makes the process to function as a signal handler. In many cases I'd say a process needs to delivery some functionality and at the same time it needs to handle some particular signals. Then in such cases, if I do not want to use signal handler to handle signals asynchronously, then I launch a thread and use sigwaitinfo() in that thread. Is this understanding right?

SIGIO vs epoll for Linux sockets

The socket documentation for linux (man 7 socket) says that you can set your socket to be O_ASYNC and then receive a signal when the socket is ready for read/write.
However, it seems most people use epoll instead. What is the reason for using epoll rather than this asynchronous signaling system?

If you have a central loop where you catch all kind of events makes it very easy to write a single threaded application and you don't have to take care about all the synchronization problems which may occur if you are running with different execution contexts.
If you use a signal handler you must take care that you never call a non-reentrant function from the signal handler context. There is a list of Async-signal-safe functions you are allowed to call. And as you can see, it is a short list! As a result your signal handler can not do much, maybe only set a flag or send a message and the real work must be done "somewhere". In fact, signal handlers are very limited.
And using signal handlers in multi threaded applications is also not so easy as it looks in the first place, as the handler is per task and not per thread. Read more: signal handler function in multithreaded environment

How to log signals to an application signals with a log handler?

There are ways to do some work with linux signal handlers.
We can either register system handlers for every signals (if we have sourcecode) or
Run the process under strace to view them.
Stratergy 1:
But if we dont have source code, how can we catch a signals to an application to do something with it and return back? (not a one time debugging but permanent feature). [may be hack a system call?]
Stratergy 2:
And in case we do have source code, is writing to a file safe in case of multiple signals ? or is it more safe to execute signal handler in a fork() process and discard SIGCHLD? what happens if another signals comes in when handling previous signal?

For your Stratergy 2, depends on how your log files are written and how the signals are triggered (asynchronously or not). Normally stdio library functions are not async-signal-safe.
See details in http://man7.org/linux/man-pages/man7/signal-safety.7.html
To avoid problems with unsafe functions, there are two possible
choices:
1. Ensure that (a) the signal handler calls only async-signal-safe
functions, and (b) the signal handler itself is reentrant with
respect to global variables in the main program.
2. Block signal delivery in the main program when calling functions
that are unsafe or operating on global data that is also accessed
by the signal handler.

Stratergy 1: But if we dont have source code, how can we catch a signals to an application to do something with it and return back? (not a one time debugging but permanent feature). [may be hack a system call?]
To intercept a signal delivered to a process there are at least 2 ways:
ptrace(2) (which is what strace uses) see this answer for an example.
LD_PRELOAD: (I'd not advise this approach) you can use it to set handlers for every signal and replace signal and sigaction with two wrapper functions to prevent the program from overriding your signal handlers (please note the recommendations in this other answer).

Linux/vxworks signals

I came across the following in a vxworks manual and was wondering why this is the case.
What types of things do signals do that make them undesirable?
In applications, signals are most
appropriate for error and exception
handling, and not for a
general-purpose inter-task
communication.

The main issue with signals is that signal handlers are registered on a per process/memory space basis (in vxWorks, the kernel represents one memory space, and each RTP is a different memory space).
This means that regardless of the thread/task context, the same signal handler will get executed (for a given process). This can cause some problems with side-effects if your signal handler is not well behaved.
For example, if your signal uses a mutex for protect a shared resource, this could cause nasty problems, or at least, unexpected behavior
Task A Task B Signal Handler
Take Mutex
...
Gets preempted
does something
....
<SIGNAL ARRIVES>----->Take Mutex (blocks)
resumes
....
Give Mutex
----->Resumes Handler
I'm not sure the example above really conveys what I'm trying to.
Here are some other characteristics of signals:
Handler not executed until the task/process is scheduled. Just because you sent the signal, doesn't mean the handler will execute right away
No guarantee on which Task/Thread will execute the handler. Any thread/task in the process could run it (whichever thread/task executes first). VxWorks has ways around this.
Note that the above only applies to asynchronous signals sent via a kill call.
An exception will generate a synchronous signal which WILL get executed right away in the current context.

How are asynchronous signals handled in Linux?

This seems like a silly question, but I can't find the answer to it anywhere I look. I know that in UNIX, signals are handled asynchronously. If I write a function that handles a signal, where is that function run? Is a new thread spawned? Is an existing thread interrupted somehow? Or is this handled in a system thread like asynchronous I/O is?

A signal function is executed as if a thread in the process has been interrupted. That is, the signal handler is called using the signaled thread and the stack is rearranged so that when the signal handler returns the thread continues execution. No new threads are introduced.

An existing process thread is interrupted until the function returns. There are serious restrictions on what it can safely do to ensure it doesn't corrupt state of function calls the thread was in the middle of - specifically, any functions it calls that the thread may have already been calling must be async reentrant. See the man pages e.g. signal, sigaction for further details or ask more specific questions as you like.

It's not a separate thread, but your code is hastily suspended. That's why only a limited subset of the POSIX calls is available.
From the signal man page:
The routine handler must be very careful, since processing elsewhere was interrupted at some arbitrary point. POSIX has the concept of "safe function". If a signal interrupts an unsafe function, and handler calls an unsafe function, then the behavior is undefined. Safe functions are listed explicitly in the various standards.

Develop Reference

node.js excel linux python-3.x azure haskell apache-spark rust .htaccess string