Why does calling kill(getpid(), SIGUSR1) inside handler for SIGUSR1 loop? - linux

I'm trying to understand what is happening behind the scenes with this code. This was asked at a final exam of an Intro to OS course I'm taking. As I understand it, when returning from kernel mode to user mode, the system checks if there are any pending signals (by examining the signal vector) and tends to them. So as the program returns from the kill syscall the OS sees that SIGUSR1 is pending and invokes the handler. By that logic, the handler should print "stack" in an infinite loop, but when running this code it actually prints "stackoverflow" in an infinite loop. Why is this happening?
Thanks in advance.
void handler(int signo) {
printf("stack");
kill(getpid(), SIGUSR1);
printf("overflow\n");
}
int main() {
struct sigaction act;
act.sa_handler = &handler;
sigaction(SIGUSR1, &act, NULL);
kill(getpid(), SIGUSR1);
return 0;
}

You actually have undefined behavior here, as you're calling sigaction with an incompletely initialized struct sigaction object. So depending on what values happen to be in the sa_flags and sa_mask fields, a variety of different things might happen.
Some of these would not block SIGUSR1 while the signal handler is running, which would mean that a new signal handler would run immediately when the first calls kill (so before the first handler returns and pops its stack frame). So you end up with many recursive handler stack frames on the stack (and outputs of 'stack') until it overflows.
Other combos would block the signal so it would not immediately trigger a second signal handler. Instead the signal would be "pending" until the first signal handler returns.

Related

When and why should you use WNOHANG with waitpid()?

I'm currently in a systems programming class and we went over the wait system call functions today. I was reading over the section on waitpid() system call and in the options section it lists one called WNOHANG.
pid_t waitpid*(pid_t pid, int *status, int options);
WNOHANG: If no child specified by pid (from the parameters) has yet changed state, then return immediately, instead of blocking. In this case, the return value of waitpid() is 0. If the calling process has no children that match the specification in pid, waitpid() fails with the error ECHILD.
I understand waitpid() was implemented to solve the limitations in wait(); however, I'm not really sure about why you would use the WNOHANG option flag.
If I were to render a guess it would be so that the parent process can preform other tasks and perhaps keep checking on its children to see if any of them have terminated. Sort of how a demon process sits in the background and waits for requests.
Any situational examples or regular examples would help as well.
Thanks in advance!
You don't need to keep checking on children. It is job of SIGCHLD signal handler. Every time this handler is fired, you check terminated children:
pid_t pid;
int status;
while ((pid=waitpid(-1,&status,WNOHANG)) > 0)
{
//process terminated child
}

Why it will terminate even if I used signl(SIGINT, sig_int)?

As you see, This is a sample in APUE.
#include "apue.h"
static void sig_int(int sig);
int main(int argc, char **argv)
{
char buf[MAXLINE];
pid_t pid;
int status;
if (signal(SIGINT, sig_int) == SIG_ERR) //sig_int is a simple handler function
err_sys("signal error");
printf("%% ");
while (fgets(buf, MAXLINE, stdin) != NULL) {
//This is a loop to implement a simple shell
}
return 0;
}
This is the signal handler
void sig_int(int sig)
/*When I enter Ctrl+C, It'll say a got SIGSTOP, but it would terminate.*/
{
if (sig == SIGINT)
printf("got SIGSTOP\n");
}
When I enter Ctrl+C, It'll say got SIGSTOP, but it terminates right now.
The short version is that the signal interrupts the current system call. You're doing fgets(), which likely now blocks in a read() system-call. The read() call is interrupted, it returns -1 and sets errno to EINTR.
This causes fgets to return NULL, your loop ends, and the program is finished.
Some background
glibc on linux implements two different concepts for signal(). One where system calls are automatically restarted across signals, and one where they are not.
When a signal occurs and the process is blocked in a system call, the system call is interrupted("cancelled"). Execution resumes in the user space application, and the signal handler occurs. The interrupted system call will return an error, and set errno to EINTR.
What happens next depends on whether system calls are restarted or not across signals.
If system calls are restartable, the runtime (glibc) simply retries the system call. For the read() system call, this would be similar to read() being implemented as:
ssize_t read(int fd, void *buf, size_t len)
{
ssize_t sz;
while ((sz = syscall_read(fd, buf, len)) == -1
&& errno == EINTR);
return sz;
}
If system calls are not automatically restarted, read() would behave similar to:
ssize_t read(int fd, void *buf, size_t len)
{
ssize_t sz;
sz = syscall_read(fd, buf, len));
return sz;
}
In the latter case it would be up to your application to check whether read() failed because it was interrupted by a signal. And it is up you, to determine if read() just failed temporarily due to a signal getting handled, and it's up you you to re-try the read() call
signal vs sigaction
By using sigaction() instead of signal(), you get control over
whether system calls are restared or not. The relevant flag you specify with sigaction() is
SA_RESTART
Provide behavior compatible with BSD signal semantics by making certain system calls restartable across signals.
This flag is meaningful only when establishing a sig‐
nal handler. See signal(7) for a discussion of system call restarting.
BSD vs SVR4 semantics
If you use signal(), it depends on what semantics you want. As seen in the description of SA_RESTART, if it is BSD signal semantics, system calls are restarted. This is the default behavior in glibc.
Another difference is that BSD semantics leave the signal handler installed by signal() installed after a signal is handled. SVR4 semantics uninstalls the signal handler, and your signal handler will have to re-install the handler if you want to catch more signals.
apue.h
The "apue.h" however, defines the macro _XOPEN_SOURCE 600 before including <signal.h>. This will cause signal() to have SVR4 semantics, where system calls are not restarted. Which will cause your fgets() call to "fail".
Don't use signal(), use sigaction()
Due to all these differences in behavior, use sigaction() instead of signal. sigaction() lets you control what happens instead of having the semantics change based on a (possibly) hidden #define as is the case with signal()

How do I "disengage" from `accept` on a blocking socket when signalled from another thread?

I am in the same situation as this guy, but I don't quite understand the answer.
The problem:
Thread 1 calls accept on a socket, which is blocking.
Thread 2 calls close on this socket.
Thread 1 continues blocking. I want it to return from accept.
The solution:
what you should do is send a signal to the thread which is blocked in
accept. This will give it EINTR and it can cleanly disengage - and
then close the socket. Don't close it from a thread other than the one
using it.
I don't get what to do here -- when the signal is received in Thread 1, accept is already blocking, and will continue to block after the signal handler has finished.
What does the answer really mean I should do?
If the Thread 1 signal handler can do something which will cause accept to return immediately, why can't Thread 2 do the same without signals?
Is there another way to do this without signals? I don't want to increase the caveats on the library.
Instead of blocking in accept(), block in select(), poll(), or one of the similar calls that allows you to wait for activity on multiple file descriptors and use the "self-pipe trick". All of the file descriptors passed to select() should be in non-blocking mode. One of the file descriptors should be the server socket that you use with accept(); if that one becomes readable then you should go ahead and call accept() and it will not block. In addition to that one, create a pipe(), set it to non-blocking, and check for the read side becoming readable. Instead of calling close() on the server socket in the other thread, send a byte of data to the first thread on the write end of the pipe. The actual byte value doesn't matter; the purpose is simply to wake up the first thread. When select() indicates that the pipe is readable, read() and ignore the data from the pipe, close() the server socket, and stop waiting for new connections.
The accept() call will return with error code EINTR if a signal is caught before a connection is accepted. So check the return value and error code then close the socket accordingly.
If you wish to avoid the signal mechanism altogether, use select() to determine if there are any incoming connections ready to be accepted before calling accept(). The select() call can be made with a timeout so that you can recover and respond to abort conditions.
I usually call select() with a timeout of 1000 to 3000 milliseconds from a while loop that checks for an exit/abort condition. If select() returns with a ready descriptor I call accept() otherwise I either loop around and block again on select() or exit if requested.
Call shutdown() from Thread 2. accept will return with "invalid argument".
This seems to work but the documentation doesn't really explain its operation across threads -- it just seems to work -- so if someone can clarify this, I'll accept that as an answer.
Just close the listening socket, and handle the resulting error or exception from accept().
I believe signals can be used without increasing "the caveats on the library". Consider the following:
#include <pthread.h>
#include <signal.h>
#include <stddef.h>
static pthread_t thread;
static volatile sig_atomic_t sigCount;
/**
* Executes a concurrent task. Called by `pthread_create()`..
*/
static void* startTask(void* arg)
{
for (;;) {
// calls to `select()`, `accept()`, `read()`, etc.
}
return NULL;
}
/**
* Starts concurrent task. Doesn't return until the task completes.
*/
void start()
{
(void)pthread_create(&thread, NULL, startTask, NULL);
(void)pthread_join(thread);
}
static void noop(const int sig)
{
sigCount++;
}
/**
* Stops concurrent task. Causes `start()` to return.
*/
void stop()
{
struct sigaction oldAction;
struct sigaction newAction;
(void)sigemptyset(&newAction.sa_mask);
newAction.sa_flags = 0;
newAction.sa_handler = noop;
(void)sigaction(SIGTERM, &newAction, &oldAction);
(void)pthread_kill(thread, SIGTERM); // system calls return with EINTR
(void)sigaction(SIGTERM, &oldAction, NULL); // restores previous handling
if (sigCount > 1) // externally-generated SIGTERM was received
oldAction.sa_handler(SIGTERM); // call previous handler
sigCount = 0;
}
This has the following advantages:
It doesn't require anything special in the task code other than normal EINTR handling; consequently, it makes reasoning about resource leakage easier than using pthread_cancel(), pthread_cleanup_push(), pthread_cleanup_pop(), and pthread_setcancelstate().
It doesn't require any additional resources (e.g. a pipe).
It can be enhanced to support multiple concurrent tasks.
It's fairly boilerplate.
It might even compile. :-)

Segmentation fault within segmentation fault handler

Is there some defined behaviour for segmentation faults which happen within segmentation falut handler under Linux?
Will be there another call to the same handler? If so, on all platforms, is it defined and so on.
Thank you.
Than answer depends on how you installed your signal handler. If you installed your signal handler using the deprecated signal() call, then it will either reset the signal handler to the default handler or block the signal being handled before calling your signal handler. If it blocked the signal, it will unblock it after your signal handler returns.
If you use sigaction(), you have control over which signals get blocked while your signal handler is being called. If you so specify, it is possible to cause infinite recursion.
It is possible to implement a safe wrapper around sigaction() that has an API similar to signal():
sighandler_t safe_signal (int sig, sighandler_t h) {
struct sigaction sa;
struct sigaction osa;
sa.sa_handler = h;
sigemptyset(&sa.sa_mask);
sa.sa_flags = 0;
if (sigaction(sig, &sa, &osa) < 0) {
return SIG_ERR;
}
return osa.sa_handler;
}
This blocks all signals for the duration of the signal handler call, which gets restored after the signal handler returns.
From C-11 standard, 7.14.1.1,
When a signal occurs and func points to a function, it is
implementation-defined whether the equivalent of signal(sig, SIG_DFL);
is executed or the implementation prevents some implementation-defined
set of signals (at least including sig) from occurring until the
current signal handling has completed;
So Standard says that it is implementation defined whether it allows for recursive calls of the same signal handler. So I would conclude that the behaviour is defined but is implementation defined!
But its a total mess if a segfault handler is itself segfaulting :)

Thread, ansi c signal and Qt

I'm writing a multithread plugin based application. I will not be the plugins author. So I would wish to avoid that the main application crashes cause of a segmentation fault in a plugin. Is it possible? Or the crash in the plugin definitely compromise also the main application status?
I wrote a sketch program using qt cause my "real" application is strongly based on qt library. Like you can see I forced the thread to crash calling the trimmed function on a not-allocated QString. The signal handler is correctly called but after the thread is forced to quit also the main application crashes. Did I do something wrong? or like I said before what I'm trying to do is not achievable?
Please note that in this simplified version of the program I avoided to use plugins but only thread. Introducing plugins will add a new critical level, I suppose. I want to go on step by step. And, overall, I want to understand if my target is feasible. Thanks a lot for any kind of help or suggestions everyone will try to give me.
#include <QString>
#include <QThread>
#include<csignal>
#include <QtGlobal>
#include <QtCore/QCoreApplication>
class MyThread : public QThread
{
public:
static void sigHand(int sig)
{
qDebug("Thread crashed");
QThread* th = QThread::currentThread();
th->exit(1);
}
MyThread(QObject * parent = 0)
:QThread(parent)
{
signal(SIGSEGV,sigHand);
}
~MyThread()
{
signal(SIGSEGV,SIG_DFL);
qDebug("Deleted thread, restored default signal handler");
}
void run()
{
QString* s;
s->trimmed();
qDebug("Should not reach this point");
}
};
int main(int argc, char *argv[])
{
QCoreApplication a(argc, argv);
MyThread th(&a);
th.run();
while (th.isRunning());
qDebug("Thread died but main application still on");
return a.exec();
}
I'm currently working on the same issue and found this question via google.
There are several reasons your source is not working:
There is no new thread. The thread is only created, if you call QThread::start. Instead you call MyThread::run, which executes the run method in the main thread.
You call QThread::exit to stop the thread, which is not supposed to directly stop a thread, but sends a (qt) signal to the thread event loop, requesting it to stop. Since there is neither a thread nor an event loop, the function has no effect. Even if you had called QThread::start, it would not work, since writing a run method does not create a qt event loop. To be able to use exit with any QThread, you would need to call QThread::exec first.
However, QThread::exit is the wrong method anyways. To prevent the SIGSEGV, the thread must be called immediately, not after receiving the (qt) signal in its event loop. So although generally frowned upon, in this case QThread::terminate has to be called
But it is generally said to be unsafe to call complex functions like QThread::currentThread, QThread::exit or QThread::terminate from signal handlers, so you should never call them there
Since the thread is still running after the signal handler (and I'm not sure even QThread::terminate would kill it fast enough), the signal handler exits to where it was called from, so it reexecutes the instruction causing the SIGSEGV, and the next SIGSEGV occurs.
Therefore I have used a different approach, the signal handler changes the register containing the instruction address to another function, which will then be run, after the signal handler exits, instead the crashing instruction. Like:
void signalHandler(int type, siginfo_t * si, void* ccontext){
(static_cast<ucontext_t*>(ccontext))->Eip = &recoverFromCrash;
}
struct sigaction sa;
memset(&sa, 0, sizeof(sa)); sa.sa_flags = SA_SIGINFO;
sa.sa_sigaction = &signalHandler;
sigaction(SIGSEGV, &sa, 0);
The recoverFromCrash function is then normally called in the thread causing the SIGSEGV. Since the signal handler is called for all SIGSEGV, from all threads, the function has to check which thread it is running in.
However, I did not consider it safe to simply kill the thread, since there might be other stuff, depending on a running thread. So instead of killing it, I let it run in an endless loop (calling sleep to avoid wasting CPU time). Then, when the program is closed, it sets a global variabel, and the thread is terminated. (notice that the recover function must never return, since otherwise the execution will return to the function which caused the SIGSEGV)
Called from the mainthread on the other hand, it starts a new event loop, to let the program running.
if (QThread::currentThread() != QCoreApplication::instance()->thread()) {
//sub thread
QThread* t = QThread::currentThread();
while (programIsRunning) ThreadBreaker::sleep(1);
ThreadBreaker::forceTerminate();
} else {
//main thread
while (programIsRunning) {
QApplication::processEvents(QEventLoop::AllEvents);
ThreadBreaker::msleep(1);
}
exit(0);
}
ThreadBreaker is a trivial wrapper class around QThread, since msleep, sleep and setTerminationEnabled (which has to be called before terminate) of QThread are protected and could not be called from the recover function.
But this is only the basic picture. There are a lot of other things to worry about: Catching SIGFPE, Catching stack overflows (check the address of the SIGSEGV, run the signal handler in an alternate stack), have a bunch of defines for platform independence (64 bit, arm, mac), show debug messages (try to get a stack trace, wonder why calling gdb for it crashes the X server, wonder why calling glibc backtrace for it crashes the program)...

Resources