Saving gmon.out before killing a process - linux

I would like to use gprof to profile a daemon. My daemon uses a 3rd party library, with which it registers some callbacks, then calls a main function, that never returns. I need to call kill (either SIGTERM or SIGKILL) to terminate the daemon. Unfortunately, gprof's manual page says the following:
The profiled program must call "exit"(2) or return normally for the
profiling information to be saved in the gmon.out file.
Is there is way to save profiling information for processes which are killed with SIGTERM or SIGKILL ?

First, I would like to thank #wallyk for giving me good initial pointers. I solved my issue as follows. Apparently, libc's gprof exit handler is called _mcleanup. So, I registered a signal handler for SIGUSR1 (unused by the 3rd party library) and called _mcleanup and _exit. Works perfectly! The code looks as follows:
#include <dlfcn.h>
#include <stdio.h>
#include <unistd.h>
void sigUsr1Handler(int sig)
{
fprintf(stderr, "Exiting on SIGUSR1\n");
void (*_mcleanup)(void);
_mcleanup = (void (*)(void))dlsym(RTLD_DEFAULT, "_mcleanup");
if (_mcleanup == NULL)
fprintf(stderr, "Unable to find gprof exit hook\n");
else _mcleanup();
_exit(0);
}
int main(int argc, char* argv[])
{
signal(SIGUSR1, sigUsr1Handler);
neverReturningLibraryFunction();
}

You could add a signal handler for a signal the third party library doesn't catch or ignore. Probably SIGUSR1 is good enough, but will either have to experiment or read the library's documentation—if it is thorough enough.
Your signal handler can simply call exit().

Related

Both registering signal handler for SIGSEGV and still being able to create full crash dump from OS

We are using the standard pattern of registering custom signal handler for SIGSEGV with sigaction and then when segmentation fault occurs using the backtrace function to walk the stack and print it to some file. It is nice feature to have the backtrace in logs but it disables the OS writing the full dump of the crashed program which is more than useful. How is it possible to both catch the SIGSEGV, do custom handling, and also cause the OS to create the full dump as it would be in case of default action?
Can I for example call remember the oldact pointer to default handler (as described in man), and then call it directly from out custom handler? Needless to say we need crash to indicate the exact place where it happened. So for example re-registering handler to old value and implicitly crashing the program in other place would not work.
You can reset the sigaction after having handled the signal. Then the faulting instruction will re-run after returning from the handler, and fault again, leading to core dump.
Here's an example:
#include <signal.h>
#include <unistd.h>
struct sigaction oldSA;
void handler(int signal)
{
const char msg[] = "Caught, should dump core now\n";
write(STDERR_FILENO, msg, sizeof msg - 1);
sigaction(SIGSEGV, &oldSA, NULL);
}
int main()
{
struct sigaction sa={0};
sa.sa_handler=handler;
sigaction(SIGSEGV, &sa, &oldSA);
int* volatile p=NULL;
*p=5; // cause segfault
}
Example run:
$ gcc test.c -o test && ./test
Caught, should dump core now
Segmentation fault (core dumped)

Linux: handling a segmentation fault and getting a core dump

When my application crashes with a segmentation fault I'd like to get a core dump from the system. I do that by configuring before hand
ulimit -c unlimited
I would also like to have an indication in my application logs that a segmentation fault has occured. I do that by using sigaction(). If I do that however, the signal does not reach its default handling and a core dump is not saved.
How can I have both the system core dump an a log line from my own signal handler at the same time?
Overwrite the default signal handler for SIGSEGV to call your custom logging function.
After it is logged, restore and trigger the default handler that will create the core dump.
Here is a sample program using signal:
void sighandler(int signum)
{
myLoggingFunction();
// this is the trick: it will trigger the core dump
signal(signum, SIG_DFL);
kill(getpid(), signum);
}
int main()
{
signal(SIGSEGV, sighandler);
// ...
}
The same idea should also work with sigaction.
Source: How to handle SIGSEGV, but also generate a core dump
The answer: set the sigaction with flag SA_RESETHAND and just return from the handler. The same instruction occurs again, causing a segmentation fault again and invoking the default handler.
There's no need to do anything special in your signal handler
As explained at: Where does signal handler return back to? by default the program returns to the very instruction that caused the SIGSEGV after a signal gets handled.
Furthermore, tested as of Ubuntu 22.04, the default behavior for signal is that it automatically de-registers the handler. man signal does suggest that this is not very portable however, so maybe using the more explicit sigaction syscall intead is better.
Therefore, what happens by default on that system is:
the signal gets handled
the handler is automatically disabled
after return, you go back to the instruction that causes the signal
signal happens again
there is no handler, so crash in basically the exact same way as if we hadn't handled the signal
The most important thing to check is if you can generate core dumps at all regardless of the signal handler. Notably, many newer systems such as Ubuntu 22.04 have a complex core dump handler which prevents creation of core files: https://askubuntu.com/questions/1349047/where-do-i-find-core-dump-files-and-how-do-i-view-and-analyze-the-backtrace-st/1442665#1442665 and which you can deactivate as a one off with:
echo 'core' | sudo tee /proc/sys/kernel/core_pattern
Minimal runnable example:
main.c
#include <signal.h> /* signal, SIGSEGV */
#include <unistd.h> /* write, STDOUT_FILENO */
void signal_handler(int sig) {
(void)sig;
const char msg[] = "signal received\n";
write(STDOUT_FILENO, msg, sizeof(msg));
}
int myfunc(int i) {
*(int *)0 = 1;
return i + 1;
}
int main(int argc, char **argv) {
(void)argv;
signal(SIGSEGV, signal_handler);
int ret = myfunc(argc);
return ret;
}
compile and run:
gcc -ggdb3 -O0 -std=c99 -Wall -Wextra -pedantic -o main.out main.c
./main.out
Terminal output contain:
signal received
Segmentation fault (core dumped)
so we see that the signal was both handled, and we got a core file.
And inspecting the core file with:
gdb main.out core.243260
does put us at the correct line:
#0 myfunc (i=1) at main.c:12
12 *(int *)0 = 1;
so we did return to it as expected.
Making it more portable with sigaction
man signal portability section has a Bible of a text about how signal() varied across different OSes and versions:
The only portable use of signal() is to set a signal's disposition to SIG_DFL or SIG_IGN. The semantics when using signal() to establish a signal handler vary across systems (and POSIX.1 explicitly
permits this variation); do not use it for this purpose.
POSIX.1 solved the portability mess by specifying sigaction(2), which provides explicit control of the semantics when a signal handler is invoked; use that interface instead of signal().
In the original UNIX systems, when a handler that was established using signal() was invoked by the delivery of a signal, the disposition of the signal would be reset to SIG_DFL, and the system did
not block delivery of further instances of the signal. This is equivalent to calling sigaction(2) with the following flags:
sa.sa_flags = SA_RESETHAND | SA_NODEFER;
System V also provides these semantics for signal(). This was bad because the signal might be delivered again before the handler had a chance to reestablish itself. Furthermore, rapid deliveries
of the same signal could result in recursive invocations of the handler.
BSD improved on this situation, but unfortunately also changed the semantics of the existing signal() interface while doing so. On BSD, when a signal handler is invoked, the signal disposition is
not reset, and further instances of the signal are blocked from being delivered while the handler is executing. Furthermore, certain blocking system calls are automatically restarted if interrupted
by a signal handler (see signal(7)). The BSD semantics are equivalent to calling sigaction(2) with the following flags:
sa.sa_flags = SA_RESTART;
The situation on Linux is as follows:
The kernel's signal() system call provides System V semantics.
By default, in glibc 2 and later, the signal() wrapper function does not invoke the kernel system call. Instead, it calls sigaction(2) using flags that supply BSD semantics. This default behavior is provided as long as a suitable feature test macro is defined: _BSD_SOURCE on glibc 2.19 and earlier or _DEFAULT_SOURCE in glibc 2.19 and later. (By default, these macros are defined; see
feature_test_macros(7) for details.) If such a feature test macro is not defined, then signal() provides System V semantics.
That seems to suggest that I should get BSD semantics by default, but I seem to get System V semantics for some reason because:
sudo strace -f -s999 -v ./main.out
contains:
rt_sigaction(SIGSEGV, {sa_handler=0x55b428604189, sa_mask=[], sa_flags=SA_RESTORER|SA_INTERRUPT|SA_NODEFER|SA_RESETHAND|0xffffffff00000000, sa_restorer=0x7fb173d0a520}, {sa_handler=SIG_DFL, sa_mask=[], sa_flags=0}, 8) = 0
which has the SA_NODEFER|SA_RESETHAND. Notably, the flag we care the most about is SA_RESETHAND, which resets the handler to the default behavior.
But maybe I just misinterpreted one of the verses of the Holy Text.
So, just to be more portable, we could do the same as above with sigaction instead:
sigaction.c
#define _XOPEN_SOURCE 700
#include <signal.h> /* signal, SIGSEGV */
#include <unistd.h> /* write, STDOUT_FILENO */
void signal_handler(int sig) {
(void)sig;
const char msg[] = "signal received\n";
write(STDOUT_FILENO, msg, sizeof(msg));
}
int myfunc(int i) {
*(int *)0 = 1;
return i + 1;
}
int main(int argc, char **argv) {
(void)argv;
/* Adapted from: https://www.gnu.org/software/libc/manual/html_node/Sigaction-Function-Example.html */
struct sigaction new_action;
new_action.sa_handler = signal_handler;
sigemptyset(&new_action.sa_mask);
new_action.sa_flags = SA_NODEFER|SA_RESETHAND;
sigaction(SIGINT, &new_action, NULL);
int ret = myfunc(argc);
return ret;
}
which behaves just like main.c in Ubuntu 22.04.

In linux, calling system() from a forked process with pipe()

I have a standard program using fork() and pipe() with the intention of making a system() call for a third party program in the child process and redirecting the output to the parent process. I discovered that if I do this, somehow the parent process is never able to detect that the child process has closed the pipe, thus it is never able to exit from the while loop calling read().
The issue disappears when I replace the system() call to the third party program with some other generic system call like system("ls -l"). What could be potential issues with the call to the third party program using system() that is affecting this program?
#include <iostream>
#include <fstream>
#include <stdlib.h>//system
#include <sys/wait.h>
int main(int argc, char **argv){
//setup pipe
int pipeid_L1[2];
pipe(pipeid_L1);
pid_t pid_L1;
pid_L1 = fork();
if( pid_L1==-1 ){
throw std::logic_error("Fork L1 failed");
}
else if(pid_L1 ==0){//L1 child process
dup2(pipeid_L1[1],STDOUT_FILENO);//redirect standard out to pipe
close(pipeid_L1[0]); //child doesn't read
system( ... some program ... ); //making the system call to a third party program
close(pipeid_L1[1]);
exit(0);
}
else{
//setup pipe
close(pipeid_L1[1]);
int buf_size=64;
char L1_buf[buf_size];
while( read(pipeid_L1[0],L1_buf,buf_size)){ //this while loop never exits if I make the system call to the third party program
... do stuff here ...
}
}
exit(EXIT_SUCCESS);
}
The problem is that the parent will only see the EOF when ALL other processes close the write end of the pipe. There are three relevant processes -- the child you forked, the shell that system forks and execs, and the actual program you run. The first two won't close their end of the pipe until after the program actually exits, so the parent won't see the EOF until that happens and all the processes exit.
If you want the parent to see the EOF as soon as the program closes its stdout, rather than waiting until it exits, you'll need to get rid of those extra processes by using exec rather than system.
Alternately, you can use popen which does all of the needed fork/pipe/exec for you.

Thread, ansi c signal and Qt

I'm writing a multithread plugin based application. I will not be the plugins author. So I would wish to avoid that the main application crashes cause of a segmentation fault in a plugin. Is it possible? Or the crash in the plugin definitely compromise also the main application status?
I wrote a sketch program using qt cause my "real" application is strongly based on qt library. Like you can see I forced the thread to crash calling the trimmed function on a not-allocated QString. The signal handler is correctly called but after the thread is forced to quit also the main application crashes. Did I do something wrong? or like I said before what I'm trying to do is not achievable?
Please note that in this simplified version of the program I avoided to use plugins but only thread. Introducing plugins will add a new critical level, I suppose. I want to go on step by step. And, overall, I want to understand if my target is feasible. Thanks a lot for any kind of help or suggestions everyone will try to give me.
#include <QString>
#include <QThread>
#include<csignal>
#include <QtGlobal>
#include <QtCore/QCoreApplication>
class MyThread : public QThread
{
public:
static void sigHand(int sig)
{
qDebug("Thread crashed");
QThread* th = QThread::currentThread();
th->exit(1);
}
MyThread(QObject * parent = 0)
:QThread(parent)
{
signal(SIGSEGV,sigHand);
}
~MyThread()
{
signal(SIGSEGV,SIG_DFL);
qDebug("Deleted thread, restored default signal handler");
}
void run()
{
QString* s;
s->trimmed();
qDebug("Should not reach this point");
}
};
int main(int argc, char *argv[])
{
QCoreApplication a(argc, argv);
MyThread th(&a);
th.run();
while (th.isRunning());
qDebug("Thread died but main application still on");
return a.exec();
}
I'm currently working on the same issue and found this question via google.
There are several reasons your source is not working:
There is no new thread. The thread is only created, if you call QThread::start. Instead you call MyThread::run, which executes the run method in the main thread.
You call QThread::exit to stop the thread, which is not supposed to directly stop a thread, but sends a (qt) signal to the thread event loop, requesting it to stop. Since there is neither a thread nor an event loop, the function has no effect. Even if you had called QThread::start, it would not work, since writing a run method does not create a qt event loop. To be able to use exit with any QThread, you would need to call QThread::exec first.
However, QThread::exit is the wrong method anyways. To prevent the SIGSEGV, the thread must be called immediately, not after receiving the (qt) signal in its event loop. So although generally frowned upon, in this case QThread::terminate has to be called
But it is generally said to be unsafe to call complex functions like QThread::currentThread, QThread::exit or QThread::terminate from signal handlers, so you should never call them there
Since the thread is still running after the signal handler (and I'm not sure even QThread::terminate would kill it fast enough), the signal handler exits to where it was called from, so it reexecutes the instruction causing the SIGSEGV, and the next SIGSEGV occurs.
Therefore I have used a different approach, the signal handler changes the register containing the instruction address to another function, which will then be run, after the signal handler exits, instead the crashing instruction. Like:
void signalHandler(int type, siginfo_t * si, void* ccontext){
(static_cast<ucontext_t*>(ccontext))->Eip = &recoverFromCrash;
}
struct sigaction sa;
memset(&sa, 0, sizeof(sa)); sa.sa_flags = SA_SIGINFO;
sa.sa_sigaction = &signalHandler;
sigaction(SIGSEGV, &sa, 0);
The recoverFromCrash function is then normally called in the thread causing the SIGSEGV. Since the signal handler is called for all SIGSEGV, from all threads, the function has to check which thread it is running in.
However, I did not consider it safe to simply kill the thread, since there might be other stuff, depending on a running thread. So instead of killing it, I let it run in an endless loop (calling sleep to avoid wasting CPU time). Then, when the program is closed, it sets a global variabel, and the thread is terminated. (notice that the recover function must never return, since otherwise the execution will return to the function which caused the SIGSEGV)
Called from the mainthread on the other hand, it starts a new event loop, to let the program running.
if (QThread::currentThread() != QCoreApplication::instance()->thread()) {
//sub thread
QThread* t = QThread::currentThread();
while (programIsRunning) ThreadBreaker::sleep(1);
ThreadBreaker::forceTerminate();
} else {
//main thread
while (programIsRunning) {
QApplication::processEvents(QEventLoop::AllEvents);
ThreadBreaker::msleep(1);
}
exit(0);
}
ThreadBreaker is a trivial wrapper class around QThread, since msleep, sleep and setTerminationEnabled (which has to be called before terminate) of QThread are protected and could not be called from the recover function.
But this is only the basic picture. There are a lot of other things to worry about: Catching SIGFPE, Catching stack overflows (check the address of the SIGSEGV, run the signal handler in an alternate stack), have a bunch of defines for platform independence (64 bit, arm, mac), show debug messages (try to get a stack trace, wonder why calling gdb for it crashes the X server, wonder why calling glibc backtrace for it crashes the program)...

What happens in Eclipse CDT on Linux if you press the Terminate button?

I guess some signals will be sent to the process. Some or one? If more than one in which order do they occure?
And what happens if the Terminate button is pressed and if the process has forked?
And what happens if the process has started other processes by system(...)?
I can't be sure without checking, but I would be surprised if the signal sent was anything other than SIGTERM (or possibly SIGKILL, but that would be a bit unfriendly of CDT).
As for sub-processes, depends what they are actually doing. If they are communicating with their parent processes over a pipe (in any way whatsoever, including reading their stdout), they'll likely find that those file descriptors close or enter the exception state; if they try to use the fds anyway they'll be sent a SIGPIPE. There may also be a SIGHUP in there.
If a sub-process was really completely disjoint (close all open FDs, no SIGTERM handler in the parent which might tell it to exit) then it could theoretically keep running. This is how daemon processes are spawned.
I checked SIGTERM, SIGHUP, SIGPIPE with terminate button. Doesn't work...
I guess it is SIGKILL and this makes me very sad! Also, I didn't find a good solution to run program from external(or built-in plugin) console.
It seems to be SIGKILL. SIGSTOP is used by GDB to stop/resume. From signal man page:
The signals SIGKILL and SIGSTOP cannot be caught or ignored.
I tried to debug following program with eclipse. Pressing terminate in Run session or pause in Debug session does not print anything. Thus it must be either SIGKILL or SIGSTOP.
#include <signal.h>
#include <string.h>
#include <unistd.h>
void handler(int sig) {
printf("\nsig:%2d %s\n", sig, strsignal(sig));
}
int main(int argc, char **argv) {
int signum;
int delay;
if (argc < 2) {
printf("usage: continue <sleep>\n");
return 1;
}
delay = atoi(argv[1]);
for (signum = 1; signum < 64; signum++) {
signal(signum, handler);
}
printf("sleeping %d s\n", delay);
for(;;) {
sleep(delay);
}
return 0;
}

Resources