What does signal(SIGCHLD, SIG_DFL); mean? - linux

I am not handling SIGCHLD in my code. Still my process is removed immediately after termination. I want it to become zombie process.
If I set SIGCHLD to SIG_DFL then, will it work? How do I set SIGCHLD to SIG_DFL?
I want process to become zombie, so I can read the child status in parent after waitpid.

From your question history you seem to be tying yourself in knots over this. Here is the outline on how this works:
The default disposition of SIGCHLD is ignore. In other words, if you do nothing, the signal is ignored but the zombie exists in the process table. This why you can wait on it at any time after the child dies.
If you set up a signal handler then the signal is delivered and you can reap it as appropriate but the (former) child is still a zombie between the time it dies and the time you reap it.
If you manually set SIGCHLD's disposition to SIG_IGN via signal then the semantics are a little different than they are in item 1. When you manually set this disposition the OS immediately removes the child from the process table when it dies and does not create a zombie. Consequently there is no longer any status information to reap and wait will fail with ECHILD. (Linux kernels after 2.6.9 adhere to this behavior.)

So your final target is to read return code in parent process after your child process exit? I don't see this has any matter with signal. Some example code is:
short pid;
if((pid == fork()) == 0) {
// Child process do some thing here.
exit(n);
} else {
int returnCode;
while(pid != wait(&returnCode));
// the child has terminated with returnCode
// wait is blocking system call so u don't need to worry about busy waiting.
}

Related

Handle and propogate SIGHUP signal to child process without parent termination

I've got a piece of classic problem, but can't figure out how to deal with it. There is a bash process which executes children, and I want to send some signal to it (SIGHUP), handle it there and propagate this signal to one of the children (another_long_running_process for example). Here is snippet:
#!/bin/bash
long_running_process &
another_long_running_process &
pid=$!
trap 'kill -1 $pid' HUP
wait $pid
Ok, now I setup trap, create handler to send signal to particular pid, but then find out that my script just exits after first SIGHUP receiving and handling. The problem that bash returns immediately from wait built-in:
When Bash receives a signal for which a trap has been set while waiting for a command to complete, the trap will not be executed until the command completes. When Bash is waiting for an asynchronous command via the wait built-in, the reception of a signal for which a trap has been set will cause the wait built-in to return immediately with an exit status greater than 128, immediately after which the trap is executed.
And yes, my script just exits after first time I send SIGHUP, by design. But I need to keep it running.
And I can't figure out how to wait children processes, and propagate one of them (ok, even all of them) SIGHUP signals multiple times while they're running. Is this achievable in such problem definition? I think with parent pid I can iterate over children and find required process and then send signal particularly there, but it looks a bit overengineering, or not?
Ok, finally I fix my problem with following approach: I setup ignore signal handler in bash-script and make it leader of process group. Then redefine SIGHUP handler in another_long_running_process and then send signal to process group. So bash-script and long_running_process ignore that signal, and another_long_running_process catch signal and handle it.

waitpid - difference between first parameter pid=-1 and pid=0

I am reading http://www.tutorialspoint.com/unix_system_calls/waitpid.htm regarding the waitpid function. It says this about the first parameter, pid,
-1 meaning wait for any child process.
0 meaning wait for any child process whose process group ID is equal to that of the calling process.
May I know what does "any child process" mean, any child process of whom? What sort of situation would one need to use a value of -1?
Ignoring the case where your process has pid 1 (in some process namespace - in which case orphaned processes will be reparented), there is only one difference between 0 and -1.
With -1, any child will be waited for. With 0, children that have called setpgid will not be waited for.
"child" is defined as the process created by fork from your process (but not from any child - you cannot wait for grandchildren, though on Linux I think you can do something similar by polling /proc/<pid>). Note that execve does not affect anything.
By “any child process”, it means any process that is a child of the process that called waitpid.
You would use a pid argument of -1 if you want to wait for any of your children. The most common use is probably when you have multiple children and you know at least one has exited because you received SIGCHLD. You want to call waitpid for each child that has exited, but you don't know exactly which ones have exited. So you loop like this:
while (1) {
int status;
pid_t childPid = waitpid(-1, &status, WNOHANG);
if (childPid <= 0) {
break;
}
// Do whatever you want knowing that the child with pid childPid
// exited. Use status to figure out why it exited.
}

When and why should you use WNOHANG with waitpid()?

I'm currently in a systems programming class and we went over the wait system call functions today. I was reading over the section on waitpid() system call and in the options section it lists one called WNOHANG.
pid_t waitpid*(pid_t pid, int *status, int options);
WNOHANG: If no child specified by pid (from the parameters) has yet changed state, then return immediately, instead of blocking. In this case, the return value of waitpid() is 0. If the calling process has no children that match the specification in pid, waitpid() fails with the error ECHILD.
I understand waitpid() was implemented to solve the limitations in wait(); however, I'm not really sure about why you would use the WNOHANG option flag.
If I were to render a guess it would be so that the parent process can preform other tasks and perhaps keep checking on its children to see if any of them have terminated. Sort of how a demon process sits in the background and waits for requests.
Any situational examples or regular examples would help as well.
Thanks in advance!
You don't need to keep checking on children. It is job of SIGCHLD signal handler. Every time this handler is fired, you check terminated children:
pid_t pid;
int status;
while ((pid=waitpid(-1,&status,WNOHANG)) > 0)
{
//process terminated child
}

How do you close a Qt child process and get the child process to execute cleanup code?

I am starting a process in Linux/Qt and then starting some child processes using QProcess. Then eventually I want to close the child processes gracefully (aka execute some clean up code).
The child processes are using QSharedMemory and right now when I call QProcess::close() the child processes are closing without calling QSharedMemory::detach() and the result is that all the processes are closed... but there is left over shared memory that is not cleaned up.
I have the code for the child process and in the code there is the function cleanup(). How does the parent process close the QProcess in such a manner so that the child process will execute cleanup()?
I got the child to execute Qt cleanup code using unix signal handlers.
Here's a high level explanation:
the parent opens the child process using QProcess
processing occurs
the parent closes the child process using QProcess::terminate() which raises the SIGTERM signal on the child
(don't use QProcess::close() because it doesn't raise the SIGTERM signal)
the child implements a unix signal handler for SIGTERM
from the unix signal handler the qApp->exit(0); occurs
qApp emits a Qt signal "aboutToQuit()"
connect the child process cleanup() slot to the qApp aboutToQuit() signal
Child process code to handle unix SIGTERM signal:
static void unixSignalHandler(int signum) {
qDebug("DBG: main.cpp::unixSignalHandler(). signal = %s\n", strsignal(signum));
/*
* Make sure your Qt application gracefully quits.
* NOTE - purpose for calling qApp->exit(0):
* 1. Forces the Qt framework's "main event loop `qApp->exec()`" to quit looping.
* 2. Also emits the QCoreApplication::aboutToQuit() signal. This signal is used for cleanup code.
*/
qApp->exit(0);
}
int main(int argc, char *argv[])
{
QCoreApplication a(argc, argv);
MAINOBJECT mainobject;
/*
* Setup UNIX signal handlers for some of the common signals.
* NOTE common signals:
* SIGINT: The user started the process on the command line and user ctrl-C.
* SIGTERM: The user kills the process using the `kill` command.
* OR
* The process is started using QProcess and SIGTERM is
* issued when QProcess::close() is used to close the process.
*/
if (signal(SIGINT, unixSignalHandler) == SIG_ERR) {
qFatal("ERR - %s(%d): An error occurred while setting a signal handler.\n", __FILE__,__LINE__);
}
if (signal(SIGTERM, unixSignalHandler) == SIG_ERR) {
qFatal("ERR - %s(%d): An error occurred while setting a signal handler.\n", __FILE__,__LINE__);
}
// executes mainbobject.cleanupSlot() when the Qt framework emits aboutToQuit() signal.
QObject::connect(qApp, SIGNAL(aboutToQuit()),
&mainobject, SLOT(cleanupSlot()));
return a.exec();
}
Conclusion:
I confirmed that this solution works.
I think this is a good solution because:
let's the parent close the child process in such a way that the child process executes cleanup
if the parent closes mistakenly and leaves the child process running, the user/sysadmin can kill the leftover child process using kill command and the child process will still cleanup after itself before closing
p.s. "why not just do the cleanup code directly in the signal handler entry point?"
The short answer is because you can't. Here's an explanation as to why you can't execute your Qt cleanup code in the unix signal handler function. From Qt documentation "Calling Qt Functions From Unix Signal Handlers":
You can't call Qt functions from Unix signal handlers. The standard
POSIX rule applies: You can only call async-signal-safe functions from
signal handlers. See Signal Actions for the complete list of functions
you can call from Unix signal handlers.
You could try connecting the processExited() signal to a slot to handle detaching from the shared memory yourself, as far as I'm aware there's nothing in QProcess that will directly trigger the detach method.
I believe you need to implement a small protocol here. You need a way to tell to the child process to exit gracefully. If you have the source for both, you may try to implement such a signal using QtDBus library.

How can a process kill itself?

#include<stdlib.h>
#include<unistd.h>
#include<signal.h>
int main(){
pid_t pid = fork();
if(pid==0){
system("watch ls");
}
else{
sleep(5);
killpg(getpid(),SIGTERM); //to kill the complete process tree.
}
return 0;
}
Terminal:
anirudh#anirudh-Aspire-5920:~/Desktop/testing$ gcc test.c
anirudh#anirudh-Aspire-5920:~/Desktop/testing$ ./a.out
Terminated
for the first 5 secs the output of the "watch ls" is shown and then it terminates because I send a SIGTERM.
Question: How can a process kills itself ? I have done kill(getpid(),SIGTERM);
My hypothesis:
so during the kill() call the process switches to kernel mode. The kill call sends the SIGTERM to the process and copies it in the process's process table. when the process comes back to user mode it sees the signal in its table and it terminates itself (HOW ? I REALLY DO NOT KNOW )
(I think I am going wrong (may be a blunder) somewhere in my hypothesis ... so Please enlighten me)
This code is actually a stub which I am using to test my other modules of the Project.
Its doing the job for me and I am happy with it but there lies a question in my mind how actually a process kills itself. I want to know the step by step hypothesis.
Thanks in advance
Anirudh Tomer
Your process dies because you are using killpg(), that sends a signal to a process group, not to a process.
When you fork(), the children inherits from the father, among the other things, the process group. From man fork:
* The child's parent process ID is the same as the parent's process ID.
So you kill the parent along with the child.
If you do a simple kill(getpid(), SIGTERM) then the father will kill the child (that is watching ls) and then will peacefully exit.
so during the kill() call the process switches to kernel mode. The kill call sends the SIGTERM to the process and copies it in the process's process table. when the process comes back to user mode it sees the signal in its table and it terminates itself (HOW ? I REALLY DO NOT KNOW )
In Linux, when returning from the kernel mode to the user-space mode the kernel checks if there are any pending signals that can be delivered. If there are some it delivers the signals just before returning to the user-space mode. It can also deliver signals at other times, for example, if a process was blocked on select() and then killed, or when a thread accesses an unmapped memory location.
I think it when it sees the SIGTERM signal in its process tables it first kills its child processes( complete tree since I have called killpg() ) and then it calls exit().
I am still looking for a better answer to this question.
kill(getpid(), SIGKILL); // itself I think
I tested it after a fork with case 0: and it quit regular from separate parent process.
I don't know if this is a standard certification method ....
(I can see from my psensor tool that CPU usage return in 34% like a normal program code with
a counter stopped ) .
This is super-easy in Perl:
{
local $SIG{TERM} = "IGNORE";
kill TERM => -$$;
}
Conversion into C is left as an exercise for the reader.

Resources