When and why should you use WNOHANG with waitpid()? - linux

I'm currently in a systems programming class and we went over the wait system call functions today. I was reading over the section on waitpid() system call and in the options section it lists one called WNOHANG.
pid_t waitpid*(pid_t pid, int *status, int options);
WNOHANG: If no child specified by pid (from the parameters) has yet changed state, then return immediately, instead of blocking. In this case, the return value of waitpid() is 0. If the calling process has no children that match the specification in pid, waitpid() fails with the error ECHILD.
I understand waitpid() was implemented to solve the limitations in wait(); however, I'm not really sure about why you would use the WNOHANG option flag.
If I were to render a guess it would be so that the parent process can preform other tasks and perhaps keep checking on its children to see if any of them have terminated. Sort of how a demon process sits in the background and waits for requests.
Any situational examples or regular examples would help as well.
Thanks in advance!

You don't need to keep checking on children. It is job of SIGCHLD signal handler. Every time this handler is fired, you check terminated children:
pid_t pid;
int status;
while ((pid=waitpid(-1,&status,WNOHANG)) > 0)
{
//process terminated child
}

Related

waitpid - difference between first parameter pid=-1 and pid=0

I am reading http://www.tutorialspoint.com/unix_system_calls/waitpid.htm regarding the waitpid function. It says this about the first parameter, pid,
-1 meaning wait for any child process.
0 meaning wait for any child process whose process group ID is equal to that of the calling process.
May I know what does "any child process" mean, any child process of whom? What sort of situation would one need to use a value of -1?
Ignoring the case where your process has pid 1 (in some process namespace - in which case orphaned processes will be reparented), there is only one difference between 0 and -1.
With -1, any child will be waited for. With 0, children that have called setpgid will not be waited for.
"child" is defined as the process created by fork from your process (but not from any child - you cannot wait for grandchildren, though on Linux I think you can do something similar by polling /proc/<pid>). Note that execve does not affect anything.
By “any child process”, it means any process that is a child of the process that called waitpid.
You would use a pid argument of -1 if you want to wait for any of your children. The most common use is probably when you have multiple children and you know at least one has exited because you received SIGCHLD. You want to call waitpid for each child that has exited, but you don't know exactly which ones have exited. So you loop like this:
while (1) {
int status;
pid_t childPid = waitpid(-1, &status, WNOHANG);
if (childPid <= 0) {
break;
}
// Do whatever you want knowing that the child with pid childPid
// exited. Use status to figure out why it exited.
}

How to make sure my child executes first and then parent?

Here below i have a simple code snippet of application which takes request from several clients and invokes mathematical operations through exec and waits for result from invoked processes to return those results to the respective clients through pipes
case '+':
fret=fork();
if(fret==-1)
perror(" error in forking at add\n");
else if(fret==0)
{//child
sprintf(s_rp,"%d",p_op[0]);
sprintf(s_wp,"%d",p_op[1]);
argm[0]="add";
if((ret= execve(argm[0],argm,argp))== -1)
{
printf("exeve of add failed \n");
exit(1);
}
}
else
{//parent
write(p_op[1],&request,sizeof(request));
if((ret=waitpid(fret,&x,0))==-1)
perror("Error with wait at +\n");
read(p_op[0],&result,sizeof(result));
write(fd_res,&result,sizeof(result));
}
break;
Here i am facing a simple issue of parent being executing first fastly,which makes the waitpid() to fail,generally waitpid() waits for the child to exit but in my case the child is not even created when parent encounters waitpid() fails
My question is instead of using a sleep() (which has solved my problem but making the program run slow !! ) or any IPC How can i make sure the fork to execute my child first than my parent
when i thought of this, some approaches reeled in my mind, like making use of signals to block parent or semaphores to achieve atomicity is there any simple approach that makes sure my child will execute first and then my parent start execution
The waitpid function will BLOCK until the specified process ends. Although some other reasons such as EINTR can cause waitpid to return -1, it is very easy to solve that by placing the waitpid call into a loop. For example, if waitpid call returned -1 and errno == EINTR, then just continue the loop. See http://linux.die.net/man/3/waitpid about the return value.
Your waitpid call will NEVER be executed before the child process is created! Obviously, the waitpid function is called after the fork syscall, and the return value of the fork syscall is not -1. So, at that time, the fork call has already returned which means that the child process would already exist.

What does signal(SIGCHLD, SIG_DFL); mean?

I am not handling SIGCHLD in my code. Still my process is removed immediately after termination. I want it to become zombie process.
If I set SIGCHLD to SIG_DFL then, will it work? How do I set SIGCHLD to SIG_DFL?
I want process to become zombie, so I can read the child status in parent after waitpid.
From your question history you seem to be tying yourself in knots over this. Here is the outline on how this works:
The default disposition of SIGCHLD is ignore. In other words, if you do nothing, the signal is ignored but the zombie exists in the process table. This why you can wait on it at any time after the child dies.
If you set up a signal handler then the signal is delivered and you can reap it as appropriate but the (former) child is still a zombie between the time it dies and the time you reap it.
If you manually set SIGCHLD's disposition to SIG_IGN via signal then the semantics are a little different than they are in item 1. When you manually set this disposition the OS immediately removes the child from the process table when it dies and does not create a zombie. Consequently there is no longer any status information to reap and wait will fail with ECHILD. (Linux kernels after 2.6.9 adhere to this behavior.)
So your final target is to read return code in parent process after your child process exit? I don't see this has any matter with signal. Some example code is:
short pid;
if((pid == fork()) == 0) {
// Child process do some thing here.
exit(n);
} else {
int returnCode;
while(pid != wait(&returnCode));
// the child has terminated with returnCode
// wait is blocking system call so u don't need to worry about busy waiting.
}

unix fork() understanding

int main(){
fork();
}
I know this is a newbie question, but my understanding is that the parent process now will fork a new child process exactly as the parent one, which means that the child should also fork a child process and so on... In reality, this only generates one child process. I cant understand what code will the child be executing?
The child process begins executing at the exact point where the last one left off - after the fork statement. If you wanted to fork forever, you'd have to put it in a while loop.
As everybody mentioned, the child also starts executing after fork() has finished. Thus, it doesn't call fork again.
You could see it clearly in the very common usage like this:
int main()
{
if (fork())
{
// you are in parent. The return value of fork was the pid of the child
// here you can do stuff and perhaps eventually `wait` on the child
}
else
{
// you are in the child. The return value of fork was 0
// you may often see here an `exec*` command
}
}
You missed a semi-colon.
But the child (and also the parent) is continuing just after the fork happenned. From the point of view of application programming, fork (like all system calls) is "atomic".
The only difference between the two processes (which after the fork have conceptually separate memory spaces) is the result of the fork.
If the child went on to call fork, the child would have two forks (the one that created it and the one that it then made) while the parent would only have one (the one that gave it a child). The nature of fork is that one process calls it and two processes return from it.

How can a process kill itself?

#include<stdlib.h>
#include<unistd.h>
#include<signal.h>
int main(){
pid_t pid = fork();
if(pid==0){
system("watch ls");
}
else{
sleep(5);
killpg(getpid(),SIGTERM); //to kill the complete process tree.
}
return 0;
}
Terminal:
anirudh#anirudh-Aspire-5920:~/Desktop/testing$ gcc test.c
anirudh#anirudh-Aspire-5920:~/Desktop/testing$ ./a.out
Terminated
for the first 5 secs the output of the "watch ls" is shown and then it terminates because I send a SIGTERM.
Question: How can a process kills itself ? I have done kill(getpid(),SIGTERM);
My hypothesis:
so during the kill() call the process switches to kernel mode. The kill call sends the SIGTERM to the process and copies it in the process's process table. when the process comes back to user mode it sees the signal in its table and it terminates itself (HOW ? I REALLY DO NOT KNOW )
(I think I am going wrong (may be a blunder) somewhere in my hypothesis ... so Please enlighten me)
This code is actually a stub which I am using to test my other modules of the Project.
Its doing the job for me and I am happy with it but there lies a question in my mind how actually a process kills itself. I want to know the step by step hypothesis.
Thanks in advance
Anirudh Tomer
Your process dies because you are using killpg(), that sends a signal to a process group, not to a process.
When you fork(), the children inherits from the father, among the other things, the process group. From man fork:
* The child's parent process ID is the same as the parent's process ID.
So you kill the parent along with the child.
If you do a simple kill(getpid(), SIGTERM) then the father will kill the child (that is watching ls) and then will peacefully exit.
so during the kill() call the process switches to kernel mode. The kill call sends the SIGTERM to the process and copies it in the process's process table. when the process comes back to user mode it sees the signal in its table and it terminates itself (HOW ? I REALLY DO NOT KNOW )
In Linux, when returning from the kernel mode to the user-space mode the kernel checks if there are any pending signals that can be delivered. If there are some it delivers the signals just before returning to the user-space mode. It can also deliver signals at other times, for example, if a process was blocked on select() and then killed, or when a thread accesses an unmapped memory location.
I think it when it sees the SIGTERM signal in its process tables it first kills its child processes( complete tree since I have called killpg() ) and then it calls exit().
I am still looking for a better answer to this question.
kill(getpid(), SIGKILL); // itself I think
I tested it after a fork with case 0: and it quit regular from separate parent process.
I don't know if this is a standard certification method ....
(I can see from my psensor tool that CPU usage return in 34% like a normal program code with
a counter stopped ) .
This is super-easy in Perl:
{
local $SIG{TERM} = "IGNORE";
kill TERM => -$$;
}
Conversion into C is left as an exercise for the reader.

Resources