waitpid - difference between first parameter pid=-1 and pid=0 - linux

I am reading http://www.tutorialspoint.com/unix_system_calls/waitpid.htm regarding the waitpid function. It says this about the first parameter, pid,
-1 meaning wait for any child process.
0 meaning wait for any child process whose process group ID is equal to that of the calling process.
May I know what does "any child process" mean, any child process of whom? What sort of situation would one need to use a value of -1?

Ignoring the case where your process has pid 1 (in some process namespace - in which case orphaned processes will be reparented), there is only one difference between 0 and -1.
With -1, any child will be waited for. With 0, children that have called setpgid will not be waited for.
"child" is defined as the process created by fork from your process (but not from any child - you cannot wait for grandchildren, though on Linux I think you can do something similar by polling /proc/<pid>). Note that execve does not affect anything.

By “any child process”, it means any process that is a child of the process that called waitpid.
You would use a pid argument of -1 if you want to wait for any of your children. The most common use is probably when you have multiple children and you know at least one has exited because you received SIGCHLD. You want to call waitpid for each child that has exited, but you don't know exactly which ones have exited. So you loop like this:
while (1) {
int status;
pid_t childPid = waitpid(-1, &status, WNOHANG);
if (childPid <= 0) {
break;
}
// Do whatever you want knowing that the child with pid childPid
// exited. Use status to figure out why it exited.
}

Related

fork/vfork, exec and waitpid in atomic way

Let suppose that process-A calls fork
pid = fork();
...
waitpid( pid, ...);
Is it possible, that between these calls (fork and waitpid) proccess-B, which is created by fork(), may to finish? Then some new process-C starts and gets a pid is equal to an old pid of process-B. And after that waitpid will waits the end of process-C, not B.
The exec-family calls don't return a value and a control if they are successful. The exec starts new process instead current process but keeps a process pid. Is it an any guaranteed way to do fork/vfork + exec + waitpid as a truly "atomic" operation and to get result of a process which is created by exec?
Does bash/shell run, wait commands and return their results in an "atomic" way?

When and why should you use WNOHANG with waitpid()?

I'm currently in a systems programming class and we went over the wait system call functions today. I was reading over the section on waitpid() system call and in the options section it lists one called WNOHANG.
pid_t waitpid*(pid_t pid, int *status, int options);
WNOHANG: If no child specified by pid (from the parameters) has yet changed state, then return immediately, instead of blocking. In this case, the return value of waitpid() is 0. If the calling process has no children that match the specification in pid, waitpid() fails with the error ECHILD.
I understand waitpid() was implemented to solve the limitations in wait(); however, I'm not really sure about why you would use the WNOHANG option flag.
If I were to render a guess it would be so that the parent process can preform other tasks and perhaps keep checking on its children to see if any of them have terminated. Sort of how a demon process sits in the background and waits for requests.
Any situational examples or regular examples would help as well.
Thanks in advance!
You don't need to keep checking on children. It is job of SIGCHLD signal handler. Every time this handler is fired, you check terminated children:
pid_t pid;
int status;
while ((pid=waitpid(-1,&status,WNOHANG)) > 0)
{
//process terminated child
}

Why fork() return 0 in the child process?

As we know, the fork() will return twice, namely two PIDs. The PID of the child process is returned in the parent, and 0 is returned in the child.
Why the 0 is returned in the child process? any special reason for that?
UPDATE I was told that the linked list is used between parent and child process, and parent process knows the PID of child process, but if there is no grandchildren, so the child process will get 0. I do not know whether it is right?
As to the question you ask in the title, you need a value that will be considered success and cannot be a real PID. The 0 return value is a standard return value for a system call to indicate success. So it is provided to the child process so that it knows that it has successfully forked from the parent. The parent process receives either the PID of the child, or -1 if the child did not fork successfully.
Any process can discover its own PID by calling getpid().
As to your update question, it seems a little backward. Any process can discover its parent process by using the getppid() system call. If a process did not track the return value of fork(), there is no straight forward way to discover all the PIDs of its children.
You need to return something that cannot be a real PID (otherwise the child may think it is the parent).
0 fits the bill.
From the docs:
RETURN VALUES
Upon successful completion, fork() returns a value of 0 to the child
process and returns the process ID of the child process to the parent
process. Otherwise, a value of -1 is returned to the parent process,
no child process is created, and the global variable errno is set to
indi- cate the error.
From the book(Advanced Programing in the unix)
The reason fork returns 0 to the child is that a process can have only
a single parent, and the child can always call getppid to obtain the
process ID of its parent. (Process ID 0 is reserved for use by the
kernel, so it’s not possible for 0 to be the process ID of a child.)

How to make sure my child executes first and then parent?

Here below i have a simple code snippet of application which takes request from several clients and invokes mathematical operations through exec and waits for result from invoked processes to return those results to the respective clients through pipes
case '+':
fret=fork();
if(fret==-1)
perror(" error in forking at add\n");
else if(fret==0)
{//child
sprintf(s_rp,"%d",p_op[0]);
sprintf(s_wp,"%d",p_op[1]);
argm[0]="add";
if((ret= execve(argm[0],argm,argp))== -1)
{
printf("exeve of add failed \n");
exit(1);
}
}
else
{//parent
write(p_op[1],&request,sizeof(request));
if((ret=waitpid(fret,&x,0))==-1)
perror("Error with wait at +\n");
read(p_op[0],&result,sizeof(result));
write(fd_res,&result,sizeof(result));
}
break;
Here i am facing a simple issue of parent being executing first fastly,which makes the waitpid() to fail,generally waitpid() waits for the child to exit but in my case the child is not even created when parent encounters waitpid() fails
My question is instead of using a sleep() (which has solved my problem but making the program run slow !! ) or any IPC How can i make sure the fork to execute my child first than my parent
when i thought of this, some approaches reeled in my mind, like making use of signals to block parent or semaphores to achieve atomicity is there any simple approach that makes sure my child will execute first and then my parent start execution
The waitpid function will BLOCK until the specified process ends. Although some other reasons such as EINTR can cause waitpid to return -1, it is very easy to solve that by placing the waitpid call into a loop. For example, if waitpid call returned -1 and errno == EINTR, then just continue the loop. See http://linux.die.net/man/3/waitpid about the return value.
Your waitpid call will NEVER be executed before the child process is created! Obviously, the waitpid function is called after the fork syscall, and the return value of the fork syscall is not -1. So, at that time, the fork call has already returned which means that the child process would already exist.

What does signal(SIGCHLD, SIG_DFL); mean?

I am not handling SIGCHLD in my code. Still my process is removed immediately after termination. I want it to become zombie process.
If I set SIGCHLD to SIG_DFL then, will it work? How do I set SIGCHLD to SIG_DFL?
I want process to become zombie, so I can read the child status in parent after waitpid.
From your question history you seem to be tying yourself in knots over this. Here is the outline on how this works:
The default disposition of SIGCHLD is ignore. In other words, if you do nothing, the signal is ignored but the zombie exists in the process table. This why you can wait on it at any time after the child dies.
If you set up a signal handler then the signal is delivered and you can reap it as appropriate but the (former) child is still a zombie between the time it dies and the time you reap it.
If you manually set SIGCHLD's disposition to SIG_IGN via signal then the semantics are a little different than they are in item 1. When you manually set this disposition the OS immediately removes the child from the process table when it dies and does not create a zombie. Consequently there is no longer any status information to reap and wait will fail with ECHILD. (Linux kernels after 2.6.9 adhere to this behavior.)
So your final target is to read return code in parent process after your child process exit? I don't see this has any matter with signal. Some example code is:
short pid;
if((pid == fork()) == 0) {
// Child process do some thing here.
exit(n);
} else {
int returnCode;
while(pid != wait(&returnCode));
// the child has terminated with returnCode
// wait is blocking system call so u don't need to worry about busy waiting.
}

Resources