In man page of ps it describes defunct as "Defunct ("zombie") process, terminated but not reaped by its parent".
The process use pipes to communicate with its parent. In the logs I verified it has closed pipes and exited normally.
Why it becomes defunct instead of properly destroyed?
Is there something the parent process can do to avoid such situations?
In top this process has no allocated memory, which confirms it has exited normally.
osysops 42884 42820 0 06:55 ? 00:00:03 [lecw]
Related
We have a main node process that spawns a number of child processes with child_process.fork(), which themselves each spawn a helper child process. We are using node-forever to manage the lifetime of the main node processes, and often use forever restartall to restart this.
One problem we are seeing occasionally is that the grandchild processes will fail to terminate, and we end up with duplicated child processes running. Ie. what should be this:
Main Process
Child Process 1
Grandchild Process 1
Child Process 2
Grandchild Process 2
Ends up like this after restartall:
Main Process
Child Process 1
Grandchild Process 1
Child Process 2
Grandchild Process 2
Grandchild Process 1
Grandchild Process 2
Unsurprisingly this causes lots of weird problems and we usually have to restart the whole server (or kill processes manually, if we can establish which are the old ones).
As I understand it, forever issues a SIGTERM message to the process when it does a restartall. I believe this message should cascade down to the child and grandchild processes (but please correct me if I've made a false assumption there). Since this problem only occurs maybe once in 100, perhaps it's something related to timing?
What circumstances could cause the grandchild processes to fail to terminate? How to mitigate this?
OS is Debian Squeeze.
EDIT: My initial description was a bit over simplified. I've updated it to include all the details.
EDIT2 : We don't use forever anymore. I recommend PM2
I have more than 30 process '[avconv] ' (i have a bug in script), With this command i find these process :
Ps aux | grep '\[avconv\] <defunct>'
but i don't know how to kill these process, anyone have an idea to kill these process ?
Thanks
A <defunct> process is a process that has already terminated, and hence cannot be killed, but for which the parent has not yet invoked one of the wait system calls (wait, wait3, wait4, waitpid, etc...) to read its exit status. As a result, the process information is retained by the system in case the parent eventually does try to obtain its status. Such processes disappear when the parent reads their exit status.
These <defunct> processes also disappear when the parent is killed, as the init process will take ownership of the process and obtain (and discard) its status.
You can avoid <defunct> processes by ensuring you issue as many wait system calls as you issue fork calls.
Alternatively, as J.F. Sebastian points out, you can also avoid <defunct> processes by either setting the SIGCHLD signal disposition to SIG_IGN (ignore the signal) or by using the SA_NOCLDWAIT flag when registering a SIGCHLD signal handler (or when resetting the default disposition with SIG_DFL) using sigaction. In this case, however, the child's exit status will not be made available to the parent - it is simply discarded.
This question already has an answer here:
Are child processes created with fork() automatically killed when the parent is killed?
(1 answer)
Closed 9 years ago.
kill -9 will send SIGKILL signal to parent process. But SIGKILL can not be caught. So How do parent process terminate child processes?
The parent, once killed via SIGKILL, will just stop existing and thus cannot send signals to its child processes any more. The child process would have to monitor its parent process by itself. When the parent gets killed, the PPID changes to 1 - this might be a helpful to the client to act on its parent being killed. But to "ensure that child process will always be closed with parent process" is not possible from the parents process - that's the nature of SIGKILL. However, if you feel brave you can always hack the source and redefine SIGKILL to something else, but I wouldn't recommend it :)
You can try using this:-
pkill -TERM -P 27888
which will kill all the child processes. Here 27888 is Process Id of the Parent.
Someone told me that when you killed a parent process in linux, the child would die.
But I doubt it. So I wrote two bash scripts, where father.shwould invoke child.sh
Here is my script:
Now I run bash father.sh, you could check it ps -alf
Then I killed the father.sh by kill -9 24588, and I guessed the child process should be terminated but unfortunately I was wrong.
Could anyone explain why?
thx
No, when you kill a process alone, it will not kill the children.
You have to send the signal to the process group if you want all processes for a given group to receive the signal
For example, if your parent process id has the code 1234, you will have to specify the parentpid adding the symbol minus followed by your parent process id:
kill -9 -1234
Otherwise, orphans will be linked to init, as shown by your third screenshot (PPID of the child has become 1).
-bash: kill: (-123) - No such process
In an interactive Terminal.app session the foreground process group id number and background process group id number are different by design when job control/monitor mode is enabled. In other words, if you background a command in a job-control enabled Terminal.app session, the $! pid of the backgrounded process is in fact a new process group id number (pgid).
In a script having no job control enabled, however, this may not be the case! The pid of the backgrounded process may not be a new pgid but a normal pid! And this is, what causes the error message -bash: kill: (-123) - No such process, trying to kill a process group but only specifying a normal pid (instead of a pgid) to the kill command.
# the following code works in Terminal.app because $! == $pgid
{
sleep 100 &
IFS=" " read -r pgid <<EOF
$(ps -p $! -o pgid=)
EOF
echo $$ $! $pgid
sleep 10
kill -HUP -- -$!
#kill -HUP -- -${pgid} # use in script
}
pkill -TERM -P <ProcessID>
This will kill both Parent as well as child
Generally killing the parent also kills the child.
The reason that you are seeing the child still alive after killing the father is because the child only will die after it "chooses" (the kernel chooses) to handle the SIGKILL event. It doesn't have to handle it right away. Your script is running a sleep() command (i.e. in the kernel), which will not wake up to handle any events whatsoever until the sleep is completed.
Why is PPID #1? The parent has died and is no longer in the process table. child.sh isn't linked inexplicably to init now. It simply has no running parent. Saying it is linked to init creates the impression that if we somehow leave init, that init has control over shutting down the process. It also creates the impression that killing a parent will make the grandparent the owner of a child. Both are not true. That child process still exists in the process table and is running, but no new events based upon it's process ID will be handled until it handles SIGKILL. Which means that the child is a pre-zombie, walking dead, in danger of being labeled .
Killing in the process group is different, and is used to kill the siblings, and the parent by the process group #. It's probably also important to note that "killing a process" is not "killing" per se, in the human way, where you expect the process to be destroyed and all memory returned as though it never was. It just sends a particular event, among many, to the process for it to handle. If the process does not handle it properly, then after a while the OS will come along and "clean it up" forcibly.
It (killing) doesn't happen right away because the child (or even the parent) could have written something to disk and be waiting for I/O to complete or doing some other critical task that could compromise system stability or file integrity.
What is the difference between SIGTERM and SIGKILL when it comes to the process tree?
When a root thread receives SIGKILL does it get killed cleanly or does it leave it's child threads as zombies?
Is there any signal which can be sent to a root thread to cleanly exit by not leaving any zombie threads ?
Thanks.
If you kill the root process (parent process), this should make orphan children, not zombie children. orphan children are made when you kill a process's parent, and the kernel makes init the parent of orphans. init is supposed to wait until orphan dies, then use wait to clean it up.
Zombie children are created when a process (not its parent) ends and its parent does not take up its exit status from the process table.
It sounds to me like you are worried about leaving orphans because by definition, when you kill a zombies parent process, the zombie child itself dies.
To kill your orphans, use kill -9 , which is the equivalent SIGKILL.
Here is a more in depth tutorial for killing stuff on linux:
http://riccomini.name/posts/linux/2012-09-25-kill-subprocesses-linux-bash/
You can't control that by signal; only its parent process can control that, by calling waitpid() or setting signal handlers for SIGCHLD. See SIGCHLD and SA_NOCLDWAIT in the sigaction(2) manpage for details.
Also, what happens to child threads depends on the Linux kernel version. With 2.6's POSIX threads, killing the main thread should cause the other threads to exit cleanly. With 2.4 LinuxThreads, each thread is actually a separate process and SIGKILL doesn't give the root thread a chance to tell the others to shut down, whereas SIGTERM does.