Kill process '[avconv] <defunct>' - linux

I have more than 30 process '[avconv] ' (i have a bug in script), With this command i find these process :
Ps aux | grep '\[avconv\] <defunct>'
but i don't know how to kill these process, anyone have an idea to kill these process ?
Thanks

A <defunct> process is a process that has already terminated, and hence cannot be killed, but for which the parent has not yet invoked one of the wait system calls (wait, wait3, wait4, waitpid, etc...) to read its exit status. As a result, the process information is retained by the system in case the parent eventually does try to obtain its status. Such processes disappear when the parent reads their exit status.
These <defunct> processes also disappear when the parent is killed, as the init process will take ownership of the process and obtain (and discard) its status.
You can avoid <defunct> processes by ensuring you issue as many wait system calls as you issue fork calls.
Alternatively, as J.F. Sebastian points out, you can also avoid <defunct> processes by either setting the SIGCHLD signal disposition to SIG_IGN (ignore the signal) or by using the SA_NOCLDWAIT flag when registering a SIGCHLD signal handler (or when resetting the default disposition with SIG_DFL) using sigaction. In this case, however, the child's exit status will not be made available to the parent - it is simply discarded.

Related

Does a process whose parent has died normally continues execution?

I know that a child process whose parent has died becomes a zombie process but when that happens, does it continue execution normally ?
What I have read so far seems to suggest that yes but I have not found confirmation and my programming ventures seems to suggest otherwise.
Whether a child's parent has exited has no effect on whether it continues running. Assuming the child has access to the resources that it needs, it will continue to run normally.
This is important when writing a daemon, since typically the started process forks twice, and it is the grandchild that ultimately runs as a service.
Note that there are some reasons a child may end up exiting abnormally due to a parent exiting. For example, if the parent is an interactive shell and it exits, the terminal may disappear, and as a result the child may receive a SIGHUP. However, in that case, the reason the child will have exited is because it received a signal it didn't handle, and if it had set up a suitable handler, it would have continued running.

Why do we need a wait() system call?

Hello I am new to learning about system calls. I am currently learning about fork() and wait() system calls. I know that fork() creates a new child process. What confuses me is the wait() call.
This is what I understand so far:
(1) When a process dies, it goes into a 'Zombie State' i.e. it does not release its PID but waits for its parent to acknowledge that the child process has died and then the PID is released
(2) So we need a way to figure out when the child process has ended so that we don't leave any processes in the zombie state
I am confused with the following things:
(1) When running a C program where I fork a new child process, if I don't call wait() explicitly, is it done internally when the child process ends? Because you could still write a block of code in C where you run fork() without wait() and it seems to work fine?
(2) What does wait() do? I know it returns the PID of the child process that was terminated, but how is this helpful/related to releasing the PID of the terminated process?
I am sorry for such naive questions but this is something I was really curious about and I couldn't find any good resources online! Your help is much appreciated!
wait isn't about preventing zombie states. Zombie states are your friend.
POSIX more or less lets you do two things with pids: signal them with kill or reap them (and synchronize with them) with wait/waitpid/waittid.
The wait syscalls are primarily for waiting on a process to exit or die from a signal (though they can also be used to wait on other process status changes such as the child becoming stopped or the child waking up from being stopped).
Secondarily, they're about reaping exit/died statuses, thereby releasing (zombified) pids.
Until you release a pid with wait/waitpid/waittid, you can continue flogging the pid with requests for it to die (kill(pid,SIGTERM);) or with some other signal (other then SIGKILL) and you can rest assured the pid represents the process you've forked off and that you're not accidentally killing someone else's process.
But once you reap a zombified pid by waiting on it, then the pid is no longer yours and another process might take it (which typically happens after some time, as pids in the system typically increment and then wrap arround).
That's why auto-wait would be a bad idea (in some cases it isn't and then you can achieve it with globally with signal(SIGCHLD,SIG_IGN);) and why (short-lived) zombies states are your friend. They keep the child pid stable for you until you're ready to release it.
If you exit without releasing any of your children's pids, then you don't have to worry about zombie children anymore--your child processes will be reparented to the init process, which will wait on them for you when they die.
When you call fork(), a new process is created with you being its parent. When the child process finishes its running with a call to exit(), its process descriptor is still kept in the kernel's memory. It is your responsibility as its parent to collect its exit code, which is done with a call to wait() syscall. wait() blocks the parent process until one of its childrens is finished.
Zombie process is the name given to a process whose exit code was never collected by its parent.
Regarding to your first question - wait() is not called automatically as zombie processes wouldn't exist if it did. It is your responsibility as a programmer. Omitting the call to wait() will still work as you mentioned - but it is considered a bad practice.
Both this link and this link explains it good.

Process stuck in exit, shows as zombie but cannot be reaped

I have a process that's monitored by its parent. The child encountered an error that caused it to call abort. The process does not tamper with the abort process, so it should proceed as expected (dump core, terminate). The parent is supposed to detect the child's termination and trigger a series of events to respond to the failure. The child is multi-threaded and complex.
Here's what I see from ps:
F UID PID PPID PRI NI VSZ RSS WCHAN STAT TTY TIME COMMAND
0 1000 4929 1272 20 0 85440 6792 wait S+ pts/2 0:00 rxd
1 1000 4930 4929 20 0 0 0 exit Zl+ pts/2 38:21 [rxd] <defunct>
So the child (4930) has terminated. It is a zombie. I cannot attach to it, as expected. However, the parent (4929) stays blocked in:
int i;
// ...
waitpid (-1, &i, 0);
So it seems like the child is a zombie but somehow has not completed everything necessary for its parent to reap it. The WCHAN field of exit is, I think, a valuable clue.
The platform is 64-bit Linux, Ubuntu 13.04, kernel 3.8.0-30. The child doesn't appear to be dumping core or doing anything. I've left the system for several minutes and nothing changed.
Does anyone have any ideas what might be causing this or what I can do about it?
Update: Another interesting bit of information -- if I kill -9 the parent process, the child goes away. This is kind of baffling, since the parent process is trivial, just blocking in waitpid. Also, I don't get any core dump (from the child) when this problem happens.
Update: It seems the child is stuck in schedule, called from exit_mm, called from do_exit. I wonder why exit_mm would call schedule. And I wonder why killing the parent would unstick it.
I finally figured it out! The process was actually doing useful work all this time. The process held the last reference to a large file on a slow filesystem. When the process terminates, the last reference to the file is release, forcing the OS to reclaim the space. The file was so large that this required tens of thousands of I/O operations, taking 10 minutes or more.

reading the exit value of a defunct process

I have a dead process that is now in the defunct state which means that its parent process has not read its exit value. (and it is not going to read it)
I know that the exit value is stored somewhere in the kernel for the parent process to read but, is there a way for me to read that value if I am not the parent process ?
Ideally, I would be able to do this from a shell or an abritrary C/python/your-favorite-language program.
[edit]: This is not a question on how to reap the child or kill it. I do not care if it uses up a slot in the process table. I just want to know what its exit value is. i.e., I would like to read task_struct->exit_code in the kernel.
Mathieu
One thing you might be able to do is to send the parent SIGCHLD, which tells it that a child has died. If the program is of any quality, it will reap the process.
kill -s SIGCHLD parentpid
No. Attempting to call waitpid() for a process that is not one of the calling process's children will result in ECHILD. You will need to kill the parent process, causing the child to reparent to init, which will subsequently reap it.

Why child process still alive after parent process was killed in Linux?

Someone told me that when you killed a parent process in linux, the child would die.
But I doubt it. So I wrote two bash scripts, where father.shwould invoke child.sh
Here is my script:
Now I run bash father.sh, you could check it ps -alf
Then I killed the father.sh by kill -9 24588, and I guessed the child process should be terminated but unfortunately I was wrong.
Could anyone explain why?
thx
No, when you kill a process alone, it will not kill the children.
You have to send the signal to the process group if you want all processes for a given group to receive the signal
For example, if your parent process id has the code 1234, you will have to specify the parentpid adding the symbol minus followed by your parent process id:
kill -9 -1234
Otherwise, orphans will be linked to init, as shown by your third screenshot (PPID of the child has become 1).
-bash: kill: (-123) - No such process
In an interactive Terminal.app session the foreground process group id number and background process group id number are different by design when job control/monitor mode is enabled. In other words, if you background a command in a job-control enabled Terminal.app session, the $! pid of the backgrounded process is in fact a new process group id number (pgid).
In a script having no job control enabled, however, this may not be the case! The pid of the backgrounded process may not be a new pgid but a normal pid! And this is, what causes the error message -bash: kill: (-123) - No such process, trying to kill a process group but only specifying a normal pid (instead of a pgid) to the kill command.
# the following code works in Terminal.app because $! == $pgid
{
sleep 100 &
IFS=" " read -r pgid <<EOF
$(ps -p $! -o pgid=)
EOF
echo $$ $! $pgid
sleep 10
kill -HUP -- -$!
#kill -HUP -- -${pgid} # use in script
}
pkill -TERM -P <ProcessID>
This will kill both Parent as well as child
Generally killing the parent also kills the child.
The reason that you are seeing the child still alive after killing the father is because the child only will die after it "chooses" (the kernel chooses) to handle the SIGKILL event. It doesn't have to handle it right away. Your script is running a sleep() command (i.e. in the kernel), which will not wake up to handle any events whatsoever until the sleep is completed.
Why is PPID #1? The parent has died and is no longer in the process table. child.sh isn't linked inexplicably to init now. It simply has no running parent. Saying it is linked to init creates the impression that if we somehow leave init, that init has control over shutting down the process. It also creates the impression that killing a parent will make the grandparent the owner of a child. Both are not true. That child process still exists in the process table and is running, but no new events based upon it's process ID will be handled until it handles SIGKILL. Which means that the child is a pre-zombie, walking dead, in danger of being labeled .
Killing in the process group is different, and is used to kill the siblings, and the parent by the process group #. It's probably also important to note that "killing a process" is not "killing" per se, in the human way, where you expect the process to be destroyed and all memory returned as though it never was. It just sends a particular event, among many, to the process for it to handle. If the process does not handle it properly, then after a while the OS will come along and "clean it up" forcibly.
It (killing) doesn't happen right away because the child (or even the parent) could have written something to disk and be waiting for I/O to complete or doing some other critical task that could compromise system stability or file integrity.

Resources