Closed. This question is off-topic. It is not currently accepting answers.
Want to improve this question? Update the question so it's on-topic for Stack Overflow.
Closed 11 years ago.
Improve this question
In my server machines, some process goes to the defunct state every day. It affects my CPU usage. I need to write a shell script to kill the defunct process id and parent id.
For example, when I run the command:
ps -ef|grep defunct.
it found many values. In that I need to kill only "[chrome] defunct" process.
sample entry:-
bitnami 12217 12111 0 Feb09 pts/3 00:00:00 [chrome] <defunct>
I need to kill this type of chrome entries. Can anyone suggest some samples to kill the entries?
Defunct processes do not go away until the parent process collects the corpse or the parent dies. When the parent process dies, the defunct processes are inherited by PID 1 (classically it is PID 1; it is some system process designated with the job), and PID 1 is designed to wait for dead bodies and remove them from the process table. So, strictly, the defunct processes only go away when their parent collects the corpse; when the original parent dies, the new parent collects the corpse so the defunct process goes away at last.
So, either write the parent code so that it waits on its dead children, or kill the parent process.
Note that defunct processes occupy very little resources - basically, a slot in the process table and the resource (timing) information that the parent can ask for.
Having said that, last year I was working on a machine where there were 3 new defunct processes per minute, owned by a a system process other than PID 1, that were not being harvested. Things like ps took a long, long, long time when the number of defunct processes climbed into the hundreds of thousands. (The solution was to install the correct fix pack for the o/s.) They are not completely harmless, but a few are not a major problem.
It's already dead. The parent needs to reap it and then it will go away.
Related
Closed. This question does not meet Stack Overflow guidelines. It is not currently accepting answers.
This question does not appear to be about programming within the scope defined in the help center.
Closed 1 year ago.
Improve this question
I'm developing a program (Grand parent process) that automatically relaunch a process (parent process) that calls two other processes (children processes) in case of errors.
If one of the children processes misbehave, the parent process try to close the application gracefully and the grand parent process restart everything. However, in case of bug or unexpected behavior, the grand parent process :
kills the parent process (which kills all the children)
Restart the parent process
Due to probably a problem in my code, the parents processes survive as zombies and sometime I find my embedded linux with 12 or 20 zombies. I know that zombies use very little ressources (if I'm not mistaken : only their entry into the process table).
My question, is there a theoretical limit to zombies number ?
My question, is there a theoretical limit to zombies number ?
Yes. It is whatever the maximum size of the kernel's process table is. This will vary from kernel to kernel and according to adjustable kernel parameters, but it is likely to be at least in the thousands.
But as long as we're here, let's address a couple of other things:
the grand parent process [...] kills the parent process (which kills all the children)
Killing a process does not automatically kill its children. They will be inherited by process ID 1, which you can observe in the process list if the processes live long enough. Cleaning those up after they terminate is one of the responsibilities of process 1, which may be why you have the impression that you are killing the grandchildren -- you probably don't see them left behind as zombies.
If you want to forcibly kill the children along with their parent, then you should be able to do so by putting the parent process in its own process group, and killing the whole group. (You need a separate process group so that the grandparent does not kill itself.)
Due to probably a problem in my code, the parents processes survive as zombies
This happens when a process's parent continues to run but does not wait(), waitpid(), or waitid() for the process after it terminates. In fact, that's closely related to zombie processes: they are indeed very light, because all they carry is the data that could be reported via one of those functions. Thus, "survive" is not a particularly apt description: a zombie process is no longer running; all that remains is some data about how it terminated.
I believe the only negative effect of keeping Zombie processes around is that they take up space in the kernel process table. The max number of zombies you can keep around should be the max number of processes your kernel supports, which you can query with cat /proc/sys/kernel/pid_max
I'm a beginner in Operating Systems and Linux, just a question on zombie processes.
I don't understand why parent processes need to reap child processes? Can't Linux just be designed to behave like: whenever a child process is terminated, it is going to be reaped automatically immediately without waiting for its parent process, which can save programers' time? Another question is, why a zombie process still consume system memory resources, isn't that it is already terminated, nothing needs to be maintained?
Is it possible to stop a PID from being reused?
For example if I run a job myjob in the background with myjob &, and get the PID using PID=$!, is it possible to prevent the linux system from re-using that PID until I have checked that the PID no longer exists (the process has finished)?
In other words I want to do something like:
myjob &
PID=$!
do_not_use_this_pid $PID
wait $PID
allow_use_of_this_pid $PID
The reasons for wanting to do this do not make much sense in the example given above, but consider launching multiple background jobs in series and then waiting for them all to finish.
Some programmer dude rightly points out that no 2 processes may share the same PID. That is correct, but not what I am asking here. I am asking for a method of preventing a PID from being re-used after a process has been launched with a particular PID. And then also a method of re-enabling its use later after I have finished using it to check whether my original process finished.
Since it has been asked for, here is a use case:
launch multiple background jobs
get PID's of background jobs
prevent PID's from being re-used by another process after background job terminates
check for PID's of "background jobs" - ie, to ensure background jobs finish
[note if disabled PID re-use for the PID's of the background jobs those PIDs could not be used by a new process which was launched after a background process terminated]*
re-enable PID of background jobs
repeat
*Further explanation:
Assume 10 jobs launched
Job 5 exits
New process started by another user, for example, they login to a tty
New process has same PID as Job 5!
Now our script checks for Job 5 termination, but sees PID in use by tty!
You can't "block" a PID from being reused by the kernel. However, I am inclined to think this isn't really a problem for you.
but consider launching multiple background jobs in series and then waiting for them all to finish.
A simple wait (without arguments) would wait for all the child processes to complete. So, you don't need to worry about the
PIDs being reused.
When you launch several background process, it's indeed possible that PIDs may be reused by other processes.
But it's not a problem because you can't wait on a process unless it's your child process.
Otherwise, checking whether one of the background jobs you started is completed by any means other than wait is always going to unreliable.
Unless you've retrieved the return value of the child process it will exist in the kernel. That also means that it's pid is bound to it and can't being re-used during that time.
Further suggestion to work around this - if you suspect that a PID assigned to one of your background jobs is reassigned, check it in ps to see if it still is your process with your executable and has PPID (parent PID) 1.
If you are afraid of reusing PID's, which won't happen if you wait as other answers explain, you can use
echo 4194303 > /proc/sys/kernel/pid_max
to decrease your fear ;-)
From what I gather, programs that run as pid 1 may need to take special precautions such as capturing certain signals.
It's not altogether clear how to correctly write a pid 1. I'd rather not use runit or supervisor in my case. For example, supervisor is written in python and if you install that, it'll result in a much larger container. I'm not a fan of runit.
Looking at the source code for runit is intersting but as usual, comments are virtually non-existent and don't explain what's being done for what reason.
There is a good discussion here:
When the process with pid 1 die for any reason, all other processes
are killed with KILL signal
When any process having children dies for any reason, its children are reparented to process with PID 1
Many signals which have default action of Term do not have one for PID 1.
The relevant part for your question:
you can’t stop process by sending SIGTERM or SIGINT, if process have not installed a signal handler
Closed. This question is off-topic. It is not currently accepting answers.
Want to improve this question? Update the question so it's on-topic for Stack Overflow.
Closed 10 years ago.
Improve this question
EDIT: More detailed answers here: https://serverfault.com/questions/454192/my-linux-server-number-of-processes-created-and-context-switches-are-growing
I have a strange behaviour in my server :-/. Is a VPS. When I do cat /proc/stat, I can see how each second about 50-100 processes are created and happens about 800k-1200k context switches! All that info is with the server completely idle, no traffic nor programs running.
Top shows 0 load average and 100% idle CPU.
I've closed all non-needed services (httpd, mysqld, sendmail, nagios, named...) and the problem still happens. I do ps -ALf each second too and I don't see any changes, only a new ps process is created each time and the PID is just the same as before + 1, so new processes are not created, so I thought that process growing in cat /proc/stat must be threads (Yes, seems that processes in /proc/stat counts threads creation too as this states: http://webcache.googleusercontent.com/search?q=cache:8NLgzKEzHQQJ:www.linuxhowtos.org/System/procstat.htm&hl=es&tbo=d&gl=es&strip=1).
I've changed to /proc dir and done cat [PID]\status with all PIDs listed with ls (Including kernel ones) and in any process voluntary_ctxt_switches nor nonvoluntary_ctxt_switches are growing at the same speed as cat /proc/stat does (just a few tens/second).
I've done strace -p PID to all process too so I can see if any process is crating threads or something but the only process that has a bit of movement is ssh and that movement is read/write operations because of the data is sending to my terminal.
After that, I've done vmstat -s and saw that forks is growing at the same speed processes in /proc/stat does. As http://linux.die.net/man/2/fork says, each fork() creates a new PID but my server PID is not growing!
The last thing I can think of is that all process data that proc/stat and vmstat -s show is shared with all the other VPS stored in the same machine, but I don't know if that is correct... If someone can throw some light on this I would be really grateful.