Killing a pid tree in a script - linux

I have a script that loads a few programs but refuses to exit when given kill -15 and if I use kill -9 all of the child processes stay running.
I have tried:
parent=$(pgrep -f ./start_server.sh)
children=$(pgrep -P $parent)
sudo kill -9 $children $parent &
and this kills MOST of the pids (2964 2976 2977 2978 2979). Problem being is sometimes one of the children will have its own children and those will still be running (2976 has 10, etc).
What I mean when I say they keep running is for example: kill -9 2964 will cause 2976 2977 2978 2979 to become children of 2738
Also, the above code causes the script its in to exit as soon as the kill command returns (and print Killed to the console). I tried to run it by adding a & or exec but the script will still immediately exit.
At the end of the day I have this ./start_server.sh started via initd and when it crashes it can not restart properly because of these orphaned processes. I want to put a line in start that will murder all the processes that it spawned.

Related

kill -s SIGTERM kills parent process and one level child process only

I've been doing some experimenting with writing a command to kill parent and all it's children recursively. I've a script as below
parent.sh:
#!/bin/bash
/home/oracle/child.sh &
sleep infinity
child.sh:
#!/bin/bash
sleep infinity
Started command using
su oracle -c parent.sh &
I see a process tree like below
[root#source ~]# ps -ef | grep "/home/oracle"
root 14129 1171 0 12:39 pts/1 00:00:00 su oracle -c /home/oracle/parent.sh
oracle 14130 14129 0 12:39 ? 00:00:00 /bin/bash /home/oracle/parent.sh
oracle 14131 14130 0 12:39 ? 00:00:00 /bin/bash /home/oracle/child.sh
When I send sigterm to 14129 using kill -s SIGTERM 14129 it appears to kill 14129 and then 14130 goes down as well immediately; but 14131 stays up for a very long time. The last level child appears to have been reparented and has become a zombie.
oracle 14131 1 0 12:39 ? 00:00:00 /bin/bash /home/oracle/child.sh
If kill doesn't terminate any child processes why did 14130 get killed when I sent a SIGTERM to 14129?
If kill can kill child processes, why would does it go only one level down? Is the behavior here guaranteed?
The relevant part of what pilcrow provided is this:
SIGNALS top
Upon receiving either SIGINT, SIGQUIT or SIGTERM, su terminates
its child and afterwards terminates itself with the received
signal.
>> The child is terminated by SIGTERM,
>> (then) after unsuccessful attempt (to kill with SIGTERM) and
>> (after) 2 seconds of delay (,) the child is (then) killed
>> by SIGKILL [a second, harsher method].
That harsher method, SIGKILL, prevents that child process from attempting to kill its own children, hence the zombie state.
I haven't used it myself, but it seems that something like
killall --process-group parent.sh
would kill all processes tied to the process group associated with the "parent.sh" script. BUT ... not sure if "--wait" will serve you well, if the method used in the attempt to terminate is not being accepted.

Suspended child process exit if parent process exit

I have the following process tree
test1.sh
\- test2.sh
\- sleep 600
Normally If I kill the test1.sh process, the child processes test2.sh and sleep 600 will continue running. But If I suspend the sleep 600 process through send signal (SIGSTOP or SIGTSTP), and then kill the test1.sh process, the child test2.sh and sleep 600 will exit. Why?
Here is my test program:
test1.sh
#!/bin/sh
./test2.sh
test2.sh
#!/bin/sh
sleep 600
Test steps:
run test1.sh
$ ./test1.sh
open new console and suspend the child process.
$ kill -19 < sleep pid > or kill -20 < sleep pid >
kill the parent process test1.sh
$ kill < test1.sh pid >
You will find the after step3, the test2.sh and sleep 600 exited.
Bug if I only run step1 and step3, ignore step2, the test2.sh and sleep 600 process will not exit.
Can anyone explain this? Many thanks.
When you are killing process test1.sh, you leave test2.sh orphan so you need to know what happens with orphan processes in your Operating System.
When process test2.sh is running and his parent dies, the OS moves it to the init process and keeps its execution. So the result is both, test2.sh and sleep processes are still up even if you have killed test1.sh.
When process sleep is stopped (signal 20) and his parent dies, the OS tries to move it to the init process. However, since the process is stopped and there will no longer be any tty capable of resuming it (since its parent has died), the OS may decide to do other things with the process. In your case, it dies with SIGKILL to avoid the problem of many stopped, orphaned processes lying around the system. Since the sleep process have exited, the test2.sh process ends too.
From the GNU man page:
While a process is stopped, no more signals can be delivered to it
until it is continued, except SIGKILL signals and (obviously) SIGCONT
signals. The signals are marked as pending, but not delivered until
the process is continued. The SIGKILL signal always causes termination
of the process and can’t be blocked, handled or ignored. You can
ignore SIGCONT, but it always causes the process to be continued
anyway if it is stopped. Sending a SIGCONT signal to a process causes
any pending stop signals for that process to be discarded. Likewise,
any pending SIGCONT signals for a process are discarded when it
receives a stop signal.
When a process in an orphaned process group (see Orphaned Process
Groups) receives a SIGTSTP, SIGTTIN, or SIGTTOU signal and does not
handle it, the process does not stop. Stopping the process would
probably not be very useful, since there is no shell program that will
notice it stop and allow the user to continue it. What happens instead
depends on the operating system you are using. Some systems may do
nothing; others may deliver another signal instead, such as SIGKILL or
SIGHUP. On GNU/Hurd systems, the process dies with SIGKILL; this
avoids the problem of many stopped, orphaned processes lying around
the system.
By the way, if you are willing to kill them always you can add a trap on the main process to capture signals and exit the children properly.

How to kill(if process exist) & restart a process in a single command line

I am in situation where i want to kill a process if exist & restart the same. How to do it?
currently i am doing this
killall -9 inetd && /bin/inetd
If inetd is not running i get this
killall: /bin/inetd: no process killed
Even though inetd is not running i want the above command to be successful.
Use ;
killall -9 inetd; /bin/inetd

How to kill a process by its pid in linux

I'm new in linux and I'm building a program that receives the name of a process, gets its PID (i have no problem with that part) and then pass the PID to the kill command but its not working. It goes something like this:
read -p "Process to kill: " proceso
proid= pidof $proceso
echo "$proid"
kill $proid
Can someone tell me why it isn't killing it ? I know that there are some other ways to do it, even with the PID, but none of them seems to work for me. I believe it's some kind of problem with the Bash language (which I just started learning).
Instead of this:
proid= pidof $proceso
You probably meant this:
proid=$(pidof $proceso)
Even so,
the program might not get killed.
By default, kill PID sends the TERM signal to the specified process,
giving it a chance to shut down in an orderly manner,
for example clean up resources it's using.
The strongest signal to send a process to kill without graceful cleanup is KILL, using kill -KILL PID or kill -9 PID.
I believe it's some kind of problem with the bash language (which I just started learning).
The original line you posted, proid= pidof $proceso should raise an error,
and Bash would print an error message about it.
Debugging problems starts by reading and understanding the error messages the software is trying to tell you.
kill expects you to tell it **how to kill*, so there must be 64 different ways to kill your process :) They have names and numbers. The most lethal is -9. Some interesting ones include:
SIGKILL - The SIGKILL (also -9) signal forces the process to stop executing immediately. The program cannot ignore this signal. This process does not get to clean-up either.
SIGHUP - The SIGHUP signal disconnects a process from the parent process. This an also be used to restart processes. For example, "killall -SIGUP compiz" will restart Compiz. This is useful for daemons with memory leaks.
SIGINT - This signal is the same as pressing ctrl-c. On some systems, "delete" + "break" sends the same signal to the process. The process is interrupted and stopped. However, the process can ignore this signal.
SIGQUIT - This is like SIGINT with the ability to make the process produce a core dump.
use the following command to display the port and PID of the process:
sudo netstat -plten
AND THEN
kill -9 PID
Here is an example to kill a process running on port 8283 and has PID=25334
You have to send the SIGKILL flag with the kill statement.
kill -9 [pid]
If you don't the operating system will choose to kill the process at its convenience, SIGKILL (-9) will tell the os to kill the process NOW without ignoring the command until later.
Try this
kill -9
It will kill any process with PID given in brackets
Try "kill -9 $proid" or "kill -SIGKILL $proid" commands. If you want more information, click.
Based on what you have there, it looks like you aren't getting the actual PID in your proid variable. If you want to capture the output of pidof, you will need to enclose that command in backtics for the old form of command substitution ...
proid=`pidof $proceso`
... or like so for the new form of command substitution.
proid=$(pidof $proceso)
I had a similar problem, only wanting to run monitor (Video surveillance) for several hours a day.
Wrote two sh scripts;
cat startmotion.sh
#!/bin/sh
motion -c /home/username/.config/motion/motion.conf
And the second;
cat killmotion.sh
#!/bin/sh
OA=$(cat /var/run/motion/motion.pid)
kill -9 $OA
These were called from crontab at the scheduled time
ctontab -e
0 15 * * * /home/username/startmotion.sh
0 17 * * * /home/username/killmotion.sh
Very simple, but that's all I needed.

In unix I used kill command by providing a ppid then it close the terminal . why? kill -9 ppid

sleep 5000
In one terminal and in second terminal I'm running:
ps -ef | grep sleep
Then I'm killing this process in second terminal by using the ppid. Then it will close the first terminal where I run the sleep command. It will not create sleep command as an orphan.
$ ps -ef | grep sleep
trainee 4887 4864 0 17:05 pts/0 00:00:00 sleep 5000
trainee 4889 4264 0 17:05 pts/1 00:00:00 grep --color=auto sleep
kill -9 4864
Why?
Presumably the parent of the sleep is your shell. When you kill that your login is terminated and your terminal closes.
The Wikipedia article on Orphan process reads (in part),
An orphan process is a computer process whose parent process has finished or terminated, though it remains running itself.
and
A process can be orphaned unintentionally, such as when the parent process terminates or crashes. The process group mechanism in most Unix-like operation systems can be used to help protect against accidental orphaning, where in coordination with the user's shell will try to terminate all the child processes with the SIGHUP process signal, rather than letting them continue to run as orphans.

Resources