Bash script on background: how to kill child processes

Bash script on background: how to kill child processes - linux

Well, I'm basically trying to make a bash script runs a node script forever. I made the following bash script:
#!/bin/bash
while true ; do
cd /myscope/
unlink nohup.out
node myscript.js
sleep 6
done & echo $! > pid
I'm expecting that when it runs, it starts up node with the given script, checks if node exits, sleeps for 6 seconds if so and reopen node. Also, I'm expecting it to run in background and writes it's pid (the bash pid) on a file called "pid".
Everything explained above works as expected, apparently, but I'm also expecting that when the pid of the bash script is killed, the node script would stop running, I don't know why that made sense in my mind, but when it comes to practice, it doesn't work. The bash script is killed indeed, but the node script keeps running and that is freaking me out.
I've tested it in the terminal, by not sending the bash script to the background and entering ctrl+c, both scripts gets killed.
I'm obviously miss understanding something on the way the background process works. For god sake, can anybody help me?

There are lots of tools that let you do what you're trying, just two off the top of my head:
https://github.com/nodejitsu/forever - A simple CLI tool for ensuring that a given script runs continuously (i.e. forever)
https://github.com/remy/nodemon - Monitor for any changes in your node.js application and automatically restart the server - perfect for development
Maybe the second it's not what you're looking for, but still worth a look.
If you can't or don't want to use those then the problem is that if you kill the parent process the child one is still there, so, you should kill that too:
pkill -TERM -P $PID
where $PID is the parent PID.

Related

How do I stop a scirpt running in the background in linux?

Let's say I have a silly script:
while true;do
touch ~/test_file
sleep 3
done
And I start the script into the background and leave the terminal:
chmod u+x silly_script.sh
./silly_script.sh &
exit
Is there a way for me to identify and stop that script now? The way I see it is, that every command is started in it's own process and I might be able to catch and kill one command like the 'sleep 3' but not the execution of the entire script, am I mistaken? I expected a process to appear with the scripts name, but it does not. If I start the script with 'source silly_script.sh' I can't find a process by the name of 'source'. Do I need to identify the instance of bash, that is executing the script? How would I do that?
EDIT: There have been a few creative solutions, but so far they require the PID of the script execution to be stored right away, or the bash session to not be left with ^D or exit. I understand, that this way of running scripts should maybe be avoided, but I find it hard to believe, that any low privilege user could, even by accident, start an annoying script into the background, that is for instance filling the drive with garbage files or repeatedly starting new instances of some software and even the admin has no other option, than to restart the server, because a simple script can hide it's identifier without even trying.

With the help of the fine people here I was able to derive the answer I needed:
It is true, that the script runs every command in it's own process, so for instance killing the sleep 3 command won't do anything to the script being run, but through a command like the sleep 3 you can find the bash instance running the script, by looking for the parent process:
So after doing the above, you can run ps axf to show all processes in a tree form. You will then find this section:
18660 ? S 0:00 /bin/bash
18696 ? S 0:00 \_ sleep 3
Now you have found the bash instance, that is running the script and can stop it: kill 18660
(Of course your PID will be different from mine)

The jobs command will show you all running background jobs.
You can kill background jobs by id using kill, e.g.:
$ sleep 9999 &
[1] 58730
$ jobs
[1]+ Running sleep 9999 &
$ kill %1
[1]+ Terminated sleep 9999
$ jobs
$
58730 is the PID of the backgrounded task, and 1 is the task id of it. In this case kill 58730 and kill %1` would have the same effect.
See the JOB CONTROL section of man bash for more info.
When you exit, the backgrounded job will get a kill signal and die (assuming that's how it handles the signal - in your simple example it is), unless you disown it first.
That kill will propogate to the sleep process, which may well ignore it and continue sleeping. If this is the case you'll still see it in ps -e output, but with a parent pid of 1 indicating its original parent no longer exists.
You can use ps -o ppid= <pid> to find the parent of a process, or pstree -ap to visualise the job hierarchy and find the parent visually.

nohup node service using cron job on CentOS 7 [duplicate]

I have a python script that'll be checking a queue and performing an action on each item:
# checkqueue.py
while True:
check_queue()
do_something()
How do I write a bash script that will check if it's running, and if not, start it. Roughly the following pseudo code (or maybe it should do something like ps | grep?):
# keepalivescript.sh
if processidfile exists:
if processid is running:
exit, all ok
run checkqueue.py
write processid to processidfile
I'll call that from a crontab:
# crontab
*/5 * * * * /path/to/keepalivescript.sh

Avoid PID-files, crons, or anything else that tries to evaluate processes that aren't their children.
There is a very good reason why in UNIX, you can ONLY wait on your children. Any method (ps parsing, pgrep, storing a PID, ...) that tries to work around that is flawed and has gaping holes in it. Just say no.
Instead you need the process that monitors your process to be the process' parent. What does this mean? It means only the process that starts your process can reliably wait for it to end. In bash, this is absolutely trivial.
until myserver; do
echo "Server 'myserver' crashed with exit code $?. Respawning.." >&2
sleep 1
done
The above piece of bash code runs myserver in an until loop. The first line starts myserver and waits for it to end. When it ends, until checks its exit status. If the exit status is 0, it means it ended gracefully (which means you asked it to shut down somehow, and it did so successfully). In that case we don't want to restart it (we just asked it to shut down!). If the exit status is not 0, until will run the loop body, which emits an error message on STDERR and restarts the loop (back to line 1) after 1 second.
Why do we wait a second? Because if something's wrong with the startup sequence of myserver and it crashes immediately, you'll have a very intensive loop of constant restarting and crashing on your hands. The sleep 1 takes away the strain from that.
Now all you need to do is start this bash script (asynchronously, probably), and it will monitor myserver and restart it as necessary. If you want to start the monitor on boot (making the server "survive" reboots), you can schedule it in your user's cron(1) with an #reboot rule. Open your cron rules with crontab:
crontab -e
Then add a rule to start your monitor script:
#reboot /usr/local/bin/myservermonitor
Alternatively; look at inittab(5) and /etc/inittab. You can add a line in there to have myserver start at a certain init level and be respawned automatically.
Edit.
Let me add some information on why not to use PID files. While they are very popular; they are also very flawed and there's no reason why you wouldn't just do it the correct way.
Consider this:
PID recycling (killing the wrong process):
/etc/init.d/foo start: start foo, write foo's PID to /var/run/foo.pid
A while later: foo dies somehow.
A while later: any random process that starts (call it bar) takes a random PID, imagine it taking foo's old PID.
You notice foo's gone: /etc/init.d/foo/restart reads /var/run/foo.pid, checks to see if it's still alive, finds bar, thinks it's foo, kills it, starts a new foo.
PID files go stale. You need over-complicated (or should I say, non-trivial) logic to check whether the PID file is stale, and any such logic is again vulnerable to 1..
What if you don't even have write access or are in a read-only environment?
It's pointless overcomplication; see how simple my example above is. No need to complicate that, at all.
See also: Are PID-files still flawed when doing it 'right'?
By the way; even worse than PID files is parsing ps! Don't ever do this.
ps is very unportable. While you find it on almost every UNIX system; its arguments vary greatly if you want non-standard output. And standard output is ONLY for human consumption, not for scripted parsing!
Parsing ps leads to a LOT of false positives. Take the ps aux | grep PID example, and now imagine someone starting a process with a number somewhere as argument that happens to be the same as the PID you stared your daemon with! Imagine two people starting an X session and you grepping for X to kill yours. It's just all kinds of bad.
If you don't want to manage the process yourself; there are some perfectly good systems out there that will act as monitor for your processes. Look into runit, for example.

Have a look at monit (http://mmonit.com/monit/). It handles start, stop and restart of your script and can do health checks plus restarts if necessary.
Or do a simple script:
while true
do
/your/script
sleep 1
done

In-line:
while true; do <your-bash-snippet> && break; done
This will restart continuously <your-bash-snippet> if it fails: && break will stop the loop if <your-bash-snippet> stop gracefully (return code 0).
To restart <your-bash-snippet> in all cases:
while true; do <your-bash-snippet>; done
e.g. #1
while true; do openconnect x.x.x.x:xxxx && break; done
e.g. #2
while true; do docker logs -f container-name; sleep 2; done

The easiest way to do it is using flock on file. In Python script you'd do
lf = open('/tmp/script.lock','w')
if(fcntl.flock(lf, fcntl.LOCK_EX|fcntl.LOCK_NB) != 0):
sys.exit('other instance already running')
lf.write('%d\n'%os.getpid())
lf.flush()
In shell you can actually test if it's running:
if [ `flock -xn /tmp/script.lock -c 'echo 1'` ]; then
echo 'it's not running'
restart.
else
echo -n 'it's already running with PID '
cat /tmp/script.lock
fi
But of course you don't have to test, because if it's already running and you restart it, it'll exit with 'other instance already running'
When process dies, all it's file descriptors are closed and all locks are automatically removed.

You should use monit, a standard unix tool that can monitor different things on the system and react accordingly.
From the docs: http://mmonit.com/monit/documentation/monit.html#pid_testing
check process checkqueue.py with pidfile /var/run/checkqueue.pid
if changed pid then exec "checkqueue_restart.sh"
You can also configure monit to email you when it does do a restart.

if ! test -f $PIDFILE || ! psgrep `cat $PIDFILE`; then
restart_process
# Write PIDFILE
echo $! >$PIDFILE
fi

watch "yourcommand"
It will restart the process if/when it stops (after a 2s delay).
watch -n 0.1 "yourcommand"
To restart it after 0.1s instead of the default 2 seconds
watch -e "yourcommand"
To stop restarts if the program exits with an error.
Advantages:
built-in command
one line
easy to use and remember.
Drawbacks:
Only display the result of the command on the screen once it's finished

I'm not sure how portable it is across operating systems, but you might check if your system contains the 'run-one' command, i.e. "man run-one".
Specifically, this set of commands includes 'run-one-constantly', which seems to be exactly what is needed.
From man page:
run-one-constantly COMMAND [ARGS]
Note: obviously this could be called from within your script, but also it removes the need for having a script at all.

I've used the following script with great success on numerous servers:
pid=`jps -v | grep $INSTALLATION | awk '{print $1}'`
echo $INSTALLATION found at PID $pid
while [ -e /proc/$pid ]; do sleep 0.1; done
notes:
It's looking for a java process, so I
can use jps, this is much more
consistent across distributions than
ps
$INSTALLATION contains enough of the process path that's it's totally unambiguous
Use sleep while waiting for the process to die, avoid hogging resources :)
This script is actually used to shut down a running instance of tomcat, which I want to shut down (and wait for) at the command line, so launching it as a child process simply isn't an option for me.

I use this for my npm Process
#!/bin/bash
for (( ; ; ))
do
date +"%T"
echo Start Process
cd /toFolder
sudo process
date +"%T"
echo Crash
sleep 1
done

Howto debug running bash script

I have a bash script running on Ubuntu.
Is it possible to see the line/command executed now without script restart.
The issue is that script sometimes never exits. This is really hard to reproduce (now I caught it), so I can't just stop the script and start the debugging.
Any help would be really appreciated
P.S. Script logic is hard to understand, so I can't to figure out why it's frozen by power of thoughts.

Try to find the process id (pid) of the shell, you may use ps -ef | grep <script_name>
Let's set this pid in the shell variable $PID.
Find all the child processes of this $PID by:
ps --ppid $PID
You might find one or more (if for example it's stuck in a pipelined series of commands). Repeat this command couple of times. If it doesn't change this means the script is stuck in certain command. In this case, you may attach trace command to the running child process:
sudo strace -p $PID
This will show you what is being executed, either indefinite loop (like reading from a pipe) or waiting on some event that never happens.
In case you find ps --ppid $PID changes, this indicates that your script is advancing but it's stuck somewhere, e.g. local loop in the script. From the changing commands, it can give you a hint where in the script it's looping.

Kill a "background process" in Linux using a C Program

I have started my process in background and I would like to kill that process using a C program using popen().
I have tried in many ways but in vain. The reason is when I run a C code, it is executed in a sub-shell because of which I can't get the processes running in main shell.
I used $! to get the latest pid running in the background, but because of the above reason it didn't work.

my_process & pids="${pids-} $!" //start my process
sleep 10 // run for 10 seconds
kill -2 $pids //kill the process
Also you can store PID in file and kill it.like
./process1.sh &
echo $! > /tmp/process1.pid
kill -9 `cat /tmp/process*.pid`
rm /tmp/process*.pid

You should make your process into a daemon, that way you can start, end and restart it without complications.
You can start here: Best way to make a shell script daemon?

+1 on Raydel's answer
Another alternative (since there are so many ways to do things) If you have root you can also create it as a service and then start it and stop it manually using the "service" commands.
(Sorry wanted to add as a comment to Raydel's but my rep is not high enough apparently so adding as a separate answer)

Can upstart expect/respawn be used on processes that fork more than twice?

I am using upstart to start/stop/automatically restart daemons. One of the daemons forks 4 times. The upstart cookbook states that it only supports forking twice. Is there a workaround?
How it fails
If I try to use expect daemon or expect fork, upstart uses the pid of the second fork. When I try to stop the job, nobody responds to upstarts SIGKILL signal and it hangs until you exhaust the pid space and loop back around. It gets worse if you add respawn. Upstart thinks the job died and immediately starts another one.
Bug acknowledged by upstream
A bug has been entered for upstart. The solutions presented are stick with the old sysvinit, rewrite your daemon, or wait for a re-write. RHEL is close to 2 years behind the latest upstart package, so by the time the rewrite is released and we get updated the wait will probably be 4 years. The daemon is written by a subcontractor of a subcontractor of a contractor so it will not be fixed any time soon either.

I came up with an ugly hack to make this work. It works for my application on my system. YMMV.
start the application in the pre-start section
in the script section run a script that runs as long as the application runs. The pid of this script is what upstart will track.
in the post-stop section kill the application
example
env DAEMON=/usr/bin/forky-application
pre-start script
su -s /bin/sh -c "$DAEMON" joeuseraccount
end script
script
sleepWhileAppIsUp(){
while pidof $1 >/dev/null; do
sleep 1
done
}
sleepWhileAppIsUp $DAEMON
end script
post-stop script
if pidof $DAEMON;
then
kill `pidof $DAEMON`
#pkill $DAEMON # post-stop process (19300) terminated with status 1
fi
end script
a similar approach could be taken with pid files.

Develop Reference

node.js excel linux python-3.x azure haskell apache-spark rust .htaccess string