I would like to know if there is any simple script to automatically restart a screened background process.
The process gets killed but couldn't manage to create a successful one :(.
Thanks in advance! <3
I believe that the safest (but not the easiest) way to do this is to create a cron job to check if the process is running, and if it is not, restart it again. The reason why this method is "safer", is because if you use a loop like what ivanivan suggested and that script "crashes", the program will not be restarted again; on the other hand, by doing via cron, the check program will be called every minute.
For example, your cron could be:
* * * * * env DISPLAY=:0 /folder/testscript >/dev/null 2>&1
The env DISPLAY=:0 might not be needed in your case, or it might be needed, depending on your script (note: you might need to adapt this to your case, run echo $DISPLAY to find out your variable on the case).
For example, your testscript could be:
#!/bin/bash
testvar="$(ps aux | grep -s "mainscript" | grep -sv "grep -s mainscript")"
if [ -z "$testvar" ]; then nohup /folder/mainscript &; fi
#sleep and run second test
sleep 30
testvar="$(ps aux | grep -s "mainscript" | grep -sv "grep -s mainscript")"
if [ -z "$testvar" ]; then nohup /folder/mainscript &; fi
exit 0
On the example above, the testscript would check to see if the mainscript is running (and restart it if necessary) twice every minute.
Related
Debian 8.6. No root.
I can use cron.
I need to check if application ( php ./somescript & ) running in background stopped, and restart it. How can I check it using bash?
Of course, there is ps aux | grep ....., but how do I automate it?
I suggest to take a look at keyword #reboot from man 5 crontab to start a job once at server startup.
One way to go about it would be:
Cron:
* * * * * env DISPLAY=:0 /folder/myscript >/dev/null 2>&1
The env DISPLAY=:0 might not be needed in your case, or it might be needed, depending on your script (note: you might need to adapt this to your case, run echo $DISPLAY to find out your variable on the case).
Script:
#!/bin/bash
testvar="$(ps aux | grep -s "somescript" | grep -sv "grep")"
if [ -z "$testvar" ]; then nohup /folder/somescript &; fi
exit 0
This all could and should be fine tuned to your needs, but I believe this example could serve you well.
Edit: I fixed a small oversight on the code (I added | grep -sv "grep" to get rid of the own grep process of looking for the file from the tesvar results).
I'm trying to have a lightweight memory profiler for the matlab jobs that are run on my machine. There is either one or zero matlab job instance, but its process id changes frequently (since it is actually called by another script).
So here is the bash script that I put together to log memory usage:
#!/bin/bash
pid=`ps aux | grep '[M]ATLAB' | awk '{print $2}'`
if [[ -n $pid ]]
then
\grep VmSize /proc/$pid/status
else
echo "no pid"
fi
when I run this script in bash like this:
./script.sh
it works fine, giving me the following result:
VmSize: 1289004 kB
which is exactly what I want.
Now, I want to run this periodically. So I run it with watch, like this:
watch ./script.sh
But in this case I only receive:
no pid
Please note that I know the matlab job is still running, because I can see it with the same pid on top, and besides, I know each matlab job take several hours to finish.
I'm pretty sure that something is wrong with the quotes I have when setting pid. I just can't figure out how to fix it. Anyone knows what I'm doing wrong?
PS.
In the man page of watch, it says that commands are executed by sh -c. I did run my script like sh -c ./script and it works just fine, but watch doesn't.
Why don't you use a loop with sleep command instead?
For example:
#!/bin/bash
pid=`ps aux | grep '[M]ATLAB' | awk '{print $2}'`
while [ "1" ]
do
if [[ -n $pid ]]
then
\grep VmSize /proc/$pid/status
else
echo "no pid"
fi
sleep 10
done
Here the script sleeps(waits) for 10 seconds. You can set the interval you need changing the sleep command. For example to make the script sleep for an hour use sleep 1h.
To exit the script press Ctrl - C
This
pid=`ps aux | grep '[M]ATLAB' | awk '{print $2}'`
could be changed to:
pid=$(pidof MATLAB)
I have no idea why it's not working in watch but you could use a cron job and make the script log to a file like so:
#!/bin/bash
pid=$(pidof MATLAB) # Just to follow previously given advice :)
if [[ -n $pid ]]
then
echo "$(date): $(\grep VmSize /proc/$pid/status)" >> logfile
else
echo "$(date): no pid" >> logfile
fi
You'd of course have to create logfile with touch.
You might try just running ps command in watch. I have had issues in the past with watch chopping lines and such when they get too long.
It can be fixed by making the terminal you are running the command from wider or changing the column like this (may need to adjust the 160 to your liking):
export COLUMNS=160;
My cron is like below:
$ crontab -l
0,15,30,45 * * * * /vas/app/check_cron/cronjob.sh 2>&1 > /vas/app/check_cron/cronjob.log; echo "Exit code: $?" >> /vas/app/check_cron/cronjob.log
$ more /vas/app/check_cron/cronjob.sh
#!/bin/sh
echo "starting script";
/usr/local/bin/rsync -r /vas/app/check_cron/cron1/ /vas/app/check_cron/cron2/
echo "completed running the script";
$ ls -l /usr/local/bin/rsync
-rwxr-xr-x 1 bin bin 411494 Oct 5 2011 /usr/local/bin/rsync
$ ls -l /vas/app/check_cron/cronjob.sh
-rwxr-xr-x 1 vas vas 153 May 14 12:28 /vas/app/check_cron/cronjob.sh
if i run it manually ... the script is running well.
$ /vas/app/check_cron/cronjob.sh 2>&1 > /vas/app/check_cron/cronjob.log; echo "Exit code: $?" >> /vas/app/check_cron/cronjob.log
if run by crontab, the cron generate double processes more than 30 in 24hours until i kill them manually.
$ ps -ef | grep cron | grep -v root | grep -v grep
vas 24157 24149 0 14:30:00 ? 0:00 /bin/sh /vas/app/check_cron/cronjob.sh
vas 24149 8579 0 14:30:00 ? 0:00 sh -c /vas/app/check_cron/cronjob.sh 2>&1 > /vas/app/check_cron/cronjob.log; ec
vas 24178 24166 0 14:30:00 ? 0:00 /usr/local/bin/rsync -r /vas/app/check_cron/cron1/ /vas/app/check_cron/cron2/
vas 24166 24157 0 14:30:00 ? 0:01 /usr/local/bin/rsync -r /vas/app/check_cron/cron1/ /vas/app/check_cron/cron2/
Please give me advice how to make running well and no processes still running in the system
and processes stop properly.
BR,
Noel
The output you provide seems normal, the first two processes is just /bin/sh running your cron script and the later two are the rsync processes.
It might be a permission issue if the crontab is not the same user as the one you use for testing, causing the script to take longer when run from cron. You can add -v, -vv, or even -vvv to the rsync command for increased output and then observe the cron email after each run.
One method to prevent multiple running instances of scripts is to use lock files of some sort, I find it easy to use mkdir for this purpose.
#!/bin/sh
LOCK="/tmp/$0.lock"
# If mkdir fails then the lock already exists
mkdir $LOCK > /dev/null 2>&1
[ $? -ne 0 ] && exit 0
# We clean up the lock when the script exists for any reason
trap "{ rmdir $LOCK ; exit 0 ; }" EXIT
echo "starting script";
/usr/local/bin/rsync -r /vas/app/check_cron/cron1/ /vas/app/check_cron/cron2/
echo "completed running the script";
Just make sure you have some kind of cleanup when the OS starts in case it doesn't clean up /tmp by itself. The lock might be left there if the script crashes, is killed or is running when the OS is rebooted.
Why do you worry? Is something not working? From the parent process ID's I can deduce that the shell (PID=24157) forks an rsync (24166), and the rsync forks another rsync (24178). Looks like that's just how rsync operates...
It's certainly not cron starting two rsync processes.
Instead of CRON, you might want to have a look at the Fat Controller
It works similarly to CRON but has various built-in strategies for managing cases where instances of the script you want to run would overlap.
For example, you could specify that the currently running instance is killed and a new one started, or you could specify a grace period in which the currently running instance has to finish before then terminating it and starting a new one. Alternatively, you can specify to wait indefinitely.
There are more examples and full documentation on the website:
http://fat-controller.sourceforge.net/
I have a script that I mean to be run from cron that ensures that a daemon that I wrote is working. The contents of the script file are similar to the following:
daemon_pid=`ps -A | grep -c fsdaemon`
echo "daemon_pid: " $daemon_pid
if [ $daemon_pid -eq 0 ]; then
echo "restarting fsdaemon"
/etc/init.d/fsdaemon start
fi
When I execute this script from the command prompt, the line that echoes the value of $daemon_pid is reporting a value of 2. This value is two regardless of whether my daemon is running or not. If, however, I execute the command with back quotes and then examine the $daemon_pid variable, the value of $daemon_pid is now one. I have also tried single stepping through the script using bashdb and, when I examine the variables using that tool, they are what they should be.
My question therefore is: why is there a difference in the behaviour between when the script is executed by the shell versus when the commands in the script are executed manually? I'm sure that there is something very fundamental that I am missing.
You're very likely encountering the grep as part of the 'answer' from ps.
To help fully understand what is happening, turn off the -c option, to see what data is being returned from just ps -A | grep fsdameon.
To solve the issue, some systems have a p(rocess)grep (pgrep). That will work, OR
ps -A | grep -v grep | grep -c fsdaemon
Is a common idiom you will see, but at the expense of another process.
The cleanest solution is,
ps -A | grep -c '[f]sdaemon'
The regular expression syntax should work with all greps, on all systems.
I hope this helps.
The problem is that grep itself shows up... Try running this command with anything after grep -c:
eple:~ erik$ ps -a | grep -c asdfladsf
1
eple:~ erik$ ps -a | grep -c gooblygoolbygookeydookey
1
eple:~ erik$
What does ps -a | grep fsdaemon return? Just look at the processes actually listed... :)
Since this is Linux, why not try the pgrep? This saves you a pipe, and you don't end up with grep reporting back the daemon script itself running.
Aany process with arguments including that name will add to the count - grep, and your script.
psing for a process isn't really reliable, you should use a lock file.
As several people have pointed out already, your process count is inflated because ps | grep detects (1) the script itself and (2) the subprocess created by the backquotes, which inherits the name of the main script. So an easy solution is to change the name of the script to something that doesn't include the name you're looking for. But you can do better.
The "best-practice" solution that I would suggest is to use the facilities provided by your operating system. It's not uncommon for an init script to create a PID file as part of the process of starting your daemon; in other words, instead of just running the daemon itself, you use a wrapper script that starts the daemon and then writes the process ID to a file somewhere. If start-stop-daemon exists on your system (and I think it's fairly common these days), you can use that like so:
start-stop-daemon --start --quiet --background \
--make-pidfile --pidfile /var/run/fsdaemon.pid -- /usr/bin/fsdaemon
(obviously replace the path /usr/bin/fsdaemon as appropriate) to start it, and then
start-stop-daemon --stop --quiet --pidfile /var/run/fsdaemon.pid
to stop it. start-stop-daemon has other options that might be useful to you, which you can investigate by reading the man page.
If you don't have access to start-stop-daemon, you can write a wrapper script to do basically the same thing, something like this to start:
echo "$$" > /var/run/fsdaemon.pid
exec /usr/bin/fsdaemon
and this to stop:
kill $(< /var/run/fsdaemon/pid)
rm /var/run/fsdaemon.pid
(this is pretty crude, of course, but it should normally work).
Anyway, once you have the setup to generate a PID file, whether by using start-stop-daemon or not, you can update your check script to this:
daemon_pid=`ps --no-headers --pid $(< /var/run/fsdaemon.pid) | wc -l`
if [ $daemon_pid -eq 0 ]; then
echo "restarting fsdaemon"
/etc/init.d/fsdaemon restart
fi
(one would think there would be a concise command to check whether a given PID is running, but I don't know it).
If you don't want to (or can't) create a PID file, I would at least suggest pgrep instead of ps | grep, since pgrep will search directly for a process by name and won't find anything that just happens to include the same string.
daemon_pid=`pgrep -x -c fsdaemon`
if [ $daemon_pid -eq 0 ]; then
echo "restarting fsdaemon"
/etc/init.d/fsdaemon restart
fi
The -x means "match exactly", and -c works as with grep.
By the way, it seems a bit misleading to name your variable daemon_pid when it is actually a count.
This
#!/bin/bash
if [ `ps -ef | grep "91.34.124.35" | grep -v grep | wc -l` -eq 0 ]; then sh home/asfd.sh; fi
or this?
ps -ef | grep "91\.34\.124\.35" | grep -v grep > /dev/null
if [ "$?" -ne "0" ]
then
sh home/asfd.sh
else
echo "Process is running fine"
fi
Hello, how can I write a shell script that looks in running processes and if there isn't a process name CONTAINING 91.34.124.35 then execute a file in a certain place and I want to make this run every 30 seconds in a continuous loop, I think there was a sleep command.
you can't use cron since on the implementation I know the smallest unit is one minute. You can use sleep but then your process will always be running (with cron it will started every time).
To use sleep just
while true ; do
if ! pgrep -f '91\.34\.124\.35' > /dev/null ; then
sh /home/asfd.sh
fi
sleep 30
done
If your pgrep has the option -q to suppress output (as on BSD) you can also use pgrep -q without redirecting the output to /dev/null
First of all, you should be able to reduce your script to simply
if ! pgrep "91\.34\.124\.35" > /dev/null; then ./your_script.sh; fi
To run this every 30 seconds via cron (because cron only runs every minute) you need 2 entries - one to run the command, another to delay for 30 seconds before running the same command again. For example:
* * * * * root if ! pgrep "91\.34\.124\.35" > /dev/null; then ./your_script.sh; fi
* * * * * root sleep 30; if ! pgrep "91\.34\.124\.35" > /dev/null; then ./your_script.sh; fi
To make this cleaner, you might be able to first store the command in a variable and use it for both entries. (I haven't tested this).
CHECK_COMMAND="if ! pgrep '91\.34\.124\.35' > /dev/null; then ./your_script.sh; fi"
* * * * * root eval "$CHECK_COMMAND"
* * * * * root sleep 30; eval "$CHECK_COMMAND"
p.s. The above assumes you're adding that to /etc/crontab. To use it in a user's crontab (crontab -e) simply leave out the username (root) before the command.
I would suggest using watch:
watch -n 30 launch_my_script_if_process_is_dead.sh
Either way is fine, you can save it in a .sh file and add it to the crontab to run every 30 seconds. Let me know if you want to know how to use crontab.
Try this:
if ps -ef | grep "91\.34\.124\.35" | grep -v grep > /dev/null
then
sh home/asfd.sh
else
echo "Process is running fine"
fi
No need to use test. if itself will examine the exit code.
You can save your script in file name, myscript.sh
then you can run your script through cron,
*/30 * * * * /full/path/for/myscript.sh
or you can use while
# cat script1.sh
#!/bin/bash
while true; do /bin/sh /full/path/for/myscript.sh ; sleep 30; done &
# ./script1.sh
Thanks.
I have found deamonizing critical scripts very effective.
http://cr.yp.to/daemontools.html
You can use monit for this task. See docu. It is available on most linux distributions and has a straightforward config. Find some examples in this post
For your app it will look something like
check process myprocessname
matching "91\.34\.124\.35"
start program = "/home/asfd.sh"
stop program = "/home/dfsa.sh"
If monit is not available on your platform you can use supervisord.
I also found this question very similar Repeat command automatically in Linux. It suggests to use watch.
Use cron for the "loop every 30 seconds" part.