Linux: process of a bash shell launched by crontab still running after the shell is terminated - linux

There is an issue I would like to solve: I'm going to deploy on a Linux Red Hat 5 production environment (called PENV) a php web application under apache server; I am developing such this application on a development environment (called DENV) with Linux Mint 20.3.
On DENV, I created a crontab for the user www-data containing the following scheduled command:
0 4,12 * * * sh /bdir/s_etlShell.sh >/dev/null 2>&1;
the shell /bdir/s_etlShell.sh starts everyday at 4.00 AM and at noon, and its execution lasts between 2 and 10 minutes. It also writes to a logfile /bdir/logshell.txt.
the last two instructions of the shell are
echo "SHELL TERMINATED" >> /bdir/logshell.txt
exit
Past the 4.00 AM and noon, I found SHELL TERMINATED as a final statement inside /bdir/logshell.txt, but when I give the following command by terminal
ps fax | grep "s_etlShell.sh" | grep -v grep
I get the following output (the PID's are varying obviously):
ps fax | grep "s_etlShell.sh" | grep -v grep
1596 ? Ss 0:00 \_ /bin/sh -c sh /bdir/s_etlShell.sh >/dev/null 2>&1
1605 ? S 0:00 \_ sh /bdir/s_etlShell.sh
the processes of the shell look as if they were still active despite the shell terminated. I would expect no output instead.
I need to check the status of the shell execution in the web application via php script (check_etl_shell_status.php) launched every 2 seconds by the following javascript funcion
function loadCall() {
setInterval(function () {$("#id_content").load("check_etl_shell_status.php",'q='); }, 2000);
}
the function loadCall() is being called on load the home page.
The content of check_etl_shell_status.php is the following
<?php
$output = shell_exec('ps fax | grep "s_etlShell.sh" | grep -v grep');
if ($output) {
echo "shell is still running...";
} else {
echo "shell terminated";
}
?>
and the output message is displayed inside a div of the home page
...
<div id="id_content"></div>
...
is there a way to make sure that, when a shell has terminated, whether is launched by crontab or on demand by web application, I have the right information on its status?
Thanks to whoever can help me

Related

How to find the commands executed by a /bin/bash process? (Linux)

TL;DR :
I want to get the command running (if running) in the /bin/bash processes.
I want a script that can identify in the /bin/bash process the command /bin/bash is running. Tried to find it in /proc/[pid]/cmdline but it only show /bin/bash.
Is there a way to do this or what I'm wondeing is impossible. :o
I'm asking because when I run a ps -ef, some processes (like ssh) show how they'r running.
user 30410 30409 0 10:58 pts/0 00:00:00 ssh name#127.0.0.1 <-- here
There is the ssh command fully printed.
We can see the same if I do the command ps -ef | grep "/bin/bash", it return :
user 20080 4999 0 13:40 pts/9 00:00:00 grep /bin/bash <-- here
There is the command grep /bin/bash printed.
But if I run a bash loop like while true; do echo "hello"; done
And then I do ps -ef | grep "while" It return nothing !!!
that depends on what type of command are you looking for.
for external commands running from a shell, "ps -efH" shows you a hierarchical list of running processes, which you can then find the info you need.
bash built-in commands doesn't show up on ps list, you will have to enable script debugging using "set -x" and then monitor the stderr to see what the script is doing.
To answer the edits you made:
while is a built-in, so it doesn't show up. but the "echo" will show up in the "ps -efH" output i mentioned above.

Cron script to restart memcached not working

I have a script in cron to check memcached and restart it if it's not working. For some reason it's not functioning.
Script, with permissions:
-rwxr-xr-x 1 root root 151 Aug 28 22:43 check_memcached.sh
Crontab entry:
*/5 * * * * /home/mysite/www/check_memcached.sh 1> /dev/null 2> /dev/null
Script contents:
#!/bin/sh
ps -eaf | grep 11211 | grep memcached
if [ $? -ne 0 ]; then
service memcached restart
else
echo "eq 0 - memcache running - do nothing"
fi
It works fine if I run it from the command line but last night memcached crashed and it was not restarted from cron. I can see cron is running it every 5 minutes.
What am I doing wrong?
Do I need to use the following instead of service memcached restart?
/etc/init.d/memcached restart
I have another script that checks to make sure my lighttpd instance is running and it works fine. It works a little differently to verify it's running but is using the init.d call to restart things.
Edit - Resolution: Using /etc/init.d/memcached restart solved this problem.
What usually causes crontab problems is command paths. In the command line, the paths to commands are already there, but in cron they're often not. If this is your issue, you can solve it by adding the following line into the top of your crontab:
PATH=/usr/local/sbin:/usr/local/bin:/usr/sbin:/usr/bin:/sbin:/bin
This will give cron explicit paths to look through to find the commands your script runs.
Also, your shebang in your script is wrong. It needs to be:
#!/bin/bash
I suspect the problem is with the grep 11211 - it's not clear the meaning of the number, and that grep may not be matching the desired process.
I think you need to log the actions of this script - then you see what's actually happening.
#!/bin/bash
exec >> /tmp/cronjob.log 2>&1
set -xv
cat2 () { tee -a /dev/stderr; }
ps -ef | cat2 | grep 11211 | grep memcached
if [ $? -ne 0 ]; then
service memcached restart
else
echo "eq 0 - memcache running - do nothing"
fi
exit 0
The set -xv output is captured to a log file in /tmp. The cat2 will copy the stdin to the log file, so you can see what grep is acting upon.
Save below code as check_memcached.sh
#!/bin/bash
MEMCACHED_STATUS=`systemctl is-active memcached.service`
if [[ ${MEMCACHED_STATUS} == 'active' ]]; then
echo " Service running.... so exiting "
exit 1
else
service memcached restart
fi
And you can schedule it as cron.

Linux bash script that kills a process (not started by me) after x amount of time

I'm pretty inexperienced with Linux bash. That being said, I have a CentOS7 machine that runs a COTS application server. This application server runs other processes that sometimes hang. Since I have no control over the start of these processes, I'm looking for a script that runs every 2 minutes that kills processes of the name "spicer" that have been running for longer than 10 minutes. I've looked around and have only been able to find answers for processes that are run and owned by me.
I use the command ps -eo pid, command,etime | grep spicer to get all the spicer processes. The output of this command looks like:
18216 spicer -l/opt/otmm-10.5/Spi 14:20
18415 spicer -l/opt/otmm-10.5/Spi 11:49
etc...
18588 grep --color=auto spicer
I don't know if there's a way to parse this directly in bash. I'm also not well-versed at all in other Linux tools. I know that awk (or gawk) could possibly help.
EDIT
I have no control over the data that the process is working on.
What about wrapping the executable of spicer and start it using the timeout command? Let's say it is installed in /usr/bin/spicer. Then issue:
cp /usr/bin/spicer{,.orig}
echo '#!/bin/bash' > /usr/bin/spicer
echo 'timeout 10m spicer.orig "$#"' >> /usr/bin/spicer
Another approach would be to create a cronjob defintion into /etc/cron.d/kill_spicer. Like this:
* * * * * root kill $(ps --no-headers -C spicer -o pid,etimes | awk '$2>=600{print $1}')
The cronjob will get executed minutely and uses ps to obtain a list of spicer processes that run longer than 10minutes and passes them to kill.
Probably you even want kill -9 if the process is hanging.
You can use the -C option of ps to select processes by name.
ps --no-headers -C spicer -o pid,etime
Then you can use cut to filter the results, if the spacing is consistent. On my system the pid field takes up 8 characters, so I'd use
kill $(ps --no-headers -C spicer -o pid,etime | cut -c-8)
If the spacing is inconsistent (but if so, what kind of messed up ps are you using? :-P), you can use awk { print $1 } instead of cut.

My crond job doesn't work as expected, why?

I created a shell script to check a tomcat instance status. If the instance is not started, then start it:
if [ `ps -ef | grep 'travelco' | grep -v grep | wc -l` -eq 0 ];then
sudo /home/q/tools/bin/restart_tomcat.sh /home/www/travelco/
else
echo 'travelco started'
fi
Then I tested the script and it worked well. But after I added it as a crond job, this script didn't work as expected.
I used crontab -e, and added
*/1 * * * * /home/yuliang.jin/travelcoCheck.sh
After that, even though I can see the script executed in the crontab log(sudo tail -f /var/log/cron), the tomcat instance was not started. Why?
There's a sudo in your script but are you sure that your current user has the permission to execute /home/q/tools/bin/restart_tomcat.sh without password authentication?
You should add the script to /etc/sudoers to allow your current user to execute the script without password, or you can just sudo crontab -e to run the script as root (and don't forget to delete sudo in your script if you do so).
If there is any other option, don't sudo in a cron job.
travelcoCheck.sh will be matched by the grep travelco and is not cancelled by the grep -v grep, so wc -l will be at least 1 always. So restart_tomcat.sh will not run.
(As a side note: whether or not your ps-parsing stack gets caught by ps is something of a dark art and is full of corner cases and race conditions and generally difficult to get to work right. Stuff like this is why dbus was invented.)

"Ambiguous output redirect" trying to send both stdout and stderr to mailx from a command sent to at

I have a bash script called test.sh which, for the sake of simplicity, prints one line to stdout and one line to stderr.
test.sh:
#!/bin/bash
echo "this is to stdout"
echo "this is to stderr" 1>&2
I want to run the script test.sh at 7:00 PM, but only if certain conditions are met. To this end, I have another bash script called schedule.sh, which checks some stuff and then submits the command to at to be run later.
I want the output of test.sh (both stdout and stderr) to be sent to me in an email. I use mailx to do this so I can get a nice subject name.
Furthermore, I want at to shut up. No output from at because it always sends me ugly emails (no subject line) if at produces any output.
schedule.sh:
#!/bin/bash
my_email="me#example.com" # Email is a variable
# Check some stuff, exit if certain conditions not met
echo "~/test.sh 2>&1 | mailx -s\"Cool title\" $my_email" | at 7:00 PM &> /dev/null
What's interesting is that when I run schedule.sh from cron (which runs the script with sh), it works perfectly. However, when I manually run schedule.sh from the terminal (NB: I'm using tcsh), at (not mailx) sends me an email saying
Ambiguous output redirect.
I'm not sure why the shell I run schedule.sh from makes a difference, when schedule.sh is a bash script.
Here is my thinking in looking at schedule.sh. Everything within the quotation marks "~/test.sh 2>&1 | mailx -s\"Cool title\" me#email.com" should be an argument to at, and at runs that argument as a command using sh. The redirection 2>&1 | is in the style of sh for this reason.
When I remove 2>&1 and only pipe the stdout of test.sh to mailx, it does work; however, I receive 2 emails: one with stdout from mailx and another from stderr from at.
What gives? How can I make this work regardless of the shell I'm calling it from?
Thanks.
edit:
uname -o says my OS is GNU/Linux
Here is uname -a if it helps:
Linux [hostname censored] 2.6.9-89.ELlargesmp #1 SMP Mon Jun 22 12:46:58 EDT 2009 x86_64 x86_64 x86_64 GNU/Linux
When I check the at contents using at -c, here's what I see:
#!/bin/sh
# atrun uid=xxxxx gid=xxxxx
# mail username 0
# ...
SHELL=/bin/tcsh; export SHELL
# ...
${SHELL:-/bin/sh} << `(dd if=/dev/urandom count=200 bs=1 2>/dev/null|LC_ALL=C tr -d -c '[:alnum:]')`
~/test.sh 2>&1 | mailx -s"Cool title" me#example.com
I'm having a hard time understanding the second to last line... is this going to execute using $SHELL or /bin/sh?
The command executed via at is:
~/test.sh 2>&1 | mailx -s\"Cool title\" $my_email
The behavior of at command varies from one system to another. On Linux, the command is executed using /bin/sh. In fact, on my system (Linux Mint 14), it prints a warning message:
$ echo 'printenv > at.env' | at 19:24
warning: commands will be executed using /bin/sh
On Solaris, the command is executed by the shell specified by the current value of the $SHELL environment variable. Using an account where my default shell is /bin/tcsh on Solaris 9, I get:
% echo 'printenv > at.env' | at 19:25
commands will be executed using /bin/tcsh
job 1397874300.a at Fri Apr 18 19:25:00 2014
% echo 'printenv > at.env' | env SHELL=/bin/sh at 19:28
commands will be executed using /bin/sh
job 1397874480.a at Fri Apr 18 19:28:00 2014
Given that at's behavior is inconsistent (and frankly confusing), I suggest having it execute just a single command, with any I/O redirection being performed inside that command. That's the best way to ensure that the command will be executed correctly regardless of which shell is used to execute it.
For example (untested code follows):
echo '#!/bin/bash' > tmp.bash
echo "~/test.sh 2>&1 | mailx -s\"Cool title\" $my_email" >> tmp.bash
chmod +x tmp.bash
echo "./tmp.bash" | at 7:00 PM

Resources