I wrote a shell script which inturn calls other schell scripts using nohup. After the successful completion of the script, I still see Linux process running for the custom script I wrote.
contents of startAllComponents.sh
start_Server()
{
SERVER_HOME=${1}
NOHUP_LOG_FILE=${2}
logmsg "Starting the server"
/usr/bin/nohup `${SERVER_HOME}/bin/server.sh >> ${NOHUP_LOG_FILE} 2>&1 ` &
sleep 5
PID=`ps -ef|grep ${SERVER_HOME}/jvm |grep -v grep| awk '{print $2}'`
if [ "${PID}" = "" ]
then
logmsg "Couldn't get the PID after starting the server"
else
logmsg "****** Server started with PID: ${PID} ****** "
fi
}
logmsg()
{
echo "`date '+%b %e %T'` : $1"$'\n' >> /tmp/STARTUP`date '+%Y%m%d'`_.log
}
#### Send an email #####
sendEmail()
{
RECIPIENTS="gut1kor#sample.com"
SMTP="1.1.1.1:25"
mailx -s "$SUBJECT" -S "smtp=smtp://$SMTP" $RECIPIENTS < /tmp/STARTUP`date '+%Y%m%d'`_.log
}
##### Main #####
INTS[0]="/opt/server/inst01;/home/gut1kor/nohup.inst01.out"
INTS[1]="/opt/server/inst02;/home/gut1kor/nohup.inst02.out"
INTS[2]="/opt/server/inst03;/home/gut1kor/nohup.inst03.out"
echo "##### Bringing up servers on `hostname`. #####"$'\n' > /tmp/STARTUP`date '+%Y%m%d'`_.log
IS_TOTAL=${#INTS[#]}
logmsg "Total Servers are: ${IS_TOTAL}"
if [ "$IS_TOTAL" -gt "0" ]
then
for((i=0;i<IS_TOTAL;i++)) do
IFS=";" read -a arr <<< "${INTS[$i]}"
start_Server ${arr[0]} ${arr[1]}
done
fi
sendEmail
The script works as expected in bringin up the server instances but after the execution I see two processes for the script running for each instance.
[gut1kor#HOST1 startAll]$ ps -ef|grep startAllComponents.sh
gut1kor 63699 1 0 18:44 pts/2 00:00:00 /bin/sh ./startAllComponents.sh
gut1kor 63700 63699 0 18:44 pts/2 00:00:00 /bin/sh ./startAllComponents.sh
gut1kor 63889 61027 0 18:45 pts/2 00:00:00 grep startAllComponents.sh
Why these processes are still there even after the script execution is done? What changes should I make in the script?
It is mostly like due to the use of nohup utility. The problem with the using the command is, it forks a new process every time it is invoked from start_Server() function call.
From the man page
nohup No Hang Up. Run a command immune to hangups, runs the given
command with hangup signals ignored, so that the command can
continue running in the background after you log out.
To kill all the process started by nohup you probably need to get the process id of the command started and kill it at the end of the script.
/usr/bin/nohup $( ${SERVER_HOME}/bin/server.sh >> ${NOHUP_LOG_FILE} 2>&1 ) &
echo $! >> save_pid.txt # Add this line
At the end of the script.
sendEmail
while read p; do
kill -9 $p
done <save_pid.txt
Related
I keep getting zero status even after interrupting the script.
The first script
#!/bin/bash
## call the backup script
/usr/local/bin/backup 2>&1 >/dev/null
echo $?
backup
#!/bin/bash
exitscript() {
rm -f $LOCKFILE
echo "Script Status: $1 | tee -a ${LOG}"
echo "> End Date: $(date +'%d.%m.%Y %H:%M:%S')" | tee -a ${LOG}
exit $1
}
######START#######
trap "exitscript 1" 1 2 23 24 25
rsync ${args} ${src} ${dest} | tee -a ${RSYNC_LOG}
retcode=${PIPESTATUS[0]}
if [[ ${retcode} -ne 0 ]]; then
exitcode=1
fi
exitscript ${exitcode:-0}
When the First Script is run, it returns exit status of 0 although i have tried to kill the backup script before it ends (for that i have created a very large size file so that rsync takes time to copy the file and i get the time to kill the script before it ends)
ps -ef | grep -i backup
kill $PID
Another thing is that even after killing the backup script, rsync still runs. I would like for rsync to stop once the script is being killed and my first script to return the status code of zero.
Much appreciation for any suggestions. Thanks!
I assume the missing quote in echo "Script Status: $1 | tee -a ${LOG} is not relevant to the question.
When you want a function to handle the trap, you need to export that function.
And when you want to kill children, you should add these in your trap-function.
I tested these adjustments with a sleep command, it should work for rsync too.
#!/bin/bash
exitscript() {
echo "Script Status: $1"
(( $pleasekill > 0 )) && kill ${pleasekill}
echo "> End Date: $(date +'%d.%m.%Y %H:%M:%S')"
exit $1
}
# Export the function exitscript
export exitscript
######START#######
pleasekill=0
trap "exitscript 1" 1 2 23 24 25
# Start I/O-friendly rsync function
sleep 30 &
pleasekill=$!
wait
exitscript 2
When you test this with the first script, use ^C or kill -1 pid_of_backup.
I have a problem in my script, when I want to save the pid, then the incorrect pid is saved to me.
I suspect that the pid script (start.sh) is written to me instead of the screen command.
echo "Trwa uruchamianie bota muzycznego..."
if [ -e "$BINARYNAME" ]; then
if [ ! -x "$BINARYNAME" ]; then
echo "${BINARYNAME} is not executable, trying to set it"
chmod u+x "${BINARYNAME}"
fi
if [ -x "$BINARYNAME" ]; then
export LD_LIBRARY_PATH="${LIBRARYPATH}:${LD_LIBRARY_PATH}"
screen -dmS "${BASENAME}" mono "${BINARYNAME}" > /dev/null &
TEST=$0
PID=$!
echo "${PID}"
ps -p ${PID} > /dev/null 2>&1
if [ "$?" -ne "0" ]; then
echo "Bot muzyczy nie został uruchomiony."
else
echo $PID > TS3AudioBot.pid
echo "Bot muzyczny został uruchomiony."
fi
else
echo "${BINARNAME} nie jest możliwy do wykrycia, nie można uruchomić bota muzycznego."
fi
else
echo "Could not find binary, aborting"
exit 5
fi
I believe you were expecting to get the pid of a screen process in $PID. What's happening is that screen exits immediately, and $! refers to the pid of the vanished screen process rather than the detached process that is running your mono command (if it's still running).
I substituted "sleep 2000 &" for "screen -dmS ${BASENAME}" mono ${BINARYNAME} >/dev/null &" in your script and the correct $PID, that of the sleep process, was saved in the variable and acted upon. This doesn't happen with screen for the reason I described above.
You might want to consider processing the output of "screen -list" in order to get at the pid of the detached process:
root#tutorial:/var/tmp# screen -dmS 'sleeper' sleep 2000
root#tutorial:/var/tmp# screen -list
There is a screen on:
7089.sleeper (07/02/2018 04:05:57 AM) (Detached)
1 Socket in /var/run/screen/S-root.
root#tutorial:/var/tmp# ps axlww | grep 7089
5 0 7089 1 20 0 25672 2396 poll_s Ss ? 0:00 SCREEN -dmS sleeper sleep 2000
4 0 7090 7089 20 0 5808 648 hrtime Ss+ pts/0 0:00 sleep 2000
0 0 7093 2607 20 0 12728 2192 pipe_w S+ ttyS1 0:00 grep 7089
From here your script could grab the pid of the sleeper.
I'm running a bash script with multiple simultaneous commands (python scripts).
I'm trying to kill all the processes if one of them has failed.
The thing is that the python scripts are still running in the background, and if one of them has failed, my bash script doesn't know about.
Here's a snippet from my script:
set -a
trap cleanup_children SIGTERM
MY_PID=$$
function thread_listener () {
to_execute="$1"
echo "Executing $to_execute ..."
$to_execute &
PID=$!
trap 'echo killing $PID; kill $PID' SIGTERM
echo "Waiting for $PID ($to_execute) ..."
wait $PID || if `kill -0 $MY_PID &> /dev/null`; then kill $MY_PID; fi
}
function cleanup_children () {
for job in `jobs -p`
do
if `kill -0 $job &> /dev/null`; then
echo "Killing child number $job"
ps -p $job
kill $job
fi
done
}
function create_app1 () {
cd ${GIT_DIR}
python ./create-app.py -myapp
exit_code=$?
echo "Create app1 ISO result: ${exit_code}"
[ "${exit_code}" == "1" ] && exit 1
mv ${ISO_OUTPUT_DIR}/rhel-7.1.iso ${ISO_OUTPUT_DIR}/${ISO_NAME}.iso
}
function create_app2 () {
cd ${GIT_DIR}
python ./create-app.py -do-something
exit_code=$?
echo "Create app1 ISO result: ${exit_code}"
[ "${exit_code}" == "1" ] && exit 1
mv ${ISO_OUTPUT_DIR}/rhel-7.1.iso ${ISO_OUTPUT_DIR}/${ISO_NAME}.iso
}
export -f create_app1
export -f create_app2
echo "MY_PID=$MY_PID"
thread_listener create_app1 &
PID_APP1=$!
thread_listener create_app2 &
PID_APP2=$!
wait
kill $PID_APP1 2> /dev/null
kill $PID_APP2 2> /dev/null
Hm, this looks quite advanced ;). Do I assume correctly that you never see the "Create app1 ISO result" output then because the python script does not terminate? It might be an issue with the signal not being properly dispatched to bash background jobs. It might also be related to your python code not properly reacting to the signal. Have you checked out https://docs.python.org/2/library/signal.html? Sure you'd have to figure out the exact steps how to interrupt you python code while executing. I'd suggest to first make sure that the python code reacts to signals the way you want.
I have a shell snippet:
nohup sudo node server.js >> node.log 2>&1 &
if [ $? -eq 0 ]; then
echo $! $?
echo $! > pids
fi
What I expect is if the node server.js run normaly, then record the pid of this process to the file: pids.
But it does'nt work, the $? is always 0 because it is the status of sudo process?
And the $! is also not the pid of the process of node command.
So how can I get the correct return code and pid of the node server.js in the above shell script?
My final solutions:
#!/usr/bin/env bash
ROOT=$(cd `dirname $0`; pwd)
sudo kill -9 `cat ${ROOT}/pids` || true
nohup sudo node server.js >> node.log 2>&1 &
sleep 1
pid=$(ps --ppid $! | tail -1 | awk '{ print $1 }')
if echo $pid | egrep -q '^[0-9]+$'; then
echo $pid > ${ROOT}/pids
else
echo 'server not started!'
fi
When you run a command in the background, there's no sensible way for the shell to get the return code without waiting for the process to finish (i.e., node is no longer running). Therefore even these have return code 0:
$ false &
$ does_not_exist &
It seems what you want to do is to check whether a daemon started properly, which completely depends on the daemon. In your case you've started a Node.js server, so you could simply run something like this (untested):
test_if_server_is_running() {
tries=10
while [ "$tries" -gt 0 ]
do
let tries--
wget http://127.0.0.1/some_server_path && return
sleep 1
done
return 1 # Did not start up in at least 10 seconds
}
I am calling another shell script testarg.sh within my main script.
the logfiles of testarg.sh are stored in $CUSTLOGS in the below format
testarg.DDMONYY.PID.log
example: testarg.09Jun10.21165.log
In the main script after the testarg process gets completed i need to grep the log file for the text "ERROR" and "COMPLETED SUCCESSFULLY".
How do i get the PID of the process and combine with DDMONYY for grepping. Also i need to check whether file
exists before grepping
$CUSTBIN/testarg.sh
$CUSTBIN/testarg.sh
rc=$?
if [ $rc -ne 0 ]; then
return $CODE_WARN
fi
You may background testarg.sh, which puts its pid into $!, and then wait for it:
#! /bin/bash
...
$CUSTBIN/testarg.sh &
LOGFILE=testarg.$(date +%d%b%y).$!.log # testarg.09Jun10.12345.log
wait $!
# ... $? is set as you expect ...
[ -f $LOGFILE ] && grep {pattern} $LOGFILE
...
If you can modify testarg.sh and it doesn't otherwise output anything, just change it to output its log file with a line like:
echo testarg.$(date +%blah).$$.log
then use:
fspec=$($CUSTBIN/testarg.sh)
in your parent.
Alternatively, you can provide a wrapper function to do the work:
#!/bin/bash
function fgpid() {
"$#" &
pid=$!
ps -ef | grep ${pid} | sed 's/^/DEBUG:/' >&2 # debugging
wait ${pid}
echo ${pid}
}
fspec=testarg.$(date +%d%b%y).$(fgpid sleep 5).log
echo ${fspec}
This produces:
pax> ./qq.sh
DEBUG:pax 2656 2992 con 15:27:00 /usr/bin/sleep
testarg.09Jun10.2656.log
as expected.
Or this if you think your executable may output something. This one stores the PID into a variable:
#!/bin/bash
function fgpid() {
"$#" &
pid=$!
ps -ef | grep ${pid} | sed 's/^/DEBUG:/' >&2 # debugging
wait ${pid}
}
fgpid sleep 5
fspec=testarg.$(date +%d%b%y).${pid}.log
echo ${fspec}
There are two simple ways to get the PID of some process you've just spawned.
One would be to modify the program being spawned (the subprocess) to have it write its PID to a file. You'd then read it therefrom with something like:
$CUSTBIN/testarg.sh
TSTARGSPID=$(cat /var/run/custbin.testarg.pid)
Another more elegant method would be:
$CUSTBIN/testarg.sh &
TSTARGSPID=$!
wait
# Do stuff with PID and output files