I see an incorrect pid - linux

I have a problem in my script, when I want to save the pid, then the incorrect pid is saved to me.
I suspect that the pid script (start.sh) is written to me instead of the screen command.
echo "Trwa uruchamianie bota muzycznego..."
if [ -e "$BINARYNAME" ]; then
if [ ! -x "$BINARYNAME" ]; then
echo "${BINARYNAME} is not executable, trying to set it"
chmod u+x "${BINARYNAME}"
fi
if [ -x "$BINARYNAME" ]; then
export LD_LIBRARY_PATH="${LIBRARYPATH}:${LD_LIBRARY_PATH}"
screen -dmS "${BASENAME}" mono "${BINARYNAME}" > /dev/null &
TEST=$0
PID=$!
echo "${PID}"
ps -p ${PID} > /dev/null 2>&1
if [ "$?" -ne "0" ]; then
echo "Bot muzyczy nie został uruchomiony."
else
echo $PID > TS3AudioBot.pid
echo "Bot muzyczny został uruchomiony."
fi
else
echo "${BINARNAME} nie jest możliwy do wykrycia, nie można uruchomić bota muzycznego."
fi
else
echo "Could not find binary, aborting"
exit 5
fi

I believe you were expecting to get the pid of a screen process in $PID. What's happening is that screen exits immediately, and $! refers to the pid of the vanished screen process rather than the detached process that is running your mono command (if it's still running).
I substituted "sleep 2000 &" for "screen -dmS ${BASENAME}" mono ${BINARYNAME} >/dev/null &" in your script and the correct $PID, that of the sleep process, was saved in the variable and acted upon. This doesn't happen with screen for the reason I described above.
You might want to consider processing the output of "screen -list" in order to get at the pid of the detached process:
root#tutorial:/var/tmp# screen -dmS 'sleeper' sleep 2000
root#tutorial:/var/tmp# screen -list
There is a screen on:
7089.sleeper (07/02/2018 04:05:57 AM) (Detached)
1 Socket in /var/run/screen/S-root.
root#tutorial:/var/tmp# ps axlww | grep 7089
5 0 7089 1 20 0 25672 2396 poll_s Ss ? 0:00 SCREEN -dmS sleeper sleep 2000
4 0 7090 7089 20 0 5808 648 hrtime Ss+ pts/0 0:00 sleep 2000
0 0 7093 2607 20 0 12728 2192 pipe_w S+ ttyS1 0:00 grep 7089
From here your script could grab the pid of the sleeper.

Related

bash script exits with zero status even after kill signal

I keep getting zero status even after interrupting the script.
The first script
#!/bin/bash
## call the backup script
/usr/local/bin/backup 2>&1 >/dev/null
echo $?
backup
#!/bin/bash
exitscript() {
rm -f $LOCKFILE
echo "Script Status: $1 | tee -a ${LOG}"
echo "> End Date: $(date +'%d.%m.%Y %H:%M:%S')" | tee -a ${LOG}
exit $1
}
######START#######
trap "exitscript 1" 1 2 23 24 25
rsync ${args} ${src} ${dest} | tee -a ${RSYNC_LOG}
retcode=${PIPESTATUS[0]}
if [[ ${retcode} -ne 0 ]]; then
exitcode=1
fi
exitscript ${exitcode:-0}
When the First Script is run, it returns exit status of 0 although i have tried to kill the backup script before it ends (for that i have created a very large size file so that rsync takes time to copy the file and i get the time to kill the script before it ends)
ps -ef | grep -i backup
kill $PID
Another thing is that even after killing the backup script, rsync still runs. I would like for rsync to stop once the script is being killed and my first script to return the status code of zero.
Much appreciation for any suggestions. Thanks!
I assume the missing quote in echo "Script Status: $1 | tee -a ${LOG} is not relevant to the question.
When you want a function to handle the trap, you need to export that function.
And when you want to kill children, you should add these in your trap-function.
I tested these adjustments with a sleep command, it should work for rsync too.
#!/bin/bash
exitscript() {
echo "Script Status: $1"
(( $pleasekill > 0 )) && kill ${pleasekill}
echo "> End Date: $(date +'%d.%m.%Y %H:%M:%S')"
exit $1
}
# Export the function exitscript
export exitscript
######START#######
pleasekill=0
trap "exitscript 1" 1 2 23 24 25
# Start I/O-friendly rsync function
sleep 30 &
pleasekill=$!
wait
exitscript 2
When you test this with the first script, use ^C or kill -1 pid_of_backup.

xinitrc.d content are not applied when running startx

I have a script which allows me to restart my running Xserver. However, whenever the Xserver run back again, all the contents of xinitrc.d folder aren't applied.
rm /tmp/startx.logs
LOOPTC=0
while [ $LOOPTC -eq 0 ]
do
pidof TerminalConfig 1>/dev/null 2>/dev/null
LOOPTC=$?
sleep 1
echo Tc not closed >> /tmp/startx.logs
done
killall gdm 2>/dev/null &
pkill x
if grep ^AUTOLOGIN /etc/sysconfig/autologin | egrep "NO|no|No|nO" ; then
echo autologin off >> /tmp/startx.logs
LOOPX=0
while [ $LOOPX -eq 0 ]
do
pidof X 1>/dev/null 2>/dev/null
LOOPX=$?
sleep 1
echo X not closed >> /tmp/startx.logs
done
fi
clear >> /dev/tty1
for (( i=0; i<4; i++ )) ; do
sleep 1
echo Please wait while restarting X11 Windows... >> /dev/tty1
done
clear >> /dev/tty1
ps ax | grep startx > /tmp/startx.result
sed -e s/.*grep.*//g -e /^$/d /tmp/startx.result -i
echo $(date) :startx: >> /tmp/startx.logs
cat /tmp/startx.result >> /tmp/startx.logs
if [ -s /tmp/startx.result ] ; then
echo Thu Feb 1 22:50:08 UTC 2018 :startx already running, no need to execute startx >> /tmp/startx.logs
else
killall gdm 2>/dev/null &
startx
echo $(date) :startx not running, executing startx >> /tmp/startx.logs
fi
if grep --quiet if [ ! /etc/X11/xinit/xinitrc; then
cp -f /etc/X11/xinit/xinitrc.old /etc/X11/xinit/xinitrc
rm -f /tmp/startx.result
#rm -f /tmp/restart-x.sh
Note that this script is always run on the Xserver. All the xinitrc.d contents are always applied whenever I run startx from the console (without Xserver running).
I'm wondering, why aren't the xinitrc.d files applied even if I have already ensured that the contents of the xinitrc.d folder should be applied using a for-loop and placed it in the xinitrc file?

Shell script leaving process after successful execution

I wrote a shell script which inturn calls other schell scripts using nohup. After the successful completion of the script, I still see Linux process running for the custom script I wrote.
contents of startAllComponents.sh
start_Server()
{
SERVER_HOME=${1}
NOHUP_LOG_FILE=${2}
logmsg "Starting the server"
/usr/bin/nohup `${SERVER_HOME}/bin/server.sh >> ${NOHUP_LOG_FILE} 2>&1 ` &
sleep 5
PID=`ps -ef|grep ${SERVER_HOME}/jvm |grep -v grep| awk '{print $2}'`
if [ "${PID}" = "" ]
then
logmsg "Couldn't get the PID after starting the server"
else
logmsg "****** Server started with PID: ${PID} ****** "
fi
}
logmsg()
{
echo "`date '+%b %e %T'` : $1"$'\n' >> /tmp/STARTUP`date '+%Y%m%d'`_.log
}
#### Send an email #####
sendEmail()
{
RECIPIENTS="gut1kor#sample.com"
SMTP="1.1.1.1:25"
mailx -s "$SUBJECT" -S "smtp=smtp://$SMTP" $RECIPIENTS < /tmp/STARTUP`date '+%Y%m%d'`_.log
}
##### Main #####
INTS[0]="/opt/server/inst01;/home/gut1kor/nohup.inst01.out"
INTS[1]="/opt/server/inst02;/home/gut1kor/nohup.inst02.out"
INTS[2]="/opt/server/inst03;/home/gut1kor/nohup.inst03.out"
echo "##### Bringing up servers on `hostname`. #####"$'\n' > /tmp/STARTUP`date '+%Y%m%d'`_.log
IS_TOTAL=${#INTS[#]}
logmsg "Total Servers are: ${IS_TOTAL}"
if [ "$IS_TOTAL" -gt "0" ]
then
for((i=0;i<IS_TOTAL;i++)) do
IFS=";" read -a arr <<< "${INTS[$i]}"
start_Server ${arr[0]} ${arr[1]}
done
fi
sendEmail
The script works as expected in bringin up the server instances but after the execution I see two processes for the script running for each instance.
[gut1kor#HOST1 startAll]$ ps -ef|grep startAllComponents.sh
gut1kor 63699 1 0 18:44 pts/2 00:00:00 /bin/sh ./startAllComponents.sh
gut1kor 63700 63699 0 18:44 pts/2 00:00:00 /bin/sh ./startAllComponents.sh
gut1kor 63889 61027 0 18:45 pts/2 00:00:00 grep startAllComponents.sh
Why these processes are still there even after the script execution is done? What changes should I make in the script?
It is mostly like due to the use of nohup utility. The problem with the using the command is, it forks a new process every time it is invoked from start_Server() function call.
From the man page
nohup No Hang Up. Run a command immune to hangups, runs the given
command with hangup signals ignored, so that the command can
continue running in the background after you log out.
To kill all the process started by nohup you probably need to get the process id of the command started and kill it at the end of the script.
/usr/bin/nohup $( ${SERVER_HOME}/bin/server.sh >> ${NOHUP_LOG_FILE} 2>&1 ) &
echo $! >> save_pid.txt # Add this line
At the end of the script.
sendEmail
while read p; do
kill -9 $p
done <save_pid.txt

Why do `kill -0 $pid; echo $?` and `ps -p$pid; echo $?` sometimes differ?

I use kill -0 $pid (reading $pid from a PID file) to check if a daemon is running. I just discovered that even though kill -0 2661 returns 0, I can't see the process running in top, htop or ps aux. In particular, ps -p $pid returns 1.
Why is that?
Example output:
$ pid=2661; kill -0 $pid; echo $?
0
$ pid=2661; ps -p $pid; echo $?
PID TTY TIME CMD
1
Same like that:
$ pid=2661; kill -0 $pid; echo $?; ps -p $pid; echo $?
0
PID TTY TIME CMD
1
Edit
It seems that this occurs more often than I thought. Here's a small snippet to check PIDs 1 to 2000 (only works as root):
# for pid in $(seq 1 2000); do killcode=$(kill -0 $pid 2>/dev/null; echo $?); pscode=$(ps -p $pid >/dev/null 2>&1; echo $?); if [ $killcode != $pscode ]; then echo $pid $killcode $pscode; fi done
820 0 1
821 0 1
822 0 1
974 0 1
977 0 1
1029 0 1
1030 0 1
...
ayecee in #linux on Freenode answered my question. If $pid is a running thread, kill -0 $pid will return 0, while ps -p $pid will return 1.
I have verified this answer by manually picking a thread PID from htop and then running the above mentioned commands.

Get the pid and return status from nohup + sudo

I have a shell snippet:
nohup sudo node server.js >> node.log 2>&1 &
if [ $? -eq 0 ]; then
echo $! $?
echo $! > pids
fi
What I expect is if the node server.js run normaly, then record the pid of this process to the file: pids.
But it does'nt work, the $? is always 0 because it is the status of sudo process?
And the $! is also not the pid of the process of node command.
So how can I get the correct return code and pid of the node server.js in the above shell script?
My final solutions:
#!/usr/bin/env bash
ROOT=$(cd `dirname $0`; pwd)
sudo kill -9 `cat ${ROOT}/pids` || true
nohup sudo node server.js >> node.log 2>&1 &
sleep 1
pid=$(ps --ppid $! | tail -1 | awk '{ print $1 }')
if echo $pid | egrep -q '^[0-9]+$'; then
echo $pid > ${ROOT}/pids
else
echo 'server not started!'
fi
When you run a command in the background, there's no sensible way for the shell to get the return code without waiting for the process to finish (i.e., node is no longer running). Therefore even these have return code 0:
$ false &
$ does_not_exist &
It seems what you want to do is to check whether a daemon started properly, which completely depends on the daemon. In your case you've started a Node.js server, so you could simply run something like this (untested):
test_if_server_is_running() {
tries=10
while [ "$tries" -gt 0 ]
do
let tries--
wget http://127.0.0.1/some_server_path && return
sleep 1
done
return 1 # Did not start up in at least 10 seconds
}

Resources