Why isn't this command substitution returning the expected value? - linux

I'm using a shell script from the answer to this post to re-start a process if it exits. I have several processes that I need to start with this script, so I'm writing another script to launch them in order. I want to check that each process is launched successfully before continuing in the launch order. I've written a script similar to this.
#!/bin/sh
launch() {
exec "$1" "$2" &!
local pid="$(pidof $2)"
echo "PID: $pid"
if [ "$pid" == "" ]; do
return 1
done
return 0
}
launch [looping-script.sh] [process-to-wrap]
exit 0
When I run this script with valid arguments for launch the script and process start and show up in ps output, but the value of $pid is blank and the function return 1. However, if I take the [looping-script.sh] argument out of the equation, like so.
#!/bin/sh
launch() {
exec "$1" &!
local pid="$(pidof $1)"
echo "PID: $pid"
if [ "$pid" == "" ]; do
return 1
done
return 0
}
launch [process-to-wrap]
exit 0
Then $pid matches the value output by ps for the process and the function returns 0. Is there something incorrect about my call in the first script?
bash --version output if it's useful:
GNU bash, version 4.3.30(1)-release (arm-petalinux-linux-gnueabi)
Copyright (C) 2013 Free Software Foundation, Inc.
License GPLv3+: GNU GPL version 3 or later <http://gnu.org/licenses/gpl.html>

You're not allowing time for looping-script.sh to start process-to-wrap. Try adding a sleep command before checking for the PID.
exec "$1" "$2" &!
sleep 1
local pid="$(pidof $2)"

Why use pidof to get the process ID of the monitored command? It will not do what you want if there happens to be more than one matching process. The shell can tell you what the PID was of the last subshell it started, e.g. to run a background pipeline:
exec "$#" &
local pid=$!
echo "PID: $pid"
Note also the "$#", which accommodates any number of arguments to the command you want to launch.

Related

Script to check if vim is open or another script is running?

I'm making a background script that requires a user to input a certain string (a function) to continue. The script runs fine, but will interrupt anything else that is open in vim or any script that is running. Is there a way I can test in my script if the command line is waiting for input to avoid interrupting something?
I'm running the script enclosed in parenthesis to hide the job completion message, so I'm using (. nightFall &)
Here is the script so far:
#!/bin/bash
# nightFall
clear
text=""
echo "Night begins to fall... Now might be a good time to rest."
while [[ "$text" != "rest" ]]
do
read -p "" text
done
Thank you in advance!
If you launch nightFall from the shell you are monitoring, you can use "ps" with the parent PID to see how many processes are launched by the shell as well:
# bg.sh
for k in `seq 1 15`; do
N=$(ps -ef | grep -sw $PPID | grep -v $$ | wc -l)
(( N -= 2 ))
[ "$N" -eq 0 ] && echo "At prompt"
[ "$N" -ne 0 ] && echo "Child processes: $N"
sleep 1
done
Note that I subtract 2 from N: one for the shell process itself and one for the bg.sh script. The remainder is = how many other child processes does the shell have.
Launch the above script from a shell in background:
bash bg.sh &
Then start any command (for example "sleep 15") and it will detect if you are at the prompt or in a command.

Shell scripts and how to avoid running the same script at the same time on a Linux machine

I have Linux centralize server – Linux 5.X.
In some cases on my Linux server the get_hosts.ksh script could be run from some other different hosts.
For example get_hosts.ksh could run on my Linux machine three or more times at the same time.
My question:
How to avoid running multiple instances of process/script?
A common solution for your problem on *nix systems is to check for a lock file existence.
Usually lock file contains current process PID.
This is an example ksh script:
#!/bin/ksh
pid="/var/run/get_hosts.pid"
trap "rm -f $pid" SIGSEGV
trap "rm -f $pid" SIGINT
if [ -e $pid ]; then
exit # pid file exists, another instance is running, so now we politely exit
else
echo $$ > $pid # pid file doesn't exit, create one and go on
fi
# your normal workflow here...
rm -f $pid # remove pid file just before exiting
exit
UPDATE: Answering to OP comment, I add handling program interruptions and segfaults with trap command.
The normal way of doing this is to write the process id into a file. The first thing the script does is check for the existence of the file, read the pid, check if a process with that pid exists, and for extra paranoia points, if that process actually runs the script. If yes, the script exits.
Here's a simple example. The process in question is a binary, and this script makes sure the binary runs only once. This is not exactly what you need, but you should be able to adapt this:
RUNNING=0
PIDFILE=$PATH_TO/var/run/example.pid
if [ -f $PIDFILE ]
then
PID=`cat $PIDFILE`
ps -eo pid | grep $PID >/dev/null 2>&1
if [ $? -eq 0 ]
then
RUNNING=1
fi
fi
if [ $RUNNING -ne 1 ]
then
run_binary
PID=$!
echo $PID > $PIDFILE
fi
This is not very elaborate but should get you on the right track.
You can use a pid file to keep track of when the process is running. At the top of the script, check for the existence of the pid file and if it doesn't exist, create it and run the script, otherwise return.
Some sample code can be seen in this answer to a similar question.
You might consider using the (optional) lockfile(1) command (provided by procmail package on Debian).
I have a lot of scripts, and using this below code for prevent multiple/simulate run:
PID="/var/scripts/PID.txt" # Temp file
if [ ! -f "$PID" ]; then
echo $$ > "$PID" # Print actual PID into a file
else
ps -p $(cat "$PID") > /dev/null && exit || echo $$ > "$PID"
fi
Building on wallenborn's answer I also added a "staleness" check just in case the PID lock file is beyond a certain expected age in seconds.
# prevent simultaneous executions within an hourish
pid_file="$HOME/.harness.pid"
max_stale_seconds=3600
if [ -f $pid_file ]; then
pid="$(cat "$pid_file")"
let age_in_seconds="$(date +%s) - $(date -r "$pid_file" +%s)"
if ps $pid >/dev/null && [ $age_in_seconds -lt $max_stale_seconds ]; then
exit 1
fi
fi
echo $$>"$pid_file"
trap "rm -f \"$pid_file\"" SIGSEGV
trap "rm -f \"$pid_file\"" SIGINT
This could be made "smarter" to kill off the other executions should the PID be valid but this would be dangerous. Consider a sudden power failure and reset situation where the PID file contains a number that may now reference a completely different process.

How do I know if a bash script is running with nohup?

I have a script to process records in some files, it usually takes 1-2 hours. When it's running, it prints a progress of number of records processed.
Now, what I want to do is: when it's running with nohup, I don't want it to print the progress; it should print progress only when it run manually.
My question is how do I know if a bash script is running with nohup?
Suppose the command is nohup myscript.sh &. In the script, how do I get the nohup from command line? I tried to use $0, but it gives myscript.sh.
Checking for file redirections is not robust, since nohup can be (and often is) used in scripts where stdin, stdout and/or stderr are already explicitly redirected.
Aside from these redirections, the only thing nohup does is ignore the SIGHUP signal (thanks to Blrfl for the link.)
So, really what we're asking for is a way to detect if SIGHUP is being ignored. In linux, the signal ignore mask is exposed in /proc/$PID/status, in the least-significant bit of the SigIgn hex string.
Provided we know the pid of the bash script we want to check, we can use egrep. Here I see if the current shell is ignoring SIGHUP (i.e. is "nohuppy"):
$ egrep -q "SigIgn:\s.{15}[13579bdf]" /proc/$$/status && echo nohuppy || echo normal
normal
$ nohup bash -c 'egrep -q "SigIgn:\s.{15}[13579bdf]" /proc/$$/status && echo nohuppy || echo normal'; cat nohup.out
nohup: ignoring input and appending output to `nohup.out'
nohuppy
You could check if STDOUT is associated with a terminal:
[ -t 1 ]
You can either check if the parent pid is 1:
if [ $PPID -eq 1 ] ; then
echo "Parent pid=1 (runing via nohup)"
else
echo "Parent pid<>1 (NOT running via nohup)"
fi
or if your script ignores the SIGHUP signal (see https://stackoverflow.com/a/35638712/1011025):
if egrep -q "SigIgn:\s.{15}[13579bdf]" /proc/$$/status ; then
echo "Ignores SIGHUP (runing via nohup)"
else
echo "Doesn't ignore SIGHUP (NOT running via nohup)"
fi
One way, but not really portable would be to do a readlink on /proc/$$/fd/1 and test if it ends with nohup.out.
Assuming you are on the pts0 terminal (not really relevant, just to be able to show the result):
#!/bin/bash
if [[ $(readlink /proc/$$/fd/1) =~ nohup.out$ ]]; then
echo "Running under hup" >> /dev/pts/0
fi
But the traditional approach to such problems is to test if the output is a terminal:
[ -t 1 ]
Thank you guys. Check STDOUT is a good idea. I just find another way to do it. That is to test tty.
test tty -s check its return code. If it's 0 , then it's running on a terminal; if it's 1 then it's running with nohup.

Linux Single Instance Kill if running too long

I am using the following to keep a single instance of a script running on my server. I have a cronjob to run this every minute.
How do I daemonize an arbitrary script in unix?
#!/bin/bash
if [[ $# < 1 ]]; then
echo "Name of pid file not given."
exit
fi
# Get the pid file's name.
PIDFILE=$1
shift
if [[ $# < 1 ]]; then
echo "No command given."
exit
fi
echo "Checking pid in file $PIDFILE."
#Check to see if process running.
PID=$(cat $PIDFILE 2>/dev/null)
if [[ $? = 0 ]]; then
ps -p $PID >/dev/null 2>&1
if [[ $? = 0 ]]; then
echo "Command $1 already running."
exit
fi
fi
# Write our pid to file.
echo $$ >$PIDFILE
# Get command.
COMMAND=$1
shift
# Run command
$COMMAND "$*"
Now I found out that my script had hung for some reason and therefore it was stuck. I'd like a way to check if the $PIDFILE is "old" and if so, kill the process. I know that's possible (check the timestamp on the file) but I don't know the syntax or if this is even a good idea. Also, when this script is running, the CPU should be pretty heavily used. If it hangs (rare but it happened at least once so far), the CPU usage drops to 0%. It would be nice if I could check that the process is really hung/not active, but I don't know if there's an easy way to do that (and I don't want to have many false positives where it gets killed but it's running fine).
To answer the question in your title, which seems quite different from your problem, use timeout.
Now, for your problem, I don't see where it could hang, unless you gave it a fifo queue for the pid file. Now, to run and respawn, you can just run this script once, on startup:
#!/bin/bash
while /bin/true; do
"$#"
wait
done
Which brings up another bug in the code you got from the other question: "$*" will pass all the arguments to the script as a single argument; without the quotes it'll split arguments with white space. "$#" will pass them individually and handling white space properly.
Call with /path/to/script command [argument]....

wait child process but get error: 'pid is not a child of this shell'

I write a script to get data from HDFS parrallel,then I wait these child processes in a for loop, but sometimes it returns "pid is not a child of this shell". sometimes, it works well。It's so puzzled. I use "jobs -l" to show all the jobs run in the background. I am sure these pid is the child process of the shell process, and I use "ps aux" to make sure these pids is note assign to other process. Here is my script.
PID=()
FILE=()
let serial=0
while read index_tar
do
echo $index_tar | grep index > /dev/null 2>&1
if [[ $? -ne 0 ]]
then
continue
fi
suffix=`printf '%03d' $serial`
mkdir input/output_$suffix
$HADOOP_HOME/bin/hadoop fs -cat $index_tar | tar zxf - -C input/output_$suffix \
&& mv input/output_$suffix/index_* input/output_$suffix/index &
PID[$serial]=$!
FILE[$serial]=$index_tar
let serial++
done < file.list
for((i=0;i<$serial;i++))
do
wait ${PID[$i]}
if [[ $? -ne 0 ]]
then
LOG "get ${FILE[$i]} failed, PID:${PID[$i]}"
exit -1
else
LOG "get ${FILE[$i]} success, PID:${PID[$i]}"
fi
done
Just find the process id of the process you want to wait for and replace that with 12345 in below script. Further changes can be made as per your requirement.
#!/bin/sh
PID=12345
while [ -e /proc/$PID ]
do
echo "Process: $PID is still running" >> /home/parv/waitAndRun.log
sleep .6
done
echo "Process $PID has finished" >> /home/parv/waitAndRun.log
/usr/bin/waitingScript.sh
http://iamparv.blogspot.in/2013/10/unix-wait-for-running-process-not-child.html
Either your while loop or the for loop runs in a subshell, which is why you cannot await a child of the (parent, outer) shell.
Edit this might happen if the while loop or for loop is actually
(a) in a {...} block
(b) participating in a piper (e.g. for....done|somepipe)
If you're running this in a container of some sort, the condition apparently can be caused by a bug in bash that is easier to encounter in a containerized envrionment.
From my reading of the bash source (specifically see comments around RECYCLES_PIDS and CHILD_MAX in bash-4.2/jobs.c), it looks like in their effort to optimize their tracking of background jobs, they leave themselves vulnerable to PID aliasing (where a new process might obscure the status of an old one); to mitigate that, they prune their background process history (apparently as mandated by POSIX?). If you should happen to want to wait on a pruned process, the shell can't find it in the history and assumes this to mean that it never knew about it (i.e., that it "is not a child of this shell").

Resources