Determining through Shell Script if a Linux process is running with given arguments - linux

The company I work to has a crontab set to run a given Shell Script every few minutes to perform certain complex operations without the users' intervention. This script basically executes multiple Perl scripts in a sequence, checking first if they are not running already, using the following structure as many times as there are customers:
for i in `seq 1 20`;
do
ps ax | grep ourFile10000008.p | grep pl 2>> /dev/null >> $LOG
if [ $? -eq 1 ] ; then
cd /path/to/the/script
perl ourFile10000008.pl 10000008 & 2>> $LOG
fi
ps ax | grep ourFile10000009.p | grep pl 2>> /dev/null >> $LOG
if [ $? -eq 1 ] ; then
cd /path/to/the/script
perl ourFile10000009.pl 10000009 & 2>> $LOG
fi
# (and so on, and so forth...)
done
This kind of works, except for the fact that there are now dozens of "ourFile" Perl script in our /path/to/the/script folder, and they are exact copies of each other! Every time a new customer comes online, we need to create a new replica, which makes maintaining this structure very hard to say the least.
I'm trying to make this structure run on a single file (named here as [theOneFile.pl]) that's another copy of those scripts but is called every time with a new argument. This works, but now I have to make sure I'm only running this file once per argument passed.
After some research, and thanks to This answer, I have successfully determined the argument behind a running [theOneFile.pl] through pgrep -af theOneFile.pl | tr '\000' ' '| awk '{print $4}' >> $LOG. However, this gives me a list of results to content with. To keep today's logic as intact as possible, I'm trying to determine only if there is one of these processes running with one specific argument at that given time (eg. theOneFile.pl 10000009), but I'm not sure how to do so. Any ideas?

pgref -f (which you are using) does match the pattern to the whole command line of a process, not just the process name. That said, you can use:
arg="foo"
pgrep -f "theOneFile.pl.*${arg}"
Well, the pgrep approach is prone to race conditions. Better would be to to change the script itself to use an exclusive lock - per argument.

Related

When piping in BASH, is it possible to get the PID of the left command from within the right command?

The Problem
Given a BASH pipeline:
./a.sh | ./b.sh
The PID of ./a.sh being 10.
Is there way to find the PID of ./a.sh from within ./b.sh?
I.e. if there is, and if ./b.sh looks something like the below:
#!/bin/bash
...
echo $LEFT_PID
cat
Then the output of ./a.sh | ./b.sh would be:
10
... Followed by whatever else ./a.sh printed to stdout.
Background
I'm working on this bash script, named cachepoint, that I can place in a pipeline to speed things up.
E.g. cat big_data | sed 's/a/b/g' | uniq -c | cachepoint | sort -n
This is a purposefully simple example.
The pipeline may run slowly at first, but on subsequent runs, it will be quicker, as cachepoint starts doing the work.
The way I picture cachepoint working is that it would use the first few hundred lines of input, along with a list of commands before it, in order to form a hash ID for the previously cached data, thus breaking the stdin pipeline early on subsequent runs, resorting instead to printing the cached data. Cached data would get deleted every hour or so.
I.e. everything left of | cachepoint would continue running, perhaps to 1,000,000 lines, in normal circumstances, but on subsequent executions of cachepoint pipelines, everything left of | cachepoint would exit after maybe 100 lines, and cachepoint would simply print the millions of lines it has cached. For the hash of the pipe sources and pipe content, I need a way for cachepoint to read the PIDs of what came before it in the pipeline.
I use pipelines a lot for exploring data sets, and I often find myself piping to temporary files in order to bypass repeating the same costly pipeline more than once. This is messy, so I want cachepoint.
This Shellcheck-clean code should work for your b.sh program on any Linux system:
#! /bin/bash
shopt -s extglob
shopt -s nullglob
left_pid=
# Get the identifier for the pipe connected to the standard input of this
# process (e.g. 'pipe:[10294010]')
input_pipe_id=$(readlink "/proc/self/fd/0")
if [[ $input_pipe_id != pipe:* ]]; then
echo 'ERROR: standard input is not a pipe' >&2
exit 1
fi
# Find the process that has standard output connected to the same pipe
for stdout_path in /proc/+([[:digit:]])/fd/1; do
output_pipe_id=$(readlink -- "$stdout_path")
if [[ $output_pipe_id == "$input_pipe_id" ]]; then
procpid=${stdout_path%/fd/*}
left_pid=${procpid#/proc/}
break
fi
done
if [[ -z $left_pid ]]; then
echo "ERROR: Failed to set 'left_pid'" >&2
exit 1
fi
echo "$left_pid"
cat
It depends on the fact that, on Linux, for a process with id PID the path /proc/PID/fd/0 looks like a symlink to the device connected to the standard input of the process and /proc/PID/fd/1 looks like a symlink to the device connected to the standard output of the process.

Multiple scripts making rest calls interfering

So I am running into a problem with unix scripts that use curl to make rest calls. I have one script, that runs two other scripts inside of it.
cat example.sh
FILE="file1.txt"
RECIP="wilfred#blamagam.com"
rm -f $FILE
./script1.sh > $FILE
mail -s "subject" $RECIP < $FILE
RECIP="bob#blamagam.com"
rm -f $FILE
./script2.sh > $FILE
mail -s "subject" $RECIP < $FILE
exit 0
Each script makes rest calls to the same service. It is my understanding that script1.sh should completely finish before script2.sh is ran, however that is not the case. In the logs for the rest service I see a rest call from the second script in the middle of the first one still executing. The second script then fails because of this (it does not get any data returned).
I am modifying this process so I am not the one who originally wrote it. I am not seeing any forked processes, or background processes at all and I have been banging my head against the wall.
I do know that script2.sh works. Whenever script1.sh takes under a minute script2.sh works just fine, but more often than not script1.sh takes over a min, causing the second script to fail.
This is ran by a cron, and the contents of the files are mailed out, so I cant just default to running them manually. Any suggestions for what to look into would be much appreciated!
EDIT: Here is a high pseudo code example
script1.sh
ITEMS=`/usr/bin/curl -m 10 -k -u userName:passWord -L https://server/rest-service/rest?where=clause=value;clause2=value2&sel=field 2>/dev/null | sed s/<\/\?Attribute[^>]*>/\n/g | grep -v '^<' | grep -v '^$' | sed 's/ //g'`
echo "\n Subject for these metrics"
echo "$ITEMS"
Both scripts have lots of entries like this. There are 2 or 3 for loops but they are simple and I do not see any background processes being called. Its a large script so I could only provide a snippet. Could the rest call into pipes be causing an issue?
Edit:
Just tested this on my system and it seems to work.
cat example.sh
FILE="file1.txt"
RECIP="wilfred#blamagam.com"
rm -f "$FILE"
(./script1.sh > "$FILE") &
procscript1=$!
wait "$procscript1"
mail -s "subject" "$RECIP" < "$FILE"
RECIP="bob#blamagam.com"
rm -f "$FILE"
(./script2.sh > "$FILE") &
procscript2=$!
wait "$procscript2"
mail -s "subject" "$RECIP" < "$FILE"
exit 0
Put the script executions in the background with the &.
Get the process id's for each script execution.
Use the wait command to block until the execution is done.

bash execute command in variable

I have a command in a variable in Bash:
check_ifrunning=\`ps aux | grep "programmname" | grep -v "grep" | wc -l\`
The command checks if a specific program is running at the moment.
Later in my script, I want to query the value of the variable on a point.
If the specific program is running, the script should sleep for 15 minutes.
I solved it like this:
while [ $check_ifrunning -eq 1 ]; do
sleep 300
done
Will the script execute the command in the variable for each single loop-run or will the value in the variable stay after the first execution?
I have more variables in my script which can change their value. This was just one simple example of this.
Notice that check_ifrunning is set only once, in
check_ifrunning=`ps aux | grep "programmname" | grep -v "grep" | wc -l`
and that it is set before the loop:
while [ $check_ifrunning -eq 1 ]; do
sleep 300
done
You could add, for debugging purposes, an echo check_ifrunning is $check_ifrunning statement inside your while loop just before the sleep ...
You probably simply want (using pidof(8)) - without defining or using any check_ifrunning Bash variable:
while [ -n "$(pidof programname)" ]; do
sleep 300
done
Because you want to test if programname is running at every start of the loop!
You should use the more nestable and more readable $(...) instead of backquotes.
Consider reading the Advanced Bash Scripting Guide...
If you are writing a Bash script, consider to start it with
#!/bin/bash -vx
while debugging. When you are satisfied, remove the -vx...
If you want to encapsulate your commands, the proper way to do that is a function.
running () {
ps aux | grep "$1" | grep -q -v grep
}
With grep -q you get the result as the exit code, not as output; you use it simply like
if running "$programname"; then
:
Ideally, the second grep is unnecessary, but I did not want to complicate the code too much. It still won't work correctly if you are looking for grep. The proper solution is pidof.
See also http://mywiki.wooledge.org/BashFAQ/050

Why are commands executed in backquotes giving me different results when done in as script?

I have a script that I mean to be run from cron that ensures that a daemon that I wrote is working. The contents of the script file are similar to the following:
daemon_pid=`ps -A | grep -c fsdaemon`
echo "daemon_pid: " $daemon_pid
if [ $daemon_pid -eq 0 ]; then
echo "restarting fsdaemon"
/etc/init.d/fsdaemon start
fi
When I execute this script from the command prompt, the line that echoes the value of $daemon_pid is reporting a value of 2. This value is two regardless of whether my daemon is running or not. If, however, I execute the command with back quotes and then examine the $daemon_pid variable, the value of $daemon_pid is now one. I have also tried single stepping through the script using bashdb and, when I examine the variables using that tool, they are what they should be.
My question therefore is: why is there a difference in the behaviour between when the script is executed by the shell versus when the commands in the script are executed manually? I'm sure that there is something very fundamental that I am missing.
You're very likely encountering the grep as part of the 'answer' from ps.
To help fully understand what is happening, turn off the -c option, to see what data is being returned from just ps -A | grep fsdameon.
To solve the issue, some systems have a p(rocess)grep (pgrep). That will work, OR
ps -A | grep -v grep | grep -c fsdaemon
Is a common idiom you will see, but at the expense of another process.
The cleanest solution is,
ps -A | grep -c '[f]sdaemon'
The regular expression syntax should work with all greps, on all systems.
I hope this helps.
The problem is that grep itself shows up... Try running this command with anything after grep -c:
eple:~ erik$ ps -a | grep -c asdfladsf
1
eple:~ erik$ ps -a | grep -c gooblygoolbygookeydookey
1
eple:~ erik$
What does ps -a | grep fsdaemon return? Just look at the processes actually listed... :)
Since this is Linux, why not try the pgrep? This saves you a pipe, and you don't end up with grep reporting back the daemon script itself running.
Aany process with arguments including that name will add to the count - grep, and your script.
psing for a process isn't really reliable, you should use a lock file.
As several people have pointed out already, your process count is inflated because ps | grep detects (1) the script itself and (2) the subprocess created by the backquotes, which inherits the name of the main script. So an easy solution is to change the name of the script to something that doesn't include the name you're looking for. But you can do better.
The "best-practice" solution that I would suggest is to use the facilities provided by your operating system. It's not uncommon for an init script to create a PID file as part of the process of starting your daemon; in other words, instead of just running the daemon itself, you use a wrapper script that starts the daemon and then writes the process ID to a file somewhere. If start-stop-daemon exists on your system (and I think it's fairly common these days), you can use that like so:
start-stop-daemon --start --quiet --background \
--make-pidfile --pidfile /var/run/fsdaemon.pid -- /usr/bin/fsdaemon
(obviously replace the path /usr/bin/fsdaemon as appropriate) to start it, and then
start-stop-daemon --stop --quiet --pidfile /var/run/fsdaemon.pid
to stop it. start-stop-daemon has other options that might be useful to you, which you can investigate by reading the man page.
If you don't have access to start-stop-daemon, you can write a wrapper script to do basically the same thing, something like this to start:
echo "$$" > /var/run/fsdaemon.pid
exec /usr/bin/fsdaemon
and this to stop:
kill $(< /var/run/fsdaemon/pid)
rm /var/run/fsdaemon.pid
(this is pretty crude, of course, but it should normally work).
Anyway, once you have the setup to generate a PID file, whether by using start-stop-daemon or not, you can update your check script to this:
daemon_pid=`ps --no-headers --pid $(< /var/run/fsdaemon.pid) | wc -l`
if [ $daemon_pid -eq 0 ]; then
echo "restarting fsdaemon"
/etc/init.d/fsdaemon restart
fi
(one would think there would be a concise command to check whether a given PID is running, but I don't know it).
If you don't want to (or can't) create a PID file, I would at least suggest pgrep instead of ps | grep, since pgrep will search directly for a process by name and won't find anything that just happens to include the same string.
daemon_pid=`pgrep -x -c fsdaemon`
if [ $daemon_pid -eq 0 ]; then
echo "restarting fsdaemon"
/etc/init.d/fsdaemon restart
fi
The -x means "match exactly", and -c works as with grep.
By the way, it seems a bit misleading to name your variable daemon_pid when it is actually a count.

What does "who | grep $1" command do in the shell script?

I am learning shell programming from the very basics using the book called Beginning Linux Programming (4th Edition). I am confused by this script with an until-clause:
#!/bin/bash
until who | grep "$1" > /dev/null
do
sleep 60
done
# Now ring the bell and announce the unexpected user.
echo -e '\a'
echo "***** $1 has just logged in *****"
exit 0
My quesiton is what is who | grep "$1" > /dev/null used for here? Why redirect the grep output to /dev/null?
The 'until' loop is used to test a condition, as you mentioned, and will run all the 'do|done' block until the condition present becomes true. In other words, it only executes the code block when the condition present is FALSE, and runs it until it becomes true. The script you are testing is useful for catching a logged in user that you pass as a parameter to the script (hence, the grep "$1", being $1 a positional parameter). It will sleep for a minute (sleep 60) until that user logs in to the system, and then it will exit the loop and do all the '$1 has just logged in' stuff. The redirection of grep output to /dev/null is used to not display the output of the grep comand (you could have used grep -q "$1" and that will achieve the same effect).
Hope to have clarified your doubts.
while and until (and, admittedly, if) look at the exit code of the test, not at any text that may or may not be generated on stdout (or stderr).
I suspect the reason redirection to /dev/null has been used is because the command only generates output if there is a match, most of the time there is (admittedly) none, but when there is, you're not interested in seeing the result.

Resources