count processes in shell script [duplicate] - linux

This question already has answers here:
Closed 10 years ago.
Possible Duplicate:
Quick-and-dirty way to ensure only one instance of a shell script is running at a time
I am new to shell script.
what I wanna do is to avoid running multiple instances of a script.
I have this shell script cntps.sh
#!/bin/bash
cnt=`ps -e|grep "cntps"|grep -v "grep"`
echo $cnt >> ~/cntps.log
if [ $cnt < 1 ];
then
#do something.
else
exit 0
fi
if I run it this way $./cntps.sh, it echoes 2
if I run it this way $. ./cntps.sh, it echoes 0
if I run it with crontab, it echoes 3
Could somebody explain to me why is this happening?
And what is the proper way to avoid running multiple instances of a script?

I changed your command slightly to output ps to a log file so we can see what is going on.
cnt=`ps -ef| tee log | grep "cntps"|grep -v "grep" | wc -l`
This is what I saw:
32427 -bash
20430 /bin/bash ./cntps.sh
20431 /bin/bash ./cntps.sh
20432 ps -ef
20433 tee log
20434 grep cntps
20435 grep -v grep
20436 wc -l
As you can see, my terminal's shell (32427) spawns a new shell (20430) to run the script. The script then spawns another child shell (20431) for command substitution (`ps -ef | ...`).
So, the count of two is due to:
20430 /bin/bash ./cntps.sh
20431 /bin/bash ./cntps.sh
In any case, this is not a good way to ensure that only one process is running. See this SO question instead.

Firstly, I would recommend using pgrep rather than this method. Secondly I presume you're missing a wc -l to count the number of instances from the script
In answer to your counting problems:
if I run it this way $./cntps.sh, it echoes 2
This is because the backtick call: ps -e ... is triggering a subshell which is also called cntps.sh and this triggers two items
if I run it this way $. ./cntps.sh, it echoes 0
This is caused as you're not running, it but are actually sourcing it into the currently running shell. This causes there to be no copies of the script running by the name cntps
if I run it with crontab, it echoes 3
Two from the invocation, one from the crontab invocation itself which spawns sh -c 'path/to/cntps.sh'
Please see this question for how to do a single instance shell script.

Use a "lock" file as a mutex.
if(exists("lock") == false)
{
touch lock file // create a file named "lock" in the current dir
execute_script_body // execute script commands
remove lock file // delete the file
}
else
{
echo "another instance is running!"
}
exit

Related

Run command as another user in Linux with `su - user c` creates a duplicate process

I want to run a process by a service as root user, because the daemon may have its own user.
But when I run it with system("su - root c ./testbin"), the system shows two processes (I check this via ps aux | grep testbin):
su - root c ./testbin
and
./testbin
How to achieve a single process?
The "su" process can not be avoided, but you can get it to end before your testbin does.
Your original question using "sleep" looks like this:
(su - root -c "sleep 120" &) ; ps aux | grep sleep
If you execute that line multiple times you will see multiple "su" processes as a result of the grep.
Backgrounding the subprocess allows the su process to end, like this:
(su - root -c "sleep 120 &" &) ; ps aux | grep sleep
When you execute that line multiple times you can see that the "su" processes disappear from the list but that the sleep commands continue.
Note that the ampersand inside the double quotes is for the sub process and that the ampersand just before the parentheses is for the 'su' command which is required to perform your question in a single line and speed up testing this case.
I checked if an equivalent of 'execv' exists for the command line, but this does not seem to be the case. Also, 'su' is a process that runs with the permissions of the caller and the subprocess of su runs with the permissions of the process forked by 'su'. It seems logical to me that you can not replace the 'su' process with its child as 'execv' does in 'C' for security reasons.
ps aux | grep testbin | grep -v grep
I think you just see your "grep" process. Use the command above.

Launching a bash shell from a sudo-ed environment

Apologies for the confusing Question title. I am trying to launch an interactive bash shell from a shell script ( say shel2.sh) which has been launched by a parent script (shel1.sh) in a sudo-ed environment. ( I am creating a guided deployment
script for my software which needs to be installed as super-user , hence the sudo, but may need the user to access the shell. )
Here's shel1.sh
#!/bin/bash
set -x
sudo bash << EOF
echo $?
./shel2.sh
EOF
echo shel1 done
And here's shel2.sh
#!/bin/bash
set -x
bash --norc --verbose --noprofile -i
echo $?
echo done
I expected this to launch an interactive bash shell which waits for my input before returning to shel1.sh. This is what I see:
+ ./shel1.sh
+ sudo bash
0
+ bash --norc --verbose --noprofile -i
bash-4.3# exit
+ echo 0
0
+ echo done
done
+ echo shel1 done
shel1 done
The bash-4.3# calls an exit automatically and quits. Interestingly if I invoke the bash shell with -l (or --login) the automatic entry is logout !
Can someone explain what is happening here ?
When you use a here document, you are tying up the shell's -- and its spawned child processes' -- standard input to the here document input.
You can avoid using a here document in many situations. For example, replace the here document with a single-quoted string.
#!/bin/bash
set -x
sudo bash -c '
# Aside: How is this actually useful?
echo $?
# Spawned script inherits the stdin of "sudo bash"
./shel2.sh'
echo shel1 done
Without more details, it's hard to see where exactly you want to go with this, but most modern Linux platforms have package managers which allow all kinds of hooks for installation, so that you would typically not need to do this sort of thing. Have you looked into that?

How do I find the current process running on a particular PBS job

I am trying to write a script to provide diagnostics on processes. I have submitted a script to a job scheduling server using qsub. I can easily find the node that the job gets sent to. But I would like to be able to find what process is currently being run.
ie. I have a list of different commands in the submitted script, how can I find the current one that is running, and the arguments passed to it?
example of commands in script
matlab -nodesktop -nosplash -r "display('here'),quit"
python runsomethings.py
I would like to see whether the nodes is currently executing the first or second line.
When you submit a job, pbs_server pass your task to pbs_mom. pbs_mom process/daemon actually executes your script on the execution node. It
"creates a new session as identical user."
This means invoking a shell. You specialize the shell at the top of the script marking your choice with shebang: #!/bin/bash).
It's clear, that pbs_mom stores process (shell) PID somewhere to kill the job and to monitor if the job (shell process) have finished.
UPD. based on #Dmitri Chubarov comment: pbs_mom stores subshell PID internally in memory after calling fork(), and in the .TK file which is under torque installation directory: /var/spool/torque/mom_priv/jobs on my system.
Dumping file internals in decimal mode (<job_number>, <queue_name> should be your own values):
$ hexdump -d /var/spool/torque/mom_priv/jobs/<job_number>.<queue_name>.TK
have disclosed, that in my torque implementation it is stored in position
00000890 + offset 4*2 = 00000898 (it is hex value of first byte of PID in .TK file) and has a length of 2 bytes.
For example, for shell PID=27110 I have:
0000890 00001 00000 00001 00000 27110 00000 00000 00000
Let's recover PID from .TK file:
$ hexdump -s 2200 -n 2 -d /var/spool/torque/mom_priv/jobs/<job_number>.<queue_name>.TK | tr -s ' ' | cut -s -d' ' -f 2
27110
This way you've found subshell PID.
Now, monitor process list on the execution node and find name of child processes (getcpid function is a slighlty modified version of that posted earlier on SO):
function getcpid() {
cpids=`pgrep -P $1|xargs`
for cpid in $cpids;
do
ps -p "$cpid" -o comm=
getcpid $cpid
done
}
At last,
getcpid <your_PID>
gives you the child processes' names (note, there will be some garbage lines, like task numbers). This way you will finally know, what command is currently running on the execution node.
Of course, for each task monitored, you should obtain the PID and process name on the execution node after doing
ssh <your node>
You can automatically retrieve node name(s) in <node/proc+node/proc+...> format (process it further to obtain bare node names):
qstat -n <job number> | awk '{print $NF}' | grep <pattern_for_your_node_names>
Note:
The PID method is reliable and, as I believe, optimal.
Search by name is worse, it provides you unambiguous result only if your invoke different commands in your scripts, and no user executes the same software on the node.
ssh <your node>
ps aux | grep matlab
You will know if matlab runs.
Simple and elegant way to do it is to print to a log file
`
ARGS=" $A $B $test "
echo "running MATLAB now with args: $ARGS" >> $LOGFILE
matlab -nodesktop -nosplash -r "display('here'),quit"
PYARGS="$X $Y"
echo "running Python now with args: $ARGS" >> $LOGFILE
python runsomethings.py
`
And monitor the output of $LOGFILE using tail -f $LOGFILE

Bash script commands not running

I have seafile (http://www.seafile.com/en/home/) running on my NAS and I set up a cron tab that runs a script every few minutes to check if the seafile server is up, and if not, it will start it
The script looks like this:
#!/bin/bash
# exit if process is running
if ps aux | grep "[s]eafile" > /dev/null
then exit
else
# restart process
/home/simon/seafile/seafile-server-latest/seafile.sh start
/home/simon/seafile/seafile-server-latest/seahub.sh start-fastcgi
fi
running /home/simon/seafile/seafile-server-latest/seafile.sh start and /home/simon/seafile/seafile-server-latest/seahub.sh start-fastcgi individually/manually works without a problem, but when I try to manually run this script file, neither of those lines execute and seafile/seahub do not start
Is there an error in my script that is preventing execution of those 2 lines? I've made sure to chmod the script file to 755
The problem is likely that when you pipe commands into one another, you don't guarentee that the second command doesn't start before the first (it can start, but not do anything while it waits for input). For example:
oj#ironhide:~$ ps -ef | grep foo
oj 8227 8207 0 13:54 pts/1 00:00:00 grep foo
There is no process containing the word "foo" running on my machine, but the grep that I'm piping ps to appears in the process list that ps produces.
You could try using pgrep instead, which is pretty much designed for this sort of thing:
if pgrep "[s]eafile"
Or you could add another pipe to filter out results that include grep:
ps aux | grep "[s]eafile" | grep -v grep
If the name of this script matches the regex [s]eafile it will trivially always take the exit branch.
You should probably be using pidof in preference of reinventing the yak shed anyway.
turns out the script itself was working ok, although the change to using pgrep is much nicer. the problem was actually in the crontab (didn't include the sh in the command)

Redirecting Output of Bash Child Scripts

I have a basic script that outputs various status messages. e.g.
~$ ./myscript.sh
0 of 100
1 of 100
2 of 100
...
I wanted to wrap this in a parent script, in order to run a sequence of child-scripts and send an email upon overall completion, e.g. topscript.sh
#!/bin/bash
START=$(date +%s)
/usr/local/bin/myscript.sh
/usr/local/bin/otherscript.sh
/usr/local/bin/anotherscript.sh
RET=$?
END=$(date +%s)
echo -e "Subject:Task Complete\nBegan on $START and finished at $END and exited with status $RET.\n" | sendmail -v group#mydomain.com
I'm running this like:
~$ topscript.sh >/var/log/topscript.log 2>&1
However, when I run tail -f /var/log/topscript.log to inspect the log I see nothing, even though running top shows myscript.sh is currently being executed, and therefore, presumably outputting status messages.
Why isn't the stdout/stderr from the child scripts being captured in the parent's log? How do I fix this?
EDIT: I'm also running these on a remote machine, connected via ssh using pseudo-tty allocation, e.g. ssh -t user#host. Could the pseudo-tty be interfering?
I just tried your the following: I have three files t1.sh, t2.sh, and t3.sh all with the following content:
#!/bin/bash
for((i=0;i<10;i++)) ; do
echo $i of 9
sleep 1
done
And a script called myscript.sh with the following content:
#!/bin/bash
./t1.sh
./t2.sh
./t3.sh
echo "All Done"
When I run ./myscript.sh > topscript.log 2>&1 and then in another terminal run tail -f topscript.log I see the lines being output just fine in the log file.
Perhaps the things being run in your subscripts use a large output buffer? I know when I've run python scripts before, it has a pretty big output buffer so you don't see any output for a while. Do you actually see the entire output in the email that gets sent out at the end of topscript.sh? Is it just that while the processes run you're not seeing the output?
try
unbuffer topscript.sh >/var/log/topscript.log 2>&1
Note that unbuffer is not always available as a std binary in old-style Unix platforms and may require a search and installation for a package to support it.
I hope this helps.

Resources