Can't solve this error when monitoring a output using sh - linux

I'm working on an optimization and for that I need to link a matlab code into a linux program and keep monitoring the outputs. I'd done this link using this sh below, however it wasn't working well, since I couldn't keep track of more than one 'expression'.
#!/bin/bash
../program inputfile &> OutputFile.dat &
tail -f OutputFile.dat | sed -n '/NaN/q;/STOP/q'
killall program
I've asked for help here, and I got a good solution. The solution solved partially the problem. Running the program on the prompt it was possible to keep track on those expressions and kill the program when needed. The solution given was:
#!/bin/bash
( stdbuf -o 0 -e 0 ../program inputfile & ) &> OutputFile.dat
sed -n '/NaN/q;/STOP/q' <(tail -f OutputFile.dat)
killall program
When I implemented on the matlab, and did the 'linkage' the code didn't responded well. After a few minutes running, the matlab got stuck, I couldn't get any answer from the terminal. When looked manually on the outputs of my program I realized that there were no problems on the program, and the outputs was normally being written.
I can't solve this problem. I don't have a lot of experience on sh. I've searched for answers, but I couldn't find any. Alternative suggestions to achieve the same thing are also welcome.
Thanks in advance

The tail -f is causing the hang. You need to also kill the sed/tail process in order to continue.
#!/bin/bash
( stdbuf -o 0 -e 0 ../program inputfile & ) &> OutputFile.dat
# get the process id (pid) of "program"
# (bash sets $! to the pid of the last background process)
program_pid=$!
# put this in the background, too
sed -n '/NaN/q;/STOP/q' <(tail -f OutputFile.dat) &
# get its pid
sed_pid=$!
# wait while "program" and sed are both still running
while ps -p $program_pid && ps -p $sed_pid; do
sleep 1
done >/dev/null
# one (or both) have now ended
if ps -p $program_pid >/dev/null; then
# "program" is still running, and sed must have found a match and ended
echo "found Nan or STOP; killing program"
kill $program_pid
elif ps -p $sed_pid; then
# sed is still running, so program must have finished ok
kill $sed_pid
fi
ref: https://stackoverflow.com/a/2041505/1563960

Related

How to wait on a backgrounded sub-process with `wait` command [duplicate]

Is there any builtin feature in Bash to wait for a process to finish?
The wait command only allows one to wait for child processes to finish.
I would like to know if there is any way to wait for any process to finish before proceeding in any script.
A mechanical way to do this is as follows but I would like to know if there is any builtin feature in Bash.
while ps -p `cat $PID_FILE` > /dev/null; do sleep 1; done
To wait for any process to finish
Linux (doesn't work on Alpine, where ash doesn't support tail --pid):
tail --pid=$pid -f /dev/null
Darwin (requires that $pid has open files):
lsof -p $pid +r 1 &>/dev/null
With timeout (seconds)
Linux:
timeout $timeout tail --pid=$pid -f /dev/null
Darwin (requires that $pid has open files):
lsof -p $pid +r 1m%s -t | grep -qm1 $(date -v+${timeout}S +%s 2>/dev/null || echo INF)
There's no builtin. Use kill -0 in a loop for a workable solution:
anywait(){
for pid in "$#"; do
while kill -0 "$pid"; do
sleep 0.5
done
done
}
Or as a simpler oneliner for easy one time usage:
while kill -0 PIDS 2> /dev/null; do sleep 1; done;
As noted by several commentators, if you want to wait for processes that you do not have the privilege to send signals to, you have find some other way to detect if the process is running to replace the kill -0 $pid call. On Linux, test -d "/proc/$pid" works, on other systems you might have to use pgrep (if available) or something like ps | grep "^$pid ".
I found "kill -0" does not work if the process is owned by root (or other), so I used pgrep and came up with:
while pgrep -u root process_name > /dev/null; do sleep 1; done
This would have the disadvantage of probably matching zombie processes.
This bash script loop ends if the process does not exist, or it's a zombie.
PID=<pid to watch>
while s=`ps -p $PID -o s=` && [[ "$s" && "$s" != 'Z' ]]; do
sleep 1
done
EDIT: The above script was given below by Rockallite. Thanks!
My orignal answer below works for Linux, relying on procfs i.e. /proc/. I don't know its portability:
while [[ ( -d /proc/$PID ) && ( -z `grep zombie /proc/$PID/status` ) ]]; do
sleep 1
done
It's not limited to shell, but OS's themselves do not have system calls to watch non-child process termination.
FreeBSD and Solaris have this handy pwait(1) utility, which does exactly, what you want.
I believe, other modern OSes also have the necessary system calls too (MacOS, for example, implements BSD's kqueue), but not all make it available from command-line.
From the bash manpage
wait [n ...]
Wait for each specified process and return its termination status
Each n may be a process ID or a job specification; if a
job spec is given, all processes in that job's pipeline are
waited for. If n is not given, all currently active child processes
are waited for, and the return status is zero. If n
specifies a non-existent process or job, the return status is
127. Otherwise, the return status is the exit status of the
last process or job waited for.
Okay, so it seems the answer is -- no, there is no built in tool.
After setting /proc/sys/kernel/yama/ptrace_scope to 0, it is possible to use the strace program. Further switches can be used to make it silent, so that it really waits passively:
strace -qqe '' -p <PID>
All these solutions are tested in Ubuntu 14.04:
Solution 1 (by using ps command):
Just to add up to Pierz answer, I would suggest:
while ps axg | grep -vw grep | grep -w process_name > /dev/null; do sleep 1; done
In this case, grep -vw grep ensures that grep matches only process_name and not grep itself. It has the advantage of supporting the cases where the process_name is not at the end of a line at ps axg.
Solution 2 (by using top command and process name):
while [[ $(awk '$12=="process_name" {print $0}' <(top -n 1 -b)) ]]; do sleep 1; done
Replace process_name with the process name that appears in top -n 1 -b. Please keep the quotation marks.
To see the list of processes that you wait for them to be finished, you can run:
while : ; do p=$(awk '$12=="process_name" {print $0}' <(top -n 1 -b)); [[ $b ]] || break; echo $p; sleep 1; done
Solution 3 (by using top command and process ID):
while [[ $(awk '$1=="process_id" {print $0}' <(top -n 1 -b)) ]]; do sleep 1; done
Replace process_id with the process ID of your program.
Blocking solution
Use the wait in a loop, for waiting for terminate all processes:
function anywait()
{
for pid in "$#"
do
wait $pid
echo "Process $pid terminated"
done
echo 'All processes terminated'
}
This function will exits immediately, when all processes was terminated. This is the most efficient solution.
Non-blocking solution
Use the kill -0 in a loop, for waiting for terminate all processes + do anything between checks:
function anywait_w_status()
{
for pid in "$#"
do
while kill -0 "$pid"
do
echo "Process $pid still running..."
sleep 1
done
done
echo 'All processes terminated'
}
The reaction time decreased to sleep time, because have to prevent high CPU usage.
A realistic usage:
Waiting for terminate all processes + inform user about all running PIDs.
function anywait_w_status2()
{
while true
do
alive_pids=()
for pid in "$#"
do
kill -0 "$pid" 2>/dev/null \
&& alive_pids+="$pid "
done
if [ ${#alive_pids[#]} -eq 0 ]
then
break
fi
echo "Process(es) still running... ${alive_pids[#]}"
sleep 1
done
echo 'All processes terminated'
}
Notes
These functions getting PIDs via arguments by $# as BASH array.
Had the same issue, I solved the issue killing the process and then waiting for each process to finish using the PROC filesystem:
while [ -e /proc/${pid} ]; do sleep 0.1; done
There is no builtin feature to wait for any process to finish.
You could send kill -0 to any PID found, so you don't get puzzled by zombies and stuff that will still be visible in ps (while still retrieving the PID list using ps).
If you need to both kill a process and wait for it finish, this can be achieved with killall(1) (based on process names), and start-stop-daemon(8) (based on a pidfile).
To kill all processes matching someproc and wait for them to die:
killall someproc --wait # wait forever until matching processes die
timeout 10s killall someproc --wait # timeout after 10 seconds
(Unfortunately, there's no direct equivalent of --wait with kill for a specific pid).
To kill a process based on a pidfile /var/run/someproc.pid using signal SIGINT, while waiting for it to finish, with SIGKILL being sent after 20 seconds of timeout, use:
start-stop-daemon --stop --signal INT --retry 20 --pidfile /var/run/someproc.pid
Use inotifywait to monitor some file that gets closed, when your process terminates. Example (on Linux):
yourproc >logfile.log & disown
inotifywait -q -e close logfile.log
-e specifies the event to wait for, -q means minimal output only on termination. In this case it will be:
logfile.log CLOSE_WRITE,CLOSE
A single wait command can be used to wait for multiple processes:
yourproc1 >logfile1.log & disown
yourproc2 >logfile2.log & disown
yourproc3 >logfile3.log & disown
inotifywait -q -e close logfile1.log logfile2.log logfile3.log
The output string of inotifywait will tell you, which process terminated. This only works with 'real' files, not with something in /proc/
Rauno Palosaari's solution for Timeout in Seconds Darwin, is an excellent workaround for a UNIX-like OS that does not have GNU tail (it is not specific to Darwin). But, depending on the age of the UNIX-like operating system, the command-line offered is more complex than necessary, and can fail:
lsof -p $pid +r 1m%s -t | grep -qm1 $(date -v+${timeout}S +%s 2>/dev/null || echo INF)
On at least one old UNIX, the lsof argument +r 1m%s fails (even for a superuser):
lsof: can't read kernel name list.
The m%s is an output format specification. A simpler post-processor does not require it. For example, the following command waits on PID 5959 for up to five seconds:
lsof -p 5959 +r 1 | awk '/^=/ { if (T++ >= 5) { exit 1 } }'
In this example, if PID 5959 exits of its own accord before the five seconds elapses, ${?} is 0. If not ${?} returns 1 after five seconds.
It may be worth expressly noting that in +r 1, the 1 is the poll interval (in seconds), so it may be changed to suit the situation.
On a system like OSX you might not have pgrep so you can try this appraoch, when looking for processes by name:
while ps axg | grep process_name$ > /dev/null; do sleep 1; done
The $ symbol at the end of the process name ensures that grep matches only process_name to the end of line in the ps output and not itself.

Shutdown computer when all instances of a given program have finished

I use the following script to check whether wget has finished downloading. To check for this, I'm looking for its PID, and when it is not found the computer shutdowns. This works fine for a single instance of wget, however, I'd like the script to look for all already running wget programs.
#!/bin/bash
while kill -0 $(pidof wget) 2> /dev/null; do
for i in '-' '/' '|' '\'
do
echo -ne "\b$i"
sleep 0.1
done
done
poweroff
EDIT: I'd would be great if the script would check if at least one instance of wget is running and only then check whether wget has finished and shutdown the computer.
In addition to the other answers, you can satisfy your check for at least one wget pid by initially reading the result of pidof wget into an array, for example:
pids=($(pidof wget))
if ((${#pids[#]} > 0)); then
# do your loop
fi
This also brings up a way to routinely monitor the remaining pids as each wget operation completes, for example,
edit
npids=${#pids[#]} ## save original number of pids
while (( ${#pids[#]} -gt 0 )); do ## while pids remain
for ((i = 0; i < npids; i++)); do ## loop, checking remaining pids
kill -0 ${pids[i]} || pids[$i]= ## if not unset in array
done
## do your sleep and spin
done
poweroff
There are probably many more ways to do it. This is just one that came to mind.
I don't think kill is a right Idea,
may be some thing on the lines like this
while [ 1 ]
do
live_wgets=0
for pid in `ps -ef | grep wget| awk '{print $2}'` ; # Adjust the grep
do
live_wgets=$((live_wgets+1))
done
if test $live_wgets -eq 0; then # shutdown
sudo poweroff; # or whatever that suits
fi
sleep 5; # wait for sometime
done
You can adapt your script in the following way:
#!/bin/bash
spin[0]="-"
spin[1]="\\"
spin[2]="|"
spin[3]="/"
DOWNLOAD=`ps -ef | grep wget | grep -v grep`
while [ -n "$DOWNLOAD" ]; do
for i in "${spin[#]}"
do
DOWNLOAD=`ps -ef | grep wget | grep -v grep`
echo -ne "\b$i"
sleep 0.1
done
done
sudo poweroff
However I would recommend using cron instead of an active waiting approach or even use wait
How to wait in bash for several subprocesses to finish and return exit code !=0 when any subprocess ends with code !=0?

Grep not working in script but on console

I have a problem with a script. I have a voltage meter connected to a serial USB device(ttyUSB1).
The smart meter needs an initial sequence and shortly followed by a second command to give all of it's information. That works fine. 1.8.0*00(000898.46) for example comes in this is the line I am interested in. The number in brackets is the kWh number i want. If i open a second terminal and do a cat /dev/ttyUSB1 it works fine and i can see the information coming in. After 4 to 5 seconds the line I want comes in. But the script is not working. If i start a script in one terminal it keeps waiting. Grep is not finishing. If I start it in a second terminal then the first terminal gets finished. Or just the grep 1.8.0 /dev/ttyUSB1 -m1 in another terminal works but not in the script.
I tried different methos with read and so none worked. To be honest i don't understand much of scripting and always succeed somehow but here nothings helped :(
Please help. Thank you!
Arne
here the script:
#! /bin/bash
echo start
echo $'\x2f\x3f\x21\x0d' > /dev/ttyUSB1
sleep 1
echo ask
echo $'\x06\x30\x30\x30\x0d' > /dev/ttyUSB1
echo wait
grep 1.8.0 /dev/ttyUSB1 -m1
echo end
You can try creating a file with voltimeter's output and grep from that file:
#! /bin/bash
dev=/dev/ttyUSB1
file=/tmp/testfile
(tail -f $dev | tee $file) & # let's continuously copy in background
echo start
echo $'\x2f\x3f\x21\x0d' > $dev
sleep 1
echo ask
echo $'\x06\x30\x30\x30\x0d' > $dev
echo wait
grep 1.8.0 $file # lets get the info from the file instead
echo end
sleep 1
exit

Bash script optimization for waiting for a particular string in log files

I am using a bash script that calls multiple processes which have to start up in a particular order, and certain actions have to be completed (they then print out certain messages to the logs) before the next one can be started. The bash script has the following code which works really well for most cases:
tail -Fn +1 "$log_file" | while read line; do
if echo "$line" | grep -qEi "$search_text"; then
echo "[INFO] $process_name process started up successfully"
pkill -9 -P $$ tail
return 0
elif echo "$line" | grep -qEi '^error\b'; then
echo "[INFO] ERROR or Exception is thrown listed below. $process_name process startup aborted"
echo " ($line) "
echo "[INFO] Please check $process_name process log file=$log_file for problems"
pkill -9 -P $$ tail
return 1
fi
done
However, when we set the processes to print logging in DEBUG mode, they print so much logging that this script cannot keep up, and it takes about 15 minutes after the process is complete for the bash script to catch up. Is there a way of optimizing this, like changing 'while read line' to 'while read 100 lines', or something like that?
How about not forking up to two grep processes per log line?
tail -Fn +1 "$log_file" | grep -Ei "$search_text|^error\b" | while read line; do
So one long running grep process shall do preprocessing if you will.
Edit: As noted in the comments, it is safer to add --line-buffered to the grep invocation.
Some tips relevant for this script:
Checking that the service is doing its job is a much better check for daemon startup than looking at the log output
You can use grep ... <<<"$line" to execute fewer echos.
You can use tail -f | grep -q ... to avoid the while loop by stopping as soon as there's a matching line.
If you can avoid -i on grep it might be significantly faster to process the input.
Thou shalt not kill -9.

How to get watch to run a bash script with quotes

I'm trying to have a lightweight memory profiler for the matlab jobs that are run on my machine. There is either one or zero matlab job instance, but its process id changes frequently (since it is actually called by another script).
So here is the bash script that I put together to log memory usage:
#!/bin/bash
pid=`ps aux | grep '[M]ATLAB' | awk '{print $2}'`
if [[ -n $pid ]]
then
\grep VmSize /proc/$pid/status
else
echo "no pid"
fi
when I run this script in bash like this:
./script.sh
it works fine, giving me the following result:
VmSize: 1289004 kB
which is exactly what I want.
Now, I want to run this periodically. So I run it with watch, like this:
watch ./script.sh
But in this case I only receive:
no pid
Please note that I know the matlab job is still running, because I can see it with the same pid on top, and besides, I know each matlab job take several hours to finish.
I'm pretty sure that something is wrong with the quotes I have when setting pid. I just can't figure out how to fix it. Anyone knows what I'm doing wrong?
PS.
In the man page of watch, it says that commands are executed by sh -c. I did run my script like sh -c ./script and it works just fine, but watch doesn't.
Why don't you use a loop with sleep command instead?
For example:
#!/bin/bash
pid=`ps aux | grep '[M]ATLAB' | awk '{print $2}'`
while [ "1" ]
do
if [[ -n $pid ]]
then
\grep VmSize /proc/$pid/status
else
echo "no pid"
fi
sleep 10
done
Here the script sleeps(waits) for 10 seconds. You can set the interval you need changing the sleep command. For example to make the script sleep for an hour use sleep 1h.
To exit the script press Ctrl - C
This
pid=`ps aux | grep '[M]ATLAB' | awk '{print $2}'`
could be changed to:
pid=$(pidof MATLAB)
I have no idea why it's not working in watch but you could use a cron job and make the script log to a file like so:
#!/bin/bash
pid=$(pidof MATLAB) # Just to follow previously given advice :)
if [[ -n $pid ]]
then
echo "$(date): $(\grep VmSize /proc/$pid/status)" >> logfile
else
echo "$(date): no pid" >> logfile
fi
You'd of course have to create logfile with touch.
You might try just running ps command in watch. I have had issues in the past with watch chopping lines and such when they get too long.
It can be fixed by making the terminal you are running the command from wider or changing the column like this (may need to adjust the 160 to your liking):
export COLUMNS=160;

Resources