why the command 'exec' can remove the blocking state of fifo file? - multithreading

I'am studying how to use multi thread to process tasks.And i noticed that the fifo file can help that.here is the effect:
#!/bin/bash
my_cmd(){
echo "process $1"
sleep 3
}
ff="d:/myfifo/$$.fifo"
mkfifo $ff
exec 7<>$ff
for i in {1..10};do echo;done >&7
for i in {1..1000};do {
read -u 7
my_cmd $i
echo >&7
}& done
rm $ff
wait
echo "end"
This shell script can run normally(process 1000 cmds, 10 at a time).And i modified this script slightly
#!/bin/bash
my_cmd(){
echo "process $1"
sleep 3
}
ff="d:/myfifo/$$.fifo"
mkfifo $ff
exec 7<>$ff
for i in {1..10};do echo;done >$ff # modified
for i in {1..1000};do {
read <$ff # modified
my_cmd $i
echo >$ff # modified
}& done
wait
rm $ff
echo "end"
As expected,the second script can also run normally.But i made a error when i modified it again.
#!/bin/bash
my_cmd(){
echo "process $1"
sleep 3
}
ff="d:/myfifo/$$.fifo"
mkfifo $ff
# exec 7<>$ff modified
for i in {1..10};do echo;done >$ff
for i in {1..1000};do {
read <$ff
my_cmd $i
echo >$ff
}& done
wait
rm $ff
echo "end"
The script wait a input to fifo file,because the fifo file entered a blocking state.It seems that this command 'exec 7<>$ff' lifted the blocking state of this fifo file.So is this the case?

On Linux, at least (Not sure about other OSes, and POSIX doesn't define a behavior), opening a fifo for both reading and writing will succeed at once without blocking waiting for the other end of the pipe to be opened.
So when you commented out the exec 7<>$ff line, the next line for i in {1..10};do echo;done >$ff will opening the fifo for writing, and block waiting for something else to open it for writing before going on. With the original version using the exec, it was already opened for reading so there was no need to block.
The Linux fifo(7) documentation does note
A process that uses both ends of the connection in order to communicate with itself should be very careful to avoid deadlocks.

Related

Bash - get process pid and parse output line by line

I am trying to write a bash script that runs a process and parses its output line by line (like here).
I would also like to get the process PID so for each line I can run ps and get CPU and memory usage (and print them with the output line).
I know I can get the PID with $! if I run the process in background, but then I won't know how to read the output.
Thanks in advance
You can create a FIFO and read that while the background process is running. For each line you read, you can do whatever you want with the child_pid.
First we need a small sample program:
bgp() { sleep 1; echo 1; sleep 1; echo 2; sleep 1; echo 3; }
Then create a fifo (maybe choose some path in /tmp)
tmp_fifo="/tmp/my_fifo"
rm "${tmp_fifo}" &>/dev/null
mkfifo "${tmp_fifo}"
Start your process in the background and redirect the output to the FIFO:
bgp > "${tmp_fifo}" &
child_pid=$!
Now read the output until the child process dies.
while true; do
if jobs %% >&/dev/null; then
if read -r -u 9 line; then
# Do whatever with $child_pid
echo -n "output from [$child_pid]: "
echo $line
fi
else
echo "info: child finished or died"
break
fi
done 9< "${tmp_fifo}"

Filtering shell script output within itself, the script is not terminated

I want a Bash script to generate some output messages. The script is supposed to capture messages, do some filtering, transform, and then output them to the screen.
The filtered results are correct in the output, but the script is not terminated. I must press a return key to finish it. How do I fix it?
Demo script:
#!/bin/bash
exec &> >(
{
while read line; do
[ "$line" = "exit" ] && break
echo "`date +%H:%M:%S.%N` $line"
done
echo "while finish"
} )
for ((i=3;i--;)); do
echo "text $i"
done
echo "exit"
The script does terminate, but the script itself finishes before the background process that writes the output does, so the prompt is displayed first, then the output, leaving your terminal with a blank line that looks like the script is still running. You could type any command instead of hitting return, and that command would execute.
To avoid this, you need to run the while loop in an explicit background job that you can wait on before exiting your script.
mkfifo logpipe
trap 'wait $logger_pid; rm logpipe' EXIT
while read line; do
[ "$line" = "exit" ] && break
echo "$(date +%H:%M:%S.%N) $line"
done < logpipe &
logger_pid=$!
exec &> logpipe
# ==========
for ((i=3;i--;)); do
echo "text $i"
done
echo "exit"
The while loop runs in the background, reading its input from the named pipe logpipe. Once that is running, you can redirect all your output to the pipe and start your "main" script. The exit trap ensures that your script doesn't actually exit until the while loop completes; it also cleans up the named pipe for you.
You might not have noticed yet, but there is no guarantee that the while loop will receive the merged standard output and standard error in the exact order in which things are written to them. For instance,
echo out1
echo err1 >&2
echo out2
echo err2 >&2
may end up being read as
out1
err1
err2
out2
Each stream itself will remain in order, but the two could be arbitrarily merged.

Can I detect early exit from a long-running, backgrounded process?

I'm trying to improve the startup scripts for several servers running in a cluster environment. The server processes should run indefinitely but occasionally fails on startup issuing e.g., Address already in use exceptions.
I'd like the exit code for the startup script to reflect these early terminations by, say, waiting for 1 second and telling me if the server seems to have started okay. I also need the server PID echoed.
Here's my best shot so far:
$ cat startup.sh
# start the server in the bg but if it fails in the first second,
# then kill startup.sh.
CMD="start_server -option1 foo -option2 bar"
eval "($CMD >> cc.log 2>&1 || kill -9 $$ &)"
SERVER_PID=$!
# the `kill` above only has 1 second to kill me-- otherwise my exit code is 0
sleep 1
echo $SERVER_PID
The exit code works fine but two problems remain:
If the server is long-running but eventually encounters an error, the parent startup.sh will have exited already and the $$ PID may have been reused by an unrelated process which this script will then kill off.
The SERVER_PID isn't correct since it's the PID of the subshell rather than the start_server command (which in this case is a grandchild of the startup.sh script.
Is there a simpler way to background the start_server process, get its PID, and use a timeout'ed check for error codes? I looked into bash builtins wait and timeout but they don't seem to work for processes that shouldn't exit in the end.
I can't change the server code and the startup script should not run indefinitely.
You can also use coproc (and look, I'm putting the command in an array, and also with proper quoting!):
#!/bin/bash
cmd=( start_server -option1 foo -option2 bar )
coproc mycoprocfd { "${cmd[#]}" >> cc.log 2>&1 ; }
server_pid=$!
sleep 1
if [[ -z "${mycoprocfd[#]}" ]]; then
echo >&2 "Failure detected when starting server! Server died before 1 second."
exit 1
else
echo $server_pid
fi
The trick is that coproc puts the file descriptors of the redirections of stdin and stdout in a prescribed array (here mycoprocfd) and empties the array when the process exits. So you don't need to do clumsy stuff with the PID itself.
You can hence check for the server to never exit as so:
#!/bin/bash
cmd=( start_server -option1 foo -option2 bar )
coproc mycoprocfd { "${cmd[#]}" >> cc.log 2>&1 ; }
server_pid=$!
read -u "${mycoprocfd[0]}"
echo >&2 "Oh dear, the server with PID $server_pid died after $SECONDS seconds."
exit 1
That's because read will read on the file descriptor given by coproc (but nothing is ever read here, since the stdout of your command has been redirected to a file!), and read exits when the file descriptor is closed, i.e., when the command launched by coproc exits.
I'd say this is a really elegant solution!
Now, this script will live as long as the coproc lives. I understood that's not what you want. In this case, you can timeout the read with its -t option, and then you'll use the fact that return's exit status is greater than 128 if it timed out. E.g., for a 4.5 seconds timeout
#!/bin/bash
timeout=4.5
cmd=( start_server -option1 foo -option2 bar )
coproc mycoprocfd { "${cmd[#]}" >> cc.log 2>&1 ; }
server_pid=$!
read -t $timeout -u "${mycoprocfd[0]}"
if (($?>128)); then
echo "$server_pid <-- all is good, it's still alive after $timeout seconds."
else
echo >&2 "Oh dear, the server with PID $server_pid died after $timeout seconds."
exit 1
fi
exit 0 # Yay
This is also very elegant :).
Use, extend, and adapt to your needs! (but with good practices!)
Hope this helps!
Remarks.
coproc is a bash-builtin that appeared in bash 4.0. The solutions shown here are 100% pure bash (except the first one, with sleep, which is not the best one at all!).
The use of coproc in scripts is almost always superior to putting jobs in background with & and doing clumsy and awkward stuff with sleep and checking $!.
If you want coproc to keep quiet, whatever happens (e.g., if there's an error launching the command, which is fine here since you're handling everything yourself), do:
coproc mycoprocfd { "${cmd[#]}" >> cc.log 2>&1 ; } > /dev/null 2>&1
20 minutes of more googling revealed https://stackoverflow.com/a/6756971/494983 and kill -0 $PID from https://stackoverflow.com/a/14296353/494983.
So it seems I can use:
$ cat startup.sh
CMD="start_server -option1 foo -option2 bar"
eval "$CMD >> cc.log 2>&1 &"
SERVER_PID=$!
sleep 1
kill -0 $SERVER_PID
if [ $? != 0 ]; then
echo "Failure detected when starting server! PID $SERVER_PID doesn't exist!" 1>&2
exit 1
else
echo $SERVER_PID
fi
This wouldn't work for processes that I can't send signals to but works well enough in my case (where startup.sh starts the server itself).

Bash: How do I make sub-processes of a script be terminated, when the script is terminated?

The question applies to a script such as the following:
Script
#!/bin/sh
SRC="/tmp/my-server-logs"
echo "STARTING GREP JOBS..."
for f in `find ${SRC} -name '*log*2011*' | sort --reverse`
do
(
OUT=`nice grep -ci -E "${1}" "${f}"`
if [ "${OUT}" != "0" ]
then
printf '%7s : %s\n' "${OUT}" "${f}"
else
printf '%7s %s\n' "(none)" "${f}"
fi
) &
done
echo "WAITING..."
wait
echo "FINISHED!"
Current behavior
Pressing Ctrl+C in console terminates the script but not the already running grep processes.
Write a trap for Ctrl+c and in the trap kill all of the subprocesses. Put this before your wait command.
function handle_sigint()
{
for proc in `jobs -p`
do
kill $proc
done
}
trap handle_sigint SIGINT
A simple alternative is using a cat pipe. The following worked for me:
echo "-" > test.text;
for x in 1 2 3; do
( sleep $x; echo $x | tee --append test.text; ) &
done | cat
If I press Ctrl-C before the last number is printed to stdout. It also works if the text-generating command is something that takes a long time such as "find /", i.e. it is not only the connection to stdout through cat that is killed but actually the child process.
For large scripts that make extensive use of subprocesses the easiest way to ensure the indented Ctrl-C behaviour is wrapping the whole script into such a subshell, e.g.
#!/usr/bin/bash
(
...
) | cat
I am not sure though if this has the exactly same effect as Andrew's answer (i.e. I'm not sure what signal is sent to the subprocesses). Also I only tested this with cygwin, not with a native Linux shell.

Using named pipes with bash - Problem with data loss

Did some search online, found simple 'tutorials' to use named pipes. However when I do anything with background jobs I seem to lose a lot of data.
[[Edit: found a much simpler solution, see reply to post. So the question I put forward is now academic - in case one might want a job server]]
Using Ubuntu 10.04 with Linux 2.6.32-25-generic #45-Ubuntu SMP Sat Oct 16 19:52:42 UTC 2010 x86_64 GNU/Linux
GNU bash, version 4.1.5(1)-release (x86_64-pc-linux-gnu).
My bash function is:
function jqs
{
pipe=/tmp/__job_control_manager__
trap "rm -f $pipe; exit" EXIT SIGKILL
if [[ ! -p "$pipe" ]]; then
mkfifo "$pipe"
fi
while true
do
if read txt <"$pipe"
then
echo "$(date +'%Y'): new text is [[$txt]]"
if [[ "$txt" == 'quit' ]]
then
break
fi
fi
done
}
I run this in the background:
> jqs&
[1] 5336
And now I feed it:
for i in 1 2 3 4 5 6 7 8
do
(echo aaa$i > /tmp/__job_control_manager__ && echo success$i &)
done
The output is inconsistent.
I frequently don't get all success echoes.
I get at most as many new text echos as success echoes, sometimes less.
If I remove the '&' from the 'feed', it seems to work, but I am blocked until the output is read. Hence me wanting to let sub-processes get blocked, but not the main process.
The aim being to write a simple job control script so I can run say 10 jobs in parallel at most and queue the rest for later processing, but reliably know that they do run.
Full job manager below:
function jq_manage
{
export __gn__="$1"
pipe=/tmp/__job_control_manager_"$__gn__"__
trap "rm -f $pipe" EXIT
trap "break" SIGKILL
if [[ ! -p "$pipe" ]]; then
mkfifo "$pipe"
fi
while true
do
date
jobs
if (($(jobs | egrep "Running.*echo '%#_Group_#%_$__gn__'" | wc -l) < $__jN__))
then
echo "Waiting for new job"
if read new_job <"$pipe"
then
echo "new job is [[$new_job]]"
if [[ "$new_job" == 'quit' ]]
then
break
fi
echo "In group $__gn__, starting job $new_job"
eval "(echo '%#_Group_#%_$__gn__' > /dev/null; $new_job) &"
fi
else
sleep 3
fi
done
}
function jq
{
# __gn__ = first parameter to this function, the job group name (the pool within which to allocate __jN__ jobs)
# __jN__ = second parameter to this function, the maximum of job numbers to run concurrently
export __gn__="$1"
shift
export __jN__="$1"
shift
export __jq__=$(jobs | egrep "Running.*echo '%#_GroupQueue_#%_$__gn__'" | wc -l)
if (($__jq__ '<' 1))
then
eval "(echo '%#_GroupQueue_#%_$__gn__' > /dev/null; jq_manage $__gn__) &"
fi
pipe=/tmp/__job_control_manager_"$__gn__"__
echo $# >$pipe
}
Calling
jq <name> <max processes> <command>
jq abc 2 sleep 20
will start one process.
That part works fine. Start a second one, fine.
One by one by hand seem to work fine.
But starting 10 in a loop seems to lose the system, as in the simpler example above.
Any hints as to what I can do to solve this apparent loss of IPC data would be greatly appreciated.
Regards,
Alain.
Your problem is if statement below:
while true
do
if read txt <"$pipe"
....
done
What is happening is that your job queue server is opening and closing the pipe each time around the loop. This means that some of the clients are getting a "broken pipe" error when they try to write to the pipe - that is, the reader of the pipe goes away after the writer opens it.
To fix this, change your loop in the server open the pipe once for the entire loop:
while true
do
if read txt
....
done < "$pipe"
Done this way, the pipe is opened once and kept open.
You will need to be careful of what you run inside the loop, as all processing inside the loop will have stdin attached to the named pipe. You will want to make sure you redirect stdin of all your processes inside the loop from somewhere else, otherwise they may consume the data from the pipe.
Edit: With the problem now being that you are getting EOF on your reads when the last client closes the pipe, you can use jilles method of duping the file descriptors, or you can just make sure you are a client too and keep the write side of the pipe open:
while true
do
if read txt
....
done < "$pipe" 3> "$pipe"
This will hold the write side of the pipe open on fd 3. The same caveat applies with this file descriptor as with stdin. You will need to close it so any child processes dont inherit it. It probably matters less than with stdin, but it would be cleaner.
As said in other answers you need to keep the fifo open at all times to avoid losing data.
However, once all writers have left after the fifo has been open (so there was a writer), reads return immediately (and poll() returns POLLHUP). The only way to clear this state is to reopen the fifo.
POSIX does not provide a solution to this but at least Linux and FreeBSD do: if reads start failing, open the fifo again while keeping the original descriptor open. This works because in Linux and FreeBSD the "hangup" state is local to a particular open file description, while in POSIX it is global to the fifo.
This can be done in a shell script like this:
while :; do
exec 3<tmp/testfifo
exec 4<&-
while read x; do
echo "input: $x"
done <&3
exec 4<&3
exec 3<&-
done
Just for those that might be interested, [[re-edited]] following comments by camh and jilles, here are two new versions of the test server script.
Both versions now works exactly as hoped.
camh's version for pipe management:
function jqs # Job queue manager
{
pipe=/tmp/__job_control_manager__
trap "rm -f $pipe; exit" EXIT TERM
if [[ ! -p "$pipe" ]]; then
mkfifo "$pipe"
fi
while true
do
if read -u 3 txt
then
echo "$(date +'%Y'): new text is [[$txt]]"
if [[ "$txt" == 'quit' ]]
then
break
else
sleep 1
# process $txt - remember that if this is to be a spawned job, we should close fd 3 and 4 beforehand
fi
fi
done 3< "$pipe" 4> "$pipe" # 4 is just to keep the pipe opened so any real client does not end up causing read to return EOF
}
jille's version for pipe management:
function jqs # Job queue manager
{
pipe=/tmp/__job_control_manager__
trap "rm -f $pipe; exit" EXIT TERM
if [[ ! -p "$pipe" ]]; then
mkfifo "$pipe"
fi
exec 3< "$pipe"
exec 4<&-
while true
do
if read -u 3 txt
then
echo "$(date +'%Y'): new text is [[$txt]]"
if [[ "$txt" == 'quit' ]]
then
break
else
sleep 1
# process $txt - remember that if this is to be a spawned job, we should close fd 3 and 4 beforehand
fi
else
# Close the pipe and reconnect it so that the next read does not end up returning EOF
exec 4<&3
exec 3<&-
exec 3< "$pipe"
exec 4<&-
fi
done
}
Thanks to all for your help.
Like camh & Dennis Williamson say don't break the pipe.
Now I have smaller examples, direct on the command line:
Server:
(
for i in {0,1,2,3,4}{0,1,2,3,4,5,6,7,8,9};
do
if read s;
then echo ">>$i--$s//";
else
echo "<<$i";
fi;
done < tst-fifo
)&
Client:
(
for i in {%a,#b}{1,2}{0,1};
do
echo "Test-$i" > tst-fifo;
done
)&
Can replace the key line with:
(echo "Test-$i" > tst-fifo&);
All client data sent to the pipe gets read, though with option two of the client one may need to start the server a couple of times before all data is read.
But although the read waits for data in the pipe to start with, once data has been pushed, it reads the empty string forever.
Any way to stop this?
Thanks for any insights again.
On the one hand the problem is worse than I thought:
Now there seems to be a case in my more complex example (jq_manage) where the same data is being read over and over again from the pipe (even though no new data is being written to it).
On the other hand, I found a simple solution (edited following Dennis' comment):
function jqn # compute the number of jobs running in that group
{
__jqty__=$(jobs | egrep "Running.*echo '%#_Group_#%_$__groupn__'" | wc -l)
}
function jq
{
__groupn__="$1"; shift # job group name (the pool within which to allocate $__jmax__ jobs)
__jmax__="$1"; shift # maximum of job numbers to run concurrently
jqn
while (($__jqty__ '>=' $__jmax__))
do
sleep 1
jqn
done
eval "(echo '%#_Group_#%_$__groupn__' > /dev/null; $#) &"
}
Works like a charm.
No socket or pipe involved.
Simple.
run say 10 jobs in parallel at most and queue the rest for later processing, but reliably know that they do run
You can do this with GNU Parallel. You will not need a this scripting.
http://www.gnu.org/software/parallel/man.html#options
You can set max-procs "Number of jobslots. Run up to N jobs in parallel." There is an option to set the number of CPU cores you want to use. You can save the list of executed jobs to a log file, but that is a beta feature.

Resources