Getting BASH command PID - linux

I have this piece of code
#!/bin/bash
streamURL=http://devimages.apple.com/iphone/samples/bipbop/gear4/prog_index.m3u8
(
echo "Debugging for stream: $streamURL";
echo "Starting debugging...";
vlc -vvv --color $streamURL --file-caching=10000 2>&1 | sed "s/^/ `date`/";
) | tee debug.txt &
PROCESS_PID=$!
ps -e | grep $PROCESS_PID
echo " killing process pid: "
echo $PROCESS_PID;
kill -9 $PROCESS_PID
ps -e | grep vlc #still there
My problem is I can't manage to save the "vlc ..." command PID into a variable in order to kill it later. If I move "PROCESS_PID=$!" right after it, it will be empty. Also need the pipe after it for sed. Any suggestions?

You can get the pid by twiddling file descriptors, but it's painful. For example:
{ PID=$({ (
echo foo;
echo bar;
sh -c 'echo $$ >&5; exec echo baz' ) |
tr a o; } 5>&1 1>&3 ); } 3>&1
will assign the pid of 'echo baz' to PID. Replace that echo with your vlc and replace the tr with your sed and you should have a solution.
To try an provide a somewhat simplified explanation of what's going on here, first notice that we are using process substitution to make the assignment to PID. The $() syntax simply takes the command inside the parentheses and assigns to the variable the output of the command. It is important to remember that "output" here means "whatever is printed to file descriptor 1". Inside the sh command, we print a pid to file descriptor 5 and then exec echo. By using exec, that echo has the same pid that the previous echo wrote. Now the echo foo, bar and baz are all writing into the pipe that goes to tr. The output of tr is being redirected to fd 3 (before the edit, this was fd 2. Which file descriptor to use is mostly arbitrary, but modifying 2 is a bad idea in case any errors are generated) and file descriptor 5 is being redirected to fd 1, so that it becomes the "output" of the process substitution that is assigned to PID. Then outside the process substitution, we assign fd 3 to give output where it was originally desired. Hopefully, this paragraph is more explanatory than obfuscating: if confused, look at the code for clarification!
Unfortunately, it gets uglier if you want to run in the background:
{ PID=$({ (
echo foo;
echo bar;
sh -c 'echo $$ >&5; exec 5>&-; exec echo baz' >&3 & ) |
tr a o; } 5>&1 1>&3 ); } 3>&1
Here, you need to close file descriptor 5 to ensure that the process substitution completes.

You can't assign a variable in a subshell and get it back outside it.
In this case, if you kill $! you'll kill tee, which will (AFAIK) send SIGPIPE to the subshell and terminate the whole thing. So there's generally no need for the PID in the subshell.
I'm not sure, but the problem might be that you're nuking the process from orbit with SIGKILL rather than killing it softly with just kill $PID. It might be that tee does not send SIGPIPE in this case, because it doesn't get to clean up after itself.
In other words, just use kill $process_id. Be aware that killing a process is not synchronous - you're just sending it a signal and carrying on. See Kill bash processes “nicely” for details.

Related

Bash command with pipe('|') alway return exit code of 0, even in error case [duplicate]

I want to execute a long running command in Bash, and both capture its exit status, and tee its output.
So I do this:
command | tee out.txt
ST=$?
The problem is that the variable ST captures the exit status of tee and not of command. How can I solve this?
Note that command is long running and redirecting the output to a file to view it later is not a good solution for me.
There is an internal Bash variable called $PIPESTATUS; it’s an array that holds the exit status of each command in your last foreground pipeline of commands.
<command> | tee out.txt ; test ${PIPESTATUS[0]} -eq 0
Or another alternative which also works with other shells (like zsh) would be to enable pipefail:
set -o pipefail
...
The first option does not work with zsh due to a little bit different syntax.
Dumb solution: Connecting them through a named pipe (mkfifo). Then the command can be run second.
mkfifo pipe
tee out.txt < pipe &
command > pipe
echo $?
using bash's set -o pipefail is helpful
pipefail: the return value of a pipeline is the status of
the last command to exit with a non-zero status,
or zero if no command exited with a non-zero status
There's an array that gives you the exit status of each command in a pipe.
$ cat x| sed 's///'
cat: x: No such file or directory
$ echo $?
0
$ cat x| sed 's///'
cat: x: No such file or directory
$ echo ${PIPESTATUS[*]}
1 0
$ touch x
$ cat x| sed 's'
sed: 1: "s": substitute pattern can not be delimited by newline or backslash
$ echo ${PIPESTATUS[*]}
0 1
This solution works without using bash specific features or temporary files. Bonus: in the end the exit status is actually an exit status and not some string in a file.
Situation:
someprog | filter
you want the exit status from someprog and the output from filter.
Here is my solution:
((((someprog; echo $? >&3) | filter >&4) 3>&1) | (read xs; exit $xs)) 4>&1
echo $?
See my answer for the same question on unix.stackexchange.com for a detailed explanation and an alternative without subshells and some caveats.
By combining PIPESTATUS[0] and the result of executing the exit command in a subshell, you can directly access the return value of your initial command:
command | tee ; ( exit ${PIPESTATUS[0]} )
Here's an example:
# the "false" shell built-in command returns 1
false | tee ; ( exit ${PIPESTATUS[0]} )
echo "return value: $?"
will give you:
return value: 1
So I wanted to contribute an answer like lesmana's, but I think mine is perhaps a little simpler and slightly more advantageous pure-Bourne-shell solution:
# You want to pipe command1 through command2:
exec 4>&1
exitstatus=`{ { command1; printf $? 1>&3; } | command2 1>&4; } 3>&1`
# $exitstatus now has command1's exit status.
I think this is best explained from the inside out - command1 will execute and print its regular output on stdout (file descriptor 1), then once it's done, printf will execute and print icommand1's exit code on its stdout, but that stdout is redirected to file descriptor 3.
While command1 is running, its stdout is being piped to command2 (printf's output never makes it to command2 because we send it to file descriptor 3 instead of 1, which is what the pipe reads). Then we redirect command2's output to file descriptor 4, so that it also stays out of file descriptor 1 - because we want file descriptor 1 free for a little bit later, because we will bring the printf output on file descriptor 3 back down into file descriptor 1 - because that's what the command substitution (the backticks), will capture and that's what will get placed into the variable.
The final bit of magic is that first exec 4>&1 we did as a separate command - it opens file descriptor 4 as a copy of the external shell's stdout. Command substitution will capture whatever is written on standard out from the perspective of the commands inside it - but since command2's output is going to file descriptor 4 as far as the command substitution is concerned, the command substitution doesn't capture it - however once it gets "out" of the command substitution it is effectively still going to the script's overall file descriptor 1.
(The exec 4>&1 has to be a separate command because many common shells don't like it when you try to write to a file descriptor inside a command substitution, that is opened in the "external" command that is using the substitution. So this is the simplest portable way to do it.)
You can look at it in a less technical and more playful way, as if the outputs of the commands are leapfrogging each other: command1 pipes to command2, then the printf's output jumps over command 2 so that command2 doesn't catch it, and then command 2's output jumps over and out of the command substitution just as printf lands just in time to get captured by the substitution so that it ends up in the variable, and command2's output goes on its merry way being written to the standard output, just as in a normal pipe.
Also, as I understand it, $? will still contain the return code of the second command in the pipe, because variable assignments, command substitutions, and compound commands are all effectively transparent to the return code of the command inside them, so the return status of command2 should get propagated out - this, and not having to define an additional function, is why I think this might be a somewhat better solution than the one proposed by lesmana.
Per the caveats lesmana mentions, it's possible that command1 will at some point end up using file descriptors 3 or 4, so to be more robust, you would do:
exec 4>&1
exitstatus=`{ { command1 3>&-; printf $? 1>&3; } 4>&- | command2 1>&4; } 3>&1`
exec 4>&-
Note that I use compound commands in my example, but subshells (using ( ) instead of { } will also work, though may perhaps be less efficient.)
Commands inherit file descriptors from the process that launches them, so the entire second line will inherit file descriptor four, and the compound command followed by 3>&1 will inherit the file descriptor three. So the 4>&- makes sure that the inner compound command will not inherit file descriptor four, and the 3>&- will not inherit file descriptor three, so command1 gets a 'cleaner', more standard environment. You could also move the inner 4>&- next to the 3>&-, but I figure why not just limit its scope as much as possible.
I'm not sure how often things use file descriptor three and four directly - I think most of the time programs use syscalls that return not-used-at-the-moment file descriptors, but sometimes code writes to file descriptor 3 directly, I guess (I could imagine a program checking a file descriptor to see if it's open, and using it if it is, or behaving differently accordingly if it's not). So the latter is probably best to keep in mind and use for general-purpose cases.
(command | tee out.txt; exit ${PIPESTATUS[0]})
Unlike #cODAR's answer this returns the original exit code of the first command and not only 0 for success and 127 for failure. But as #Chaoran pointed out you can just call ${PIPESTATUS[0]}. It is important however that all is put into brackets.
In Ubuntu and Debian, you can apt-get install moreutils. This contains a utility called mispipe that returns the exit status of the first command in the pipe.
Outside of bash, you can do:
bash -o pipefail -c "command1 | tee output"
This is useful for example in ninja scripts where the shell is expected to be /bin/sh.
The simplest way to do this in plain bash is to use process substitution instead of a pipeline. There are several differences, but they probably don't matter very much for your use case:
When running a pipeline, bash waits until all processes complete.
Sending Ctrl-C to bash makes it kill all the processes of a pipeline, not just the main one.
The pipefail option and the PIPESTATUS variable are irrelevant to process substitution.
Possibly more
With process substitution, bash just starts the process and forgets about it, it's not even visible in jobs.
Mentioned differences aside, consumer < <(producer) and producer | consumer are essentially equivalent.
If you want to flip which one is the "main" process, you just flip the commands and the direction of the substitution to producer > >(consumer). In your case:
command > >(tee out.txt)
Example:
$ { echo "hello world"; false; } > >(tee out.txt)
hello world
$ echo $?
1
$ cat out.txt
hello world
$ echo "hello world" > >(tee out.txt)
hello world
$ echo $?
0
$ cat out.txt
hello world
As I said, there are differences from the pipe expression. The process may never stop running, unless it is sensitive to the pipe closing. In particular, it may keep writing things to your stdout, which may be confusing.
PIPESTATUS[#] must be copied to an array immediately after the pipe command returns.
Any reads of PIPESTATUS[#] will erase the contents.
Copy it to another array if you plan on checking the status of all pipe commands.
"$?" is the same value as the last element of "${PIPESTATUS[#]}",
and reading it seems to destroy "${PIPESTATUS[#]}", but I haven't absolutely verified this.
declare -a PSA
cmd1 | cmd2 | cmd3
PSA=( "${PIPESTATUS[#]}" )
This will not work if the pipe is in a sub-shell. For a solution to that problem,
see bash pipestatus in backticked command?
Base on #brian-s-wilson 's answer; this bash helper function:
pipestatus() {
local S=("${PIPESTATUS[#]}")
if test -n "$*"
then test "$*" = "${S[*]}"
else ! [[ "${S[#]}" =~ [^0\ ] ]]
fi
}
used thus:
1: get_bad_things must succeed, but it should produce no output; but we want to see output that it does produce
get_bad_things | grep '^'
pipeinfo 0 1 || return
2: all pipeline must succeed
thing | something -q | thingy
pipeinfo || return
Pure shell solution:
% rm -f error.flag; echo hello world \
| (cat || echo "First command failed: $?" >> error.flag) \
| (cat || echo "Second command failed: $?" >> error.flag) \
| (cat || echo "Third command failed: $?" >> error.flag) \
; test -s error.flag && (echo Some command failed: ; cat error.flag)
hello world
And now with the second cat replaced by false:
% rm -f error.flag; echo hello world \
| (cat || echo "First command failed: $?" >> error.flag) \
| (false || echo "Second command failed: $?" >> error.flag) \
| (cat || echo "Third command failed: $?" >> error.flag) \
; test -s error.flag && (echo Some command failed: ; cat error.flag)
Some command failed:
Second command failed: 1
First command failed: 141
Please note the first cat fails as well, because it's stdout gets closed on it. The order of the failed commands in the log is correct in this example, but don't rely on it.
This method allows for capturing stdout and stderr for the individual commands so you can then dump that as well into a log file if an error occurs, or just delete it if no error (like the output of dd).
It may sometimes be simpler and clearer to use an external command, rather than digging into the details of bash. pipeline, from the minimal process scripting language execline, exits with the return code of the second command*, just like a sh pipeline does, but unlike sh, it allows reversing the direction of the pipe, so that we can capture the return code of the producer process (the below is all on the sh command line, but with execline installed):
$ # using the full execline grammar with the execlineb parser:
$ execlineb -c 'pipeline { echo "hello world" } tee out.txt'
hello world
$ cat out.txt
hello world
$ # for these simple examples, one can forego the parser and just use "" as a separator
$ # traditional order
$ pipeline echo "hello world" "" tee out.txt
hello world
$ # "write" order (second command writes rather than reads)
$ pipeline -w tee out.txt "" echo "hello world"
hello world
$ # pipeline execs into the second command, so that's the RC we get
$ pipeline -w tee out.txt "" false; echo $?
1
$ pipeline -w tee out.txt "" true; echo $?
0
$ # output and exit status
$ pipeline -w tee out.txt "" sh -c "echo 'hello world'; exit 42"; echo "RC: $?"
hello world
RC: 42
$ cat out.txt
hello world
Using pipeline has the same differences to native bash pipelines as the bash process substitution used in answer #43972501.
* Actually pipeline doesn't exit at all unless there is an error. It executes into the second command, so it's the second command that does the returning.
Why not use stderr? Like so:
(
# Our long-running process that exits abnormally
( for i in {1..100} ; do echo ploop ; sleep 0.5 ; done ; exit 5 )
echo $? 1>&2 # We pass the exit status of our long-running process to stderr (fd 2).
) | tee ploop.out
So ploop.out receives the stdout. stderr receives the exit status of the long running process. This has the benefit of being completely POSIX-compatible.
(Well, with the exception of the range expression in the example long-running process, but that's not really relevant.)
Here's what this looks like:
...
ploop
ploop
ploop
ploop
ploop
ploop
ploop
ploop
ploop
ploop
5
Note that the return code 5 does not get output to the file ploop.out.

finding the process group id created through setsid

In a shell script, I see that using setsid, we could create a new process group. I am not able to find a reliable way to get the group id after the creation. My requirement is simple, launch a process, and after it is done, clean up any descendant (if any). I dont want to kill the main process, hence I have to wait for the main process to end. After which, I can kill the leftover child processes if I had somehow got the group id. which can be done with kill -- -pgid. The missing piece is how do I get the group id ?
This script is what I came up with finally. Hope this helps someone.
$! will give the pid, and a ps has to be used to find its gid.
there was an extra space in front while using ps,the next line of variable expansion removes the leading space.
Finally after waiting for the main process,it kills the group.
#!/bin/sh -x
setsid "$#" &
pid=$!
gidspace=$(ps -o pgid= $pid)
gid="${gidspace## }"
echo "gid $gid"
echo "waiting"
wait $pid
ps -s $gid -o pid,ppid,pgid,command
kill -- -$gid
I managed to do it with a coproc, and a sleep to ensure we have enough time to read back the pid. This is bash-specific of course, and the only way to avoid using a hackish sleep inside a coproc is to write to a temp file and wait for the command to terminate (no need for coproc then).
Using a coproc
Note that I open filehandle 3 to write the pgid back to the parent shell and close it before executing the command.
#!/bin/bash -x
coproc setsid bash -c 'ps -o pgid= $BASHPID >&3; exec 3>&-; exec "$#" & sleep 1' -- "$#" 3>&1
read -u ${COPROC[0]} gid
echo "gid $gid"
ps -s $gid -o pid,ppid,pgid,command
kill -- -$gid
Using a temp file
To avoid having to pass the temp file to the subshell (and the risk the parent dies and removes it before child writes to it) I again open fh 3 so the children can write its pgid to it.
#!/bin/bash -x
t=$(mktemp)
trap 'rm -f "$t"' EXIT
exec {fh}>"$t"
setsid bash -c 'ps -o pgid= $BASHPID >&3; exec 3>&-; exec "$#" &' -- "$#" 3>&${fh}
read gid <$t
echo "gid $gid"
ps -s $gid -o pid,ppid,pgid,command
kill -- -$gid

Bash: Using SSH to start a long-running remote command and collect its PID

When I do the following, then I have to press CTRL-c afterwards or the shell acts weird. Left/right arrows keys e.g. doesn't move correctly and the text is messed up.
# read -r pid < <(ssh 10.10.10.46 'sleep 50 & echo $!') ; echo $pid
2135
# Killed by signal 2.
^C
#
I need this for a script, so I'd like to know why CTRL-c is needed and is it possible to work around it?
Update
It looks like it opens an extra Bash shell, and that is the one that needs to be exited.
The command I am actually interesting in is
read -r pid < <(ssh 10.10.10.46 "mbuffer -4 -v 0 -q -I 8023 > /tmp/mtest & echo $!"); echo $pid
Try this instead:
read -r pid \
< <(ssh 10.10.10.46 'nohup mbuffer >/tmp/mtest </dev/null 2>/tmp/mtest.err & echo $!')
Three important changes:
Use of nohup (you could also get a similar effect with the bash built-in disown)
Redirection of stdin and stderr to files (preventing them from holding handles that connect, eventually, to your terminal).
Use of single quotes for the remote command (with double-quotes, expansions happen before ssh is started, so the $! you get is the PID of the most recently started local background process).

Bash: How do I make sub-processes of a script be terminated, when the script is terminated?

The question applies to a script such as the following:
Script
#!/bin/sh
SRC="/tmp/my-server-logs"
echo "STARTING GREP JOBS..."
for f in `find ${SRC} -name '*log*2011*' | sort --reverse`
do
(
OUT=`nice grep -ci -E "${1}" "${f}"`
if [ "${OUT}" != "0" ]
then
printf '%7s : %s\n' "${OUT}" "${f}"
else
printf '%7s %s\n' "(none)" "${f}"
fi
) &
done
echo "WAITING..."
wait
echo "FINISHED!"
Current behavior
Pressing Ctrl+C in console terminates the script but not the already running grep processes.
Write a trap for Ctrl+c and in the trap kill all of the subprocesses. Put this before your wait command.
function handle_sigint()
{
for proc in `jobs -p`
do
kill $proc
done
}
trap handle_sigint SIGINT
A simple alternative is using a cat pipe. The following worked for me:
echo "-" > test.text;
for x in 1 2 3; do
( sleep $x; echo $x | tee --append test.text; ) &
done | cat
If I press Ctrl-C before the last number is printed to stdout. It also works if the text-generating command is something that takes a long time such as "find /", i.e. it is not only the connection to stdout through cat that is killed but actually the child process.
For large scripts that make extensive use of subprocesses the easiest way to ensure the indented Ctrl-C behaviour is wrapping the whole script into such a subshell, e.g.
#!/usr/bin/bash
(
...
) | cat
I am not sure though if this has the exactly same effect as Andrew's answer (i.e. I'm not sure what signal is sent to the subprocesses). Also I only tested this with cygwin, not with a native Linux shell.

Suppress Notice of Forked Command Being Killed

Let's suppose I have a bash script (foo.sh) that in a very simplified form, looks like the following:
echo "hello"
sleep 100 &
ps ax | grep sleep | grep -v grep | awk '{ print $1 } ' | xargs kill -9
echo "bye"
The third line imitates pkill, which I don't have by default on Mac OS X, but you can think of it as the same as pkill. However, when I run this script, I get the following output:
hello
foo: line 4: 54851 Killed sleep 100
bye
How do I suppress the line in the middle so that all I see is hello and bye?
While disown may have the side effect of silencing the message; this is how you start the process in a way that the message is truly silenced without having to give up job control of the process.
{ command & } 2>/dev/null
If you still want the command's own stderr (just silencing the shell's message on stderr) you'll need to send the process' stderr to the real stderr:
{ command 2>&3 & } 3>&2 2>/dev/null
To learn about how redirection works:
From the BashGuide: http://mywiki.wooledge.org/BashGuide/TheBasics/InputAndOutput#Redirection
An illustrated tutorial: http://bash-hackers.org/wiki/doku.php/howto/redirection_tutorial
And some more info: http://bash-hackers.org/wiki/doku.php/syntax/redirection
And by the way; don't use kill -9.
I also feel obligated to comment on your:
ps ax | grep sleep | grep -v grep | awk '{ print $1 } ' | xargs kill -9
This will scortch the eyes of any UNIX/Linux user with a clue. Moreover, every time you parse ps, a fairy dies. Do this:
kill $!
Even tools such as pgrep are essentially broken by design. While they do a better job of matching processes, the fundamental flaws are still there:
Race: By the time you get a PID output and parse it back in and use it for something else, the PID might already have disappeared or even replaced by a completely unrelated process.
Responsibility: In the UNIX process model, it is the responsibility of a parent to manage its child, nobody else should. A parent should keep its child's PID if it wants to be able to signal it and only the parent can reliably do so. UNIX kernels have been designed with the assumption that user programs will adhere to this pattern, not violate it.
How about disown? This mostly works for me on Bash on Linux.
echo "hello"
sleep 100 &
disown
ps ax | grep sleep | grep -v grep | awk '{ print $1 } ' | xargs kill -9
echo "bye"
Edit: Matched the poster's code better.
The message is real. The code killed the grep process as well.
Run ps ax | grep sleep and you should see your grep process on the list.
What I usually do in this case is ps ax | grep sleep | grep -v grep
EDIT: This is an answer to older form of question where author omitted the exclusion of grep for the kill sequence. I hope I still get some rep for answering the first half.
Yet another way to disable job termination messages is to put your command to be backgrounded in a sh -c 'cmd &' construct.
And as already pointed out, there is no need to imitate pkill; you may store the value of $! in another variable instead.
echo "hello"
sleep_pid=`sh -c 'sleep 30 & echo ${!}' | head -1`
#sleep_pid=`sh -c '(exec 1>&-; exec sleep 30) & echo ${!}'`
echo kill $sleep_pid
kill $sleep_pid
echo "bye"
Have you tried to deactivate job control? It's a non-interactive shell, so I would guess it's off by default, but it does not hurt to try... It's regulated by the -m (monitor) shell variable.

Resources