Capturing output from a background subshell in bash? - linux

I'm trying to run multiple subshells in a bash script and capture the stdout result to a variable. When I run the subshell in the background I would expect I can use wait to let the subshell complete and then use the variable the result is assigned to later in the program.... but it doesn't seem to work.
Simple example script:
l=$(ls) &
wait $!
echo "L=$l"
Then when I run it:
$ bash -x test2.sh
+ wait 16821
++ ls
+ l='test1.sh test2.sh'
+ echo L=
L=
The output from my test program would suggest the variable l should be assigned the result of the subshell, but when I use echo it is empty...
If I don't background the subshell (or use wait) then it works as expected...
l=$(ls)
echo "L=$l"
Results in:
$ bash -x test1.sh
++ ls
+ l='test1.sh test2.sh'
+ echo 'L=test1.sh test2.sh'
L=test1.sh test2.sh
Am I missing something obvious or ... ?

From bash manpage (emphasis mine):
Command substitution, commands grouped with parentheses, and
asynchronous commands are invoked in a subshell environment that is
a duplicate of the shell environment, except that traps caught by the
shell are re‐set to the values that the shell inherited from its
parent at invocation. Builtin commands that are invoked as part of a
pipeline are also executed in a subshell environment. Changes made to the
subshell environment cannot affect the shell's execution environment.
So, l=$(ls) & would be like (l=$(ls)) if not backgrounded.

Related

Calling `ksh` from `ksh` script stops execution

I am executing a ksh script from another ksh script. The called script ends by executing ksh which simply stops the caller script from continuing.
MRE
#!/bin/ksh
# Caller Script
. ~/called
# Does not continue to next echo.
echo "DONE!"
#!/bin/ksh
#Called script
# Some exports..
ENV=calledcalled ksh
Output with set -x
++ ksh
++ ENV=calledcalled
.kshrc executed
If I run calledcalled directly in my caller it works fine (i.e. continues with next commands. Why does this happen? I checked $? and it is 0. I tried ./called || true. Please let me know if more information is needed.
Note: Called script is outside my control.
This is completely normal and expected. Remember, when you run cmd1; cmd2, cmd2 doesn't run until cmd1 exits.
When your script runs ksh (and is invoked from a terminal or other context where reading from stdin doesn't cause an immediate EOF), nothing is making that new copy of ksh exit -- it waits for code to run to be given to it on stdin as normal -- so that script is just sitting around waiting for the copy of ksh to exit before it does anything else.
There are plenty of ways you can work around this. A few easy ones:
Ensure that stdin is empty so the child interpreter can't wait for input
. ~/called </dev/null
Define a function named ksh that doesn't do anything at all.
ksh() { echo "Not actually running ksh" >&2; }
. ~/called
Set ENV (a variable which, when defined, tells any shell to run the code in that file before doing anything else) to the filename of a script that, when run, causes any interactive shell to exit immediately.
exit_script=$(mktemp -t exit_script.XXXXXX)
printf '%s\n' 'case $- in *i*) exit 0;; esac' >"$exit_script"
ENV=$exit_script . ~/called
rm -f -- "$exit_script"
The above are just a few approaches; you can surely imagine many more with just a little thought and experimentation.

If bash pipeline commands run in subshell, why echo command can access the non-exported variables?

Bash manual says "Each command in a pipeline is executed as a separate process (i.e., in a subshell)". I test two simple commands.
Scene 1
cd /home/work
str=hello
echo $str | tee a.log
It outputs:
hello
It seems that echo command is not executed in a subshell, as it can access the non-exported variable $str.
Scene 2
cd /home/work
cd src | pwd
pwd
It outputs:
/home/work
Is looks like cd command is executed in a subshell, as it doesn't affect the working directory of original shello.
Can anyone explain why the behaviors are not consistent?
Can anyone explain why the behaviors are not consistent?
Well, because this is how it was designed. A "subshell" inherits the whole context, not only exported variables.
Bash manual says "Each command in a pipeline is executed as a separate process (i.e., in a subshell)"
Bsah manual is available here. The sentence you are mentioning literally has a link to solve the mystery:
Each command in a pipeline is executed in its own subshell, which is a separate process (see Command Execution Environment).
Then you can check the "Command Execution Environment", from it (emphasis mine):
The shell has an execution environment, which consists of the following:
...
shell parameters that are set by variable assignment ...
...
...
Command substitution, commands grouped with parentheses, and asynchronous commands are invoked in a subshell environment that is a duplicate of the shell environment, ....
A subshell has all the environment (well, except traps). On the other hand commands:
When a simple command other than a builtin or shell function is to be executed, it is invoked in a separate execution environment that consists of the following. ....
...
shell variables and functions marked for export, ...
If bash pipeline commands run in subshell, why echo command can access the non-exported variables?
Because a subshell inherits the parent environment, including all the non-exported variables.

Difference between bash pid and $$

I'm a bash scripting beginner, and I have a "homework" to do. I figured most of the stuff out but there is a part which says that I have to echo the pid of the parent bash and the pid of the two subshells that I will be running. So I looked online and found this (The Linux documentation project):
#!/bin/bash4
echo "\$\$ outside of subshell = $$" # 9602
echo "\$BASH_SUBSHELL outside of subshell = $BASH_SUBSHELL" # 0
echo "\$BASHPID outside of subshell = $BASHPID" # 9602
echo
( echo "\$\$ inside of subshell = $$" # 9602
echo "\$BASH_SUBSHELL inside of subshell = $BASH_SUBSHELL" # 1
echo "\$BASHPID inside of subshell = $BASHPID" ) # 9603
# Note that $$ returns PID of parent process.
So here are my questions:
1) What does the first echo print? Is this the pid of the parent bash?
2) Why does the 2nd echo print out 0?
3) Is $BASH_SUBSHELL a command or a variable?
4) I'm doing everything on a mac, I will try all of this on a Linux machine in some days but
whenever I run this script $BASHPID doesn't return anything, I just get a new line. Is this because I'm running this on a mac and $BASHPID doesn't work on a mac?
Looking at documentation on this, it looks like:
$$ means the process ID that the script file is running under. For any given script, when it is run, it will have only one "main" process ID. Regardless of how many subshells you invoke, $$ will always return the first process ID associated with the script. BASHPID will show you the process ID of the current instance of bash, so in a subshell it will be different than the "top level" bash which may have invoked it.
BASH_SUBSHELL indicates the "subshell level" you're in. If you're not in any subshell level, your level is zero. If you start a subshell within your main program, that subshell level is 1. If you start a subshell within that subshell, the level would be 2, and so on.
BASH_SUBSHELL is a variable.
Maybe BASHPID isn't supported by the version of bash you have? I doubt it's a "Mac" problem.
It'd be best to get well-acquainted with bash(1):
BASHPID
Expands to the process ID of the current bash process.
This differs from $$ under certain circumstances, such
as subshells that do not require bash to be re-
initialized.
[...]
BASH_SUBSHELL
Incremented by one each time a subshell or subshell
environment is spawned. The initial value is 0.
$BASHPID was introduced with bash-4.0-alpha. If you run bash --version you can find out what version of bash(1) you're using.
If you're going to be doing much bash(1) work, you'll also need the following:
Greg's bash FAQ
TLDP bash reference card

question on linux command

What do the two ampersands in the following command do:
(make foo&)&
The ( and ) run the command in a subshell. This means that a separate shell is spawned off and the command is run. This is probably because they wanted to use shell specific operation (backgrounding - other examples are redirection etc.). The first & in the command backgrounds the command run in the subshell (ie. make foo). The second ampersand backgrounds the subshell itself so that you get back your command prompt immediately.
You can see the effects here
Foreground on the current shell
(bb-python2.6)noufal#NibrahimT61% ls # shell waits for process to complete
a b c d e
Background on the current shell
(bb-python2.6)noufal#NibrahimT61% ls& #Shell returns immediately.
[1] 3801
a b c d e
[1] + done /bin/ls -h -p --color=auto -X
Using a subshell (Foreground)
(bb-python2.6)noufal#NibrahimT61% (ls&) # Current shell waits for subshell to finish.
a b c d e
In this case, the current shell waits for the subshell to finish even though the job in the subshell itself is backgrounded.
Using a subshell (BAckground)
(bb-python2.6)-130- noufal#NibrahimT61% (ls &)&
[1] 3829
a b c d e
[1] + exit 130 (; /bin/ls -h -p --color=auto -X &; )
The foreground shell returns immediately (Doesn't wait for the subshell which itself doesn't wait for the ls to finish). Observe the difference the command executed.
A sidenote on the need to run some commands in a subshell. Suppose you wanted to run a "shell command" (ie. One that uses shell specific stuff like redirects, job ids etc.), you'd have to either run that command in a subshell (using (, )) or using the -c option to shells like bash. You can't just directly exec such things because the shell is necessary to process the job id or whatever. Ampersanding that will have the subshell return immediately. The second ampersand in your code looks (like the other answer suggests) redundant. A case of "make sure that it's backgrounded".
It's difficult to say without context, but & in shell commands runs the command in the background and immediately continues, without waiting for the command to finish. Maybe the Makefile author wanted to run several commands in parallel. The second ampersand would be redundant though, as are the parentheses.
Ampersand is used as a line continuation character in makefiles.
Hard to say for sure since there isn't enough context in your question.

What is '$$' in the bash shell?

I'm beginner at bash shell programming. Can you tell me about '$$' symbols in the bash shell?
If I try the following
#> echo $$
it prints
#>18756
Can you tell me what this symbol is used for and when?
It's the process id of the bash process itself.
You might use it to track your process over its life - use ps -p to see if it's still running, send it a signal using kill (to pause the process for example), change its priority with renice, and so on.
Process ids are often written to log files, especially when multiple instances of a script run at once, to help track performance or diagnose problems.
Here's the bash documentation outlining special parameters.
BASHPID, mentioned by ghostdog74, was added at version 4.0. Here's an example from Mendel Cooper's Advanced Bash-Scripting Guide that shows the difference between $$ and $BASHPID:
#!/bin/bash4
echo "\$\$ outside of subshell = $$" # 9602
echo "\$BASH_SUBSHELL outside of subshell = $BASH_SUBSHELL" # 0
echo "\$BASHPID outside of subshell = $BASHPID" # 9602
echo
( echo "\$\$ inside of subshell = $$" # 9602
echo "\$BASH_SUBSHELL inside of subshell = $BASH_SUBSHELL" # 1
echo "\$BASHPID inside of subshell = $BASHPID" ) # 9603
# Note that $$ returns PID of parent process.
if you have bash, a relatively close equivalent is the BASHPID variable. See man bash
BASHPID
Expands to the process id of the current bash process. This differs from $$ under certain circumstances, such as subshells
that do not require bash to be re-initialized.

Resources