Variable scope in the shell level - linux

Recently I have been reading The Advanced Bash Script and I find something about the variable scope between parent and children shells puzzle me so much. Here it is:
Scene:
there are some ways to spawn a child shell:
first, (command-lists);
second, execute a non-built-in command or a script, and so on.
Since when we run a script in the parent script, the child script can not see the variables in the parent shell. Why is it possible that in the (command-lists) struct the child shell can seen the variable in the parent shell.
e.g
(command-lists)
$ a=100
$ (echo $a)
100
$
run a script
$ cat b.sh
echo $a
$ a=100
$ ./b.sh
# empty
How?

In the case where you have a sub-shell run in the original script:
(command1; command2; ...)
the sub-shell is a direct copy of the original shell created by fork(), and therefore has direct access to its own copy of all the original variables available to it.
Suppose the commands (command1, command2 etc) in the sub-shell are themselves shell scripts. Those commands are executed by the sub-shell calling fork() and then exec() to create a new shell, and the new shell does not inherit the non-exported variables from the original shell.
Addressing your examples directly:
$ a=100
$ (echo $a)
100
$
Here, the sub-shell has its own copy of all the variables (specifically, a) that the parent shell had access to. Any changes made in the sub-shell will not be reflected in the parent shell, of course, so:
$ a=100
$ (echo $a; a=200; echo $a)
100
200
$ echo $a
100
$
Now your second example:
$ cat b.sh
echo $a
$ a=100
$ ./b.sh
$ . ./b.sh
100
$ source ./b.sh
100
$ a=200 ./b.sh
200
$ echo $a
100
$ export a
$ ./b.sh
100
$
The variable a is not exported, so the first time b.sh is run, it has no value for $a so it echoes an empty line. The second two examples are a 'cheat'; the shell reads the script b.sh as if it was part of the current shell (no fork()) so the variables are still accessible to b.sh, hence it echoes 100 each time. (Dot or . is the older mechanism for reading a script in the current shell; the Bourne shell in 7th Edition UNIX used it. The source command is borrowed from the C shells as an equivalent mechanism.)
The command a=200 ./b.sh exports a for the duration of the command, so b.sh sees and echoes the modified value 200 but the main shell has a unchanged. Then when a is exported, it is available to b.sh automatically, hence it sees and echoes the last 100.

Related

Exporting environment variables both to bash as csh using a bash script with functions

I have a bash shell-script with a function which exports an environment variable.
For sake of argument lets use the following example:
#!/bin/bash
function my_function()
{
export my_env_var=$1
}
Since the whole purpose is to export the variable to the main shell I source it.
When the main shell is bash this works fine:
<bash-shell>
> source ~/tmp/my_test.sh
> my_function test
> echo $my_env_var
test
But other customers use csh and there things start to fail if I use the same command with the same script, since csh does not know functions :-(
<csh-shell>
% source ~/tmp/my_test.sh
Badly placed ()'s
I already tried to wrap it in a wrapper-script:
#!/bin/sh
bash -c 'source ~/tmp/my_test.sh; my_function test`
echo my_env_var = $my_env_var
But my_env_var is not exported in this way:
<csh-shell>
% source ~/tmp/my_test2.sh
my_env_var: Undefined variable.
Where it is known in the bash shell (as can be seen by changing the 2nd script to:
#!/bin/sh
bash -c 'source ~/tmp/my_test.sh; my_function test; echo my_env_var in bash = $my_env_var`
echo my_env_var = $my_env_var
<csh-shell>
% source ~/tmp/my_test2.sh
my_env_var in bash = test
my_env_var: Undefined variable.
What am I missing / doing wrong so the script exports the variable when it is called from bash and when it is called from csh?
The Bourne shell and csh are not compatible; many commands are different, and csh misses many features (it doesn't have functions at all). Plus, sooner or later you're going to have someone who uses fish, which is different yet still. The only way to make a non-trivial script work for both is to write it twice.
That said, if you want to set some environment variables then the general strategy is to create a script which outputs the required commands; this can be in any language (shell, Python, C); for example:
#!/bin/sh
# ... do work here ...
var="foo"
# Getting the shell in a cross-platform way isn't too easy. This was only tested
# on Linux. Can add a "-c" or "-f" flag if you need cross-platform support.
shell=$(ps -ho comm $(ps -ho ppid $$))
case "$shell" in
(csh|tcsh) echo "setenv VAR $var" ;;
(fish) echo "set -Ux VAR $var" ;;
(*) echo "export VAR=$var"
esac
And when you run it, it outputs the appropriate commands:
% ./work
export VAR=foo
% tcsh
> ./work
setenv VAR foo
> fish
martin#x270 ~> ./work
set -Ux VAR foo
And to actually set it, eval the output like so:
% eval $(./work)
% echo $VAR
foo
% tcsh
> eval `./work`
> echo $VAR
foo
> fish
martin#x270 ~> eval (./work)
martin#x270 ~> echo $VAR
foo
The downside of this is that informational messages, warnings, etc. will also get eval'd; to solve this make sure to always output these to stderr:
echo >&2 "warning: foo"
If you don't want to run eval you can also use something slightly more complicated which prints VAR=foo and then create a Bourne and csh wrapper script to parse those lines, but "output the variables you want to set, instead of directly setting them" is the general approach to take to make something work in multiple incompatible shells.

Why does "pgrep -f bash" emit two numbers instead of one?

When I run this script in shell:
printf "Current bash PID is `pgrep -f bash`\n"
using this command:
$ bash script.sh
I get back this output:
Current bash PID is 5430
24390
Every time I run it, I get a different number:
Current bash PID is 5430
24415
Where is the second line coming from?
When you use backticks (or the more modern $(...) syntax for command substitution), you create a subshell. That's a fork()ed-off, independent copy of the shell process which has its own PID, so pgrep finds two separate copies of the shell. (Moreover, pgrep can be finding copies of bash running on the system completely unrelated to the script at hand).
If you want to find the PID of the current copy of bash, you can just look it up directly (printf is better practice than echo when contents can contain backslashes or if the behavior of echo -n or the nonstandard bash extension echo -e is needed, but neither of those things is the case here, so echo is fine):
echo "Current bash PID is $$"
Note that even when executed in a subshell, $$ expands to the PID of the parent shell. With bash 4.0 or newer, you can use $BASHPID to look up the current PID even in a subshell.
See the related question Bash - Two processes for one script

Bash script to append argument to $PATH

I'm trying to write as simple a bash script as possible to append one argument to the $PATH environment variable if argument isn't already part of the $PATH. I know there are other simple ways to do it by not using a bash script; however, I want to use a bash script. I've experimented with export but I haven't had any luck. Right now my simple code looks like this:
#!/bin/bash
if [[ "$(echo $PATH)" != *"$1"* ]]
then
PATH=$PATH:$1
fi
But:
$ ./script /home/scripts
$ echo $PATH
(returns unaltered PATH)
try with src or .:
src ./script /home/scripts
. ./script /home/scripts
It's because your script runs on its own interpreter and this interpreter instance (which is where the variable $PATH is getting set) dies when the script dies. You have to ask your current interpreter to run the script instead (that's what src or . are used for)

What's the point of eval/bash -c as opposed to just evaluating a variable?

Suppose you have the following command stored in a variable:
COMMAND='echo hello'
What's the difference between
$ eval "$COMMAND"
hello
$ bash -c "$COMMAND"
hello
$ $COMMAND
hello
? Why is the last version almost never used if it is shorter and (as far as I can see) does exactly the same thing?
The third form is not at all like the other two -- but to understand why, we need to go into the order of operations when bash in interpreting a command, and look at which of those are followed when each method is in use.
Bash Parsing Stages
Quote Processing
Splitting Into Commands
Special Operator Parsing
Expansions
Word Splitting
Globbing
Execution
Using eval "$string"
eval "$string" follows all the above steps starting from #1. Thus:
Literal quotes within the string become syntactic quotes
Special operators such as >() are processed
Expansions such as $foo are honored
Results of those expansions are split on characters into whitespace into separate words
Those words are expanded as globs if they parse as same and have available matches, and finally the command is executed.
Using sh -c "$string"
...performs the same as eval does, but in a new shell launched as a separate process; thus, changes to variable state, current directory, etc. will expire when this new process exits. (Note, too, that that new shell may be a different interpreter supporting a different language; ie. sh -c "foo" will not support the same syntax that bash, ksh, zsh, etc. do).
Using $string
...starts at step 5, "Word Splitting".
What does this mean?
Quotes are not honored.
printf '%s\n' "two words" will thus parse as printf %s\n "two words", as opposed to the usual/expected behavior of printf %s\n two words (with the quotes being consumed by the shell).
Splitting into multiple commands (on ;s, &s, or similar) does not take place.
Thus:
s='echo foo && echo bar'
$s
...will emit the following output:
foo && echo bar
...instead of the following, which would otherwise be expected:
foo
bar
Special operators and expansions are not honored.
No $(foo), no $foo, no <(foo), etc.
Redirections are not honored.
>foo or 2>&1 is just another word created by string-splitting, rather than a shell directive.
$ bash -c "$COMMAND"
This version starts up a new bash interpreter, runs the command, and then exits, returning control to the original shell. You don't need to be running bash at all in the first place to do this, you can start a bash interpreter from tcsh, for example. You might also do this from a bash script to start with a fresh environment or to keep from polluting your current environment.
EDIT:
As #CharlesDuffy points out starting a new bash shell in this way will clear shell variables but environment variables will be inherited by the spawned shell process.
Using eval causes the shell to parse your command twice. In the example you gave, executing $COMMAND directly or doing an eval are equivalent, but have a look at the answer here to get a more thorough idea of what eval is good (or bad) for.
There are at least times when they are different. Consider the following:
$ cmd="echo \$var"
$ var=hello
$ $cmd
$var
$ eval $cmd
hello
$ bash -c "$cmd"
$ var=world bash -c "$cmd"
world
which shows the different points at which variable expansion is performed. It's even more clear if we do set -x first
$ set -x
$ $cmd
+ echo '$var'
$var
$ eval $cmd
+ eval echo '$var'
++ echo hello
hello
$ bash -c "$cmd"
+ bash -c 'echo $var'
$ var=world bash -c "$cmd"
+ var=world
+ bash -c 'echo $var'
world
We can see here much of what Charles Duffy talks about in his excellent answer. For example, attempting to execute the variable directly prints $var because parameter expansion and those earlier steps had already been done, and so we don't get the value of var, as we do with eval.
The bash -c option only inherits exported variables from the parent shell, and since I didn't export var it's not available to the new shell.

Why aren't positional variables changed when one runs a command every time

I am learning shell scripting and I have this situation.
We say that positional variables are environmental variables, but why they don't change every time a command is executed.
Take a look at this
set v1set v2set v3set v4set
old=$#
#Just a random command
ls -l
new=$#
echo $old $new
It outputs 4 4. If environmental variables are global, why isn't it 4 1, as I ran ls -l and it should have updated positional variables?
Interesant question - you got a good point.
For understanding it, you need understand what happens when you run any command, like ls -l. It has nothing with "variables are restored or similar"...
When you going to run any command,
the bash FORKS itself into to two identical copies
the one copy (called as child) will replace itself with the wanted command (e.g. with ls -l)
at this moment, the child process will get the correct count of positional variables $#
remerber - this happens for the child process, the second (parent) process know NOTHING about this
the parent simply waits until the child finishes (and of course, HIS $# is not changes, because for the parent nothing happens - only waits
when the child (ls -l) finishes, the parent contienue to run - and of course, his $# was no reason to change...
ps: the above is simplyfied. In fact, after the fork they are not fully identical but difer in one number - the parent gets the child's process number, the child this nuber has '0'
If environmental variables are global, why isn't it 4 1
I presume that you are asking why running the command ls -l does not change the positional parameters from four to one with the one being -l.
It does set them to -l for the program ls. When the program ls queries its positional parameters, it is told that is has a single one consisting of -l. Once ls terminates, however, the positional parameters are returned to what they were before.
If environmental variables are global,
Even for global environmental variables, changes to them in child process never appear to the parent process. The communication of environmental variables is a one way street: from parent to child only.
For example:
$ cat test1.sh
echo "in $0, before, we have $# pos. params with values=$*"
bash test2.sh calling test2 from test1
echo "in $0, after , we have $# pos. params with values=$*"
$ cat test2.sh
echo "in $0, we have $# pos. params with values=$*"
$ bash test1.sh -l
in test1.sh, before, we have 1 pos. params with values=-l
in test2.sh, we have 4 pos. params with values=calling test2 from test1
in test1.sh, after , we have 1 pos. params with values=-l
And, another example, this one showing that a child's changes to an environment variable do not affect the parent:
$ cat test3.sh
export myvar=1
echo "in $0, before, myvar=$myvar"
bash test4.sh
echo "in $0, after, myvar=$myvar"
$ cat test4.sh
export myvar=2
echo "in $0, myvar=$myvar"
$ bash test3.sh
in test3.sh, before, myvar=1
in test4.sh, myvar=2
in test3.sh, after, myvar=1
I don't think $# applies to an interactive shell. It works fine in a script. Try this.
$ cat try.sh
#!/bin/bash
echo $*
echo $#
$ ./try.sh one
one
1
$ ./try.sh one two
one two
2
$ ./try.sh one two three
one two three
3

Resources