Why does space in Bash string subscripts matter here? - linux

I'm experimenting with bash scripting and noticed the following behavior:
file="test.jar"
echo "${file: -4}" #prints .jar
echo "${file:-4}" #prints test.jar
Very confusing behavior actually. Can someone explain why the second case prints the whole test.jar?

This is due to inconsistent syntax. {"string":-} means default (all the string), whatever follows -. So you need either a space or parenthesis:
{"string": -4}
{"string":(-4)}
Read bash string manipulation.

This is a compromise due to the timeline of features being added to Bash.
The ${parameter:-word} or ${parameter-word} syntax for "replace parameter with word if parameter is null or unset (:-) / null (-)" was around for pretty much always; the - version was already in the Version 7 Bourne Shell.
The ${parameter:offset:length} and ${parameter:offset} syntax for "substring of parameter starting at offset (with optional length length)" was introduced in Bash 2.0 (no conflict so far).
Negative offsets and length specifications for the substring construct were introduced in Bash 4.2. This leads to a problem:
$ string=01234567890abcdefgh
$ echo ${string:7:-2} # Not ambiguous
7890abcdef
$ echo ${string:-7} # Interpreted as "undefined/null, or..."
01234567890abcdefgh
$ echo ${string: -7} # Interpreted as offset from the end
bcdefgh
$ echo ${string:(-7)} # Interpreted as offset from the end
bcdefgh
The space before - or the parentheses around the negative offset are there to tell the expansion apart from the :- (default value) expansion.
If the space is not there, the expansion ${file:-4} is interpreted as "print 4 if the parameter file is null or unset, and the expansion of file otherwise".
References:
BashFAQ/061: Is there a list of which features were added to specific releases (versions) of Bash?
Bash hackers wiki: Bash changes
Shell parameter expansion in the bash manual
Bash NEWS file describing feature added per version

Related

Strange output from bash one liner

While in the course of learning bash, I often tweak an existing thing and see it's output.
~$ for i in {1..19}; do echo "Everything in UNIX is a file."; sleep 1; done
I had this, and out of curiosity I tweaked the above one into the following:-
~$ for i in {1..19 * 2}; do echo "Everything in UNIX is a file."; echo "The value of i is ${i}"; sleep 1; done
Now to my surprise I started getting the following output :-
Everything in UNIX is a file.
The value of i is OneDrive
Everything in UNIX is a file.
The value of i is opera autoupdate
Everything in UNIX is a file.
The value of i is Personal_Workspace
Everything in UNIX is a file.
The value of i is Pictures
Everything in UNIX is a file.
The value of i is PrintHood
Everything in UNIX is a file.
The value of i is Recent
Everything in UNIX is a file.
The value of i is Roaming
Everything in UNIX is a file.
The value of i is Saved Games
Everything in UNIX is a file.
The value of i is Searches
Some of the values of i are the names of files and directories in my home directory, I am in home directory, while executing this script.
What I was expecting that the i values would range from 1 to 19*2 = 38, so i would take values from 1,2,3...30...38.
But obviously it did n't Why?
Yes in bash, range expansion happens before everything else. You were expecting arithmetic expansion to happen which did not happen as expected because of the order of expansion bash shell. Your code ended up interpreting {1..19, * and 2} as literal strings.
Since * has a special meaning in shell which is a glob expansion listing all files/directories in current folder. Also you could see one entry stating the other two strings interpreted literally.
From the man bash(1) page under section Expansion
The order of expansions is: brace expansion, tilde expansion, parameter, variable and arithmetic expansion and command substitution (done in a left-to-right fashion), word splitting, and pathname expansion.
You are much better off using a for loop with a ((..)) construct if you are targeting scripts for bourne again shell
for ((i=1; i<=38; i++)); do
The * symbol in unix shell translates to a wildcard, so basically what {1..19 * 2} means is 1 through 19, all files in current dir (that's the *), and than 2. these will be the values of i in your loop

Passing \* as a parameter for a parameter

Using ksh. Trying to reuse a current script without modifying it, which basically boils down to something like this:
`expr 5 $1 $2`
How do i pass in a a multiplication command (*) as parameter $1 ?
I first attempted using "*" and even \* but that isn't working.
I've tried multiple escape backslash and quote combinations but i think im doing it wrong.
Without modifying the script, I don't think this can be done:
On calling, you can pass a literal * as '*', "*" or \* (any will do): this will initially protect the * from shell expansions (interpretation by the shell).
The callee (the script) will then receive literal * (as $1), but due to unquoted use of $1 inside the script, * will invariably be subject to filename expansion (globbing), and will expand to all (non-hidden) filenames in the current folder, breaking the expr command.
Trying to add an extra layer of escaping - such as "'*'" or \\\* - will NOT work, because the extra escaping will become an embedded, literal part of the argument - the target script will see literal '*' or \* and pass it as-is to expr, which will fail, because neither is a valid operator.
Here's a workaround:
Change to an empty directory.
By default, ksh will return any glob (pattern) as-is if there are no matching filenames. Thus, * (or any other glob) will be left unmodified in an empty directory, because there's nothing to match (thanks, #Peter Cordes).
For the calling script / interactive shell, you could disable globbing altogether by running set -f, but note that this will not affect the called script.
It's now safe to invoke your script with '*' (or any other glob), because it will simply be passed through; e.g., script '*' 2, will now yield 10, as expected
If both the shell you invoke from and the script's shell are ksh (or bash) with their default configuration, you can even get away with script * 2; i.e., you can make do without quoting * altogether.
Glob expansion happens very late, after parameter expansion, and word-splitting (in that order). Quote-removal doesn't happen on the results of earlier expansions, just what was on the command line to start with. This rules out passing in a quoted \* or similar (see mklement0's answer), by using an extra layer of quoting.
It also rules out passing in space-padded *: Word-splitting removes the spaces before pathname (glob) expansion, so it still ends up expanding * to all the filenames in the directory.
foo(){ printf '"%s"\n' "$#"; set -x; expr 5 $1 $2; set +x; }
$ foo ' * ' 4
" * "
"4"
+ expr 5 ...contents of my directory... 4
expr: syntax error
+ set +x
You should fix this buggy script before someone runs it with an arg that breaks it in a dangerous way, rather than just inconvenient.
If you don't need to support exactly the same operators as expr, you might want to use arithmetic expansion to do it without running an external command:
result=$((5 $1 $2)) # arithmetic expansion for the right-hand side
# or
((result=5 "$1" "$2")) # whole command is an arithmetic expression.
Double-quotes around parameters are optional inside an arithmetic expression, but you need to not use them in an arithmetic expansion (in bash. Apparently this works in ksh).
Normally it's not a bad habit to just always quote unless you specifically want word-splitting and glob expansion.

Bash variable defaulting doesn't work if followed by pipe (bash bug?)

I've just discovered a strange behaviour in bash that I don't understand. The expression
${variable:=default}
sets variable to the value default if it isn't already set. Consider the following examples:
#!/bin/bash
file ${foo:=$1}
echo "foo >$foo<"
file ${bar:=$1} | cat
echo "bar >$bar<"
The output is:
$ ./test myfile.txt
myfile.txt: ASCII text
foo >myfile.txt<
myfile.txt: ASCII text
bar ><
You will notice that the variable foo is assigned the value of $1 but the variable bar is not, even though the result of its defaulting is presented to the file command.
If you remove the innocuous pipe into cat from line 4 and re-run it, then it both foo and bar get set to the value of $1
Am I missing somehting here, or is this potentially a bash bug?
(GNU bash, version 4.3.30)
In second case file is a pipe member and runs as every pipe member in its own shell. When file with its subshell ends, $b with its new value from $1 no longer exists.
Workaround:
#!/bin/bash
file ${foo:=$1}
echo "foo >$foo<"
: "${bar:=$1}" # Parameter Expansion before subshell
file $bar | cat
echo "bar >$bar<"
It's not a bug. Parameter expansion happens when the command is evaluated, not parsed, but a command that is part of a pipeline is not evaluated until the new process has been started. Changing this, aside from likely breaking some existing code, would require extra level of expansion before evaluation occurs.
A hypothetical bash session:
> foo=5
> bar='$foo'
> echo "$bar"
$foo
# $bar expands to '$foo' before the subshell is created, but then `$foo` expands to 5
# during the "normal" round of parameter expansion.
> echo "$bar" | cat
5
To avoid that, bash would need some way of marking pieces of text that result from the new first round of pre-evaluation parameter expansion, so that they do not undergo a second
round of evaluation. This type of bookkeeping would quickly lead to unmaintainable code as more corner cases are found to be handled. Far simpler is to just accept that parameter expansions will be deferred until after the subshell starts.
The other alternative is to allow each component to run in the current shell, something that is allowed by the POSIX standard, but is not required, either. bash made the choice long ago to execute each component in a subshell, and reversing that would break too much existing code that relies on the current behavior. (bash 4.2 did introduce the lastpipe option, allowing the last component of a pipeline to execute in the current shell if explicitly enabled.)

Bash arrays and negative subscripts, yes or no?

The GNU bash manual tells me
An indexed array is created automatically if any variable is assigned
to using the syntax
name[subscript]=value
The subscript is treated as an arithmetic expression that must
evaluate to a number. If subscript evaluates to a number less than
zero, it is used as an offset from one greater than the array’s
maximum index (so a subcript of -1 refers to the last element of the
array).
So I figure I will give it a try and get the following result:
$ muh=(1 4 'a' 'bleh' 2)
$ echo $muh
1
$ echo ${muh[*]}
1 4 a bleh 2 # so far so good so now I'll try a negative ...
$ echo ${muh[-1]}
-bash: muh: bad array subscript # didn't go as planned!
Did I do something wrong, or is the website wrong, or is gnu bash that different from the bash I am running under CentOS? Thanks!
If you just want the last element
$ echo ${muh[*]: -1}
2
If you want next to last element
$ echo ${muh[*]: -2:1}
bleh
According to Greg Wooledge's wiki, (which links to the bash changelog) the negative index syntax was added to bash in version 4.2 alpha.
Bash beore 4.2 (like the default one on Macs these days) doesn't support negative subscripts. Apart from the "substring expansion" used in the accepted answer, a possibly cleaner workaround is to count the desired index from the array start within the brackets:
$ array=(one two three)
$ echo "${array[${#array[#]}-1]}"
three
With this approach, you can pack other parameter expansion operations into the term, e.g. "remove matching prefix pattern" th:
$ echo "${array[${#array[#]}-1]#th}"
ree
If you do man bash the section on arrays does not list this behavior. It might be something new (gnu?) in bash.
Fails for me in CentOS 6.3 (bash 4.1.2)
The negative subscript works perfectly fine for me on my computer with Ubuntu 14.04 / GNU bash version 4.3.11(1) however it returns:
line 46: [-1]: bad array subscript
When I try to run the same script on 4.2.46(1). I

How to encode url in bash script?

EDIT (Side Question)
Can someone please explain what this line does?
eval website=\${$#}
The script reads a lot of paremeters, it's called somewhat like this
./script.sh -t 30 -n 100 -a test http://www.google.com
I have trouble reading the url ( http://www.google.com )
I am opening firefox using urls passed to a bash script. How do I encode them? Some of these urls are causing issue.
Some code
eval website=\${$#} // takes as argument
firefox -width 1280 -height 8000 ${website} &
Problematic URL
http://www.airportbusiness.com//print/Airport-Business-Magazine/Expo-Returns-to-Vegas/1$41912
In firefox, it opens as
http://www.airportbusiness.com//print/Airport-Business-Magazine/Expo-Returns-to-Vegas/141912
$ sign gets removed
The easiest way is probably to escape the characters that cause some problems.
Unless your url contain some unusual characters as ', or \, you should be fine just by putting your url between tow ':
$ firefox 'YOUR_URL'
This will prevent YOUR_URL content to be evaluated.
Edit, to reflect updated answer:
You can see using echo command how bash expands your parameters.
In your example, bash thinks $ is used to identify a variable (a variable named 4), thus it substitutes $4 with the value of variable 4, which is not defined (thus just removes $4):
$ echo http://www.airportbusiness.com//print/Airport-Business-Magazine/Expo-Returns-to-Vegas/1$41912
http://www.airportbusiness.com//print/Airport-Business-Magazine/Expo-Returns-to-Vegas/11912
$ echo 'http://www.airportbusiness.com//print/Airport-Business-Magazine/Expo-Returns-to-Vegas/1$41912'
http://www.airportbusiness.com//print/Airport-Business-Magazine/Expo-Returns-to-Vegas/1$41912
Never use eval, and always quote your variables. The first argument ist stored in the parameter 1:
firefox "$1" &
This line:
eval website=\${$#}
sets the variable to the last positional parameter, regardless of how many there are.
Change it to:
website=${#: -1}
which is a Bashism, by the way.
Here are a few other Bashisms that accomplish the same thing:
echo "${!#}"
echo "${#:$#}"
echo "${BASH_ARGV[0]}"

Resources