Passing string with a space between words through various function layers - string

I have two functions in Bash. One is a generic run function, that accepts an input and evaluates it, while printing the command, and testing the exit code. This is used in a large script to ensure each command executes successfully before continuing.
The second one is a complex function, that is doing some Git history parsing. The problematic line is the only one shown.
I am calling this function from a for-loop, that iterates over a list of terms to search. The issue is that spaces are not being handled correctly, when between other words. I have tried running my script though shell-checking websites, and all of the suggestions seem to break my code.
function run() {
echo "> ${1}"
eval "${1}"
# Test exit code of the eval, and exit if non-zero
}
function searchCommitContents() {
run 'result=$(git log -S'"${1}"' --format=format:%H)'
# Do something with result, which is a list of matching SHA1 hashes for the commits
echo "${result}"
}
# Main
declare -a searchContents=('foo' 'bar' ' foo ' 'foo bar')
for i in "${searchContents[#]}"
do
searchCommitContents "${i}"
done
Here is the output I get:
> result=$(git log -Sfoo --format=format:%H)
<results>
> result=$(git log -Sbar --format=format:%H)
<results>
> result=$(git log -S foo --format=format:%H)
<results>
> result=$(git log -Sfoo bar --format=format:%H)
fatal: ambiguous argument 'bar': unknown revision of path not in the working tree.
Use '--' to separate paths from revisions, like this:
'git <command> [<revision>...] -- [<file>...]'
I tried to add additional single and double-quotes to various areas of the code, such that the 'foo bar' string would not resolve to two different words. I also tried adding an escape to the dollar sign, like so: -s'"\${1}"' based on other questions on this site.

Why are you printing result=$(? It's an internal variable, it can be anything, there is no need for it in logs.
Print the command that you are executing, not the variable name.
run() {
echo "+ $*" >&2
"$#"
}
searchCommitContents() {
local result
result=$(run git log -s"${1}" --format=format:%H)
: do stuff to "${result}"
echo "$result"
}
issue with an input that has a space in the middle.
If you want quoted string, use printf "%q" or ${...#Q} for newer Bash, but I don't really enjoy both quoting methods and just use $*. I really like /bin/printf from GNU coreutils, but it's a separate process... while ${..#Q} is the fastest, it's (still) not enough portable for me (I have some old Bash around).
# compare
$ set -- a 'b c' d
$ echo "+ $*" >&2
+ a b c d
$ echo "+$(printf " %q" "$#")" >&2
+ a b\ \ c d
$ echo "+" "${##Q}" >&2
+ 'a' 'b c' 'd'
$ echo "+$(/bin/printf " %q" "$#")" >&2
+ a 'b c' d

See these lines:
> result=$(git log -Sfoo bar --format=format:%H)
fatal: ambiguous argument 'bar': unknown revision of path not in the working tree.
Specifically this: -Sfoo bar. It should be -S"foo bar" or -S "foo bar". Because to pass an argument with spaces, we need to quote the argument. But, each time the argument pass through a command/function layer, one layer of quote ('', "") is extracted. So, we need to nest the quote.
So in this line:
declare -a searchContents=('foo' 'bar' ' foo ' 'foo bar')
change 'foo bar' to '"foo bar"' or "'foo bar'" or "\"foo bar\"".
This is a case of 2 layers nested quotes. The more the layer, the trickier it gets. Here's an example of 4 layers quotes I once did.

Related

Simplify debugging output. Specific request for in-place command text replacement in BASH

OK, so, I have a debugging setup a little like this:
DVAR=();
function DBG() {
if [[ $1 == -s ]]; then shift; stackTrace; fi;
if [[ ! -z ${DVAR[#]} ]]; then
for _v in ${!DVAR[#]}; do
echo "${DVAR[$_v]}" >> $LOG;
unset DVAR[$_v];
done;
fi
local tmp=("$*")
[[ ! -z $tmp ]]&&echo "$tmp" >> $LOG||continue;
}
every once in a while I call it either directly or, and I'd like to take this approach more, by repeatedly adding things to the array and calling it later. SPECIFICALLY, I'd like to be using this:
DVAR+="${0##*/}${FUNCNAME[0]}:$LINENO === assorted local variables and stuff here ====="
That first part is quite a mouthful and really clutters up my code. I'd REALLY rather be able to say something like:
DBG === assorted local variables and stuff here=====
I've tried messing around with alias and even eval, all to no. . evail. ahem
Thoughts anyone?
Like #ufopilot said, you should add your line as a new entry in the array with DVAR+=("...."), not overwriting it as a long string by concatenating it to the first element. Here's an explaining it:
Concatenating:
$ DVAR=()
$ DVAR+="foo"
$ DVAR+="bar"
$ declare -p DVAR
declare -a DVAR=([0]="foobar")
$ echo ${DVAR[#]}
foobar
Appending new entry:
$ DVAR=()
$ DVAR+=(foo)
$ DVAR+=(bar)
$ declare -p DVAR
declare -a DVAR=([0]="foo" [1]="bar")
$ echo "${DVAR[#]}"
foo bar
Here's an example of a function I put together for debugging purposes some time. The function is get_stack which will get the function name from whoever called it and the file name of where that calling function exists in, along with the trace so you can see the call history.
File test.sh:
#!/usr/bin/env bash
foo() {
get_stack
echo -e "$stack_trace" # Note the quotation to get indentation
}
get_stack () {
stack_trace=""
local i stack_size=${#FUNCNAME[#]}
local indent=" "
local newline="" # newline only after first line
# Offset to skip get_stack function
for (( i=0; i<$stack_size; i++ )); do
local func="${FUNCNAME[$i]}"
[ x$func = x ] && func=MAIN
local linen="${BASH_LINENO[$(( i - 1 ))]}"
local src="${BASH_SOURCE[$i]}"
[ x"$src" = x ] && src=non_file_source
stack_trace+="${newline}${indent}-> $src [$func]: $linen"
newline="\n"
indent="$indent "
done
}
echo "stack from test.sh"
foo
File test2.sh:
#!/usr/bin/env bash
source test.sh
echo "stack from test2.sh"
foo
Output:
stack from test.sh
-> test.sh [foo]: 4
-> test.sh [source]: 28
-> ./test2.sh [main]: 3
stack from test2.sh
-> test.sh [foo]: 4
-> ./test2.sh [main]: 6
In my script I have a down-right arrow ascii character that looks better than "->" but can't figure out how to get stackoverflow to display it proberly. The ascii is \u21b3.
As you can see in the stack trace, "foo" ran upon sourcing the function, just as it's supposed to do. But now it is clear why it ran and outputed text!
I think this demonstrates well how you can the array FUNCNAME to walk backwards in the call stack. Also BASH_SORUCE is an array itself, with matching indices, however it will display what file the function call came from. Modifying the "get_stack" function to inspect these arrays:
foo() {
get_stack
echo -e "$stack_trace"
echo "${BASH_SOURCE[#]}"
echo "${FUNCNAME[#]}"
}
yields:
stack from test.sh
-> test.sh [foo]: 4
-> test.sh [source]: 30
-> ./test2.sh [main]: 3
test.sh test.sh ./test2.sh
foo source main
stack from test2.sh
-> test.sh [foo]: 4
-> ./test2.sh [main]: 6
test.sh ./test2.sh
foo main
As you can see, the first set of outputs belong to test.sh, which came from the sourcing. This you can see in BASH_SOURCE: "foo source main". The "main" is the main scope, that is the stuff that runs but is not in a function but the main body of the file. In this case, test.sh had the "foo" call which upon sourcing this file will run.
I hope you see how the indices belong to the same call, but yield different info.
Now for your part, you wanted to add only the string and not the whole same-y info again and again. Since I'm not sure you want the stack trace or not I just added the message string to the first calling function instance. I also fixed up some old code here to make it a little better.
New improved function with message:
get_stack () {
local msg="$#"
stack_trace=""
local i stack_size=${#FUNCNAME[#]}
local indent=" "
# Offset to skip get_stack function
for (( i=1; i<$stack_size; i++ )); do
local func="${FUNCNAME[$i]}"
[ x$func = x ] && func=MAIN
local linen="${BASH_LINENO[$(( i - 1 ))]}"
local src="${BASH_SOURCE[$i]}"
[ x"$src" = x ] && src=non_file_source
stack_trace+="${newline:=\n}${indent}-> $src [$func:$linen]${msg:+": $msg"}"
msg=""
indent="$indent "
done
}
Output:
$ ./test2.sh
stack from test.sh
-> test.sh [foo:4]: my message
-> test.sh [source:28]
-> ./test2.sh [main:3]
stack from test2.sh
-> test.sh [foo:4]: my message
-> ./test2.sh [main:6]
Note that ${var:=X} will initialize "var" with "X" if var was uninitialized or set to empty string, ${var:+X} will replace "var" with "X" if var is initialized/set to something that is not the empty string, and finally as bonus ${var:-X} will replace "var" with "X" if var is uninitialized/set empty string. This is variable substitution in bash which is quite handy!
There are some pieces you can cut and past into your function, or if you need a stack based log you can use my function as a base.
Hope this helps you in your endeavors!

Shell script error "syntax error at line 145: `<<' unmatched" [duplicate]

For personal development and projects I work on, we use four spaces instead of tabs.
However, I need to use a heredoc, and I can't do so without breaking the indention flow.
The only working way to do this I can think of would be this:
usage() {
cat << ' EOF' | sed -e 's/^ //';
Hello, this is a cool program.
This should get unindented.
This code should stay indented:
something() {
echo It works, yo!;
}
That's all.
EOF
}
Is there a better way to do this?
Let me know if this belongs on the Unix/Linux Stack Exchange instead.
(If you are using bash 4, scroll to the end for what I think is the best combination of pure shell and readability.)
For heredocs, using tabs is not a matter of preference or style; it's how the language is defined.
usage () {
⟶# Lines between EOF are each indented with the same number of tabs
⟶# Spaces can follow the tabs for in-document indentation
⟶cat <<-EOF
⟶⟶Hello, this is a cool program.
⟶⟶This should get unindented.
⟶⟶This code should stay indented:
⟶⟶ something() {
⟶⟶ echo It works, yo!;
⟶⟶ }
⟶⟶That's all.
⟶EOF
}
Another option is to avoid a here document altogether, at the cost of having to use more quotes and line continuations:
usage () {
printf '%s\n' \
"Hello, this is a cool program." \
"This should get unindented." \
"This code should stay indented:" \
" something() {" \
" echo It works, yo!" \
" }" \
"That's all."
}
If you are willing to forego POSIX compatibility, you can use an array to avoid the explicit line continuations:
usage () {
message=(
"Hello, this is a cool program."
"This should get unindented."
"This code should stay indented:"
" something() {"
" echo It works, yo!"
" }"
"That's all."
)
printf '%s\n' "${message[#]}"
}
The following uses a here document again, but this time with bash 4's readarray command to populate an array. Parameter expansion takes care of removing a fixed number of spaces from the beginning of each lie.
usage () {
# No tabs necessary!
readarray message <<' EOF'
Hello, this is a cool program.
This should get unindented.
This code should stay indented:
something() {
echo It works, yo!;
}
That's all.
EOF
# Each line is indented an extra 8 spaces, so strip them
printf '%s' "${message[#]# }"
}
One last variation: you can use an extended pattern to simplify the parameter expansion. Instead of having to count how many spaces are used for indentation, simply end the indentation with a chosen non-space character, then match the fixed prefix. I use : . (The space following
the colon is for readability; it can be dropped with a minor change to the prefix pattern.)
(Also, as an aside, one drawback to your very nice trick of using a here-doc delimiter that starts with whitespace is that it prevents you from performing expansions inside the here-doc. If you wanted to do so, you'd have to either leave the delimiter unindented, or make one minor exception to your no-tab rule and use <<-EOF and a tab-indented closing delimiter.)
usage () {
# No tabs necessary!
closing="That's all"
readarray message <<EOF
: Hello, this is a cool program.
: This should get unindented.
: This code should stay indented:
: something() {
: echo It works, yo!;
: }
: $closing
EOF
shopt -s extglob
printf '%s' "${message[#]#+( ): }"
shopt -u extglob
}
geta() {
local _ref=$1
local -a _lines
local _i
local _leading_whitespace
local _len
IFS=$'\n' read -rd '' -a _lines ||:
_leading_whitespace=${_lines[0]%%[^[:space:]]*}
_len=${#_leading_whitespace}
for _i in "${!_lines[#]}"; do
printf -v "$_ref"[$_i] '%s' "${_lines[$_i]:$_len}"
done
}
gets() {
local _ref=$1
local -a _result
local IFS
geta _result
IFS=$'\n'
printf -v "$_ref" '%s' "${_result[*]}"
}
This is a slightly different approach which requires Bash 4.1 due to printf's assigning to array elements. (for prior versions, substitute the geta function below). It deals with arbitrary leading whitespace, not just a predetermined amount.
The first function, geta, reads from stdin, strips leading whitespace and returns the result in the array whose name was passed in.
The second, gets, does the same thing as geta but returns a single string with newlines intact (except the last).
If you pass in the name of an existing variable to geta, make sure it is already empty.
Invoke geta like so:
$ geta hello <<'EOS'
> hello
> there
>EOS
$ declare -p hello
declare -a hello='([0]="hello" [1]="there")'
gets:
$ unset -v hello
$ gets hello <<'EOS'
> hello
> there
> EOS
$ declare -p hello
declare -- hello="hello
there"
This approach should work for any combination of leading whitespace characters, so long as they are the same characters for all subsequent lines. The function strips the same number of characters from the front of each line, based on the number of leading whitespace characters in the first line.
The reason all the variables start with underscore is to minimize the chance of a name collision with the passed array name. You might want to rewrite this to prefix them with something even less likely to collide.
To use in OP's function:
gets usage_message <<'EOS'
Hello, this is a cool program.
This should get unindented.
This code should stay indented:
something() {
echo It works, yo!;
}
That's all.
EOS
usage() {
printf '%s\n' "$usage_message"
}
As mentioned, for Bash older than 4.1:
geta() {
local _ref=$1
local -a _lines
local _i
local _leading_whitespace
local _len
IFS=$'\n' read -rd '' -a _lines ||:
_leading_whitespace=${_lines[0]%%[^[:space:]]*}
_len=${#_leading_whitespace}
for _i in "${!_lines[#]}"; do
eval "$(printf '%s+=( "%s" )' "$_ref" "${_lines[$_i]:$_len}")"
done
}

Bash IFS, getting zapped in function call

Platform CentOS Linux release 7.6.1810, working in bash.
GNU bash, version 4.2.46(2)-release (x86_64-redhat-linux-gnu)
This is an idiom I've seen recommended for parsing text in bash in general and in particular for returning multiple values from a function.
IFS=":" read A B <<< $(echo ONE:TWO)
I'm getting unexpected behaviour when I call a function, yyy in the example here
IFS=":" read Y1 Y2 <<< $(yyy)
where yyy itself also wants to do a similar call.
The effect is that that within yyy() even though I explicitly specify the IFS
IFS=":" read C1 C2 <<< $( echo "A:B" )
The fields are parsed, but both values are assigned to C1, it gets the value "A B". If the function is called in isolation it works as expected.
This is a test case, distilled down from a much larger script. I want to know what is happening with IFS here. In the failure case (the second example below) setting IFS=":" in the caller somehow cause the result fields to be aggregated. The first and third calls to yyy() below work as expected, output shown after the code.
#!/bin/bash
debug() { echo "$1" 1>&2 ; }
yyy() {
debug "in yyy"
# why are the two values assigned to A here if the caller specified IFS?
IFS=":" read A B <<< $(echo ONE:TWO)
debug "A=$A"
debug "B=$B"
echo "$A:$B"
}
# this works as expected
read Y1 Y2 <<< $(yyy)
echo -e "===\n"
# this cause the read in yyy() to aggregate
IFS=":" read Y1 Y2 <<< $(yyy)
echo -e "===\n"
# This is a workaround that enables yyy() to work correctly
# But why do I need to do this?
OUT="$(yyy)"
IFS=":" read Y1 Y2 <<< $(echo $OUT)
This is the output
in yyy
A=ONE B=TWO
===
in yyy
A=ONE TWO B=
===
in yyy
A=ONE B=TWO
Note that in the second case A gets the value ONE TWO
This seems to be a bug in bash-4.2 as discussed here, IFS incorrectly splitting herestrings in bash 4.2. Should work on the versions above that.
These are the results on the same version as you have - GNU bash, version 4.2.46(2). When I ran the function yyy in debug mode ( by setting set -x in prompt ).
++ IFS=:
++ read A B
+++ echo ONE:TWO
++ debug 'A=ONE TWO'
++ echo 'A=ONE TWO'
A=ONE TWO
++ debug B=
++ echo B=
B=
++ echo 'ONE TWO:'
The above is snippet of the output from the debug mode output. As you can see when the echo ONE:TWO is printed as a result of the command substitution, no word splitting is expected to happen because the line doesn't contain any character of the default IFS value (space/tab or a newline)
So you would expect reading the the whole string with IFS=: expected to split the string and put the values in the constituent variables A and B, but somehow the : character is lost and a string ONE TWO is stored as the first variable value.
Look at the output of the function execution in GNU bash, version 4.4.12(1) which exhibits the right behavior.
++ IFS=:
++ read A B
+++ echo ONE:TWO
++ debug A=ONE
++ echo A=ONE
A=ONE
++ debug B=TWO
++ echo B=TWO
B=TWO
++ echo ONE:TWO
There have been lot of IFS related bugs up to version 4.4.0 bash/CHANGES. So a personal recommendation is to upgrade your bash version to a more recent stable one. Also see Trying to split a string into two variables
Similar bug on version 4.4.0(1)-release
You would expect the ONE:TWO to be unmodified when the $(..) is expanded because for reasons mentioned earlier. But here too the delimit character is lost and the variable A is set to ONE TWO
IFS=":" read A B <<< $(echo ONE:TWO)
echo "$A"
ONE TWO
Surprisingly the above code works on 4.2.46(2), which means the 4.4.0(1) broke a functionality which used to work in the earlier releases.

How to return a string value from a Bash function

I'd like to return a string from a Bash function.
I'll write the example in java to show what I'd like to do:
public String getSomeString() {
return "tadaa";
}
String variable = getSomeString();
The example below works in bash, but is there a better way to do this?
function getSomeString {
echo "tadaa"
}
VARIABLE=$(getSomeString)
There is no better way I know of. Bash knows only status codes (integers) and strings written to the stdout.
You could have the function take a variable as the first arg and modify the variable with the string you want to return.
#!/bin/bash
set -x
function pass_back_a_string() {
eval "$1='foo bar rab oof'"
}
return_var=''
pass_back_a_string return_var
echo $return_var
Prints "foo bar rab oof".
Edit: added quoting in the appropriate place to allow whitespace in string to address #Luca Borrione's comment.
Edit: As a demonstration, see the following program. This is a general-purpose solution: it even allows you to receive a string into a local variable.
#!/bin/bash
set -x
function pass_back_a_string() {
eval "$1='foo bar rab oof'"
}
return_var=''
pass_back_a_string return_var
echo $return_var
function call_a_string_func() {
local lvar=''
pass_back_a_string lvar
echo "lvar='$lvar' locally"
}
call_a_string_func
echo "lvar='$lvar' globally"
This prints:
+ return_var=
+ pass_back_a_string return_var
+ eval 'return_var='\''foo bar rab oof'\'''
++ return_var='foo bar rab oof'
+ echo foo bar rab oof
foo bar rab oof
+ call_a_string_func
+ local lvar=
+ pass_back_a_string lvar
+ eval 'lvar='\''foo bar rab oof'\'''
++ lvar='foo bar rab oof'
+ echo 'lvar='\''foo bar rab oof'\'' locally'
lvar='foo bar rab oof' locally
+ echo 'lvar='\'''\'' globally'
lvar='' globally
Edit: demonstrating that the original variable's value is available in the function, as was incorrectly criticized by #Xichen Li in a comment.
#!/bin/bash
set -x
function pass_back_a_string() {
eval "echo in pass_back_a_string, original $1 is \$$1"
eval "$1='foo bar rab oof'"
}
return_var='original return_var'
pass_back_a_string return_var
echo $return_var
function call_a_string_func() {
local lvar='original lvar'
pass_back_a_string lvar
echo "lvar='$lvar' locally"
}
call_a_string_func
echo "lvar='$lvar' globally"
This gives output:
+ return_var='original return_var'
+ pass_back_a_string return_var
+ eval 'echo in pass_back_a_string, original return_var is $return_var'
++ echo in pass_back_a_string, original return_var is original return_var
in pass_back_a_string, original return_var is original return_var
+ eval 'return_var='\''foo bar rab oof'\'''
++ return_var='foo bar rab oof'
+ echo foo bar rab oof
foo bar rab oof
+ call_a_string_func
+ local 'lvar=original lvar'
+ pass_back_a_string lvar
+ eval 'echo in pass_back_a_string, original lvar is $lvar'
++ echo in pass_back_a_string, original lvar is original lvar
in pass_back_a_string, original lvar is original lvar
+ eval 'lvar='\''foo bar rab oof'\'''
++ lvar='foo bar rab oof'
+ echo 'lvar='\''foo bar rab oof'\'' locally'
lvar='foo bar rab oof' locally
+ echo 'lvar='\'''\'' globally'
lvar='' globally
All answers above ignore what has been stated in the man page of bash.
All variables declared inside a function will be shared with the calling environment.
All variables declared local will not be shared.
Example code
#!/bin/bash
f()
{
echo function starts
local WillNotExists="It still does!"
DoesNotExists="It still does!"
echo function ends
}
echo $DoesNotExists #Should print empty line
echo $WillNotExists #Should print empty line
f #Call the function
echo $DoesNotExists #Should print It still does!
echo $WillNotExists #Should print empty line
And output
$ sh -x ./x.sh
+ echo
+ echo
+ f
+ echo function starts
function starts
+ local 'WillNotExists=It still does!'
+ DoesNotExists='It still does!'
+ echo function ends
function ends
+ echo It still 'does!'
It still does!
+ echo
Also under pdksh and ksh this script does the same!
Bash, since version 4.3, feb 2014(?), has explicit support for reference variables or name references (namerefs), beyond "eval", with the same beneficial performance and indirection effect, and which may be clearer in your scripts and also harder to "forget to 'eval' and have to fix this error":
declare [-aAfFgilnrtux] [-p] [name[=value] ...]
typeset [-aAfFgilnrtux] [-p] [name[=value] ...]
Declare variables and/or give them attributes
...
-n Give each name the nameref attribute, making it a name reference
to another variable. That other variable is defined by the value
of name. All references and assignments to name, except for⋅
changing the -n attribute itself, are performed on the variable
referenced by name's value. The -n attribute cannot be applied to
array variables.
...
When used in a function, declare and typeset make each name local,
as with the local command, unless the -g option is supplied...
and also:
PARAMETERS
A variable can be assigned the nameref attribute using the -n option to the
declare or local builtin commands (see the descriptions of declare and local
below) to create a nameref, or a reference to another variable. This allows
variables to be manipulated indirectly. Whenever the nameref variable is⋅
referenced or assigned to, the operation is actually performed on the variable
specified by the nameref variable's value. A nameref is commonly used within
shell functions to refer to a variable whose name is passed as an argument to⋅
the function. For instance, if a variable name is passed to a shell function
as its first argument, running
declare -n ref=$1
inside the function creates a nameref variable ref whose value is the variable
name passed as the first argument. References and assignments to ref are
treated as references and assignments to the variable whose name was passed as⋅
$1. If the control variable in a for loop has the nameref attribute, the list
of words can be a list of shell variables, and a name reference will be⋅
established for each word in the list, in turn, when the loop is executed.
Array variables cannot be given the -n attribute. However, nameref variables
can reference array variables and subscripted array variables. Namerefs can be⋅
unset using the -n option to the unset builtin. Otherwise, if unset is executed
with the name of a nameref variable as an argument, the variable referenced by⋅
the nameref variable will be unset.
For example (EDIT 2: (thank you Ron) namespaced (prefixed) the function-internal variable name, to minimize external variable clashes, which should finally answer properly, the issue raised in the comments by Karsten):
# $1 : string; your variable to contain the return value
function return_a_string () {
declare -n ret=$1
local MYLIB_return_a_string_message="The date is "
MYLIB_return_a_string_message+=$(date)
ret=$MYLIB_return_a_string_message
}
and testing this example:
$ return_a_string result; echo $result
The date is 20160817
Note that the bash "declare" builtin, when used in a function, makes the declared variable "local" by default, and "-n" can also be used with "local".
I prefer to distinguish "important declare" variables from "boring local" variables, so using "declare" and "local" in this way acts as documentation.
EDIT 1 - (Response to comment below by Karsten) - I cannot add comments below any more, but Karsten's comment got me thinking, so I did the following test which WORKS FINE, AFAICT - Karsten if you read this, please provide an exact set of test steps from the command line, showing the problem you assume exists, because these following steps work just fine:
$ return_a_string ret; echo $ret
The date is 20170104
(I ran this just now, after pasting the above function into a bash term - as you can see, the result works just fine.)
Like bstpierre above, I use and recommend the use of explicitly naming output variables:
function some_func() # OUTVAR ARG1
{
local _outvar=$1
local _result # Use some naming convention to avoid OUTVARs to clash
... some processing ....
eval $_outvar=\$_result # Instead of just =$_result
}
Note the use of quoting the $. This will avoid interpreting content in $result as shell special characters. I have found that this is an order of magnitude faster than the result=$(some_func "arg1") idiom of capturing an echo. The speed difference seems even more notable using bash on MSYS where stdout capturing from function calls is almost catastrophic.
It's ok to send in a local variables since locals are dynamically scoped in bash:
function another_func() # ARG
{
local result
some_func result "$1"
echo result is $result
}
You could also capture the function output:
#!/bin/bash
function getSomeString() {
echo "tadaa!"
}
return_var=$(getSomeString)
echo $return_var
# Alternative syntax:
return_var=`getSomeString`
echo $return_var
Looks weird, but is better than using global variables IMHO. Passing parameters works as usual, just put them inside the braces or backticks.
The most straightforward and robust solution is to use command substitution, as other people wrote:
assign()
{
local x
x="Test"
echo "$x"
}
x=$(assign) # This assigns string "Test" to x
The downside is performance as this requires a separate process.
The other technique suggested in this topic, namely passing the name of a variable to assign to as an argument, has side effects, and I wouldn't recommend it in its basic form. The problem is that you will probably need some variables in the function to calculate the return value, and it may happen that the name of the variable intended to store the return value will interfere with one of them:
assign()
{
local x
x="Test"
eval "$1=\$x"
}
assign y # This assigns string "Test" to y, as expected
assign x # This will NOT assign anything to x in this scope
# because the name "x" is declared as local inside the function
You might, of course, not declare internal variables of the function as local, but you really should always do it as otherwise you may, on the other hand, accidentally overwrite an unrelated variable from the parent scope if there is one with the same name.
One possible workaround is an explicit declaration of the passed variable as global:
assign()
{
local x
eval declare -g $1
x="Test"
eval "$1=\$x"
}
If name "x" is passed as an argument, the second row of the function body will overwrite the previous local declaration. But the names themselves might still interfere, so if you intend to use the value previously stored in the passed variable prior to write the return value there, be aware that you must copy it into another local variable at the very beginning; otherwise the result will be unpredictable!
Besides, this will only work in the most recent version of BASH, namely 4.2. More portable code might utilize explicit conditional constructs with the same effect:
assign()
{
if [[ $1 != x ]]; then
local x
fi
x="Test"
eval "$1=\$x"
}
Perhaps the most elegant solution is just to reserve one global name for function return values and
use it consistently in every function you write.
As previously mentioned, the "correct" way to return a string from a function is with command substitution. In the event that the function also needs to output to console (as #Mani mentions above), create a temporary fd in the beginning of the function and redirect to console. Close the temporary fd before returning your string.
#!/bin/bash
# file: func_return_test.sh
returnString() {
exec 3>&1 >/dev/tty
local s=$1
s=${s:="some default string"}
echo "writing directly to console"
exec 3>&-
echo "$s"
}
my_string=$(returnString "$*")
echo "my_string: [$my_string]"
executing script with no params produces...
# ./func_return_test.sh
writing directly to console
my_string: [some default string]
hope this helps people
-Andy
You could use a global variable:
declare globalvar='some string'
string ()
{
eval "$1='some other string'"
} # ---------- end of function string ----------
string globalvar
echo "'${globalvar}'"
This gives
'some other string'
To illustrate my comment on Andy's answer, with additional file descriptor manipulation to avoid use of /dev/tty:
#!/bin/bash
exec 3>&1
returnString() {
exec 4>&1 >&3
local s=$1
s=${s:="some default string"}
echo "writing to stdout"
echo "writing to stderr" >&2
exec >&4-
echo "$s"
}
my_string=$(returnString "$*")
echo "my_string: [$my_string]"
Still nasty, though.
The way you have it is the only way to do this without breaking scope. Bash doesn't have a concept of return types, just exit codes and file descriptors (stdin/out/err, etc)
Addressing Vicky Ronnen's head up, considering the following code:
function use_global
{
eval "$1='changed using a global var'"
}
function capture_output
{
echo "always changed"
}
function test_inside_a_func
{
local _myvar='local starting value'
echo "3. $_myvar"
use_global '_myvar'
echo "4. $_myvar"
_myvar=$( capture_output )
echo "5. $_myvar"
}
function only_difference
{
local _myvar='local starting value'
echo "7. $_myvar"
local use_global '_myvar'
echo "8. $_myvar"
local _myvar=$( capture_output )
echo "9. $_myvar"
}
declare myvar='global starting value'
echo "0. $myvar"
use_global 'myvar'
echo "1. $myvar"
myvar=$( capture_output )
echo "2. $myvar"
test_inside_a_func
echo "6. $_myvar" # this was local inside the above function
only_difference
will give
0. global starting value
1. changed using a global var
2. always changed
3. local starting value
4. changed using a global var
5. always changed
6.
7. local starting value
8. local starting value
9. always changed
Maybe the normal scenario is to use the syntax used in the test_inside_a_func function, thus you can use both methods in the majority of cases, although capturing the output is the safer method always working in any situation, mimicking the returning value from a function that you can find in other languages, as Vicky Ronnen correctly pointed out.
The options have been all enumerated, I think. Choosing one may come down to a matter of the best style for your particular application, and in that vein, I want to offer one particular style I've found useful. In bash, variables and functions are not in the same namespace. So, treating the variable of the same name as the value of the function is a convention that I find minimizes name clashes and enhances readability, if I apply it rigorously. An example from real life:
UnGetChar=
function GetChar() {
# assume failure
GetChar=
# if someone previously "ungot" a char
if ! [ -z "$UnGetChar" ]; then
GetChar="$UnGetChar"
UnGetChar=
return 0 # success
# else, if not at EOF
elif IFS= read -N1 GetChar ; then
return 0 # success
else
return 1 # EOF
fi
}
function UnGetChar(){
UnGetChar="$1"
}
And, an example of using such functions:
function GetToken() {
# assume failure
GetToken=
# if at end of file
if ! GetChar; then
return 1 # EOF
# if start of comment
elif [[ "$GetChar" == "#" ]]; then
while [[ "$GetChar" != $'\n' ]]; do
GetToken+="$GetChar"
GetChar
done
UnGetChar "$GetChar"
# if start of quoted string
elif [ "$GetChar" == '"' ]; then
# ... et cetera
As you can see, the return status is there for you to use when you need it, or ignore if you don't. The "returned" variable can likewise be used or ignored, but of course only after the function is invoked.
Of course, this is only a convention. You are free to fail to set the associated value before returning (hence my convention of always nulling it at the start of the function) or to trample its value by calling the function again (possibly indirectly). Still, it's a convention I find very useful if I find myself making heavy use of bash functions.
As opposed to the sentiment that this is a sign one should e.g. "move to perl", my philosophy is that conventions are always important for managing the complexity of any language whatsoever.
In my programs, by convention, this is what the pre-existing $REPLY variable is for, which read uses for that exact purpose.
function getSomeString {
REPLY="tadaa"
}
getSomeString
echo $REPLY
This echoes
tadaa
But to avoid conflicts, any other global variable will do.
declare result
function getSomeString {
result="tadaa"
}
getSomeString
echo $result
If that isn’t enough, I recommend Markarian451’s solution.
They key problem of any 'named output variable' scheme where the caller can pass in the variable name (whether using eval or declare -n) is inadvertent aliasing, i.e. name clashes: From an encapsulation point of view, it's awful to not be able to add or rename a local variable in a function without checking ALL the function's callers first to make sure they're not wanting to pass that same name as the output parameter. (Or in the other direction, I don't want to have to read the source of the function I'm calling just to make sure the output parameter I intend to use is not a local in that function.)
The only way around that is to use a single dedicated output variable like REPLY (as suggested by Evi1M4chine) or a convention like the one suggested by Ron Burk.
However, it's possible to have functions use a fixed output variable internally, and then add some sugar over the top to hide this fact from the caller, as I've done with the call function in the following example. Consider this a proof of concept, but the key points are
The function always assigns the return value to REPLY, and can also return an exit code as usual
From the perspective of the caller, the return value can be assigned to any variable (local or global) including REPLY (see the wrapper example). The exit code of the function is passed through, so using them in e.g. an if or while or similar constructs works as expected.
Syntactically the function call is still a single simple statement.
The reason this works is because the call function itself has no locals and uses no variables other than REPLY, avoiding any potential for name clashes. At the point where the caller-defined output variable name is assigned, we're effectively in the caller's scope (technically in the identical scope of the call function), rather than in the scope of the function being called.
#!/bin/bash
function call() { # var=func [args ...]
REPLY=; "${1#*=}" "${#:2}"; eval "${1%%=*}=\$REPLY; return $?"
}
function greet() {
case "$1" in
us) REPLY="hello";;
nz) REPLY="kia ora";;
*) return 123;;
esac
}
function wrapper() {
call REPLY=greet "$#"
}
function main() {
local a b c d
call a=greet us
echo "a='$a' ($?)"
call b=greet nz
echo "b='$b' ($?)"
call c=greet de
echo "c='$c' ($?)"
call d=wrapper us
echo "d='$d' ($?)"
}
main
Output:
a='hello' (0)
b='kia ora' (0)
c='' (123)
d='hello' (0)
You can echo a string, but catch it by piping (|) the function to something else.
You can do it with expr, though ShellCheck reports this usage as deprecated.
bash pattern to return both scalar and array value objects:
definition
url_parse() { # parse 'url' into: 'url_host', 'url_port', ...
local "$#" # inject caller 'url' argument in local scope
local url_host="..." url_path="..." # calculate 'url_*' components
declare -p ${!url_*} # return only 'url_*' object fields to the caller
}
invocation
main() { # invoke url parser and inject 'url_*' results in local scope
eval "$(url_parse url=http://host/path)" # parse 'url'
echo "host=$url_host path=$url_path" # use 'url_*' components
}
Although there were a lot of good answers, they all did not work the way I wanted them to. So here is my solution with these key points:
Helping the forgetful programmer
Atleast I would struggle to always remember error checking after something like this: var=$(myFunction)
Allows assigning values with newline chars \n
Some solutions do not allow for that as some forgot about the single quotes around the value to assign. Right way: eval "${returnVariable}='${value}'" or even better: see the next point below.
Using printf instead of eval
Just try using something like this myFunction "date && var2" to some of the supposed solutions here. eval will execute whatever is given to it. I only want to assign values so I use printf -v "${returnVariable}" "%s" "${value}" instead.
Encapsulation and protection against variable name collision
If a different user or at least someone with less knowledge about the function (this is likely me in some months time) is using myFunction I do not want them to know that he must use a global return value name or some variable names are forbidden to use. That is why I added a name check at the top of myFunction:
if [[ "${1}" = "returnVariable" ]]; then
echo "Cannot give the ouput to \"returnVariable\" as a variable with the same name is used in myFunction()!"
echo "If that is still what you want to do please do that outside of myFunction()!"
return 1
fi
Note this could also be put into a function itself if you have to check a lot of variables.
If I still want to use the same name (here: returnVariable) I just create a buffer variable, give that to myFunction and then copy the value returnVariable.
So here it is:
myFunction():
myFunction() {
if [[ "${1}" = "returnVariable" ]]; then
echo "Cannot give the ouput to \"returnVariable\" as a variable with the same name is used in myFunction()!"
echo "If that is still what you want to do please do that outside of myFunction()!"
return 1
fi
if [[ "${1}" = "value" ]]; then
echo "Cannot give the ouput to \"value\" as a variable with the same name is used in myFunction()!"
echo "If that is still what you want to do please do that outside of myFunction()!"
return 1
fi
local returnVariable="${1}"
local value=$'===========\nHello World\n==========='
echo "setting the returnVariable now..."
printf -v "${returnVariable}" "%s" "${value}"
}
Test cases:
var1="I'm not greeting!"
myFunction var1
[[ $? -eq 0 ]] && echo "myFunction(): SUCCESS" || echo "myFunction(): FAILURE"
printf "var1:\n%s\n" "${var1}"
# Output:
# setting the returnVariable now...
# myFunction(): SUCCESS
# var1:
# ===========
# Hello World
# ===========
returnVariable="I'm not greeting!"
myFunction returnVariable
[[ $? -eq 0 ]] && echo "myFunction(): SUCCESS" || echo "myFunction(): FAILURE"
printf "returnVariable:\n%s\n" "${returnVariable}"
# Output
# Cannot give the ouput to "returnVariable" as a variable with the same name is used in myFunction()!
# If that is still what you want to do please do that outside of myFunction()!
# myFunction(): FAILURE
# returnVariable:
# I'm not greeting!
var2="I'm not greeting!"
myFunction "date && var2"
[[ $? -eq 0 ]] && echo "myFunction(): SUCCESS" || echo "myFunction(): FAILURE"
printf "var2:\n%s\n" "${var2}"
# Output
# setting the returnVariable now...
# ...myFunction: line ..: printf: `date && var2': not a valid identifier
# myFunction(): FAILURE
# var2:
# I'm not greeting!
myFunction var3
[[ $? -eq 0 ]] && echo "myFunction(): SUCCESS" || echo "myFunction(): FAILURE"
printf "var3:\n%s\n" "${var3}"
# Output
# setting the returnVariable now...
# myFunction(): SUCCESS
# var3:
# ===========
# Hello World
# ===========
#Implement a generic return stack for functions:
STACK=()
push() {
STACK+=( "${1}" )
}
pop() {
export $1="${STACK[${#STACK[#]}-1]}"
unset 'STACK[${#STACK[#]}-1]';
}
#Usage:
my_func() {
push "Hello world!"
push "Hello world2!"
}
my_func ; pop MESSAGE2 ; pop MESSAGE1
echo ${MESSAGE1} ${MESSAGE2}
agt#agtsoft:~/temp$ cat ./fc
#!/bin/sh
fcall='function fcall { local res p=$1; shift; fname $*; eval "$p=$res"; }; fcall'
function f1 {
res=$[($1+$2)*2];
}
function f2 {
local a;
eval ${fcall//fname/f1} a 2 3;
echo f2:$a;
}
a=3;
f2;
echo after:a=$a, res=$res
agt#agtsoft:~/temp$ ./fc
f2:10
after:a=3, res=

How can I pass a complete argument list in bash while keeping mulitword arguments together?

I am having some issues with word-splitting in bash variable expansion. I want to be able to store an argument list in a variable and run it, but any quoted multiword arguments aren't evaluating how I expected them to.
I'll explain my problem with an example. Lets say I had a function decho that printed each positional parameter on it's own line:
#!/bin/bash -u
while [ $# -gt 0 ]; do
echo $1
shift
done
Ok, if I go decho a b "c d" I get:
[~]$ decho a b "c d"
a
b
c d
Which is what I expect and want. But on the other hand if I get the arguments list from a variable I get this:
[~]$ args='a b "c d"'
[~]$ decho $args
a
b
"c
d"
Which is not what I want. I can go:
[~]$ echo decho $args | bash
a
b
c d
But that seems a little clunky. Is there a better way to make the expansion of $args in decho $args be word-split the way I expected?
You can use:
eval decho $args
You can move the eval inside the script:
#!/bin/bash -u
eval set -- $*
for i;
do
echo $i;
done
Now you can do:
$ args='a b "c d"'
$ decho $args
a
b
c d
but you'll have to quote the arguments if you pass them on the CL:
$ decho 'a b "c d"'
a
b
c d
Have you tried:
for arg in "$#"
do
echo "arg $i:$arg:"
let "i+=1"
done
Should yield something like:
arg 1: a
arg 2: c d
in your case.
Straight from memory, no guarantee :-)
hmmm.. eval decho $args works too:
[~]$ eval decho $args
a
b
c d
And I may be able to do something with bash arrays using "${array[#]}" (which works like "$#"), but then I would have to write code to load the array, which would be a pain.
It is fundamentally flawed to attempt to pass an argument list stored in a variable, to a command.
Presumably, if you have code somewhere to create a variable containing the intended args. for a command, then you can change it to instead store the args into an array variable:
decho_argv=(a b 'c d') # <-- easy!
Then, rather than changing the command "decho" to accommodate the args taken from a plain variable (which will break its ability to handle normal args) you can do:
decho "${decho_argv[#]}" # USE DOUBLE QUOTES!!!
However, if you are the situation where you are trying to take arbitrary input which is expected to be string fields corresponding to intended command positional arguments, and you want to pass those arguments to a command, then you should instead of using a variable, read the data into an array.
Note that suggestions which offer the use of eval to set positional parameters with the contents of an ordinary variable are extremely dangerous.
Because, exposing the contents of a variable to the quote-removal and word-splitting on the command-line affords no way to protect against shell metachars in the string in the variable from causing havoc.
E.g., imagine in the following example if the word "man" was replaced with the two words "rm" and "-rf" and the final arg word was "*":
Do Not Do This:
> args='arg1 ; man arg4'
> eval set -- $args
No manual entry for arg4
> eval set -- "$args" # This is no better
No manual entry for arg4
> eval "set -- $args" # Still hopeless
No manual entry for arg4
> eval "set -- '$args'" # making it safe also makes it not work at all!
> echo "$1"
arg1 ; man arg4

Resources