Does the `-r` flag with `read` cause or NOT cause character escape? - linux

I am very confused about the read -r flag, or the meaning of "escape" in this contexts. The manual says regarding this flag:
-r = do not allow backslashes to escape any characters
But this seems to me to be the OPPOSITE of what the flag does. For example, running:
read -d '' VAR <<EOF
This is the \t first line
This is the second line
EOF
echo $VAR
... gives:
This is the t first line
This is the second line
But that seems to me as though the 't' character has NOT been escaped by the backslash. Conversely, when I add the -r flag, I get the following:
This is the first line
This is the second line
... where it appears to me as though the 't' character HAS been escaped due to the -r flag. So am I misunderstanding the meaning of the word "escape", or misunderstanding something else going on here?

I strongly suspect your confusion is caused by the manner in which you are determining the final content of the string. When backslashes are treated as an escape sequence (eg, when you do not use -r), \t is treated the same as a t. When they are not, it is treated as the literal two characters \t. Consider:
$ cat a.sh
#!/bin/sh
read a << 'EOF'
a: Without -r: foo\tbar
EOF
read -r b << 'EOF'
b: With -r : foo\tbar
EOF
printf "a = %s\n" "$a"
printf "b = %s\n" "$b"
printf "printf interprets the string: $a\n"
printf "printf interprets the string: $b\n"
$ ./a.sh
a = a: Without -r: footbar
b = b: With -r : foo\tbar
printf interprets the string: a: Without -r: footbar
printf interprets the string: b: With -r : foo bar

Thanks to everyone for their input. OK, this is one of those pesky things in bash that is clearer to me now, but had me confused initially. Here's my summary understanding.
There are, in a sense, three strings at play here:
The string of characters input to the heredoc,
The string of characters output from the heredoc and input to read,
The string of characters output from read and input to VAR
The string of characters being fed into the heredoc is, of course, whatever you type between the delimiters. But the string of characters output by the heredoc will depend on its own rules (viz. on whether the delimiter is quoted or not).
Next, the string of characters output by the heredoc will go into read, but the string of characters to be output by read (and saved into VAR) will depend on the presence/absence of the -r flag. If the string of characters input to read contain backslashes, then read without -r will first escape any such backlash-prefixed sequence -- thus modifying the string of characters -- and saving it into VAR.
But read -r will not attempt to interpret the backslashes, leaving the input text "as is" when outputting to VAR. Hence, the original \t is preserved with read -r and thus interpreted as a tab in the final echo $VAR.
My confusion primarily lay in my lack of discernment of the three separate strings of characters at play here (not echo vs printf).

The escaping that a backslash does as input to read, is to prevent the next character from being treated as a separator:
$ read -r a b <<< 'foo\ bar'; printf "<%s> <%s>\n" "$a" "$b"
<foo\> <bar>
$ read a b <<< 'foo\ bar'; printf "<%s> <%s>\n" "$a" "$b"
<foo bar> <>
Without it, backslashes are removed as part of the escape processing. With it, they are kept as-is.
Having the \t turn into a hard tab is due to echo, some implementations of it do that by default, some don't.

Related

newlines in a bash variable (grep output) [duplicate]

Here are a series of cases where echo $var can show a different value than what was just assigned. This happens regardless of whether the assigned value was "double quoted", 'single quoted' or unquoted.
How do I get the shell to set my variable correctly?
Asterisks
The expected output is /* Foobar is free software */, but instead I get a list of filenames:
$ var="/* Foobar is free software */"
$ echo $var
/bin /boot /dev /etc /home /initrd.img /lib /lib64 /media /mnt /opt /proc ...
Square brackets
The expected value is [a-z], but sometimes I get a single letter instead!
$ var=[a-z]
$ echo $var
c
Line feeds (newlines)
The expected value is a a list of separate lines, but instead all the values are on one line!
$ cat file
foo
bar
baz
$ var=$(cat file)
$ echo $var
foo bar baz
Multiple spaces
I expected a carefully aligned table header, but instead multiple spaces either disappear or are collapsed into one!
$ var=" title | count"
$ echo $var
title | count
Tabs
I expected two tab separated values, but instead I get two space separated values!
$ var=$'key\tvalue'
$ echo $var
key value
In all of the cases above, the variable is correctly set, but not correctly read! The right way is to use double quotes when referencing:
echo "$var"
This gives the expected value in all the examples given. Always quote variable references!
Why?
When a variable is unquoted, it will:
Undergo field splitting where the value is split into multiple words on whitespace (by default):
Before: /* Foobar is free software */
After: /*, Foobar, is, free, software, */
Each of these words will undergo pathname expansion, where patterns are expanded into matching files:
Before: /*
After: /bin, /boot, /dev, /etc, /home, ...
Finally, all the arguments are passed to echo, which writes them out separated by single spaces, giving
/bin /boot /dev /etc /home Foobar is free software Desktop/ Downloads/
instead of the variable's value.
When the variable is quoted it will:
Be substituted for its value.
There is no step 2.
This is why you should always quote all variable references, unless you specifically require word splitting and pathname expansion. Tools like shellcheck are there to help, and will warn about missing quotes in all the cases above.
You may want to know why this is happening. Together with the great explanation by that other guy, find a reference of Why does my shell script choke on whitespace or other special characters? written by Gilles in Unix & Linux:
Why do I need to write "$foo"? What happens without the quotes?
$foo does not mean “take the value of the variable foo”. It means
something much more complex:
First, take the value of the variable.
Field splitting: treat that value as a whitespace-separated list of fields, and build the resulting list. For example, if the variable
contains foo * bar ​ then the result of this step is the 3-element
list foo, *, bar.
Filename generation: treat each field as a glob, i.e. as a wildcard pattern, and replace it by the list of file names that match this
pattern. If the pattern doesn't match any files, it is left
unmodified. In our example, this results in the list containing foo,
following by the list of files in the current directory, and finally
bar. If the current directory is empty, the result is foo, *,
bar.
Note that the result is a list of strings. There are two contexts in
shell syntax: list context and string context. Field splitting and
filename generation only happen in list context, but that's most of
the time. Double quotes delimit a string context: the whole
double-quoted string is a single string, not to be split. (Exception:
"$#" to expand to the list of positional parameters, e.g. "$#" is
equivalent to "$1" "$2" "$3" if there are three positional
parameters. See What is the difference between $* and $#?)
The same happens to command substitution with $(foo) or with
`foo`. On a side note, don't use `foo`: its quoting rules are
weird and non-portable, and all modern shells support $(foo) which
is absolutely equivalent except for having intuitive quoting rules.
The output of arithmetic substitution also undergoes the same
expansions, but that isn't normally a concern as it only contains
non-expandable characters (assuming IFS doesn't contain digits or
-).
See When is double-quoting necessary? for more details about the
cases when you can leave out the quotes.
Unless you mean for all this rigmarole to happen, just remember to
always use double quotes around variable and command substitutions. Do
take care: leaving out the quotes can lead not just to errors but to
security
holes.
In addition to other issues caused by failing to quote, -n and -e can be consumed by echo as arguments. (Only the former is legal per the POSIX spec for echo, but several common implementations violate the spec and consume -e as well).
To avoid this, use printf instead of echo when details matter.
Thus:
$ vars="-e -n -a"
$ echo $vars # breaks because -e and -n can be treated as arguments to echo
-a
$ echo "$vars"
-e -n -a
However, correct quoting won't always save you when using echo:
$ vars="-n"
$ echo "$vars"
$ ## not even an empty line was printed
...whereas it will save you with printf:
$ vars="-n"
$ printf '%s\n' "$vars"
-n
user double quote to get the exact value. like this:
echo "${var}"
and it will read your value correctly.
echo $var output highly depends on the value of IFS variable. By default it contains space, tab, and newline characters:
[ks#localhost ~]$ echo -n "$IFS" | cat -vte
^I$
This means that when shell is doing field splitting (or word splitting) it uses all these characters as word separators. This is what happens when referencing a variable without double quotes to echo it ($var) and thus expected output is altered.
One way to prevent word splitting (besides using double quotes) is to set IFS to null. See http://pubs.opengroup.org/onlinepubs/009695399/utilities/xcu_chap02.html#tag_02_06_05 :
If the value of IFS is null, no field splitting shall be performed.
Setting to null means setting to empty
value:
IFS=
Test:
[ks#localhost ~]$ echo -n "$IFS" | cat -vte
^I$
[ks#localhost ~]$ var=$'key\nvalue'
[ks#localhost ~]$ echo $var
key value
[ks#localhost ~]$ IFS=
[ks#localhost ~]$ echo $var
key
value
[ks#localhost ~]$
The answer from ks1322 helped me to identify the issue while using docker-compose exec:
If you omit the -T flag, docker-compose exec add a special character that break output, we see b instead of 1b:
$ test=$(/usr/local/bin/docker-compose exec db bash -c "echo 1")
$ echo "${test}b"
b
echo "${test}" | cat -vte
1^M$
With -T flag, docker-compose exec works as expected:
$ test=$(/usr/local/bin/docker-compose exec -T db bash -c "echo 1")
$ echo "${test}b"
1b
Additional to putting the variable in quotation, one could also translate the output of the variable using tr and converting spaces to newlines.
$ echo $var | tr " " "\n"
foo
bar
baz
Although this is a little more convoluted, it does add more diversity with the output as you can substitute any character as the separator between array variables.

How to concatenate strings with escape characters in bash? [duplicate]

This
STR="Hello\nWorld"
echo $STR
produces as output
Hello\nWorld
instead of
Hello
World
What should I do to have a newline in a string?
Note: This question is not about echo.
I'm aware of echo -e, but I'm looking for a solution that allows passing a string (which includes a newline) as an argument to other commands that do not have a similar option to interpret \n's as newlines.
If you're using Bash, you can use backslash-escapes inside of a specially-quoted $'string'. For example, adding \n:
STR=$'Hello\nWorld'
echo "$STR" # quotes are required here!
Prints:
Hello
World
If you're using pretty much any other shell, just insert the newline as-is in the string:
STR='Hello
World'
Bash recognizes a number of other backslash escape sequences in the $'' string. Here is an excerpt from the Bash manual page:
Words of the form $'string' are treated specially. The word expands to
string, with backslash-escaped characters replaced as specified by the
ANSI C standard. Backslash escape sequences, if present, are decoded
as follows:
\a alert (bell)
\b backspace
\e
\E an escape character
\f form feed
\n new line
\r carriage return
\t horizontal tab
\v vertical tab
\\ backslash
\' single quote
\" double quote
\nnn the eight-bit character whose value is the octal value
nnn (one to three digits)
\xHH the eight-bit character whose value is the hexadecimal
value HH (one or two hex digits)
\cx a control-x character
The expanded result is single-quoted, as if the dollar sign had not
been present.
A double-quoted string preceded by a dollar sign ($"string") will cause
the string to be translated according to the current locale. If the
current locale is C or POSIX, the dollar sign is ignored. If the
string is translated and replaced, the replacement is double-quoted.
Echo is so nineties and so fraught with perils that its use should result in core dumps no less than 4GB. Seriously, echo's problems were the reason why the Unix Standardization process finally invented the printf utility, doing away with all the problems.
So to get a newline in a string, there are two ways:
# 1) Literal newline in an assignment.
FOO="hello
world"
# 2) Command substitution.
BAR=$(printf "hello\nworld\n") # Alternative; note: final newline is deleted
printf '<%s>\n' "$FOO"
printf '<%s>\n' "$BAR"
There! No SYSV vs BSD echo madness, everything gets neatly printed and fully portable support for C escape sequences. Everybody please use printf now for all your output needs and never look back.
What I did based on the other answers was
NEWLINE=$'\n'
my_var="__between eggs and bacon__"
echo "spam${NEWLINE}eggs${my_var}bacon${NEWLINE}knight"
# which outputs:
spam
eggs__between eggs and bacon__bacon
knight
I find the -e flag elegant and straight forward
bash$ STR="Hello\nWorld"
bash$ echo -e $STR
Hello
World
If the string is the output of another command, I just use quotes
indexes_diff=$(git diff index.yaml)
echo "$indexes_diff"
The problem isn't with the shell. The problem is actually with the echo command itself, and the lack of double quotes around the variable interpolation. You can try using echo -e but that isn't supported on all platforms, and one of the reasons printf is now recommended for portability.
You can also try and insert the newline directly into your shell script (if a script is what you're writing) so it looks like...
#!/bin/sh
echo "Hello
World"
#EOF
or equivalently
#!/bin/sh
string="Hello
World"
echo "$string" # note double quotes!
The only simple alternative is to actually type a new line in the variable:
$ STR='new
line'
$ printf '%s' "$STR"
new
line
Yes, that means writing Enter where needed in the code.
There are several equivalents to a new line character.
\n ### A common way to represent a new line character.
\012 ### Octal value of a new line character.
\x0A ### Hexadecimal value of a new line character.
But all those require "an interpretation" by some tool (POSIX printf):
echo -e "new\nline" ### on POSIX echo, `-e` is not required.
printf 'new\nline' ### Understood by POSIX printf.
printf 'new\012line' ### Valid in POSIX printf.
printf 'new\x0Aline'
printf '%b' 'new\0012line' ### Valid in POSIX printf.
And therefore, the tool is required to build a string with a new-line:
$ STR="$(printf 'new\nline')"
$ printf '%s' "$STR"
new
line
In some shells, the sequence $' is a special shell expansion.
Known to work in ksh93, bash and zsh:
$ STR=$'new\nline'
Of course, more complex solutions are also possible:
$ echo '6e65770a6c696e650a' | xxd -p -r
new
line
Or
$ echo "new line" | sed 's/ \+/\n/g'
new
line
A $ right before single quotation marks '...\n...' as follows, however double quotation marks doesn't work.
$ echo $'Hello\nWorld'
Hello
World
$ echo $"Hello\nWorld"
Hello\nWorld
Disclaimer: I first wrote this and then stumbled upon this question. I thought this solution wasn't yet posted, and saw that tlwhitec did post a similar answer. Still I'm posting this because I hope it's a useful and thorough explanation.
Short answer:
This seems quite a portable solution, as it works on quite some shells (see comment).
This way you can get a real newline into a variable.
The benefit of this solution is that you don't have to use newlines in your source code, so you can indent
your code any way you want, and the solution still works. This makes it robust. It's also portable.
# Robust way to put a real newline in a variable (bash, dash, ksh, zsh; indentation-resistant).
nl="$(printf '\nq')"
nl=${nl%q}
Longer answer:
Explanation of the above solution:
The newline would normally be lost due to command substitution, but to prevent that, we add a 'q' and remove it afterwards. (The reason for the double quotes is explained further below.)
We can prove that the variable contains an actual newline character (0x0A):
printf '%s' "$nl" | hexdump -C
00000000 0a |.|
00000001
(Note that the '%s' was needed, otherwise printf will translate a literal '\n' string into an actual 0x0A character, meaning we would prove nothing.)
Of course, instead of the solution proposed in this answer, one could use this as well (but...):
nl='
'
... but that's less robust and can be easily damaged by accidentally indenting the code, or by forgetting to outdent it afterwards, which makes it inconvenient to use in (indented) functions, whereas the earlier solution is robust.
Now, as for the double quotes:
The reason for the double quotes " surrounding the command substitution as in nl="$(printf '\nq')" is that you can then even prefix the variable assignment with the local keyword or builtin (such as in functions), and it will still work on all shells, whereas otherwise the dash shell would have trouble, in the sense that dash would otherwise lose the 'q' and you'd end up with an empty 'nl' variable (again, due to command substitution).
That issue is better illustrated with another example:
dash_trouble_example() {
e=$(echo hello world) # Not using 'local'.
echo "$e" # Fine. Outputs 'hello world' in all shells.
local e=$(echo hello world) # But now, when using 'local' without double quotes ...:
echo "$e" # ... oops, outputs just 'hello' in dash,
# ... but 'hello world' in bash and zsh.
local f="$(echo hello world)" # Finally, using 'local' and surrounding with double quotes.
echo "$f" # Solved. Outputs 'hello world' in dash, zsh, and bash.
# So back to our newline example, if we want to use 'local', we need
# double quotes to surround the command substitution:
# (If we didn't use double quotes here, then in dash the 'nl' variable
# would be empty.)
local nl="$(printf '\nq')"
nl=${nl%q}
}
Practical example of the above solution:
# Parsing lines in a for loop by setting IFS to a real newline character:
nl="$(printf '\nq')"
nl=${nl%q}
IFS=$nl
for i in $(printf '%b' 'this is line 1\nthis is line 2'); do
echo "i=$i"
done
# Desired output:
# i=this is line 1
# i=this is line 2
# Exercise:
# Try running this example without the IFS=$nl assignment, and predict the outcome.
I'm no bash expert, but this one worked for me:
STR1="Hello"
STR2="World"
NEWSTR=$(cat << EOF
$STR1
$STR2
EOF
)
echo "$NEWSTR"
I found this easier to formatting the texts.
Those picky ones that need just the newline and despise the multiline code that breaks indentation, could do:
IFS="$(printf '\nx')"
IFS="${IFS%x}"
Bash (and likely other shells) gobble all the trailing newlines after command substitution, so you need to end the printf string with a non-newline character and delete it afterwards. This can also easily become a oneliner.
IFS="$(printf '\nx')" IFS="${IFS%x}"
I know this is two actions instead of one, but my indentation and portability OCD is at peace now :) I originally developed this to be able to split newline-only separated output and I ended up using a modification that uses \r as the terminating character. That makes the newline splitting work even for the dos output ending with \r\n.
IFS="$(printf '\n\r')"
On my system (Ubuntu 17.10) your example just works as desired, both when typed from the command line (into sh) and when executed as a sh script:
[bash]§ sh
$ STR="Hello\nWorld"
$ echo $STR
Hello
World
$ exit
[bash]§ echo "STR=\"Hello\nWorld\"
> echo \$STR" > test-str.sh
[bash]§ cat test-str.sh
STR="Hello\nWorld"
echo $STR
[bash]§ sh test-str.sh
Hello
World
I guess this answers your question: it just works. (I have not tried to figure out details such as at what moment exactly the substitution of the newline character for \n happens in sh).
However, i noticed that this same script would behave differently when executed with bash and would print out Hello\nWorld instead:
[bash]§ bash test-str.sh
Hello\nWorld
I've managed to get the desired output with bash as follows:
[bash]§ STR="Hello
> World"
[bash]§ echo "$STR"
Note the double quotes around $STR. This behaves identically if saved and run as a bash script.
The following also gives the desired output:
[bash]§ echo "Hello
> World"
I wasn't really happy with any of the options here. This is what worked for me.
str=$(printf "%s" "first line")
str=$(printf "$str\n%s" "another line")
str=$(printf "$str\n%s" "and another line")
This isn't ideal, but I had written a lot of code and defined strings in a way similar to the method used in the question. The accepted solution required me to refactor a lot of the code so instead, I replaced every \n with "$'\n'" and this worked for me.

bash: including both \n and double-quotes in a string variable [duplicate]

This
STR="Hello\nWorld"
echo $STR
produces as output
Hello\nWorld
instead of
Hello
World
What should I do to have a newline in a string?
Note: This question is not about echo.
I'm aware of echo -e, but I'm looking for a solution that allows passing a string (which includes a newline) as an argument to other commands that do not have a similar option to interpret \n's as newlines.
If you're using Bash, you can use backslash-escapes inside of a specially-quoted $'string'. For example, adding \n:
STR=$'Hello\nWorld'
echo "$STR" # quotes are required here!
Prints:
Hello
World
If you're using pretty much any other shell, just insert the newline as-is in the string:
STR='Hello
World'
Bash recognizes a number of other backslash escape sequences in the $'' string. Here is an excerpt from the Bash manual page:
Words of the form $'string' are treated specially. The word expands to
string, with backslash-escaped characters replaced as specified by the
ANSI C standard. Backslash escape sequences, if present, are decoded
as follows:
\a alert (bell)
\b backspace
\e
\E an escape character
\f form feed
\n new line
\r carriage return
\t horizontal tab
\v vertical tab
\\ backslash
\' single quote
\" double quote
\nnn the eight-bit character whose value is the octal value
nnn (one to three digits)
\xHH the eight-bit character whose value is the hexadecimal
value HH (one or two hex digits)
\cx a control-x character
The expanded result is single-quoted, as if the dollar sign had not
been present.
A double-quoted string preceded by a dollar sign ($"string") will cause
the string to be translated according to the current locale. If the
current locale is C or POSIX, the dollar sign is ignored. If the
string is translated and replaced, the replacement is double-quoted.
Echo is so nineties and so fraught with perils that its use should result in core dumps no less than 4GB. Seriously, echo's problems were the reason why the Unix Standardization process finally invented the printf utility, doing away with all the problems.
So to get a newline in a string, there are two ways:
# 1) Literal newline in an assignment.
FOO="hello
world"
# 2) Command substitution.
BAR=$(printf "hello\nworld\n") # Alternative; note: final newline is deleted
printf '<%s>\n' "$FOO"
printf '<%s>\n' "$BAR"
There! No SYSV vs BSD echo madness, everything gets neatly printed and fully portable support for C escape sequences. Everybody please use printf now for all your output needs and never look back.
What I did based on the other answers was
NEWLINE=$'\n'
my_var="__between eggs and bacon__"
echo "spam${NEWLINE}eggs${my_var}bacon${NEWLINE}knight"
# which outputs:
spam
eggs__between eggs and bacon__bacon
knight
I find the -e flag elegant and straight forward
bash$ STR="Hello\nWorld"
bash$ echo -e $STR
Hello
World
If the string is the output of another command, I just use quotes
indexes_diff=$(git diff index.yaml)
echo "$indexes_diff"
The problem isn't with the shell. The problem is actually with the echo command itself, and the lack of double quotes around the variable interpolation. You can try using echo -e but that isn't supported on all platforms, and one of the reasons printf is now recommended for portability.
You can also try and insert the newline directly into your shell script (if a script is what you're writing) so it looks like...
#!/bin/sh
echo "Hello
World"
#EOF
or equivalently
#!/bin/sh
string="Hello
World"
echo "$string" # note double quotes!
The only simple alternative is to actually type a new line in the variable:
$ STR='new
line'
$ printf '%s' "$STR"
new
line
Yes, that means writing Enter where needed in the code.
There are several equivalents to a new line character.
\n ### A common way to represent a new line character.
\012 ### Octal value of a new line character.
\x0A ### Hexadecimal value of a new line character.
But all those require "an interpretation" by some tool (POSIX printf):
echo -e "new\nline" ### on POSIX echo, `-e` is not required.
printf 'new\nline' ### Understood by POSIX printf.
printf 'new\012line' ### Valid in POSIX printf.
printf 'new\x0Aline'
printf '%b' 'new\0012line' ### Valid in POSIX printf.
And therefore, the tool is required to build a string with a new-line:
$ STR="$(printf 'new\nline')"
$ printf '%s' "$STR"
new
line
In some shells, the sequence $' is a special shell expansion.
Known to work in ksh93, bash and zsh:
$ STR=$'new\nline'
Of course, more complex solutions are also possible:
$ echo '6e65770a6c696e650a' | xxd -p -r
new
line
Or
$ echo "new line" | sed 's/ \+/\n/g'
new
line
A $ right before single quotation marks '...\n...' as follows, however double quotation marks doesn't work.
$ echo $'Hello\nWorld'
Hello
World
$ echo $"Hello\nWorld"
Hello\nWorld
Disclaimer: I first wrote this and then stumbled upon this question. I thought this solution wasn't yet posted, and saw that tlwhitec did post a similar answer. Still I'm posting this because I hope it's a useful and thorough explanation.
Short answer:
This seems quite a portable solution, as it works on quite some shells (see comment).
This way you can get a real newline into a variable.
The benefit of this solution is that you don't have to use newlines in your source code, so you can indent
your code any way you want, and the solution still works. This makes it robust. It's also portable.
# Robust way to put a real newline in a variable (bash, dash, ksh, zsh; indentation-resistant).
nl="$(printf '\nq')"
nl=${nl%q}
Longer answer:
Explanation of the above solution:
The newline would normally be lost due to command substitution, but to prevent that, we add a 'q' and remove it afterwards. (The reason for the double quotes is explained further below.)
We can prove that the variable contains an actual newline character (0x0A):
printf '%s' "$nl" | hexdump -C
00000000 0a |.|
00000001
(Note that the '%s' was needed, otherwise printf will translate a literal '\n' string into an actual 0x0A character, meaning we would prove nothing.)
Of course, instead of the solution proposed in this answer, one could use this as well (but...):
nl='
'
... but that's less robust and can be easily damaged by accidentally indenting the code, or by forgetting to outdent it afterwards, which makes it inconvenient to use in (indented) functions, whereas the earlier solution is robust.
Now, as for the double quotes:
The reason for the double quotes " surrounding the command substitution as in nl="$(printf '\nq')" is that you can then even prefix the variable assignment with the local keyword or builtin (such as in functions), and it will still work on all shells, whereas otherwise the dash shell would have trouble, in the sense that dash would otherwise lose the 'q' and you'd end up with an empty 'nl' variable (again, due to command substitution).
That issue is better illustrated with another example:
dash_trouble_example() {
e=$(echo hello world) # Not using 'local'.
echo "$e" # Fine. Outputs 'hello world' in all shells.
local e=$(echo hello world) # But now, when using 'local' without double quotes ...:
echo "$e" # ... oops, outputs just 'hello' in dash,
# ... but 'hello world' in bash and zsh.
local f="$(echo hello world)" # Finally, using 'local' and surrounding with double quotes.
echo "$f" # Solved. Outputs 'hello world' in dash, zsh, and bash.
# So back to our newline example, if we want to use 'local', we need
# double quotes to surround the command substitution:
# (If we didn't use double quotes here, then in dash the 'nl' variable
# would be empty.)
local nl="$(printf '\nq')"
nl=${nl%q}
}
Practical example of the above solution:
# Parsing lines in a for loop by setting IFS to a real newline character:
nl="$(printf '\nq')"
nl=${nl%q}
IFS=$nl
for i in $(printf '%b' 'this is line 1\nthis is line 2'); do
echo "i=$i"
done
# Desired output:
# i=this is line 1
# i=this is line 2
# Exercise:
# Try running this example without the IFS=$nl assignment, and predict the outcome.
I'm no bash expert, but this one worked for me:
STR1="Hello"
STR2="World"
NEWSTR=$(cat << EOF
$STR1
$STR2
EOF
)
echo "$NEWSTR"
I found this easier to formatting the texts.
Those picky ones that need just the newline and despise the multiline code that breaks indentation, could do:
IFS="$(printf '\nx')"
IFS="${IFS%x}"
Bash (and likely other shells) gobble all the trailing newlines after command substitution, so you need to end the printf string with a non-newline character and delete it afterwards. This can also easily become a oneliner.
IFS="$(printf '\nx')" IFS="${IFS%x}"
I know this is two actions instead of one, but my indentation and portability OCD is at peace now :) I originally developed this to be able to split newline-only separated output and I ended up using a modification that uses \r as the terminating character. That makes the newline splitting work even for the dos output ending with \r\n.
IFS="$(printf '\n\r')"
On my system (Ubuntu 17.10) your example just works as desired, both when typed from the command line (into sh) and when executed as a sh script:
[bash]§ sh
$ STR="Hello\nWorld"
$ echo $STR
Hello
World
$ exit
[bash]§ echo "STR=\"Hello\nWorld\"
> echo \$STR" > test-str.sh
[bash]§ cat test-str.sh
STR="Hello\nWorld"
echo $STR
[bash]§ sh test-str.sh
Hello
World
I guess this answers your question: it just works. (I have not tried to figure out details such as at what moment exactly the substitution of the newline character for \n happens in sh).
However, i noticed that this same script would behave differently when executed with bash and would print out Hello\nWorld instead:
[bash]§ bash test-str.sh
Hello\nWorld
I've managed to get the desired output with bash as follows:
[bash]§ STR="Hello
> World"
[bash]§ echo "$STR"
Note the double quotes around $STR. This behaves identically if saved and run as a bash script.
The following also gives the desired output:
[bash]§ echo "Hello
> World"
I wasn't really happy with any of the options here. This is what worked for me.
str=$(printf "%s" "first line")
str=$(printf "$str\n%s" "another line")
str=$(printf "$str\n%s" "and another line")
This isn't ideal, but I had written a lot of code and defined strings in a way similar to the method used in the question. The accepted solution required me to refactor a lot of the code so instead, I replaced every \n with "$'\n'" and this worked for me.

How to make printf in bash script with a variable which comes from txt text with NEW LINES [duplicate]

Here are a series of cases where echo $var can show a different value than what was just assigned. This happens regardless of whether the assigned value was "double quoted", 'single quoted' or unquoted.
How do I get the shell to set my variable correctly?
Asterisks
The expected output is /* Foobar is free software */, but instead I get a list of filenames:
$ var="/* Foobar is free software */"
$ echo $var
/bin /boot /dev /etc /home /initrd.img /lib /lib64 /media /mnt /opt /proc ...
Square brackets
The expected value is [a-z], but sometimes I get a single letter instead!
$ var=[a-z]
$ echo $var
c
Line feeds (newlines)
The expected value is a a list of separate lines, but instead all the values are on one line!
$ cat file
foo
bar
baz
$ var=$(cat file)
$ echo $var
foo bar baz
Multiple spaces
I expected a carefully aligned table header, but instead multiple spaces either disappear or are collapsed into one!
$ var=" title | count"
$ echo $var
title | count
Tabs
I expected two tab separated values, but instead I get two space separated values!
$ var=$'key\tvalue'
$ echo $var
key value
In all of the cases above, the variable is correctly set, but not correctly read! The right way is to use double quotes when referencing:
echo "$var"
This gives the expected value in all the examples given. Always quote variable references!
Why?
When a variable is unquoted, it will:
Undergo field splitting where the value is split into multiple words on whitespace (by default):
Before: /* Foobar is free software */
After: /*, Foobar, is, free, software, */
Each of these words will undergo pathname expansion, where patterns are expanded into matching files:
Before: /*
After: /bin, /boot, /dev, /etc, /home, ...
Finally, all the arguments are passed to echo, which writes them out separated by single spaces, giving
/bin /boot /dev /etc /home Foobar is free software Desktop/ Downloads/
instead of the variable's value.
When the variable is quoted it will:
Be substituted for its value.
There is no step 2.
This is why you should always quote all variable references, unless you specifically require word splitting and pathname expansion. Tools like shellcheck are there to help, and will warn about missing quotes in all the cases above.
You may want to know why this is happening. Together with the great explanation by that other guy, find a reference of Why does my shell script choke on whitespace or other special characters? written by Gilles in Unix & Linux:
Why do I need to write "$foo"? What happens without the quotes?
$foo does not mean “take the value of the variable foo”. It means
something much more complex:
First, take the value of the variable.
Field splitting: treat that value as a whitespace-separated list of fields, and build the resulting list. For example, if the variable
contains foo * bar ​ then the result of this step is the 3-element
list foo, *, bar.
Filename generation: treat each field as a glob, i.e. as a wildcard pattern, and replace it by the list of file names that match this
pattern. If the pattern doesn't match any files, it is left
unmodified. In our example, this results in the list containing foo,
following by the list of files in the current directory, and finally
bar. If the current directory is empty, the result is foo, *,
bar.
Note that the result is a list of strings. There are two contexts in
shell syntax: list context and string context. Field splitting and
filename generation only happen in list context, but that's most of
the time. Double quotes delimit a string context: the whole
double-quoted string is a single string, not to be split. (Exception:
"$#" to expand to the list of positional parameters, e.g. "$#" is
equivalent to "$1" "$2" "$3" if there are three positional
parameters. See What is the difference between $* and $#?)
The same happens to command substitution with $(foo) or with
`foo`. On a side note, don't use `foo`: its quoting rules are
weird and non-portable, and all modern shells support $(foo) which
is absolutely equivalent except for having intuitive quoting rules.
The output of arithmetic substitution also undergoes the same
expansions, but that isn't normally a concern as it only contains
non-expandable characters (assuming IFS doesn't contain digits or
-).
See When is double-quoting necessary? for more details about the
cases when you can leave out the quotes.
Unless you mean for all this rigmarole to happen, just remember to
always use double quotes around variable and command substitutions. Do
take care: leaving out the quotes can lead not just to errors but to
security
holes.
In addition to other issues caused by failing to quote, -n and -e can be consumed by echo as arguments. (Only the former is legal per the POSIX spec for echo, but several common implementations violate the spec and consume -e as well).
To avoid this, use printf instead of echo when details matter.
Thus:
$ vars="-e -n -a"
$ echo $vars # breaks because -e and -n can be treated as arguments to echo
-a
$ echo "$vars"
-e -n -a
However, correct quoting won't always save you when using echo:
$ vars="-n"
$ echo "$vars"
$ ## not even an empty line was printed
...whereas it will save you with printf:
$ vars="-n"
$ printf '%s\n' "$vars"
-n
user double quote to get the exact value. like this:
echo "${var}"
and it will read your value correctly.
echo $var output highly depends on the value of IFS variable. By default it contains space, tab, and newline characters:
[ks#localhost ~]$ echo -n "$IFS" | cat -vte
^I$
This means that when shell is doing field splitting (or word splitting) it uses all these characters as word separators. This is what happens when referencing a variable without double quotes to echo it ($var) and thus expected output is altered.
One way to prevent word splitting (besides using double quotes) is to set IFS to null. See http://pubs.opengroup.org/onlinepubs/009695399/utilities/xcu_chap02.html#tag_02_06_05 :
If the value of IFS is null, no field splitting shall be performed.
Setting to null means setting to empty
value:
IFS=
Test:
[ks#localhost ~]$ echo -n "$IFS" | cat -vte
^I$
[ks#localhost ~]$ var=$'key\nvalue'
[ks#localhost ~]$ echo $var
key value
[ks#localhost ~]$ IFS=
[ks#localhost ~]$ echo $var
key
value
[ks#localhost ~]$
The answer from ks1322 helped me to identify the issue while using docker-compose exec:
If you omit the -T flag, docker-compose exec add a special character that break output, we see b instead of 1b:
$ test=$(/usr/local/bin/docker-compose exec db bash -c "echo 1")
$ echo "${test}b"
b
echo "${test}" | cat -vte
1^M$
With -T flag, docker-compose exec works as expected:
$ test=$(/usr/local/bin/docker-compose exec -T db bash -c "echo 1")
$ echo "${test}b"
1b
Additional to putting the variable in quotation, one could also translate the output of the variable using tr and converting spaces to newlines.
$ echo $var | tr " " "\n"
foo
bar
baz
Although this is a little more convoluted, it does add more diversity with the output as you can substitute any character as the separator between array variables.

How does one ‘contract’ strings to escape special characters in Bash?

There are many ways to expand an escaped string, but how can a shell command be made to take a string as an argument and escape it?
Here are some examples of different ways of expansion:
$ echo -e '\x27\\012\b34\n56\\\aa7\t8\r 9\0\0134\047'
'\0134
9\'7 8
$ echo $'\x27\\012\b34\n56\\\aa7\t8\r 9\0\0134\047'
'\0134
9\a7 8
$ PS1='(5)$ ' # At least tab-width - 3 long; 5 columns given typical tab-width.
(5)$ printf %b '\x27\\012\b34\n56\\\aa7\t8\r 9\0\0134\047'
'\0134
9\'(5)$
Note: there's actually a tab character between the 7 and 8 above, but the markup rendering seems to break it.
Yes, all sorts of craziness in there. ;-)
Anyway, I'm looking for the reverse of such escape expansion commands. If the command was called escape, it would satisfy these properties:
$ echo -ne "$(escape "$originalString")"
Should output the verbatim value of originalString as would ‘echo -n "$originalString"’. I.e. it should be an identity.
Likewise:
$ escape "$(echo -ne "$escapedString")"
Should output the string escaped again, though not necessarily in the same way as before. E.g. \0134 may become \\ or vice versa.
Don't use echo -e -- it's very poorly specified in POSIX, and considered deprecated for all but the simplest uses. Bash has extensions to its printf that provide a better-supported approach:
printf -v escaped_string %q "$raw_string"
...gives you a shell-escaped string from a raw one (storing it in a variable named escaped_string), and
printf -v raw_string %b "$escaped_string"
...gives you a raw string from a backslash-escaped one, storing it in raw_string.
Note that the two escape syntaxes are not equivalent -- strings escaped with printf %q are ready for eval, rather than for printf %b.
That is, you can safely run:
eval "myvar=$escaped_string"
...when escaped_string has been created with printf %q as above.
That said: What's the use case? It's strongly preferred to handle raw strings as raw strings (using NUL terminaters when delimiting is necessary), rather than converting them to and from an escaped form.

Resources