I have read a BASH script, and found the following line:
lines="$lines"$'\n'
After testing, I know the meaning of this line is adding a "\n" after the string "$lines".
But after checking the bash manual, I can't find "$" can be used as a concatenated symbol. Could anyone give explainations on this usage of "$"? Thanks very much in advance!
A slightly closer read of the Bash Manual under Quoting would reveal where this gem is hidden. Specifically:
Words of the form $'string' are treated specially. The word expands to
string, with backslash-escaped characters replaced as specified by the
ANSI C standard.
Used specifically in the context of \n it provdes a new line. You most often see this form of quoting used in regard to the Bash IFS (internal field separator) whose default is space tab newline written:
IFS=$' \t\n'
Related
More precisely, why does
"`command "$variable"`"
treat the outer quotes as enclosing the inner quotes, instead of expanding the variable outside any quotes?
The exact command I used to test this is similar to an example brought up in another stackoverflow question about the correct method of quoting when using command substitution:
fileName="some path with/spaces"
echo "`dirname "$fileName"`"
which correctly echoes "some path with", instead of complaining because of an invalid number of arguments.
I read Bash's man page, where it states in chapter "EXPANSION", section "Commmand Substitution" that the new-style $() substitution preserves the meaning of any character between the parentheses, however, regarding backquotes, it only mentions that backslashes work in a limited way:
When the old-style backquote form of substitution is used, backslash retains
its literal meaning except when followed by $, `, or \. The first backquote
not preceded by a backslash terminates the command substitution.
My first thought was that backticks do the same, aside from the mentioned exception, thus "quoting" the inner double quotes, however, I got told that is not the case.
The second observation that pointed me to this direction was that
a=\$b
b=hello
echo `echo $a`
prints "$b". Had the backticks let the dollar sign get interpreted, the first variable substitution should have occurred before the subshell was invoked, with the subshell expanding the string "$b", resulting in "hello."
According to the above excerpt from the man page, I can even make sure the dollar sign is actually quoted, by using
echo `echo \$a`
and the results would still be the same.
A third observation gives me some doubts though:
echo `echo \\a`
Result: "\a"
echo \a
Result: a
Here it seems like both backslashes were retained until the subshell came into play, even though the man page states that backslashes within backquotes do not have their literal meaning when followed by another backslash.
EDIT: ^ Everything works as expected in this regard, I must have used the wrong shell (tcsh in my other terminal, and with a different character from "a").
Although I have not been able to find out what actually happens, while I was searching for the answer, I came across some people mentioning the term "quoting context" with regards to command substitution, but without any explanation as to what it means or where it is described.
I have not found any real reference to "quoting contexts" in either Bash references (gnu.org, tldp, man bash) or via DuckDuckGo.
Additionally to knowing what is going on, I'd preferably like to have some reference or guidance as to how this behavior can be discerned from it, because I think I might have failed to put some pieces together from which this naturally comes. Otherwise I'll just forget the answer.
To those recommending people to use the new-style dollar sign and parentheses substitution: on ca. 50 years old Unix machines with tens or hundreds of different proprietary environments (can't throw out a shell for a newer one), when one has to write scripts compatible between most shells that anyone might be using, it is not an option.
Thanks to anyone who can help me.
POSIX has this to say in 2.2.3 (emphasis mine):
` (backquote)
The backquote shall retain its special meaning introducing the other form of command substitution (see Command Substitution). The portion of the quoted string from the initial backquote and the characters up to the next backquote that is not preceded by a <backslash>, having escape characters removed, defines that command whose output replaces "`...`" when the word is expanded. Either of the following cases produces undefined results:
A single-quoted or double-quoted string that begins, but does not end, within the "`...`" sequence
A "`...`" sequence that begins, but does not end, within the same double-quoted string
This, to me, pretty much defines what other people might (informally?) call a quoting context comprising everything within two consecutive backquotes.
In a way, the backquote is the fourth quote in addition to single quote, double quote and backslash. Note that within double quotes, single quotes lose their quoting capability as well, so it should be no surprise that backquotes change the function of double quotes within them.
I tried your example with other shells, like the Almquist shell on FreeBSD and zsh. As expected, they output some path with.
The problem
I have multiple property lines in a single string separated by \n like this:
LINES2="Abc1.def=$SOME_VAR\nAbc2.def=SOMETHING_ELSE\n"$LINES
The LINES variable
might contain an undefined set of characters
may be empty. If it is empty, I want to avoid the trailing \n.
I am open for any command line utility (sed, tr, awk, ... you name it).
Tryings
I tried this to no avail
sed -z 's/\\n$//g' <<< $LINES2
I also had no luck with tr, since it does not accept regex.
Idea
There might be an approach to convert the \n to something else. But since $LINES can contain arbitrary characters, this might be dangerous.
Sources
I skim read through the following questions
How can I replace a newline (\n) using sed?
sed with literal string--not input file
Here's one solution:
LINES2="Abc1.def=$SOME_VAR"$'\n'"Abc2.def=SOMETHING_ELSE${LINES:+$'\n'$LINES}"
The syntax ${name:+value} means "insert value if the variable name exists and is not empty." So in this case, it inserts a newline followed by $LINES if $LINES is not empty, which seems to be precisely what you want.
I use $'\n' because "\n" is not a newline character. A more readable solution would be to define a shell variable whose value is a single newline.
It is not necessary to quote strings in shell assignment statements, since the right-hand side of an assignment does not undergo word-splitting nor glob expansion. Not quoting would make it easier to interpolate a $'\n'.
It is not usually advisable to use UPPER-CASE for shell variables because the shell and the OS use upper-case names for their own purposes. Your local variables should normally be lower case names.
So if I were not basing the answer on the command in the question, I would have written:
lines2=Abc1.def=$someVar$'\n'Abc2.def=SOMETHING_ELSE${lines:+$'\n'$lines}
I have searched for the list of metacharacters in Bash but space is not enlisted.
I wonder if I'm right by assuming that space is the "token separation character" in Bash, since it not only works as such with Shell programs or builtins but also when creating an array through compound assignment - quotes escape spaces, just like they do most other metacharacters.
They cannot be escaped by backslashes, though.
Parameters are passed to programs and functions separated by spaces, for example.
Can someone explain how (and when) bash interprets spaces? Thanks!
I've written an example:
$ a=(zero one two)
$ echo ${a[0]}
$ zero
$ a=("zero one two")
$ echo ${a[0]}
$ zero one two
From the man page:
metacharacter
A character that, when unquoted, separates words. One of the following:
| & ; ( ) < > space tab
^^^^^
According to the Posix shell specification for Token Recognition, any shell (which pretends to be Posix-compliant) should interpret whitespace as separating tokens:
If the current character is an unquoted <newline>, the current token shall be delimited.
If the current character is an unquoted <blank>, any token containing the previous character is delimited and the current character shall be discarded.
Here <blank> refers to the character class blank as defined by LC_CTYPE at the time the shell starts. In almost all cases, that character class consists precisely of the space and tab characters.
It's important to distinguish between the shell mechanism for recognizing tokens, and the use of $IFS to perform word-splitting. Word splitting is performed (in most contexts) after brace, tilde, parameter and variable, arithmetic and command expansions. Consider, for example:
$ # Setting IFS does not affect token recognition
$ bash -c 'IFS=:; arr=(foo:bar); echo "${arr[0]}"'
foo:bar
$ # But it does affect word splitting after variable expansion
$ bash -c 'IFS=: foobar=foo:bar; arr=($foobar); echo "${arr[0]}"'
foo
Yes it is. From the Bash Reference Manual's Definitions section:
blank
A space or tab character.
…
metacharacter
A character that, when unquoted, separates words. A metacharacter is a blank or one of the following characters: ‘|’, ‘&’, ‘;’, ‘(’, ‘)’, ‘<’, or ‘>’.
I have a theoretical question about the syntax of Bash.
I am running Bash 4.3.11(1) in Linux Ubuntu 14.04.
In the official GNU's website: Bash official web (GNU)
in Subection 9.3.1. it says:
!string
Refer to the most recent command preceding the current position
in the history list starting with string.
In general it's understood that string is, syntactically speaking, a sequence of characters ending before the first blank or newline.
However, when describing quoting in subsection 3.1.2., we can read in paragraph 3.1.2.2. what follows:
Enclosing characters in single quotes (‘'’) preserves the literal
value of each character within the quotes.
In particular, the blanks inside single quotes are not broking the strings in separated words.
So, a expression like !'some text' would have to search in the history list of Bash for the most recent command starting by 'some text'.
However, the blank between some and text is broken when I write it in my terminal, since the following error message is shown:
bash: !'some: event not found
Is this behaviour a bug in the implementation of the shell, or well I am not understanding the expansion rules of Bash for this example?
I wouldn't call the observed behaviour a bug, because there is no specification for history expansion other than the observed behaviour of the bash shell itself. But it is certainly the case that the precise mechanics of parsing a history expansion expression is not well documented and has a lot of possibly surprising corner cases.
The bash manpage does state that history expansion "is performed immediately after a complete line is read, before the shell breaks it into words" (emphasis added), while the bash manual mentions that history expansion is provided by the History library. This is the root cause of most of the history expansion parsing oddities: history expansion works on raw unparsed input without any assistance from the bash tokenizer, and is mostly done with an external library which is not bash-specific. Since tokenizing bash input is non-trivial, it is not really surprising that the relatively simple parsing rules used during history expansion are only a rough approximation to a real bash parse.
For example, the bash manual does indicated that you can prevent a history expansion character (!) from being recognized as such by backslash-quoting it. But it is not explicitly documented that any \ which immediately precedes an ! will inhibit recognition of the history expansion, even if the backslash was itself quoted with a backslash. So the ! in \\!word does not cause the previous command starting with word to be substituted. (\\word is a common way to execute the command word instead of the alias word, so the example is not entirely contrived.)
A longer discussion of some of the corner cases of the recognition of the history expansion character can be found in this answer.
The issue raised by this question is slightly different, since it is about the next phase of the history expansion parse. Once it has been established that a particular character is a history expansion character, it is then necessary to parse the "event" which follows; as indicated by the bash manual, the event can take several forms, one of which is !string, representing the most recent command which starts with "string".
It is implied that this form will only be used if no other form applies, which means that string may not start with a digit or -, !, # or ?. It also may not start with whitespace or = (since those would inhibit history expansion) and in some circumstances ( or " (which may inhibit history expansion). And finally, it may not start with ^, $, % or *, which would be interpreted as a word designator (from the default event, which is the previous command).
The bash manual does not specify what terminates the string. It is semi-documented in the history library manual, which mentions that a history search string (or "event" as it is called in the bash manual) is terminated by whitespace, :, or any of the characters in the history configuration variable history_search_delimiter_chars. (For the record, bash currently (v4.3) sets that variable to ";&()|<>".)
As indicated earlier, quoting is taken into account when deciding whether or not to recognize a history expansion character; as it turns out, if the history expansion occurs inside a double-quoted string, then the closing double-quote is also considered a history search delimiter character. And that, as far as I know, is the entire list of characters which will delimit !string.
Nowhere in either the bash nor the history documentation does it state that a history search delimiter character can be made non-special by quoting, and indeed this does not happen. An open quote, whether double or single, or even a backslash following the ! will be treated as just part of the string to be searched for, without any special processing.
Parsing of the substring-match history expansion -- !?string? -- is completely different. That string can only be terminated by a ? or by a newline. (As the bash manual says, the trailing ? is optional if terminated by a newline.)
Once the history expansion character has been recognized and the history search string has been identified, it may then be necessary to split the retrieved history entry into words. Again, the bash manual is slightly cavalier about corner cases, when it says that "the line is broken into words in the same fashion that Bash does, so that several words surrounded by quotes are considered one word."
A pedant would observe that "in the same fashion that Bash does" is not quite the same as saying "exactly as Bash would do", and in fact the second part of the sentence is literall true: several words surrounded by quotes are considered one word even if the quotes are not really matching quotes. For example, the line:
command "$(echo " foo bar ")"
is considered by the history library to consist of the following five words:
0. command
1. "$(echo "
2. foo
3. bar
4. ")"
although the bash parse would be quite different. By contrast, bash and the history library agree on the parsing of
command "$(echo ' foo bar ')"
as two words.
If I want to replace for example the placeholder {{VALUE}} with another string which can contain any characters, what's the best way to do it?
Using sed s/{{VALUE}}/$(value)/g might fail if $(value) contains a slash...
oldValue='{{VALUE}}'
newValue='new/value'
echo "${var//$oldValue/$newValue}"
but oldValue is not a regexp but works like a glob pattern, otherwise :
echo "$var" | sed 's/{{VALUE}}/'"${newValue//\//\/}"'/g'
Sed also works like 's|something|someotherthing|g' (or with other delimiters for that matter), but if you can't control the input string, you'll have to use some function to escape it before passing it to sed..
The question asked basically duplicates How can I escape forward slashes in a user input variable in bash?, Escape a string for sed search pattern, Using sed in a makefile; how to escape variables?, Use slashes in sed replace, and many other questions. “Use a different delimiter” is the usual answer. Pianosaurus's answer and Ben Blank's answer list characters (backslash and ampersand) that need to be escaped in the shell, besides whatever character is used as an alternate delimiter. However, they don't address the quoting-a-quote problem that will occur if your “string which can contain any characters” contains a double quote. The same kind of problem can affect the ${parameter/pattern/string} shell variable expansion mentioned in a previous answer.
Some other questions besides the few mentioned above suggest using awk, and that is usually a good approach to changes that are more complicated than are easy to do with sed. Also consider perl and python. Besides single- and double-quoted strings, python has u'...' unicode quoting, r'...' raw quoting,ur'...' quoting, and triple quoting with ''' or """ delimiters. The question as stated doesn't provide enough context for specific awk/perl/python solutions.