What is wrong with 'echo 'a\\b' | grep "\\"' - linux

When I am running this
echo 'a\\b' | grep '\\'
it correctly identifies the backslashes
but when I used the double quotes
echo 'a\\b' | grep "\\"
It is not running and it is returning a trailing backlash error, I can't figure out why that happens. It should work exactly similar to single quotes as I have escaped the backslashes.

When using double quotes the \\ gets evaluated into \. Single quotes leaves things as-is. Do an echo "\\" to see what I mean.

There are multiple places where backslash escapes can be processed:
In your shell: This happens when you use double quotes and not single quotes. Note that this is a very limited processing and escapes like "\n" will not work. This means when you do echo "a\\b" echo will get a\b as first argument while when you're executing echo 'a\\b' echo will get a\\b as first argument.
In the program you're calling: grep will parse the input as POSIX regular expression which has its own set of escapes. echo may or may not handle escape codes by default. My /bin/echo will not process escape codes by default, my bash bulitin echo also won't but my zsh builtin echo will. If a certain behaviour is wanted this can be specified by -e to enable escapes or by -E to disable escapes.
So if you're calling grep "\\" the shell will process the \\ escape sequence and pass \ to grep which will try to parse that as a regular expression and correclty complains about a trailing backslash.
So when you use double quotes you have to double escape everything resulting in the rather clumsy echo 'a\\b' | grep "\\\\".
To get the least amount of escaping you can use echo 'a\\b' | grep '\\' or even echo 'a\\b' | grep -F '\', where the -F flag tells grep to interpret the pattern as verbatim string and not as a regular expression. If you happen to have an echo that processes escapes by default (this will result in echo 'a\\b' printing a\b) you also need to add -E to the echo command.

grep does a bit of basic regular expression matching.
When in double quotes, the shell parses \\, so \\ will resolve to a single backslash as a parameter.
Additionally, grep does a bit of basic regular expression matching, so to match a single backslash you need to pass two backslashes to grep.
So by calling grep "\\" you actually get grep \, which does not parse as a regular expression, therefore makes grep fail.
To correctly match only double backslashes, do grep -F '\\' (which means fixed string matching instead of regex), grep '\\\\' or grep "\\\\\\\\".

Related

Problem with using grep to match the whole word

I am trying to match a whole string in a list of new line separated strings. Here is my example:
[hemanth.a#gateway ~]$ echo $snapshottableDirs
/user/hemanth.a/dummy1 /user/hemanth.a/dummy3
[hemanth.a#gateway ~]$ echo $snapshottableDirs | tr -s ' ' '\n'
/user/hemanth.a/dummy1
/user/hemanth.a/dummy3
[hemanth.a#gateway ~]$ echo $snapshottableDirs | tr -s ' ' '\n' | grep -w '/user/hemanth.a'
/user/hemanth.a/dummy1
/user/hemanth.a/dummy3
My aim is to only find a match if and only if the string /user/hemanth.a exists as a whole word(in a new line) in the list of strings. But the above command is also returning strings that contain /user/hemanth.a.
This is a sample scenario. There is no guarantee that all the strings that I would want to match will be in the form of /user/xxxxxx.x. Ideally I would want to match the exact string if it exists in a new line as a whole word in the list.
Any help would be appreciated. thank you.
Update: Using fgrep -x '/user/hemanth.a' is probably a better solution here, as it avoids having to escape characters such as $ to prevent grep from interpreting them as meta-characters. fgrep performs a literal string match as opposed to a regular expression match, and the -x option tells it to only match whole lines.
Example:
> cat testfile.txt
foo
foobar
barfoo
barfoobaz
> fgrep foo testfile.txt
foo
foobar
barfoo
barfoobaz
> fgrep -x foo testfile.txt
foo
Original answer:
Try adding the $ regex metacharacter to the end of your grep expression, as in:
echo $snapshottableDirs | tr -s ' ' '\n' | grep -w '/user/hemanth.a$'.
The $ metacharacter matches the end of the line.
While you're at it, you might also want to use the ^ metacharacter, which matches the beginning of the line, so that grep '/user/hemanth.a$' doesn't accidentally also match something like /user/foo/user/hemanth.a.
So you'd have this:
echo $snapshottableDirs | tr -s ' ' '\n' | grep '^/user/hemanth\.a$'.
Edit: You probably don't actually want the -w here, so I've removed that from my answer.
Edit 2: #U. Windl brings up a good point. The . character in a regular expression is a metacharacter that matches any character, so grep /user/hemanth.a might end up matching things you're not expecting, such as /user/hemanthxa, etc. Or perhaps more likely, it would also match the line /user/hemanth/a. To fix that, you need to escape the . character. I've updated the grep line above to reflect this.
Update: In response to your question in the comments about how to escape a string so that it can be used in a grep regular expression...
Yes, you can escape a string so that it should be able to be used in a regular expression. I'll explain how to do so, but first I should say that attempting to escape strings for use in a regex can become very complicated with lots of weird edge cases. For example, an escaped string that works with grep won't necessarily work with sed, awk, perl, bash's =~ operator, or even grep -e.
On top of that, if you change from single quotes to double quotes, you might then have to add another level of escaping so that bash will expand your string properly.
For example, if you wanted to search for the literal string 'foo [bar]* baz$'using grep, you'd have to escape the [, *, and $ characters, resulting in the regular expression:
'foo \[bar]\* baz\$'
But if for some reason you decided to pass that expression to grep as a double-quoted string, you would then have to escape the escapes. Otherwise, bash would interpret some of them as escapes. You can see this if you do:
echo "foo \[bar]\* baz\$"
foo \[bar]\* baz$
You can see that bash interpreted \$ as an escape sequence representing the character $, and thus swallowed the \ character. This is because normally, in double quoted strings $ is a special character that begins a parameter expansion. But it left \[ and \* alone because [ and * aren't special inside a double-quoted string, so it interpreted the backslashes as literal \ characters. To get this expression to work as an argument to grep in a double-quoted string, then, you would have to escape the last backslash:
# This command prints nothing, because bash expands `\$` to just `$`,
# which grep then interprets as an end-of-line anchor.
> echo 'foo [bar]* baz$' | grep "foo \[bar]\* baz\$"
# Escaping the last backslash causes bash to expand `\\$` to `\$`,
# which grep then interprets as matching a literal $ character
> echo 'foo [bar]* baz$' | grep "foo \[bar]\* baz\\$"
foo [bar]* baz$
But note that "foo \[bar]\* baz \\$" will not work with sed, because sed uses a different regex syntax in which escaping a [ causes it to become a meta-character, whereas in grep you have to escape it to prevent it from being interpreted as a meta-character.
So again, yes, you can escape a literal string for use as a grep regular expression. But if you need to match literal strings containing characters that will need to be escaped, it turns out there's a better way: fgrep.
The fgrep command is really just shorthand for grep -F, where the -F tells grep to match "fixed strings" instead of regular expression. For example:
> echo '[(*\^]$' | fgrep '[(*\^]$'
[(*\^]$
This works because fgrep doesn't know or care about regular expressions. It's just looking for the exact literal string '[(*\^]$'. However, this sort of puts you back at square one, because fgrep will match on substrings:
> echo '/users/hemanth/dummy' | fgrep '/users/hemanth'
/users/hemanth/dummy
Thankfully, there's a way around this, which it turns out was probably a better approach than my initial answer, considering your specific needs. The -x option to fgrep tells it to only match the entire line. Note that -x is not specific to fgrep (since fgrep is really just grep -F anyway). For example:
> echo '/users/hemanth/dummy' | fgrep -x '/users/hemanth' # prints nothing
This is equivalent to what you would have gotten by escaping the grep regex, and is almost certainly a better answer than my previous answer of enclosing your regex in ^ and $.
Now, as promised, just in case you want to go this route, here's how you would escape a fixed string to use as a grep regex:
# Suppose we want to match the literal string '^foo.\ [bar]* baz$'
# It contains lots of stuff that grep would normally interpret as
# regular expression meta-characters. We need to escape those characters
# so grep will interpret them as literals.
> str='^foo.\ [bar]* baz$'
> echo "$str"
^foo.\ [bar]* baz$
> regex=$(sed -E 's,[.*^$\\[],\\&' <<< "$str")
> echo "$regex"
\^foo\.\\ \[bar]\* baz\$
> echo "$str" | grep "$regex"
^foo.\ [bar]* baz$
# Success
Again, for the reasons cited above, I don't recommend this approach, especially not when fgrep -x exists.
Read "Anchoring" in man grep:
Anchoring
The caret ^ and the dollar sign $ are meta-characters that respectively
match the empty string at the beginning and end of a line.
Also be aware that . matches any character (from said manual page):
The period . matches any single character.

When grep "\\" XXFile I got "Trailing Backslash"

Now I want to find whether there are lines containing '\' character. I tried grep "\\" XXFile but it hints "Trailing Backslash". But when I tried grep '\\' XXFile it is OK. Could anyone explain why the first case cannot run? Thanks.
The difference is in how the shell treats the backslashes:
When you write "\\" in double quotes, the shell interprets the backslash escape and ends up passing the string \ to grep. Grep then sees a backslash with no following character, so it emits a "trailing backslash" warning. If you want to use double quotes you need to apply two levels of escaping, one for the shell and one for grep. The result: "\\\\".
When you write '\\' in single quotes, the shell does not do any interpretation, which means grep receives the string \\ with both backslashes intact. Grep interprets this as an escaped backslash, so it searches the file for a literal backslash character.
If that's not clear, we can use echo to see exactly what the shell is doing. echo doesn't do any backslash interpretation itself, so what it prints is what the shell passed to it.
$ echo "\\"
\
$ echo '\\'
\\
You could have written the command as
grep "\\\\" ...
This has two pairs of backslashes which bash will convert to two single backslashes. This new pair will then be passed to grep as an escaped backslash getting you what you want.

Replace forward slash with double backslash enclosed in double quotes

I'm desperately trying to replace a forward slash (/) with double backslash enclosed in double quotes ("\\")
but
a=`echo "$var" | sed 's/^\///' | sed 's/\//\"\\\\\"/g'`
does not work, and I have no idea why. It always replaces with just one backslash and not two
When / is part of a regular expression that you want to replace with the s (substitute) command of sed, you can use an other character instead of slash in the command's syntax, so you write, for example:
sed 's,/,\\\\,g'
above , was used instead of the usual slash to delimit two parameters of the s command: the regular expression describing part to be replaced and the string to be used as the replacement.
The above will replace every slash with two backslashes. A backslash is a special (quoting) character, so it must be quoted, here it's quoted with itself, that's why we need 4 backslashes to represent two backslashes.
$ echo /etc/passwd| sed 's,/,\\\\,g'
\\etc\\passwd
How about this?
a=${var//\//\\\\}
Demo in a shell:
$ var=a/b/c
$ a=${var//\//\\\\}
$ echo "$a"
a\\b\\c
Another way of doing it: tr '/' '\'
$ var=a/b/c
$ echo "$var"
a/b/c
$ tr '/' '\' <<< "$var"
a\b\c

Escape double quote in grep

I wanted to do grep for keywords with double quotes inside. To give a simple example:
echo "member":"time" | grep -e "member\""
That does not match. How can I fix it?
The problem is that you aren't correctly escaping the input string, try:
echo "\"member\":\"time\"" | grep -e "member\""
Alternatively, you can use unescaped double quotes within single quotes:
echo '"member":"time"' | grep -e 'member"'
It's a matter of preference which you find clearer, although the second approach prevents you from nesting your command within another set of single quotes (e.g. ssh 'cmd').

Escape file name for use in sed substitution

How can I fix this:
abc="a/b/c"; echo porc | sed -r "s/^/$abc/"
sed: -e expression #1, char 7: unknown option to `s'
The substitution of variable $abc is done correctly, but the problem is that $abc contains slashes, which confuse sed. Can I somehow escape these slashes?
Note that sed(1) allows you to use different characters for your s/// delimiters:
$ abc="a/b/c"
$ echo porc | sed -r "s|^|$abc|"
a/b/cporc
$
Of course, if you go this route, you need to make sure that the delimiters you choose aren't used elsewhere in your input.
The GNU manual for sed states that "The / characters may be uniformly replaced by any other single character within any given s command."
Therefore, just use another character instead of /, for example ::
abc="a/b/c"; echo porc | sed -r "s:^:$abc:"
Do not use a character that can be found in your input. We can use : above, since we know that the input (a/b/c/) doesn't contain :.
Be careful of character-escaping.
If using "", Bash will interpret some characters specially, e.g. ` (used for inline execution), ! (used for accessing Bash history), $ (used for accessing variables).
If using '', Bash will take all characters literally, even $.
The two approaches can be combined, depending on whether you need escaping or not, e.g.:
abc="a/b/c"; echo porc | sed 's!^!'"$abc"'!'
You don't have to use / as pattern and replace separator, as others already told you. I'd go with : as it is rather rarely used in paths (it's a separator in PATH environment variable). Stick to one and use shell built-in string replace features to make it bullet-proof, e.g. ${abc//:/\\:} (which means replace all : occurrences with \: in ${abc}) in case of : being the separator.
$ abc="a/b/c"; echo porc | sed -r "s:^:${abc//:/\\:}:"
a/b/cporc
backslash:
abc='a\/b\/c'
space filling....
As for the escaping part of the question I had the same issue and resolved with a double sed that can possibly be optimized.
escaped_abc=$(echo $abc | sed "s/\//\\\AAA\//g" | sed "s/AAA//g")
The triple A is used because otherwise the forward slash following its escaping backslash is never placed in the output, no matter how many backslashes you put in front of it.

Resources