Trying to understand meaning of non-shell matacharacters in linux - linux

Can anyone help me understand the meaning of
grep "[0-9]\{2\}" filename
It is a non-shell matacharacter command.

grep needs the curly braces backslashed in quantifiers.
grep '[0-9]\{2\}' # n.b. the single quotes
In double quotes, \{ is expanded as {, you need to backslash the backslash to get it through.

Related

How to correctly detect and replace apostrophe (') with sed?

I'm having a directory with many files having special characters and spaces. I want to perform an operation with all these files so I'm trying to store all filenames in a list.txt and then run the command with this list.
The special characters in my list are & []'.
So basically I want to use sed to replace each occurence with \ + the character in question.
E.g. : filename .txt => filename\ .txt etc...
The thing is I have trouble handling apostrophes.
Here is my command as of now :
ls | sed 's/\ /\\ /g' | sed 's/\&/\\&/g' | sed "s/\'/\\'/g" | sed 's/\[/\\[/g' | sed 's/\]/\\]/g'
At first I had issues with, I believe, the apostrophes in the string command in conflict with the apostrophes surrounding the string. So I used double quotes instead, but it still doesn't work.
I've tried all these and nothing worked :
sed "s/\'/\\'/g" (escaping the apostrophe)
sed "s/'/\'/g" (escaping nothing)
sed "s/'/\\'/g" (escaping the backslash)
sed 's/"'"/\"'"/g' (double quoting single quote)
As a disclaimer, I must say, I'm completely new to sed. I just run my first sed command today, so maybe I'm doing something wrong I didn't realize.
PS : I've seen those thread, but no answer worked for me :
https://unix.stackexchange.com/questions/157076/how-to-remove-the-apostrophe-and-delete-the-space
How to replace to apostrophe ' inside a file using SED
This may do:
cat file
avbadf
test&rr
more [ yes
this ]
and'df
sed -r 's/(\x27|&|\[|\])/\\\1/g' file
avbadf
test\&rr
more \[ yes
this \]
and\'df
\x27 is equal to singe quote '
\x22 is equal to double quote "
Whoops, I found the answer to my question. Here is the working input :
sed "s/'/\\\'/g"
This will effectively replace any ' with \'.
However I'm having trouble understanding exactly what's happening here.
So if I understand correctly, we are escaping the backslash and the apostrophe in the replacement string. Now, if somebody could answer some those, I would be grateful :
Why don't we need to escape the first quote (the one in the pattern to find) ?
Why do we have to escape the backslash whereas for the other characters, there's no need ?
Why do we need to escape the second quote (the one in the replacement string) ?
I think all of your sed matches actually need that replacement pattern. This one seems to work for all examples:
ls | sed "s/\ /\\\ /g" | sed "s/\&/\\\&/g" | sed "s/\[/\\\[/g" | sed "s/\]/\\\]/g" | sed "s/'/\\\'/g"
So it is s/regex/replacement/command and 'regex' and 'replacement' have different sets of special characters.
The only one that's different is s/'/\\\'/g and there only because I don't believe there is any special ' character on the regex expression. There is some obscure \' special character in the replacement expression, for matching buffer ends in multi-line mode, accord to the docs. That might be why it needs an escape in the replacement side, but not in the regex side.
For example, \5 is a special character in the replacement expression, so to replace:
filename5.txt -> filename\5.txt
You would also need, as with apostrophe:
sed "s/5/\\\5/g"
It probably has to do with the mysterious inner works of sed parsing, it might read from right to left or something.
Please try the following:
sed 's/[][ &'\'']/\\&/g' file
By using the same example by #Jotne, the result will be:
gavbadf
gtest\&rr
gmore\ \[\ yes
gthis\ \]
gand\'df
[How it works]
The regex part in the sed s command above just defines a character
class of & []', which should be escaped with a backslash.
The right square bracket ] does not need escaping when put
immediately after the left square bracket [.
The obfuscating part will be the handling of a single quote.
We cannot put a single quote within single quotes even if we escape it.
The workaround is as follows: Say we have an assignment str='aaabbb'.
To put a single quote between "aaa" and "bbb", we can say as
str='aaa'\''bbb'.
It may look puzzling but it just concatenates the three sequences;
1) to close the single-quoted string as 'aaa'.
2) to put a single quote with an escaping backslash as \'.
3) to restart the single-quoted string as 'bbb'.
Hope this helps.

Grep issue after escaping backslash

I am trying to grep the string
"SNTCHDCS06-Filesystem D:\\ Label:Data Serial Number f8271450"
from a csv file, but somehow I failed miserably.
I understand that I need to add two backlashes on top of the two backslashes (one for shell, one for bash), but it doesn't work after that.
The below command works.
[root#nagiospdc01 folder]# grep -e "^SNTCHDCS06-Filesystem D:\\\\" in/masterlist.csv
SNTCHDCS06-Filesystem D:\\ Label:Data Serial Number f8271450,SNTCHDCS06,10.24.64.210,Active Directory,AD Server,UCS,Filesystem D:\\ Label:Data Serial Number f8271450,Windows Team,0,XM_OPS_WIN,Windows Team,Y,Y,N,N,Y,Y,Y,Y,Y,N,N,Y,Y,ITOC
When I try to grep for the space after that, grep doesn't work and simply fails.
[root#nagiospdc01 folder]# grep -e "^SNTCHDCS06-Filesystem D:\\\\ " in/masterlist.csv
Appreciate if someone can enlighten me on the correct grep syntax and command.
Use single quotes,
grep -e '^SNTCHDCS06-Filesystem D:\\ Label:Data Serial Number f8271450'
When using double quotes \\ gets converted to a single \ by bash. However bash does not look inside single quotes.
From the Bash manual:
3.1.2.2 Single Quotes
Enclosing characters in single quotes (') preserves the literal value of each character within the quotes. A single quote may not occur between single quotes, even when preceded by a backslash.
3.1.2.3 Double Quotes
Enclosing characters in double quotes (") preserves the literal value of all characters within the quotes, with the exception of $, `, \, and, when history expansion is enabled, !. The characters $ and ` retain their special meaning within double quotes (see Shell Expansions). The backslash retains its special meaning only when followed by one of the following characters: $, `, ", \, or newline. Within double quotes, backslashes that are followed by one of these characters are removed. Backslashes preceding characters without a special meaning are left unmodified. A double quote may be quoted within double quotes by preceding it with a backslash. If enabled, history expansion will be performed unless an ! appearing in double quotes is escaped using a backslash. The backslash preceding the ! is not removed.
The special parameters * and # have special meaning when in double quotes (see Shell Parameter Expansion).
The -F option in grep allows to search for fixed strings. Also the single quotes helps to search for the exact string.
-F, --fixed-strings :
Interpret PATTERN as a list of fixed strings, separated by newlines, any of which is to be matched.
grep -F 'SNTCHDCS06-Filesystem D:\\ Label:Data Serial Number f8271450' masterlist.csv
*SNTCHDCS06-Filesystem D:\\ Label:Data Serial Number f8271450*

Egrep results are current command and grabage

Im trying to egrep lines that contain nothing but a single occurrence of "Hihihihihihihi!", with arbitrarily many 'hi's
Here is what I write
egrep "^Hi(hi)*!$" myfile.txt
But it didn't work. After pressing enter, the command was displayed again:
egrep "^Hi(hi)*myfile.txt" mayflies.txt
Anyone can help me?
Thanks!
The shell is interpreting !$ to substitute the last argument of the previous commend.
To disable these shell substitutions, change the double quotes to single quotes.
egrep '^Hi(hi)*!$' myfile.txt
Alternatively, you can use the -x switch to match only whole lines, obviating the need for the ^ and $ characters, and thus avoiding the fatal !$ argument substitution:
egrep -x "Hi(hi)*!" myfile.txt
You don't say what shell, but I suspect the problem you have is that the exclamation mark (!) is extra special to the shell. You need to escape that:
egrep "^Hi(hi)*\!$" myfile.txt
Should work in most shells where that's true.
Changing the double quotes to single quotes is not enough for all shells, the exclamation is still special inside single quotes. I just tested all this in the tcsh, other shells will have differences.
try it with single quotes. I think the $ is being interpreted by BASH as something, not sure what:
egrep '^Hi(hi)*!$' myfile.txt

Do bash script and command grep treat single quote differently

In Advanced Bash-Scripting Guide, I find
Within single quotes, every special character except ' gets
interpreted literally.
So I think grep '\<the\>' file.txt would search \<the\>, instead of word the. But it searches the indeed.
#!/bin/bash
grep '\<the\>' file.txt
Added
Maybe I don't describe my question clearly.In man page,
Enclosing characters in single quotes preserves the literal value of each character within the quotes.
So my question is: Now that bash would regard enclosing characters in single quote as the literal value, why '\<the\>' is treated as the in grep? Is it grep own characteristic,differing from bash?
Indeed, bash will pass your string literally.
It is grep that interpretes the string (as a regular expression). If you want to avoid that, use grep -F. With that option, grep will search literally for the given string.
You need to add another backslash \ to match the whole pattern, as the symbols \< and \> are special to grep. Quoting the manpage: man grep
The Backslash Character and Special Expressions
The symbols \< and \> respectively match the empty string at the beginning and end
of a word.

When grep "\\" XXFile I got "Trailing Backslash"

Now I want to find whether there are lines containing '\' character. I tried grep "\\" XXFile but it hints "Trailing Backslash". But when I tried grep '\\' XXFile it is OK. Could anyone explain why the first case cannot run? Thanks.
The difference is in how the shell treats the backslashes:
When you write "\\" in double quotes, the shell interprets the backslash escape and ends up passing the string \ to grep. Grep then sees a backslash with no following character, so it emits a "trailing backslash" warning. If you want to use double quotes you need to apply two levels of escaping, one for the shell and one for grep. The result: "\\\\".
When you write '\\' in single quotes, the shell does not do any interpretation, which means grep receives the string \\ with both backslashes intact. Grep interprets this as an escaped backslash, so it searches the file for a literal backslash character.
If that's not clear, we can use echo to see exactly what the shell is doing. echo doesn't do any backslash interpretation itself, so what it prints is what the shell passed to it.
$ echo "\\"
\
$ echo '\\'
\\
You could have written the command as
grep "\\\\" ...
This has two pairs of backslashes which bash will convert to two single backslashes. This new pair will then be passed to grep as an escaped backslash getting you what you want.

Resources