Who reads the regex, Shell or the command? - linux

The regex, we use to limit the results or for any other purposes, whom from are those interpreted, the command itself or the shell.

If you look at ls *.txt | sed -e 's/[AB]/a/' then the *.txt are interpreted by the shell (this is not a regex but is called globbing) and the regex 's/[AB]/a/' are interpreted by sed.
See http://wiki.bash-hackers.org/syntax/expansion/globs for more about how bash do it.

Related

Change variable evaluation method in all script from $VAR_NAME to ${VAR_NAME}

We have couple of scripts where we want to replace variable evaluation method from $VAR_NAME to ${VAR_NAME}
This is required so that scripts will have uniform method for variable evaluation
I am thinking of using sed for the same, I wrote sample command which looks like follows,
echo "\$VAR_NAME" | sed 's/^$[_a-zA-Z0-9]*/${&}/g'
output for the same is
${$VAR_NAME}
Now i don't want $ inside {}, how can i remove it?
Any better suggestions for accomplishing this task?
EDIT
Following command works
echo "\$VAR_NAME" | sed -r 's/\$([_a-zA-Z]+)/${\1}/g'
EDIT1
I used following command to do replacement in script file
sed -i -r 's:\$([_a-zA-Z0-9]+):${\1}:g' <ScriptName>
Since the first part of your sed command searches for the $ and VAR_NAME, the whole $VAR_NAME part will be put inside the ${} wrapper.
You could search for the $ part with a lookbehind in your regular expression, so that you end up ending the sed call with /{&}/g as the $ will be to the left of your matched expression.
http://www.regular-expressions.info/lookaround.html
http://www.perlmonks.org/?node_id=518444
I don't think sed supports this kind of regular expression, but you can make a command that begins perl -pe instead. I believe the following perl command may do what you want.
perl -p -e 's/(?<=\$)[_a-zA-Z0-9]*/{$&}/g'
PCRE Regex to SED

Bash tool to search for regex pattern in file

I want a bash tool that does the following: Given a file and a regex pattern, it outputs all matches of that pattern in the file.
Any tool like that? Grep or something? (I don't know how to use it to make that.)
You can do it with grep. There are 2 ways I like to use it
cat file.txt | grep 'regex_pattern'
and
grep 'regex_pattern' file.txt

Bash to transform string `3.11.0.17.16` into `3.11.0-17-generic`

I'm trying to transform this 3.11.0.17.16 into 3.11.0-17-generic using only bash and unix tools. The 16 in the original string can be anything. I feel like sed is the answer, but I'm not comfortable with its flavor of regex. How would you do this?
Version using awk instead of sed:
echo "3.11.0.17.16" | awk -F. '{printf "%s.%s.%s-%s-generic\n",$1,$2,$3,$4}'
echo "3.11.0.17.16" | sed 's/\.\([0-9][0-9]*\)\.[0-9][0-9]*$/-\1-generic/'
3.11.0-17-generic
This only accepts digits in the final component. If you want to accept arbitrary characters other than . there (you can't allow . or the match will become ambiguous) then write instead
echo "3.11.0.17.gr#wl1x" | sed 's/\.\([0-9][0-9]*\)\.[^.][^.]*$/-\1-generic/'
In a portable sed invocation you are limited to POSIX basic regular expressions, which most importantly means you cannot use +, ?, or |, and ( ) { } are ordinary characters unless \-escaped. Many sed implementations now accept an -E option that brings their regex syntax in line with egrep, but that is not a feature even of the very latest revision of POSIX so you cannot rely on it.
Substring removal using bash parameter expansion and extended globs
shopt -s extglob
version=3.11.0.17.16
version=${version%.+(!(.))}
printf "%s-%s-generic\n" ${version%.+(!(.))} ${version##*.}
3.11.0-17-generic
If you anchor the regex you are trying to match onto the last 3 sets of digits you would get
echo "3.11.0.17.16" | sed 's!\([0-9]*\)\.\([0-9]*\)\.\([0-9]*\)$!\1-\2-generic!'

Question about shell commands and grep

Does anyone know why
grep "p\{2\}" textfile
will find "apple" if it's in the file, but
grep p\{2\} textfile
won't?
I'm new to using a command line and regular expressions, and this is puzzling me.
Although this has already been answered, but since you are new to all this stuff, here is how to debug it:
-- get the pid of current shell (using ps).
PID TTY TIME CMD
1611 pts/0 00:00:00 su
1619 pts/0 00:00:00 bash
1763 pts/0 00:00:00 ps
-- from some other shell, attach strace (system call tracer) to the required pid (here 1619):
strace -f -o <output_file> -p 1619
-- Run both the commands that you tried
-- open the output file and look for exec family calls for the required process, here: grep
The output on my machine is some thing like:
1723 execve("/bin/grep", ["grep", "--color=auto", "p{2}", "foo"], [/* 19 vars */]) = 0
1725 execve("/bin/grep", ["grep", "--color=auto", "p\\{2\\}", "foo"], [/* 19 vars */]) = 0
Now you can see the difference how grep was executed in both the cases and can figure out the problem yourself. :)
still the -e flag mystery is yet to be solved....
Without the quotes, the shell will try to expanding the options. In your case the curly brackets '{}' have a special meaning in the shell much like the asterisk '*' which expands to a wildcard.
With quotes, your complete regex gets passed directly to grep. Without the quotes, grep sees your regex as p{2}.
Edit:
To clarify, without the quotes your slashes are being removed by shell before your regex is passed to grep.
Try:
echo grep p\{2\} test.txt
And you'll see your output as...
grep p{2} test.txt
The quotes prevent shell from escaping characters before they get to grep. You could also escape your slashes and it will work without quotes - grep p\\{2\\} test.txt
The first one greps the pattern using regex, then pp:
echo "apple" | grep 'p\{2\}'
The second one greps the pattern literally, then p{2}:
echo "ap{2}le" | grep p\{2\}
From the grep man page
In basic regular expressions the meta-characters ?, +, {, |, (, and ) lose their special meaning; instead use the backslashed versions \?, \+, \{, \|, \(, and \).
so these two become functional equivalent
egrep p{2}
and
grep "p\{2\}"
the first uses EREs(Extended Regular Expressions) the second uses BREs(Basic Regular Expressions) in your example because your using grep(which supports BREs when you don't use the -e switch) and you're enclosed in quotes so "\{" gets expanded as a special BRE character.
You second instance doesn't work because your just looking for the literal string 2{p} which doesn't exist in your file
you can demonstrate that grep is expanding your string as a BRE by trying:
grep "p\{2"
grep will complain
grep: Unmatched \{

shell scripting for token replacement in all files in a folder

HI
I am not very good with linux shell scripting.I am trying following shell script to replace
revision number token $rev -<rev number> in all html files under specified directory
cd /home/myapp/test
set repUpRev = "`svnversion`"
echo $repUpRev
grep -lr -e '\$rev -'.$repUpRev.'\$' *.html | xargs sed -i 's/'\$rev -'.$repUpRev.'\$'/'\$rev -.*$'/g'
This seems not working, what is wrong with the above code ?
rev=$(svnversion)
sed -i.bak "s/$rev/some other string/g" *.html
What is $rev in the regexp string? Is it another variable? Or you're looking for a string '$rev'. If latter - I would suggest adding '\' before $ otherwise it's treated as a special regexp character...
This is how you show the last line:
grep -lr -e '\$rev -'.$repUpRev.'\$' *.html | xargs sed -i 's/'\$rev -'.$repUpRev.'\$'/'\$rev -.*$'/g'
It would help if you showed some input data.
The -r option makes the grep recursive. That means it will operate on files in the directory and its subdirectories. Is that what you intend?
The dots in your grep and sed stand for any character. If you want literal dots, you'll need to escape them.
The final escaped dollar sign in the grep and sed commands will be seen as a literal dollar sign. If you want to anchor to the end of the line you should remove the escape.
The .* works only as a literal string on the right hand side of a sed s command. If you want to include what was matched on the left side, you need to use capture groups. The g modifier on the s command is only needed if the pattern appears more than once in a line.
Using quote, unquote, quote, unquote is hard to read. Use double quotes to permit variable expansion.
Try your grep command by itself without the xargs and sed to see if it's producing a list of files.
This may be closer to what you want:
grep -lr -e "\$rev -.$repUpRev.$" *.html | xargs sed -i "s/\$rev -.$repUpRev.$/\$rev -REPLACEMENT_TEXT/g"
but you'll still need to determine if the g modifier, the dots, the final dollar signs, etc., are what you intend.

Resources