What's the meaning of this sed command? sed 's%^.*/%%' [duplicate] - linux

This question already has answers here:
Using different delimiters in sed commands and range addresses
(3 answers)
Closed 1 year ago.
I saw a bash command sed 's%^.*/%%'
Usually the common syntax for sed is sed 's/pattern/str/g', but in this one it used s%^.* for the s part in the 's/pattern/str/g'.
My questions:
What does s%^.* mean?
What's meaning of %% in the second part of sed 's%^.*/%%'?

The % is an alternative delimiter so that you don't need to escape the forward slash contained in the matching portion.
So if you were to write the same expression with / as a delimiter, it would look like:
sed 's/^.*\///'
which is also kind of difficult to read.
Either way, the expression will look for a forward slash in a line; if there is a forward slash, remove everything up to and including that slash.

the usually used delimiter is / and the usage is sed 's/pattern/str/'.
At times you find that the delimiter is present in the pattern. In such a case you have a choice to either escape the delimiter found in the pattern or to use a different delimiter. In your case a different delimiter % has been used.
The later way is recommended as it keeps your command short and clean.

Related

Syntax error: invalid token operator (error token is " [duplicate]

This question already has answers here:
How to remove carriage return from a variable in shell script
(6 answers)
Closed 1 year ago.
Doing a basic shell script I'm having trouble with getting the length of a file with curl and later dividing it, and the text:
syntax error: invalid token operator (error token is"
keeps appearing.
Simplified code would be:
head=$(curl -sI $dir | awk '/Content-Lenth/{print $2}')
res=$(($head/3))
I've been reading solutions to similar problems and they said removing the \r that the variable is probably having at its end, so I put this in the middle of the two lines, but the problem is still there:
head=${head//\r}
Any idea why this happens? Is it because of an \r character at the end? If it's the case, how to remove it?
I was able to reproduce your issue, which for me was indeed caused by a carriage return at the end of the value of $head. That arises from the HTTP protocol specification: HTTP uses carriage return / newline sequences as line terminators (like DOS and Windows), and awk will not recognize the carriage return as whitespace.
Your attempt to remove the carriage return ...
head=${head//\r}
... does not serve that purpose. The \ is a general escape character to the shell, preserving the literal meaning of whatever character follows. Since the character r has no special significance to the shell, the above code is equivalent to
# Oops, not what you meant:
head=${head//r}
I used this alternative:
# Strip all non-digits:
head=${head//[^0-9]}
, and that resolved the issue for me.
Do also note, however, that
you have misspelled "Length", and
HTTP header names are not case sensitive, but by default, awk regular expressions are. Therefore, your awk command might in some cases fail to match the wanted header even after the spelling is corrected.
Here's a version of your script that addresses all of these issues:
head=$(curl -sI "$dir" | awk 'BEGIN{IGNORECASE=1} /Content-Length/{print $2}')
head=${head//[^0-9]}
res=$(($head/3))
The last two lines could be merged into one, but I have kept them separate for clarity. That might also serve well if you want to use the value of $head for other purposes, too.

sed is misbehaving when replacing certain regular expressions

I am trying to remove numbers - but only when they immediately follow periods. Similar replaces seem to work correctly, but not with periods.
I have tried the following which was given as a solution in another post:
echo "fr.r1.1.0" | sed s/\.[0-9][0-9]*/\./g
I get fr..... It seems that even though I escape the period it is matching arbitrary characters instead of only periods.
This expression seems to work for the previous example:
echo "fr.r1.1.0" | sed s/[[:punct:]][0-9][0-9]*/\./g
and gives me fr.r1.. but then for
echo "ge.s1_1.0" | sed s/[[:punct:]][0-9][0-9]*/\./g
I get ge.s1.. instead of ge.s1_1.
You will have to put the sed instructions between single quotes to avoid interpretation of some of the special characters by your shell:
echo "fr.r1.1.0" | sed 's/\.[0-9][0-9]*/\./g'
fr.r1..
Also you do not need to escape the dot in the replacement part (.) and [0-9][0-9]* can be simplified into [0-9]\+ giving the simplified command:
echo "fr.r1.1.0" | sed 's/\.[0-9]\+/./g'
fr.r1..
Last but not least, as POSIX [:punct:] character class is defined as
punctuation (all graphic characters except letters and digits)
https://en.wikibooks.org/wiki/Regular_Expressions/POSIX_Basic_Regular_Expressions
it will also include underscore (and a lot of other stuff), therefore, if you want to limit your matches to . followed by digits you will need to explicitly use dot (escaped or via its ascii value)

sed is matching passed variable subsets, not exact matches

I'm partially successfully using sed to replace variables in a text file. I'm stuck on an exception.
A script reads input from a list - say the $roll_symbol is C20.
sed replaces C20, GC20, and KC20 (because C20 matches part of the string).
I searched the web and tried the variations I found - no success.
I tried these variations without success:
escape the reserved character $
escape braces
escape both
use double quotes instead of single quotes.
*the best version so far (but only partially):
sed -i 's/'${roll_symbol}'/'${roll_symbol}\,${contract_month}'/g' $OUTPUT_DIRECTORY/$OUTPUT_FILE;
You need to tell sed what characters are legal before the start of your match to limit where it can match. To only match at start-of-word boundaries try \<.
sed -i "s/\<${roll_symbol}/${roll_symbol},${contract_month}/g" "$OUTPUT_DIRECTORY/$OUTPUT_FILE";

linux bash replace placeholder with unknown text which can contain any characters

If I want to replace for example the placeholder {{VALUE}} with another string which can contain any characters, what's the best way to do it?
Using sed s/{{VALUE}}/$(value)/g might fail if $(value) contains a slash...
oldValue='{{VALUE}}'
newValue='new/value'
echo "${var//$oldValue/$newValue}"
but oldValue is not a regexp but works like a glob pattern, otherwise :
echo "$var" | sed 's/{{VALUE}}/'"${newValue//\//\/}"'/g'
Sed also works like 's|something|someotherthing|g' (or with other delimiters for that matter), but if you can't control the input string, you'll have to use some function to escape it before passing it to sed..
The question asked basically duplicates How can I escape forward slashes in a user input variable in bash?, Escape a string for sed search pattern, Using sed in a makefile; how to escape variables?, Use slashes in sed replace, and many other questions. “Use a different delimiter” is the usual answer. Pianosaurus's answer and Ben Blank's answer list characters (backslash and ampersand) that need to be escaped in the shell, besides whatever character is used as an alternate delimiter. However, they don't address the quoting-a-quote problem that will occur if your “string which can contain any characters” contains a double quote. The same kind of problem can affect the ${parameter/pattern/string} shell variable expansion mentioned in a previous answer.
Some other questions besides the few mentioned above suggest using awk, and that is usually a good approach to changes that are more complicated than are easy to do with sed. Also consider perl and python. Besides single- and double-quoted strings, python has u'...' unicode quoting, r'...' raw quoting,ur'...' quoting, and triple quoting with ''' or """ delimiters. The question as stated doesn't provide enough context for specific awk/perl/python solutions.

sed regex with variables to replace numbers in a file

Im trying to replace numbers in my textfile by adding one to them. i.e.
sed 's/3/4/g' path.txt
sed 's/2/3/g' path.txt
sed 's/1/2/g' path.txt
Instead of this, Can i automate it, i.e. find a /d and add one to it in the replace.
Something like
sed 's/\([0-8]\)/\1+1/g' path.txt
Also wanted to capture more than one digit i.e. ([0-9])\t([0-9]) and change each one keeping the tab inbetween
Thanks
edited #2
Using the perl example,
I also would like it to work with more digits i.e.
perl -pi~ -e 's/(\d+)\.(\d+)\.(\d+)\.(\d+)/ ($1+1)\.($2+1)\.($3+1)\.($4+1) /ge' output.txt
Any tips on making the above work?
There is no support for arithmetic in sed, but you can easily do this in Perl.
perl -pe 's/(\d+)/ $1+1 /ge'
With the /e option, the replacement expression needs to be valid Perl code. So to handle your final updated example, you need
perl -pi~ -e 's/(\d+)\.(\d+)\.(\d+)\.(\d+)/ $1+1 . "." $2+1 . "." . $3+1 . "." . $4+1 /ge'
where strings are properly quoted and adjacent strings are concatenated together with the . Perl string concatenation operator. (The arithmetic numbers are coerced into strings as well when they are concatenated with a string.)
... Though of course, the first script already does that more elegantly, since with the /g flag it already increments every sequence of digits with one, anywhere in the string.
Triplee's perl solution is the more generic answer, but Michal's sed solution works well for this particular case. However, Michal's sed solution is more easily written:
sed y/12345678/23456789/ path.txt
and is better implemented as
tr 12345678 23456789 < path.txt
This utterly fails to handle 2 digit numbers (as in the edited question).
You can do it with sed but it's not easy, see this thread.
And it's hard with awk too, see this.
I'd rather use perl for this (something like this can be seen in action # ideone):
perl -pe 's/([0-8])/$1+1/e'
(The ideone.com example must have some looping as ideone does not sets -pe by default.)
You can't do addition directly in sed - you could do it in awk by matching numbers using a regex in each line and increasing the value, but it's quite complicated. If do not need to handle arbitrary numbers but a limited set, like only single-digit numbers from 0 to 8, you can just put several replacement commands on a single sed command line by separating them with semicolons:
sed 's/8/9/g ; s/7/8/g; s/6/7/g; s/5/6/g; s/4/5/g; s/3/4/g; s/2/3/g; s/1/2/g; s/0/1/g' path.txt
This might work for you (GNU sed & Bash):
sed 's/[0-9]/$((&+1))/g;s/.*/echo "&"/e' file
This will add one to every individual digit, to increment numbers:
sed 's/[0-9]\+/$((&+1))/g;s/.*/echo "&"/e' file
N.B. This method is fraught with problems and may cause unexpected results.

Resources