fedora sed command replace special characters - linux

i am totally new to sed and as part of script writing i am trying to replace specific string from a fiel. I know the special characters need to be escaped using backslash but the problem is if the special character is first in the line then it is not replaced....
For e.g my file contains
sldgfkls $bdxcv sldflksd
Now if i write the below code
sed -i 's/\b\$bdxcv\b/abcd/' filename
Then the above word is not replaced....But if the file contains
sldgfkls a$bdxcv sldflksd
Now if i write the below code
sed -i 's/\ba\$bdxcv\b/abcd/' filename
Then the above word is replaced.....
Please Help me here....

Clearly, \b does not consider a dollar sign to be a word character, so there is no word boundary for it to match between space and $.
Perhaps you want this instead:
sed -i 's/\(^\|[\t ]\)\$bdxcv\b/\1abcd/' filename
Assuming yours is GNU sed, see https://www.gnu.org/software/sed/manual/sed.html which contains this definition:
A “word” character is any letter or digit or the underscore character.
and thus not dollar sign.

sed cannot operator on strings, only regular expressions. Trying to figure out which characters need to be escaped to disable their regexp (or sed delimiter or sed backreference) meaning to make a regexp in sed behave as if it were a string is a fool's errand, just use a tool that can operate on strings, e.g. awk.
$ awk '{for (i=1;i<NF;i++) if ($i == "$bdxcv") $i="abcd"} 1' file
sldgfkls abcd sldflks
The above uses string comparison and string assignment - no need to escape anything unless one of the strings contained the string delimiter, ".

Related

Replace the word with "\" using sed command fails? [duplicate]

I am using sed in a shell script to edit filesystem path names. Suppose I want to replace
/foo/bar
with
/baz/qux
However, sed's s/// command uses the forward slash / as the delimiter. If I do that, I see an error message emitted, like:
▶ sed 's//foo/bar//baz/qux//' FILE
sed: 1: "s//foo/bar//baz/qux//": bad flag in substitute command: 'b'
Similarly, sometimes I want to select line ranges, such as the lines between a pattern foo/bar and baz/qux. Again, I can't do this:
▶ sed '/foo/bar/,/baz/qux/d' FILE
sed: 1: "/foo/bar/,/baz/qux/d": undefined label 'ar/,/baz/qux/d'
What can I do?
You can use an alternative regex delimiter as a search pattern by backslashing it:
sed '\,some/path,d'
And just use it as is for the s command:
sed 's,some/path,other/path,'
You probably want to protect other metacharacters, though; this is a good place to use Perl and quotemeta, or equivalents in other scripting languages.
From man sed:
/regexp/
Match lines matching the regular expression regexp.
\cregexpc
Match lines matching the regular expression regexp. The c may be any character other than backslash or newline.
s/regular expression/replacement/flags
Substitute the replacement string for the first instance of the regular expression in the pattern space. Any character other than backslash or newline can be used instead of a slash to delimit the RE and the replacement. Within the RE and the replacement, the RE delimiter itself can be used as a literal character if it is preceded by a backslash.
Perhaps the closest to a standard, the POSIX/IEEE Open Group Base Specification says:
[2addr] s/BRE/replacement/flags
Substitute the replacement string for instances of the BRE in the
pattern space. Any character other than backslash or newline can
be used instead of a slash to delimit the BRE and the replacement.
Within the BRE and the replacement, the BRE delimiter itself can be
used as a literal character if it is preceded by a backslash."
When there is a slash / in theoriginal-string or the replacement-string, we need to escape it using \. The following command is work in ubuntu 16.04(sed 4.2.2).
sed 's/\/foo\/bar/\/baz\/qux/' file

Remove double quotes within the column value using Unix

I am working on Processing a (90 Cols) CSV File - Semicolon Separated (;) {case can be ignore and I am aware file standard is a mess but I am helpless in that regards}
Input Rows :
"AAAAA";"ABABDBDA";"ASDASDA"asads";"123";"456"
"AAAAA";"ABABDBDA";"12322AAasd"asads";"123";"456"
"Lmnop";"asdasads";"mer";"123;2343;asa"dwd";"456"
Output Expected :
"AAAAA";"ABABDBDA";"ASDASDA asads";"123";"456"
"AAAAA";"ABABDBDA";"12322AAasd asads";"123";"456"
"Lmnop";"asdasads";"mer";"123;2343;asa dwd";"456"
(Double Quote can be replaced by Space or blank). {Kindly note - even though this is ';' seperated file some rows have ';' within quoted data for a column.
Issue : In the rows - I am getting an extra Double Quote within the quoted data.
Please advise me on how to handle this in Unix.
one trick you can use is to remove " not around the field boundaries. A simple sed script can be
$ sed -E 's/([^;])"([^;])/\1 \2/g' file
note that if you allow escaped quote marks is you fields, this is going to remove them as well.
note the example below in the comments which is not covered with one round of the sed. Due to greedy match a single char can't be a condition for both matches, so "a"b"c"; won't work correctly.
What would you think of the following solution:
Replace all ";" by ;
Remove all remaining "
Replace all ; back into ";"
Add additional " characters, at the beginning and at the end of every line.
The whole thing can be done with tr or sed or whatever command you prefer.
mawk 'NF*(gsub(__," ",$!(NF=NF))^_ +gsub(OFS,FS) +gsub("^ | $",__))' \
__='\42' FS='\442\73\42' OFS='\31\17'
"AAAAA";"ABABDBDA";"ASDASDA asads";"123";"456"
"AAAAA";"ABABDBDA";"12322AAasd asads";"123";"456"
"Lmnop";"asdasads";"mer";"123;2343;asa dwd";"456"
This transform is easy to do using tool which provide regular expression with zero-length assertions (lookbehind and lookahead), as you applied unix tag there is good chance you have perl command and therefore I propose following solution, let file.txt content be
"AAAAA";"ABABDBDA";"ASDASDA"asads";"123";"456"
"AAAAA";"ABABDBDA";"12322AAasd"asads";"123";"456"
"Lmnop";"asdasads";"mer";"123;2343;asa"dwd";"456"
then
perl -p -e 's/(?<=[[:alnum:]])"(?=[[:alnum:]])/ /g' file.txt
gives output
"AAAAA";"ABABDBDA";"ASDASDA asads";"123";"456"
"AAAAA";"ABABDBDA";"12322AAasd asads";"123";"456"
"Lmnop";"asdasads";"mer";"123;2343;asa dwd";"456"
Explanation: I inform perl that I want to use it sed-style via -p -e then I provide substitution (s): " which is after alphanumeric character (letter or digit) and before alphanumeric should be replaced using space character. This is applied to all such " that is globally (g).
Note: you might elect to port that answer to any other tools which does provide ability to replace regular expression with zero-length assertions.
(tested in perl 5, version 26, subversion 3)
When you consider the combination ";" as a delimiter, you can use
awk -F '";"' '{
printf "\"";
for (i=1;i<NF;i++) {
gsub("\"","", $i);
printf("%s\";\"",$i)
};
print $NF
}' inputfile
This might work for you (GNU sed):
sed -E ':a;s/^(("[^"]*";)*"[^"]*)"([^;])/\1 \3/;ta' file
Iterate starting from the start of the line, match zero or more correctly double quoted fields followed by an incorrect double quote and replace that double quote by a space.

sed replace paths in all files on system [duplicate]

I am using sed in a shell script to edit filesystem path names. Suppose I want to replace
/foo/bar
with
/baz/qux
However, sed's s/// command uses the forward slash / as the delimiter. If I do that, I see an error message emitted, like:
▶ sed 's//foo/bar//baz/qux//' FILE
sed: 1: "s//foo/bar//baz/qux//": bad flag in substitute command: 'b'
Similarly, sometimes I want to select line ranges, such as the lines between a pattern foo/bar and baz/qux. Again, I can't do this:
▶ sed '/foo/bar/,/baz/qux/d' FILE
sed: 1: "/foo/bar/,/baz/qux/d": undefined label 'ar/,/baz/qux/d'
What can I do?
You can use an alternative regex delimiter as a search pattern by backslashing it:
sed '\,some/path,d'
And just use it as is for the s command:
sed 's,some/path,other/path,'
You probably want to protect other metacharacters, though; this is a good place to use Perl and quotemeta, or equivalents in other scripting languages.
From man sed:
/regexp/
Match lines matching the regular expression regexp.
\cregexpc
Match lines matching the regular expression regexp. The c may be any character other than backslash or newline.
s/regular expression/replacement/flags
Substitute the replacement string for the first instance of the regular expression in the pattern space. Any character other than backslash or newline can be used instead of a slash to delimit the RE and the replacement. Within the RE and the replacement, the RE delimiter itself can be used as a literal character if it is preceded by a backslash.
Perhaps the closest to a standard, the POSIX/IEEE Open Group Base Specification says:
[2addr] s/BRE/replacement/flags
Substitute the replacement string for instances of the BRE in the
pattern space. Any character other than backslash or newline can
be used instead of a slash to delimit the BRE and the replacement.
Within the BRE and the replacement, the BRE delimiter itself can be
used as a literal character if it is preceded by a backslash."
When there is a slash / in theoriginal-string or the replacement-string, we need to escape it using \. The following command is work in ubuntu 16.04(sed 4.2.2).
sed 's/\/foo\/bar/\/baz\/qux/' file

How to split Lines using shell SED or something similar

I have a file containing the following
String, SomeotherString Additional, StringNew String
I would like to have the following output:
String, Someother
String Additional, String
New String
The delimiter is always a capital letter following a small letter without space. I tried
sed 's/\([a-z][A-Z]\)/\n\1/g <<< String, SomeotherString Additional, StringNew String However this leads to:
String, Someothe
rString Additional, Strin
gNew String
Thanks for your help
With sed:
sed 's/\([a-z]\)\([A-Z]\)/\1\n\2/g'
Matches a small letter (sub-expression 1) followed by a capital letter (sub-expression 2) and replaces them with the part matching sub-expression 1, a newline character, and the part matching sub-expression 2.
The previous should work with any sed. With GNU sed and others that support it, you can use -E (also -r in GNU sed) to enable extended regexps, so that you don't have to put backslashes before the parentheses.
sed -E 's/([a-z])([A-Z])/\1\n\2/g'
At least GNU sed also supports named character classes, so you can easily match other letters than a-z and A-Z too:
sed -E 's/([[:lower:]])([[:upper:]])/\1\n\2/g'
More than one way to do this, but here's one that uses perl
echo 'StringSomeotherstringAdditionalString' | perl -pe 's/([A-Z])/\n$1/g'
[A-Z] matches a capital letter;
\n$1 replaces it with a newline and the capital letter.

How to correctly detect and replace apostrophe (') with sed?

I'm having a directory with many files having special characters and spaces. I want to perform an operation with all these files so I'm trying to store all filenames in a list.txt and then run the command with this list.
The special characters in my list are & []'.
So basically I want to use sed to replace each occurence with \ + the character in question.
E.g. : filename .txt => filename\ .txt etc...
The thing is I have trouble handling apostrophes.
Here is my command as of now :
ls | sed 's/\ /\\ /g' | sed 's/\&/\\&/g' | sed "s/\'/\\'/g" | sed 's/\[/\\[/g' | sed 's/\]/\\]/g'
At first I had issues with, I believe, the apostrophes in the string command in conflict with the apostrophes surrounding the string. So I used double quotes instead, but it still doesn't work.
I've tried all these and nothing worked :
sed "s/\'/\\'/g" (escaping the apostrophe)
sed "s/'/\'/g" (escaping nothing)
sed "s/'/\\'/g" (escaping the backslash)
sed 's/"'"/\"'"/g' (double quoting single quote)
As a disclaimer, I must say, I'm completely new to sed. I just run my first sed command today, so maybe I'm doing something wrong I didn't realize.
PS : I've seen those thread, but no answer worked for me :
https://unix.stackexchange.com/questions/157076/how-to-remove-the-apostrophe-and-delete-the-space
How to replace to apostrophe ' inside a file using SED
This may do:
cat file
avbadf
test&rr
more [ yes
this ]
and'df
sed -r 's/(\x27|&|\[|\])/\\\1/g' file
avbadf
test\&rr
more \[ yes
this \]
and\'df
\x27 is equal to singe quote '
\x22 is equal to double quote "
Whoops, I found the answer to my question. Here is the working input :
sed "s/'/\\\'/g"
This will effectively replace any ' with \'.
However I'm having trouble understanding exactly what's happening here.
So if I understand correctly, we are escaping the backslash and the apostrophe in the replacement string. Now, if somebody could answer some those, I would be grateful :
Why don't we need to escape the first quote (the one in the pattern to find) ?
Why do we have to escape the backslash whereas for the other characters, there's no need ?
Why do we need to escape the second quote (the one in the replacement string) ?
I think all of your sed matches actually need that replacement pattern. This one seems to work for all examples:
ls | sed "s/\ /\\\ /g" | sed "s/\&/\\\&/g" | sed "s/\[/\\\[/g" | sed "s/\]/\\\]/g" | sed "s/'/\\\'/g"
So it is s/regex/replacement/command and 'regex' and 'replacement' have different sets of special characters.
The only one that's different is s/'/\\\'/g and there only because I don't believe there is any special ' character on the regex expression. There is some obscure \' special character in the replacement expression, for matching buffer ends in multi-line mode, accord to the docs. That might be why it needs an escape in the replacement side, but not in the regex side.
For example, \5 is a special character in the replacement expression, so to replace:
filename5.txt -> filename\5.txt
You would also need, as with apostrophe:
sed "s/5/\\\5/g"
It probably has to do with the mysterious inner works of sed parsing, it might read from right to left or something.
Please try the following:
sed 's/[][ &'\'']/\\&/g' file
By using the same example by #Jotne, the result will be:
gavbadf
gtest\&rr
gmore\ \[\ yes
gthis\ \]
gand\'df
[How it works]
The regex part in the sed s command above just defines a character
class of & []', which should be escaped with a backslash.
The right square bracket ] does not need escaping when put
immediately after the left square bracket [.
The obfuscating part will be the handling of a single quote.
We cannot put a single quote within single quotes even if we escape it.
The workaround is as follows: Say we have an assignment str='aaabbb'.
To put a single quote between "aaa" and "bbb", we can say as
str='aaa'\''bbb'.
It may look puzzling but it just concatenates the three sequences;
1) to close the single-quoted string as 'aaa'.
2) to put a single quote with an escaping backslash as \'.
3) to restart the single-quoted string as 'bbb'.
Hope this helps.

Resources