How to make GNU sed remove certain characters from a line - linux

I have a following line;
�5=?�#A00165:69:HKJ3YDMXX:1:1101:16812:7341 1:N:0:TCTTAAAG
and would like to remove characters, �5=?� in front of #. So the desired output looks as follows;
#A00165:69:HKJ3YDMXX:1:1101:16812:7341 1:N:0:TCTTAAAG
I used gnu sed (v4.8)with a following argument;
sed "s/.*#/#/"'
but this did not remove �5=?� thought it worked in the GNU sed live editor.
At this point, I really appreciate any help on this.
My system is 3.10.0-1160.71.1.el7.x86_64

Using sed, remove everything up to the first occurance of #
$ sed 's/^[^#]*//' input_file
#A00165:69:HKJ3YDMXX:1:1101:16812:7341 1:N:0:TCTTAAAG

This might work for you (GNU sed):
sed -E 's/(\o357\o277\o275)5=\?\1//g' file
This removes all occurrences of �5=?�.
N.B. To translate the octal strings use sed -n l file to display the file as is. The triplets \357\277\275 can be matched in the LHS of the substitute command by using \o357\o277\o275.

Related

How to use sed to replace a filename in text file

I have a file:
dynamicclaspath.cfg
VENDOR_JAR=/clear-as-1-d/apps/sterling/jar/struts/2_5_18/1_0_0/log4j-core-2.10.0.jar
VENDOR_JAR=/clear-as-1-d/apps/sterling/jar/log4j/2_17_1/log4j-core-2.10.0.jar
I want to replace any occurrence of log4j-core* with log4j-core-2.17.1.jar
I tried this but I know I'm missing a regex:
sed -i '/^log4j-core/ s/[-]* /log4j-core-2.17.1.jar/'
With your shown samples please try following sed program. Using -E option with sed to enable ERE(extended regular expressions) with it. In main program using substitute option to perform substitution. Using sed's capability to use regex and store matched values into temp buffer(capturing groups). Matching till last occurrence of / and then matching log4j-core till jar at last of value. While substituting it with 1st capturing group value(till last occurrence of /) followed by new value of log4j as per OP's requirement.
sed -E 's/(^.*\/)log4j-core-.*\.jar$/\1log4j-core-2.17.1.jar/' Input_file
Using sed
$ sed -E 's/(log4j-core-)[0-9.]+/\12.17.1./' input_file
VENDOR_JAR=/clear-as-1-d/apps/sterling/jar/struts/2_5_18/1_0_0/log4j-core-2.17.1.jar
VENDOR_JAR=/clear-as-1-d/apps/sterling/jar/log4j/2_17_1/log4j-core-2.17.1.jar
It depends on possible other contents in your input file how specific the search pattern must be.
sed 's/log4j-core-.*\.jar/log4j-core-2.17.1.jar/' inputfile
or
sed 's/log4j-core-[0-9.]*\.jar/log4j-core-2.17.1.jar/' inputfile
or (if log4j-core*.jar is always the last part of the line)
sed 's/log4j-core.*/log4j-core-2.17.1.jar/' inputfile
sed -i s'#2.10.0.jar$#2.17.1.jar#'g file
That seems to work.

SED : replace part of syntax to lower case

Original Text file:
.ABC0 (ABC0),
.EFG2 (EFG2),
.ZZZ3 (ZZZ3),
How to convert this part to
.ABC0 (abc0),
.EFG2 (efg2),
.ZZZ3 (zzz3),
with SED command easily?
There's issue to make it work.
echo ".ABC(ABC)," | sed -e 's/\(.*\.[A-Z]*\(\)\([A-Z]*\)\)/\1\L\2\E/ p'
You were almost there.
sed -e 's/\(\.[A-Z0-9]*\)\( ([A-Z0-9]*)\)/\1\L\2\E/g'
Remove the .* from the beginning, it matches as much as it can, i.e. it skips to the last occurrence of the pattern.
Include the digits in the character classes.
Use ( without a backslash to match a parenthesis literally.
This might work for you (GNU sed):
sed 's/([^)]*)/\L&/g' file
Replace the contents of parenthesis by its lowercase equivalent.

Conditional replace using sed

My question is probably rather simple. I'm trying to replace sequences of strings that are at the beginning of lines in a file. For example, I would like to replace any instance of the pattern "GN" with "N" or "WR" with "R", but only if they are the first 2 characters of that line. For example, if I had a file with the following content:
WRONG
RIGHT
GNOME
I would like to transform this file to give
RONG
RIGHT
NOME
I know i can use the following to replace any instance of the above example;
sed -i 's/GN/N/g' file.txt
sed -i 's/WR/R/g' file.txt
The issue is that I want this to happen only if the above patterns are the first 2 characters in any given line. Possibly an IF statement, although i'm not sure what the condition would look like. Any pointers in the right direction would be much appreciated, thanks.
just add the circumflex, remove g suffix (unnecessary, since you want at most one replacement), you can also combine them in one script.
sed -i 's/^GN/N/;s/^WR/R/' file.txt
Use the start-of-string regexp anchor ^:
sed -i 's/^GN/N/' file.txt
sed -i 's/^WR/R/' file.txt
Since sed is line-oriented, start-of-string == start-of-line.

I am having trouble with Sed

I am trying to use the sed command to replace this line:
charmm.c36a4.20140107.newcali4.fixhcali.grange.b
with:
charmm.20140911.c36a4.3rd.ghost2.model3rd
When I use:
sed -i '/s/firstline/secondline/g'
It doesn't work. I think the periods are messing it up. How do I get around this?
sed uses regular expressions, so . matches any character. If you want to only match the . character itself, tell sed to look for \.
so to change the first line into the second line:
sed -e 's/charmm\.c36a4.20140107\.newcali4\.fixhcali\.grange\.b/charmm.20140911.c36a4.3rd.ghost2.model3rd/g' < filetochange >newfile
Here, I added "g" so it does it globally, ie, if there are several instances on the same line, all will be changed. If you remove the "g", it will only change the first occurence on each line.
It reads from filetochange and writes to newfile
If you do :
sed -i -e 's/charmm\.c36a4.20140107\.newcali4\.fixhcali\.grange\.b/charmm.20140911.c36a4.3rd.ghost2.model3rd/g' filetochange
it will directly do the change in "filetochange" ... but please be careful, a badly written sed -i could mess up the file and make it unusable
The s command follows this syntax:
s/pattern/replacement/
You need to drop the / in front of the sed command.

sed help: matching and replacing a literal "\n" (not the newline)

i have a file which contains several instances of \n.
i would like to replace them with actual newlines, but sed doesn't recognize the \n.
i tried
sed -r -e 's/\n/\n/'
sed -r -e 's/\\n/\n/'
sed -r -e 's/[\n]/\n/'
and many other ways of escaping it.
is sed able to recognize a literal \n? if so, how?
is there another program that can read the file interpreting the \n's as real newlines?
Can you please try this
sed -i 's/\\n/\n/g' input_filename
What exactly works depends on your sed implementation. This is poorly specified in POSIX so you see all kinds of behaviors.
The -r option is also not part of the POSIX standard; but your script doesn't use any of the -r features, so let's just take it out. (For what it's worth, it changes the regex dialect supported in the match expression from POSIX "basic" to "extended" regular expressions; some sed variants have an -E option which does the same thing. In brief, things like capturing parentheses and repeating braces are "extended" features.)
On BSD platforms (including MacOS), you will generally want to backslash the literal newline, like this:
sed 's/\\n/\
/g' file
On some other systems, like Linux (also depending on the precise sed version installed -- some distros use GNU sed, others favor something more traditional, still others let you choose) you might be able to use a literal \n in the replacement string to represent an actual newline character; but again, this is nonstandard and thus not portable.
If you need a properly portable solution, probably go with Awk or (gasp) Perl.
perl -pe 's/\\n/\n/g' file
In case you don't have access to the manuals, the /g flag says to replace every occurrence on a line; the default behavior of the s/// command is to only replace the first match on every line.
awk seems to handle this fine:
echo "test \n more data" | awk '{sub(/\\n/,"**")}1'
test ** more data
Here you need to escape the \ using \\
$ echo "\n" | sed -e 's/[\\][n]/hello/'
sed works one line at a time, so no \n on 1 line only (it's removed by sed at read time into buffer). You should use N, n or H,h to fill the buffer with more than one line, and then \n appears inside. Be careful, ^ and $ are no more end of line but end of string/buffer because of the \n inside.
\n is recognized in the search pattern, not in the replace pattern. Two ways for using it (sample):
sed s/\(\n\)bla/\1blabla\1/
sed s/\nbla/\
blabla\
/
The first uses a \n already inside as back reference (shorter code in replace pattern);
the second use a real newline.
So basically
sed "N
$ s/\(\n\)/\1/g
"
works (but is a bit useless). I imagine that s/\(\n\)\n/\1/g is more like what you want.

Resources