A good way to use sed to find and replace characters with 2 delimiters - linux

I trying to find and replace items using bash. I was able to use sed to grab out some of the characters, but I think I might be using it in the wrong matter.
I am basically trying to remove the characters after ";" and before "," including removing ","
sed -e 's/\(;\).*\(,\)/\1\2/'
That is what I used to replace it with nothing. However, it ends up replacing everything in the middle so my output came out like this:
cmd2="BMC,./socflash_x64 if=B600G3_BMC_V0207.ima;,reboot -f"
This is the original text of what I need to replace
cmd2="BMC,./socflash_x64 if=B600G3_BMC_V0207.ima;X,sleep 120;after_BMC,./run-after-bmc-update.sh;hba_fw,./hba_fw.sh;X,sleep 5;DB,2;X,reboot -f"
Is there any way to make it look like this output?
./socflash_x64 if=B600G3_BMC_V0207.ima;sleep 120;./run-after-bmc-update.sh;./hba_fw.sh;sleep 5;reboot -f
Ff there is any way to make this happen other than bash I am fine with any type of language.

Non-greedy search can (mostly) be simulated in programs that don't support it by replacing match-any (dot .) with a negated character class.
Your original command is
sed -e 's/\(;\).*\(,\)/\1\2/'
You want to match everything in between the semi-colon and the comma, but not another comma (non-greedy). Replace .* with [^,]*
sed -e 's/\(;\)[^,]*\(,\)/\1\2/'
You may also want to exclude semi-colons themselves, making the expression
sed -e 's/\(;\)[^,;]*\(,\)/\1\2/'
Note this would treat a string like "asdf;zxcv;1234,qwer" differently, since one would match ;zxcv;1234, and the other would match only ;1234,

In perl:
perl -pe 's/;.*?,/;/g;' -pe 's/^[^,]*,//' foo.txt
will output:
./socflash_x64 if=B600G3_BMC_V0207.ima;sleep 120;./run-after-bmc-update.sh;./hba_fw.sh;sleep 5;2;reboot -f
The .*? is non greedy matching before the comma. The second command is to remove from the beginning to the comma.

Something like:
echo $cmd2 | tr ';' '\n' | cut -d',' -f2- | tr '\n' ';' ; echo
result is:
./socflash_x64 if=B600G3_BMC_V0207.ima;sleep 120;./run-after-bmc-update.sh;./hba_fw.sh;sleep 5;2;reboot -f;
however, I thing your requirements are a few more complex, because 'DB,2' seems a particular case. After "tr" command, insert a "grep" or "grep -v" to include/exclude these cases.

Related

Linux Bash. Delete line if field exactly matches

I have something like this in a file named file.txt
AA.201610.pancake.Paul
AA.201610.hello.Robert
A.201610.hello.Mark
Now, i ONLY get the first three fields in 3 variables like:
field1="A"
field2="201610"
field3='hello'.
I'd like to remove a line, if it contains exactly the first 3 fields, like , in the case described above, i want only the third line to be removed from the file.txt . Is there a way to do that? And is there a way to do that in the same file?
I tried with:
sed -i /$field1"."$field2"."$field3"."/Id file.txt
but of course this removes both the second and the third line
I suggest using awk for this as sed can only do regex search and that requires escaping all special meta-chars and anchors, word boundaries etc to avoid false matches.
Suggested awk with non-regex matching:
awk -F '[.]' -v f1="$field1" -v f2="$field2" -v f3="$field3" '
!($1==f1 && $2==f2 && $3==f3)' file
AA.201610.pancake.Paul
AA.201610.hello.Robert
Use ^ to anchor the pattern at the beginning of the line. Also note that . in a regex means "any character" and not a literal peridio. You have to escape it: either \. (be careful with shell escaping and the difference between single and double quotes) or [.]
Sed cannot do string matches, only regexp matches which becomes horrendously complicated to work around when you simply want to match a literal string (see Is it possible to escape regex metacharacters reliably with sed). Just use awk:
$ awk -v str="${field1}.${field2}.${field3}." 'index($0,str)!=1' file
AA.201610.pancake.Paul
AA.201610.hello.Robert
The question was about bash so in bash:
#!/usr/bin/env bash
field1="A"
field2="201610"
field3='hello'
IFS=
while read -r i
do
case "$i" in
"${field1}.${field2}.${field3}."*) ;;
*) echo -E "$i"
esac
done < file.txt

sed help: matching and replacing a literal "\n" (not the newline)

i have a file which contains several instances of \n.
i would like to replace them with actual newlines, but sed doesn't recognize the \n.
i tried
sed -r -e 's/\n/\n/'
sed -r -e 's/\\n/\n/'
sed -r -e 's/[\n]/\n/'
and many other ways of escaping it.
is sed able to recognize a literal \n? if so, how?
is there another program that can read the file interpreting the \n's as real newlines?
Can you please try this
sed -i 's/\\n/\n/g' input_filename
What exactly works depends on your sed implementation. This is poorly specified in POSIX so you see all kinds of behaviors.
The -r option is also not part of the POSIX standard; but your script doesn't use any of the -r features, so let's just take it out. (For what it's worth, it changes the regex dialect supported in the match expression from POSIX "basic" to "extended" regular expressions; some sed variants have an -E option which does the same thing. In brief, things like capturing parentheses and repeating braces are "extended" features.)
On BSD platforms (including MacOS), you will generally want to backslash the literal newline, like this:
sed 's/\\n/\
/g' file
On some other systems, like Linux (also depending on the precise sed version installed -- some distros use GNU sed, others favor something more traditional, still others let you choose) you might be able to use a literal \n in the replacement string to represent an actual newline character; but again, this is nonstandard and thus not portable.
If you need a properly portable solution, probably go with Awk or (gasp) Perl.
perl -pe 's/\\n/\n/g' file
In case you don't have access to the manuals, the /g flag says to replace every occurrence on a line; the default behavior of the s/// command is to only replace the first match on every line.
awk seems to handle this fine:
echo "test \n more data" | awk '{sub(/\\n/,"**")}1'
test ** more data
Here you need to escape the \ using \\
$ echo "\n" | sed -e 's/[\\][n]/hello/'
sed works one line at a time, so no \n on 1 line only (it's removed by sed at read time into buffer). You should use N, n or H,h to fill the buffer with more than one line, and then \n appears inside. Be careful, ^ and $ are no more end of line but end of string/buffer because of the \n inside.
\n is recognized in the search pattern, not in the replace pattern. Two ways for using it (sample):
sed s/\(\n\)bla/\1blabla\1/
sed s/\nbla/\
blabla\
/
The first uses a \n already inside as back reference (shorter code in replace pattern);
the second use a real newline.
So basically
sed "N
$ s/\(\n\)/\1/g
"
works (but is a bit useless). I imagine that s/\(\n\)\n/\1/g is more like what you want.

Why can s command of sed can be followed by a comma?

I saw someone use an expression like: sed -e 's, *$,,'
does anybody know why we can use it like this, and what does it do?
I thought the s command should be sed -e 'addr,addrs/reg/sub/' ?
From Using different delimiters in sed:
sed takes whatever follows the "s" as the separator
It is a good way to avoid escaping too much. Code is more readable if you use a delimiter that is not present in the string you want to handle.
For example let's say we want to replace lo/bye from a string. With / as delimiter it would be a little messy:
$ echo "hello/bye" | sed 's/lo\/bye/aa/g'
helaa
So if we define another separator it is more clear:
$ echo "hello/bye" | sed 's|lo/bye|aa|g'
helaa
$ echo "hello/bye" | sed 's,lo/bye,aa,g'
helaa

how to replace a special characters by character using shell

I have a string variable x=tmp/variable/custom-sqr-sample/test/example
in the script, what I want to do is to replace all the “-” with the /,
after that,I should get the following string
x=tmp/variable/custom/sqr/sample/test/example
Can anyone help me?
I tried the following syntax
it didnot work
exa=tmp/variable/custom-sqr-sample/test/example
exa=$(echo $exa|sed 's/-///g')
sed basically supports any delimiter, which comes in handy when one tries to match a /, most common are |, # and #, pick one that's not in the string you need to work on.
$ echo $x
tmp/variable/custom-sqr-sample/test/example
$ sed 's#-#/#g' <<< $x
tmp/variable/custom/sqr/sample/test/example
In the commend you tried above, all you need is to escape the slash, i.e.
echo $exa | sed 's/-/\//g'
but choosing a different delimiter is nicer.
The tr tool may be a better choice than sed in this case:
x=tmp/variable/custom-sqr-sample/test/example
echo "$x" | tr -- - /
(The -- isn't strictly necessary, but keeps tr (and humans) from mistaking - for an option.)
In bash, you can use parameter substitution:
$ exa=tmp/variable/custom-sqr-sample/test/example
$ exa=${exa//-/\/}
$ echo $exa
tmp/variable/custom/sqr/sample/test/example

Removing Parts of String With Sed

I have lines of data that looks like this:
sp_A0A342_ATPB_COFAR_6_+_contigs_full.fasta
sp_A0A342_ATPB_COFAR_9_-_contigs_full.fasta
sp_A0A373_RK16_COFAR_10_-_contigs_full.fasta
sp_A0A373_RK16_COFAR_8_+_contigs_full.fasta
sp_A0A4W3_SPEA_GEOSL_15_-_contigs_full.fasta
How can I use sed to delete parts of string after 4th column (_ separated) for each line.
Finally yielding:
sp_A0A342_ATPB_COFAR
sp_A0A342_ATPB_COFAR
sp_A0A373_RK16_COFAR
sp_A0A373_RK16_COFAR
sp_A0A4W3_SPEA_GEOSL
cut is a better fit.
cut -d_ -f 1-4 old_file
This simply means use _ as delimiter, and keep fields 1-4.
If you insist on sed:
sed 's/\(_[^_]*\)\{4\}$//'
This left hand side matches exactly four repetitions of a group, consisting of an underscore followed by 0 or more non-underscores. After that, we must be at the end of the line. This is all replaced by nothing.
sed -e 's/\([^_]*\)_\([^_]*\)_\([^_]*\)_\([^_]*\)_.*/\1_\2_\3_\4' infile > outfile
Match "any number of not '_'", saving what was matched between \( and \), followed by '_'. Do this 4 times, then match anything for the rest of the line (to be ignored). Substitute with each of the matches separated by '_'.
Here's another possibility:
sed -E -e 's|^([^_]+(_[^_]+){3}).*$|\1|'
where -E, like -r in GNU sed, turns on extended regular expressions for readability.
Just because you can do it in sed, though, doesn't mean you should. I like cut much much better for this.
AWK likes to play in the fields:
awk 'BEGIN{FS=OFS="_"}{print $1,$2,$3,$4}' inputfile
or, more generally:
awk -v count=4 'BEGIN{FS="_"}{for(i=1;i<=count;i++){printf "%s%s",sep,$i;sep=FS};printf "\n"}'
sed -e 's/_[0-9][0-9]*_[+-]_contigs_full.fasta$//g'
Still the cut answer is probably faster and just generally better.
Yes, cut is way better, and yes matching the back of each is easier.
I finally got a match using the beginning of each line:
sed -r 's/(([^_]*_){3}([^_]*)).*/\1/' oldFile > newFile

Resources