sed regex not matching - text

I have more than one instance of a string per line and I only need to prepend one of them I marked the other ones I dont want to change to Sports CommunincationP I am using
sed -i.bak 's/Sports Communication[^P]/College of Architecture\, Arts and Humanities\/Sports Communication/' out.csv
to try to prepend the unaltered string below is the csv content I am interested in
2.5,"Sports Communication,","The Campus
I am probably missing something simple. I tried /g at end to see if I had more than one match I missed altering. I also took away the [^P] make sure no typos and it matches

Related

Wildcard sed search/remove within other text in the same line

I'm trying to remove a matching string with partial wildcards using sed, and the searches I've done for answers on this site either don't seem to apply or I can't convert them to my situation.
Below is the string of text I need to remove:
www.foo.com.cp123.bar.com
It is in a file with other entries on the same line. The line that has my entries always starts with serveralias:, however, as below:
serveralias: www.domain.com mail.domain.com www.foo.com.cp123.bar.com domain.com
I can identify what I need to remove via the 'cp123.bar.com' text as that always stays the same. It's the preceding 'www.foo.com' that changes. It can appear just once or multiple times within the line, but it will always end in 'cp123.bar.com'. I've tried the following two commands based on my research:
sed 's/\ .*cp123.bar.com\ //g' file.txt
sed 's/\ [^:]+$cp123.bar.com\ //g' file.txt
I'm using the spaces between each entry as the start and stop point for the find/replace(delete), but that's a band-aid and not always going to work since the entry I need to delete is occasionally at the end of the line (without a space afterward). If I don't include the spaces, though, everything gets removed since I'm using wildcards, including the www.domain.com, mail.domain.com, etc. text I need to keep there. Running either of the sed commands above doesn't do anything, just prints what's currently in the file.
Any ideas on what I need to change? I'm happy to clarify anything if need be.
Sed requires an -r flag to be able to use enhanced regular expressions. Without the -r, the + won't work in the regexps. Thus, a
sed -r 's/ +[^ ]+\.cp123\.bar\.com//g'
will do what you want. It removes the following substrings:
one or more space
followed by one or more non-space
followed by .cp123.bar.com

Using SED to replace capture group with regex pattern

I need some help with a sed command that I thought would help solve an issue I have. I have basically have long text files that look something like this:
>TRINITY_DN112253_co_g1_i2 Len=3873 path=[38000:0-183]
ACTCACGCCCACATAAT
The ACT text blocks continue on, and then there are more blocks of text that follow the same pattern, except the text after the > differs slightly by numbers. I want to replace only this header part (the part followed by the >) to everything up until the very last “_” the sed command I thought seemed logical is the following:
sed -i ‘s/>.*/TRINITY.*_/‘
However, sed is literally changing each header to TRINITY.*_ rather than capturing the block I thought it would. Any help is appreciated!
(Also.. just to make things clear, I thought that my sed command would convert the top header block into this:
>TRINITY_DN112253_co_g1_
ACTCACGCCCACATAAT
This might help:
sed '/^>/s/[^_]*$//' file
Output:
>TRINITY_DN112253_co_g1_
ACTCACGCCCACATAAT
See: The Stack Overflow Regular Expressions FAQ

Linux: Replace first string in file with contents of other file containing quotes and slashes.

I have spent all day today trying to find a proper solution, but I am not able to. My problem:
I have an XML file with tags containing multiple of the same.
Example:
<TASK INSTANCE />
<WORKFLOWLINK CONDITION=""/>
<WORKFLOWLINK CONDITION=""/>
I want to add the contents of an other XML file before the first <WORKFLOWLINK. The issue I've ran into is that this file is full of double quotes and slashes. I've tried replacing them and escaping them, but to no avail.
My tries mainly culminated on something like:
sed -e "0,/<WORKFLOWLINK/ /<WORKFLOWLINK/{ r ${filename}" -e "}" ${sourcefile}
If this isn't clear enough I'll get the exact data so you can see.
For the fun of sed:
sed -e "0,/<WORKFLOWLINK/{/<WORKFLOWLINK/{r ${sourcefile}" -e"}}"
The trick is to start a new "pattern/command" pair after your first address range condition 0,/<WORKFLOWLINK/.
Two nested patterns/addresses are not understood, there must be a command after the first pattern. Using an additional pair of curlies {} does that for you.
Apart from the brain exercise to do it in sed, #EdMorton is right in recommending to use an XML-processor. Also his request for an MCVE is appropriate. I had to do some guessing to see what you want and I hope I guessed right.
The mcve should at least have included
the error message or problem description defining your problem
the initialisation of your environment variables
some sample input; not the original data
You surely would have had an answer earlier and (in case mine does not satisfy you) probably a better one by now.
So, before your next question, please take the https://stackoverflow.com/tour
GNU sed version 4.2.1
GNU bash, version 3.1.17(1)-release (i686-pc-msys)
Everyone,
Thank you for thinking with me, even if I apparently broke some rules.
I have figured out a solution, granted it is not as pretty as can be, but for a one time action it is good enough.
I have moved from a single command to a combination of first detecting the location I want to put my data:
sed -e "0,/<WORKFLOWLINK/ s/<WORKFLOWLINK/##MARKER##\n\t<WORKFLOWLINK'" which will put the marker string in the desired location.
After this I replace the marker with the contents of the file I have. I managed to make the individual statements working when I was trying to do it all in a single statement before, so I just execute them separately.
sed -e "/##MARKER##/{r ${sourcefile}" -e 'd}'

sed -- matching first occurrence in search

I'm writing a shell script to modify a file and I have a line something like this in it:
sed s/here \(.*\n\)/gone \1/g
Unfortunately, the search seems to match the longest string (i.e., it goes all the way to the last \n -- thus giving me just one replacement) but I want it to match only up to the first \n it finds (so I can get replacements on every line).
Is this possible?
Thanks for your help!
Looks like you want the feature called non-greedy (or lazy) match. Unfortunately sed does not provide such feature. To emulate it you need to search for anything except separator match until separator match. Like this:
s/here \([^\n]*\n\)/gone \1/g

How to delete text in a file based on regular expression using vim

I have an XML file like this:
<fruit><apple>100</apple><banana>200</banana></fruit>
<fruit><apple>150</apple><banana>250</banana></fruit>
Now I want delete all the text in the file except the words in tag apple. That is, the file should contain:
100
150
How can I achive this?
:%s/.*apple>\(.*\)<\/apple.*/\1/
That should do what you need. Worked for me.
Basically just grabbing everything up to and including the tag, then backreferences everything between the apple begin and end tag, and matches to the rest of the line. Replaces it with the first backreference, which was the stuff between the apple tags.
I personally use this:
%s;.*<apple>\(\d*\)</apple>.*;\1;
Since the text contain '/' which is the default seperator,and by using ';' as sep makes the code clearer.
And I found that non-greedy match #Conspicuous Compiler mentioned should be
\{-}
instead of "{-}" in Vim.
However, I after change Conspicuous' solution to
%s/.*apple>(.\{-\})<\/apple.*/\1^M/g
my Vim said it can't find the pattern.
In this case, one can use the general technique for collecting pattern matches
explained in my answer to the question "How to extract regex matches
using Vim".
In order to collect and store all of the matches in a list, run the Ex command
:let t=[] | %s/<apple>\(.\{-}\)<\/apple>\zs/\=add(t,submatch(1))[1:0]/g
The command purposely does not change the buffer's contents, only collects the
matched text. To set the contents of the current buffer to the
newline-separated list of matches, use the command
:0pu=t | +,$d_

Resources