Sed - delete all chars before dash - linux

I have below in file with contents
devtools-cloudformation
devtools-common-rpm
I want to remove devtools- and keep the rest of characters same, I tried below command but its removing all dashes
sed 's/.*-//' projects.txt

This worked for me
sed 's/-/\n/;s/.*\n//' projects.txt

If I understood well, you want to delete everything up till the first dash.
Try this:
sed 's/[^\-]*-//'
This deletes this first dash as well.
In case you want to maintain that first dash:
sed 's/[^\-]*-/-/'
The reason your solution doesn't work is the fact that the regular expression .*- means 'anything followed by a dash'.
The string devtools-common- matches this criterion and is therefore removed.
The expression I suggest says 'anything but a dash, followed by a dash'.

Related

Wildcard sed search/remove within other text in the same line

I'm trying to remove a matching string with partial wildcards using sed, and the searches I've done for answers on this site either don't seem to apply or I can't convert them to my situation.
Below is the string of text I need to remove:
www.foo.com.cp123.bar.com
It is in a file with other entries on the same line. The line that has my entries always starts with serveralias:, however, as below:
serveralias: www.domain.com mail.domain.com www.foo.com.cp123.bar.com domain.com
I can identify what I need to remove via the 'cp123.bar.com' text as that always stays the same. It's the preceding 'www.foo.com' that changes. It can appear just once or multiple times within the line, but it will always end in 'cp123.bar.com'. I've tried the following two commands based on my research:
sed 's/\ .*cp123.bar.com\ //g' file.txt
sed 's/\ [^:]+$cp123.bar.com\ //g' file.txt
I'm using the spaces between each entry as the start and stop point for the find/replace(delete), but that's a band-aid and not always going to work since the entry I need to delete is occasionally at the end of the line (without a space afterward). If I don't include the spaces, though, everything gets removed since I'm using wildcards, including the www.domain.com, mail.domain.com, etc. text I need to keep there. Running either of the sed commands above doesn't do anything, just prints what's currently in the file.
Any ideas on what I need to change? I'm happy to clarify anything if need be.
Sed requires an -r flag to be able to use enhanced regular expressions. Without the -r, the + won't work in the regexps. Thus, a
sed -r 's/ +[^ ]+\.cp123\.bar\.com//g'
will do what you want. It removes the following substrings:
one or more space
followed by one or more non-space
followed by .cp123.bar.com

Linux Sed command replace after special character

How can I use sed command in Linux to replace key value pair. I want to replace characters that occur after “:”
For example
App.log.level: “xyz”
It sounds like you just want something like sed 's/:.*$/: YOURTEXTHERE/' where the general format is sed 's/REPLACE_THIS/WITH_THIS/g'
The /:.*$/ bit means I want to replace all text from a colon to the end of the line. The : YOURTEXTHERE is what you're replacing with. (I'm putting the colon back in and putting the extra text.) Since I'm only doing one replacement per line, I don't need the g at the end (although it wouldn't hurt anything.)
A real example:
>> echo App.log.level: \"xyz\" | sed 's/:.*$/: YOURTEXTHERE/'
App.log.level: YOURTEXTHERE

Vim or sed : Replace character(s) within a pattern

I wanted to replace underscores with hyphens in all places where the character('_') is preceded and following by uppercase letters e.g. QWQW_IOIO, OP_FD_GF_JK, TRT_JKJ, etc. The replacement is needed throughout one document.
I tried to replace this in vim using:
:%s/[A-Z]_[A-Z]/[A-Z]-[A-Z]/g
But that resulted in QWQW_IOIO with QWQ[A-Z]-[A-Z]OIO :(
I tried using a sed command:
sed -i '/[A-Z]_[A-Z]/ s/_/-/g' ./file_name
This resulted in replacement over the whole line. e.g.
QWQW_IOIO variable may contain '_' or '-' line was replaced by
QWQW-IOIO variable may contain '-' or '-'
You had the right idea with your first vim approach. But you need to use a capturing group to remember what character was found in the [A-Z] section. Those are nicely explained here and under :h /\1. As a side note, I would recommend using \u instead of [A-Z], since it is both shorter and faster. That means the solution you want is:
:%s/\(\u\)_\(\u\)/\1-\2/g
Or, if you would like to use the magic setting to make it more readable:
:%s/\v(\u)_(\u)/\1-\2/g
Another option would be to limit the part of the search that gets replaced with the \zs and \ze atoms:
:%s/\u\zs_\ze\u/-/g
This is the shortest solution I'm aware of.
This should do what you want, assuming GNU sed.
sed -i -r -e 's/([A-Z]+)_([A-Z]+)/\1-\2/g' ./file_name
Explanation:
-r flag enables extended regex
[A-Z]+ is "one or more uppercase letters"
() groups a pattern together and creates a numbered memorized match
\1, \2 put those memorized matches in the replacement.
So basically this finds a chunk of uppercase letters followed by an underscore, followed by another chunk of uppercase letters, memorizes only the letter chunks as 2 groups,
([A-Z]+)_([A-Z]+)
Then it replays those groups, but with a hyphen in between instead of an underscore.
\1-\2
The g flag at the end says to do this even if the pattern shows up multiple times on one line.
Note that this falls apart a little in this case:
QWQW_IOIO_ABAB
Because it matches the first time, but not the second; the second part won't match because IOIO was consumed by the first match. So that would result in
QWQW-IOIO_ABAB
This version drops the + so it only matches one uppercase letter, and won't break in the same way:
sed -i -r -e 's/([A-Z])_([A-Z])/\1-\2/g'
It still has a small flaw, if you have a string like this:
A_B_C
Same issue as before, just one letter now instead of multiple.

sed is matching passed variable subsets, not exact matches

I'm partially successfully using sed to replace variables in a text file. I'm stuck on an exception.
A script reads input from a list - say the $roll_symbol is C20.
sed replaces C20, GC20, and KC20 (because C20 matches part of the string).
I searched the web and tried the variations I found - no success.
I tried these variations without success:
escape the reserved character $
escape braces
escape both
use double quotes instead of single quotes.
*the best version so far (but only partially):
sed -i 's/'${roll_symbol}'/'${roll_symbol}\,${contract_month}'/g' $OUTPUT_DIRECTORY/$OUTPUT_FILE;
You need to tell sed what characters are legal before the start of your match to limit where it can match. To only match at start-of-word boundaries try \<.
sed -i "s/\<${roll_symbol}/${roll_symbol},${contract_month}/g" "$OUTPUT_DIRECTORY/$OUTPUT_FILE";

Replacing comma on specific lines only

I have a dataset that is comma separated. But I have a little problem with its format. I want everything to be in the form x,x,x
Below is a sample of my dataset:
995970,16779453
995971,16828069
995972,
995973,16828069
995974,16827226
As you can see, most of my dataset is in the proper format but I have those commas on single id#'s also (my data is in form id#, connection#). How would I go about removing the commas on those single id#'s? I can't seem to figure it out just using a text editor. Any suggestions?
Edit: can I use some sort of regex expression to only remove it from those ids that have a specified length?
Edit2: Ok I figured it out using some regex, thanks for all the help!
In vi one would do something like
:%s/,$//
This means
: (enter a line mode command)
% (try the command on every line)
s (substitute)
,$ (match a comma at the end of a line)
(empty replacement text)
Sometimes you need something like /, *$/ do match a comma followed by 0 or more trailing spaces. You can get vi on windows in various different ways; one way is to install Cygwin.
You can select regular expression mode in Notepad++ and do find and replace using the following regex ,$. Leave the replace field blank.
With the sed command:
sed 's/, *//' < FILE
or inplace (requires GNU sed):
sed -ie 's/, *//' FILE

Resources