Say I have a string:
ap=test:::bc=exam:::dc=comic:::mp=calc:::
Read in a linux box, i need to remove say bc=exam, the key is always the same, but the value can be any value, string or digits, and the placement of the key value pair can be anywhere in the string.
i've got to
sed -e 's/:::bc=\(.*:::\)*/\1/'
which only removes the key and a delimiter.
or
sed -e 's/:::bc=.*\(:::\)*/\1/'
which is removing everything from the key on.
Thanks in advance.
Since your values do not contain semicolons, you may match them with a negated bracket expression, [^:]*:
sed 's/:::bc=[^:]*//' file
See the online sed demo.
The :::bc=[^:]* matches :::bc and then any 0+ chars other than a colon.
Related
I have a file with two fields. I need to change the first field values from lowercase to uppercase. Can anyone give me a suggestion on how can I do this?
sample file data
e6|VerizonOctoberWB_PromoE7E6
e2|VerizonOctoberWB_UnlimwP_E1E2
e5|VerizonOctoberWB_PromoLI_E5
In above sample data I need to change the first field values(e6,e2,e5)
Given your small and poorly formatted sample:
cat up
e6|VerizonOctoberWB_PromoE7E6
e2|VerizonOctoberWB_UnlimwP_E1E2
e5|VerizonOctoberWB_PromoLI_E5
sed -r 's/^([^|]+)/\U\1\E/g' up
E6|VerizonOctoberWB_PromoE7E6
E2|VerizonOctoberWB_UnlimwP_E1E2
E5|VerizonOctoberWB_PromoLI_E5
Edit 1: added explanation:
search for and remember everything from beginning of line up to the first separator |, replace with \U(start upper-casing), \1 remembered string, \E stop upper-casing.
I have text file with ~70k lines like this:
/dir1/dir2/dir3/2013/04/04/file.pdf
and I need to convert it to:
dir4/dir5/2013/04/4/file.pdf
It's important that the leading 0 in 6th place is removed, values in this place go from 1 to 31. Can anyone help with this?
Using sed :
sed -E 's#(/[^/]*){3}(/[0-9]+/[0-9]+/)0?([0-9]+.*)#dir4/dir5\2\3#' your_file
We match the three dirs in a first group that will be disregarded (we'd be using a non-capturing group if sed supported it), then the year and month in a second group, then optionnaly the leading 0 of the day, then the rest of the day and the filename in a third group. The replacement pattern just specifies the new path root then refers to the second and third groups. I'm using # as a delimiter to avoid having to espace all the / in the pattern and replacement pattern, any character that isn't found in them would work as well.
Try it online !
I'm having a .txt file looking like this (along about 400 rows):
lettuceFMnode_1240 J_C7R5_99354_KNKSR3_Oligomycin 81.52
lettuceFMnode_3755 H_C1R3_99940_KNKSF2_Tubulysin 70
lettuceFMnode_17813 G_C4R5_80184_KNKS113774F_Tetronasin 79.57
lettuceFMnode_69469 J_C11R7_99276_KNKSF2_Nystatin 87.27
I want to edit the names in the entire 2nd column so that only the last part will stay (meaning delete anything before that, so in fact leaving what comes after the last _).
I looked into different solutions using a combination of cut and sed, but couldn't understand how the code should be built.
Would appreciate any tips and help!
Thank you!
Here's one way:
perl -pe 's/^\S+\s+\K\S+_//'
For every line of input (-p) we execute some code (-e ...).
The code performs a subtitution (s/PATTERN/REPLACEMENT/).
The pattern matches as follows:
^ beginning of string
\S+ 1 or more non-whitespace characters (the first column)
\s+ 1 or more whitespace characters (the space after the first column)
\K do not treat the text matched so far as part of the final match
\S+ 1 or more non-whitespace characters (the second column)
_ an underscore
Because + is greedy (it matches as many characters as possible), \S+_ will match everything up to the last _ in the second column.
Because we used \K, only the rest of the pattern (i.e. the part of the match that lies in the second column) gets replaced.
The replacement string is empty, so the match is effectively removed.
With sed:
sed 's/ [^ ]*_/ /' file
Replace first space followed by non-space characters ([^ ]*) followed by _ widh one space.
I have an output like this:
a/foo bar /
b/c/foo sth /xyz
cc/bar ghj /axz/byz
What i want is just this line:
a/foo bar /
To be more clear, I want those line ending with a specific string. I want to grep lines that have a / character at their last column.
You can use $ like this:
$ grep '/$' file
a/foo bar /
As $ stands for end of line, /$ matches those lines whose last character is a /.
grep '/$'
slash is not special character for grep and $ means match expression at the end of a line.
You can even grep the last column with only backlash at last column (but not the only column in the line)
I assumed tha the last column of a line is a string with more than one white space in front the string and no more character after the string. This assumption does not fulfill the requirement if there has only one column in that line because it does not need space in front of it to show it is last column if there has only one column.
By enable perl regular expressions (-P),
grep -P '\s+/$'
\s means matches any whitespace character (space, tab, newline)
plus sign means match 1 or more times for preceding element
$ means end of string
OR refer to Character Classes and Bracket Expressions
grep '[[:space:]]\+/$'
OR
grep '[[:blank:]]\+/$'
‘[:blank:]’ Blank characters: space and tab.
‘[:space:]’ Space characters: in the ‘C’ locale, this is tab, newline,
vertical tab, form feed, carriage return, and space. It is a synonym for '\s'.
Refer to #fedorqui, the backslash after ]] is used to distinguish with
the literal +. Thanks for the explanations.
Sorry if wrong for perl answer because I never use or learn Perl expression but really hope can help you find the last column slash so may be you can read these for more information for searching backspace with slash at end of line
grep with regexp: whitespace doesn't match unless I add an assertion
Regular expressions in Perl
I want to replace a special character (=) only at first occurrence.
Example:
abc=abc.def=
Expected output is:
abc.def=
I tried the following command: sed -e 's/\([^=]*\)\(=.*\)/\2/'
but the output I am getting is:
=abc.def=
Note that your example suggests you want to remove everything up to and including the first equals.
Move the equals sign into the first part of the regex, delete the remaining part of the regex (because you only need to match the part you want to remove) and replace the match with "nothing" to remove it:
sed -e 's/^[^=]*=//'