Find two lines and replace with one - linux

I am looking for a solution that would allow me to search text files on a linux server that would look a file and find a pattern such as:
Text 123
Blue Green
And then replaces it with one line, every time it finds it in a file...
Order Blue Green
I am not sure what would be the easiest way to solve this. I have seen many guides using SED but only for finding one line and replacing it.

You ask about sed, here is an answer in sed.
Let me mention however, that while sed is fun for this kind of exercise, you probably should choose something else, more flexible and easier to learn; perl for example.
look for first line /Text 123/
when found start a loop :a
concat next line N
replace twins of searched text with single copy and print it
s/Text 123\nText 123/Text 123/p;
loop while that replaces ta;
try to replace s///
rely on concat being printed unchanged if replace does not trigger
Code:
sed "/Text 123/{:a;N;s/Text 123\nText 123/Text 123/p;ta;s/Text 123\nBlue Green/Order Blue Green/}"
Test input:
Text 123
Do not replace
Lala
Text 123
Blue Green
lulu
Text 123
Do not replace either
Text 123
Text 123
Blue Green
preceding should be replaced
Output:
Text 123
Do not replace
Lala
Order Blue Green
lulu
Text 123
Do not replace either
Text 123
Order Blue Green
preceding should be replaced
Platform: Windows and GNU sed version 4.2.1
Note:
On that platform the sed line allows to use the environment variables for the two text fragments, which you probably want to do:
sed "/%EnvVar2%/{:a;N;s/%EnvVar2%\n%EnvVar2%/%EnvVar2%/p;ta;s/%EnvVar2%\n%EnvVar%/Order %EnvVar%/}"
Platform2:
still Windows
using bash GNU bash, version 3.1.17(1)-release (i686-pc-msys)
GNU sed version 4.2.1 (same)
On this platform, variables can e.g. be used like:
sed "/${EnvVar2}/{:a;N;s/${EnvVar2}\n${EnvVar2}/${EnvVar2}/p;ta;s/${EnvVar2}\n${EnvVar}/Order ${EnvVar}/}"
On this platform it is important to use "..." in order to be able to use variables,
it does not work with '...'.
As #edMorton has hinted, on all platforms be careful however with trying to replace (using variables) text which looks like using a variable. E.g. with "Text $123" in bash. In that case, not using variables but trying to replace text which looks like variables, using '...' instead of "..." is the way to go.

sed is for simple substitutions on individual lines, that is all. If you find yourself trying to use constructs other than s, g, and p (with -n) then you are on the wrong track as all other sed constructs became obsolete in the mid-1970s when awk was invented.
Your problem is not doing substitutions on individual lines, it's on a multi-line record and to do that with GNU awk for multi-char RS is:
$ awk -v RS='^$' -v ORS= '{gsub(/Text 123\nBlue Green/,"Order Blue Green")}1' file
Order Blue Green
but there are several other approaches depending on your real needs.

Related

color list in a general way

I thought this might be easy but not so much
I want to color the output of a command based on delimeters, in my case
apt-show-versions -u
and want to color the packages' names based on the colon seperator, or on the word 'to'. It seems to run into wanting a parser not a filter.
using color xterm on Linux and PuTTY
Others have been interested in similar functionality, I suggest you check the tools (grc/grcat) mentioned there.
You might be able to get away with sed-magic, though. I'm not sure what you want to colour exactly, and neither do I know what the output of apt-show-versions looks like, but this colours everything preceding a colon and the word "to":
cat << EOF | sed -e "s/^[^:]*/\x1b[31m&\x1b[0m/g" | sed -e "s/to/\x1b[31m&\x1b[0m/g"
foo: 1 to 2
bar: 3 to 4
quux: 5 to 6
EOF
You can paste that into a terminal and see if it's what you're looking for. Essentially, it searches for occurences of regular expressions and surrounds it with ANSI colour codes:
s/X/Y&Y/g : replace X by surrounding with Y, in the entire input (g flag), or, quoting man sed:
s/regexp/replacement/
Attempt to match regexp against the pattern space. If success‐
ful, replace that portion matched with replacement. The
replacement may contain the special character & to refer to that
portion of the pattern space which matched, and the special
escapes \1 through \9 to refer to the corresponding matching
sub-expressions in the regexp.
^[^:]* : from beginning of line, match everything until you encounter a :
\x1b : Hex 27, an escape sequence (see here for more!)
[31m : ANSI colour code for red
[0m : ANSI colour code for "reset to normal output"
If anything, this post taught me that sed captures matches in & ;-) Hope you gained some insight, too!

Linux Sed command replace after special character

How can I use sed command in Linux to replace key value pair. I want to replace characters that occur after “:”
For example
App.log.level: “xyz”
It sounds like you just want something like sed 's/:.*$/: YOURTEXTHERE/' where the general format is sed 's/REPLACE_THIS/WITH_THIS/g'
The /:.*$/ bit means I want to replace all text from a colon to the end of the line. The : YOURTEXTHERE is what you're replacing with. (I'm putting the colon back in and putting the extra text.) Since I'm only doing one replacement per line, I don't need the g at the end (although it wouldn't hurt anything.)
A real example:
>> echo App.log.level: \"xyz\" | sed 's/:.*$/: YOURTEXTHERE/'
App.log.level: YOURTEXTHERE

extract first instance per line (maybe grep?)

I want to extract the first instance of a string per line in linux. I am currently trying grep but it yields all the instances per line. Below I want the strings (numbers and letters) after "tn="...but only the first set per line. The actual characters could be any combination of numbers or letters. And there is a space after them. There is also a space before the tn=
Given the following file:
hello my name is dog tn=12g3 fun 23k3 hello tn=1d3i9 cheese 234kd dks2 tn=6k4k ksk
1263 chairs are good tn=k38493kd cars run vroom it95958 tn=k22djd fair gold tn=293838 tounge
Desired output:
12g3
k38493
Here's one way you can do it if you have GNU grep, which (mostly) supports Perl Compatible Regular Expressions with -P. Also, the non-standard switch -o is used to only print the part matching the pattern, rather than the whole line:
grep -Po '^.*?tn=\K\S+' file
The pattern matches the start of the line ^, followed by any characters .*?, where the ? makes the match non-greedy. After the first match of tn=, \K "kills" the previous part so you're only left with the bit you're interested in: one or more non-space characters \S+.
As in Ed's answer, you may wish to add a space before tn to avoid accidentally matching something like footn=.... You might also prefer to use something like \w to match "word" characters (equivalent to [[:alnum:]_]).
Just split the input in tn=-separators and pick the second one. Then, split again to get everything up to the first space:
$ awk -F"tn=" '{split($2,a, " "); print a[1]}' file
12g3
k38493kd
$ awk 'match($0,/ tn=[[:alnum:]]+/) {print substr($0,RSTART+4,RLENGTH-4)}' file
12g3
k38493kd

delete strings from file bash

I have a file with tons of call logs and I am trying to clean it up using bash. I figured out how to search for a string and delete the entire line it is on but that isn't what I want to accomplish.
I want to search for a string as an example:
There are tons of MAC address in the file and I want to remove them all MAC:00-0A-DD-84-01-33
There is also a call ID at the beginning of each line that looks like: 354469805 or 354469894 and I want to remove all of those as well.
I'm just starting in bash so please excuse my ignorance. I am entering 2 lines of the call log below for clarification. I want to delete the 3544 number, the MAC address, and the word Telepacific.
354469725 06/24/2013 09:34 00:03:26 Chante Squires 105 TelePacific MAC:00-0A-DD-84-01-1D TelePacific 17025290701 1
354469732 06/24/2013 09:59 00:01:16 Chante Squires 105 TelePacific MAC:00-0A-DD-84-01-1D TelePacific 12132238375 1
You could use sed:
sed -i 's/^[0-9]\{9\}\|MAC:[0-9A-Fa-f]\{2\}\([-\:][0-9A-Fa-f]\{2\}\)\{5\}//g' input.log
Between the 's/ and //g' is a regular expression that matches the removal criteria in your question. The s flag in front means "search and replace" the regular expression. The // means replace the regular expression with nothing. The g flag at the end means "replace all matches" if they occur more than once in a line. Finally, the -i switch means "edit the files in-place".
This solution assumes that your call IDs are all 9 digits and that the MAC address has six groups of two hexadecimal digits separated by dashes or colons.
One way with awk (you will loose extra tabs space, every field will be separated by single space):
awk '{for(i=2;i<NF;i++) if(8>i || i>10) printf "%s ", $i; print $NF}' log

Replacing comma on specific lines only

I have a dataset that is comma separated. But I have a little problem with its format. I want everything to be in the form x,x,x
Below is a sample of my dataset:
995970,16779453
995971,16828069
995972,
995973,16828069
995974,16827226
As you can see, most of my dataset is in the proper format but I have those commas on single id#'s also (my data is in form id#, connection#). How would I go about removing the commas on those single id#'s? I can't seem to figure it out just using a text editor. Any suggestions?
Edit: can I use some sort of regex expression to only remove it from those ids that have a specified length?
Edit2: Ok I figured it out using some regex, thanks for all the help!
In vi one would do something like
:%s/,$//
This means
: (enter a line mode command)
% (try the command on every line)
s (substitute)
,$ (match a comma at the end of a line)
(empty replacement text)
Sometimes you need something like /, *$/ do match a comma followed by 0 or more trailing spaces. You can get vi on windows in various different ways; one way is to install Cygwin.
You can select regular expression mode in Notepad++ and do find and replace using the following regex ,$. Leave the replace field blank.
With the sed command:
sed 's/, *//' < FILE
or inplace (requires GNU sed):
sed -ie 's/, *//' FILE

Resources