sed to insert on first match only - linux

UPDATED:
Using sed, how can I insert (NOT SUBSTITUTE) a new line on only the first match of keyword for each file.
Currently I have the following but this inserts for every line containing Matched Keyword and I want it to only insert the New Inserted Line for only the first match found in the file:
sed -ie '/Matched Keyword/ i\New Inserted Line' *.*
For example:
Myfile.txt:
Line 1
Line 2
Line 3
This line contains the Matched Keyword and other stuff
Line 4
This line contains the Matched Keyword and other stuff
Line 6
changed to:
Line 1
Line 2
Line 3
New Inserted Line
This line contains the Matched Keyword and other stuff
Line 4
This line contains the Matched Keyword and other stuff
Line 6

You can sort of do this in GNU sed:
sed '0,/Matched Keyword/s//New Inserted Line\n&/'
But it's not portable. Since portability is good, here it is in awk:
awk '/Matched Keyword/ && !x {print "Text line to insert"; x=1} 1' inputFile
Or, if you want to pass a variable to print:
awk -v "var=$var" '/Matched Keyword/ && !x {print var; x=1} 1' inputFile
These both insert the text line before the first occurrence of the keyword, on a line by itself, per your example.
Remember that with both sed and awk, the matched keyword is a regular expression, not just a keyword.
UPDATE:
Since this question is also tagged bash, here's a simple solution that is pure bash and doesn't required sed:
#!/bin/bash
n=0
while read line; do
if [[ "$line" =~ 'Matched Keyword' && $n = 0 ]]; then
echo "New Inserted Line"
n=1
fi
echo "$line"
done
As it stands, this as a pipe. You can easily wrap it in something that acts on files instead.

If you want one with sed*:
sed '0,/Matched Keyword/s//Matched Keyword\nNew Inserted Line/' myfile.txt
*only works with GNU sed

This might work for you:
sed -i -e '/Matched Keyword/{i\New Inserted Line' -e ':a;n;ba}' file
You're nearly there! Just create a loop to read from the Matched Keyword to the end of the file.
After inserting a line, the remainder of the file can be printed out by:
Introducing a loop place holder :a (here a is an arbitrary name).
Print the current line and fetch the next into the pattern space with the ncommand.
Redirect control back using the ba command which is essentially a goto to the a place holder. The end-of-file condition is naturally taken care of by the n command which terminates any further sed commands if it tries to read passed the end-of-file.
With a little help from bash, a true one liner can be achieved:
sed $'/Matched Keyword/{iNew Inserted Line\n:a;n;ba}' file
Alternative:
sed 'x;/./{x;b};x;/Matched Keyword/h;//iNew Inserted Line' file
This uses the Matched Keyword as a flag in the hold space and once it has been set any processing is curtailed by bailing out immediately.

If you want to append a line after first match only, use AWK instead of SED as below
awk '{print} /Matched Keyword/ && !n {print "New Inserted Line"; n++}' myfile.txt
Output:
Line 1
Line 2
Line 3
This line contains the Matched Keyword and other stuff
New Inserted Line
Line 4
This line contains the Matched Keyword and other stuff
Line 6

Related

Create newline before line matching a pattern, if there is no newline yet

I have a text file with a few lines in it. What i am trying to do is to find all lines matching a pattern and if there is no newline (= non empty line) before them, create it.
Something like this, but it is not working properly:
sed -i '/[a-zA-Z0-9]/{N;/PATTERN/{s/PATTERN/\nPATTERN/}}' FILENAME
I know it could be probably done more easily and nicely in awk or perl/bash, but i would prefer an one line/one step solution.
Sample input file:
LINE1
LINE2
PATTERN
LINE3
PATTERN
LINE4
Expected output:
LINE1
LINE2
PATTERN
LINE3
PATTERN
LINE4
I'm not very good at sed but here's how I'd do it in awk:
awk 'prev != "" && /PATTERN/ { print "" } { prev = $0; print }' file
If prev (the previous line) is not empty and the current line matches /PATTERN/ then print a blank line. Unconditionally save the current line for comparison with the next, and print the current line.
To achieve an "in-place" edit (like sed -i), just redirect the command to a temporary file and then overwrite the original:
awk 'prev != "" && /PATTERN/ { print "" } { prev = $0; print }' file > tmp && mv tmp file
Note that since prev is initially unset, this won't print a newline at the start of the output, even if the first line matches /PATTERN/. To get around this, you can change the condition to:
(NR == 1 || prev != "") && /PATTERN/
You can also achieve the in-place edit with GNU awk, using the -i inplace option.
Take a look at this GNU sed (note that awk is a better tool for the job):
sed -i '/PATTERN/{x;/^$/!i\
x};h' input
h is a command that saves the contents of the pattern space into the hold buffer. It saves the line at the end of each cycle so that it can be used as the "previous" line in the next cycle
x exchanges the contents of the hold and pattern spaces. Whenever the current line matches your /PATTERN/, the previously saved line is put into the pattern space. If the previous line is NOT empty (/^$/!), newline is inserted with the i command. The current line is then put back into the pattern space with the x command
If you want to add a newline even if the first line matches /PATTERN/, use:
sed -i '/PATTERN/{1h;x;/^$/! ...
Further reading:
GNU sed: Less Frequently-Used Commands
grymoire.com sed tutorial

Remove new line character by checking the expression, using sed

Have to write a script which updates the file in this way.
raw file:
<?blah blah blah?>
<pen>
<?pineapple?>
<apple>
<pen>
Final file:
<?blah blah blah?><pen>
<?pineapple?><apple><pen>
Where ever in the file if the new line charter is not followed by
<?
We have to remove the newline in order to append it at the end of previous line.
Also it will be really helpful if you explain how your sed works.
Perl solution:
perl -pe 'chomp; substr $_, 0, 0, "\n" if $. > 1 && /^<\?/'
-p reads the input line by line, printing each line after changes
chomp removes the final newline
substr with 4 arguments modifies the input string, here it prepends newline if it's not the first line ($. is the input line number) and the line starts with <?.
Sed solution:
sed ':a;N;$!ba;s/\n\(<[^?]\)/\1/g' file > newfile
The basic idea is to replace every
\n followed by < not followed by ?
with what you matched except the \n.
When you are happy with a solution that puts every <? at the start of a line, you can combine tr with sed.
tr -d '\n' < inputfile| sed 's/<?/\n&/g;$s/$/\n/'
Explanation:
I use tr ... < inputfile and not cat inputfile | tr ... avoiding an additional catcall.
The sed command has 2 parts.
In s/<?/\n&/g it will insert a newline and with & it will insert the matched string (in this case always <?, so it will only save one character).
With $s/$/\n/ a newline is appended at the end of the last line.
EDIT: When you only want newlines before <? when you had them already,
you can use awk:
awk '$1 ~ /^<\?/ {print} {printf("%s",$0)} END {print}'
Explanation:
Consider the newline as the start of the line, not the end. Then your question transposes into "write a newline when the line starts with <?. You must escape the ? and use ^ for the start of the line.
awk '$1 ~ /^<\?/ {print}'
Next print the line you read without a newline character.
And you want a newline at the end.

delete a line after a pattern only if it is blank using sed or awk

I want to delete a blank line only if this one is after the line of my pattern using sed or awk
for example if I have
G
O TO P999-ERREUR
END-IF.
the pattern in this case is G
I want to have this output
G
O TO P999-ERREUR
END-IF.
This will do the trick:
$ awk -v n=-2 'NR==n+1 && !NF{next} /G/ {n=NR}1' file
G
O TO P999-ERREUR
END-IF.
Explanation:
-v n=-2 # Set n=-2 before the script is run to avoid not printing the first line
NR == n+1 # If the current line number is equal to the matching line + 1
&& !NF # And the line is empty
{next} # Skip the line (don't print it)
/G/ # The regular expression to match
{n = NR} # Save the current line number in the variable n
1 # Truthy value used a shorthand to print every (non skipped) line
Using sed
sed '/GG/{N;s/\n$//}' file
If it sees GG, gets the next line, removes the newline between them if the next line is empty.
Note this will only remove one blank line after, and the line must be blank i.e not spaces or tabs.
This might work for you (GNU sed):
sed -r 'N;s/(G.*)\n\s*$/\1/;P;D' file
Keep a moving window of two lines throughout the length of the file and remove a newline (and any whitespace) if it follows the intended pattern.
Using ex (edit in-place):
ex +'/G/j' -cwq foo.txt
or print to the standard output (from file or stdin):
ex -s +'/GG/j|%p|q!' file_or_/dev/stdin
where:
/GG/j - joins the next line when the pattern is found
%p - prints the buffer
q! - quits
For conditional checking (if there is a blank line), try:
ex -s +'%s/^\(G\)\n/\1/' +'%p|q!' file_or_/dev/stdin

print specific line if it is matches with the line after it

I have a log file containing the following info:
<msisdn>37495989804</msisdn>
<address>10.14.14.26</address>
<msisdn>37495371855</msisdn>
<address>10.14.0.172</address>
<msisdn>37495989832</msisdn>
<address>10.14.14.29</address>
<msisdn>37495479810</msisdn>
<address>10.14.1.11</address>
<msisdn>37495429157</msisdn>
<address>10.14.0.213</address>
<msisdn>37495275824</msisdn>
<msisdn>37495739176</msisdn>
<address>10.14.2.86</address>
<msisdn>37495479840</msisdn>
<address>10.14.1.12</address>
<msisdn>37495706059</msisdn>
<msisdn>37495619889</msisdn>
<address>10.14.1.198</address>
<msisdn>37495574341</msisdn>
<address>10.14.1.148</address>
<msisdn>37495391624</msisdn>
<address>10.14.0.188</address>
<msisdn>37495989796</msisdn>
<address>10.14.14.24</address>
<msisdn>37495835940</msisdn>
<address>10.14.2.164</address>
<msisdn>37495743249</msisdn>
<address>10.14.2.94</address>
<msisdn>37495674117</msisdn>
<address>10.14.1.236</address>
<msisdn>37495754536</msisdn>
<address>10.14.2.120</address>
<msisdn>37495576434</msisdn>
<msisdn>37495823889</msisdn>
<address>10.14.2.159</address>
There are some lines where the 'msisdn' line is not followed by an 'address' line, like this:
<msisdn>37495576434</msisdn>
<msisdn>37495823889</msisdn>
I would like to write a script which will output only the lines ('msisdn' lines), that aren't followed by 'address'. Expected output:
<msisdn>37495275824</msisdn>
<msisdn>37495706059</msisdn>
<msisdn>37495576434</msisdn>
If it will be smth with awk/sed, it will be perfect.
Thanks.
One way with awk:
awk '/address/{p=0}p{print a;p=0}/msisdn/{a=$0;p=1}' log
you can use pcregrep to match next line is not adddress and use awk show it
pcregrep -M '(.*</msisdn>)\n.*<msi' | awk 'NR % 2 == 1'
This might work for you (GNU sed):
sed -r '$!N;/(<msisdn>).*\n.*\1/P;D' file
This reads 2 lines into the pattern space and trys to match the pattern <msisdn> in both the 2 lines. If the pattern matchs it prints out the first line. The first line is then deleted and the process begins again, however since the pattern space contains the second line (now the first), the automatic reading of a line is forgone and process begins as of $!N.
Perl has its own way to do this:
perl -lne 'if($prev && $_!~/\./){print $prev}unless(/\./){$prev=$_}else{undef $prev}' your_file
Tested Below:
> perl -lne 'if($prev && $_!~/\./){print $prev}unless(/\./){$prev=$_}else{undef $prev}' temp
<msisdn>37495275824</msisdn>
<msisdn>37495706059</msisdn>
<msisdn>37495576434</msisdn>
>

vim/vi/sed: Act on a certain number of lines from the end of the file

Just as we can delete (or substitute, or yank, etc.) the 4th to 6th lines from the beginning of a file in vim:
:4,6d
I'd like to delete (or substitute, or yank, etc.) the 4th last to the 6th lines from the end of a file. It means, if the file has 15 lines, I'd do:
:10,12d
But one can't do this when they don't know how many lines are in the files -- and I'm going to use it on a batch of many files. How do I do this in vim and sed?
I did in fact look at this post, but have not found it useful.
Well, using vim you can try the following -- which goes quite intuitive, anyway:
:$-4,$-5d
Now, using sed I couldn't find an exact way to do it, but if you can use something other than sed, here goes a solution with head and tail:
head -n -4 file.txt && tail -2 file.txt
In Vim, you can subtract the line numbers from $, which stands for the last line, e.g. this will work on the last 3 lines:
:$-2,$substitute/...
In sed, this is not so easy, because it works on the stream of characters, and cannot simply go back. You would have to store a number of last seen lines in the hold space, and at the end of the stream work on the hold space.
Here are some recipes from sed1line.txt:
# print the last 10 lines of a file (emulates "tail")
sed -e :a -e '$q;N;11,$D;ba'
# print the last 2 lines of a file (emulates "tail -2")
sed '$!N;$!D'
# delete the last 2 lines of a file
sed 'N;$!P;$!D;$d'
# delete the last 10 lines of a file
sed -e :a -e '$d;N;2,10ba' -e 'P;D' # method 1
sed -n -e :a -e '1,10!{P;N;D;};N;ba' # method 2
From the 4th last to the 6th lines from the end of a file: use tac to reverse the file
tac filename | sed 4,6d | tac
You can use 2 passes with awk, first pass to count the number of lines and the second to print or delete whatever lines you like, e.g.
awk 'NR==FNR{numLines++;next} {fromEnd = numLines - FNR} fromEnd > 6 || fromEnd < 4' file file
awk 'NR==FNR{numLines++;next} {fromEnd = numLines - FNR} fromEnd < 6 && fromEnd > 4' file file
This might work for you (GNU sed):
sed -r ':a;${s/([^\n]*\n){3}//;q};N;7,$!ba;P;D' file
This works by making a moving window of 6 lines in the pattern space (PS) and then deleting the first three of them on encountering the last line.
:a is a loop label
${s/([^\n]*\n){3}//;q} delete the first three lines of the PS at end of file and quit.
N append a newline and then the next line to the PS.
7,$!ba' if not lines 7 to the $ (end-of file) that is lines 1 to 6, loop back to beginning i.e. label :a
P;D for the line range 7 to $ (end-of-file) print upto the first newline in the PS and then delete upto and including the first newline and begin a new cycle.
The second to last clause creates the window by default in that the lines 1 to 6 are appended into the PS. From line 7 to the end a line is added at the end and the first line is printed then deleted.
Alternatively:
sed -e ':a' -e '$s/\([^\n]*\n\)\{3\}//' -e '$q' -e 'N' -e '7,$!ba' -e 'P' -e 'D' file

Resources