Remove line break every nth line using sed - linux

Example: Is there a way to use sed to remove/subsitute a pattern in a file for every 3n + 1 and 3n+ 2 line?
For example, turn
Line 1n/
Line 2n/
Line 3n/
Line 4n/
Line 5n/
Line 6n/
Line 7n/
...
To
Line 1 Line 2 Line 3n/
Line 4 Line 5 Line 6n/
...
I know this can probably be handled by awk. But what about sed?

Well, I'd just use awk for that1 since it's a little more complex but, if you're really intent on using sed, the following command will combine groups of three lines into a single line (which appears to be what you're after based on the title and text, despite the strange use of /n for newline):
sed '$!N;$!N;s/\n/ /g'
See the following transcript for how to test this:
$ printf 'Line 1\nLine 2\nLine 3\nLine 4\nLine 5\n' | sed '$!N;$!N;s/\n/ /g'
Line 1 Line 2 Line 3
Line 4 Line 5
The sub-commands are as follows:
$!N will append the next line to the pattern space, but only if you're not on the last line (you do this twice to get three lines). Each line in the pattern space is separated by a newline character.
s/\n/ /g replaces all the newlines in the pattern space with a space character, effectively combining the three lines into one.
1 With something like:
awk '{if(NR%3==1){s="";if(NR>1){print ""}};printf s"%s", $0;s=" "}'
This is complicated by the likelihood you don't want an extraneous space at the end of each line, necessitating the introduction of the s variable.
Since the sed variant is smaller (and less complex once you understand it), you're probably better off sticking with it. Well, at least up to the point where you want to combine groups of 17 lines, or do something else more complex than sed was meant to handle :-)

The example is for merging 3 consecutive lines although description is different. To generate the example output, you can use awk idiom
awk 'ORS=NR%3?FS:RS' <(seq 1 9)
1 2 3
4 5 6
7 8 9
in your case the record separator needs to be defined upfront to include the literals
awk -v RS="n/\\n" 'ORS=NR%3?FS:RS'

ok. following are ways to deal with it generally using awk and sed.
awk:
awk 'NR % 3 { sub(/pattern/, substitution) } { print }' file | paste -d' ' - - -
sed:
sed '{s/pattern/substitution/p; n;s/pattern/substitution/p; n;p}' file | paste -d' ' - - -
both of them replace pattern in 3n+1 and 3n+2 lines into substitution and keep the 3n line untouched.
paste - - - is the bash idiom to fold the stdout by 3.

Related

Append then delete line to another line, only if it does not contain character

In my text file, there are 6 lines in a group separated by two blank lines. I have printed the line number for each line to the text document.
365:--------------------------------------------------------------------------------
366:--------------------------------------------------------------------------------
367:--------------------------------------------------------------------------------
368:--------------------------------------------------------------------------x-----
369:--------------------4-----------------------------------------------------------
370:--0-----------------------------------------------------------------------------
371:
372:
373:--------------------------------------------------------------------|
374:--------------------------------------------------------------------|
375:------------0--------2--------3h----2h----0-----2-------------------|
376:---2-----------------------------------------------------2----------|
377:--------------------------------------------------------------------|
378:--------------------------------------------------------------------|
Currently only 80 characters are printed to a line, so the rest of the data continues in the next group. For example, Line 365 corresponds to Line 373.
For only lines that do not contain a vertical bar (i.e., lines 365-370), I am trying to 1) append the line that is 8 lines away, then 2) delete the appended line after it has been printed.
So, ideally:
365:----------------------------------------------------------------------------------------------------------------------------------------------------|
366:----------------------------------------------------------------------------------------------------------------------------------------------------|
367:--------------------------------------------------------------------------------------------0--------2--------3h----2h----0-----2-------------------|
368:--------------------------------------------------------------------------x--------2-----------------------------------------------------2----------|
369:--------------------4-------------------------------------------------------------------------------------------------------------------------------|
370:--0-------------------------------------------------------------------------------------------------------------------------------------------------|
I can isolate the lines that do not contain a vertical bar using grep
grep -vn \| song.txt
I know that SED or AWK are likely my best bet, but I'm not sure how to proceed from here.
Just massage this approach to suit:
$ seq 16 | awk 'NR>8{print a[NR%8], $0} {a[NR%8]=$0}'
1 9
2 10
3 11
4 12
5 13
6 14
7 15
8 16
e.g. assuming 2 blank lines at the end of your input to make it blocks of 8 lines:
$ awk 'NR>8{print a[NR%8] $0} {a[NR%8]=$0}' file
--------------------------------------------------------------------------------------------------------------------------------------------------|
--------------------------------------------------------------------------------------------------------------------------------------------------|
------------------------------------------------------------------------------------------0--------2--------3h----2h----0-----2-------------------|
-------------------------------------------------------------------------x-------2-----------------------------------------------------2----------|
-------------------4------------------------------------------------------------------------------------------------------------------------------|
-0------------------------------------------------------------------------------------------------------------------------------------------------|
or if you don't have those blank lines after the last block:
$ awk '!NF{next} ++cnt>6{print a[NR%6] $0} {a[NR%6]=$0}' file
--------------------------------------------------------------------------------------------------------------------------------------------------|
-------------------------------------------------------------------------x------------------------------------------------------------------------|
-------------------4----------------------------------------------------------------------0--------2--------3h----2h----0-----2-------------------|
-0-------------------------------------------------------------------------------2-----------------------------------------------------2----------|
--------------------------------------------------------------------------------------------------------------------------------------------------|
--------------------------------------------------------------------------------------------------------------------------------------------------|
A little bit ugly, but working:
Split your input:
egrep -v "^$|\|" song.txt >file1
egrep "\|" song.txt >file2
And put it together:
paste -d "" file1 file2
I usually use the vim program for this type of work. For example, assuming you have a file named file_name.txt with the following content
-------------------------8----
------------0--------2--------|
---2--------------------------|
------------------aaa---------|
---------------984asds--------|
---------t6776----------------|
with the following command
vim -c ":6y" -c ":put" -c ":1" -c ":join!" -c ":6d" -c ":wq" file_name.txt
the program opens file_name.txt on the first line, copy the sixth line, paste the contents copied in the second line (the next line), go to the first line, joins the first line with the second, delete the line that was copied (sixth line), save and close the file. In this way, this command produces the following result
-------------------------8-------------------984asds--------|
------------0--------2--------|
---2--------------------------|
------------------aaa---------|
---------t6776----------------|
This might work for you (GNU utils);
sed '/^$/d' file |
split -nr/6 --filter 'cat'|
paste -sd'\0'|
sed 's/|/&\n/g;s/\n$//'
This removes any blank lines using sed, splits the file into 6 using a round-robin method and instead of making separate files, outputs all the files interleaved into the stdout. The lines are then pasted into a long lines (one per string) and split back into shorter lines using the | as record separators.

How to remove all data from a file before a line containing string by passing variable in linux

I am trying to trim the data above the line from a file, where line containing some string by passing variable to it
varfile=$(cat variable.txt)
echo "$varfile"
if [ -z "$varfile" ]; then
echo "null"
else
echo "data"
sed "1,/$varfile/d" fileee.txt
fi
Here I am taking a string from variable.txt file and trying to find that text in fileee.txt file and removing all the data above the line
EX: variable.txt has 3
I am finding 3 in fileee.txt and removing data above three
INPUT:
1
2
3
4
OUTPUT:
3
4
I suppose the issue here is that you want to remove all lines before the match, but not the matching line itself?
One way, with GNU sed, is to explicitly add a print for the matching line first:
pattrn=3
seq 1 4 | sed -e "/$pattrn/p;1,/$pattrn/d"
Though this will duplicate any further lines that match the pattern.
Better, invert the sense of the match:
seq 1 4 | sed -ne "/$pattrn/,\$p"
That is, don't print by default (-n), but print (p) anything from a match to the end ($, escaped because of the double-quoted string)
Even better would be to use awk:
pattrn=3
seq 1 4 | awk -vpat="$pattrn" '$0 ~ pat {p=1} p'
This sets a flag on the line where the whole line ($0) matches the pattern (~ is a regex match), then prints the lines whenever that flag is set.
The awk solution is also better in that special characters in the pattern don't cause issues (at least not as many); in the sed case, if the pattern contains a slash /, it will terminate the regex in the sed code, and cause syntax errors or allow for code injection.
I used seq from GNU coreutils here only to make up the sequence of numbers for input.

vim/vi/sed: Act on a certain number of lines from the end of the file

Just as we can delete (or substitute, or yank, etc.) the 4th to 6th lines from the beginning of a file in vim:
:4,6d
I'd like to delete (or substitute, or yank, etc.) the 4th last to the 6th lines from the end of a file. It means, if the file has 15 lines, I'd do:
:10,12d
But one can't do this when they don't know how many lines are in the files -- and I'm going to use it on a batch of many files. How do I do this in vim and sed?
I did in fact look at this post, but have not found it useful.
Well, using vim you can try the following -- which goes quite intuitive, anyway:
:$-4,$-5d
Now, using sed I couldn't find an exact way to do it, but if you can use something other than sed, here goes a solution with head and tail:
head -n -4 file.txt && tail -2 file.txt
In Vim, you can subtract the line numbers from $, which stands for the last line, e.g. this will work on the last 3 lines:
:$-2,$substitute/...
In sed, this is not so easy, because it works on the stream of characters, and cannot simply go back. You would have to store a number of last seen lines in the hold space, and at the end of the stream work on the hold space.
Here are some recipes from sed1line.txt:
# print the last 10 lines of a file (emulates "tail")
sed -e :a -e '$q;N;11,$D;ba'
# print the last 2 lines of a file (emulates "tail -2")
sed '$!N;$!D'
# delete the last 2 lines of a file
sed 'N;$!P;$!D;$d'
# delete the last 10 lines of a file
sed -e :a -e '$d;N;2,10ba' -e 'P;D' # method 1
sed -n -e :a -e '1,10!{P;N;D;};N;ba' # method 2
From the 4th last to the 6th lines from the end of a file: use tac to reverse the file
tac filename | sed 4,6d | tac
You can use 2 passes with awk, first pass to count the number of lines and the second to print or delete whatever lines you like, e.g.
awk 'NR==FNR{numLines++;next} {fromEnd = numLines - FNR} fromEnd > 6 || fromEnd < 4' file file
awk 'NR==FNR{numLines++;next} {fromEnd = numLines - FNR} fromEnd < 6 && fromEnd > 4' file file
This might work for you (GNU sed):
sed -r ':a;${s/([^\n]*\n){3}//;q};N;7,$!ba;P;D' file
This works by making a moving window of 6 lines in the pattern space (PS) and then deleting the first three of them on encountering the last line.
:a is a loop label
${s/([^\n]*\n){3}//;q} delete the first three lines of the PS at end of file and quit.
N append a newline and then the next line to the PS.
7,$!ba' if not lines 7 to the $ (end-of file) that is lines 1 to 6, loop back to beginning i.e. label :a
P;D for the line range 7 to $ (end-of-file) print upto the first newline in the PS and then delete upto and including the first newline and begin a new cycle.
The second to last clause creates the window by default in that the lines 1 to 6 are appended into the PS. From line 7 to the end a line is added at the end and the first line is printed then deleted.
Alternatively:
sed -e ':a' -e '$s/\([^\n]*\n\)\{3\}//' -e '$q' -e 'N' -e '7,$!ba' -e 'P' -e 'D' file

sed to insert on first match only

UPDATED:
Using sed, how can I insert (NOT SUBSTITUTE) a new line on only the first match of keyword for each file.
Currently I have the following but this inserts for every line containing Matched Keyword and I want it to only insert the New Inserted Line for only the first match found in the file:
sed -ie '/Matched Keyword/ i\New Inserted Line' *.*
For example:
Myfile.txt:
Line 1
Line 2
Line 3
This line contains the Matched Keyword and other stuff
Line 4
This line contains the Matched Keyword and other stuff
Line 6
changed to:
Line 1
Line 2
Line 3
New Inserted Line
This line contains the Matched Keyword and other stuff
Line 4
This line contains the Matched Keyword and other stuff
Line 6
You can sort of do this in GNU sed:
sed '0,/Matched Keyword/s//New Inserted Line\n&/'
But it's not portable. Since portability is good, here it is in awk:
awk '/Matched Keyword/ && !x {print "Text line to insert"; x=1} 1' inputFile
Or, if you want to pass a variable to print:
awk -v "var=$var" '/Matched Keyword/ && !x {print var; x=1} 1' inputFile
These both insert the text line before the first occurrence of the keyword, on a line by itself, per your example.
Remember that with both sed and awk, the matched keyword is a regular expression, not just a keyword.
UPDATE:
Since this question is also tagged bash, here's a simple solution that is pure bash and doesn't required sed:
#!/bin/bash
n=0
while read line; do
if [[ "$line" =~ 'Matched Keyword' && $n = 0 ]]; then
echo "New Inserted Line"
n=1
fi
echo "$line"
done
As it stands, this as a pipe. You can easily wrap it in something that acts on files instead.
If you want one with sed*:
sed '0,/Matched Keyword/s//Matched Keyword\nNew Inserted Line/' myfile.txt
*only works with GNU sed
This might work for you:
sed -i -e '/Matched Keyword/{i\New Inserted Line' -e ':a;n;ba}' file
You're nearly there! Just create a loop to read from the Matched Keyword to the end of the file.
After inserting a line, the remainder of the file can be printed out by:
Introducing a loop place holder :a (here a is an arbitrary name).
Print the current line and fetch the next into the pattern space with the ncommand.
Redirect control back using the ba command which is essentially a goto to the a place holder. The end-of-file condition is naturally taken care of by the n command which terminates any further sed commands if it tries to read passed the end-of-file.
With a little help from bash, a true one liner can be achieved:
sed $'/Matched Keyword/{iNew Inserted Line\n:a;n;ba}' file
Alternative:
sed 'x;/./{x;b};x;/Matched Keyword/h;//iNew Inserted Line' file
This uses the Matched Keyword as a flag in the hold space and once it has been set any processing is curtailed by bailing out immediately.
If you want to append a line after first match only, use AWK instead of SED as below
awk '{print} /Matched Keyword/ && !n {print "New Inserted Line"; n++}' myfile.txt
Output:
Line 1
Line 2
Line 3
This line contains the Matched Keyword and other stuff
New Inserted Line
Line 4
This line contains the Matched Keyword and other stuff
Line 6

How can I swap two lines using sed?

Does anyone know how to replace line a with line b and line b with line a in a text file using the sed editor?
I can see how to replace a line in the pattern space with a line that is in the hold space (i.e., /^Paco/x or /^Paco/g), but what if I want to take the line starting with Paco and replace it with the line starting with Vinh, and also take the line starting with Vinh and replace it with the line starting with Paco?
Let's assume for starters that there is one line with Paco and one line with Vinh, and that the line Paco occurs before the line Vinh. Then we can move to the general case.
#!/bin/sed -f
/^Paco/ {
:notdone
N
s/^\(Paco[^\n]*\)\(\n\([^\n]*\n\)*\)\(Vinh[^\n]*\)$/\4\2\1/
t
bnotdone
}
After matching /^Paco/ we read into the pattern buffer until s// succeeds (or EOF: the pattern buffer will be printed unchanged). Then we start over searching for /^Paco/.
cat input | tr '\n' 'ç' | sed 's/\(ç__firstline__\)\(ç__secondline__\)/\2\1/g' | tr 'ç' '\n' > output
Replace __firstline__ and __secondline__ with your desired regexps. Be sure to substitute any instances of . in your regexp with [^ç]. If your text actually has ç in it, substitute with something else that your text doesn't have.
try this awk script.
s1="$1"
s2="$2"
awk -vs1="$s1" -vs2="$s2" '
{ a[++d]=$0 }
$0~s1{ h=$0;ind=d}
$0~s2{
a[ind]=$0
for(i=1;i<d;i++ ){ print a[i]}
print h
delete a;d=0;
}
END{ for(i=1;i<=d;i++ ){ print a[i] } }' file
output
$ cat file
1
2
3
4
5
$ bash test.sh 2 3
1
3
2
4
5
$ bash test.sh 1 4
4
2
3
1
5
Use sed (or not at all) for only simple substitution. Anything more complicated, use a programming language
A simple example from the GNU sed texinfo doc:
Note that on implementations other than GNU `sed' this script might
easily overflow internal buffers.
#!/usr/bin/sed -nf
# reverse all lines of input, i.e. first line became last, ...
# from the second line, the buffer (which contains all previous lines)
# is *appended* to current line, so, the order will be reversed
1! G
# on the last line we're done -- print everything
$ p
# store everything on the buffer again
h

Resources