Replacing multiple line using sed command - linux

I have a text file file.log contains following text
file.log
ab
cd
ef
I want to replace "ab\ncd" with "ab\n" and the final file.log should look like this:
ab
ef
This is the sed command I am using but it couldn't recognize the newline character to match the pattern:
sed -i 's/\(.*\)\r \(.*\)/\1\r/g' file.log
with 3 character space after '\r' but no change is made with this.
\(.*\) - This matches any character(.) followed by 0 or more (*) of the preceding character
\r - For newline
\1 - Substitution for the first matching pattern. In this case, it's 'ab'
Can you help me out what's wrong with the above command.

The issue is that, the sed is a stream editor, which reads line by line from the input file
So when it reads line
ab
from the input file, it doesnt know whether the line is followed by a line
cd
When it reads the line cd it sed will habe removed the line ab from the pattern space, this making the pattern invalid for the current pattern space.
Solution
A solution can be to read the entire file, and append them into the hold space, and then replace the hold space. As
$ sed -n '1h; 1!H;${g;s/ab\ncd/ab\n/g;p}' input
ab
ef
What it does
1h Copies the first line into the hold space.
1!H All lines excpet the first line (1!) appends the line to the hold space.
$ matches the last line, performs the commands in {..}
g copies the contents of hold space back to pattern space
s/ab\ncd/ab\n/g makes the substitution.
p Prints the entire patterns space.

Sed processes the input file line by line. So can't do like the above . You need to include N, so that it would append the next line into pattern space.
$ sed 'N;s~ab\ncd~ab\n~g' file
ab
ef

A couple of other options:
perl -i -0pe 's/^ab\n\Kcd$//mg' file.log
which will change any such pattern in the file
If there's just one, good ol' ed
ed file.log <<END_SCRIPT
/^ab$/+1 c
.
wq
END_SCRIPT

Related

Delete last line break using sed [duplicate]

This question already has answers here:
How can I delete a newline if it is the last character in a file?
(23 answers)
Closed 4 years ago.
How to delete the last \n from a file. The file has a last blank line created for a line break in the last text line. I'm using this command:
sed '/^\s*$/d'
But that las blank line is not removed.
Why is sed printing a newline?
When you read the sed POSIX standard, then it states:
Whenever the pattern space is written to standard output or a named file, sed shall immediately follow it with a <newline>.
A bit more details can be found in this answer.
Removing the last <newline>:
truncate: If you want to delete just one-character from a file you can do :
truncate -s -1 <file>
This makes the file one byte shorter, i.e. remove the last character.
From man resize:
-s, --size=SIZE set or adjust the file size by SIZE bytes
SIZE may also be prefixed by one of the following modifying characters:
'+' extend by, '-' reduce by, '<' at most, '>' at least,
'/' round down to multiple of, '%' round up to multiple of.
other answers can be found in How can I delete a newline if it is the last character in a file?
1) DELETE LAST EMPTY LINE FROM A FILE:
First of all, the command you are currently using will delete ALL empty and blank lines!
NOT ONLY THE LAST ONE.
If you want to delete the last line if it is empty/blank then you can use the following command:
sed '${/^[[:blank:]]*$/d}' test
INPUT:
cat -vTE test
a$
$
b$
$
c$
^I ^I $
OUTPUT:
sed '${/^[[:blank:]]*$/d}' test
a
b
c
Explanations:
the first $ will tell sed to do the processing only on the last line
/^[[:blank:]]*$/ the condition will be evaluate by sed and if this line is empty or composed only of blank chars it will trigger the delete operation on the pattern buffer, therefore this last line will not be printed
you can redirect the output of the sed command to save it to a new file or do the changes in-place using -i option (if you use it take a back up of your file!!!!) or use -i.bak to force sed to take a back up of your file before modifying it.
IMPORTANT:
If your file comes from Windows and contain some carriage returns (\r) this sed command will not work!!! You will need to remove those noisy characters by using either dos2unix or tr -d '\r'.
For files containing carriage returns <CR> (\r or ^M):
BEFORE FIXING THE FILE:
cat:
cat -vTE test
a$
$
b$
$
c$
^I ^I ^M$
od:
od -c test
0000000 a \n \n b \n \n c \n \t \t \r \n
0000016
sed:
sed '${/^[[:blank:]]*$/d}' test
a
b
c
AFTER FIXING THE FILE:
dos2unix test
dos2unix: converting file test to Unix format ...
cat:
cat -vTE test
a$
$
b$
$
c$
^I ^I $
od:
od -c test
0000000 a \n \n b \n \n c \n \t \t \n
0000015
sed:
sed '${/^[[:blank:]]*$/d}' test
a
b
c
2) DELETE LAST EOL CHARACTER FROM A FILE:
For this particular purpose, I would recommend using perl:
perl -pe 'chomp if eof' test
a
b
c
you can add -i option to to the change in-place (take a backup of your file before running the command). Last but not least, you might have to remove Carriage Return from your files as described hereover.
Your question isn't clear but this might be what you're asking for:
$ cat file
a
b
c
$ awk 'NR>1{print p} {p=$0}' file
a
b
c
$
you can also use below one-liner from sed to remove the trailing blank line(s):
sed -e :a -e '/^\n*$/N;/\n$/ba'

SED - insert a blank line after every input line that consists of capital letters and spaces

I have a text file and I need a command using sed to insert a blank line after every line that that consists of capital letters and spaces only.
This might work for you (GNU sed):
sed '/^[[:blank:][:upper:]][[:blank:][:upper:]]*$/G' file
This appends the hold space (by default it contains a newline) to lines containing at least one or more whitespace or uppercase characters.
Given:
$ cat file
LINE LINE LINE
Line Line Line
Line 1
LINE 2
END!
====
You can use s/// to add a \n to the line:
With POSIX sed, use a literal new line in the sed script:
$ sed 's/^\([[:upper:][:blank:]]*\)$/\1\
/' file
LINE LINE LINE
Line Line Line
Line 1
LINE 2
END!
====
With GNU sed, you can use the representation of \n:
$ sed 's/^\([[:upper:][:blank:]]*\)$/\1\n/' file
You can also use a\ to append in sed. I have tried to get sed append to work but cannot reliably with POSIX, BSD and GNU sed since POSIX and BSD do not support \n
With GNU sed (note space after a\):
$ sed '/^[[:upper:][:blank:]]*$/a\ ' file
BSD:
$ sed '/^[[:upper:][:blank:]]*$/a\
\
' file
Those are not exactly equivalent since the GNU version has a space on the blank line.
The version of POSIX sed I have did not work with either of those...
Given the platform and version differences of sed, you might consider awk to do this since simple awk's are easier to make universal.
This works on every awk I have:
$ awk '1; /^[[:upper:][:blank:]]*$/{print ""}' file
With awk you can also make it so that blank lines are not doubled by making sure there is at least non blank like so:
$ awk '1; /^[[:upper:][:blank:]]+$/ && NF>1 {print ""}' file
Sure. Just insert lines with a:
sed '/^[[:blank:]A-Z]*$/a\'
The a command inserts the string after it after every matching line (end the string with a backslash). So the above command just inserts an empty line after all lines that contains solely of capital letters and spaces. That's exactly what you want.

Hidden line in file?

I have a UTF-8/no BOM file (converted from ISO-8859-1) that has 31214 lines. I have already run dos2unix on the file. When I open it in notepad++, I see a blank line underneath. When I remove this blank line, the line count reduces by one. I save it under a different name and when I tail the file, the prompt displays on the same line. From bash, how do I delete the blank line in the 1st file to produce the result displayed below in the 2nd file?
The goal is to do this from bash w/o manually deleting the line in notepad++
1st file:
[user#server]$ cat file1.txt | wc -l
31214
[user#server]$ tail file1.txt
T 31212 Data 20170517
[user#server]$
2nd file (edited with notepad++)
[user#server]$ cat file2.txt | wc -l
31213
[user#server]$ tail file2.txt
T 31212 Data 20170517[user#server]$
That's the trailing newline of the last line. Some editors allow you to go to the nonexisting "empty" line at the end, some don't show it. Again, some programs may allow you to remove the final newline, but note that e.g. POSIX in effect requires it to be there, and some standard utilities act oddly if it isn't present.
E.g. wc -l counts the number of newlines in the input file (printf "foo\nbar" | wc -l shows 1) so removing the final newline does decrease the line count.
Also, Bash prints the prompt wherever it was that the cursor was left on the screen, so if you print something that doesn't have the trailing newline, the prompt will be placed where the final incomplete line ended, as you saw.
There's no need to remove that final newline, just leave it there.
To remove the final newline character it is possible, as explained here, to use
sed -i '$ s/.$//' your.file
which will substitute nothing for the last character in the last line of the file (if you want to delete smth else from the end of the file you can replace the regex .$ with smth-else$). -i means ‘substitute in-place’ (in FreeBSD/MacOS you need to add an empty string as an argument: sed -i "" '$ s/.$//' your.file)
The file2.txt is missing a trailing newline.
Yes, a text file should end on a newline character.
Given that you do know that a trailing newline is missing, this command should be enough to correct the problem:
$ echo >> file2.txt

Inserting string in file in nth line after pattern using sed

I want to insert word after nth line after pattern using sed.
I tied to modify this command but it inserts only in first line after pattern.
sed -i '/myPattern/a \ LineIWantToinser ' myFile
What command should I use to insert for example in third line after pattern?
Easiest way to do it with GNU sed is.. (maybe some direct solution exists!?)
sed -n '/pattern/=' file
to see line where pattern is (grep also can be used here with -n)
then if linenumber+ numoflines is for example 123
sed '123aSOME INSERTED TEXT AFTER THAT LINE' file
where little a is append command (after that line, if i is used will be pre pattern line)
ps. I'm eager to see if #neronlevelu (or other sed Lover) will find some better sed solution.
Edit: i've found it, it seems a for append or i for insert must? be on first position on line when using { with ; inside } like
sed '/pattern/{N;N;N;
a SOME TEXT FOR INSERTING
}' file
sed '/pattern/{N;N;N;i \
Line to add after 3 lines with patterne as starting counter
' YourFile
number of N to add line between pattern and inserted line.
there is no check for end of file or pattern in the 3 lines. (not specified in PO)
A version with bash and ed:
ed -s myFile <<<$'/myPattern/+3a\n LineIWantToinser \n.\nwq'
ed enables us to use the line addressing /myPattern/+3.

How can I remove the last character of a file in unix?

Say I have some arbitrary multi-line text file:
sometext
moretext
lastline
How can I remove only the last character (the e, not the newline or null) of the file without making the text file invalid?
A simpler approach (outputs to stdout, doesn't update the input file):
sed '$ s/.$//' somefile
$ is a Sed address that matches the last input line only, thus causing the following function call (s/.$//) to be executed on the last line only.
s/.$// replaces the last character on the (in this case last) line with an empty string; i.e., effectively removes the last char. (before the newline) on the line.
. matches any character on the line, and following it with $ anchors the match to the end of the line; note how the use of $ in this regular expression is conceptually related, but technically distinct from the previous use of $ as a Sed address.
Example with stdin input (assumes Bash, Ksh, or Zsh):
$ sed '$ s/.$//' <<< $'line one\nline two'
line one
line tw
To update the input file too (do not use if the input file is a symlink):
sed -i '$ s/.$//' somefile
Note:
On macOS, you'd have to use -i '' instead of just -i; for an overview of the pitfalls associated with -i, see the bottom half of this answer.
If you need to process very large input files and/or performance / disk usage are a concern and you're using GNU utilities (Linux), see ImHere's helpful answer.
truncate
truncate -s-1 file
Removes one (-1) character from the end of the same file. Exactly as a >> will append to the same file.
The problem with this approach is that it doesn't retain a trailing newline if it existed.
The solution is:
if [ -n "$(tail -c1 file)" ] # if the file has not a trailing new line.
then
truncate -s-1 file # remove one char as the question request.
else
truncate -s-2 file # remove the last two characters
echo "" >> file # add the trailing new line back
fi
This works because tail takes the last byte (not char).
It takes almost no time even with big files.
Why not sed
The problem with a sed solution like sed '$ s/.$//' file is that it reads the whole file first (taking a long time with large files), then you need a temporary file (of the same size as the original):
sed '$ s/.$//' file > tempfile
rm file; mv tempfile file
And then move the tempfile to replace the file.
Here's another using ex, which I find not as cryptic as the sed solution:
printf '%s\n' '$' 's/.$//' wq | ex somefile
The $ goes to the last line, the s deletes the last character, and wq is the well known (to vi users) write+quit.
After a whole bunch of playing around with different strategies (and avoiding sed -i or perl), the best way i found to do this was with:
sed '$! { P; D; }; s/.$//' somefile
If the goal is to remove the last character in the last line, this awk should do:
awk '{a[NR]=$0} END {for (i=1;i<NR;i++) print a[i];sub(/.$/,"",a[NR]);print a[NR]}' file
sometext
moretext
lastlin
It store all data into an array, then print it out and change last line.
Just a remark: sed will temporarily remove the file.
So if you are tailing the file, you'll get a "No such file or directory" warning until you reissue the tail command.
EDITED ANSWER
I created a script and put your text inside on my Desktop. this test file is saved as "old_file.txt"
sometext
moretext
lastline
Afterwards I wrote a small script to take the old file and eliminate the last character in the last line
#!/bin/bash
no_of_new_line_characters=`wc '/root/Desktop/old_file.txt'|cut -d ' ' -f2`
let "no_of_lines=no_of_new_line_characters+1"
sed -n 1,"$no_of_new_line_characters"p '/root/Desktop/old_file.txt' > '/root/Desktop/my_new_file'
sed -n "$no_of_lines","$no_of_lines"p '/root/Desktop/old_file.txt'|sed 's/.$//g' >> '/root/Desktop/my_new_file'
opening the new_file I created, showed the output as follows:
sometext
moretext
lastlin
I apologize for my previous answer (wasn't reading carefully)
sed 's/.$//' filename | tee newFilename
This should do your job.
A couple perl solutions, for comparison/reference:
(echo 1a; echo 2b) | perl -e '$_=join("",<>); s/.$//; print'
(echo 1a; echo 2b) | perl -e 'while(<>){ if(eof) {s/.$//}; print }'
I find the first read-whole-file-into-memory approach can be generally quite useful (less so for this particular problem). You can now do regex's which span multiple lines, for example to combine every 3 lines of a certain format into 1 summary line.
For this problem, truncate would be faster and the sed version is shorter to type. Note that truncate requires a file to operate on, not a stream. Normally I find sed to lack the power of perl and I much prefer the extended-regex / perl-regex syntax. But this problem has a nice sed solution.

Resources