How to remove line breaks generated by sed - linux

I have a file called sms:
gsm versi jadul
29 sender: +62896666666
date: 15/02/14,03:55:12
reboot router
when I type in:
sed -n '6p' sms > /tmp/result
The /tmp/result always looks like this:
Notice the line break there, I want to get rid of the line break on the second line, so the final result will be like this:
How do I do that?

You could trim it off with tr like this:
sed -n '6p' sms | tr -d '\n' > /tmp/result

You can use awk instead of sed:
awk 'NR==6 {printf $0}' sms > result
NR==6 specifies line number
printf $0 prints that line without any \n

There's nothing wrong with your sed command, your input file contains trailing control-Ms. Remove them with dos2unix or similar before running sed.

A correct implementation of the POSIX sed command does not add such a blank line. 6p should print the sixth line. I cannot reproduce the issue on, for example, Ubuntu 12 Linux. You have some line ending problem or some such issue.

Related

How to extract email headers extending on multiple lines from file

I am trying to extract the To header from an email file using sed on linux.
The problem is that the To header could be on multiple lines.
e.g:
To: name1#mydomain.org, name2#mydomain.org,
name3#mydomain.org, name4#mydomain.org,
name5#mydomain.org
Message-ID: <46608700.369886.1549009227948#domain.org>
I tried the following:
sed -n -e '/^[Tt]o: / { N; p; }' _message_file_ |
awk '{$1=$1;printf("%s ",$0)};NR%2==0{print ""}'
The sed command extracts the line starting with To and next line.
I pipe the output to awk to put everything on a single line.
The full command outputs in one line:
To: name1#mydomain.org, name2#mydomain.org, name3#mydomain.org, name4#mydomain.org
I don't know how to keep going and test if the next line starts with whitespace and add it to the result.
What I want is all the addresses
To: name1#mydomain.org, name2#mydomain.org, name3#mydomain.org, name4#mydomain.org, name5#mydomain.org
Any help will be appreciated.
formail is a good solution but here's how to do it with sed:
sed -e '/^$/q;/^To:/!d;n;:c;/^\s/!d;n;bc' message_file
/^$/q; - (optional) quit if we run out of headers
/^To:/!d; - if not a To: header, stop processing this line
n; - otherwise, implicitly print it, and load next line
:c; - c is a label we can branch to
/^\s/!d; - if not a contination, stop processing this line
n; - otherwise, implicitly print it, and load next line
bc - branch back to label c (ie. loop)
Both formail and reformail have a -c option to do exactly that.
From man reformail:
-c Concatenate multi-line headers. Headers split on multiple lines
are combined into a single line.
So you don't need to pipe the output to awk, and can just do
reformail -c -X To: < $your_message_file
However, emails normally use CRLF line endings, and the output on screen may be garbled because of the CR characters. To remove them, you can use Perl's generic \R line ending in a regex on the output :
reformail -c -X To: < $your_message_file | perl -pe 's/\R/\n/g'
or do it on the input if you prefer:
perl -pe 's/\R/\n/g' $your_message_file | reformail -c -X To:
On Debian and derived systems like Ubuntu, you can install them with
apt install maildrop for reformail, which is part of Courier's maildrop
or apt install procmail for formail (but procmail seems to be abandoned now).
I did it like this:
cat _message_file | formail -X To: | awk '{$1=$1;printf("%s ",$0)};NR%2==0{print ""}'
Or:
formail -X To: < _message_file | awk '{$1=$1;printf("%s ",$0)};NR%2==0{print ""}'
This might work for you (GNU sed):
sed -n '/^To:/{:a;N;/^ /Ms/\s*\n\s*/ /;ta;P}' file
Turn off implicit printing by using the -n option. Gather up the lines starting with white space, removing white space either side of the newline and replace it by a single space, starting from the line that begins To:. When matching fails, print the first line in the pattern space.
To print addresses as is, use:
sed '/^\S/h;G;/^To:/MP;d' file
It could be as straightforward as this:
sed -n '/^To:/{
:a
p
n
/^[[:space:]]/ba
}'
Be silent, but starting from the To: header print the text line by line while it still relevant to the header.

Separate a text file with sed

I have the following sample file:
evtlog.161202.002609.debugevtlog.161201.162408.debugevtlog.161202.011046.debugevtlog.161202.002809.debugevtlog.161201.160035.debugevtlog.161201.155140.debugevtlog.161201.232156.debugevtlog.161201.145017.debugevtlog.161201.154816.debug
I want to separate the string and add a newline after matching "debug" like this:
evtlog.161202.002609.debug
evtlog.161201.162408.debug
So far I tried almost everything with sed, but it doesn't seem to do what I want.
sed 's/debug/{G}' latest_evtlogs.out
sed '/debug/i "SAD"' latest_evtlogs.out
etc...
sed 's/debug/\n/g' latest_evtlogs.out doesn't work when I add it as a pipe in the script , but it does when I run it manually.
Here's how I generate the file:
printf $(ls -l $EVTLOG_PATH/evtlog|tail -n 10|awk '{printf $8 , "%s\n\n"}'|sed 's/debug/\n/g') >> latest_evtlogs.out
Initially I wanted to just add newline with awk, but it doesn't work either.
Any ideas why I can't separate the string with a newline ?
I'm using :
Distributor ID: Debian
Description: Debian GNU/Linux 5.0.10 (lenny)
Release: 5.0.10
Codename: lenny
Just add a new line after debug:
sed 's/debug/&\n/g' file
Note & prints back the matched text, so it is a way to print "debug" back.
This returns:
evtlog.161202.002609.debug
evtlog.161201.162408.debug
evtlog.161202.011046.debug
evtlog.161202.002809.debug
evtlog.161201.160035.debug
evtlog.161201.155140.debug
evtlog.161201.232156.debug
evtlog.161201.145017.debug
evtlog.161201.154816.debug
The problem is, that you are using the output of sed in a command expansion. In this context your shell will replace all newlines with spaces. The spaces are then used to do the word splitting, so that printf sees each line as a separate argument, interpreting the first line as the format argument and ignoring the rest as there are printf-placeholders in the format.
It should work if you drop the outer printf $() from your command and just redirect the output from your pipeline to your file:
ls -l $EVTLOG_PATH/evtlog|tail -n 10|awk '{printf $8 , "%s\n\n"}'|sed 's/debug/\n/g' >> latest_evtlogs.out
Maybe Perl is "happier" than sed on your system:
perl -pe 's/debug/&\n/g' < YourLogFile
Get will append what is in the hold buffer unto the pattern space (Usually just the current line read from the input file) So this cannot be used.
insert will print the specified text to standard output. So this cannot be used.
What you you want to to replace all debug with debug^J, where ^J is a newline, dependent on the sed version, you can either do:
sed 's/debug/&\n/g' input_file
But \n is - afaik - not strictly specified in POSIX sed. One can however use c strings:
sed 's/debug/&'$'\n''/g' input_file
Or a multi line string:
sed 's/debug/&\
/g' input_file
Thank you all for the answers.I finally did it like this :
echo $(ls -l $EVTLOG_PATH/evtlog|tail -n 10|awk '{printf $8 , "%s\n\n"}'|sed 's/debug/&\n/g') > temp.out
sed 's/ /\n/g' /share/sqa/dumps/5314577631/checks/temp.out > latest_evtlogs.out
It's not at all elegant, but it finally works.

remove \n and keep space in linux

I have a file contained \n hidden behind each line:
input:
s3741206\n
s2561284\n
s4411364\n
s2516482\n
s2071534\n
s2074633\n
s7856856\n
s11957134\n
s682333\n
s9378200\n
s1862626\n
I want to remove \n behind
desired output:
s3741206
s2561284
s4411364
s2516482
s2071534
s2074633
s7856856
s11957134
s682333
s9378200
s1862626
however, I try this:
tr -d '\n' < file1 > file2
but it goes like below without space and new line
s3741206s2561284s4411364s2516482s2071534s2074633s7856856s11957134s682333s9378200s1862626
I also try sed $'s/\n//g' -i file1 and it doesn't work in mac os.
Thank you.
This is a possible solution using sed:
sed 's/\\n/ /g'
with awk
awk '{sub(/\\n/,"")} 1' < file1 > file2
What you are describing so far in your question+comments doesn't make sense. How can you have a multi-line file with a hidden newline character at the end of each line? What you show as your input file:
s3741206\n
s2561284\n
s4411364\n
etc.
where each "\n" above according to your comment is a single newline character "\n" is impossible. If those "\n"s were newline characters then your file would simply look like:
s3741206
s2561284
s4411364
etc.
There's really only 2 possibilities I can think of:
You are wrongly interpreting what you are seeing in your input file
and/or using the wrong terminology and you actually DO have \r\n
at the end of every line. Run cat -v file to see the \rs as
^Ms and run dos2unix or similar (e.g. sed 's/\r$//' file) to
remove the \rs - you do not want to remove the \ns or you will
no longer have a POSIX text file and so POSIX tools will exhibit
undefined behavior when run on it. If that doesn't work for you then
copy/paste the output of cat -v file into your question so we can
see for sure what is in your file.
Or:
It's also entirely possible that your file is a perfectly fine POSIX
text file as-is and you are incorrectly assuming you will have a
problem for some reason so also include in your question a
description of the actual problem you are having, include an example
of the command you are executing on that input file and the output
you are getting and the output you expected to get.
You could use bash-native string substitution
$ cat /tmp/newline
s3741206\n
s2561284\n
s4411364\n
s2516482\n
s2071534\n
s2074633\n
s7856856\n
s11957134\n
s682333\n
s9378200\n
s1862626\n
$ for LINE in $(cat /tmp/newline); do echo "${LINE%\\n}"; done
s3741206
s2561284
s4411364
s2516482
s2071534
s2074633
s7856856
s11957134
s682333
s9378200
s1862626

Append text to file without line breaking

On a Linux machine, I have list of IPs as follows:
107.6.38.55
108.171.207.62
108.171.244.138
108.171.246.87
I want to use some function to add the word "or" at the end of each line without breaking each line, like this:
107.6.38.55 or
108.171.207.62 or
108.171.244.138 or
108.171.246.87 or
Every implementation I have experimented with in sed or awk has given me incorrect results as it keeps trying to line break or add input in strange spots. What is the easiest way to achieve this goal?
With awk '$0=$0" or"' and the sed suggestions I've tried thus far I get the following formatting:
107.6.38.55
or
108.171.207.62
or
108.171.244.138
or
108.171.246.87
or
Not sure what you have been trying but the following works for me on Ubuntu 12.04
awk '{print $0" or"}'
Or as fedorqui suggests
awk '$0=$0" or"'
Or as glenn jackman suggests
awk '{print $0, "or"}'
[EDIT]
It turns out the OP's file had CRLF line breaks so dos2unix had to be run first to address the format issue
The following two worked for me:
sed 's/.*/& or/'
sed 's/$/ or/'
Or use ed, the standard text editor:
With bash you can use the lovely here-strings together with ANSI-C quotings
ed -s filename <<< $',s/.$/& or/\nwq'
or a pipe with printf
printf "%s\n" ',s/.$/& or/' 'wq' | ed -s filename
or if you like echo better
{ echo ',s/.$/& or/'; echo "wq"; } | ed -s filename
or interactively (if you love question marks):
$ ed filename
,s/.$/& or/
wq
Remark. I'm using the substitution s/.$/& or/ and not s/$/ or/ just so as not to append or in an empty line.

Insert newline before first line

I am trying to insert a newline before the first line of text in a file. The only solution i have found so far is this:
sed -e '1 i
')
I do not like to have an actual newline in my shell script. Can this be solved any other way using the standard (GNU) UNIX utilities?
For variety:
echo | cat - file
Here's a pure sed solution with no specific shell requirements:
sed -e '1 s|^|\n|'
EDIT:
Please note that there has to be at least one line of input for this (and anything else using a line address) to work.
A $ before a single-quoted string will cause bash to interpret escape sequences within it.
sed -e '1 i'$'\n'
You could use awk:
$ awk 'FNR==1{print ""} 1' file
Which will work with any number of files.

Resources