Replace real newlines with \n in text file [duplicate] - linux

This question already has answers here:
How can I replace each newline (\n) with a space using sed?
(43 answers)
Closed 3 years ago.
On a Pi, in a text file like this
line1
line2
line3
...
how can I translate that to a file with just one line formatted like this
line1\n\line2\nline3\n......
NB The real file is 50MB and 200000 lines long

You can use sed
sed ':a;N;$!ba;s/\n/\\n/g' my.txt >> new_my.txt
This will read the whole file in a loop, then replaces the newline(s) with a "\n" and store it in a new file.

With GNU sed you can:
sed -z -i -e 's/\n/\\n/g' file
replace all newlines for \n character. This can use some memory, as it can read the whole file into memory.
With awk you can print each line with \\n on the end:
awk '{printf "%s\\n", $0}'
You can use xargs to split the input on newlines and run printf:
cat file | xargs -d $'\n' printf '%s\\n'

Related

Delete empty lines from a text file via Bash including empty spaces characters [duplicate]

This question already has answers here:
Delete empty lines using sed
(17 answers)
Closed 6 years ago.
I tried to use 'sed' command to remove the empty lines.
sed -i '/^$/d' file.txt
My sample txt file looks likes this. The second line has space characters. sed command only removes the empty lines but not the lines with white space.
Sample text
Sample text
So is there away to accomplish this via bash.
My intended out put is
Sample text
Sample text
Use character class [:blank:] to indicate space or tab:
With sed:
sed -i '/^[[:blank:]]*$/ d' file.txt
With perl:
perl -ne 'print if !/^[[:blank:]]*$/' file.txt
With awk:
awk '!/^[[:blank:]]*$/' file.txt
With grep:
grep -v '^[[:blank:]]*$' file.txt
If the tool does not support editing in-place, leverage a temporary file e.g. for grep:
grep -v '^[[:blank:]]*$' file.txt >file.txt.tmp && mv file.txt{.tmp,}
sed -i '/^ *$/d' file.txt
or to also match other white space characters such as tabs, etc:
sed -i '/^[[:space:]]*$/d' file.txt
the * character matches 0 or more instances of preceding character

Combining SED commands [duplicate]

This question already has answers here:
Combining two sed commands
(2 answers)
Closed 1 year ago.
Combine sed commands into one command
I am currently doing these two commands
removes the first character of each line
sed -i 's/\(.\{1\}\)//'
removes extra spaces in each line
sed -i 's/ / /g'
There are 3.4 BILLION lines in the 237GB file it is parsing, and i dont want it to need to run through twice.
The below sed command would combine the both. Use ; as separator to combine two sed operations.
sed -i 's/\(.\{1\}\)//;s/ / /g' file
Another way:
sed -i -e 's/\(.\{1\}\)//' -e 's/ / /g' file
You can try an awk
awk '{sub(/./,"");$1=$1}1' file
sub(/./,"") removes first character
$1=$1 removes all double space.

SED: Displaying the first 10 lines of sophisticated expression

How to use sed to find lines with the word linux? As later display a first line 10 with the word linux?
EX.:
cat file | sed -e '/linux/!d' -e '10!d' ### I can not display the first 10 lines of the word linux
cat file | sed '/linux/!d' | sed '10!d' ### It is well
How to make it work with one sed?
cat file | sed -e '/linux/!d; ...?; 10!d'
...? - storing of the buffer linux? 10 later cut the lines?
Someone explain to me?
I would use awk:
awk '/linux/ && c<10 {print;c++} c==10 {exit}' file
This might work for you (GNU sed):
sed -nr '/linux/{p;G;/(.*\n){10}/q;h}' file
Print the line if it contains the required string. If the required number of lines has already been printed quit, otherwise store the line and previous lines in the hold space.
You could use perl:
perl -ne 'if (/linux/) {print; ++$a;}; last if $a==10' inputfile
Using GNU sed:
sed -rn "/linux/{p;x;s/^/P/;ta;:a;s/^P{10}$//;x;Tb;Q;:b}" filename
Thanks. You are great. All of the examples look very nice. Wooow :) It is a pity that I can not do that.
I have not seen for 'r' option in sed. I need to learn.
echo -e 'windows\nlinux\nwindows\nlinux\nlinux\nwindows' | sed -nr '/linux/{p;G;/(.*\n){2}/q;h}'
It works very well.
echo -e 'windows\nlinux\nwindows\nlinux\nlinux\nwindows' | sed -nr '/linux/{p;G;/(.*\n){2}/q;h}' | sed '2s/linux/debian/'
Can I ask you one more example? How to get a result at one sed?

clean letters and characters in files leaving only numbers using bash

I am reading files and i am doing something like:
cat file | sed s/\ //g |awk '$0 !~ /[^0-9]/'
With this line I want to clean anything different to numbers.
But i have a problem, when the file is not sorted the command works fine, but with a sorted file the command not works, the output is empty.
Who can help me?
with grep -o '[0-9]+' not works because:
I have a file like:
311435ll3e
kk13322;.
erre433
The output is:
311435
3
13322
433
And the 3 is in the second line, the output that i need is:
3114353
13322
433
As a general rule, there is no reason to have both awk and sed appearing in the same pipe, due to a large overlap of capability, and frequently the same is true of awk/grep/sed combinations.
If you just want to suppress the non-digit characters within lines of characters, use (eg) sed -e 's/[^0-9]//g' file, or if you want to do it in place with no backup, sed -i -e 's/[^0-9]//g' file, or in place with backup to a .bak file, sed -ibak -e 's/[^0-9]//g' file.
To suppress blank lines, you can append |egrep -v '^$' after the sed, but it's more efficient to just use sed's d command to delete the pattern space and start next cycle if the pattern space is empty. For example,
sed -e 's/[^0-9]//g; /^$/d' file
does a d if the line is empty after substitution.
The form suggested in 1_CR's comment,
sed -e 's/[^0-9]//g' -e '/./!d'
is an alternative. That form tests if the line has at least one character in it, and if so does not do a d.
If you want to suppress everything in the file that's not digits, use tr -cd 0-9 < file. This suppresses line feeds also.
Note, the form tr -cd [0-9] < file or tr -cd '[0-9]' < file is not correct; it will fail to suppress ] and [ characters because tr will regard them as part of SET1.

Turning multiple lines into one comma separated line [duplicate]

This question already has answers here:
Concise and portable "join" on the Unix command-line
(10 answers)
Closed 8 years ago.
I have the following data in multiple lines:
foo
bar
qux
zuu
sdf
sdfasdf
What I want to do is to convert them to one comma separated line:
foo,bar,qux,zuu,sdf,sdfasdf
What's the best unix one-liner to do that?
Using paste command:
paste -d, -s file
file
aaa
bbb
ccc
ddd
xargs
cat file | xargs
result
aaa bbb ccc ddd
xargs improoved
cat file | xargs | sed -e 's/ /,/g'
result
aaa,bbb,ccc,ddd
There are many ways it can be achieved. The tool you use mostly depends on your own preference or experience.
Using tr command:
tr '\n' ',' < somefile
Using awk:
awk -F'\n' '{if(NR == 1) {printf $0} else {printf ","$0}}' somefile
xargs -a your_file | sed 's/ /,/g'
This is a shorter way.
based on your input example, this awk line works. (without trailing comma)
awk -vRS="" -vOFS=',' '$1=$1' file
test:
kent$ echo "foo
bar
qux
zuu
sdf
sdfasdf"|awk -vRS="" -vOFS=',' '$1=$1'
foo,bar,qux,zuu,sdf,sdfasdf
Perl one-liner:
perl -pe'chomp, s/$/,/ unless eof' file
or, if you want to be more cryptic:
perl '-peeof||chomp&&s/$/,/' file
sed -n 's/.*/&,/;H;$x;$s/,\n/,/g;$s/\n\(.*\)/\1/;$s/\(.*\),/\1/;$p'
perl -pi.bak -e 'unless(eof){s/\n/,/g}' your_file
This will create a backup of original file with an extension of .bak and then modifies the original file

Resources