Turning multiple lines into one comma separated line [duplicate] - linux

This question already has answers here:
Concise and portable "join" on the Unix command-line
(10 answers)
Closed 8 years ago.
I have the following data in multiple lines:
foo
bar
qux
zuu
sdf
sdfasdf
What I want to do is to convert them to one comma separated line:
foo,bar,qux,zuu,sdf,sdfasdf
What's the best unix one-liner to do that?

Using paste command:
paste -d, -s file

file
aaa
bbb
ccc
ddd
xargs
cat file | xargs
result
aaa bbb ccc ddd
xargs improoved
cat file | xargs | sed -e 's/ /,/g'
result
aaa,bbb,ccc,ddd

There are many ways it can be achieved. The tool you use mostly depends on your own preference or experience.
Using tr command:
tr '\n' ',' < somefile
Using awk:
awk -F'\n' '{if(NR == 1) {printf $0} else {printf ","$0}}' somefile

xargs -a your_file | sed 's/ /,/g'
This is a shorter way.

based on your input example, this awk line works. (without trailing comma)
awk -vRS="" -vOFS=',' '$1=$1' file
test:
kent$ echo "foo
bar
qux
zuu
sdf
sdfasdf"|awk -vRS="" -vOFS=',' '$1=$1'
foo,bar,qux,zuu,sdf,sdfasdf

Perl one-liner:
perl -pe'chomp, s/$/,/ unless eof' file
or, if you want to be more cryptic:
perl '-peeof||chomp&&s/$/,/' file

sed -n 's/.*/&,/;H;$x;$s/,\n/,/g;$s/\n\(.*\)/\1/;$s/\(.*\),/\1/;$p'

perl -pi.bak -e 'unless(eof){s/\n/,/g}' your_file
This will create a backup of original file with an extension of .bak and then modifies the original file

Related

SED: Displaying the first 10 lines of sophisticated expression

How to use sed to find lines with the word linux? As later display a first line 10 with the word linux?
EX.:
cat file | sed -e '/linux/!d' -e '10!d' ### I can not display the first 10 lines of the word linux
cat file | sed '/linux/!d' | sed '10!d' ### It is well
How to make it work with one sed?
cat file | sed -e '/linux/!d; ...?; 10!d'
...? - storing of the buffer linux? 10 later cut the lines?
Someone explain to me?
I would use awk:
awk '/linux/ && c<10 {print;c++} c==10 {exit}' file
This might work for you (GNU sed):
sed -nr '/linux/{p;G;/(.*\n){10}/q;h}' file
Print the line if it contains the required string. If the required number of lines has already been printed quit, otherwise store the line and previous lines in the hold space.
You could use perl:
perl -ne 'if (/linux/) {print; ++$a;}; last if $a==10' inputfile
Using GNU sed:
sed -rn "/linux/{p;x;s/^/P/;ta;:a;s/^P{10}$//;x;Tb;Q;:b}" filename
Thanks. You are great. All of the examples look very nice. Wooow :) It is a pity that I can not do that.
I have not seen for 'r' option in sed. I need to learn.
echo -e 'windows\nlinux\nwindows\nlinux\nlinux\nwindows' | sed -nr '/linux/{p;G;/(.*\n){2}/q;h}'
It works very well.
echo -e 'windows\nlinux\nwindows\nlinux\nlinux\nwindows' | sed -nr '/linux/{p;G;/(.*\n){2}/q;h}' | sed '2s/linux/debian/'
Can I ask you one more example? How to get a result at one sed?

How to extract patterns form a text files in shell bash

I have a text file that contains:
toto.titi.any=val1
toto.tata.any=val2
toto.tete.any=val2
How to extract titi , tata and tete from this file.
Should be some sthing like that
$ cat myfile.txt | sed '......'
and the output should be
titi
tata
tete
Do you really need sed? You could use cut:
cut -d. -f2 filename
With awk you can do:
awk -F. '{print $2}' file
awk/cut would be better choice for this problem.
here is the sed line and grep option:
sed -r 's/.*\.([^.]*)\..*/\1/'
grep -Po '\.\K[^.]*(?=\.)'
awk and cut are best for this
you can also read the file line by line, and print out the portion needed.
$ IFS=.
$ while read first interested others ; do echo $interested; done < file
titi
tata
tete
sed -n 's/^[^.]*.\([^.]*\)..*/\1/p' myfile.txt
display second value between dot from line having at least 2 dot inside

Linux cut string

In Linux (Cento OS) I have a file that contains a set of additional information that I want to removed. I want to generate a new file with all characters until to the first |.
The file has the following information:
ALFA12345|7890
Beta0-XPTO-2|30452|90 385|29
ZETA2334423 435; 2|2|90dd5|dddd29|dqe3
The output expected will be:
ALFA12345
Beta0 XPTO-2
ZETA2334423 435; 2
That is removed all characters after the character | (inclusive).
Any suggestion for a script that reads File1 and generates File2 with this specific requirement?
Try
cut -d'|' -f1 oldfile > newfile
And, to round out the "big 3", here's the awk version:
awk -F\| '{print $1}' in.dat
You can use a simple sed script.
sed 's/^\([^|]*\).*/\1/g' in.dat
ALFA12345
Beta0-XPTO-2
ZETA2334423 435; 2
Redirect to a file to capture the output.
sed 's/^\([^|]*\).*/\1/g' in.dat > out.dat
And with grep:
$ grep -o '^[^|]*' file1
ALFA12345
Beta0-XPTO-2
ZETA2334423 435; 2
$ grep -o '^[^|]*' file1 > file2

Delete empty lines using sed

I am trying to delete empty lines using sed:
sed '/^$/d'
but I have no luck with it.
For example, I have these lines:
xxxxxx
yyyyyy
zzzzzz
and I want it to be like:
xxxxxx
yyyyyy
zzzzzz
What should be the code for this?
You may have spaces or tabs in your "empty" line. Use POSIX classes with sed to remove all lines containing only whitespace:
sed '/^[[:space:]]*$/d'
A shorter version that uses ERE, for example with gnu sed:
sed -r '/^\s*$/d'
(Note that sed does NOT support PCRE.)
I am missing the awk solution:
awk 'NF' file
Which would return:
xxxxxx
yyyyyy
zzzzzz
How does this work? Since NF stands for "number of fields", those lines being empty have 0 fields, so that awk evaluates 0 to False and no line is printed; however, if there is at least one field, the evaluation is True and makes awk perform its default action: print the current line.
sed
'/^[[:space:]]*$/d'
'/^\s*$/d'
'/^$/d'
-n '/^\s*$/!p'
grep
.
-v '^$'
-v '^\s*$'
-v '^[[:space:]]*$'
awk
/./
'NF'
'length'
'/^[ \t]*$/ {next;} {print}'
'!/^[ \t]*$/'
sed '/^$/d' should be fine, are you expecting to modify the file in place? If so you should use the -i flag.
Maybe those lines are not empty, so if that's the case, look at this question Remove empty lines from txtfiles, remove spaces from start and end of line I believe that's what you're trying to achieve.
I believe this is the easiest and fastest one:
cat file.txt | grep .
If you need to ignore all white-space lines as well then try this:
cat file.txt | grep '\S'
Example:
s="\
\
a\
b\
\
Below is TAB:\
\
Below is space:\
\
c\
\
"; echo "$s" | grep . | wc -l; echo "$s" | grep '\S' | wc -l
outputs
7
5
Another option without sed, awk, perl, etc
strings $file > $output
strings - print the strings of printable characters in files.
With help from the accepted answer here and the accepted answer above, I have used:
$ sed 's/^ *//; s/ *$//; /^$/d; /^\s*$/d' file.txt > output.txt
`s/^ *//` => left trim
`s/ *$//` => right trim
`/^$/d` => remove empty line
`/^\s*$/d` => delete lines which may contain white space
This covers all the bases and works perfectly for my needs. Kudos to the original posters #Kent and #kev
The command you are trying is correct, just use -E flag with it.
sed -E '/^$/d'
-E flag makes sed catch extended regular expressions. More info here
You can say:
sed -n '/ / p' filename #there is a space between '//'
You are most likely seeing the unexpected behavior because your text file was created on Windows, so the end of line sequence is \r\n. You can use dos2unix to convert it to a UNIX style text file before running sed or use
sed -r "/^\r?$/d"
to remove blank lines whether or not the carriage return is there.
This works in awk as well.
awk '!/^$/' file
xxxxxx
yyyyyy
zzzzzz
You can do something like that using "grep", too:
egrep -v "^$" file.txt
My bash-specific answer is to recommend using perl substitution operator with the global pattern g flag for this, as follows:
$ perl -pe s'/^\n|^[\ ]*\n//g' $file
xxxxxx
yyyyyy
zzzzzz
This answer illustrates accounting for whether or not the empty lines have spaces in them ([\ ]*), as well as using | to separate multiple search terms/fields. Tested on macOS High Sierra and CentOS 6/7.
FYI, the OP's original code sed '/^$/d' $file works just fine in bash Terminal on macOS High Sierra and CentOS 6/7 Linux at a high-performance supercomputing cluster.
If you want to use modern Rust tools, you can consider:
ripgrep:
cat datafile | rg '.' line with spaces is considered non empty
cat datafile | rg '\S' line with spaces is considered empty
rg '\S' datafile line with spaces is considered empty (-N can be added to remove line numbers for on screen display)
sd
cat datafile | sd '^\n' '' line with spaces is considered non empty
cat datafile | sd '^\s*\n' '' line with spaces is considered empty
sd '^\s*\n' '' datafile inplace edit
Using vim editor to remove empty lines
:%s/^$\n//g
For me with FreeBSD 10.1 with sed worked only this solution:
sed -e '/^[ ]*$/d' "testfile"
inside [] there are space and tab symbols.
test file contains:
fffffff next 1 tabline ffffffffffff
ffffffff next 1 Space line ffffffffffff
ffffffff empty 1 lines ffffffffffff
============ EOF =============
NF is the command of awk you can use to delete empty lines in a file
awk NF filename
and by using sed
sed -r "/^\r?$/d"

sed extract text between two patterns where second pattern may be either of one

I am trying to extract text between pattern1 (fixed) and pattern2 (this can be p2-1/p2-2).
can you please tell me how to achieve this in a single command?
A file starts with start and ends with either end or close
File1:
======
junktest
data
start
stackoverflow
sed
close
File2:
======
data2
start
stackoverflow
end
I can extract text from File1 with
sed -n "/start/,/close/p"
And from File2 with
sed -n "/start/,/end/p"
I need a single sed command to achieve both..
something like:
sed -n "/start/, /close or end /p"
Both GNU sed and BSD sed:
sed -nE '/start/,/close|end/p' file
This awk looks better
awk '/start/,/end|close/' file
sed -n -E "/Word1/,/Word2-1/p" | sed -n -E "/Word1/,/Word2-2/p"
Easy with awk:
$ awk '/start/{p=1}p{print}/end|close/{p=0}' file

Resources