How to merge two CSV files with Linux column wise? [duplicate] - linux

This question already has answers here:
combining columns of 2 files using shell script
(1 answer)
Are shell scripts sensitive to encoding and line endings?
(14 answers)
Closed 1 year ago.
I am looking for a simple line of code (if possible) to simply merge two files column wise and save the final result into a new file.
Edited to response to the first answer #heitor:
By using paste file1.csv file2.csv, What happened is:
For instance file 1:
A B
1 2
file2:
C D
3 4
By doing paste -d , file1.csv file2.csv >output.csv I got
A B
C D
1 2
3 4
not
A B C D
1 2 3 4
By doing cat file1.csv file2.csv I got
A B
1 2
C D
3 4
Neither of them is what I want. Any idea?
Any idea?

Use paste -d , to merge the two files and > to redirect the command output to another file:
$ paste -d , file1.csv file2.csv > output.csv
E.g.:
$ cat file1.csv
A,B
$ cat file2.csv
C,D
$ paste -d , file1.csv file2.csv > output.csv
$ cat output.csv
A,B,C,D
-d , tells paste to use , as the delimiter to join the columns.
> tells the shell to write the output of the paste command to the file output.csv

Indeed using paste is pretty simple,
$ cat file1.csv
A B
1 2
$ cat file2.csv
C D
3 4
$ paste -d " " file1.csv file2.csv
A B C D
1 2 3 4
With the -d option I replaced the default tab character with a space.
Edit:
In case you want to redirect that to another file then,
paste -d " " file1.csv file2.csv > file3.csv
$ cat file3.csv
A B C D
1 2 3 4

Related

unix join command to return all columns in one file

I have two files that I am joining on one column. After the join, I just want the output to be all of the columns, in the original order, from only one of the files. For example:
cat file1.tsv
1 a ant
2 b bat
3 c cat
8 d dog
9 e eel
cat file2.tsv
1 I
2 II
3 III
4 IV
5 V
join -1 1 -2 1 file1.tsv file2.tsv -t $'\t' -o 1.1,1.2,1.3
1 a ant
2 b bat
3 c cat
I know I an use -o 1.1,1.2.. notation but my file has over two dozen columns. Is there some wildcard that I can use to say -o 1.* or something?
I'm not aware of wildcards in the format string.
From your desired output I think that what you want may be achievable like so without having to specify all the enumerations:
grep -f <(awk '{print $1}' file2.tsv ) file1.tsv
1 a ant
2 b bat
3 c cat
Or as an awk-only solution:
awk '{if(NR==FNR){a[$1]++}else{if($1 in a){print}}}' file2.tsv file1.tsv
1 a ant
2 b bat
3 c cat

script to change format of text [closed]

Closed. This question needs details or clarity. It is not currently accepting answers.
Want to improve this question? Add details and clarify the problem by editing this post.
Closed 7 years ago.
Improve this question
I want to change a text file that contains for example :
a 1
b 2
a 3
b 4
to:
a b
1 2
3 4
any idea how I can accomplish that? I am not familiar with awk but I think it is the way to go?
I'm assuming the input is always two columns, the first column contains
the column headers of the output repeated over and over, and that the
output may contain one or more columns.
$ cat t.awk
{ sep = (FNR % n == 0) ? "\n" : " " }
NR==FNR { printf $1 sep; if (sep == "\n") nextfile; next; }
{ printf $2 sep }
The number of output columns is set with -v when invoking awk. Note also
that the input file needs to be provided twice on the command line as the
script does a quick initial pass to print out the output column headers
before starting over to print the data (too lazy to deal with arrays).
Two-column output:
$ cat file.txt
a 1
b 2
a 3
b 4
a 5
b 6
$ awk -v n=2 -f t.awk file.txt file.txt
a b
1 2
3 4
5 6
Three-column output:
$ cat file2.txt
a 1
b 2
c 3
a X
b Y
c Z
a #
b #
c $
$ awk -v n=3 -f t.awk file2.txt file2.txt
a b c
1 2 3
X Y Z
# # $

Cat headers and renaming a column header using awk?

I've got an input file (input.txt) like this:
name value1 value2
A 3 1
B 7 4
C 2 9
E 5 2
And another file with a list of names (names.txt) like so:
B
C
Using grep -f, I can get all the lines with names "B" and "C"
grep -wFf names.txt input.txt
to get
B 7 4
C 2 9
However, I want to keep the header at the top of the output file, and also rename the column name "name" with "ID". And using grep, to keep the rows with names B and C, the output should be:
**ID** value1 value2
B 7 4
C 2 9
I'm thinking awk should be able to accomplish this, but being new to awk I'm not sure how to approach this. Help appreciated!
While it is certainly possible to do this in awk, the fastest way to solve your actual problem is to simply prepend the header you want in front of the grep output.
echo **ID** value1 value2 > Output.txt && grep -wFf names.txt input.txt >> Output.txt
Update Since the OP has multiple files, we can modify the above line to pull the first line out of the input file instead.
head -n 1 input.txt | sed 's/name/ID/' > Output.txt && grep -wFf names.txt input.txt >> Output.txt
Here is how to do it with awk
awk 'FNR==NR {a[$1];next} FNR==1 {$1="ID";print} {for (i in a) if ($1==i) print}' name input
ID value1 value2
B 7 4
C 2 9
Store the names in an array a
Then test filed #1 if it contains data in array a

combining the text files in to one text file

i have a requirement like the following.
i am using linux
i have a set of text files like text1.txt ,text2.txt, text3.txt.
now i am combining into one final text file.
text1.txt
1
NULL
NULL
4
text2.txt
1
2
NULL
4
text3.txt
a
b
c
d
i am using the following command :
paste -d ' ' text1.txt text2.txt text3.txt >> text4.txt
i am getting the :
text4.txt
1 1 a
2 b
c
4 4 d
but i want the output like the following
text4.txt
1 1 a
NULL 2 b
NULL NULL c
4 4 d
NOTE :- NULL means space
i am passing this text4 to another loop as a input so here there i am reading the variable by positionl
thanks in advance
I expect that you want TABs separating your records in file4.txt... what about this?
NLINES=$(wc -l file1.txt | awk '{print $1}')
rm -f file4.txt
for i in $(seq 1 $NLINES); do
rec1=$(sed -n "$i p" file1.txt)
rec2=$(sed -n "$i p" file2.txt)
rec3=$(sed -n "$i p" file3.txt)
echo -e "$rec1\t$rec2\t$rec3" >> file4.txt
done
But actually paste, without "-d ' '" gave the same exact result!
you can achieve same with AWK command
awk '{a[FNR]=a[FNR]$0" "}END{for(i=1;i<=length(a);i++)print a[i]}' text1.txt text2.txt text3.txt >> text4.txt

how to sort a file according to another file?

Is there a unix oneliner or some other quick way on linux to sort a file according to a permutation set by the sorting of another file?
i.e.:
file1: (separated by CRLFs, not spaces)
2
3
7
4
file2:
a
b
c
d
sorted file1:
2
3
4
7
so the result of this one liner should be
sorted file2:
a
b
d
c
paste file1 file2 | sort | cut -f2
Below is a perl one-liner that will print the contents of file2 based on the sorted input of file1.
perl -n -e 'BEGIN{our($x,$t,#a)=(0,1,)}if($t){$a[$.-1]=$_}else{$a[$.-1].=$_ unless($.>$x)};if(eof){$t=0;$x=$.;close ARGV};END{foreach(sort #a){($j,$l)=split(/\n/,$_,2);print qq($l)}}' file1 file2
Note: If the files are different lengths, the output will only print up to the shortest file length.
For example, if file-A has 5 lines and file-B has 8 lines then the output will only be 5 lines.

Resources