I've the following file:
1 2 3
1 4 5
1 6 7
2 3 5
5 2 1
and I want that the file be sorted for the second column but from the largest number (in this case 6) to the smallest. I've tried with
sort +1 -2 file.dat
but it sorts in ascending order (rather than descending).
The results should be:
1 6 7
1 4 5
2 3 5
5 2 1
1 2 3
sort -nrk 2,2
does the trick.
n for numeric sorting, r for reverse order and k 2,2 for the second column.
Have you tried -r ? From the man page:
-r, --reverse
reverse the result of comparisons
As mention most version of sort have the -r option if yours doesn't try tac:
$ sort -nk 2,2 file.dat | tac
1 6 7
1 4 5
2 3 5
5 2 1
1 2 3
$ sort -nrk 2,2 file.dat
1 6 7
1 4 5
2 3 5
5 2 1
1 2 3
tac - concatenate and print files in reverse
Related
I have a problem with sorting my file. My file look like this
geom-10-11.com 1
geom-1-10.com 9
geom-1-11.com 10
geom-1-2.com 1
geom-1-3.com 2
geom-1-4.com 3
geom-1-5.com 4
geom-1-6.com 5
geom-1-7.com 6
geom-1-8.com 7
geom-1-9.com 8
geom-2-10.com 8
geom-2-11.com 9
geom-2-3.com 1
geom-2-4.com 2
geom-2-5.com 3
geom-2-6.com 4
geom-2-7.com 5
geom-2-8.com 6
geom-2-9.com 7
geom-3-10.com 7
geom-3-11.com 8
geom-3-4.com 1
geom-3-5.com 2
geom-3-6.com 3
geom-3-7.com 4
geom-3-8.com 5
geom-3-9.com 6
geom-4-10.com 6
geom-4-11.com 7
geom-4-5.com 1
geom-4-6.com 2
geom-4-7.com 3
geom-4-8.com 4
geom-4-9.com 5
geom-5-10.com 5
geom-5-11.com 6
geom-5-6.com 1
geom-5-7.com 2
geom-5-8.com 3
geom-5-9.com 4
geom-6-10.com 4
geom-6-11.com 5
geom-6-7.com 1
geom-6-8.com 2
geom-6-9.com 3
geom-7-10.com 3
geom-7-11.com 4
geom-7-8.com 1
geom-7-9.com 2
geom-8-10.com 2
geom-8-11.com 3
geom-8-9.com 1
geom-9-10.com 1
geom-9-11.com 2
So I used sort -k1.6 -k2 -n and I got
geom-1-2.com 1
geom-1-3.com 2
geom-1-4.com 3
geom-1-5.com 4
geom-1-6.com 5
geom-1-7.com 6
geom-1-8.com 7
geom-1-9.com 8
geom-1-10.com 9
geom-1-11.com 10
geom-2-3.com 1
geom-2-4.com 2
geom-2-5.com 3
geom-2-6.com 4
geom-2-7.com 5
geom-2-8.com 6
geom-2-9.com 7
geom-2-10.com 8
geom-2-11.com 9
geom-3-4.com 1
geom-3-5.com 2
geom-3-6.com 3
geom-3-7.com 4
geom-3-8.com 5
geom-3-9.com 6
geom-3-10.com 7
geom-3-11.com 8
geom-4-5.com 1
geom-4-6.com 2
geom-4-7.com 3
geom-4-8.com 4
geom-4-9.com 5
geom-4-10.com 6
geom-4-11.com 7
geom-5-6.com 1
geom-5-7.com 2
geom-5-8.com 3
geom-5-9.com 4
geom-5-10.com 5
geom-5-11.com 6
geom-6-7.com 1
geom-6-8.com 2
geom-6-9.com 3
geom-6-10.com 4
geom-6-11.com 5
geom-7-8.com 1
geom-7-9.com 2
geom-7-10.com 3
geom-7-11.com 4
geom-8-9.com 1
geom-8-10.com 2
geom-8-11.com 3
geom-9-10.com 1
geom-9-11.com 2
geom-10-11.com 1
But when I tried use uniq -f1 or sort -k1.6 -k2 -n -u I got same long sorted output. So I used
sort -k1.6 -k2 -n -c
and get message that this file is disordered
(sort: glist2:2: disorder: geom-1-2.com 1).
I tried use just sort -k2 -n -u but got
geom-10-11.com 1
geom-1-3.com 2
geom-1-4.com 3
geom-1-5.com 4
geom-1-6.com 5
geom-1-7.com 6
geom-1-8.com 7
geom-1-9.com 8
geom-1-10.com 9
geom-1-11.com 10
That is not what I need, I need to have
geom-1-2.com 1
geom-1-3.com 2
geom-1-4.com 3
geom-1-5.com 4
geom-1-6.com 5
geom-1-7.com 6
geom-1-8.com 7
geom-1-9.com 8
geom-1-10.com 9
geom-1-11.com 10
So I need to have at begening geom-1-X and not geom-10-X. It would be great use juste uniq because I have more bigger files with more geometries (about thousands of lines) but with same structure. Thank you for your answers.
You can use this:
grep -E '^geom-1-' file | sort -k1.8n
grep is filtering the lines you want. sort is sorting numerically the first field starting at the 8th character.
I have a file with one single column and 10 rows. Every column has the same number of characters (5). From this file I would like to get a file with 10 rows and 5 columns, where each column has 1 character only. I have no idea on how to do that in linux.. Any help? Would AWK do this?
The real data has many more rows (>4K) and characters (>500K) though. Here is a short version of the real data:
31313
30442
11020
12324
00140
34223
34221
43124
12211
04312
Desired output:
3 1 3 1 3
3 0 4 4 2
1 1 0 2 0
1 2 3 2 4
0 0 1 4 0
3 4 2 2 3
3 4 2 2 1
4 3 1 2 4
1 2 2 1 1
0 4 3 1 2
Thanks!
I think that this does what you want:
$ awk -F '' '{ $1 = $1 }1' file
3 1 3 1 3
3 0 4 4 2
1 1 0 2 0
1 2 3 2 4
0 0 1 4 0
3 4 2 2 3
3 4 2 2 1
4 3 1 2 4
1 2 2 1 1
0 4 3 1 2
The input field separator is set to the empty string, so every character is treated as a field. $1 = $1 means that awk "touches" every record, causing it to be reformatted, inserting the output field separator (a space) between every character. 1 is the shortest "true" condition, causing awk to print every record.
Note that the behaviour of setting the field separator to an empty string isn't well-defined, so may not work on your version of awk. You may find that setting the field separator differently e.g. by using -v FS= works for you.
Alternatively, you can do more or less the same thing in Perl:
perl -F -lanE 'say "#F"' file
-a splits each input record into the special array #F. -F followed by nothing sets the input field separator to the empty string. The quotes around #F mean that the list separator (a space by default) is inserted between each element.
You can use this sed as well:
sed 's/./& /g; s/ $//' file
3 1 3 1 3
3 0 4 4 2
1 1 0 2 0
1 2 3 2 4
0 0 1 4 0
3 4 2 2 3
3 4 2 2 1
4 3 1 2 4
1 2 2 1 1
0 4 3 1 2
Oddly enough, this isn't trivial to do with most standard Unix tools (update: except, apparently, with awk). I would use Python:
python -c 'import sys; map(sys.stdout.write, map(" ".join, sys.stdin))' in.txt > new.txt
(This isn't the greatest idiomatic Python, but it suffices for a simple one-liner.)
another unix toolchain for this task
$ while read line; do echo $line | fold -w1 | xargs; done < file
3 1 3 1 3
3 0 4 4 2
1 1 0 2 0
1 2 3 2 4
0 0 1 4 0
3 4 2 2 3
3 4 2 2 1
4 3 1 2 4
1 2 2 1 1
0 4 3 1 2
I have a datafile with 10 columns as given below
ifile.txt
2 4 4 2 1 2 2 4 2 1
3 3 1 5 3 3 4 5 3 3
4 3 3 2 2 1 2 3 4 2
5 3 1 3 1 2 4 5 6 8
I want to add 11th column which will show the average of each rows along 10 columns. i.e. AVE(2 4 4 2 1 2 2 4 2 1) and so on. Though my following script is working well, but I would like to make it more simpler and short. I appreciate, in advance, for any kind help or suggestions in this regard.
awk '{for(i=1;i<=NF;i++){s+=$i;ss+=$i}m=s/NF;$(NF+1)=ss/NF;s=ss=0}1' ifile.txt
This should work
awk '{for(i=1;i<=NF;i++)x+=$i;$(NF+1)=x/NF;x=0}1' file
For each field you add the value to x in the loop.
Next you set field 11 to the sum in xdivided by the number of fields NF.
Reset x to zero for the next line.
1 equates to true and performs the default action in awk which is to print the line.
is this helping
awk '{for(i=1;i<=NF;i++)s+=$i;print $0,s/NF;s=0}' ifile.txt
or
awk '{for(i=1;i<=NF;i++)ss+=$i;$(NF+1)=ss/NF;ss=0}1' ifile.txt
I have a datafile with 10 columns as given below
ifile.txt
2 4 4 2 1 2 2 4 2 1
3 3 1 5 3 3 4 5 3 3
4 3 3 2 2 1 2 3 4 2
5 3 1 3 1 2 4 5 6 8
I want to add 11th column which will show the standard deviation of each rows along 10 columns. i.e. STDEV(2 4 4 2 1 2 2 4 2 1) and so on.
I am able to do by taking tranpose, then using the following command and again taking transpose
awk '{x[NR]=$0; s+=$1} END{a=s/NR; for (i in x){ss += (x[i]-a)^2} sd = sqrt(ss/NR); print sd}'
Can anybody suggest a simpler way so that I can do it directly along each row.
You can do the same with one pass as well.
awk '{for(i=1;i<=NF;i++){s+=$i;ss+=$i*$i}m=s/NF;$(NF+1)=sqrt(ss/NF-m*m);s=ss=0}1' ifile.txt
Do you mean something like this ?
awk '{for(i=1;i<=NF;i++)s+=$i;M=s/NF;
for(i=1;i<=NF;i++)sd+=(($i-M)^2);$(NF+1)=sqrt(sd/NF);M=sd=s=0}1' file
2 4 4 2 1 2 2 4 2 1 1.11355
3 3 1 5 3 3 4 5 3 3 1.1
4 3 3 2 2 1 2 3 4 2 0.916515
5 3 1 3 1 2 4 5 6 8 2.13542
You just use the fields instead of transposing and using the rows.
I have the following file:
BTA Pos KLD
4 79.7011 5.7711028907
4 79.6231 5.7083918219
5 20.9112 4.5559494707
5 50.7354 4.2495580809
5 112.645 4.0936819092
6 72.8212 4.9384741047
6 18.3889 7.3631759258
I want to use AWK or bash commands to sort the second column based on the first column to have the output as follows:
4 79.6231 5.7083918219
4 79.7011 5.7711028907
5 20.9112 4.5559494707
5 50.7354 4.2495580809
5 112.645 4.0936819092
6 18.3889 7.3631759258
6 72.8212 4.9384741047
sort numerically on column one then on column two:
$ sort -nk1,1 -nk2,2 file
BTA POS KLD
4 79.6231 5.7083918219
4 79.7011 5.7711028907
5 20.9112 4.5559494707
5 50.7354 4.2495580809
5 112.645 4.0936819092
6 18.3889 7.3631759258
6 72.8212 4.9384741047