how does linux store negative number in text files? - linux

I made a file using ed and named it numeric. Its content is as follow:
-100
-10
0
99
11
-56
12
Then I executed this command on terminal:
sort numeric
And the result was:
0
-10
-100
11
12
-56
99
And of course this output was not at all expected!

Sort want to be asked to sort numerically (otherwise it will default to lexigraphic sorting)
$ sort -n numbers.dat
-100
-56
-10
0
11
12
99
Watch out for the "-n" parameter (see manual)

Text files are text files, they contain text. Your numbers are sorted alphabetically. If you want sort to sort based on numerical value, use sort -n.
Also, your sort result is strange, when I run the same test I get:
$ sort numeric
-10
-100
-56
0
11
12
99
Sorted alphabetically, as expected.
See https://glot.io/snippets/e555jjumx6

Use sort -n to make sort do numerical sorting instead of alphabetical

That's because by default, sort is alphabetical, not numeric. sort -n does numbers.
Otherwise you'll get
1
10
2
3
etc.

Related

Problem getting desired output using Grep with just numbers as pattern

I am trying to grep rows from a file 2 that matches the values in file 1, but output is giving more lines.
File 1 looks like this:
$ head b.txt
5
11
26
27
File 2, a.txt, looks like
1509 5
1506 11
1507 12
339 26
1000 27
1000 100
Command I use:
grep -wFf b.txt a.txt
Results I want:
1509 5
1506 11
339 26
1000 27
It is giving me all I have in b.txt, but some extra lines too, e.g.,
1509 5
1506 11
1507 12
339 26
1000 27
1000 100
How can I fix this?
I simulated your problem and believe I know what's going on. With an empty line at the end of b.txt, I get the same output as you do. If I remove the empty line at the end of b.txt, I get your desired output.
➜ ~ grep -wFf b.txt a.txt
1509 5
1506 11
339 26
1000 27
From grep's manpage:
-f file, --file=file
Read one or more newline separated patterns from file. Empty pattern lines match every input line. Newlines are not considered part of a pattern. If file is empty, nothing
is matched.
I believe the Empty pattern lines match every input line. is the cause of your erroneous output.
Maybe you want to join files.
join -12 -21 -o1.1,1.2 <(<a.txt sort -k2) <(<b.txt sort)
will output:
1506 11
339 26
1000 27
1509 5
The command joins the second field from a.txt with the second field from b.txt. "joins" means finds specified fields in both files, where they have equal value. I "join" those two files on the second column from the first file and on the first column from the second file. join needs the inputs to be sorted by the joined fields, so we need to pipe it through sort. This method sadly will not preserve the order of the lines in files, as they need to be reordered for join to work.

How to find pattern and make operation in another field in awk?

I have a file with 4 columns separated by space like this bellow:
1_86500000 50 1_87500000 19
1_87500000 13 1_89500000 42
1_89500000 25 1_90500000 10
1_90500000 3 1_91500000 11
1_91500000 23 1_92500000 29
1_92500000 34 1_93500000 4
1_93500000 39 1_94500000 49
1_94500000 35 1_95500000 26
2_35500000 1 2_31500000 81
2_31500000 12 2_4150000 50
The First and Third columns are not in phase so I can not divide the value of one by another.
As there are only two or one possible columns $1 or $3, a solution would be look for the pattern and divide its value in the another column or set it to 0 if there is none like this expected result shows:
P.S. the second field in this expected result is just illustrative to shown the division.
1_86500000 0/50 0
1_87500000 19/13 1.46154
1_89500000 42/25 1.68
1_90500000 10/3 3.333
1_91500000 11/23 0.47826
1_92500000 29/34 0.85294
1_93500000 4/39 0.10256
1_94500000 49/35 1.4
2_35500000 0/1 0
2_31500000 81/12 6.75
2_4150000 50/0 50
I do not archived anything by myself other than this. So I do not have any starting point by now.
I tried separate the fields merged with _ to see if I could match by subtracting the coordinates. If I got 0 would mean that the columns was in phase and correct. But I could not go further.
awk '{if( ($5-$2)==0) print $1,$2,$3,$4,$5,$6}' file
I tried to match both columns but I only got phased results:
awk '{if(($1==$3)) print $1,$4/$2}' file
Can you help me?
awk to the rescue!
$ awk '{d[$1]=$2; n[$3]=$4}
END {for(k in n)
if(k in d) {print k,n[k]"/"d[k],n[k]/d[k]; delete d[k]}
else print k,n[k]"/0",n[k];
for(k in d) print k,"0/"d[k],0}' file | sort
1_86500000 0/50 0
1_87500000 19/13 1.46154
1_89500000 42/25 1.68
1_90500000 10/3 3.33333
1_91500000 11/23 0.478261
1_92500000 29/34 0.852941
1_93500000 4/39 0.102564
1_94500000 49/35 1.4
1_95500000 26/0 26
2_31500000 81/12 6.75
2_35500000 0/1 0
2_4150000 50/0 50
your division by zero result is little strange though!
Explanation keep two arrays for numerator and denominator. Once scanned the file, go over numerator array and find the corresponding denominator and make the division. For the denominators not used apply the convention given.

Sorting of data in descending order

Allow me to clarify my query:
I have a database with thousand of character strings, followed by some values (based on scoring matrix)
GKCHGYEGRGFQGRHYEGRSDGPNGQL 25
WGCGGYESRGFQGRHYEGGGDCPNGQG 56
GLCCGYEGRGFQCRHYEGGGDGPNDQL 43
GKGCGYEGRGFQGRHYEHGIDKDHFFR 24
PYGSGGNRARRSGCSWMLYEQVNYSGD 4
DFTEDLRCLQDVFAFNEIVSLNVLERL 3
REDYRRQSIYELSNYRCRQYLTDPSDY 18
There are equal values also present. I am trying to sort the data in descending order using:
sort -n -r file.txt
But the data is still disarranged. Also tried by adding -k argument.
Is it possible that i could get the following result:
GKCHGYEGRGFQGRHYEGRSDGPNGQL 56
WGCGGYESRGFQGRHYEGGGDCPNGQG 56
GLCCGYEGRGFQCRHYEGGGDGPNDQL 56
GKGCGYEGRGFQGRHYEHGIDKDHFFR 43
PYGSGGNRARRSGCSWMLYEQVNYSGD 25
DFTEDLRCLQDVFAFNEIVSLNVLERL 25
REDYRRQSIYELSNYRCRQYLTDPSDY 24
and so on.
I am new to Linux. Any help will be appreciated.
sort -k 2 -nr
This will number sort 2nd field in reverse order and print

Sorting based on first column and highest number in third column

I have a text file that needs to be sorted, my goal is to only keep the longest sequences in each of my modules. My text file looks like this:
1 abc 35
1 def 90
1 ghi 100
2 jui 500
3 yui 500
3 iop 300
My goal is to sort unique modules (first column) by keeping highest number from column 3, just like this:
1 ghi 100
2 jui 500
3 yui 500
So far I checked the sort options but without success, I guess awk could also do it!
I tried:
sort -u -k1,1 Black.txt | sort -k3n,3
Any help would be much appreciated!
You sort them based on the third column first and later unique them by first column.
sort -r -k 1 -k3n,3 Black.txt|sort -u -k1,1
output
1 ghi 100
2 jui 500
3 yui 500

Sort range Linux

everyone. I have some questions about sorting in bash. I am working with Ubuntu 14.04 .
The first question is: why if I have file some.txt with this content:
b 8
b 9
a 8
a 9
And when I type this :
sort -n -k 2 some.txt
the result will be:
a 8
b 8
a 9
b 9
which means that the file is sorted first to the second field and after that to the first field, but I thought that is will stay stable i.e.
b 8
a 8
...
...
Maybe if two rows are equal it is applied lexicographical sort or what ?
The second question is: why the following doesn`t working:
sort -n -k 1,2 try.txt
The file try.txt is like this:
8 2
8 11
8 0
8 5
9 2
9 0
The third question is not actally for sorting, but it appears when I try to do this:
sort blank.txt > blank.txt
After this the blank.txt file is empty. Why is that ?
Apparently GNU sort is not stable by default: add the -s option
Finally, as a last resort when all keys compare equal, sort compares entire lines as if no ordering options other than --reverse (-r) were specified. The --stable (-s) option disables this last-resort comparison so that lines in which all fields compare equal are left in their original relative order.
(https://www.gnu.org/software/coreutils/manual/html_node/sort-invocation.html)
There's no way to answer your question if you don't show the text file
Redirections are handled by the shell before handing off control to the program. The > redirection will truncate the file if it exists. After that, you are giving an empty file to sort
for #2, you don't actually explain what's not working. Expanding your sample data, this happens
$ cat try.txt
8 2
8 11
9 2
9 0
11 11
11 2
$ cat try.txt
8 2
8 11
9 2
9 0
11 11
11 2
I assume you want to know why the 2nd column is not sorted numerically. Let's go back to the sed manual:
‘-n’
‘--numeric-sort’
‘--sort=numeric’
Sort numerically. The number begins each line and consists of ...
Looks like using -n only sorts the first column numerically. After some trial and error, I found this combination that sorts each column numerically:
$ sort -k1,1n -k2,2n try.txt
8 2
8 11
9 0
9 2
11 2
11 11

Resources