Sort a tab delimited file based on column sort command bash [duplicate] - linux

This question already has answers here:
Sorting a tab delimited file
(11 answers)
Closed 7 years ago.
I am trying to sort this file based on the fourth column. I want the file to reordered based on the values of the fourth column.
File:
2 1:103496792:A 0 103496792
3 1:103544434:A 0 103544434
4 1:103548497:A 0 103548497
1 1:10363487:T 0 10363487
I want it sorted like this:
1 1:10363487:T 0 10363487
2 1:103496792:A 0 103496792
3 1:103544434:A 0 103544434
4 1:103548497:A 0 103548497
I tried this command:
sort -t$'\t' -k1,1 -k2,2 -k3,3 -k 4,4 <filename>
But I get illegal variable name error. Can somebody help me with this?

To sort on the fourth column use just the -k 4,4 selector.
sort -t $'\t' -k 4,4 <filename>
You might also want -V which sorts numbers more naturally. For example, yielding 1 2 10 rather than 1 10 2 (lexicographic order).
sort -t $'\t' -k 4,4 -V <filename>
If you're getting errors about the $'\t' then make sure your shell is bash. Perhaps you're missing #!/bin/bash at the top of your script?

I believe you have an errant $ in your command.
Try:
sort -t\t -nk4

Related

i'm trying to write a bash script and i don't get the result displayed in the terminal

i'm trying to display the 3rd to the 7th line from a file but i get nothing displayed in the terminal i'm using this command :
head -n 7 /etc/passwd | tail -n +3
i want the result be seen in the terminal .
You can try this way
head -n 7 /etc/passwd | tail -n 5
For example :
seq 20 | head -n 7 | tail -n 5
Output :
3
4
5
6
7
Explanation :
head -n 7 -- print the first 7 lines ( so 1..7 printed)
tail -n 5 -- print last 5 lines ( so skipped first two lines 3..7 printed )
head -n 7 /etc/passwd | tail -n +3
Seems to do exactly what you want it to do for me. Have you verified the contents of /etc/passwd? If you do not have permission to read the file or it is empty you will get no output.
I would check other places in the script. Change the first line of your script to:
#!/bin/bash -v
to have it echo the commands so you can make sure what you think is being executed actually is.
What do you get than ? Show the file output, but choose another file ;)
It sometimes helps to just change commands, try with either HEAD or TAIL twice, to come to the same result.

Bash descending filename sorting [duplicate]

This question already has answers here:
Sort files numerically in bash
(3 answers)
Closed 8 years ago.
I've been trying to sort my filenames with commands similar to ls -1 | sort -n -t "_" -k1 but just can't get it to work. Please help.
I have:
10_filename
11_filename
12_filename
1_filename
2_filename
I want to get:
1_filename
2_filename
...
10_filename
11_filename
Please try following this will solve the issue
ls -1v
-v It sorts on basis of file version versions
Try this,
ls -1 *\_filename | sort -n
or
ls -1 | sort -n
ls -1 | sort -t '_' +1 +0n
below, a bit heavy but working if sort does not accept field order and using simple string sort.
ls -1 | sed 's/^\([0-9]*\)_\(.*\)/000\1_\1_\2/;s/^0*\([0-9]\{3\}\)/\1/;s/\([0-9]\{1,\}_[0-9]\{1,\}_\)\(.*\)/\2_\1/' | sort -n | sed 's/\(.*\)_[0-9]\{1,\}_\([0-9]\{1,\}\)_$/\2_\1/'

Insert character in a file with bash

Hello I have a problem in bash.
i have a file and i am trying insert a point in the final line of each line:
cat file | sed s/"\n"/\\n./g > salida.csv
but not works =(.
Because i need count the lines with a word
I need count the lines with the same country
and if i do a grep the grep take colombia and colombias.
And other question how i can count lines with the same country?
for example
1 colombia
2 brazil
3 ecuador
4 colombias
5 colombia
colombia 2
colombias 1
ecuador 1
brazil 1
how about
cut -f2 -d' ' salida.csv | sort | uniq -c
since a sed solution was posted (probably the best tool for this task), I'll contribute an awk
awk '$NF=$NF"."' file > salida.csv
Update:
$ cat input
1 colombia
2 brazil
3 ecuador
4 colombias
5 colombia
$ awk '{a[$2]++}END{for (i in a) print i, a[i]}' input
brazil 1
colombias 1
ecuador 1
colombia 2
...and, please stop updating your question with different questions...
Your command line has a few problems. Some that matter, some that are style choices, but here's my take:
Unnecessary cat. sed can take a filename as an argument.
Your sed command doesn't need the g. Since each line only has one end, there's no reason to tell it to look for more.
Don't look for the newline character, just match the end of line with $.
That leaves you with:
sed s/$/./ file > salida.csv
Edit:
If your real question is "How do I grep for colombia, but not match colombias?", you just need to use the -w flag to match whole words:
grep -w colombia file
If you want to count them, just add -c:
grep -c -w colombia file
Read the grep(1) man page for more information.

Using sort in linux, how can I make 12 after 2?

I have a file, leading with numbers:
$ cat file
1
3
13
2
4
12
When I use cat file | sort, it displays like this:
$ cat file | sort
1
12
13
2
3
4
How can I get the answer like this:
1
2
3
4
12
13
Use the -n option to enable numerical sorting:
$ cat file | sort -n
This is faster and more portable than -g, which is a proprietary extension of GNU sort.
Use -g option of sort for general sorting of numbers (can be slow for large inputs):
$ sort -g file
or:
$ sort -n file
The difference can be found in a related question.
UPD: Fixed the useless cat as stated in comments.

How to catch duplicate entries in text file in linux [duplicate]

This question already has answers here:
How to delete duplicate lines in a file without sorting it in Unix
(9 answers)
Closed 4 years ago.
Text file:
1 1
2 2
3 3
1 1
I want to catch 1 1 as duplicated
Your question is not quite clear, but you can filter out duplicate lines with uniq:
sort file.txt | uniq
or simply
sort -u file.txt
(thanks RobEarl)
You can also print only repeating lines with
sort file.txt | uniq -d
One way using GNU awk:
awk 'array[$0]++' file.txt
Results:
1 1
You can use it easily:
sort -u file.txt
OR
awk '!x[$0]++' file.txt

Resources