Sort numbers values - separated by a dot or any other separator character - Sort versions values in RHEL5 - linux

Linux RHEL5 machine
How can I sort the following input to get 1.0.0.1019 in latest variable? Tried -t, -k and -n but it didn't help or may be I'm missing something.
$ echo '1.0.0
1.0.0.1018
1.0.0.1019
1.0.0.1019
1.0.0.7' | sort -u

Could you please try following and let me know if this helps(tested with GNU sort):
echo "1.0.0
1.0.0.1018
1.0.0.1019
1.0.0.1019
1.0.0.7" | sort --version-sort --field-separator=. --key=4 -r
Above will give 1019 in first place(latest one) in case you want it to last place then remove -r in above code please.

sort -n -t. -k1,4
Sort the input numerically.
Fields are separated by '.'
Only use the first four fields, in that order.

Related

how to Create a script that takes a list of words as input and prints only words that appear exactly once

Requirements for input and output files:
Input format: One line, one word
Output format: One line, one word
Words should be sorted
I tried to use this command to solve this question
sort list | uniq
but it fails.
Anyone who can help me to solve it?
Try below :
cat <file_name> | sort | uniq -c | grep -e '^\s.*1\s' | awk '{print $NF}'
Explanation:
cat <file_name> | sort | uniq -c --> Will print all the entries, sort them and print count of each name.
grep -e '^\s.*1\s' --> This is a regex which will exclude all the entries where count is not 1
awk is used to remove count and print just name
It would be nice, simple and elegant to use this command to perform this task.
cat <file_name> | sort |uniq -u
And it would do the task perfectly.
The answer given by #Evans Fone assisted me.
If you're trying to implement a script that runs as:
cat list | ./scriptname
Then do the following:
Step 1:
Type
emacs scriptname
Step 2:
Press
ENTER
Step 3:
Type
!/bin/bash
sort |uniq -u
Step 4:
Press
CTRL+X
CTRL+S
CTRL+X
CTRL+C
sort | uniq -u
as simple as that.
sort without argument prompts input sorts its and pipe it to unique which print unique words

Pipelining cut sort uniq

Trying to get a certain field from a sam file, sort it and then find the number of unique numbers in the file. I have been trying:
cut -f 2 practice.sam > field2.txt | sort -o field2.txt sortedfield2.txt |
uniq -c sortedfield2.txt
The cut is working to pull out the numbers from field two, however when trying to sort the numbers into a new file or the same file I am just getting a blank. I have tried breaking the pipeline into sections but still getting the same error. I am meant to use those three functions to achieve the output count.
Use
cut -f 2 practice.sam | sort -o | uniq -c
In your original code, you're redirecting the output of cut to field2.txt and at the same time, trying to pipe the output into sort. That won't work (unless you use tee). Either separate the commands as individual commands (e.g., use ;) or don't redirect the output to a file.
Ditto the second half, where you write the output to sortedfield2.txt and thus end up with nothing going to stdout, and nothing being piped into uniq.
So an alternative could be:
cut -f 2 practice.sam > field2.txt ; sort -o field2.txt sortedfield2.txt ; uniq -c sortedfield2.txt
which is the same as
cut -f 2 practice.sam > field2.txt
sort -o field2.txt sortedfield2.txt
uniq -c sortedfield2.txt
you can use this command:
cut -f 2 practise.sam | uniq | sort > sorted.txt
In your code is wrong. The fault is "No such file or directory". Because of pipe. You can learn at this link how it is used
https://www.guru99.com/linux-pipe-grep.html

Recursively grep unique pattern in different files

Sorry title is not very clear.
So let's say I'm grepping recursively for urls like this:
grep -ERo '(http|https)://[^/"]+' /folder
and in folder there are several files containing the same url. My goal is to output only once this url. I tried to pipe the grep to | uniq or sort -u but that doesn't help
example result:
/www/tmpl/button.tpl.php:http://www.w3.org
/www/tmpl/header.tpl.php:http://www.w3.org
/www/tmpl/main.tpl.php:http://www.w3.org
/www/tmpl/master.tpl.php:http://www.w3.org
/www/tmpl/progress.tpl.php:http://www.w3.org
If you only want the address and never the file where it was found in, there is a grep option -h to suppress file output; the list can then be piped to sort -u to make sure every address appears only once:
$ grep -hERo 'https?://[^/"]+' folder/ | sort -u
http://www.w3.org
If you don't want the https?:// part, you can use Perl regular expressions (-P instead of -E) with variable length look-behind (\K):
$ grep -hPRo 'https?://\K[^/"]+' folder/ | sort -u
www.w3.org
If the structure of the output is always:
/some/path/to/file.php:http://www.someurl.org
you can use the command cut :
cut -d ':' -f 2- should work. Basically, it cuts each line into fields separated by a delimiter (here ":") and you select the 2nd and following fields (-f 2-)
After that, you can use uniq to filter.
Pipe to Awk:
grep -ERo 'https?://[^/"]+' /folder |
awk -F: '!a[substr($0,length($1))]++'
The basic Awk idiom !a[key]++ is true the first time we see key, and forever false after that. Extracting the URL (or a reasonable approximation) into the key requires a bit of additional trickery.
This prints the whole input line if the key is one we have not seen before, i.e. it will print the file name and the URL for the first occurrence of each URL from the grep output.
Doing the whole thing in Awk should not be too hard, either.

Bash sort and multi-character tab error

I have data in the following form
C1510438;;C0220832;;2
C0026030;;C0034693;;1
C1257960;;C0007452;;1
C0061461;;C0027922;;2
C0011744;;C0037494;;3
C0014180;;C0034493;;3
When I try to sort on the 3rd field, the command returns the error
sort -t ';;' -k 3 -r -n -o output.txt input.txt
sort: multi-character tab `;;'
I also try with
sort -t $';;' -k 3 -r -n -o output.txt input.txt
but the command returns same error.
Any idea what to do?
The -t option expects a single separator character, but you give it two. A way to do what you want would be to consider that the separator is only a single ;, and thus the third column would become the fifth one:
sort -t ';' -k 5 -r -n -o output.txt input.txt
Since the -t option expects a single separator character, a good way to handle this would be to use a replace tool to temporarily replace the separator(s) with a new one, do the sort, and then restore the original separator(s) as needed for further processing. I have a file that uses "," as separator which I can temporarily replace with a | (pipe), do my sort, and then restore "," as separator.

does linux sort have incompatible arguments

I wanted to sort a file in numerical order as well as uniquify with the sort -nu [filename].
$ *** | sort -n | wc
201172
$ *** | sort -nu | wc
9599
$ *** | sort -un | wc
9599
$ *** | sort -n | sort -u | wc
201149
$ *** | sort -u | wc
201149
Why there is a decrease in number of lines with sort -un ? So I tried running above commands on a small numeric file and see if there is any problem. It worked as expected.
Am I missing something obvious ? or
those options incompatible with each other ? I've checked man sort for this, no information was provided about this combination.
Thanks in advance.
EDIT
How should I fix this ? (using the n and u options separately ?)
-u removes duplicates.
So yeah, obviously it will reduce lines if the key is repeated within the file.
The difference with
sort -n | sort -u
then is that the second sort -u pipe command considers the full line, not just the numeric key.
so you need understand what's the meaning of -u and -n.
man sort
-u Unique: suppresses all but one in each set
of lines having equal keys. If used with the
-c option, checks that there are no lines
with duplicate keys in addition to checking
that the input file is sorted.
-n Restricts the sort key to an initial numeric string,
consisting of optional blank characters, optional
minus sign, and zero or more digits with an optional
radix character and thousands separators (as defined
in the current locale), which is sorted by arithmetic
value. An empty digit string is treated as zero.
Leading zeros and signs on zeros do not affect order-
ing.

Resources