how to sort this in bash - linux

Hello I have a file containing these lines:
apple
12
orange
4
rice
16
how to use bash to sort it by numbers ?
Suppose each number is the price for the above object.
I want they are formatted like this:
12 apple
4 orange
16 rice
or
apple 12
orange 4
rice 16
Thanks

A solution using paste + sort to get each product sorted by its price:
$ paste - - < file|sort -k 2nr
rice 16
apple 12
orange 4
Explanation
From paste man:
Write lines consisting of the sequentially corresponding lines from
each FILE, separated by TABs, to standard output. With no FILE, or
when FILE is -, read standard input.
paste gets the stream coming from the stdin (your <file) and figures that each line belongs to the fictional archive represented by - , so we get two columns using - -
sort use the flag -k 2nr to get paste output sorted by second column in reverse numerical order.

you can use awk:
awk '!(NR%2){printf "%s %s\n" ,$0 ,p}{p=$0}' inputfile
(slightly adapted from this answer)
If you want to sort the output afterwards, you can use sort (quite logically):
awk '!(NR%2){printf "%s %s\n" ,$0 ,p}{p=$0}' inputfile | sort -n
this would give:
4 orange
12 apple
16 rice

Another solution using awk
$ awk '/[0-9]+/{print prev, $0; next} {prev=$0}' input
apple 12
orange 4
rice 16

while read -r line1 && read -r line2;do
printf '%s %s\n' "$line1" "$line2"
done < input_file
If you want lines to be sorted by price, pipe the result to sort -k2:
while read -r line1 && read -r line2;do
printf '%s %s\n' "$line1" "$line2"
done < input_file | sort -k2

You can do this using paste and awk
$ paste - - <lines.txt | awk '{printf("%s %s\n",$2,$1)}'
12 apple
4 orange
16 rice

an awk-based solution without needing external paste / sort, using regex, calculating modulo % of anything, or awk/bash loops
{m,g}awk '(_*=--_) ? (__ = $!_)<__ : ($++NF = __)_' FS='\n'
12 apple
4 orange
16 rice

Related

Use values in a column to separate strings in another column in bash

I am trying to separate a column of strings using the values from another column, maybe an example will be easier for you to understand.
The input is a table, with strings in column 2 separated with a comma ,.
The third column is the field number that should be outputted, with , as the delimited in the second column.
Ben mango,apple 1
Mary apple,orange,grape 2
Sam apple,melon,* 3
Peter melon 1
The output should look like this, where records that correspond to an asterisk should not be outputted (the Sam row is not outputted):
Ben mango
Mary orange
Peter melon
I am able to generate the desired output using a for loop, but I think it is quite cumbersome:
IFS=$'\n'
for i in $(cat input.txt)
do
F=`echo $i | cut -f3`
paste <(echo $i | cut -f1) <(echo $i | cut -f2 | cut -d "," -f$F) | grep -v "\*"
done
Is there any one-liner to do it maybe using sed or awk? Thanks in advance.
The key to doing it in awk is the split() function, which populates an array based on a regular expression that matches the delimiters to split a string on:
$ awk '{ split($2, fruits, /,/); if (fruits[$3] != "*") print $1, fruits[$3] }' input.txt
Ben mango
Mary orange
Peter melon

How to remove lines based on another file? [duplicate]

This question already has answers here:
How to delete rows from a csv file based on a list values from another file?
(3 answers)
Closed 2 years ago.
Now I have two files as follows:
$ cat file1.txt
john 12 65 0
Nico 3 5 1
king 9 5 2
lee 9 15 0
$ cat file2.txt
Nico
king
Now I would like to remove each line which contains a name fron the second file in its first column.
Ideal result:
john 12 65 0
lee 9 15 0
Could anyone tell me how to do that? I have tried the code like this:
for i in 'less file2.txt'; do sed "/$i/d" file1.txt; done
But it does not work properly.
You don't need to iterate it, you just need to use grep with-v option to invert match and -w to force pattern to match only WHOLE words
grep -wvf file2.txt file1.txt
This job suites awk:
awk 'NR == FNR {a[$1]; next} !($1 in a)' file2.txt file1.txt
john 12 65 0
lee 9 15 0
Details:
NR == FNR { # While processing the first file
a[$1] # store the first field in an array a
next # move to next line
}
!($1 in a) # while processing the second file
# if first field doesn't exist in array a then print

Randomly selecting (units) from a file where a unit is 2 lines.

I want to select from a file random lines/units but where the units are consisted of 2 lines.
For example a file looks like this
Adam
Apple
Mindy
Candy
Steve
Chips
David
Meat
Carol
Carrots
And I want to randomly subselect lets say 2 units group
For example
Adam
Apple
David
Meat
or
Steve
Chips
Carol
Carrots
I've tried using shuf and sort -R but they only shuffle 1 lines. Could someone help me please?
Thank you.
You could do it with shuf by joining the lines before shuffling (that might not be a bad idea for a file format in general, if the lines describe a single item):
$ < file sed -e 'N;s/\n/:/' | shuf | head -1 | tr ':' '\n'
Carol
Carrots
The sed loads two lines at a time, and joins them with a colon.
Pick a random number in the correct range, ensure that it is odd (if desired), then use sed to print the 2 lines:
$ a=$(expr $RANDOM % \( $(wc -l < input) / 2 \) \* 2 + 1)
$ sed -n -e ${a}p -e $((a+1))p input
Rather than selecting lines to print, you could walk the file and print each "unit" with a particular probability. For example, to print (roughly) 10% of the "units" in the file, you could do:
awk 'BEGIN{srand()} NR%2 && (rand() < .1) {print; getline; print}' input

Cannot get this simple sed command

This sed command is described as follows
Delete the cars that are $10,000 or more. Pipe the output of the sort into a sed to do this, by quitting as soon as we match a regular expression representing 5 (or more) digits at the end of a record (DO NOT use repetition for this):
So far the command is:
$ grep -iv chevy cars | sort -nk 5
I have to add another pipe at the end of that command I think which "quits as soon as we match a regular expression representing 5 or more digits at the end of a record"
I tried things like
$ grep -iv chevy cars | sort -nk 5 | sed "/[0-9][0-9][0-9][0-9][0-9]/ q"
and other variations within the // but nothing works! What is the command which matches a regular expression representing 5 or more digits and quits according to this question?
Nominally, you should add a $ before the second / to match 5 digits at the end of the record. If you omit the $, then any sequence of 5 digits will cause sed to quit, so if there is another number (a VIN, perhaps) before the price, it might match when you didn't intend it to.
grep -iv chevy cars | sort -nk 5 | sed '/[0-9][0-9][0-9][0-9][0-9]$/q'
On the whole, it's safer to use single quotes around the regex, unless you need to substitute a shell variable into it (or unless the regex contains single quotes itself). You can also specify the repetition:
grep -iv chevy cars | sort -nk 5 | sed '/[0-9]\{5,\}$/q'
The \{5,\} part matches 5 or more digits. If for any reason that doesn't work, you might find you're using GNU sed and you need to do something like sed --posix to get it working in the normal mode. Or you might be able to just remove the backslashes. There certainly are options to GNU sed to change the regex mechanism it uses (as there are with GNU grep too).
Another way.
As you don't post a file sample, a did it as a guess.
Here I'm looking for lines with the word "chevy" where the field 5 is less than 10000.
awk '/chevy/ {if ( $5 < 10000 ) print $0} ' cars
I forgot the flag -i from grep ... so the correct is:
awk 'BEGIN{IGNORECASE=1} /chevy/ {if ( $5 < 10000 ) print $0} ' cars
$ cat > cars
Chevy 2 3 4 10000
Chevy 2 3 4 5000
chEvy 2 3 4 1000
CHEVY 2 3 4 10000
CHEVY 2 3 4 2000
Prevy 2 3 4 1000
Prevy 2 3 4 10000
$ awk 'BEGIN{IGNORECASE=1} /chevy/ {if ( $5 < 10000 ) print $0} ' cars
Chevy 2 3 4 5000
chEvy 2 3 4 1000
CHEVY 2 3 4 2000
grep -iv chevy cars | sort -nk 5 | sed '/[0-9][0-9][0-9][0-9][0-9]$/d'

Sorting space delimited numbers with Linux/Bash

Is there a Linux utility or a Bash command I can use to sort a space delimited string of numbers?
Here's a simple example to get you going:
echo "81 4 6 12 3 0" | tr " " "\n" | sort -g
tr translates the spaces delimiting the numbers, into carriage returns, because sort uses carriage returns as delimiters (ie it is for sorting lines of text). The -g option tells sort to sort by "general numerical value".
man sort for further details about sort.
This is a variation from #JamesMorris answer:
echo "81 4 6 12 3 0" | xargs -n1 | sort -g | xargs
Instead of tr, I use xargs -n1 to convert to new lines. The final xargs is to convert back, to a space separated sequence of numbers.
This is a variation on ghostdog74's answer that's too big to fit in a comment. It shows digits instead of names of numbers and both the original string and the result are in space-delimited strings (instead of an array which becomes a newline-delimited string).
$ s="3 2 11 15 8"
$ sorted=$(echo $(printf "%s\n" $s | sort -n))
$ echo $sorted
2 3 8 11 15
$ echo "$sorted"
2 3 8 11 15
If you didn't use the echo when setting the value of sorted, then the string has newlines in it. In that case echoing it without quotes puts it all on one line, but, as echoing it with quotes would show, each number would appear on its own line. This is the case whether the original is an array or a string.
# demo
$ s="3 2 11 15 8"
$ sorted=$(printf "%s\n" $s | sort -n)
$ echo $sorted
2 3 8 11 15
$ echo "$sorted"
2
3
8
11
15
$ s=(one two three four)
$ sorted=$(printf "%s\n" ${s[#]}|sort)
$ echo $sorted
four one three two
Using Bash parameter expansion (to replace spaces with newlines) we can do:
str="3 2 11 15 8"
sort -n <<< "${str// /$'\n'}"
# alternative
NL=$'\n'
str="3 2 11 15 8"
sort -n <<< "${str// /${NL}}"
If you actually have a space-delimited string of numbers, then one of the other answers provided would work fine. If your list is a bash array, then:
oldIFS="$IFS"
IFS=$'\n'
array=($(sort -g <<< "${array[*]}"))
IFS="$oldIFS"
might be a better solution. The newline delimiter would help if you want to generalize to sorting an array of strings instead of numbers.
Improving on Evan Krall's nice Bash "array sort" by limiting the scope of IFS to a single command:
printf "%q\n" "${IFS}"
array=(3 2 11 15 8)
array=($(IFS=$'\n' sort -n <<< "${array[*]}"))
echo "${array[#]}"
printf "%q\n" "${IFS}"
$ awk 'BEGIN{split(ARGV[1], numbers);for(i in numbers) {print numbers[i]} }' \
"6 7 4 1 2 3" | sort -n
I added this to my .zshrc (or .bashrc) file:
#sort a space-separated list of words (e.g. a list of HTML classes)
sortwords() {
echo $1 | xargs -n1 | sort -g | xargs
}
Call it from the terminal like this:
sortwords "banana date apple cherry"
# apple banana cherry date
Thanks to #FranMowinckel and others for inspiration.

Resources