Sort the tab-delimited numbers on each line of a file - linux

I'm trying to sort the numbers on each line of a file individually. The numbers within one line are separated by tabs. (I used spaces but they're actually tabs.)
For example, for the following input
5 8 7 6
1 5 6 8
8 9 7 1
the desired output would be:
5 6 7 8
1 5 6 7
1 7 8 9
My attempt so far is:
let i=1
while read line
do
echo "$line" | tr " " "\n" | sort -g
cut -f $i fileName | paste -s >> tempFile$$
((++i))
done < fileName

This is the best I got - I'm sure it can be done in 6 characters with awk/sed/perl:
while read line
do
echo $(printf "%d\n" $line | sort -n) | tr ' ' \\t >> another-file.txt
done < my-input-file.txt

Using a few features that are specific to GNU awk:
$ awk 'BEGIN{ PROCINFO["sorted_in"] = "#ind_num_asc" }
{ delete(a); n = 0; for (i=1;i<=NF;++i) a[$i];
for (i in a) printf "%s%s", i, (++n<NF?FS:RS) }' file
5 6 7 8
1 5 6 8
1 7 8 9
Each field is set as a key in the array a. In GNU awk it is possible to specify the order in which the for (i in a) loop traverses the array - here, I've set it to do so in ascending numerical order.

Here is a bash script that can do it. It takes a filename argument or reads stdin, was tested on CentOS and assumes IFS=$' \t\n'.
#!/bin/bash
if [ "$1" ] ; then exec < "$1" ; fi
cat - | while read line
do
set $line
echo $(for var in "$#"; do echo $var; done | sort -n) | tr " " "\t"
done
If you want to put the output in another file run it as:
cat input_file | sorting_script > another_file
or
sorting_script input_file > another file

Consider using perl for this:
perl -ape '#F=sort #F;$_="#F\n"' input.txt
Here -a turns on automatic field splitting (like awk does) into the array #F, -p makes it execute the script for each line and print $_ each time, and -e specifies the script directly on the command line.
Not quite 6 characters, I'm afraid, Sean.
This should have been simple in awk, but it doen't quite have the features needed. If there had been an array $# corresponding to the fields $1, $2, etc., then the solution would have been awk '{asort $#}' input.txt, but sadly no such array exits. The loops required to move the fields into an array and out of it again make it longer than the bash version:
awk '{for(i=1;i<=NF;i++)a[i]=$i;asort(a);for(i=1;i<=NF;i++)printf("%s ",a[i]);printf("\n")}' input.txt
So awk isn't the right tool for the job here. It's also a bit odd that sort itself doesn't have a switch to control its sorting direction.

Using awk
$ cat file
5 8 7 6
1 5 6 8
8 9 7 1
$ awk '{c=1;while(c!=""){c=""; for(i=1;i<NF;i++){n=i+1; if($i>$n){c=$i;$i=$n;$n=c}}}}1' file
5 6 7 8
1 5 6 8
1 7 8 9
Better Readable version
awk '{
c=1
while(c!="")
{
c=""
for(i=1;i<NF;i++)
{
n=i+1
if($i>$n)
{
c=$i
$i=$n
$n=c
}
}
}
}1
' file
If you have ksh, you may try this
#!/usr/bin/env ksh
while read line ; do
set -s +A cols $line
echo ${cols[*]}
done < "input_file"
Test
[akshay#localhost tmp]$ cat test.ksh
#!/usr/bin/env ksh
cat <<EOF | while read line ; do set -s +A cols $line; echo ${cols[*]};done
5 8 7 6
1 5 6 8
8 9 7 1
EOF
[akshay#localhost tmp]$ ksh test.ksh
5 6 7 8
1 5 6 8
1 7 8 9

Related

How do I turn a text file with a single column into a matrix?

I have a text file that has a single column of numbers, like this:
1
2
3
4
5
6
I want to convert it into two columns, in the left to right order this way:
1 2
3 4
5 6
I can do it with:
awk '{print>"line-"NR%2}' file
paste line-0 line-1 >newfile
But I think the reliance on two intermediate files will make it fragile in a script.
I'd like to use something like cat file | mystery-zip-command >newfile
You can use paste to do this:
paste -d " " - - < file > newfile
You can also use pr:
pr -ats" " -2 file > newfile
-a - use round robin order
-t - suppress header and trailer
-s " " - use single space as the delimiter
-2 - two column output
See also:
Convert a text file into columns
another alternative
$ seq 6 | xargs -n2
1 2
3 4
5 6
or with awk
$ seq 6 | awk '{ORS=NR%2?FS:RS}1'
1 2
3 4
5 6
if you want the output terminate with a new line in case of odd number of input lines..
$ seq 7 | awk '{ORS=NR%2?FS:RS}1; END{ORS=NR%2?RS:FS; print ""}'
1 2
3 4
5 6
7
awk 'NR % 2 == 1 { printf("%s", $1) }
NR % 2 == 0 { printf(" %s\n", $1) }
END { if (NR % 2 == 1) print "" }' file
The odd lines are printed with no newline after them, to print the first column. The even lines are printed with a space first and a newline after, to print the second column. At the end, if there were an odd number of lines, we print a newline so we don't end in the middle of the line.
With bash:
while IFS= read -r odd; do IFS= read -r even; echo "$odd $even"; done < file
Output:
1 2
3 4
5 6
$ seq 6 | awk '{ORS=(NR%2?FS:RS); print} END{if (ORS==FS) printf RS}'
1 2
3 4
5 6
$
$ seq 7 | awk '{ORS=(NR%2?FS:RS); print} END{if (ORS==FS) printf RS}'
1 2
3 4
5 6
7
$
Note that it always adds a terminating newline - that is important as future commands might depend on it, e.g.:
$ seq 6 | awk '{ORS=(NR%2?FS:RS); print}' | wc -l
3
$ seq 7 | awk '{ORS=(NR%2?FS:RS); print}' | wc -l
3
$ seq 7 | awk '{ORS=(NR%2?FS:RS); print} END{if (ORS==FS) printf RS}' | wc -l
4
Just change the single occurrence of 2 to 3 or however many columns you want if your requirements change:
$ seq 6 | awk '{ORS=(NR%3?FS:RS); print} END{if (ORS==FS) printf RS}'
1 2 3
4 5 6
$ seq 7 | awk '{ORS=(NR%3?FS:RS); print} END{if (ORS==FS) printf RS}'
1 2 3
4 5 6
7
$ seq 8 | awk '{ORS=(NR%3?FS:RS); print} END{if (ORS==FS) printf RS}'
1 2 3
4 5 6
7 8
$ seq 9 | awk '{ORS=(NR%3?FS:RS); print} END{if (ORS==FS) printf RS}'
1 2 3
4 5 6
7 8 9
$
Short awk approach:
awk '{print ( ((getline nl) > 0)? $0" "nl : $0 )}' file
The output:
1 2
3 4
5 6
(getline nl)>0 - getline will get the next record and assign it to variable nl. The getline command returns 1 if it finds a record and 0 if it encounters the end of the file
Short GNU sed approach:
sed 'N;s/\n/ /' file
N - add a newline to the pattern space, then append the next line of input to the pattern space
s/\n/ / - replace newline with whitespace within captured pattern space
seq 6 | tr '\n' ' ' | sed -r 's/([^ ]* [^ ]* )/\1\n/g'

How to read n-th line from a text file in bash?

Say I have a text file called "demo.txt" who looks like this:
1 2 3 4
5 6 7 8
9 10 11 12
13 14 15 16
Now I want to read a certain line, say line 2, with a command which will look something like this:
Line2 = read 2 "demo.txt"
So when I'll print it:
echo "$Line2"
I'll get:
5 6 7 8
I know how to use 'sed' command in order to print a n-th line from a file, but not how to read it. I also know the 'read' command but dont know how to use it in order a certain line.
Thanks in advance for the help.
Using head and tail
$ head -2 inputFile | tail -1
5 6 7 8
OR
a generalized version
$ line=2
$ head -"$line" input | tail -1
5 6 7 8
Using sed
$ sed -n '2 p' input
5 6 7 8
$ sed -n "$line p" input
5 6 7 8
What it does?
-n suppresses normal printing of pattern space.
'2 p' specifies the line number, 2 or ($line for more general), p commands to print the current patternspace
input input file
Edit
To get the output to some variable use some command substitution techniques.
$ content=`sed -n "$line p" input`
$ echo $content
5 6 7 8
OR
$ content=$(sed -n "$line p" input)
$ echo $content
5 6 7 8
To obtain the output to a bash array
$ content= ( $(sed -n "$line p" input) )
$ echo ${content[0]}
5
$ echo ${content[1]}
6
Using awk
Perhaps an awk solution might look like
$ awk -v line=$line 'NR==line' input
5 6 7 8
Thanks to Fredrik Pihl for the suggestion.
Perl has convenient support for this, too, and it's actually the most intuitive!
The flip-flop operator can be used with line numbers:
$ printf "0\n1\n2\n3\n4" | perl -ne 'printf if 2 .. 4'
1
2
3
Note that it's 1-based.
You can also mix regular expressions:
$ printf "0\n1\nfoo\n3\n4" | perl -ne 'printf if /foo/ .. -1'
foo
3
4
(-1 refers to the last line)

Bash - finding minimum number per line

I am trying to get more familiar with awk statements, especially ones that can be done with just one line. I have a file that looks like this
9 5 0 2
8 7 4 3
4 8 2 1
I want the output to look like
0
3
1
Is there a way I can do this with just a one liner using awk? Thank you.
Using awk:
awk '{min=$1; for (i=2; i<=NF; i++) if ($i < min) min=$i; print min}' file
0
3
1
The are languages with built-in "min" functions:
ruby -ane 'puts $F.min' file
Or available libraries
perl -MList::Util=min -lane 'print min #F' file
Limiting to shell:
min() { printf "%s\n" "$#" | sort -n | head -1; }
while read -a nums; do
echo $(min "${nums[#]}")
done < file
GNU awk, which you'll find in most Linux distributions, has a built-in sort function, asort.
echo -e "9 5 0 2\n8 7 4 3\n4 8 2 1" |
awk '{ split($0,a); asort(a); print a[1]; }'
0
3
1

Searching a column in a unix file?

I have the data file below:
136110828724515000007700877
137110904734015000007700877
138110911724215000007700877
127110626724515000007700871
127110626726015000007700871
131110724724515000007700871
134110814725015000007700871
134110814734015000007700871
104110122726027000001810072
107110208724527000002900000
And I want to extract value of column 3 ie values of 6787714447.
I tried by using:-
awk "print $3" <filename>
but it didn't work. What should I use instead?
It is a better job for cut:
$ cut -c 3 < file
6
7
8
7
7
1
4
4
4
7
As per man cut:
-c, --characters=LIST
select only these characters
To make them appear all in the same line, pipe tr -d '\n':
$ cut -c 3 < file | tr -d '\n'
6787714447
Or even to sed to have the new line at the end:
$ cut -c 3 < file | tr -d '\n' | sed 's/$/\n/'
6787714447
With grep:
$ grep -oP "^..\K." file
6
7
8
7
7
1
4
4
4
7
with sed:
$ sed -r 's/..(.).*/\1/' file
6
7
8
7
7
1
4
4
4
7
with awk:
$ awk '{split ($0, a, ""); print a[3]}' file
6
7
8
7
7
1
4
4
4
7
Cut is probably the simpler/cleaner option, but here two alternatives:
AWK version:
awk '{print substr($1, 3, 1) }' <filename>
Python version:
python -c 'print "\n".join(map(lambda x: x[2], open("<filename>").readlines()))'
EDIT: Please see 1_CR's comments and disregard this option in favour of his.

AWK--Comparing the value of two variables in two different files

I have two text files A.txt and B.txt. Each line of A.txt
A.txt
100
222
398
B.txt
1 2 103 2
4 5 1026 74
7 8 209 55
10 11 122 78
What I am looking for is something like this:
for each line of A
search B;
if (the value of third column in a line of B - the value of the variable in A > 10)
print that line of B;
Any awk for doing that??
How about something like this,
I had some troubles understanding your question, but maybe this will give you some pointers,
#!/bin/bash
# Read intresting values from file2 into an array,
for line in $(cat 2.txt | awk '{print $3}')
do
arr+=($line)
done
# Linecounter,
linenr=0
# Loop through every line in file 1,
for val in $(cat 1.txt)
do
# Increment linecounter,
((linenr++))
# Loop through every element in the array (containing values from 3 colum from file2)
for el in "${!arr[#]}";
do
# If that value - the value from file 1 is bigger than 10, print values
if [[ $((${arr[$el]} - $val )) -gt 10 ]]
then
sed -n "$(($el+1))p" 2.txt
# echo "Value ${arr[$el]} (on line $(($el+1)) from 2.txt) - $val (on line $linenr from 1.txt) equals $((${arr[$el]} - $val )) and is hence bigger than 10"
fi
done
done
Note,
This is a quick and dirty thing, there is room for improvements. But I think it'll do the job.
Use awk like this:
cat f1
1
4
9
16
cat f2
2 4 10 8
3 9 20 8
5 1 15 8
7 0 30 8
awk 'FNR==NR{a[NR]=$1;next} $3-a[FNR] < 10' f1 f2
2 4 10 8
5 1 15 8
UPDATE: Based on OP's edited question:
awk 'FNR==NR{a[NR]=$1;next} {for (i in a) if ($3-a[i] > 10) print}'
and see how simple awk based solution is as compared to nested for loops.

Resources