how can i print the upper triangle of a matrix - linux

using awk command I tried to print the upper triangle of a matrix
awk '{for (i=1;i<=NF;i++) if (i>=NR) printf $i FS "\n"}' matrix
but the output is shown as a single row

Consider this sample matrix:
$ cat matrix
1 2 3
4 5 6
7 8 9
To print the upper-right triangle:
$ awk '{for (i=1;i<=NF;i++) printf "%s%s",(i>=NR)?$i:" ",FS; print""}' matrix
1 2 3
5 6
9
Or:
$ awk '{for (i=1;i<=NF;i++) printf "%2s",(i>=NR)?$i:" "; print""}' matrix
1 2 3
5 6
9
To print the upper-left triangle:
$ awk '{for (i=1;i<=NF+1-NR;i++) printf "%s%s",$i,FS; print""}' matrix
1 2 3
4 5
7
Or:
$ awk '{for (i=1;i<=NF+1-NR;i++) printf "%2s",$i; print""}' matrix
1 2 3
4 5
7

This might work for you (GNU sed):
sed -r ':a;n;H;G;s/\n//;:b;s/^\S+\s*(.*)\n.*/\1/;tb;$!ba' file
Use the hold space as a counter for those lines that have been processed and for each current line remove those many fields from the front of the current line.
N.B. The counter is set following the printing of the current line otherwise the first line would be minus the first field.
On reflection an alternative/more elegant solution is:
sed -r '1!G;h;:a;s/^\S+\s*(.*)\n.*/\1/;ta' file
And to print the upper-left triangle:
sed -r '1!G;h;:a;s/^([^\n]*)\S+[^\n]*(.*)\n.*/\1\2/;ta' file

$ awk '{for (i=NR;i<=NF;i++) printf "%s%s",$i,(i<NF?FS:RS)}' file
1 2 3
5 6
9

Related

Bash Colum sum over a table of variable length

Im trying to get the columsums (exept for the first one) of a tab delimited containing numbers.
To find out the number of columns an store it in a variable I use:
cols=$(awk '{print NF}' file.txt | sort -nu | tail -n 1
next I want to calculate the sum of all numbers in that column and store this in a variable again in a for loop:
for c in 2:$col
do
num=$(cat file.txt | awk '{sum+$2 ; print $0} END{print sum}'| tail -n 1
done
this
num=$(cat file.txt | awk '{sum+$($c) ; print $0} END{print sum}'| tail -n 1
on itself with a fixed numer and without variable input works find but i cannot get it to accept the for-loop variable.
Thanks for the support
p.s. It would also be fine if i could sum all columns (expept the first one) at once without the loop-trouble.
Assuming you want the sums of the individual columns,
$ cat file
1 2 3 4
5 6 7 8
9 10 11 12
$ awk '
{for (i=2; i<=NF; i++) sum[i] += $i}
END {for (i=2; i<=NF; i++) printf "%d%s", sum[i], OFS; print ""}
' file
18 21 24
In case you're not bound to awk, there's a nice tool for "command-line statistical operations" on textual files called GNU datamash.
With datamash, summing (probably the simplest operation of all) a 2nd column is as easy as:
$ datamash sum 2 < table
9
Assuming the table file holds tab-separated data like:
$ cat table
1 2 3 4
2 3 4 5
3 4 5 6
To sum all columns from 2 to n use column ranges (available in datamash 1.2):
$ n=4
$ datamash sum 2-$n < table
9 12 15
To include headers, see the --headers-out option

How do I turn a text file with a single column into a matrix?

I have a text file that has a single column of numbers, like this:
1
2
3
4
5
6
I want to convert it into two columns, in the left to right order this way:
1 2
3 4
5 6
I can do it with:
awk '{print>"line-"NR%2}' file
paste line-0 line-1 >newfile
But I think the reliance on two intermediate files will make it fragile in a script.
I'd like to use something like cat file | mystery-zip-command >newfile
You can use paste to do this:
paste -d " " - - < file > newfile
You can also use pr:
pr -ats" " -2 file > newfile
-a - use round robin order
-t - suppress header and trailer
-s " " - use single space as the delimiter
-2 - two column output
See also:
Convert a text file into columns
another alternative
$ seq 6 | xargs -n2
1 2
3 4
5 6
or with awk
$ seq 6 | awk '{ORS=NR%2?FS:RS}1'
1 2
3 4
5 6
if you want the output terminate with a new line in case of odd number of input lines..
$ seq 7 | awk '{ORS=NR%2?FS:RS}1; END{ORS=NR%2?RS:FS; print ""}'
1 2
3 4
5 6
7
awk 'NR % 2 == 1 { printf("%s", $1) }
NR % 2 == 0 { printf(" %s\n", $1) }
END { if (NR % 2 == 1) print "" }' file
The odd lines are printed with no newline after them, to print the first column. The even lines are printed with a space first and a newline after, to print the second column. At the end, if there were an odd number of lines, we print a newline so we don't end in the middle of the line.
With bash:
while IFS= read -r odd; do IFS= read -r even; echo "$odd $even"; done < file
Output:
1 2
3 4
5 6
$ seq 6 | awk '{ORS=(NR%2?FS:RS); print} END{if (ORS==FS) printf RS}'
1 2
3 4
5 6
$
$ seq 7 | awk '{ORS=(NR%2?FS:RS); print} END{if (ORS==FS) printf RS}'
1 2
3 4
5 6
7
$
Note that it always adds a terminating newline - that is important as future commands might depend on it, e.g.:
$ seq 6 | awk '{ORS=(NR%2?FS:RS); print}' | wc -l
3
$ seq 7 | awk '{ORS=(NR%2?FS:RS); print}' | wc -l
3
$ seq 7 | awk '{ORS=(NR%2?FS:RS); print} END{if (ORS==FS) printf RS}' | wc -l
4
Just change the single occurrence of 2 to 3 or however many columns you want if your requirements change:
$ seq 6 | awk '{ORS=(NR%3?FS:RS); print} END{if (ORS==FS) printf RS}'
1 2 3
4 5 6
$ seq 7 | awk '{ORS=(NR%3?FS:RS); print} END{if (ORS==FS) printf RS}'
1 2 3
4 5 6
7
$ seq 8 | awk '{ORS=(NR%3?FS:RS); print} END{if (ORS==FS) printf RS}'
1 2 3
4 5 6
7 8
$ seq 9 | awk '{ORS=(NR%3?FS:RS); print} END{if (ORS==FS) printf RS}'
1 2 3
4 5 6
7 8 9
$
Short awk approach:
awk '{print ( ((getline nl) > 0)? $0" "nl : $0 )}' file
The output:
1 2
3 4
5 6
(getline nl)>0 - getline will get the next record and assign it to variable nl. The getline command returns 1 if it finds a record and 0 if it encounters the end of the file
Short GNU sed approach:
sed 'N;s/\n/ /' file
N - add a newline to the pattern space, then append the next line of input to the pattern space
s/\n/ / - replace newline with whitespace within captured pattern space
seq 6 | tr '\n' ' ' | sed -r 's/([^ ]* [^ ]* )/\1\n/g'

Change format of text file

I have a file with many lines of tab separated data in the following format:
1 1 2 2
3 3 4 4
5 5 6 6
...
and I would like to change the format to:
1 1
2 2
3 3
4 4
5 5
6 6
Is there a not too complicated way to do this? I don't have any experience with using awk, sed, etc.
Thanks
If you just want to group your file in blocks of X columns, you can make use of xargs -nX:
$ xargs -n2 < file
1 1
2 2
3 3
4 4
5 5
6 6
To have more control and print an empty line after 4th field, you can also use this awk:
$ awk 'BEGIN{FS=OFS="\t"} {for (i=1;i<=NF;i++) printf "%s%s", $i, (i%2?OFS:RS); print ""}' file
1 1
2 2
3 3
4 4
5 5
6 6
# <-- note there is an empty line here
Explanation
On odd fields, it print FS after it.
On even fields, print RS.
Note FS stands for field separator, which defaults to space, and RS stands for record separator, which defaults to new line. As you have tab as field separator, we redefine it in the BEGIN block.
This is probably the simplest way which allows for customisation
awk '{print $1,$2"\n"$3,$4}' file
For a line between
awk '{print $1,$2"\n"$3,$4"\n"}' file
although fedorquis answer with xargs is probably the simplest if this isn't needed
As Ed pointed out this wouldn't work if there were blanks in the fields, this could be resolved using
awk 'BEGIN{FS=OFS="\t"} {print $1,$2 ORS $3,$4 ORS}' file
Through perl,
perl -pe 's/\t(\d\t\d)$/\n$1\n/g' file
Fed the above command's output to the sed command to delete the last blank line.
perl -pe 's/\t(\d\t\d)$/\n$1\n/g' file | sed '$d'

Bash - finding minimum number per line

I am trying to get more familiar with awk statements, especially ones that can be done with just one line. I have a file that looks like this
9 5 0 2
8 7 4 3
4 8 2 1
I want the output to look like
0
3
1
Is there a way I can do this with just a one liner using awk? Thank you.
Using awk:
awk '{min=$1; for (i=2; i<=NF; i++) if ($i < min) min=$i; print min}' file
0
3
1
The are languages with built-in "min" functions:
ruby -ane 'puts $F.min' file
Or available libraries
perl -MList::Util=min -lane 'print min #F' file
Limiting to shell:
min() { printf "%s\n" "$#" | sort -n | head -1; }
while read -a nums; do
echo $(min "${nums[#]}")
done < file
GNU awk, which you'll find in most Linux distributions, has a built-in sort function, asort.
echo -e "9 5 0 2\n8 7 4 3\n4 8 2 1" |
awk '{ split($0,a); asort(a); print a[1]; }'
0
3
1

select the second line to last line of a file

How can I select the lines from the second line to the line before the last line of a file by using head and tail in unix?
For example if my file has 15 lines I want to select lines from 2 to 14.
tail -n +2 /path/to/file | head -n -1
perl -ne 'print if($.!=1 and !(eof))' your_file
tested below:
> cat temp
1
2
3
4
5
6
7
> perl -ne 'print if($.!=1 and !(eof))' temp
2
3
4
5
6
>
alternatively in awk you can use below:
awk '{a[count++]=$0}END{for(i=1;i<count-1;i++) print a[i]}' your_file
To print all lines but first and last ones you can use this awk as well:
awk 'NR==1 {next} {if (f) print f; f=$0}'
This always prints the previous line. To prevent the first one from being printed, we skip the line when NR is 1. Then, the last one won't be printed because when reading it we are printing the penultimate!
Test
$ seq 10 | awk 'NR==1 {next} {if (f) print f; f=$0}'
2
3
4
5
6
7
8
9

Resources