Find minimum value using awk - linux

I am trying to find the minimum value from a file.
input.txt
1
2
4
5
6
4
This is the code that I am using:
awk '{sum += $1; min = min < $1 ? min : $1} !(FNR%6){print min;sum=min = ""}' input.txt
But it is not working. Can anybody see the error in my code?

use below scrip to find min value in txt file.
awk 'min=="" || $1 < min {min=$1} END {print min}' input.txt

Set min to $1 on the first line
awk 'NR == 1 {min = $1} {sum += $1; min = min < $1 ? min : $1} !(FNR%6){print min;sum=min = ""}' input.txt
output:
1
Note that sum isn't used, you could simplify to this:
awk 'NR == 1 {min = $1;} {min = min < $1 ? min : $1} !(FNR%6){print min;}' input.txt
To allow any number of lines:
awk 'NR == 1 {min = $1;} {min = min < $1 ? min : $1} END{print min;}' input.txt

Related

Using awk, subtract with previous row in all columns and print the result

I need your guidance in one liner command for linux using awk, subtract the row with previous row recursively in all columns and then print the difference values.
I have input as
2021-02-15_16 101242 102108 17572 84538
2021-02-15_17 101235 102077 17625 84445
Expected output
2021-02-15_17 -7 -31 53 -93
I tried this by myself but with no luck.
cat test |awk 'NR==1{s=$3;next}{s-=$3}END{print s}' --> this displays only for 1 column
cat test | awk 'NR==1 {for(i=3; i<=NF; i++){s=$i;next}{s-=$i}{print s}}'
You may use this awk:
awk 'NR > 1 {for (i=2; i<=5; ++i) $i -= a[i]; print} {split($0,a)}' file
2021-02-15_17 -7 -31 53 -93
To make it more readable:
awk 'NR > 1 {
for (i=2; i<=5; ++i)
$i -= a[i]
print
}
{
split($0,a)
}' file

How to copy a value from one column to another?

I have CSV data with two price columns. If a value exists in the $4 column I want to copy it over the $3 column of the same row. If $4 is empty then $3 should be left as is.
Neither of these work:
awk -F',' '{ if (length($4) == 0) $3=$4 }'
awk -F',' '{ if(!length($4) == 0 ) print $4 }'
This will output every line with the sample table
awk -F',' '{ if(!length($4) == 0 ) print $0 }' inputfile
This will output nothing with the sample table
awk -F',' '{ if(length($4) == 0 ) print $3 }' inputfile
I've cleaned my two input files, fixed the header row, and joined them using sed, awk, sort, and join. Now what I am left with is a CSV which looks like this:
itemnumber,available,regprice,mapprice
00061,9,19.30,
00061030,31,2.87,3.19
00062,9,15.44,
00062410,2,3.59,3.99
00064,9,15.44,
00066850,29,2.87,3.99
00066871,49,4.19,5.99
00066878,3,5.63,7.99
I need to overwrite the $3 column if the $4 column in the same row has a value. The end result would be:
itemnumber,available,regprice,mapprice
00061,9,19.30,
00061030,31,3.19,3.19
00062,9,15.44,
00062410,2,3.99,3.99
00064,9,15.44,
00066850,29,3.99,3.99
00066871,49,5.99,5.99
00066878,3,7.99,7.99
$ awk 'BEGIN{FS=OFS=","} (NR>1) && ($4!=""){$3=$4} 1' file
itemnumber,available,regprice,mapprice
00061,9,19.30,
00061030,31,3.19,3.19
00062,9,15.44,
00062410,2,3.99,3.99
00064,9,15.44,
00066850,29,3.99,3.99
00066871,49,5.99,5.99
00066878,3,7.99,7.99
Let's have a look at all the things you tried:
awk -F',' '{ if (length($4) == 0) $3=$4 }'
This states, if the length if field 4 is zero then set field 3 equal to field 4. You do not ask awk to print anything, so it will not do anything. This would have printed something:
awk -F',' '{ if (length($4) == 0) $3=$4 }{print $0}'
but with all field separators equal to a space, you should have done:
awk 'BEGIN{FS=OFS=","}{ if (length($4) == 0) $3=$4 }{print $0}'
awk -F',' '{ if(!length($4) == 0 ) print $4 }'
Here you state, if the length of field 4 equals zero is not true, print field 4.
As you mention that nothing is printed, it most likely indicates that you have hidden characters in field 4, such as a CR (See: Remove carriage return in Unix), or even just spaces. You could attempt something like
awk -F',' '{sub(/ *\r?$/,""){ if(!length($4) == 0 ) print $4 }'`**
awk -F',' '{ if(!length($4) == 0 ) print $0 }' inputfile
See 2
awk -F',' '{ if(length($4) == 0 ) print $3 }' inputfile
This confirms my suspicion of 2
My solution for your problem would be based on the suggestion of 2 and the solution of Ed Morton.
awk 'BEGIN{FS=OFS=","} {sub(/ *\r?/,"")}(NR>1) && ($4!=""){$3=$4} 1' file
Here's code that matches your results:
awk -F, -v OFS=, '
NR == 1
NR > 1 {
if ( $4 == "" )
print $1,$2,$3,$4
else
print $1,$2,$4,$4 }
' $*
I've run into trouble in the past with expressions like $3 = $4, so I just print out all of the fields.
Edit: I got shamed by Ed Morton for avoiding the $3 = $4 without troubleshooting. I gave it another shot here below:
awk -F, -v OFS=, '
NR == 1
NR > 1 {
if ( $4 != "" )
$3 = $4
print
}
' $*
The above achieves the same results.
tried on gnu awk
awk -F, -vOFS=, '/[0-9.]+/{if($4)$3=$4} {print}' file

AWK script to print line with the largest number of fields

The script below displays the largest number of fields in twister.txt.
awk '{if (NF > max) max = NF} END{print max}' twister.txt
My question is, How do you display the line itself, which has the largest number of fields in twister.txt.
awk '{if (NF > max) {max = NF; line=$0}} END{print line}' twister.txt

Merging Multiple records into a Unique records with all the non-null values

Suppose I have 3 records :
P1||1234|
P1|56001||
P1|||NJ
I want to merge these 3 records into one with all the attributes. Final record :
P1|56001|1234|NJ
Is there any way to achieve this in Unix/Linux?
I assume you ask solution with bash, awk, sed etc.
You could try something like
$ cat test.txt
P1||1234|
P1|56001||
P1|||NJ
$ cat test.txt | awk -F'|' '{ for (i = 1; i <= NF; i++) print $i }' | egrep '.+' | sort | uniq | awk 'BEGIN{ c = "" } { printf c $0; c = "|" } END{ printf "\n" }'
1234|56001|NJ|P1
Briefly, awk splits the lines with '|' separator and prints each field to a line. egrep removes the empty lines. After that, sort and uniq removes multiple attributes. Finally, awk merges the lines with '|' separator.
Update:
If I understand correctly, here's what you seek for;
$ cat test.txt | awk -F'|' '{ for (i = 1; i <= NF; i++) if($i) col[i]=$i } END{ for (i = 1; i <= length(col); i++) printf col[i] (i == length(col) ? "\n" : "|")}'
P1|56001|1234|NJ
In your example, 1st row you have 1234, 2nd row you have 56001.
I don't get why in your final result, the 56001 goes before 1234. I assume it is a typo/mistake.
an awk-oneliner could do the job:
awk -F'|' '{for(i=2;i<=NF;i++)if($i)a[$1]=(a[$1]?a[$1]"|":"")$i}END{print $1"|"a[$1]}'
with your data:
kent$ echo "P1||1234|
P1|56001||
P1||NJ"|awk -F'|' '{for(i=2;i<=NF;i++)if($i)a[$1]=(a[$1]?a[$1]"|":"")$i}END{print $1"|"a[$1]}'
P1|1234|56001|NJ

How to split values in a column into separate column

My tab-delimited file looks like this:
ID Pop snp1 snp2 snp3 snp4 snp5
AD62 1 0/1 1/1 . 1/1 0/.
AD75 1 0/0 1/1 . ./0 1/0
AD89 1 . 1/0 1/1 0/0 1/.
I want to separate the columns (starting from column 3) so that the values separated by the "/" character are delimited into a column of its own. However there are also columns whereby the values are missing (they only contain the "." character) and I want this to be treated as though it was "./." so that the two "." characters are then divided into their own columns. For example:
ID Pop snp1 snp2 snp3 snp4 snp5
AD62 1 0 1 1 1 . . 1 1 0 .
AD75 1 0 0 1 1 . . . 0 1 0
AD89 1 . . 1 0 1 1 0 0 1 .
Thanks
You can use sed:
sed -e 's/ \. /\.\t\. /g' -e 's/\//\t/g' <your_file>
Tried this and works well, you can tweak this as per your requirement.
Assuming data is in data.txt file.
cat data.txt | sed 1d | tr '/' '\t'| sed 's/\./.\t./g'
This gives the output, but you need to get a work around for the spaces and tab that are getting messed up.
This might work for you (GNU sed):
sed ''1s/\t/&&/3g;s/\t\.\t/\t.\t.\t/g;y/\//\t/' file
A fairly robust way, using awk and a few if statements:
awk '{ for (i = 1; i <= NF; i++) if (i >= 3 && i < NF && NR == 1) printf "%s\t\t", $i; else if (i == NF && NR == 1) print $i; else if ($i == "." && NR >= 2) printf ".\t.\t", $i; else { sub ("/", "\t", $i); if (i == NF) printf "%s\n", $i; else { printf "%s\t", $i; } } }' file.txt
Broken out on multiple lines:
awk '{ for (i = 1; i <= NF; i++)
if (i >= 3 && i < NF && NR == 1) printf "%s\t\t", $i;
else if (i == NF && NR == 1) print $i;
else if ($i == "." && NR >= 2) printf ".\t.\t", $i;
else {
sub ("/", "\t", $i);
if (i == NF) printf "%s\n", $i;
else {
printf "%s\t", $i;
}
}
}' file.txt
HTH

Resources