AWK write to new column base on if else of other column - linux

I have a CSV file with columns A,B,C,D. Column D contains values on a scale of 0 to 1. I want to use AWK to write to a new column E base in values in column D.
For example:
if value in column D <0.7, value in column E = 0.
if value in column D>=0.7, value in column E =1.
I am able to print the output of column E but not sure how to write it to a new column. Its possible to write the output of my code to a new file then paste it back to the old file but i was wondering if there was a more efficient way. Here is my code:
awk -F"," 'NR>1 {if ($3>=0.7) $4= "1"; else if ($3<0.7) $4= "0"; print $4;}' test_file.csv

below awk command should give you intended output
cat yourfile.csv|awk -F "," '{if($4>=0.7)print $0",1";else if($4<0.7)print $0",0"}' > test_file.csv

You can use:
awk -F, 'NR>1 {$0 = $0 FS (($4 >= 0.7) ? 1 : 0)} 1' test_file.csv

Related

Linux SHELL script, read each row for different number of columns

I have file and for example values in it:
1 value1.1 value1.2
2 value2.1
3 value3.1 value3.2 value3.3
I need to read values using the shell script from it but number of columns in each row is different!!!
I know that if for example I want to read second column I will do it like this (for row number as input parameter)
$ awk -v key=1 '$1 == key { print $2 }' input.txt
value1.1
But as I mentioned number of columns is different for each row.
How to make this read dynamic?
For example:
if input parameter is 1 it means I should read columns from the first row so output should be
value1.1 value1.2
if input parameter is 2 it means I should read columns from the second row so output should be
value2.1
if input parameter is 3 it means I should read columns from the third row so output should be
value3.1 value3.2 value3.2
Th point is that number of columns is not static and I should read columns from that specific row until the end of the row.
Thank you
Then you can simply say:
awk -v key=1 'NR==key' input.txt
UPDATED
If you want to process with the column data, there will be several ways.
With awk you can say something like:
awk -v key=3 'NR==key {
for (i=1; i<=NF; i++)
printf "column %d = %s\n", i, $i
}' input.txt
which outputs:
column 1 = value3.1
column 2 = value3.2
column 3 = value3.2
In awk you can access each column value by $1, $2, $3 directly or by $i indirectly where variable i holds either of 1, 2, 3.
If you prefer going with bash, try something like:
line=$(awk -v key=3 'NR==key' input.txt)
set -- $line # split into columns
for ((i=1; i<=$#; i++)); do
echo column $i = ${!i}
done
which outputs the same results.
In bash the indirect access is a little bit complex and you need to say ${!i} where i is a variable name.
Hope this helps.

insert values of a column into other column

I have a tab-delimited .txt file with two columns and long list of values in both columns
col1 col2
1 a
2 b
3 c
... ...
I want to convert this now to
col1
1
a
2
b
3
c
So that he insert the values from column 2 into column 1 at the correct location.
Is there any way to do this, maybe using awk, or something else through the command line?
You can ask awk to print first column and then second column. By using print for each case, you ensure you have a new line in between them:
awk -F"\t" '{print $1; print $2}' file
Or the following if you just want to print the 1st column on the first line:
awk -F"\t" 'NR==1 {print $1; next} {print $1; print $2}' file
The second command returns the following for your given input:
col1
1
a
2
b
3
c
this should do:
awk -F"\t" -v OFS="\n" '{$1=$1}7' file

How to use awk to get the result of computation of column1 value of the same column2 value in 2 csv files in Ubuntu?

I am using ubuntu and we got a csv file1.csv with 2 columns looks like
a,1
b,2
c,3
...
and another file2.csv with 2 columns looks like
a,4
b,3
d,2
...
Some of column 1 value appear in file1.csv but not in file2.csv and vice cersa and these values should not be in result.csv. Say the value of first column in file1.csv is x and the value of first column in file2.csv with the same column2 value is y. How to use awk to compute (x-y)/(x+y) of second lines of 2 csv files in Ubuntu to get the result.csv like this:
a,-0.6
b,-0.2
-0.6 is computed by (1-4)/(1+4)
-0.2 is computed by (2-3)/(2+3)
What about this?
$ awk 'BEGIN{FS=OFS=","} FNR==NR {a[$1]=$2; next} {if ($1 in a) print $1,(a[$1]-$2)/(a[$1]+$2)}' f1 f2
a,-0.6
b,-0.2
Explanation
BEGIN{FS=OFS=","} set input and output field separators as comma.
FNR==NR {a[$1]=$2; next} when processinig first file, store in the array a[] the values like a[first col]=second col.
{if ($1 in a) print $1,(a[$1]-$2)/(a[$1]+$2)} when looping through second file, on each line do: check if the first col is stored in the a[] array; if so, print (x-y)/(x+y), being x=stored value and y=current second column.

comparing two files with different columns

i have the two files(count.txt, count1.txt). i need to do the following
1. get the values from count.txt and count1.txt where 1st column is equal.
2. if its equal need to compare the 2nd column like ((1st column value + 5) >= 2 column value)
count.txt
order1,150
order2,165
order3,125
count1.txt
order1,155
order2,170
order3,125
order4,123
and i want the output like below,
Output.txt
order1,155
order2,170
i have used below nawk command for the 1st point, but not able to complete the 2nd point. Please suggest to achieve the same.
nawk -F"," 'NR==FNR {a[$1];next} ($1 in a)' count.txt count1.txt
nawk -F"," 'NR==FNR {a[$1]=$2;next} ($1 in a) && (a[$1]+5)<=$2' count.txt count1.txt

how to conditionally replace values in columns with value of specific column in the same line by Unix and awk commands

I want to conditionally replace values in columns with value of specific column in the same line in one file, by Unix and awk commands.
For example, I have myfile.txt (3 lines, 5 columns, tab-delimited):
1 A . C .
2 C T . T
3 T C C .
There are "." in columns 3 to 5. I want to replace those "." in columns 3 - 5 with the value in column 2 on the same line.
Could you please show me any directions on that?
This seems to do what you're asking for:
% awk 'BEGIN {
IFS = OFS = "\t"
}
{
for (column = 3; column <= NF; ++column) {
if ($column == ".") {
$column = $2
}
}
print
}
' test.tsv
1 A A C A
2 C T C T
3 T C C T
You've asked a few questions (and accepted no answers!) on awk now. May
I humbly suggest a tutorial?
awk '{FS="\t"; for(i=3;i<=5;i++) if($i==".") $i=$2; print}' myfile.txt

Resources