Add the elements of 2 rows based on pattern condition - linux

I want to add 2 rows based on a pattern
I have this table
1 - 513 1478 966 1
2 - 1594 2130 537 1
3 + 2171 2539 369 1
4 - 2587 3159 573 1
What Iam looking for is to add an $7 column with the first element starts with 0 and if the $2 is "-" then substract -1 from $7 else add +1 in the $7
like this:
1 - 513 1478 966 1 -1
2 - 1594 2130 537 1 -2
3 + 2171 2539 369 1 -1
4 - 2587 3159 573 1 -2 `
I wrote this
awk '$7==0,i=1;{for i in $1 do {if($2="-"){$7=$7+1}else{$7=$7-1} done print}'
The issue with my code is that if I remove the for condition turns the entire $2 in - and the entire $7 is -1

Your code is not working at all. It complaints about a couple of syntax errors. In any case, I think you are overthinking the problem. If I didn't understood you wrong, the solution is simpler:
awk 'BEGIN {v=0} {if ($2=="-") {v=v-1} else {v=v+1}; $7=v; print}'
Use a var v to keep the last value and add or substract one depending on $2 content. Once v is updated, assign it to $7 and print the entire record. In the next line you already have the last valu of the seventh column in v.

using #RavinderSingh13's trick
$ awk '{print $0 "\t" (c+=$2"1")}' file
1 - 513 1478 966 1 -1
2 - 1594 2130 537 1 -2
3 + 2171 2539 369 1 -1
4 - 2587 3159 573 1 -2

This should be as simple as the following.
awk 'BEGIN{OFS="\t\t"} {$2=$2"1";$(NF+1)=$2*$NF+prev;prev=$NF} 1' Input_file
Brief explanation:
Adding 1 to $2 2nd field value of every line.
Multiplying value of $2 with last field(which means multiplying 1 with +ve or -ve to last field) and saving its value to newly created last field.
Now adding previous line's last field value to it here as per OP's question.
Saving current last field(new created one by using $(NF+1)) to prev variable so that it could be used to add in next line's calculations.
Detailed explanation:
awk ' ##Starting awk program from here.
BEGIN{ ##Starting BEGIN section of this awk program from here.
OFS="\t\t" ##Setting value of 2 times TAB for each line here.
} ##Close BEGIN section of this code here.
{
$2=$2"1" ##Concatenating 1 to value of $2 here.
$(NF+1)=$2*$NF+prev ##Creating new last field whose value is $2*$NF and adding prev variable to it.
prev=$NF ##Setting current last field value to variable prev here.
}
1 ##Printing edited/non-edited lines here.
' Input_file ##mentioning Input_file name here.
Output will be as follows for provided samples.
1 -1 513 1478 966 1 -1
2 -1 1594 2130 537 1 -2
3 +1 2171 2539 369 1 -1
4 -1 2587 3159 573 1 -2

Related

how to write awk code with specific condition

I want to create a code that operates on a certain number of a row of data, for which I just want to count negative numbers to make them positive by multiplying by the number itself negative
example
data
10
11
-12
-13
-14
expected output
10
11
144
169
196
this is what I've been try
awk 'int($0)<0 {$4 = int($0) + 360}
END {print $4}' data.txt
but I don't even get the output, anyone can help me?
awk '$0 < 0 { $0 = $0 * $0 } 1' data.txt
The first condition multiplies the value by itself when it's negative. The condition 1 is always true, so the line is printed unconditionally.
Also:
awk '{print($0<0)?$0*$0:$0}' input
$ awk '{print $0 ^ (/-/ ? 2 : 1)}' file
10
11
144
169
196
You could also match only digits that start with - and in that case multiply them by themselves
awk '{print (/^-[0-9]+$/ ? $0 * $0 : $0)}' data.txt
Output
10
11
144
169
196

How to replace a number to another number in a specific column using awk

This is probably basic but I am completely new to command-line and using awk.
I have a file like this:
1 RQ22067-0 -9
2 RQ34365-4 1
3 RQ34616-4 1
4 RQ34720-1 0
5 RQ14799-8 0
6 RQ14754-1 0
7 RQ22101-7 0
8 RQ22073-1 0
9 RQ30201-1 0
I want the 0s to change to 1 in column3. And any occurence of 1 and 2 to change to 2 in column3. So essentially only changing numbers in column 3. But I am not changing the -9.
1 RQ22067-0 -9
2 RQ34365-4 2
3 RQ34616-4 2
4 RQ34720-1 1
5 RQ14799-8 1
6 RQ14754-1 1
7 RQ22101-7 1
8 RQ22073-1 1
9 RQ30201-1 1
I have tried using (see below) but it has not worked
>> awk '{gsub("0","1",$3)}1' PRS_with_minus9.pheno.txt > PRS_with_minus9_modified.pheno
>> awk '{gsub("1","2",$3)}1' PRS_with_minus9.pheno.txt > PRS_with_minus9_modified.pheno
Thank you.
With this code in your question:
awk '{gsub("0","1",$3)}1' PRS_with_minus9.pheno.txt > PRS_with_minus9_modified.pheno
awk '{gsub("1","2",$3)}1' PRS_with_minus9.pheno.txt > PRS_with_minus9_modified.pheno
you're running both commands on the same input file and writing their
output to the same output file so only the output of the 2nd script
will be present in the output, and
you're trying to change 0 to 1
first and THEN change 1 to 2 so the $3s that start out as 0 would
end up as 2, you need to change the order of the operations.
This is what you should be doing, using your existing code:
awk '{gsub("1","2",$3); gsub("0","1",$3)}1' PRS_with_minus9.pheno.txt > PRS_with_minus9_modified.pheno
For example:
$ awk '{gsub("1","2",$3); gsub("0","1",$3)}1' file
1 RQ22067-0 -9
2 RQ34365-4 2
3 RQ34616-4 2
4 RQ34720-1 1
5 RQ14799-8 1
6 RQ14754-1 1
7 RQ22101-7 1
8 RQ22073-1 1
9 RQ30201-1 1
The gsub() should also just be sub()s as you only want to perform each substitution once, and you don't need to enclose the numbers in quotes so you could just do:
awk '{sub(1,2,$3); sub(0,1,$3)}1' file
You can check the value of column 3 and then update the field value.
Check for 1 as the first rule because if the first check is for 0, the value will be set to 1 and the next check will set the value to 2 resulting in all 2's.
awk '
{
if($3==1) $3 = 2
if($3==0) $3 = 1
}
1' file
Output
1 RQ22067-0 -9
2 RQ34365-4 2
3 RQ34616-4 2
4 RQ34720-1 1
5 RQ14799-8 1
6 RQ14754-1 1
7 RQ22101-7 1
8 RQ22073-1 1
9 RQ30201-1 1
With your shown samples and ternary operators try following code. Simple explanation would be, checking condition if 3rd field is 1 then set it to 2 else check if its 0 then set it to 0 else keep it as it is, finally print the line.
awk '{$3=$3==1?2:($3==0?1:$3)} 1' Input_file
Generic solution: Adding a Generic solution here, where we can have 3 awk variables named: fieldNumber in which you could mention all field numbers which we want to check for. 2nd one is: existValue which we want to match(in condition) and 3rd one is: newValue new value which needs to be there after replacement.
awk -v fieldNumber="3" -v existValue="1,0" -v newValue="2,1" '
BEGIN{
num=split(fieldNumber,arr1,",")
num1=split(existValue,arr2,",")
num2=split(newValue,arr3,",")
for(i=1;i<=num1;i++){
value[arr2[i]]=arr3[i]
}
}
{
for(i=1;i<=num;i++){
if($arr1[i] in value){
$arr1[i]=value[$arr1[i]]
}
}
}
1
' Input_file
This might work for you (GNU sed):
sed -E 's/\S+/\n&\n/3;h;y/01/12/;G;s/.*\n(.*)\n.*\n(.*)\n.*\n.*/\2\1/' file
Surround 3rd column by newlines.
Make a copy.
Replace all 0's by 1's and all 1's by 2's.
Append the original.
Pattern match on newlines and replace the 3rd column in the original by the 3rd column in the amended line.
Also with awk:
awk 'NR > 1 {s=$3;sub(/1/,"2",s);sub(/0/,"1",s);$3=s} 1' file
1 RQ22067-0 -9
2 RQ34365-4 2
3 RQ34616-4 2
4 RQ34720-1 1
5 RQ14799-8 1
6 RQ14754-1 1
7 RQ22101-7 1
8 RQ22073-1 1
9 RQ30201-1 1
the substitutions are made with sub() on a copy of $3 and then the copy with the changes is assigned to $3.
When you don't like the simple
sed 's/1$/2/; s/0$/1/' file
you might want to play with
sed -E 's/(.*)([01])$/echo "\1$((\2+1))"/e' file

How can i print in a new column the total count for 1 column in KSH?

Im trying to add a new column in txt file with the total count without delete other columns, already found a post here in stackoverflow but
cut -f1 test.txt | uniq -c | join -2 2 test.txt -
is not working for me, neither the other option in post, i mean in that post didnt answer my question, can someone help?
test.txt:
Apple_1 1 300
Apple_2 1 500
Apple_2 500 1500
Apple_2 1500 2450
Apple_3 1 1250
Apple_3 1250 2000
EXPECTED OUTPUT:
Apple_1 1 300 1
Apple_2 1 500 3
Apple_2 500 1500 3
Apple_2 1500 2450 3
Apple_3 1 1250 2
Apple_3 1250 2000 2
Already tried with some regex and awk such:
awk '{count[$1]++} END {for (word in count) print word, count[word]}' file
awk '{ print $0 "\t" ++count[$1] }'
You need to go through the file twice, once to calculate the counts, and a second time to print all the lines with the count appended.
awk 'FNR == NR {count[$1]++; next}
{print $0 "\t" count[$1]}' file file
FNR == NR is true when reading the first file.

Select rows in one file based on specific values in the second file (Linux)

I have two files:
One is "total.txt". It has two columns: the first column is natural numbers (indicator) ranging from 1 to 20, the second column contains random numbers.
1 321
1 423
1 2342
1 7542
2 789
2 809
2 5332
2 6762
2 8976
3 42
3 545
... ...
20 432
20 758
The other one is "index.txt". It has three columns:(1.indicator, 2:low value, 3: high value)
1 400 5000
2 600 800
11 300 4000
I want to output the rows of "total.txt" file with first column matches with the first column of "index.txt" file. And at the same time, the second column of output results must be larger than (>) the second column of the "index.txt" and smaller than (<) the third column of the "index.txt".
The expected result is as follows:
1 423
1 2342
2 809
2 5332
2 6762
11 ...
11 ...
I have tried this:
awk '$1==(awk 'print($1)' index.txt) && $2 > (awk 'print($2)' index.txt) && $1 < (awk 'print($2)' index.txt)' total.txt > result.txt
But it failed!
Can you help me with this? Thank you!
You need to read both files in the same awk script. When you read index.txt, store the other columns in an array.
awk 'FNR == NR { low[$1] = $2; high[$1] = $3; next }
$2 > low[$1] && $2 < high[$1] { print }' index.txt total.txt
FNR == NR is the common awk idiom to detect when you're processing the first file.
Use join like Barmar said:
# To join on the first columns
join -11 -21 total.txt index.txt
And if the files aren't sorted in lexical order by the first column then:
join -11 -21 <(sort -k1,1 total.txt) <(sort -k1,1 index.txt)

Compare columns from two files and print not match

I want to compare the first 4 columns of file1 and file2. I want to print all lines from file1 + the lines from file2 that are not in file1.
File1:
2435 2 2 7 specification 9-8-3-0
57234 1 6 4 description 0-0 55211
32423 2 44 3 description 0-0 24242
File2:
2435 2 2 7 specification
7624 2 2 1 namecomplete
57234 1 6 4 description
28748 34 5 21 gateway
32423 2 44 3 description
832758 3 6 namecomplete
output:
2435 2 2 7 specification 9-8-3-0
57234 1 6 4 description 0-0 55211
32423 2 44 3 description 0-0 24242
7624 2 2 1 namecomplete
28748 34 5 21 gateway
832758 3 6 namecomplete
I don't understand how to print things that don't match.
You can do it with an awk script like this:
script.awk
FNR == NR { mem[ $1 $2 $3 $4 $5 ] = 1;
print
next
}
{ key = $1 $2 $3 $4 $5
if( ! ( key in mem) ) print
}
And run it like this: awk -f script.awk file1 file2 .
The first part memorizes the first 5 fields, prints the whole line and moves to the next line. This part is exclusively applied to lines from the first file.
The second part is only applied to lines from the second file. It checks if the line is not in mem, in that case the line was not in file1 and it is printed.

Resources