how to write awk code with specific condition - linux

I want to create a code that operates on a certain number of a row of data, for which I just want to count negative numbers to make them positive by multiplying by the number itself negative
example
data
10
11
-12
-13
-14
expected output
10
11
144
169
196
this is what I've been try
awk 'int($0)<0 {$4 = int($0) + 360}
END {print $4}' data.txt
but I don't even get the output, anyone can help me?

awk '$0 < 0 { $0 = $0 * $0 } 1' data.txt
The first condition multiplies the value by itself when it's negative. The condition 1 is always true, so the line is printed unconditionally.

Also:
awk '{print($0<0)?$0*$0:$0}' input

$ awk '{print $0 ^ (/-/ ? 2 : 1)}' file
10
11
144
169
196

You could also match only digits that start with - and in that case multiply them by themselves
awk '{print (/^-[0-9]+$/ ? $0 * $0 : $0)}' data.txt
Output
10
11
144
169
196

Related

How to use awk '{print $1*Number}' from the second line or telling him to ignore NaN values?

I have a file called 'waterproofposters.jsonl' with this type of output:
Regular price
100
200
300
400
500
And I need to take out 2% of each value. I have used the following code:
awk '{print $1*0.98}' waterproofposters.jsonl
And then I have the following output:
0
98
196
294
392
490
And then I'm stuck because I need to have 'Regular price' in the first line instead '0'
I thought to replace '0' with 'Regular price using
find . -name "waterproof.jsonl" | xargs sed -i -e 's/0/Regular price/g'
But it will replace all the '0' by 'Regular price'
To print the first line as-is:
awk '{print (NR>1 ? $0*0.98 : $0)}'
To print lines that are not a number as-is:
awk '{print ($0+0 == $0 ? $0*0.98 : $0)}'
I'm using $0 instead of $1 in the multiplication because:
They're the same thing in your numerical input, and
I aesthetically prefer using the same value across the whole script rather than different values for the numeric vs non-numeric lines, and
When you use a specific field it causes awk to do field-splitting so it's a bit more efficient to not reference a field when the whole record will do.
Here's both of the above working with the posted sample input:
$ awk '{print (NR>1 ? $0*0.98 : $0)}' file
Regular price
98
196
294
392
490
$ awk '{print ($0+0 == $0 ? $0*0.98 : $0)}' file
Regular price
98
196
294
392
490
and here's the difference between the two given input that has a non-numeric value mid input file:
$ cat file
Regular price
100
200
foobar
400
500
$ awk '{print (NR>1 ? $0*0.98 : $0)}' file
Regular price
98
196
0
392
490
$ awk '{print ($0+0 == $0 ? $0*0.98 : $0)}' file
Regular price
98
196
foobar
392
490
You can certainly achieve what you need with a single awk call, but an answer to why your sed -i -e 's/0/Regular price/g' command did not work as expected is that you used 0 as the regex pattern. 0 matches any zero char inside the string.
You want to replace 0s that are the only char on a line.
Hence, you need to use ^ and $ anchors to match the start and end of the line respectively:
sed -i 's/^0$/Regular price/'
If you need to replace on the first line only add the 1 address before the substitution command:
sed -i '1 s/^0$/Regular price/'
Note you do not need g, since you only expect one replacement per line and g is only needed when performing multiple replacements on a line. By default, all lines will get processed.
How to use awk '{print $1Number}' from the second line or telling him to ignore NaN values?*
I would do it following way using GNU AWK, let file.txt content be
Regular price
100
200
300
400
500
then
awk 'NR==1{print}NR>=2{print $1*0.98}' file.txt
output
Regular price
98
196
294
392
490
Explanation: if it 1st line just print it, if it 2nd or later line print 0.98 of 1st column value
(tested in GNU Awk 5.0.1)

Select rows in one file based on specific values in the second file (Linux)

I have two files:
One is "total.txt". It has two columns: the first column is natural numbers (indicator) ranging from 1 to 20, the second column contains random numbers.
1 321
1 423
1 2342
1 7542
2 789
2 809
2 5332
2 6762
2 8976
3 42
3 545
... ...
20 432
20 758
The other one is "index.txt". It has three columns:(1.indicator, 2:low value, 3: high value)
1 400 5000
2 600 800
11 300 4000
I want to output the rows of "total.txt" file with first column matches with the first column of "index.txt" file. And at the same time, the second column of output results must be larger than (>) the second column of the "index.txt" and smaller than (<) the third column of the "index.txt".
The expected result is as follows:
1 423
1 2342
2 809
2 5332
2 6762
11 ...
11 ...
I have tried this:
awk '$1==(awk 'print($1)' index.txt) && $2 > (awk 'print($2)' index.txt) && $1 < (awk 'print($2)' index.txt)' total.txt > result.txt
But it failed!
Can you help me with this? Thank you!
You need to read both files in the same awk script. When you read index.txt, store the other columns in an array.
awk 'FNR == NR { low[$1] = $2; high[$1] = $3; next }
$2 > low[$1] && $2 < high[$1] { print }' index.txt total.txt
FNR == NR is the common awk idiom to detect when you're processing the first file.
Use join like Barmar said:
# To join on the first columns
join -11 -21 total.txt index.txt
And if the files aren't sorted in lexical order by the first column then:
join -11 -21 <(sort -k1,1 total.txt) <(sort -k1,1 index.txt)

Difference between two files after average using shell script or awk

I have two files. Each has one column with some missing data as 9999, 9000. e.g.
ifile1.txt ifile2.txt
30 20
9999 10
10 40
40 30
10 31
29 9000
9000 9999
9999 9999
31 1250
550 29
I would like to calculate the difference between the averages of the above two files without considering the missing values. i.e.
average (ifile1.txt) - average (ifile2.txt)
I tried like this, but not getting the result.
ave1=$(awk '!/\9999/ && !/\9000/{sum += $1; count++} END {print count ? (sum/count) : count;sum=count=0}' ifile1.txt)
ave2=$(awk '!/\9999/ && !/\9000/{sum += $1; count++} END {print count ? (sum/count) : count;sum=count=0}' ifile2.txt)
result=$(ave1-ave2)
echo $result
awk '!/9000|9999/{a[FILENAME]+=$0;b[FILENAME]++}END{for(i in a)c=c?c-a[i]/b[i]:a[i]/b[i];print c}' file1 file2
Update:
awk '!/9000|9999/{a[ARGIND]+=$0;b[ARGIND]++}END{print a[1]/b[1]-a[2]/b[2]}' file1 file2
or
awk '!/9000|9999/{a[ARGIND]+=$0;b[ARGIND]++}END{for(i=1;i<=ARGIND;i++)c=c?c-a[i]/b[i]:a[i]/b[i];print c}' file1 file2
Your awk will compute the averages but bash won't do floating point arithmetic. You can always use bc though.
$ echo "$ave1 - $ave2" | bc
-101.429
Also for expressions you have to use $(( ... ))

print $i unles $i is less than 10. using awk or otherwise

I have some data with a series of values on each line like this:
49.01024263 49.13389087 49.38177387 (more numbers...)
42.71585143 43.48711477 44.25625756 (ect..)
43.18826160 43.15332580 43.13094893
30.69076014 28.74489096 26.85725970
eventually the numbers reach values less than 10, at that point I'd like to delete all the remaining numbers in that line.
so far I have this, but its returning several errors.
awk '{for (i=1;i++)do{if ($i > 10.0 ) print $i ; next ; else ; exit}}' input > output
What could I be doing wrong?
Any better ways to carry out this task?
try this line:
awk '{for(i=1;i<=NF;i++)if($i>10)printf "%s ",$i;else break;print ""}' file
test with an example:
kent$ cat f
30 20 15 9 8
50 40 30 20 7 2000
100 200 300 400 5 444
kent$ awk '{for(i=1;i<=NF;i++)if($i>10)printf "%s ",$i;else break;print ""}' f
30 20 15
50 40 30 20
100 200 300 400

Slice 3TB log file with sed, awk & xargs?

I need to slice several TB of log data, and would prefer the speed of the command line.
I'll split the file up into chunks before processing, but need to remove some sections.
Here's an example of the format:
uuJ oPz eeOO 109 66 8
uuJ oPz eeOO 48 0 221
uuJ oPz eeOO 9 674 3
kf iiiTti oP 88 909 19
mxmx lo uUui 2 9 771
mxmx lo uUui 577 765 27878456
The gaps between the first 3 alphanumeric strings are spaces. Everything after that is tabs. Lines are separated with \n.
I want to keep only the last line in each group.
If there's only 1 line in a group, it should be kept.
Here's the expected output:
uuJ oPz eeOO 9 674 3
kf iiiTti oP 88 909 19
mxmx lo uUui 577 765 27878456
How can I do this with sed, awk, xargs and friends, or should I just use something higher level like Python?
awk -F '\t' '
NR==1 {key=$1}
$1!=key {print line; key=$1}
{line=$0}
END {print line}
' file_in > file_out
Try this:
awk 'BEGIN{FS="\t"}
{if($1!=prevKey) {if (NR > 1) {print lastLine}; prevKey=$1} lastLine=$0}
END{print lastLine}'
It saves the last line and prints it only when it notcies that the key has changed.
This might work for you:
sed ':a;$!N;/^\(\S*\s\S*\s\S*\)[^\n]*\n\1/s//\1/;ta;P;D' file

Resources