Nawk Commmad in Unix - linux

I have a nawk command as shown.
Can anyone explain what this command is suppose to do.
set sampFile = $cur_dir/${qtr}.SAMP
nawk -F "," '{OFS=","; if (($4 == "0000" || $4 == "00000000")) {print $0} }' $samp_input_file >! $sampFile

Given a CSV file pointed to by the variable $samp_input_file this command will print the lines where the 4th field is either 0000 or 00000000 and store the output in the file pointed to by $sampFile.
1,2,3,00
2,2,3,0000
3,2,3,000
4,2,3,00000000
5,2,3,0000
# Cleaner version
awk '{FS=OFS=","; if ($4 == "0000" || $4 == "00000000") print}' file
2,2,3,0000
4,2,3,00000000
5,2,3,0000

Related

Regex issue for match a column value

I wrote a script to extract a column value from a file which doesn't matches the pattern defined in col metadata file.
But it is not returning the right output. Can anyone point out the issue here? I was trying to match string with double quotes .quotes also needs to be matched.
Code:
`awk -F'|' -v n="$col_pos" -v m="$col_patt" 'NR!=1 && $n !~ "^" m "$" {
printf "%s:%s:%s\n", FILENAME, FNR, $0 > "/dev/stderr"
count++
}
END {print count}' $input_file`
run output :-
++ awk '-F|' -v n=4 -v 'm="[a-z]+#gmail.com"' 'NR!=1 && $n !~ "^" m "$" {
printf "%s:%s:%s\n", FILENAME, FNR, $0 > "/dev/stderr"
count++
}
END {print count}' /test/data/infa_shared/dev/SrcFiles/datawarehouse/poc/BNX.csv
10,22,"00AF","abc#gmail.com",197,10,1/1/2020 12:06:10.260 PM,"BNX","Hard b","50","Us",1,"25" -- this line is not expected in output as it matches the email pattern "[a-z]+#gmail.com". pattern is extracted from the below file
Input file for pattern extraction file_col_metadata
FILE_ID~col_POS~COL_START_POS~COL_END_POS~datatype~delimited_ind~col_format~columnlength
5~4~~~char~Y~"[a-z]+#gmail.com"~100
If you replace awk -F'|' ... with awk -F',' ... it will work.

How to copy a value from one column to another?

I have CSV data with two price columns. If a value exists in the $4 column I want to copy it over the $3 column of the same row. If $4 is empty then $3 should be left as is.
Neither of these work:
awk -F',' '{ if (length($4) == 0) $3=$4 }'
awk -F',' '{ if(!length($4) == 0 ) print $4 }'
This will output every line with the sample table
awk -F',' '{ if(!length($4) == 0 ) print $0 }' inputfile
This will output nothing with the sample table
awk -F',' '{ if(length($4) == 0 ) print $3 }' inputfile
I've cleaned my two input files, fixed the header row, and joined them using sed, awk, sort, and join. Now what I am left with is a CSV which looks like this:
itemnumber,available,regprice,mapprice
00061,9,19.30,
00061030,31,2.87,3.19
00062,9,15.44,
00062410,2,3.59,3.99
00064,9,15.44,
00066850,29,2.87,3.99
00066871,49,4.19,5.99
00066878,3,5.63,7.99
I need to overwrite the $3 column if the $4 column in the same row has a value. The end result would be:
itemnumber,available,regprice,mapprice
00061,9,19.30,
00061030,31,3.19,3.19
00062,9,15.44,
00062410,2,3.99,3.99
00064,9,15.44,
00066850,29,3.99,3.99
00066871,49,5.99,5.99
00066878,3,7.99,7.99
$ awk 'BEGIN{FS=OFS=","} (NR>1) && ($4!=""){$3=$4} 1' file
itemnumber,available,regprice,mapprice
00061,9,19.30,
00061030,31,3.19,3.19
00062,9,15.44,
00062410,2,3.99,3.99
00064,9,15.44,
00066850,29,3.99,3.99
00066871,49,5.99,5.99
00066878,3,7.99,7.99
Let's have a look at all the things you tried:
awk -F',' '{ if (length($4) == 0) $3=$4 }'
This states, if the length if field 4 is zero then set field 3 equal to field 4. You do not ask awk to print anything, so it will not do anything. This would have printed something:
awk -F',' '{ if (length($4) == 0) $3=$4 }{print $0}'
but with all field separators equal to a space, you should have done:
awk 'BEGIN{FS=OFS=","}{ if (length($4) == 0) $3=$4 }{print $0}'
awk -F',' '{ if(!length($4) == 0 ) print $4 }'
Here you state, if the length of field 4 equals zero is not true, print field 4.
As you mention that nothing is printed, it most likely indicates that you have hidden characters in field 4, such as a CR (See: Remove carriage return in Unix), or even just spaces. You could attempt something like
awk -F',' '{sub(/ *\r?$/,""){ if(!length($4) == 0 ) print $4 }'`**
awk -F',' '{ if(!length($4) == 0 ) print $0 }' inputfile
See 2
awk -F',' '{ if(length($4) == 0 ) print $3 }' inputfile
This confirms my suspicion of 2
My solution for your problem would be based on the suggestion of 2 and the solution of Ed Morton.
awk 'BEGIN{FS=OFS=","} {sub(/ *\r?/,"")}(NR>1) && ($4!=""){$3=$4} 1' file
Here's code that matches your results:
awk -F, -v OFS=, '
NR == 1
NR > 1 {
if ( $4 == "" )
print $1,$2,$3,$4
else
print $1,$2,$4,$4 }
' $*
I've run into trouble in the past with expressions like $3 = $4, so I just print out all of the fields.
Edit: I got shamed by Ed Morton for avoiding the $3 = $4 without troubleshooting. I gave it another shot here below:
awk -F, -v OFS=, '
NR == 1
NR > 1 {
if ( $4 != "" )
$3 = $4
print
}
' $*
The above achieves the same results.
tried on gnu awk
awk -F, -vOFS=, '/[0-9.]+/{if($4)$3=$4} {print}' file

using sed or similar to modify lines in a file

I have a huge file composed of the following:
this is text
1234.1234567
this is another text
1234.1234567
and so on
I would like to transfer it to:
this is text:1234.1234567
this is another text:1234.1234567
is this possible using sed? or any other similar command?
Thanks
If you just want to join lines using : as separator, you could use paste:
paste -d : - - < file.txt
Or using awk:
awk -v sep=: '{ if (NR % 2 == 0) { print prev sep $0 } else prev = $0 }' file.txt
If you have lines containing just alphabets and other containing floating point numbers, you can do the following:
awk '/[a-zA-Z]+/ {printf "%s:", $0}
/[0-9.]+/ {print $0}' data
data is the filename. You can redirect the output to another file.

awk compare value with a input

can a user input variable($userinput) compare with a value?
awk -F: '$1 < $userinput { printf .... }'
This comparison expression seems ok to me, but it gives an error?
Try doing this :
awk -vuserinput="$userinput" -F: '$1 < userinput {}'
A real example :
read -p "Give me an integer >>> " int
awk -v input=$int '$1 < input {print $1, "is less than", input}' <<< 1

String comparison in awk

I need to compare two strings in alphabetic order, not only equality test. I want to know is there way to do string comparison in awk?
Sure it can:
pax$ echo 'hello
goodbye' | gawk '{if ($0 == "hello") {print "HELLO"}}'
HELLO
You can also do inequality (ordered) testing as well:
pax> printf 'aaa\naab\naac\naad\n' | gawk '{if ($1 < "aac"){print}}'
aaa
aab
You can do string comparison in awk using standard boolean operators, unlike in C where you would have to use strcmp().
echo "xxx yyy" > test.txt
cat test.txt | awk '$1!=$2 { print($1 $2); }'
You can check the answer in the nawk manual
echo aaa bbb | awk '{ print ($1 >= $2) ? "true" : "false" }'

Resources