Why AWK uses my arguments as input file - linux

I'm writing an awk script to let it parse something for me. For the purpose of convenience, I want the awk script to be executable in linux. Here are my codes:
#!/usr/bin/awk -f
BEGIN {
FILENAME=ARGV[1]
sub_name=ARGV[2]
run=ARGV[3]
count=0
}
{
if ($4 == "ARGV[2]" && $8 == ARGV[3])
{
print $15
count=count+1
}
}
END {
print count
}
When I issue my awk script in linux such as:
./my_script 001.log type1 2
awk will say:
awk: ./awk_script:23: fatal: cannot open file `type1' for reading (No such
file or directory)
I just want to let argument "type1" as a variable in my script, not a input file for parsing. How can I don't let awk treat it as an imput file?
Thank you,

Don't use a shebang to execute the awk script as it just complicates things:
/usr/bin/awk -v sub_name="$2" -v run="$3" '
{
if ($4 == sub_name && $8 == run)
{
print $15
count=count+1
}
}
END {
print count
}
' "$1"
Note that your script could be cleaned up a bit:
/usr/bin/awk -v sub_name="$2" -v run="$3" '
($4 == sub_name) && ($8 == run) {
print $15
count++
}
END { print count+0 }
' "$1"

Delete the non-file options from ARGV:
delete ARGV[2]
delete ARGV[3]

if you want to use them as variables then you have to use the -v argument. The way you are trying to do it , suggests that the second argument is an input/output file

Related

Bash Script Awk Condition

i have a problem with this code.. i can't figure out what i have to write as condition to cut my file with awk.
i=0
while [ $i -lt 10 ]; #da 1 a 9, Ap1..Ap9
do
case $i in
1) RX="54:75:D0:3F:1E:F0";;
2) RX="54:75:D0:3F:4D:00";;
3) RX="54:75:D0:3F:51:50";;
4) RX="54:75:D0:3F:53:60";;
5) RX="54:75:D0:3F:56:10";;
6) RX="54:75:D0:3F:56:E0";;
7) RX="54:75:D0:3F:5A:B0";;
8) RX="54:75:D0:3F:5F:90";;
9) RX="D0:D0:FD:68:BC:70";;
*) echo "Numero invalido!";;
esac
echo "RX = $RX" #check
awk -F, '$2 =="$RX" { print $0 }' File1 > File2[$i] #this is the line!
i=$(( $i + 1 ))
done
the command echo prints correctly but when i use the same "$RX" as condition in AWK it doesn't work (it prints a blank page).
my File1 :
1417164082794,54:75:D0:3F:53:60,54:75:D0:3F:1E:F0,-75,2400,6
1417164082794,54:75:D0:3F:56:10,54:75:D0:3F:1E:F0,-93,2400,4
1417164082794,54:75:D0:3F:56:E0,54:75:D0:3F:1E:F0,-89,2400,4
1417164082794,54:75:D0:3F:5A:B0,54:75:D0:3F:1E:F0,-80,2400,4
1417164082794,54:75:D0:3F:53:60,54:75:D0:3F:1E:F0,-89,5000,2
could you tell me the right expression "awk -F ..."
thank you very much!
To pass variables from shell to awk use -v:
awk -F, -v R="$RX" '$2 ==R { print $0 }' File1 > File2[$i]
#Ricky - any time you write a loop in shell just to manipulate text you have the wrong approach. It's just not what the shell was created to do - it's what awk was created to do and the shell was created to invoke commands like awk.
Just use a single awk command and instead of reading File 10 times and switching on variables for every line of the file, just do it all once, something like this:
BEGIN {
split(file2s,f2s)
split("54:75:D0:3F:1E:F0\
54:75:D0:3F:4D:00\
54:75:D0:3F:51:50\
54:75:D0:3F:53:60\
54:75:D0:3F:56:10\
54:75:D0:3F:56:E0\
54:75:D0:3F:5A:B0\
54:75:D0:3F:5F:90\
D0:D0:FD:68:BC:70", rxs)
for (i in rxs) {
rx2file2s[rxs[i]] = f2s[i]
}
}
{
if ($2 in rx2file2s) {
print > rx2file2s[$2]
}
else {
print NR, $2, "Numero invalido!" | "cat>&2"
}
}
which you'd then invoke as awk -v file2s="${File2[#]}" -f script.awk File1
I say "something like" because you didn't provide any sample input (File1 contents) or expected output (File2* values and contents) so I couldn't test it but it will be very close to what you need if not exactly right.

#awk If else loop not working in ksh

I have code where awk is piped to a clearcase command where If else loop is not working.
code is below :
#!/bin/ksh
export dst_region=$1
cleartool lsview -l | gawk -F":" \ '{ if ($0 ~ /Global path:/) { if($dst_region == "ABC" || $dst_region -eq "ABC") { system("echo dest_region is ABC");}
else { system("echo dest_region is not ABC"); } }; }'
But when I execute the above script the I get incorrect output,
*$ ksh script.sh ABCD
dest_region is ABC
$ ksh script.sh ABC
dest_region is ABC*
Could anyone please help on this issue ?
It would be useful if you explained exactly what you are trying to do but your awk script can be cleaned up a lot:
gawk -F":" -vdst_region="$1" '/Global path:/ { if (dst_region == "ABC") print "dest_region is ABC"; else print "dest_region is not ABC" }'
General points:
I have used -v to create an awk variable from the value of $1, the first argument to the script. This means that you can use it a lot more easily in the script.
awk's structure is condition { action } so you're using if around the whole one-liner unnecessarily
$0 ~ /Global path:/ can be changed to simply /Global path:/
the two sides of the || looked like they were trying to both do the same thing, so I got rid of the one that doesn't work in awk. Strings are compared using ==.
system("echo ...") is completely unnecessary. Use awk's built in print
You could go one step further and remove the if-else entirely:
gawk -F":" -vdst_region="$1" '/Global path:/ { printf "dest region is%s ABC", (dst_region=="ABC"?"":" not") }'

How to compare two columns in multiple files in linux with awk

I have this code
[motaro#Cyrax ]$ awk '{print $1}' awk1.txt awk2.txt
line1a
line2a
file1a
file2a
It shows the ccolumns from the both files
How can i find $1(of file 1) and $1(of file2) , separately
As per the comments above, for three or more files, set the conditionals like:
FILENAME == ARGV[1]
For example:
awk 'FILENAME == ARGV[1] { print $1 } FILENAME == ARGV[2] { print $1 } FILENAME == ARGV[3] { print $1 }' file1.txt file2.txt file3.txt
Alternatively, if you have a glob of files:
Change the conditionals to:
FILENAME == "file1.txt"
For example:
awk 'FILENAME == "file1.txt" { print $1 } FILENAME == "file2.txt" { print $1 } FILENAME == "file3.txt" { print $1 }' *.txt
You may also want to read more about the variables ARGC and ARGV. Please let me know if anything requires more explanation. Cheers.
I am not sure exactly what you need.
Probably you need predefined variable :FILENAME
awk '{print $1,FILENAME}' awk1.txt awk2.txt
This above command will output:
line1a awk1.txt
line2a awk1.txt
file1a awk2.txt
file2a awk2.txt
awk 'NR==FNR{a[FNR]=$0;next} {print a[FNR],$0}' file_1 file_2
found here

i want to combine two awk scripts

I have two with AWK which works perfectly
myScript3.awk
#!/usr/bin/awk -f
BEGIN {
FS=">|</"
OFS=","
}
{
data[count++] = $2
ptint $2
}
END{
print data[2],data[6],data[3], FILENAME
}
The above script will scan the xml document and return the 2nd, 6th, 3rd element along with file name.
for filename in *.xml
do
awk -f myscript3.awk $filename >> out.txt
done
The above script will scan the entire folder and list down xml files and then execute myscript.
i have to merge these two scripts as one.
Thanks for your help
Note about calling conventions: if you're running the script as awk -f script you do not need the shabang (#!) line at the beginning. Alternatively you can run it with the shabang as ./script if script is executable.
Answer
awk has BEGINFILE and ENDFILE, replace BEGIN/END with them and give the xml files as an argument, the following should work:
Edit
As noted by Dennis in the comments below, there's no need for BEGINFILE. Also note that this requires a fairly recent version of GNU awk to work.
myScript3.awk
BEGIN {
FS=">|</"
OFS=","
}
{
data[count++] = $2
print $2
}
ENDFILE {
print data[2],data[6],data[3], FILENAME
}
Run it like this:
awk -f myscript.awk *.xml
#!/bin/bash
AWKPROG='BEGIN {FS=">|</" OFS=","}
{ data[count++] = $2; ptint $2 }
END {print data[2], data[6], data[3], FILENAME}'
for filename in *.xml;do awk $AWKPROG $filename; done >> out.txt
Warning: Untested.

GROUP BY/SUM from shell

I have a large file containing data like this:
a 23
b 8
a 22
b 1
I want to be able to get this:
a 45
b 9
I can first sort this file and then do it in Python by scanning the file once. What is a good direct command-line way of doing this?
Edit: The modern (GNU/Linux) solution, as mentioned in comments years ago ;-) .
awk '{
arr[$1]+=$2
}
END {
for (key in arr) printf("%s\t%s\n", key, arr[key])
}' file \
| sort -k1,1
The originally posted solution, based on old Unix sort options:
awk '{
arr[$1]+=$2
}
END {
for (key in arr) printf("%s\t%s\n", key, arr[key])
}' file \
| sort +0n -1
I hope this helps.
No need for awk here, or even sort -- if you have Bash 4.0, you can use associative arrays:
#!/bin/bash
declare -A values
while read key value; do
values["$key"]=$(( $value + ${values[$key]:-0} ))
done
for key in "${!values[#]}"; do
printf "%s %s\n" "$key" "${values[$key]}"
done
...or, if you sort the file first (which will be more memory-efficient; GNU sort is able to do tricks to sort files larger than memory, which a naive script -- whether in awk, python or shell -- typically won't), you can do this in a way which will work in older versions (I expect the following to work through bash 2.0):
#!/bin/bash
read cur_key cur_value
while read key value; do
if [[ $key = "$cur_key" ]] ; then
cur_value=$(( cur_value + value ))
else
printf "%s %s\n" "$cur_key" "$cur_value"
cur_key="$key"
cur_value="$value"
fi
done
printf "%s %s\n" "$cur_key" "$cur_value"
This Perl one-liner seems to do the job:
perl -nle '($k, $v) = split; $s{$k} += $v; END {$, = " "; foreach $k (sort keys %s) {print $k, $s{$k}}}' inputfile
This can be easily achieved with the following single-liner:
cat /path/to/file | termsql "SELECT col0, SUM(col1) FROM tbl GROUP BY col0"
Or.
termsql -i /path/to/file "SELECT col0, SUM(col1) FROM tbl GROUP BY col0"
Here a Python package, termsql, is used, which is a wrapper around SQLite. Note, that currently it's not upload to PyPI, and also can only be installed system-wide (setup.py is a little broken), like:
pip install --user https://github.com/tobimensch/termsql/archive/master.zip
Update
In 2020 version 1.0 was finally uploaded to PyPI, so pip install --user termsql can be used.
One way using perl:
perl -ane '
next unless #F == 2;
$h{ $F[0] } += $F[1];
END {
printf qq[%s %d\n], $_, $h{ $_ } for sort keys %h;
}
' infile
Content of infile:
a 23
b 8
a 22
b 1
Output:
a 45
b 9
With GNU awk (versions less than 4):
WHINY_USERS= awk 'END {
for (E in a)
print E, a[E]
}
{ a[$1] += $2 }' infile
With GNU awk >= 4:
awk 'END {
PROCINFO["sorted_in"] = "#ind_str_asc"
for (E in a)
print E, a[E]
}
{ a[$1] += $2 }' infile
With sort + awk combination one could try following, without creating array.
sort -k1 Input_file |
awk '
prev!=$1 && prev{
print prev,(prevSum?prevSum:"N/A")
prev=prevSum=""
}
{
prev=$1
prevSum+=$2
}
END{
if(prev){
print prev,(prevSum?prevSum:"N/A")
}
}'
Explanation: Adding detailed explanation for above.
sort -k1 file1 | ##Using sort command to sort Input_file by 1st field and sending output to awk as an input.
awk ' ##Starting awk program from here.
prev!=$1 && prev{ ##Checking condition prev is NOT equal to first field and prev is NOT NULL.
print prev,(prevSum?prevSum:"N/A") ##Printing prev and prevSum(if its NULL then print N/A).
prev=prevSum="" ##Nullify prev and prevSum here.
}
{
prev=$1 ##Assigning 1st field to prev here.
prevSum+=$2 ##Adding 2nd field to prevSum.
}
END{ ##Starting END block of this awk program from here.
if(prev){ ##Checking condition if prev is NOT NULL then do following.
print prev,(prevSum?prevSum:"N/A") ##Printing prev and prevSum(if its NULL then print N/A).
}
}'

Resources