command substitution in bash with awk - linux

Why does this work:
This
var=hello
myvar=`echo hello hi | awk "{ if (\\\$1 == \"$var\" ) print \\\$2; }"`
echo $myvar
gives
hi
But this does not?
This
var=hello
echo hello hi | awk "{ if (\\\$1 == \"$var\" ) print \\\$2; }"
gives
awk: cmd. line:1: Unexpected token
I am using
GNU bash, version 4.1.5(1)-release (i486-pc-linux-gnu)
on
Linux 2.6.32-34-generic-pae #77-Ubuntu SMP Tue Sep 13 21:16:18 UTC 2011 i686 GNU/Linux

The correct way to pass shell variables into an AWK program is to use AWK's variable passing feature instead of trying to embed the shell variable. And by using single quotes, you won't have to do a bunch of unnecessary escaping.
echo "hello hi" | awk -v var="$var" '{ if ($1 == var ) print $2; }'
Also, you should use $() instead of backticks.

If your awk is like mine, it will tell you where it fails:
var=hello
echo hello hi | awk "{ if (\\\$1 == \"$var\" ) print \\\$2; }"
awk: syntax error at source line 1
context is
{ if >>> (\ <<< $1 == "hello" ) print \$2; }
awk: illegal statement at source line 1
furthermore, if you replace awk by echo you'll see clearly why it fails
echo hello hi | echo "{ if (\\\$1 == \"$var\" ) print \\\$2; }"
{ if (\$1 == "hello" ) print \$2; }
there are extra '\' (backslashes) in the resulting command. This is because you removed the backquotes.
So the solutions is to remove a pair of \'s
echo hello hi | awk "{ if (\$1 == \"$var\" ) print \$2; }"
hi

Related

Shell Bash Script

I am making a bash script. I have to get 3 variables
VAR1=$(cat /path to my file/ | grep "string1" | awk '{ print $2 }'
VAR2=$(cat /path to my file/ | grep "string2" | awk '{ print $2 }'
VAR3=$(cat /path to my file/ | grep "string3" | awk '{ print $4 }'
My problem is that if I write
echo $VAR1
echo $VAR2
echo $VAR3
I can see values correctly
But when I try to write them in one line like this
echo "VAR1: $VAR1 VAR2: $VAR2 VAR3: $VAR3"
Value from $VAR3 is written at the beginning of output overwritting values of $VAR1 and $VAR2
I expect my explanation had been clear. Any doubt please let me know
Thanks and regards.
Rambert
It seems to me that $VAR3 contains \r which in some shells will move the cursor to the beginning of the line. Use printf instead:
printf "VAR1: %s VAR2: %s VAR3: %s\n" "$VAR1" "$VAR2" "$VAR3"
Also note that the way you extract the values is highly inefficient and can be reduced to one call to awk:
read -r var1 var2 var3 _ < <(awk '/string1/ { a=$2 }
/string2/ { b=$2 }
/string3/ { c=$4 }
END { print(a, b, c) }' /path/to/file)
printf "VAR1: %s VAR2: %s VAR3: %s\n" "$var1" "$var2" "$var3"
A nitpick is that uppercase variable names are reserved for environment variables, so I changed all to lowercase.
<(...) is a process substitution and will make ... write to a "file" and return the file name:
$ echo <(ls)
/dev/fd/63
And command < file is a redirection changing standard input of command to be comming from the file file.
You could write :
cat /path to my file/ | grep "string1" | awk '{ print $2 }'
as
awk '/string1/{print $2}' /path/to/file
In other words you could do with awk alone what you intended to do with cat, grep & awk
So finally get :
VAR1=$(awk '/string1/{print $2}' /path/to/file) #mind the closing ')'
Regarding the issue you face, it looks like you have carriage returns or \r in your variables. In bash echo will not interpret escape sequences without the -e option, but the printf option which [ #andlrc ] pointed out is a good try though as he mentioned in his [ answer ]
which in some shells will move the cursor to the beginning
Notes :
Another subtle point to keep in mind is to avoid using upper case variable names like VAR1 for user scripts. So replace it with var1 or so
When assigning values to variable spaces are not allowed around =, so
VAR1="Note there are no spaces around = sign"
is the right usage

Linux - parsing data, what language to use

I am looking to parse data out of a 'column' based format. I am running into issues where I feel I am 'hacking' bash/awk commands to pull the strings and numbers. If the numbers/text come in different formats then the script might fail unexpectedly and I will have errors.
Data:
RSSI (dBm): -86 Tx Power: 0
RSRP (dBm): -114 TAC: 4r5t (12341)
RSRQ (dB): -10 Cell ID: efefwg (4261431)
SINR (dB): 2.2
My method:
Using bash and awk
#!/bin/bash
DATA_OUTPUT=$(get_data)
RSSI=$(echo "${DATA_OUTPUT}" | awk '$1 == "RSSI" {print $3}')
RSRP=$(echo "${DATA_OUTPUT}" | awk '$1 == "RSRP" {print $3}')
RSRQ=$(echo "${DATA_OUTPUT}" | awk '$1 == "RSRQ" {print $3}')
SINR=$(echo "${DATA_OUTPUT}" | awk '$1 == "SINR" {print $3}')
TX_POWER=$(echo "${DATA_OUTPUT}" | awk '$4 == "Tx" {print $6}')
echo "$SINR"
echo ">$SINR<"
However the output of the above comes out very strange.
2.2 # thats fine!
<2.2 # what??? expecting >4.6<
Little things like this make me question using awk and bash to parse the data. Should I use C++ or some other language? Or is there a better way of doing this?
Thank you
This should be your starting point (the match() can be simplified or removed if your input data is tab-separated or fixed width fields):
$ cat file
RSSI (dBm): -86 Tx Power: 0
RSRP (dBm): -114 TAC: 4r5t (12341)
RSRQ (dB): -10 Cell ID: efefwg (4261431)
SINR (dB): 2.2
.
$ cat tst.awk
{
tail = $0
while ( match(tail,/[^:]+:[[:space:]]+[^[:space:]]+[[:space:]]*([^[:space:]]*$)?/) )
{
nvPair = substr(tail,RSTART,RLENGTH)
sub(/ \([^)]+\):/,":",nvPair) # remove (dB) or (dBm)
sub(/:[[:space:]]+/,":",nvPair) # remove spaces after :
sub(/[[:space:]]+$/,"",nvPair) # remove trailing spaces
split(nvPair,tmp,/:/)
name2value[tmp[1]] = tmp[2] # name2value["RSSI"] = "-86"
tail = substr(tail,RSTART+RLENGTH)
}
}
END {
for (name in name2value) {
value = name2value[name]
printf "%s=\"%s\"\n", name, value
}
}
.
$ awk -f tst.awk file
Tx Power="0"
RSSI="-86"
TAC="4r5t (12341)"
Cell ID="efefwg (4261431)"
RSRP="-114"
RSRQ="-10"
SINR="2.2"
Hopefully it's clear that in the above script after the match() loop you can simply say things like print name2value["Tx Power"] to print the value of that key phrase.
If your data was created in DOS, run dos2unix or tr -d '^M' on it first, where ^M means a literal control-M character.
Your data contains DOS-style \r\n line endings. When you do this
echo ">$SINR<"
The actual output is actually
>4.6\r<
The carriage return sends the cursor back to the start of the line.
You can do this:
DATA_OUTPUT=$(get_data | sed 's/\r$//')
But instead of parsing the output over and over, I'd rewrite like this:
while read -ra fields; do
case ${fields[0]} in
RSSI) rssi=${fields[2]};;
RSRP) rsrp=${fields[2]};;
RSPQ) rspq=${fields[2]};;
SINR) sinr=${fields[2]};;
esac
if [[ ${fields[3]} == "Tx" ]]; then tx_power=${fields[5]}; fi
done < <(get_data | sed 's/\r$//' )

error bash extracting second column of a matched pattern

I am trying to search for a pattern and from the results i am extracting just the second column. The command works well in command line but not inside a bash script.
#!/bin/bash
set a = grep 'NM_033356' test.txt | awk '{ print $2 }'
echo $a
It doesnt print any output at all.
Input
NM_033356 2
NM_033356 5
NM_033356 7
Your code:
#!/bin/bash
set a = grep 'NM_033356' test.txt | awk '{ print $2 }'
echo $a
Change it to:
#!/bin/bash
a="$(awk '$1=="NM_033356"{ print $2 }' test.txt)"
echo "$a"
Code changes are based on your sample input.
.......
a="$(awk '/NM_033356/ { print $2 }' test.txt)"
Try this:
a=`grep 'NM_033356' test.txt | awk '{ print $2 }'`

Why a ~ in double quotes within awk caused syntax error

I was going write script for my device.
Here is my initial code:
dev_name=random_sting
major=`awk "\$2 ~ /^${dev_name}\$/ { print \$1 }" /proc/devices`
Then an error happen
awk: ~ /^random_string$/ { print }
awk: ^ syntax error
Meanwhile, I did an experiment:
var1=random_string
echo "\$ /^$var1\$/ \$"
The output was
$ /^random_string$/ $
It seems the syntax should be correct, can anybody give me an answer?
You need additional escapes inside back ticks. Try using major=$( .. ) instead..
In this case you can also bypass the need for escaping, using the -v option of awk, like this
major=`awk -v dev="$dev_name" '$2 ~ dev { print $1 }' /proc/devices`
Your expression inside backticks will pass through 2 shells/unescape stages.
awk "\$2 ~ /^${dev_name}\$/ { print \$1 }" /proc/devices
...will be expanded and unescaped by your bash to...
awk "$2 ~ /^random_string$/ { print $1 }" /proc/devices`
...which the shell started by the backticks will expand and unescape again to...
awk "~ /^random_string$/ { print }" /proc/devices`
...since $1 and $2 are not defined.
What you want to do is to escape $1 and $2 twice;
awk "\\\$2 ~ /^${dev_name}\$/ { print \\\$1 }" /proc/devices
...to make the executed end result...
awk "$2 ~ /^random_string\$/ { print $1 }" /proc/devices
That's how I solve the problem.
Let's scrutinize this line:
dev_name='loop' ; major=` awk "\\\$2 ~ /^\${dev_name}\\\$/ { print \\\$1 }" /proc/devices` ; echo $major
bash expand it twice, and double quote is not expanded between backticks (`), so this is going to output the proper outcome.

awk save command ouput to variable

I need to execute a command per line of some file. For example:
file1.txt 100 4
file2.txt 19 8
So my awk script need to execute something like
command $1 $2 $3
and save the output of command $1 $2 $3, so system() will not work and neither will getline. (I can't pipe the output if I do something like this.)
The restriction to this problem is to use only awk. (i already had a solution with bashscriot + awk...but I only want awk...just to know more about this)
What's wrong with using getline?
$ ./test.awk test.txt
# ls -F | grep test
test.awk*
test.txt
# cat test.txt | nl
1 ls -F | grep test
2 cat test.txt | nl
3 cat test.awk
# cat test.awk
#!/usr/bin/awk -f
{
cmd[NR] = $0
while ($0 | getline line) output[NR] = output[NR] line RS
}
END {
for (i in cmd) print "# " cmd[i] ORS output[i]
}
Awk's system() function passes the string to /bin/sh, so you can use redirect operators, like ">file.out" if you want.
awk '{system("command " $1 " " $2 " " $3 ">" $1 ".out");}'
Edit: ok, by save, you mean into an awk variable. ephemient is on the right track, then. That's what awk's getline does, like backticks or $(cmd) in shell/perl. In fact, google for awk backticks found this:
http://www.omnigroup.com/mailman/archive/macosx-admin/2006-May/054665.html
You say you can't use getline because then you couldn't pipe. But you can work around that with tee and file-descriptor tricks. This works if /bin/sh is bash:
{ "set +o posix; command " $1 " " $2 " " $3 " | tee >(grep foo)" | getline var; print toupper(var); } # bash-only, and broken.
set +o posix is necessary because awk runs bash as sh, which makes it go into posix mode after readings its startup files. Hmm, I'm not having any luck getting that to work, and it requires bash anyway.
Ok, this works:
$ touch foo bar
$ echo "foo bar" |
awk '{ "{ ls " $1 " " $2 " " $3 " | tee /dev/fd/10 | grep foo > /dev/tty; } 10>&1" | getline var; print toupper(var); }'
foo
BAR

Resources