shell scripts variable passed to awk and double quotes needed to preserve - linux

I have some logs called ts.log that look like
[957670][DEBUG:2016-11-30 16:49:17,968:com.ibatis.common.logging.log4j.Log4jImpl.debug(Log4jImpl.java:26)]{pstm-9805256} Parameters: []
[957670][DEBUG:2016-11-30 16:49:17,968:com.ibatis.common.logging.log4j.Log4jImpl.debug(Log4jImpl.java:26)]{pstm-9805256} Types: []
[957670][DEBUG:2016-11-30 16:50:17,969:com.ibatis.common.logging.log4j.Log4jImpl.debug(Log4jImpl.java:26)]{rset-9805257} ResultSet
[957670][DEBUG:2016-11-30 16:51:17,969:com.ibatis.common.logging.log4j.Log4jImpl.debug(Log4jImpl.java:26)]{rset-9805257} Header: [LAST_INSERT_ID()]
[957670][DEBUG:2016-11-30 16:52:17,969:com.ibatis.common.logging.log4j.Log4jImpl.debug(Log4jImpl.java:26)]{rset-9805257} Result: [731747]
[065417][DEBUG:2016-11-30 16:53:17,986:sdk.protocol.process.InitProcessor.process(InitProcessor.java:61)]query String=requestid=10547
I have a script in which there's sth like
#!/bin/bash
begin=$1
cat ts.log | awk -F '[ ,]' '{if($2 ~/^[0-2][0-9]:[0-6][0-9]:[0-6][0-9]&& $2>="16:50:17"){print $0}}'
instead of inputting the time like 16:50:17 I want to just pass $1 of shell to awk so that all I need to do is ./script time:hh:mm:ss The script will look like
#!/bin/bash
begin=$1
cat ts.log | awk -v var=$begin -F '[ ,]' '{if($2 ~/^[0-2][0-9]:[0-6][0-9]:[0-6][0-9]&& $2>="var"){print $0}}'
But the double quotes need to be there OR it won't work.
I tried 2>"\""var"\""
but it doesn't work.
so is there a way to keep the double quotes there?
preferred result ./script
then extract the log from the time specified as $1.

There's many ways to do what you want.
Option 1: Using double quotes enclosing awk program
#!/bin/bash
begin=$1
awk -F '[ ,]' "\$2 ~ /^..:..:../ && \$2 >= \"${begin}\" " ts.log
Inside double quotes strings, bash does variable substitution. So $begin or ${begin} will be replaced with the shell variable value (whatever sent by the user)
Undesired effect: awk special variables starting with $ must be escaped with '\' or bash will try to replace them before execute awk.
To get a double quote char (") in bash double quote strings, it has to be escaped with '\', so in bash " \"16:50\" " will be replaced with "16:50". (This won't work with single quote strings, that don't have expansion of variables nor escaped chars at all in bash).
To see what variable substitutions are made when bash executes the script, you can execute it with debug option (it's very enlightening):
$ bash -x yourscript.sh 16:50
Option 2: Using awk variables
#!/bin/bash
begin=$1
awk -F '[ ,]' -v begin=$begin '$2 ~ /^..:..:../ && $2 >= begin' ts.log
Here an awk variable begin is created with option -v varname=value.
Awk variables can be used in any place of awk program as any other awk variable (don't need double quotes nor $).
There are other options, but I think you can work with these two.
In both options I've changed a bit your script:
It doesn't need cat to send data to awk, because awk can execute your program in one or more data files sent as parameters after your program.
Your awk program doesn't need include print at all (as #fedorqui said), because a basic awk program is composed by pairs of pattern {code}, where pattern is the same as you used in the if sentence, and the default code is {print $0}.
I've also changed the time pattern, primarly to clarify the script, but in a log file there's almost no chance that exists some 8 char length string that has 2 colons inside (regexp: . repaces any char)

Related

numeric variable in egrep regular expression bash script

So I am trying to make a script that contains egrep and accepts a numeric variable
#!/bin/bash
var=$1
list="egrep "^.{$var}$ /usr/share/dict/words"
cat list
For example, if var is 5, I would like this script to print out every line with 5 characters. For some reason the script does not do that. Help would be greatly appreciated!
Your script doesn't work because there are several problems with these lines:
list="egrep "^.{$var}$ /usr/share/dict/words"
cat list
The first line isn't complete, it's missing a closing quote,
Even if you fixed it, you're assigning a literal string to list, not the output of a command,
RE and filename should be separated
cat doesn't print a variable's content, echo does that.
So:
#!/bin/bash
var="$1"
list="$(egrep '^.{'"$var"'}$' /usr/share/dict/words)"
echo "$list"
should work.
Or even better, you can use just an awk command:
awk 'length==5' /usr/share/dict/words
with $1 or any other variable:
awk -v n="$1" 'length==n' /usr/share/dict/words

Using awk to substitute variables into result of a ls command [duplicate]

This question already has answers here:
How do I use shell variables in an awk script?
(7 answers)
Closed 5 years ago.
I am trying to display which apps is deployed in each host. Using a bash script I had some success except I want to replace the following "hostname" part with the actual host name stored under the ${i} variable. I can't just substitute it because I cannot put a curly bracket inside another. EVEN if I can do it, I am still having trouble as ${i} will be replaced by the result of ls. How do I fix this?
hosts=(usa1 london2)
for i in ${hosts[#]}; do
echo ---${i}---
ssh ttoleung#${i} ls /apps | awk '{ printf("%s:%s\n", "hostname", $0) }'
done
Current output, based on code fragment above:
---usa01---
hostname:E2.gui
hostname:E1.server
---london2---
hostname:E1.gui
Desired output:
---usa01---
usa01:E2.gui
usa01:E1.server
---london2---
london2:E1.gui
Replace:
awk '{ printf("%s:%s\n", "hostname", $0) }'
With:
awk -v h="$i" '{ printf("%s:%s\n", h, $0) }'
-v h="$i" tells awk to create an awk variable h and assign to it the value of the shell variable $i.
Aside: we used h="$i" rather than h=$i because it is good practice to put shell variables inside double-quotes unless you want the shell to perform word-splitting and pathname expansion.
As an intro note, it is not a safe expand array names without double quotes unless for obvious reasons because doing so would split quoted strings in array that themselves have spaces
So change
for i in ${hosts[#]};
to
for i in "${hosts[#]}"; # Note the quoted array
Now, coming to your problem, you can pass bash variables to awk using its -v parameter. So change
ssh ttoleung#${i} ls /apps | awk '{ printf("%s:%s\n", "hostname", $0) }'
to
ssh ttoleung#${i} ls /apps | awk -v hname="${i}" '{ printf("%s:%s\n", hname, $0) }'
Here we pass shell parameter ${i} to awk variable hname.
Side Note: Don't parse ls output for the reasons mentioned [ here ]. In your case though, it doesn't make much of a difference.

Using awk command in Bash

I'm trying to loop an awk command using bash script and I'm having a hard time including a variable within the single quotes for the awk command. I'm thinking I should be doing this completely in awk, but I feel more comfortable with bash right now.
#!/bin/bash
index="1"
while [ $index -le 13 ]
do
awk "'"/^$index/ {print}"'" text.txt
done
Use the standard approach -- -v option of awk to set/pass the variable:
awk -v idx="$index" '$0 ~ "^"idx' text.txt
Here i have set the variable idx as having the value of shell variable $index. Inside awk, i have simply used idx as an awk variable.
$0 ~ "^"idx matches if the record starts with (^) whatever the variable idx contains; if so, print the record.
awk '/'"$index"'/' text.txt
# A lil play with the script part where you split the awk command
# and sandwich the bash variable in between using double quotes
# Note awk prints by default, so idiomatic awk omits the '{print}' too.
should do, alternatively use grep like
grep "$index" text.txt # Mind the double quotes
Note : -le is used for comparing numerals, so you may change index="1" to index=1.

Escaping ~! in awk (bash command), backslash not last character on line

I am trying to run a bash command in the following format:
declare "test${nb}"="$(cat file.txt | awk '{if($3>0.5 && $3 !~ "ddf") $2="NA"; print $1,$2}')"
where $nb is an int (e.g. 2) and file.txt contains a table with various numeric and string values (I can provide more details if needed, but it should not be relevant here)
when running this, the shell substitutes !~ for the name of a file that I have (not sure why). I tried escaping this using the backslash like this:
declare "test${nb}"="$(cat file.txt | awk '{if($3>0.5 && $3 \!~ "ddf") $2="NA"; print $1,$2}')"
but then I get this error:
awk: {if($3>0.5 && $3 \!~ "ddf") $2="NA"; print $1,$2}
awk: ^ backslash not last character on line
I also tried having the table contained in the variable "var" and writing it this way:
declare test[$nb]=$(echo "$var" | awk '{if($3>0.5 && $3 !~ "ddf") $2="NA"; print $1,$2}')
Then there is no error, but the output is just the first field of first column of the table, which is not the case when I don't expand the variable name. For example, if I do this:
declare test2=$(echo "$var" | awk '{if($3>0.5 && $3 !~ "ddf") $2="NA"; print $1,$2}')
then it works perfectly and test2 has the expected value. But I need to be able to use any number instead of 2 (something like test[$nb]).
any idea how I could fix this? Any help will be very appreciated!
thanks
Lose the quotes:
$ declare x="$( echo '!' )"
-bash: !': event not found
$ declare x=$( echo '!' )
$ echo "$x"
!
You have a lot of other issues with your statement, though, including UUOC, using a scalar to emulate an array, non idiomatic awk syntax, etc. Try this instead:
declare test[$nb]=$( awk '{print $1, (($3 > 0.5) && ($3 !~ /ddf/) ? "NA" : $2)}' file.txt )
Are you talking about interactively entering a command, or running the command from a file (i.e. script)? The history expansion of !~ occurs only in interactive use. BTW, the bash manpage says: If enabled, history expansion will be performed unless an ! appearing in double quotes is escaped using a backslash. The backslash preceding the ! is not removed.
This means that, as long as you do scripting (or, more precisley, as long as you do not have an interactive shell), you don't have to worry about this special meaning of '!'.
If you have an interactive shell and history expansion bothers you, you can turn it off by
set +H

AWK with If condition

i am trying to replace the following string for ex:
from
['55',2,1,10,30,23],
to
['55',2,555,10,30,23],
OR
['55',2,1,10,30,23],
to
['55',2,1,10,9999,23],
i search around and find this :
$ echo "[55,2,1,10,30,23]," | awk -F',' 'BEGIN{OFS=","}{if($1=="[55"){$2=10}{print}}'
[55,10,1,10,30,23],
but it's not working in my case since there is " ' " around the value of $1 in my if condition :
$ echo "['55',2,1,10,30,23]," | awk -F',' 'BEGIN{OFS=","}{if($1=="['55'"){$2=10}{print}}'
['55',2,1,10,30,23],
The problem is not in the awk code, it's the shell expansion. You cannot have single quotes in a singly-quoted shell string. This is the same problem you run into when you try to put the input string into single quotes:
$ echo '['55',2,1,10,30,23],'
[55,2,1,10,30,23],
-- the single quotes are gone! And this makes sense, because they did their job of quoting the [ and the ,2,1,10,30,23], (the 55 is unquoted here), but it is not what we wanted.
A solution is to quote the sections between them individually and squeeze them in manually:
$ echo '['\''55'\'',2,1,10,30,23],'
['55',2,1,10,30,23],
Or, in this particular case, where nothing nefarious is between where the single quotes should be,
echo '['\'55\'',2,1,10,30,23],' # the 55 is now unquoted.
Applied to your awk code, that looks like this:
$ echo "['55',2,1,10,30,23]," | awk -F',' 'BEGIN{OFS=","}{if($1=="['\'55\''"){$2=10}{print}}'
['55',10,1,10,30,23],
Alternatively, since this doesn't look very nice if you have many single quotes in your code, you can write the awk code into a file, say foo.awk, and use
echo "['55',2,1,10,30,23]," | awk -F, -f foo.awk
Then you don't have to worry about shell quoting mishaps in the awk code because the awk code is not subject to shell expansion anymore.
I think how to match and replace is not the problem for you. The problem you were facing is, how to match a single quote ' in field.
To avoid to escape each ' in your codes, and to make your codes more readable, you can assigen the quote to a variable, and use the variable in your codes, for example like this:
echo "['55' 1
['56' 1"|awk -v q="'" '$1=="["q"55"q{$2++}7'
['55' 2
['56' 1
In the above example, only in line with ['55', the 2nd field got incremented.

Resources