How to insert shell variable inside awk command - linux

I'm trying to write a script, In this script i'm passing a shell variable into an awk command, But when i run it nothing happens, i tried to run that line only in the shell, i found that no variable expansion happened like i expected. Here's the code :
1 #!/bin/bash
2
3 # Created By Rafael Adel
4
5 # This script is to start dwm with customizations needed
6
7
8 while true;do
9 datestr=`date +"%r %d/%m/%Y"`
10 batterystr=`acpi | grep -oP "([a-zA-Z]*), ([0-9]*)%"`
11 batterystate=`echo $batterystr | grep -oP "[a-zA-Z]*"`
12 batterypercent=`echo $batterystr | grep -oP "[0-9]*"`
13
14 for nic in `ls /sys/class/net`
15 do
16 if [ -e "/sys/class/net/${nic}/operstate" ]
17 then
18 NicUp=`cat /sys/class/net/${nic}/operstate`
19 if [ "$NicUp" == "up" ]
20 then
21 netstr=`ifstat | awk -v interface=${nic} '$1 ~ /interface/ {printf("D: %2.1fKiB, U: %2.1fKiB",$6/1000, $8/1000)}'`
22 break
23 fi
24 fi
25 done
26
27
28 finalstr="$netstr | $batterystr | $datestr"
29
30 xsetroot -name "$finalstr"
31 sleep 1
32 done &
33
34 xbindkeys -f /etc/xbindkeysrc
35
36 numlockx on
37
38 exec dwm
This line :
netstr=`ifstat | awk -v interface=${nic} '$1 ~ /interface/ {printf("D: %2.1fKiB, U: %2.1fKiB",$6/1000, $8/1000)}'`
Is what causes netstr variable not to get assigned at all. That's because interface is not replaced with ${nic} i guess.
So could you tell me what's wrong here? Thanks.

If you want to /grep/ with your variable, you have 2 choices :
interface=eth0
awk "/$interface/{print}"
or
awk -v interface=eth0 '$0 ~ interface{print}'
See http://www.gnu.org/software/gawk/manual/gawk.html#Using-Shell-Variables

it's like I thought, awk substitutes variables properly, but between //, inside regex ( or awk regex, depending on some awk parameter AFAIR), awk variable cannot be used for substitution
I had no issue grepping with variable inside an awk program (for simple regexp cases):
sawk1='repo\s+module2'
sawk2='#project2\s+=\s+module2$'
awk "/${sawk1}/,/${sawk2}/"'{print}' aFile
(Here the /xxx/,/yyy/ displays everything between xxx and yyy)
(Note the double-quoted "/${sawk1}/,/${sawk2}/", followed by the single-quoted '{print}')
This works just fine, and comes from "awk: Using Shell Variables in Programs":
A common method is to use shell quoting to substitute the variable’s value into the program inside the script.
For example, consider the following program:
printf "Enter search pattern: "
read pattern
awk "/$pattern/ "'{ nmatches++ }
END { print nmatches, "found" }' /path/to/data
The awk program consists of two pieces of quoted text that are concatenated together to form the program.
The first part is double-quoted, which allows substitution of the pattern shell variable inside the quotes.
The second part is single-quoted.
It does add the caveat though:
Variable substitution via quoting works, but can potentially be messy.
It requires a good understanding of the shell’s quoting rules (see Quoting), and it’s often difficult to correctly match up the quotes when reading the program.

Related

Different script behavior between Ubuntu 20.04.2 and 21.04 terminated

Can run the below command in shell script with no problem on Ubuntu 21.04 :
grep -h "new tip" logs/node.log | tail -1000 | sed -rn "s/\[(.*)\].\[(.*)\].* ([0-9]+)/\2,\3/p" | awk -F, -v threshold=2 '{"date +%s.%3N -d\""$1"\""|getline actualTime; delay=actualTime-$2-1591566291; sum+=delay; if (delay > threshold ) print $1 " " delay;} END {print "AVG=", sum/NR}'
but when I run the exact same script on Ubuntu 20.04.2, I get this error :
/bin/sh: 1: Syntax error: Unterminated quoted string
It's definitely the exact same script because I scp'd it from the 21.04 to 20.04.2. Couldn't find any topics in stackoverflow or on the overall internet which addressed this difference. Both Ubuntu's are on Linux cloud servers. About the only way to run the script with no error is taking out this awk line: "date +%s.%3N -d\""$1"\""|getline actualTime;
I tried playing around with the reference to the $1 field but nothing would work. Tried it with nawk instead of awk, but no luck. Maybe as a last resort I can upgrade the OS from v 20 to v 21.
Has anyone seen this before?
Added: Thanks all for the quick replies. Here are the first lines of the log file that the script is running against
[Relaynod:cardano.node.ChainDB:Notice:60] [2021-06-30 02:20:14.36 UTC] Chain extended, new tip: de56b9f458e8942ca74c6a1913dc58fa896823dc19b366285e15481f434ed337 at slot 33453323
[Relaynod:cardano.node.ChainDB:Notice:60] [2021-06-30 02:20:15.17 UTC] Chain extended, new tip: e88ea4f438944bd15186fe93f321c117ec769cfbd33667654634f4510cfd3780 at slot 33453324
Just to make sure it's not a data issue. I ran the script on the Ubuntu 21 server against the file (worked), then copied the file to the Ubuntu 20 server and ran the exact same script against the copied file, and get the error.
I'll try out the suggestions on this topic and will let everyone know the answer.
New update: after laptop crash and replacement, remembered to come back to this post. I ended up using mktime like Ed mentioned. It's working now.
The shell script:
#!/bin/bash
grep -h "new tip" logs/node.log | tail -1000 | sed -rn "s/[(.)].[(.)].* ([0-9]+)/\2,\3/p" | mawk -F, -v threshold=2 -f check_delay.awk
The awk script:
BEGIN{ ENVIRON["TZ"] = "UTC"; }
{
year = substr($1,1,4);
month = substr($1,6,2);
day = substr($1,9,2);
hour = substr($1,12,2);
min = substr($1,15,2);
sec = substr($1,18,2);
timestamp = year" "month" "day" "hour" "min" "sec;
actualTime=mktime(timestamp) + 7200;
delay=actualTime-$2-1591566291;
sum+=delay;
if (delay >= threshold )
print $1 " " delay;}
END {print "AVG=", sum/NR}
You're spawning a shell to call date using whatever value happens to be in $1 in your data so the result of that will depend on your data. Look:
$ echo '3/27/2021' | awk '{"date +%s.%3N -d\""$1"\"" | getline; print}'
1616821200.000
$ echo 'a"b' | awk '{"date +%s.%3N -d\""$1"\"" | getline; print}'
sh: -c: line 0: unexpected EOF while looking for matching `"'
sh: -c: line 1: syntax error: unexpected end of file
a"b
and what this command outputs from a log file:
sed -rn "s/\[(.*)\].\[(.*)\].* ([0-9]+)/\2,\3/p"
will vary greatly depending on the contents of specific lines in the log file since the parts you're trying to isolate aren't anchored and use .*s when you presumably meant to use [^]]*s. For example:
$ echo '[foo] [3/27/2021] 15 something [probably] happened at line 50'
[foo] [3/27/2021] 15 something [probably] happened at line 50
$ echo '[foo] [3/27/2021] 15 something [probably] happened at line 50' | sed -rn "s/\[(.*)\].\[(.*)\].* ([0-9]+)/\2,\3/p"
3/27/2021] 15 something [probably,50
$ echo '[foo] [3/27/2021] 15 something [probably] happened at line 50' | sed -rn "s/\[(.*)\].\[(.*)\].* ([0-9]+)/\2,\3/p" | awk -F, -v threshold=2 '{"date +%s.%3N -d\""$1"\""|getline actualTime; delay=actualTime-$2-1591566291; sum+=delay; if (delay > threshold ) print $1 " " delay;} END {print "AVG=", sum/NR}'
date: invalid date ‘3/27/2021] 15 something [probably’
AVG= -1591566341
If you want to do that then you could introduce a check for a valid date to avoid THAT specific error, e.g. (but obviously create a better date verification regexp than this):
$ echo 'a"b' | awk '$1 ~ "[0-9]/[0-9]+/[0-9]" {"date +%s.%3N -d\""$1"\"" | getline; print}'
$
but it's still fragile and extremely slow.
You're using GNU sed for -r so you have or can get GNU awk and that has builtin time functions so you shouldn't be spawning a subshell to call date in the first place, you should just be using mktime(), see https://stackoverflow.com/a/68180908/1745001, which will avoid cryptic errors like that and run orders of magnitude faster.

How can I fix my bash script to find a random word from a dictionary?

I'm studying bash scripting and I'm stuck fixing an exercise of this site: https://ryanstutorials.net/bash-scripting-tutorial/bash-variables.php#activities
The task is to write a bash script to output a random word from a dictionary whose length is equal to the number supplied as the first command line argument.
My idea was to create a sub-dictionary, assign each word a number line, select a random number from those lines and filter the output, which worked for a similar simpler script, but not for this.
This is the code I used:
6 DIC='/usr/share/dict/words'
7 SUBDIC=$( egrep '^.{'$1'}$' $DIC )
8
9 MAX=$( $SUBDIC | wc -l )
10 RANDRANGE=$((1 + RANDOM % $MAX))
11
12 RWORD=$(nl "$SUBDIC" | grep "\b$RANDRANGE\b" | awk '{print $2}')
13
14 echo "Random generated word from $DIC which is $1 characters long:"
15 echo $RWORD
and this is the error I get using as input "21":
bash script.sh 21
script.sh: line 9: counterintelligence's: command not found
script.sh: line 10: 1 + RANDOM % 0: division by 0 (error token is "0")
nl: 'counterintelligence'\''s'$'\n''electroencephalograms'$'\n''electroencephalograph': No such file or directory
Random generated word from /usr/share/dict/words which is 21 characters long:
I tried in bash to split the code in smaller pieces obtaining no error (input=21):
egrep '^.{'21'}$' /usr/share/dict/words | wc -l
3
but once in the script line 9 and 10 give error.
Where do you think is the error?
problems
SUBDIC=$( egrep '^.{'$1'}$' $DIC ) will store all words of the given length in the SUBDIC variable, so it's content is now something like foo bar baz.
MAX=$( $SUBDIC | ... ) will try to run the command foo bar baz which is obviously bogus; it should be more like MAX=$(echo $SUBDIC | ... )
MAX=$( ... | wc -l ) will count the lines; when using the above mentioned echo $SUBDIC you will have multiple words, but all in one line...
RWORD=$(nl "$SUBDIC" | ...) same problem as above: there's only one line (also note #armali's answer that nl requires a file or stdin)
RWORD=$(... | grep "\b$RANDRANGE\b" | ...) might match the dictionary entry catch 22
likely RWORD=$(... | awk '{print $2}') won't handle lines containing spaces
a simple solution
doing a "random sort" over the all the possible words and taking the first line, should be sufficient:
egrep "^.{$1}$" "${DIC}" | sort -R | head -1
MAX=$( $SUBDIC | wc -l ) - A pipe is used for connecting a command's output, while $SUBDIC isn't a command; an appropriate syntax is MAX=$( <<<$SUBDIC wc -l ).
nl "$SUBDIC" - The argument to nl has to be a filename, which "$SUBDIC" isn't; an appropriate syntax is nl <<<"$SUBDIC".
This code will do it. My test dictionary of words is in file file. It's a good idea to get all words of a given length first but put them in an array not in var. And then get a random index and echo it.
dic=( $(sed -n "/^.\{$1\}$/p" file) )
ind=$((0 + RANDOM % ${#dic[#]}))
echo ${dic[$ind]}
I am also doing this activity and I create one simple solution.
I create the script.
#!/bin/bash
awk "NR==$1 {print}" /usr/share/dict/words
Here if you want a random string then you have to run the script as per the below command from the terminal.
./script.sh $RANDOM
If you want the print any specific number string then you can run as per the below command from the terminal.
./script.sh 465
cat /usr/share/dict/american-english | head -n $RANDOM | tail -n 1
$RANDOM - Returns a different random number each time is it referred to.
this simple line outputs random word from the mentioned dictionary.
Otherwise as umläute mentined you can do:
cat /usr/share/dict/american-english | sort -R | head -1

Can't input date variable in bash

I have a directory /user/reports under which many files are there, one of them is :
report.active_user.30092018.77325.csv
I need output as number after date i.e. 77325 from above file name.
I created below command to find a value from file name:
ls /user/reports | awk -F. '/report.active_user.30092018/ {print $(NF-1)}'
Now, I want current date to be passed in above command as variable and get result:
ls /user/reports | awk -F. '/report.active_user.$(date +'%d%m%Y')/ {print $(NF-1)}'
But not getting required output.
Tried bash script:
#!/usr/bin/env bash
_date=`date +%d%m%Y`
active=$(ls /user/reports | awk -F. '/report.active_user.${_date}/ {print $(NF-1)}')
echo $active
But still output is blank.
Please help with proper syntax.
As #cyrus said you must use double quotes in your variable assignment because simple quote are use only for string and not for containing variables.
Bas use case
number=10
string='I m sentence with or wihtout var $number'
echo $string
Correct use case
number=10
string_with_number="I m sentence with var $number"
echo $string_with_number
You can use simple quote but not englobe all the string
number=10
string_with_number='I m sentence with var '$number
echo $string_with_number
Don't parse ls
You don't need awk for this: you can manage with the shell's capabilities
for file in report.active_user."$(date "+%d%m%Y")"*; do
tmp=${file%.*} # remove the extension
number=${tmp##*.} # remove the prefix up to and including the last dot
echo "$number"
done
See https://www.gnu.org/software/bash/manual/bashref.html#Shell-Parameter-Expansion

Using printf to display an extracted string through grep and to be use as user input in a script

Good day,
This is kinda lenghty, Im hoping for the kind help of anybody who can support me on this simple problem (to others) but taking me almost forever to figure out.
I have this file (EOL.txt) which consists of the following sample lists:
35 - 5976
36 - 5976C0
53 - 5976C2
64 - 5976D0
69 - 43593
72 - 43593C0
Im using the following commands to extract the leftmost figure since this correspond to a routine number of another script:
grep 5976C2 EOL.txt | head -n1 | cut -d- -f1
After I acquired that number, I will input that along with the other data on another script (N.csh-syntax as follows) that will execute another one (Test.csh):
$./N.csh 53 XXXX.XX "01 02 03"
N.csh --> printf "$1\n$2\n$3\nYYYY\n1\nN\n" | /export/home/Script/Test.csh
What I want to do now is to incorporate the grep command to N.csh so that I wont have to do that separately. It should look like this:
$./N.csh 5976C2 XXXX.XX "01 02 03"
I tried the following commands but its not working.
grep $1 EOL.txt | head -n1 | cut -d- -f1 >> A ; set B=`cat A` ; printf %s "$B\n$2\n$3\n82869\n1\nN\n"
Im new to this stuff, any help will be highly appreciated.
Thanks a lot in advance.
Mike
You can use the following in the file N.csh:
set mynumber = `grep $1 EOL.txt | head -n1 | cut -d- -f1`
printf "$mynumber\n$2\n$3\n...
and then invoke N.csh like
./N.csh 5976C2 XXXX.XX "01 02 03"
Note that after set mynumber =, in the first line, there is a "backtick" - a reversed single quote. The shell executes the commands delimited by two backticks, takes the output, and puts it back in place of the original contents, so the first line turns into set mynumber = 53.

Filtering Linux command output

I need to get a row based on column value just like querying a database. I have a command output like this,
Name ID Mem VCPUs State
Time(s)
Domain-0 0 15485 16 r-----
1779042.1
prime95-01 512 1
-b---- 61.9
Here I need to list only those rows where state is "r". Something like this,
Domain-0 0 15485 16
r----- 1779042.1
I have tried using "grep" and "awk" but still I am not able to succeed.
Any help me is much appreciated
Regards,
Raaj
There is a variaty of tools available for filtering.
If you only want lines with "r-----" grep is more than enough:
command | grep "r-----"
Or
cat filename | grep "r-----"
grep can handle this for you:
yourcommand | grep -- 'r-----'
It's often useful to save the (full) output to a file to analyse later. For this I use tee.
yourcommand | tee somefile | grep 'r-----'
If you want to find the line containing "-b----" a little later on without re-running yourcommand, you can just use:
grep -- '-b----' somefile
No need for cat here!
I recommend putting -- after your call to grep since your patterns contain minus-signs and if the minus-sign is at the beginning of the pattern, this would look like an option argument to grep rather than a part of the pattern.
try:
awk '$5 ~ /^r.*/ { print }'
Like this:
cat file | awk '$5 ~ /^r.*/ { print }'
grep solution:
command | grep -E "^([^ ]+ ){4}r"
What this does (-E switches on extended regexp):
The first caret (^) matches the beginning of the line.
[^ ] matches exactly one occurence of a non-space character, the following modifier (+) allows it to also match more occurences.
Grouped together with the trailing space in ([^ ]+ ), it matches any sequence of non-space characters followed by a single space. The modifyer {4} requires this construct to be matched exactly four times.
The single "r" is then the literal character you are searching for.
In plain words this could be written like "If the line starts <^> with four strings that are followed by a space <([^ ]+ ){4}> and the next character is , then the line matches."
A very good introduction into regular expressions has been written by Jan Goyvaerts (http://www.regular-expressions.info/quickstart.html).
Filtering by awk cmd in linux:-
Firstly find the column for this cmd and store file2 :-
awk '/Domain-0 0 15485 /' file1 >file2
Output:-
Domain-0 0 15485 16
r----- 1779042.1
after that awk cmd in file2:-
awk '{print $1,$2,$3,$4,"\n",$5,$6}' file2
Final Output:-
Domain-0 0 15485 16
r----- 1779042.1

Resources