How this AWK is replacing an IF? - linux

I've reviewing some bash scripts written by other people at work and I found this line that I'm trying to understand
[[ $(awk 'BEGIN{print ('$CAPACITY'>=0.9)}') -eq 1 ]] && echo "Capacity at 90 Percent"
Is my understanding that this line is replacing an if statement. Could someone help me out explaining what this line really does. Thanks

This makes me very sad and pessimistic about the future of civilization...
Let's break this down:
[[ $(awk 'BEGIN{print ('$CAPACITY'>=0.9)}') -eq 1 ]] && echo "Capacity at 90 Percent"
Note the $(....). This tells the shell to execute the program inside, and replace the contents of $(...) with the value. For example:
$ file_name="/usr/local/bin/foo"
$ short_name="$(basename $file_name)"
$ echo $short_name
foo
In this the second line, we are running the command basename $file_name. This returns foo. Then, the shell will substitute foo for $(basename $filename) before assigning short_name. Here's the same thing with the debugger on:
$ set -xv
$ file_name="/usr/local/bin/foo"
foo=/usr/local/bin/foo
+ foo=/usr/local/bin/foo
$ short_name=$(basename $file_name)
short_name=$(basename $file_name)
basename $file_name
++ basename /usr/local/bin/foo
+ short_name=foo
$ echo $short_name
echo $short_name
+ echo foo
foo
$ set +xv # Turn off the debugger.
You can see how the shell executes $(...) and replaces it.
Thus, the user is actually running the program:
awk 'BEGIN{print ('$CAPACITY'>=0.9)}'
However, take a look at the quotation marks:
awk 'BEGIN{print ('$CAPACITY'>=0.9)}'
+++++++++++++ +++++++
The stuff with the pluses under it are part of the awk command inside single quotes and thus cannot be interpolated by the shell. HOWEVER, $CAPACITY is not in quotes. In other words, the value of $CAPACITY replaces that variable before the awk command is executed. Thus, if $CAPACITY is .8, the awk command will become:
awk 'BEGIN{print ('.8'>=0.9)}`
That's the very first part of the explanation.
Now on to the next part. How much do you know about awk?
Awk is a programming language that's usually part of Unix/Linux distributions. Awk normally works on files and assumes a loop around the file with each line being read in and operated on. For example:
$ awk '{print $1}` foo.txt
Let's assume that each line in foo.txt consists of several fields that are separated by spaces. The file foo.txt is read in and each line is passed through to the awk program and the awk program will print out the first field of each line.
However, there is no file for awk to operate on. This developer is using the special patter BEGIN. This is executed before any lines are read in. Since there is no file for awk to process, and there is no actual awk program (only a BEGIN statement), awk will execute this statement (assuming capacity is at 80%:
.8 >= .9
Like in Shell and other programming languages. This statement will evaluate as true or false. In awk, if this statement is true, it will a non-zero value (we hope 1). If it is false, it will equal zero. In this case, it will equal false.
Awk returns (like Perl) the last value it executes. Thus, if the capacity is at 80%, the awk statement .8 >= .9 will be false. Awk will return a zero.
Now, the entire $([[ $(awk 'BEGIN{print ('$CAPACITY'>=0.9)}') will be replaced with 0. Your [[ ... ]] test now becomes:
[[ 0 -eq 1 ]] && echo "Capacity at 90 Percent"
Well, [[ 0 -eq 1 ]] is false.
Now the final part.
The two commands && and || are list operators. Their name comes from the C programming operators of the same name, and the way C short circuits tests. For example,
if ( ( bar > 20 ) && ( foo < 30 ) ) {
is a typical C if statement. with foo and bar being variables. I am asking if bar is greater than 20 AND if foo is less than 30 to do something.
C will first evaluate bar > 20 and decide whether it is true or false. If bar > 20 is false, there's is no need to test foo < 30 because no matter what the results are, the statement is still false. What if bar is indeed greater than 20? You have to run the next part of the if statement.
Imagine this:
if ( ( bar > 20 ) || ( foo < 30 ) ) {
This says if bar is greater than 20 OR foo < 30. In this case, C will evaluate whether bar is greater than 20. If it is, there is no need to test whether or not foo is less than 30. The statement will be true no matter what the value of foo is. What if bar isn't greater than 20? Then, I have to test the value of foo.
So, if I have && and the first statement is false, don't do the second statement (the entire expression is false anyway). If the first statement is true, I have to run the second statement (because I don't know whether or not that entire statement is true or not).
If I have ||, the complete opposite happens. If the first statement is true, don't do the second statement (because the entire expression is true). If that first statement is false, I have to run the second statement.
The gist of this is:
[ "$foo" = "$bar" ] && echo "Foo equals bar"
is the same as:
if [ "$foo" = "$bar" ]
then
echo "Foo equals bar"
fi
Because if $foo does equal $bar, I have to execute the second part of the statement!
And, this:
[ "$foo" = "$bar" ] || echo "Foo and Bar are not equal"
is the same as:
if [ "$foo" != "$bar" ]
then
echo "Foo and Bar are not equal"
fi
So, first the shell substitutes in the value of the shell variable $CAPACITY into your little awk script.
Next the awk script runs testing whether or not the substituted value of $CAPACITY is greater than or equal to 0.9. Since there is no actual awk program, awk doesn't attempt to read in from STDIN.
Next, awk will assign a zero or non-zero value to that boolean statement (depending whether or not it's true). Then, the awk program will exit with the evaluated value of that boolean statement.
The shell now substitutes that zero or non-zero value for that entire $(...) phrase. This is run through a test to see if it is or isn't equal to 1.
Finally if that test statement is equal to 1, the && will tell the shell to evaluate that echo statement. Thus, if the shell variable $CAPACITY is .9 or greater, that echo statement will print.
That's a lot of machinations just to compare .8 (or whatever the capacity is) with .9, so why did the developer do this?
Probably because BASH shell can only do integer arithmetic. Since $CAPACITY is less than one, you can't do this:
if [[ $CAPACITY -le .9 ]]
then
echo "Capacity is at 90%"
fi
Instead of using awk, I would probably have used bc:
OVER_CAPACITY=$(bc <<<"$CAPACITY >= .9")
if [[ ! $OVER_CAPACITY -eq $(true) ]]
then
echo "Capacity is over 90%"
else
echo "Every thing is okay"
fi
It would have been a few more lines, but I hope it makes things a bit easier to understand and make the file easier to maintain.

The complete line can be thought of as [[ if something is true ]] &&=then do another thing
To understand what is going on in this code, turn on your mental shell parser, and find the innermost construct that will produce output. in this case
awk 'BEGIN{print ('$CAPACITY'>=0.9)}'
execute that on a cmd-line by itself. Obviously the variable CAPACITY has to be set with a value.
Then you can use the shell debug/trace facility (set -vx) to see every thing executing
CAPACITY=0.95
set -vx
[[ $(awk 'BEGIN{print ('$CAPACITY'>=0.9)}') -eq 1 ]] && echo "Capacity at 90 Percent"
+ awk 'BEGIN{print (0.95>=0.9)}'
+ [[ 1 -eq 1 ]]
+ echo 'Capacity at 90 Percent'
Capacity at 90 Percent
IHTH

It's not, the [[ and ]] are an improvement upon the test builtin and the && is an AND
So, what this line is doing equivalent to:
if [[ $(awk 'BEGIN{print ('$CAPACITY'>=0.9)}') -eq 1 ]] ; then
echo "Capacity at 90 Percent"
fi
In effect, the line is saying TEST this condition AND do this other thing only if it's TRUE
Similarly, you could do [[ something_to_test ]] || do this if something_to_test is false
which means, TEST this condition OR do this other thing
These are bash shell one-line shortcuts.

You got a lot of good explanations, now rewrite the whole thing as:
awk -v cap="$CAPACITY" 'BEGIN{ if (cap>=0.9) print "Capacity at 90 Percent" }'
for clarity and simplicity.

[[ .... ]] construct returns true or false.
so the exp in [[ .... ]] must be a logic operation

Related

Linux script reading an ini file and splitting into variables by a specified character

I'm stuck in the following task: Lets pretend we have an .ini file in a folder. The file contains lines like this:
eno1=10.0.0.254/24
eno2=172.16.4.129/25
eno3=192.168.2.1/25
tun0=10.10.10.1/32
I had to choose the biggest subnet mask. So my attempt was:
declare -A data
for f in datadir/name
do
while read line
do
r=(${line//=/ })
let data[${r[0]}]=${r[1]}
done < $f
done
This is how far i got. (Yeah i know the file named name is not an .ini file but a .txt since i got problem even with creating an ini file,this teacher didn't even give a file like that for our exam.)
It splits the line until the =, but doesn't want to read the IP number because of the (first) . character.
(Invalid arithmetic operator the error message i got)
If someone could help me and explain how i can make a script for tasks like this i would be really thankful!
Both previously presented solutions operate (and do what they're designed to do); I thought I'd add something left-field as the specifications are fairly loose.
$ cat freasy
eno1=10.0.0.254/24
eno2=172.16.4.129/25
eno3=192.168.2.1/25
tun0=10.10.10.1/32
I'd argue that the biggest subnet mask is the one with the lowest numerical value (holds the most hosts).
$ sort -t/ -k2,2nr freasy| tail -n1
eno1=10.0.0.254/24
Don't use let. It's for arithmetic.
$ help let
let: let arg [arg ...]
Evaluate arithmetic expressions.
Evaluate each ARG as an arithmetic expression.
Just use straight assignment:
declare -A data
for f in datadir/name
do
while read line
do
r=(${line//=/ })
data[${r[0]}]=${r[1]}
done < $f
done
Result:
$ declare -p data
declare -A data=([tun0]="10.10.10.1/32" [eno1]="10.0.0.254/24" [eno2]="172.16.4.129/25" [eno3]="192.168.2.1/25" )
awk provides a simple solution to find the max value following the '/' that will be orders of magnitude faster than a bash script or Unix pipeline using:
awk -F"=|/" '$3 > max { max = $3 } END { print max }' file
Example Use/Output
$ awk -F"=|/" '$3 > max { max = $3 } END { print max }' file
32
Above awk separates the fields using either '=' or '/' as field separator and then keeps the max of the 3rd field $3 and outputs that value using the END {...} rule.
Bash Solution
If you did want a bash script solution, then you can isolate the wanted parts of each line using [[ .. =~ .. ]] to populate the BASH_REMATCH array and then compare ${BASH_REMATCH[3]} against a max variable. The [[ .. ]] expression with =~ considers everything on the right side an Extended Regular Expression and will isolate each grouping ((...)) as an element in the array BASH_REMATCH, e.g.
#!/bin/bash
[ -z "$1" ] && { printf "filename required\n" >&2; exit 1; }
declare -i max=0
while read -r line; do
[[ $line =~ ^(.*)=(.*)/(.*)$ ]]
((${BASH_REMATCH[3]} > max)) && max=${BASH_REMATCH[3]}
done < "$1"
printf "max: %s\n" "$max"
Using Only POSIX Parameter Expansions
Using parameter expansion with substring removal supported by POSIX shell (Bourne shell, dash, etc..), you could do:
#!/bin/sh
[ -z "$1" ] && { printf "filename required\n" >&2; exit 1; }
max=0
while read line; do
[ "${line##*/}" -gt "$max" ] && max="${line##*/}"
done < "$1"
printf "max: %s\n" "$max"
Example Use/Output
After making yourscript.sh executable with chmod +x yourscript.sh, you would do:
$ ./yourscript.sh file
max: 32
(same output for both shell script solutions)
Let me know if you have further questions.

How to test for certain characters in a file

I am currently running a script with an if statement. Before I run the script, I want to make sure the file provided as the first argument has certain characters.
If the file does not have those certain characters in certain spots then the output would be else "File is Invalid" on the command line.
For the if statement to be true, the file needs to have at least one hyphen in Field 1 line 1 and at least one comma in Field one Line one.
How would I create an if statement with perhaps a test command to validate those certain characters are present?
Thanks
Im new to Linux/Unix, this is my homework so I haven't really tried anything, only brain storming possible solutions.
function usage
{
echo "usage: $0 filename ..."
echo "ERROR: $1"
}
if [ $# -eq 0 ]
then
usage "Please enter a filename"
else
name="Yaroslav Yasinskiy"
echo $name
date
while [ $# -gt 0 ]
do
if [ -f $1 ]
then
if <--------- here is where the answer would be
starting_data=$1
echo
echo $1
cut -f3 -d, $1 > first
cut -f2 -d, $1 > last
cut -f1 -d, $1 > id
sed 's/$/:/' last > last1
sed '/last:/ d' last1 > last2
sed 's/^ *//' last2 > last3
sed '/first/ d' first > first1
sed 's/^ *//' first1 > first2
sed '/id/ d' id > id1
sed 's/-//g' id1 > id2
paste -d\ first2 last3 id2 > final
cat final
echo ''
else
echo
usage "Coult not find file $1"
fi
shift
done
fi
In answer to your direct question:
For the if statement to be true, the file needs to have at least one
hyphen in Field 1 line 1 and at least one comma in Field one Line one.
How would I create an if statement with perhaps a test command to
validate those certain characters are present?
Bash provides all the tools you need. While you can call awk, you really just need to read the first line of the file into two-variable (say a and b) and then use the [[ $a =~ regex ]] to where the regex is an extended regular expression that verifies that the first field (contained in $a) contains both a '-' and ','.
For details on the [[ =~ ]] expression, see bash(1) - Linux manual page under the section labeled [[ expression ]].
Let's start with read. When you provide two variables, read will read the first field (based on normal word-splitting given by IFS (the Internal Field Separator, default $'[ \t\n]' - space, tab, newline)). So by doing read -r a b you read the first field into a and the rest of the line into b (you don't care about b for your test)
Your regex can be ([-]+.*[,]+|[,]+.*[-]+) which is an (x|y), e.g. x OR y expression where x is [-]+.*[,]+ (one or more '-' and one or more ','), your y is [,]+.*[-]+ (one or more ',' and one or more '-'). So by using the '|' your regex will accept either a comma then zero-or-more characters and a hyphen or a hyphen and zero-or-more characters and then a comma in the first field.
How do you read the line? With simple redirection, e.g.
read -r a b < "$1"
So your conditional test in your script would look something like:
if [ -f $1 ]
then
read -r a b < "$1"
if [[ $a =~ ([-]+.*[,]+|[,]+.*[-]+) ]] # <-- here is where the ...
then
starting_data=$1
...
else
echo "File is Invalid" >&2 # redirection to 2 (stderr)
fi
else
echo
usage "Coult not find file $1"
fi
shift
...
Example Test Files
$ cat valid
dog-food, cat-food, rabbit-food
50lb 16lb 5lb
$ cat invalid
dogfood, catfood, rabbitfood
50lb 16lb 5lb
Example Use/Output
$ read -r a b < valid
if [[ $a =~ ([-]+.*[,]+|[,]+.*[-]+) ]]; then
echo "file valid"
else
echo "file invalid"
fi
file valid
and for the file without the certain characters:
$ read -r a b < invalid
if [[ $a =~ ([-]+.*[,]+|[,]+.*[-]+) ]]; then
echo "file valid"
else
echo "file invalid"
fi
file invalid
Now you really have to concentrate on eliminating the spawning of at least a dozen subshells where you call cut 3-times, sed 7-times, paste once and then cat. While it is good you are thinking through what you need to do, and getting it working, as mentioned in my comment, any time you are looping, you want to eliminate the number of subshells spawned to the greatest extent possible. I suspect as #Mig answered, awk will be the proper tool that can likely eliminate all 12 subshells are replace it with a single call to awk.
I personally would use awk for this all part since you want to test fields and create a string with concatenated fields. Awk is perfect for that.
But here is a small script which shows how you could just test your file's first line:
if [[ $(head -n 1 file.csv | awk '$1~/-/ && $1~/,/ {print "MATCH"}') == 'MATCH' ]]; then
echo "yes"
else
echo "no"
fi
It looks overkill when not doing the whole thing in awk but it works. I am sure there is a way to test only one regex, but that would involve knowing which flavour of awk you have because I think they don't all use the same regex engine. Therefore I left this out for the sake of simplicity.

Decrement variables that contain letters

I have a set of valid characters [0-9a-zA-Z] and a variable that is assigned one of these characters. What I want to do is to be able to decrement that variable to the next in the set.
I can't figure out how to decrement letters , it works for numbers only.
#!/bin/bash
test=b
echo $test # this shows 'b'
let test-=1
echo $test # I want it to be 'a'
The advantage of
test=$(tr 1-9a-zA-Z 0-9a-zA-Y <<<"$test")
is that it correctly (I think) decrements a to 9 and A to z. And if that is not the order you want, it is easy to adjust.
See man tr for details. This is the Gnu version of tr; character ranges are not guaranteed by Posix, but most tr implementations have them. <<< "here strings" are also a common extension, which bash implements.
test=$(printf "\\$(printf '%03o' "$(($(printf '%d' "'$test") - 1 ))")")
you could try this:
#!/bin/bash
test=b
if [[ $test == A || $test == a || $test == 0 ]]
then
echo "character already at lowest value"
else
# convert $test to decimal digit
test_digit=$(printf '%d' "'$test")
decremented=$(( test_digit - 1 ))
# print $decremented as a char
printf "\\$(printf '%03o' "$decremented")\n"
fi
reference:
http://mywiki.wooledge.org/BashFAQ/071
If we set a variable (say a) to the whole string of characters:
$ a=$( IFS=''; set -- {0..9} {a..z} {A..Z}; echo "$*"); echo "$a"
0123456789abcdefghijklmnopqrstuvwxyzABCDEFGHIJKLMNOPQRSTUVWXYZ
We may take advantage of the fact that bash "arithmetic" may use a base up to 62 (in the same order as the letters presented).
$ test="A"; echo "${a:$((62#$test-1)):1}"
z
This works only for "one character" (and not zero 0).
It may be expanded to several characters, but that is not being asked.

bash shell script concatenate string with period char

I am trying to create following string
Beta-3.8.0
but shell script always omits the . period char no matter what I do.
echo "$readVersion"
if [ -z $readVersion ]
then
echo "readVersion is empty"
exit 1
fi;
IFS=.
set $readVersion
newVersion=$(echo "$2 + 1" | bc)
newBranch="Beta-$1.$newVersion.$3"
echo $newBranch
prints:
3.8.0
Beta-3 9 0
I have also tried
newBranch='Beta-'$1'.'$newVersion'.'$3
or
newBranch="Beta-{$1}.{$newVersion}.{$3}"
although this seems printing the right value echo "$1.$newVersion.$3" why not variable doesnt work ?
I need the variable to use later on in the script...
You can save and restore the IFS once you are done.
oldIFS=$IFS
IFS=.
set $readVersion
newVersion=$(echo "$2 + 1" | bc)
IFS=$oldIFS
newBranch="Beta-$1.$newVersion.$3"
echo "$newBranch"
Or you can quote when printing:
echo "$newBranch"
The former is a better idea IMO since it conveys your intention and would make the rest of the code use the "correct" IFS. The latter just circumvents the problem.

bash palindrome grep loop if then else missing '

My Syst admin prof just started teaching us bash and he wanted us to write a bash script using grep to find all 3-45 letter palindromes in the linux dictionary without using reverse. And im getting an error on my if statement saying im missing a '
UPDATED CODE:
front='\([a-z]\)'
front_s='\([a-z]\)'
numcheck=1
back='\1'
middle='[a-z]'
count=3
while [ $count -ne "45" ]; do
if [[ $(($count % 2)) == 0 ]]
then
front=$front$front_s
back=+"\\$numcheck$back"
grep "^$front$back$" /usr/share/dict/words
count=$((count+1))
else
grep "^$front$middle$back$" /usr/share/dict/words
numcheck=$((numcheck+1))
count=$((count+1))
fi
done
You have four obvious problems here:
First about a misplaced and unescaped backslash:
back="\\$numcheck$back" # and not back="$numcheck\$back"
Second is that you only want to increment numcheck if count is odd.
Third: in the line
front=$front$front
you're doubling the number of patterns in front! hey, that yields an exponential growth, hence the explosion Argument list too long. To fix this: add a variable, say, front_step:
front_step='\([a-z]\)'
front=$front_step
and when you increment front:
front=$front$front_step
With these fixed, you should be good!
The fourth flaw is that grep's back-references may only have one digit: from man grep:
Back References and Subexpressions
The back-reference \n, where n is a single digit, matches the substring
previously matched by the nth parenthesized subexpression of the
regular expression.
In your approach, we'll need up to 22 back-references. That's too much for grep. I doubt there are any such long palindromes, though.
Also, you're grepping the file 43 times… that's a bit too much.
Try this:
#!/bin/bash
for w in `grep -E "^[[:alnum:]]{3,45}$" /usr/share/dict/words`; do if [[ "$w" == "`echo $w|sed "s/\(.\)/\1\n/g"|tac|tr -d '\012'`" ]]; then echo "$w == is a palindrome"; fi; done
OR
#!/bin/bash
front='\([a-z]\)'
numcheck=1
back='\1'
middle='[a-z]'
count=3
while [ $count -ne "45" ]; do
if [[ $(($count % 2)) == 0 ]]
then
front=$front$front
back="\\$numcheck$back"
grep "^$front$back$" /usr/share/dict/words
else
grep "^$front$middle$back$" /usr/share/dict/words
## Thanks to gniourf for catching this.
numcheck=$((numcheck+1))
fi
count=$((count+1))
## Uncomment the following if you want to see one by one and run script using bash -x filename.sh
#echo Press any key to continue: ; read toratora;
done

Resources