Counter increment in Bash loop not working - linux

I have the following simple script where I am running a loop and want to maintain a COUNTER. I am unable to figure out why the counter is not updating. Is it due to subshell that's getting created? How can I potentially fix this?
#!/bin/bash
WFY_PATH=/var/log/nginx
WFY_FILE=error.log
COUNTER=0
grep 'GET /log_' $WFY_PATH/$WFY_FILE | grep 'upstream timed out' | awk -F ', ' '{print $2,$4,$0}' | awk '{print "http://domain.example"$5"&ip="$2"&date="$7"&time="$8"&end=1"}' | awk -F '&end=1' '{print $1"&end=1"}' |
(
while read WFY_URL
do
echo $WFY_URL #Some more action
COUNTER=$((COUNTER+1))
done
)
echo $COUNTER # output = 0

First, you are not increasing the counter. Changing COUNTER=$((COUNTER)) into COUNTER=$((COUNTER + 1)) or COUNTER=$[COUNTER + 1] will increase it.
Second, it's trickier to back-propagate subshell variables to the callee as you surmise. Variables in a subshell are not available outside the subshell. These are variables local to the child process.
One way to solve it is using a temp file for storing the intermediate value:
TEMPFILE=/tmp/$$.tmp
echo 0 > $TEMPFILE
# Loop goes here
# Fetch the value and increase it
COUNTER=$[$(cat $TEMPFILE) + 1]
# Store the new value
echo $COUNTER > $TEMPFILE
# Loop done, script done, delete the file
unlink $TEMPFILE

COUNTER=1
while [ Your != "done" ]
do
echo " $COUNTER "
COUNTER=$[$COUNTER +1]
done
TESTED BASH: Centos, SuSE, RH

COUNTER=$((COUNTER+1))
is quite a clumsy construct in modern programming.
(( COUNTER++ ))
looks more "modern". You can also use
let COUNTER++
if you think that improves readability. Sometimes, Bash gives too many ways of doing things - Perl philosophy I suppose - when perhaps the Python "there is only one right way to do it" might be more appropriate. That's a debatable statement if ever there was one! Anyway, I would suggest the aim (in this case) is not just to increment a variable but (general rule) to also write code that someone else can understand and support. Conformity goes a long way to achieving that.
HTH

Try to use
COUNTER=$((COUNTER+1))
instead of
COUNTER=$((COUNTER))

I think this single awk call is equivalent to your grep|grep|awk|awk pipeline: please test it. Your last awk command appears to change nothing at all.
The problem with COUNTER is that the while loop is running in a subshell, so any changes to the variable vanish when the subshell exits. You need to access the value of COUNTER in that same subshell. Or take #DennisWilliamson's advice, use a process substitution, and avoid the subshell altogether.
awk '
/GET \/log_/ && /upstream timed out/ {
split($0, a, ", ")
split(a[2] FS a[4] FS $0, b)
print "http://example.com" b[5] "&ip=" b[2] "&date=" b[7] "&time=" b[8] "&end=1"
}
' | {
while read WFY_URL
do
echo $WFY_URL #Some more action
(( COUNTER++ ))
done
echo $COUNTER
}

count=0
base=1
(( count += base ))

Instead of using a temporary file, you can avoid creating a subshell around the while loop by using process substitution.
while ...
do
...
done < <(grep ...)
By the way, you should be able to transform all that grep, grep, awk, awk, awk into a single awk.
Starting with Bash 4.2, there is a lastpipe option that
runs the last command of a
pipeline in the current shell context. The lastpipe option has no
effect if job control is enabled.
bash -c 'echo foo | while read -r s; do c=3; done; echo "$c"'
bash -c 'shopt -s lastpipe; echo foo | while read -r s; do c=3; done; echo "$c"'
3

minimalist
counter=0
((counter++))
echo $counter

There were two conditions that caused the expression ((var++)) to fail for me:
If I set bash to strict mode (set -euo pipefail) and if I start my increment at zero (0).
Starting at one (1) is fine but zero causes the increment to return "1" when evaluating "++" which is a non-zero return code failure in strict mode.
I can either use ((var+=1)) or var=$((var+1)) to escape this behavior

This is all you need to do:
$((COUNTER++))
Here's an excerpt from Learning the bash Shell, 3rd Edition, pp. 147, 148:
bash arithmetic expressions are equivalent to their counterparts in
the Java and C languages.[9] Precedence and associativity are the same
as in C. Table 6-2 shows the arithmetic operators that are supported.
Although some of these are (or contain) special characters, there is
no need to backslash-escape them, because they are within the $((...))
syntax.
..........................
The ++ and - operators are useful when you want to increment or
decrement a value by one.[11] They work the same as in Java and C,
e.g., value++ increments value by 1. This is called post-increment;
there is also a pre-increment: ++value. The difference becomes evident
with an example:
$ i=0
$ echo $i
0
$ echo $((i++))
0
$ echo $i
1
$ echo $((++i))
2
$ echo $i
2
See http://www.safaribooksonline.com/a/learning-the-bash/7572399/

This is a simple example
COUNTER=1
for i in {1..5}
do
echo $COUNTER;
//echo "Welcome $i times"
((COUNTER++));
done

Source script has some problem with subshell.
First example, you probably do not need subshell. But We don't know what is hidden under "Some more action".
The most popular answer has hidden bug, that will increase I/O, and won't work with subshell, because it restores couter inside loop.
Do not fortot add '\' sign, it will inform bash interpreter about line continuation. I hope it will help you or anybody. But in my opinion this script should be fully converted to AWK script, or else rewritten to python using regexp, or perl, but perl popularity over years is degraded. Better do it with python.
Corrected Version without subshell:
#!/bin/bash
WFY_PATH=/var/log/nginx
WFY_FILE=error.log
COUNTER=0
grep 'GET /log_' $WFY_PATH/$WFY_FILE | grep 'upstream timed out' |\
awk -F ', ' '{print $2,$4,$0}' |\
awk '{print "http://example.com"$5"&ip="$2"&date="$7"&time="$8"&end=1"}' |\
awk -F '&end=1' '{print $1"&end=1"}' |\
#( #unneeded bracket
while read WFY_URL
do
echo $WFY_URL #Some more action
COUNTER=$((COUNTER+1))
done
# ) unneeded bracket
echo $COUNTER # output = 0
Version with subshell if it is really needed
#!/bin/bash
TEMPFILE=/tmp/$$.tmp #I've got it from the most popular answer
WFY_PATH=/var/log/nginx
WFY_FILE=error.log
COUNTER=0
grep 'GET /log_' $WFY_PATH/$WFY_FILE | grep 'upstream timed out' |\
awk -F ', ' '{print $2,$4,$0}' |\
awk '{print "http://example.com"$5"&ip="$2"&date="$7"&time="$8"&end=1"}' |\
awk -F '&end=1' '{print $1"&end=1"}' |\
(
while read WFY_URL
do
echo $WFY_URL #Some more action
COUNTER=$((COUNTER+1))
done
echo $COUNTER > $TEMPFILE #store counter only once, do it after loop, you will save I/O
)
COUNTER=$(cat $TEMPFILE) #restore counter
unlink $TEMPFILE
echo $COUNTER # output = 0

It seems that you didn't update the counter is the script, use counter++

Related

awk usage in a variable

actlist file contains around 15 records. I want to print/store each row in a variable to perform further action. script runs but echo $j displays blank value. What is the issue?
my script:
#/usr/bin/sh
acList=/root/john/actlist
Rowcount=`wc -l $acList | awk -F " " '{print $1}'`
for ((i=1; i<=Rowcount; i++)); do
j=`awk 'FNR == $i{print}' $acList`
echo $j
done
file: actlist
cat > actlist
5663233332 2223 2
5656556655 5545 5
4454222121 5555 5
.
.
.
The issue happens to be related to quotes and to the way the shell interpolates variables.
More specifically, when you write
j=`awk "FNR == $i{print}" $acList`
the AWK code must be enclosed into double quotes. This is necessary if you want the shell to be able to substitute the $i with the actual value stored in the i variable.
On the other hand, if you write
j=`awk 'FNR == $i{print}' $acList`
i.e. with single quotes, the $i will be interpreted as a literal string.
Hence the fixed code will read:
#/usr/bin/sh
acList=/root/john/actlist
Rowcount=`wc -l $acList | awk -F " " '{print $1}'`
for ((i=1; i<=Rowcount; i++)); do
j=`awk "FNR == $i{print}" $acList`
echo $j
done
Remember: it is always the shell that does variable interpolation before calling other commands.
Having said that, there are some places, in supplied code, where some improvements could be devised. But that's another story.
Unfortunately all your script does is print the contents of the input file so we can't help you figure out the right approach to do whatever it is you REALLY want to do without more information on what that is but chances are this is the right starting point:
acList=/root/john/actlist
awk '
{ print }
' "$acList"
I think you would probably be better off with this for parsing your file:
#!/bin/bash
while read a b c; do
echo $a, $b, $c
done < "$actlist"
Output:
5663233332, 2223, 2
5656556655, 5545, 5
4454222121, 5555, 5
Updated
Whilst the above demonstrates the concept I was suggesting, as #EdMorton rightly says in the comments section, the following code would be more robust for a production environment.
#!/bin/bash
while IFS= read -r a b c; do
echo "$a, $b, $c"
done < "$actlist"

Command to count the characters present in the variable

I am trying to count the number of characters present in the variable. I used the below shell command. But I am getting error - command not found in line 4
#!/bin/bash
for i in one; do
n = $i | wc -c
echo $n
done
Can someone help me in this?
In bash you can just write ${#string}, which will return the length of the variable string, i.e. the number of characters in it.
Something like this:
#!/bin/bash
for i in one; do
n=$(echo $i | wc -c)
echo $n
done
Assignments in bash cannot have a space before the equals sign. In addition, you want to capture the output of the command you run and assign that to $n, rather than that statement which would probably just assign $i to $n.
Use the following instead:
#!/bin/bash
for i in one; do
n=`$i | wc -c`
echo $n
done
It can be as simple as that:
str="abcdef"; wc -c <<< "$str"
7
But mind you that end of line counts as a character:
str="abcdef"; cat -A <<< "$str"
abcdef$
If you need to remove it:
str="abcdef"; tr -d '\n' <<< "$str" | wc -c
6

Operation Precedence

I want to store the result of a command into an array variable. I'm having trouble because the command itself contains variables that must be resolved before its execution. For example:
for ((i=1; i<=4; i++)); do
NEXT=$(( i + 1 ))
MYARRAY[i]=$(cat $VARIABLE | uniq | sed -n '$NEXTp')
done
The "cat $VARIABLE" command is being processed correctly. The problem is with "$NEXT" substitution that is immediately followed by a "p" character. How can I force the script to resolve $NEXT variable before executing the command and store the results inside MYARRAY[i]?
Thanks.
The typical solution is: ${NEXT}p
Notice that what you are doing is fairly atypical. It is more usual to assign to an array using something like:
IFS='
'
MYARRAY=( $( < $VARIABLE uniq | sed -n 1,5p ))
This will assign MYARRAY[0], which your original code does not do, but it's not clear to me if that is intentional or an attempt to adjust the indexing. As always, UUOC is to be discouraged, and although uniq can take $VARIABLE as an argument, it's a good idiom to use the redirection so I'm using that in the example to demonstrate a simple way to eliminate UUOC in 99.9% of the cases it appears.
You could add the 'p' to NEXT before using it in the expression:
for ((i=1; i<=4; i++)); do
NEXT=$(expr $i + 1)
NEXT+='p'
MYARRAY[i]=$(cat $VARIABLE | uniq | sed -n $NEXT)
done
You can use script like this:
for ((i=1; i<=4; i++)); do
NEXT=$(expr $i + 1)
MYARRAY[i]=$(cat $VARIABLE | uniq | sed -n $NEXT'p')
done

Variable scope for bash shell scripts and functions in the script

I'm a little confused with my script regarding functions, variable scope, and possibly subshells.
I saw in another post that pipes spawn a subshell and the parent shell can't access variables from the subshell. Is this the same case with cmds run in backticks too?
To not bore people, I've shortened my 100+ line script but I tried to remember to leave in the important elements (i.e. backticks, pipes etc). Hopefully I didn't leave anything out.
global1=0
global2=0
start_read=true
function testfunc {
global1=9999
global2=1111
echo "in testfunc"
echo $global1
echo $global2
}
file1=whocares
file2=whocares2
for line in `cat $file1`
do
for i in `grep -P "\w+ stream" $file2 | grep "$line"` # possible but unlikely problem spot
do
end=$(echo $i | cut -d ' ' -f 1-4 | cut -d ',' -f 1) # possible but unlikely spot
duration=`testfunc $end` # more likely problem spot
done
done
echo "global1 = $global1"
echo "global2 = $global2"
So when I run my script, the last line says global1 = 0. However, in my function testfunc, global1 gets set to 9999 and the debug msgs print out that within the function at least, it is 9999.
Two questions here:
Do the backticks spawn a subshell and thus making my script not
work?
How do I work around this issue?
You can try something like
global1=0
global2=0
start_read=true
function testfunc {
global1=9999
global2=1111
echo "in testfunc"
echo $global1
echo $global2
duration=something
}
file1=whocares
file2=whocares2
for line in `cat $file1`
do
for i in `grep -P "\w+ stream" $file2 | grep "$line"` # possible but unlikely problem spot
do
end=$(echo $i | cut -d ' ' -f 1-4 | cut -d ',' -f 1) # possible but unlikely spot
testfunc $end # more likely problem spot
done
done
echo "global1 = $global1"
echo "global2 = $global2"
Do the backticks spawn a subshell and thus making my script not work?:
Yes they do and any changes made in variable in a subshell are not visible in parent shell.
How do I work around this issue?
You can probably try this loop that avoid spawning a subshell:
while read line
do
while read i
do
end=$(echo $i | cut -d ' ' -f 1-4 | cut -d ',' -f 1)
duration=$(testfunc "$end")
done < <(grep -P "\w+ stream" "$file2" | grep "$line")
done < "$file1"
PS: But testfunc will still be called in sub process.

Parsing a CSV string in Shell Script and writing it to a File

I am not a Linux scripting expert and I have exhausted my knowledge on this matter. Here is my situation.
I have a list of states passed as a command line argument to a shell script ( e.g "AL,AK,AS,AZ,AR,CA..." ). The Shell script needs to extract each of the state code and write it to a file ( states.txt) , with each state in one line. See below
AL
AK
AS
AZ
AR
CA
..
..
How can this be achieved using a linux shell script.
Thanks in advance.
Use tr:
echo "AL,AK,AS,AZ,AR,CA" | tr ',' '\n' > states.txt
echo "AL,AK,AS,AZ,AR,CA" | awk -F, '{for (i = 1; i <= NF; i++) print $i}';
Naive solution:
echo "AL,AK,AS,AZ,AR,CA" | sed 's/,/\n/g'
I think awk is the simplest solution, but you could try using cut in a loop.
Sample script (outputs to stdout, but you can just redirect it):
#!/bin/bash
# Check for input
if (( ${#1} == 0 )); then
echo No input data supplied
exit
fi
# Initialise first input
i=$1
# While $i still contains commas
while { echo $i| grep , > /dev/null; }; do
# Get first item of $i
j=`echo $i | cut -d ',' -f '1'`
# Shift off the first item of $i
i=`echo $i | cut --complement -d ',' -f '1'`
echo $j
done
# Display the last item
echo $i
Then you can just run it as ./script.sh "AL,AK,AS,AZ,AR,CA" > states.txt (assuming you save it as script.sh in the local directory and give it execute permission)

Resources