Why do I get an extra 0 on my script

Why do I get an extra 0 on my script - linux

I don't know why I get an extra 0 when I run my script.
This is my script: I run a SQL query and save it ta an file valor.txt.
This is my array: array=(50 60 70)
Valor.txt:
count | trn_hst_id | trn_msg_host
-------+------------+--------------
11 | 50 | Aprobada
2 | 70 | Aprobada
(2 rows)
Code:
function service_status {
cd
cat valor.txt | grep $1 | gawk '{print $1}' FS="|" | sed "s/ //g"
if [ $? -eq 0 ]; then
echo -n 0
else
echo -n $1
fi
}
echo "<prtg>"
# <-- Start
for i in "${array[#]}"
do
echo -n " <result>
<channel>$i</channel>
<value>"
service_status $i
echo "</value>
</result>"
done
# End -->
echo "</prtg>"
exit
And this is my output.
<prtg>
<result>
<channel>50</channel>
<value>11
0</value>
</result>
<result>
<channel>60</channel>
<value>0</value>
</result>
<result>
<channel>70</channel>
<value>2
0</value>
</result>
</prtg>
Why do I get the 0 here? —
<value>2
0</value>

If I understand your comment correctly, you want to print the count. That is the value of the count column, if present in valor.txt, or 0 if the trn_hst_id in array is not in valor.txt. This should work (though not tested):
function service_status {
val=$(cat ~/valor.txt | grep $1 | gawk '{print $1}' FS="|" | sed "s/ //g")
# ^^ so you don't need to "cd" each time
# Save the value into "$val"
echo -n "${val:-0}" # If there is nothing in $val, print a 0
}
The "${val:-0}" sequence expands as "$val", if $val has text in it, or as a literal 0 otherwise. If the $1 wasn't in valor.txt, $val will be empty, so you will get a zero. See the wiki for more about how :- and friends work.

The "0" is the result of the echo -n 0 which is executed inside the function in case the awk command works properly (which is usually the case).
From the code it is not clear why is it written like it is. It's clear that is supposed to extract certain values from a file, but the 'if' condition seems to be checking the wrong thing, the return code of 'sed' which I bet is not what is intended. (better candidate would be the return code of 'grep'.
So I would write the function like this:
function service_status {
cd
var=$(cat valor.txt | grep $1 | gawk '{print $1}' FS="|" | sed "s/ //g")
if [ -z "$var" ]; then
echo -n 0
else
echo -n "$var"
fi
}
The variable 'var' will contain the result of the "search command". If the search would not return any value then 'var' will be empty and '0' will be the output of the function, otherwise the content of 'var' will be on the output.

Related

Bash function with input fails awk command

I am writing a function in a BASH shell script, that should return lines from csv-files with headers, having more commas than the header. This can happen, as there are values inside these files, that could contain commas. For quality control, I must identify these lines to later clean them up. What I have currently:
#!/bin/bash
get_bad_lines () {
local correct_no_of_commas=$(head -n 1 $1/$1_0_0_0.csv | tr -cd , | wc -c)
local no_of_files=$(ls $1 | wc -l)
for i in $(seq 0 $(( ${no_of_files}-1 )))
do
# Check that the file exist
if [ ! -f "$1/$1_0_${i}_0.csv" ]; then
echo "File: $1_0_${i}_0.csv not found!"
continue
fi
# Search for error-lines inside the file and print them out
echo "$1_0_${i}_0.csv has over $correct_no_of_commas commas in the following lines:"
grep -o -n '[,]' "$1/$1_0_${i}_0.csv" | cut -d : -f 1 | uniq -c | awk '$1 > $correct_no_of_commas {print}'
done
}
get_bad_lines products
get_bad_lines users
The output of this program is now all the comma-counts with all of the line numbers in all the files,
and I suspect this is due to the input $1 (foldername, i.e. products & users) conflicting with the call to awk with reference to $1 as well (where I wish to grab the first column being the count of commas for that line in the current file in the loop).
Is this the issue? and if so, would it be solvable by either referencing the 1.st column or the folder name by different variable names instead of both of them using $1 ?
Example, current output:
5 6667
5 6668
5 6669
5 6670
(should only show lines for that file having more than 5 commas).
Tried variable declaration in call to awk as well, with same effect
(as in the accepted answer to Awk field variable clash with function argument)
:
get_bad_lines () {
local table_name=$1
local correct_no_of_commas=$(head -n 1 $table_name/${table_name}_0_0_0.csv | tr -cd , | wc -c)
local no_of_files=$(ls $table_name | wc -l)
for i in $(seq 0 $(( ${no_of_files}-1 )))
do
# Check that the file exist
if [ ! -f "$table_name/${table_name}_0_${i}_0.csv" ]; then
echo "File: ${table_name}_0_${i}_0.csv not found!"
continue
fi
# Search for error-lines inside the file and print them out
echo "${table_name}_0_${i}_0.csv has over $correct_no_of_commas commas in the following lines:"
grep -o -n '[,]' "$table_name/${table_name}_0_${i}_0.csv" | cut -d : -f 1 | uniq -c | awk -v table_name="$table_name" '$1 > $correct_no_of_commas {print}'
done
}

You can use awk the full way to achieve that :
get_bad_lines () {
find "$1" -maxdepth 1 -name "$1_0_*_0.csv" | while read -r my_file ; do
awk -v table_name="$1" '
NR==1 { num_comma=gsub(/,/, ""); }
/,/ { if (gsub(/,/, ",", $0) > num_comma) wrong_array[wrong++]=NR":"$0;}
END { if (wrong > 0) {
print(FILENAME" has over "num_comma" commas in the following lines:");
for (i=0;i<wrong;i++) { print(wrong_array[i]); }
}
}' "${my_file}"
done
}
For why your original awk command failed to give only lines with too many commas, that is because you are using a shell variable correct_no_of_commas inside a single quoted awk statement ('$1 > $correct_no_of_commas {print}'). Thus there no substitution by the shell, and awk read "$correct_no_of_commas" as is, and perceives it as an undefined variable. More precisely, awk look for the variable correct_no_of_commas which is undefined in the awk script so it is an empty string . awk will then execute $1 > $"" as matching condition, and as $"" is a $0 equivalent, awk will compare the count in $1 with the full input line. From a numerical point of view, the full input line has the form <tab><count><tab><num_line>, so it is 0 for awk. Thus, $1 > $correct_no_of_commas will be always true.

You can identify all the bad lines with a single awk command
awk -F, 'FNR==1{print FILENAME; headerCount=NF;} NF>headerCount{print} ENDFILE{print "#######\n"}' /path/here/*.csv
If you want the line number also to be printed, use this
awk -F, 'FNR==1{print FILENAME"\nLine#\tLine"; headerCount=NF;} NF>headerCount{print FNR"\t"$0} ENDFILE{print "#######\n"}' /path/here/*.csv

bash count sequential files

I'm pretty new to bash scripting so some of the syntaxes may not be optimal. Please do point them out if you see one.
I have files in a directory named sequentially.
Example: prob01_01 prob01_03 prob01_07 prob02_01 prob02_03 ....
I am trying to have the script iterate through the current directory and count how many extensions each problem has. Then print the pre-extension name then count
Sample output for above would be:
prob01 3
prob02 2
This is my code:
#!/bin/bash
temp=$(mktemp)
element=''
count=0
for i in *
do
current=${i%_*}
if [[ $current == $element ]]
then
let "count+=1"
else
echo $element $count >> temp
element=$current
count=1
fi
done
echo 'heres the temp:'
cat temp
rm 'temp'
The Problem:
Current output:
prob1 3
Desired output:
prob1 3
prob2 2
The last count isn't appended because it's not seeing a different element after it
My Guess on possible solutions:
Have the last append occur at the end of the for loop?

Your code has 2 problems.
The first problem doesn't answer your question. You make a temporary file, the filename is stored in $temp. You should use that one, and not the file with the fixed name temp.
The problem is that you only write results when you see a new problem/filename. The last one will not be printed.
Fixing only these problems will result in
results() {
if (( count == 0 )); then
return
fi
echo $element $count >> "${temp}"
}
temp=$(mktemp)
element=''
count=0
for i in prob*
do
current=${i%_*}
if [[ $current == $element ]]
then
let "count+=1" # Better is using ((count++))
else
results
element=$current
count=1
fi
done
results
echo 'heres the temp:'
cat "${temp}"
rm "${temp}"
You can do without the script with
ls prob* | cut -d"_" -f1 | sort | uniq -c
When you want the have the output displayed as given, you need one more step.
ls prob* | cut -d"_" -f1 | sort | uniq -c | awk '{print $2 " " $1}'

You may use printf + awk solution:
printf '%s\n' *_* | awk -F_ '{a[$1]++} END{for (i in a) print i, a[i]}'
prob01 3
prob02 2
We use printf to print each file that has at least one _
We use awk to get a count of each file's first element delimited by _ by using an associative array.

I would do it like this:
$ ls | awk -F_ '{print $1}' | sort | uniq -c | awk '{print $2 " " $1}'
prob01 3
prob02 2

Find the first missing file in a series of numbered files

I have directory containing files:
$> ls blender/output/celebAnim/
0100.png 0107.png 0114.png 0121.png 0128.png 0135.png 0142.png 0149.png 0156.png 0163.png 0170.png 0177.png 0184.png 0191.png 0198.png 0205.png 0212.png 0219.png 0226.png 0233.png 0240.png 0247.png 0254.png 0261.png 0268.png 0275.png 0282.png
0101.png 0108.png 0115.png 0122.png 0129.png 0136.png 0143.png 0150.png 0157.png 0164.png 0171.png 0178.png 0185.png 0192.png 0199.png 0206.png 0213.png 0220.png 0227.png 0234.png 0241.png 0248.png 0255.png 0262.png 0269.png 0276.png 0283.png
0102.png 0109.png 0116.png 0123.png 0130.png 0137.png 0144.png 0151.png 0158.png 0165.png 0172.png 0179.png 0186.png 0193.png 0200.png 0207.png 0214.png 0221.png 0228.png 0235.png 0242.png 0249.png 0256.png 0263.png 0270.png 0277.png 0284.png
0103.png 0110.png 0117.png 0124.png 0131.png 0138.png 0145.png 0152.png 0159.png 0166.png 0173.png 0180.png 0187.png 0194.png 0201.png 0208.png 0215.png 0222.png 0229.png 0236.png 0243.png 0250.png 0257.png 0264.png 0271.png 0278.png
0104.png 0111.png 0118.png 0125.png 0132.png 0139.png 0146.png 0153.png 0160.png 0167.png 0174.png 0181.png 0188.png 0195.png 0202.png 0209.png 0216.png 0223.png 0230.png 0237.png 0244.png 0251.png 0258.png 0265.png 0272.png 0279.png
0105.png 0112.png 0119.png 0126.png 0133.png 0140.png 0147.png 0154.png 0161.png 0168.png 0175.png 0182.png 0189.png 0196.png 0203.png 0210.png 0217.png 0224.png 0231.png 0238.png 0245.png 0252.png 0259.png 0266.png 0273.png 0280.png
0106.png 0113.png 0120.png 0127.png 0134.png 0141.png 0148.png 0155.png 0162.png 0169.png 0176.png 0183.png 0190.png 0197.png 0204.png 0211.png 0218.png 0225.png 0232.png 0239.png 0246.png 0253.png 0260.png 0267.png 0274.png 0281.png
For some script, I will need to find out what the number of the first missing file is. In the above output, it would be 0285.png. However, it is also possible that files in between are missing. In the end, I am only interested in the number 285, which is part of the file name.
This is part of recovery logic: The files should be created by the script, but this step can fail. Therefore I want to have a means to check which files are missing and try to create them in a second step.
This is what I got so far (from how to extract part of a filename before '.' or before extension):
ls blender/output/celebAnim/ | awk -F'[.]' '{print $1}'
What I cannot figure out is how do I find the smallest number missing from that result, above a certain offset? The offset in this case is 100.

You could loop over all number from 100 to 500 and check if the corresponding file exists; if it doesn't, you'd print the number you're looking at:
for i in {100..500}; do
[[ ! -f 0$i.png ]] && { echo "$i missing!"; break; }
done
This prints, for your example, 285 missing!.
This solution could be made a bit more flexible by, for example, looping over zero padded numbers and then extracting the unpadded number:
for i in {0100..0500}; do
[[ ! -f $i.png ]] && { echo "${i##*(0)} missing!"; break; }
done
This requires extended globs (shopt -s extglob) for the *(0) pattern ("zero or more repetitions of 0").

begin=100
end=500
for i in `seq $begin 1 $end`; do
fname="0"$i".png"
if [ ! -f $fname ]; then
echo "$fname is missing"
fi
done

#!/bin/sh
search_dir=blender/output/celebAnim/
ls $search_dir > file_list
count=`wc -l file_list | awk '{ print $1 }'`
if [[ $count -eq 0 ]]
then
echo "No files in given directory!"
break
fi
file_extension=`head -1 file_list | tail -1 | awk -F "." '{ print $2 }'`
init_file_value=`head -1 file_list | tail -1 | awk -F "." '{ print $1 }'`
i=2
while [ $i -le $count ]
do
next_file_value=`head -$i file_list | tail -1 | awk -F "." '{ print $1 }'`
next_value=$((init_file_value+1));
if [ $next_file_value -ne $next_value ]
then
echo $next_value"."$file_extension
break
fi
init_file_value=$next_value;
i=$((i+1));
done

try it:
ls blender/output/celebAnim/ | sort -r | head -n1 | awk -F'.' '{print $1+1}'
command return 285
if need return 0285 than try it:
ls blender/output/celebAnim/ | sort -r | head -n1 | awk -F'.' '{print 0($1+1)}'

Multiple variables into one variable with wildcard

I have this script:
#!/bin/bash
ping_1=$(ping -c 1 www.test.com | tail -1| awk '{print $4}' | cut -d '/' -f 2 | sed 's/\.[^.]*$//')
ping_2=$(ping -c 1 www.test1.com | tail -1| awk '{print $4}' | cut -d '/' -f 2 | sed 's/\.[^.]*$//')
ping_3=$(ping -c 1 www.test2.com | tail -1| awk '{print $4}' | cut -d '/' -f 2 | sed 's/\.[^.]*$//')
ping_4=$(ping -c 1 www.test3.com | tail -1| awk '{print $4}' | cut -d '/' -f 2 | sed 's/\.[^.]*$//' )
Then I would like to treat the outputs of ping_1-4 in one variable. Something like this:
#!/bin/bash
if [ "$ping_*" -gt 50 ]; then
echo "One ping is to high"
else
echo "The pings are fine"
fi
Is there a possibility in bash to read these variables with some sort of wildcard?
$ping_*
Did nothing for me.

The answer to your stated problem is that yes, you can do this with parameter expansion in bash (but not in sh):
#!/bin/bash
ping_1=foo
ping_2=bar
ping_etc=baz
for var in "${!ping_#}"
do
echo "$var is set to ${!var}"
done
will print
ping_1 is set to foo
ping_2 is set to bar
ping_etc is set to baz
Here's man bash:
${!prefix*}
${!prefix#}
Names matching prefix. Expands to the names of variables whose
names begin with prefix, separated by the first character of the
IFS special variable. When # is used and the expansion appears
within double quotes, each variable name expands to a separate
word.
The answer to your actual problem is to use arrays instead.

I don't think there's such wildcard.
But you could use a loop to iterate over values, for example:
exists_too_high() {
for value; do
if [ "$value" -gt 50 ]; then
return 0
fi
done
return 1
}
if exists_too_high "$ping_1" "$ping_2" "$ping_3" "$ping_4"; then
echo "One ping is to high"
else
echo "The pings are fine"
fi

You can use "and" (-a) param:
if [ $ping_1 -gt 50 -a \
$ping_2 -gt 50 -a \
$ping_3 -gt 50 -a ]; then
...
...
Or instead of defining a lot of variables, you can make an array and check with a loop:
pings+=($(ping -c 1 www.test.com | tail -1| awk '{print $4}' | cut -d '/' -f 2 | sed 's/\.[^.]*$//'))
pings+=($(ping -c 1 www.test1.com | tail -1| awk '{print $4}' | cut -d '/' -f 2 | sed 's/\.[^.]*$//'))
pings+=($(ping -c 1 www.test2.com | tail -1| awk '{print $4}' | cut -d '/' -f 2 | sed 's/\.[^.]*$//'))
pings+=($(ping -c 1 www.test3.com | tail -1| awk '{print $4}' | cut -d '/' -f 2 | sed 's/\.[^.]*$//' ))
too_high=0
for ping in ${pings[#]}; do
if [ $ping -gt 50 ]; then
too_high=1
break
fi
done
if [ $too_high -eq 1 ]; then
echo "One ping is to high"
else
echo "The pings are fine"
fi

To complement the existing, helpful answers with an array-based solution that demonstrates:
several advanced Bash techniques (robust array handling, compound conditionals, handling the case where pinging fails)
an optimized way to extract the average timing from ping's output by way of a single sed command (works with both GNU and BSD/macOS sed).
reporting the servers that either took too long or failed to respond by name.
#!/usr/bin/env bash
# Determine the servers to ping as an array.
servers=( 'www.test.com' 'www.test1.com' 'www.test2.com' 'www.test3.com' )
# Initialize the array in which timings will be stored, paralleling the
# "${servers[#]}" array.
avgPingTimes=()
# Initialize the array that stores the names of the servers that either took
# too long to respond (on average), or couldn't pe pinged at all.
failingServers=()
# Determine the threshold above which a timing is considered too high, in ms.
# Note that a shell variable should contain at least 1 lowercase character.
kMAX_TIME=50
# Determine how many pings to send per server to calculate the average timing
# from.
kPINGS_PER_SERVER=1
for server in "${servers[#]}"; do
# Ping the server at hand, extracting the integer portion of the average
# timing.
# Note that if pinging fails, $avgPingTime will be empty.
avgPingTime="$(ping -c "$kPINGS_PER_SERVER" "$server" |
sed -En 's|^.* = [^/]+/([^.]+).+$|\1|p')"
# Check if the most recent ping failed or took too long and add
# the server to the failure array, if so.
[[ -z $avgPingTime || $avgPingTime -gt $kMAX_TIME ]] && failingServers+=( "$server" )
# Add the timing to the output array.
avgPingTimes+=( "$avgPingTime" )
done
if [[ -n $failingServers ]]; then # pinging at least 1 server took too long or failed
echo "${#failingServers[#]} of the ${#servers[#]} servers took too long or couldn't be pinged:"
printf '%s\n' "${failingServers[#]}"
else
echo "All ${#servers[#]} servers responded to pings in a timely fashion."
fi

Yes bash can list variables that begin with $ping_, by using its internal compgen -v command, (see man bash under SHELL BUILTIN COMMANDS), i.e.:
for f in `compgen -v ping_` foo ; do
eval p=\$$f
if [ "$p" -gt 50 ]; then
echo "One ping is too high"
break 1
fi
[ $f=foo ] && echo "The pings are fine"
done
Note the added loop item foo -- if the loop gets through all the variables, then print "the pings are fine".

sorting a "key/value pair" array in bash

How do I sort a "python dictionary-style" array e.g. ( "A: 2" "B: 3" "C: 1" ) in bash by the value? I think, this code snippet will make it bit more clear about my question.
State="Total 4 0 1 1 2 0 0"
W=$(echo $State | awk '{print $3}')
C=$(echo $State | awk '{print $4}')
U=$(echo $State | awk '{print $5}')
M=$(echo $State | awk '{print $6}')
WCUM=( "Owner: $W;" "Claimed: $C;" "Unclaimed: $U;" "Matched: $M" )
echo ${WCUM[#]}
This will simply print the array: Owner: 0; Claimed: 1; Unclaimed: 1; Matched: 2
How do I sort the array (or the output), eliminating any pair with "0" value, so that the result like this:
Matched: 2; Claimed: 1; Unclaimed: 1
Thanks in advance for any help or suggestions. Cheers!!

Quick and dirty idea would be (this just sorts the output, not the array):
echo ${WCUM[#]} | sed -e 's/; /;\n/g' | awk -F: '!/ 0;?/ {print $0}' | sort -t: -k 2 -r | xargs

echo -e ${WCUM[#]} | tr ';' '\n' | sort -r -k2 | egrep -v ": 0$"
Sorting and filtering are independent steps, so if you only like to filter 0 values, it would be much more easy.
Append an
| tr '\n' ';'
to get it to a single line again in the end.
nonull=$(for n in ${!WCUM[#]}; do echo ${WCUM[n]} | egrep -v ": 0;"; done | tr -d "\n")
I don't see a good reason to end $W $C $U with a semicolon, but $M not, so instead of adapting my code to this distinction I would eliminate this special case. If not possible, I would append a semicolon temporary to $M and remove it in the end.

Another attempt, using some of the bash features, but still needs sort, that is crucial:
#! /bin/bash
State="Total 4 1 0 4 2 0 0"
string=$State
for i in 1 2 ; do # remove unnecessary fields
string=${string#* }
string=${string% *}
done
# Insert labels
string=Owner:${string/ /;Claimed:}
string=${string/ /;Unclaimed:}
string=${string/ /;Matched:}
# Remove zeros
string=(${string[#]//;/; })
string=(${string[#]/*:0;/})
string=${string[#]}
# Format
string=${string//;/$'\n'}
string=${string//:/: }
# Sort
string=$(sort -t: -nk2 <<< "$string")
string=${string//$'\n'/;}
echo "$string"

Develop Reference

node.js excel linux python-3.x azure haskell apache-spark rust .htaccess string

Why do I get an extra 0 on my script - linux

Related

Bash function with input fails awk command

bash count sequential files

Find the first missing file in a series of numbered files

Multiple variables into one variable with wildcard

sorting a "key/value pair" array in bash

Categories

Resources