Script exits with error when var=$(... | grep "value") is empty, but works when grep has results - linux

I have the following bash code (running on Red Hat) that is exiting when I enable set -o errexit and the variable in the code is empty, BUT works fine when the variable is set; the code is designed to test if a screen session matching .monitor_* exists, and if so do something.
I have the following turned on:
set -o errexit
set -x xtrace; PS4='$LINENO: '
If there is a session matching the above pattern it works; however, if nothing matches it just exits with no information other than the following output from xtrace
someuser:~/scripts/tests> ./if_test.sh
+ ./if_test.sh
+ PS4='$LINENO: '
4: set -o errexit
5: set -o pipefail
6: set -o nounset
88: /usr/bin/ls -lR /var/run/uscreens/S-storage-rsync
88: grep '.monitor_*'
88: awk '{ print $9 }'
88: /usr/bin/grep -Ev 'total|uscreens'
8: ms=
I tested the command I am using to set the ms var and it agrees with the xtrace output, it's not set.
someuser:~/scripts/tests> test -n "${mn}"
+ test -n ''
I have tried using a select statement and got the same results... I can't figure it out, anyone able to help? Thanks.
I read through all the possible solution recommendations, nothing seems to address my issue.
The code:
#!/usr/bin/env bash
set -o xtrace; PS4='$LINENO: '
set -o errexit
set -o pipefail
set -o nounset
ms="$(/usr/bin/ls -lR /var/run/uscreens/S-"${USER}" | /usr/bin/grep -Ev "total|uscreens" | grep ".monitor_*" | awk '{ print $9 }')"
if [[ -z "${ms}" ]]; then
echo "Handling empty result"
elif [[ -n "${ms}" ]]; then
echo "Handling non-empty result"
fi
The following answer was proposed: Test if a variable is set in bash when using "set -o nounset"; however, it doesn't address the issue at all. In my case the variable being tested is set and as stated in my detail, it's set to "", or nothing. Thank you; however, it doesn't help.
It really seems to be the variable declaration that it isn't liking.
ms="$(/usr/bin/ls -lR /var/run/uscreens/S-"${USER}" | /usr/bin/grep -Ev "total|uscreens" | grep ".monitor_*" | awk '{ print $9 }')"

You're running set -o pipefail, so if any component in a pipeline has a nonzero exit status, the entire pipeline is treated as having a nonzero exit status.
Your pipeline runs grep. grep has a nonzero status whenever no matches are found.
You're running set -o errexit (aka set -e). With errexit enabled, the script terminates whenever any command fails (subject to a long and complicated set of exceptions; some of these are presented in the exercises section of BashFAQ #105, and others touched on in this excellent reference).
Thus, when you have no matches in your grep command, your script terminates on the command substitution running the pipeline in question.
If you want to exempt a specific command from set -e's behavior, the easiest way to do it is to simply append ||: (shorthand for || true), which marks the command as "checked".

"You're running set -o pipefail. When grep doesn't match anything, it has a nonzero exit status, and with pipefail, that fails the entire pipeline. This is all behaving exactly the way you're telling your shell it should behave." – Charles Duffy
Charles above comment was exactly what was going on, my script was working as intended and I need to adjust the logic to work differently if I wish to keep set -o pipefail set.
Thank you for the help.

Related

Ignoring specific exit code in Bash

Note: Question does not duplicate Ignoring specific errors in a shell script .
Suppose it is needed to capture the leading characters of a encoded representation of a file.
In shell (tested in Bash), it is easy to use the following form:
encoded="$(< file base64 | head -c16)"
The statement functions desired, except under certain alterations to the environment.
Consider the following:
set -o errexit -o pipefail
shopt -s inherit_errexit
encoded="$(< file base64 | head -c16)"
The final line would cause termination of a script, because of the non-zero return status (141) given by base64, unhappy with closed pipe. The return status is propagated to the pipe and then to the invoking shell.
The undesired effect requires a workaround, such as follows:
set -o errexit -o pipefail
shopt -s inherit_errexit
encoded="$((< file base64 || :) | head -c16)"
The : has the same effect as would have the keyword true, to evaluate as a non-error.
However, this approach leads to a further unwanted effect.
The following shows a variation with a different error:
set -o errexit -o pipefail
shopt -s inherit_errexit
encoded="$((< /not/a/real/file base64 || :) | head -c16)"
echo $?
The printed code is zero. Now, a true error has been masked.
The most obvious solution, as follows, is rather verbose
set -o errexit -o pipefail
shopt -s inherit_errexit
encoded="$((< /not/a/real/file base64 || [ $? == 141 ]) | head -c16)"
echo $?
Is a more compact form available? Is any environment alteration available such that statements masks only the particular status code, without the explicit inline expression?
First off, apparently, to actually provoke the 141 error the file needs to be fairly, large, e.g.
head -c 1000000 /dev/urandom > file
now, as you said, this script sh.sh will terminate before showing encoded: ...:
#!/bin/bash
set -o errexit -o pipefail
shopt -s inherit_errexit
encoded="$(< file base64 | head -c16)"
echo "encoded: $encoded"
instead of checking for the error code, you could let base64 continue to pipe the rest of its data to /dev/null by invoking cat > /dev/null after head is done:
#!/bin/bash
set -o errexit -o pipefail
shopt -s inherit_errexit
encoded="$(< file base64 | ( head -c16 ; cat > /dev/null ) )"
echo "encoded: $encoded"
now you will get encoded: NvyX2Zx4nTDjtQO8 or whatever.
And this does not mask other errors like the file not existing:
$ ./sh.sh
./sh.sh: line 5: file: No such file or directory
However, it will be less efficient because the whole file will be read.
For your specific example program, you could also just change your approach. 16 base 64 characters represent 16 * 6 = 96 bits, so 96/8 = 12 bytes of data from the file are needed:
encoded="$(head -c12 file | base64)"
this will not cause SIGPIPE.

Problems of set -e with grep command [duplicate]

I am using following options
set -o pipefail
set -e
In bash script to stop execution on error. I have ~100 lines of script executing and I don't want to check return code of every line in the script.
But for one particular command, I want to ignore the error. How can I do that?
The solution:
particular_script || true
Example:
$ cat /tmp/1.sh
particular_script()
{
false
}
set -e
echo one
particular_script || true
echo two
particular_script
echo three
$ bash /tmp/1.sh
one
two
three will be never printed.
Also, I want to add that when pipefail is on,
it is enough for shell to think that the entire pipe has non-zero exit code
when one of commands in the pipe has non-zero exit code (with pipefail off it must the last one).
$ set -o pipefail
$ false | true ; echo $?
1
$ set +o pipefail
$ false | true ; echo $?
0
Just add || true after the command where you want to ignore the error.
Don't stop and also save exit status
Just in case if you want your script not to stop if a particular command fails and you also want to save error code of failed command:
set -e
EXIT_CODE=0
command || EXIT_CODE=$?
echo $EXIT_CODE
More concisely:
! particular_script
From the POSIX specification regarding set -e (emphasis mine):
When this option is on, if a simple command fails for any of the reasons listed in Consequences of Shell Errors or returns an exit status value >0, and is not part of the compound list following a while, until, or if keyword, and is not a part of an AND or OR list, and is not a pipeline preceded by the ! reserved word, then the shell shall immediately exit.
Instead of "returning true", you can also use the "noop" or null utility (as referred in the POSIX specs) : and just "do nothing". You'll save a few letters. :)
#!/usr/bin/env bash
set -e
man nonexistentghing || :
echo "It's ok.."
Thanks for the simple solution here from above:
<particular_script/command> || true
The following construction could be used for additional actions/troubleshooting of script steps and additional flow control options:
if <particular_script/command>
then
echo "<particular_script/command> is fine!"
else
echo "<particular_script/command> failed!"
#exit 1
fi
We can brake the further actions and exit 1 if required.
I found another way to solve this:
set +e
find "./csharp/Platform.$REPOSITORY_NAME/obj" -type f -iname "*.cs" -delete
find "./csharp/Platform.$REPOSITORY_NAME.Tests/obj" -type f -iname "*.cs" -delete
set -e
You can turn off failing on errors by set +e this will now ignore all errors after that line. Once you are done, and you want the script to fail again on any error, you can use set -e.
After applying set +e the find does not fail the whole script anymore, when files are not found. At the same time, error messages
from find are still printed, but the whole script continues to execute. So it is easy to debug if that causes the problem.
This is useful for CI & CD (for example in GitHub Actions).
If you want to prevent your script failing and collect the return code:
command () {
return 1 # or 0 for success
}
set -e
command && returncode=$? || returncode=$?
echo $returncode
returncode is collected no matter whether command succeeds or fails.
output=$(*command* 2>&1) && exit_status=$? || exit_status=$?
echo $output
echo $exit_status
Example of using this to create a log file
log_event(){
timestamp=$(date '+%D %T') #mm/dd/yy HH:MM:SS
echo -e "($timestamp) $event" >> "$log_file"
}
output=$(*command* 2>&1) && exit_status=$? || exit_status=$?
if [ "$exit_status" = 0 ]
then
event="$output"
log_event
else
event="ERROR $output"
log_event
fi
I have been using the snippet below when working with CLI tools and I want to know if some resource exist or not, but I don't care about the output.
if [ -z "$(cat no_exist 2>&1 >/dev/null)" ]; then
echo "none exist actually exist!"
fi
while || true is preferred one, but you can also do
var=$(echo $(exit 1)) # it shouldn't fail
I kind of like this solution :
: `particular_script`
The command/script between the back ticks is executed and its output is fed to the command ":" (which is the equivalent of "true")
$ false
$ echo $?
1
$ : `false`
$ echo $?
0
edit: Fixed ugly typo

Check for zero lines output from command over SSH

If I do the following, and the network is down, then the zero case will be executed, which it shouldn't.
case "$(ssh -n $host zfs list -t snapshot -o name -H | grep "tank/fs" | wc -l | awk '{print $1}')" in
0) # do something
;;
1) # do something else
;;
*) # fail
esac
Earlier in the script I check that I can SSH to $host, but today I found this problem, where the network failed right after my check.
If I check the return value from the SSH command, then I will always get the return value from awk as it is executed last.
Question
How do I insure that I actually count zero lines that zfs outputted, and not zero lines from a failed SSH connection?
Say:
set -o pipefail
at the beginning of your script (or before the case statement).
Moreover, check for the return code of the command before executing the case statement:
set -o pipefail
$value=$(ssh -n $host zfs list -t snapshot -o name -H | grep "tank/fs" | wc -l | awk '{print $1}')
if [ $? == 0 ]; then
case $value in
...
esac
fi
From the manual:
pipefail
If set, the return value of a pipeline is the value of the last
(rightmost) command to exit with a non-zero status, or zero if all
commands in the pipeline exit successfully. This option is disabled by
default.
How about this (commands after ssh shouldn't get executed locally):
"$(ssh -n $host 'zfs list -t snapshot -o name -H | grep "tank/fs" | wc -l | awk \'{print $1}\')'"
Note how I single quoted the command for ssh to run on the remote machine. Also escaped the ' for awk like \'. Check the return value of the call and only act when it returns success.

How can I use exit codes to run shell scripts sequentially?

Since cruise control is full of bugs that have wasted my entire week, I have decided the existing shell scripts I have are simpler and thus better.
Here is what I have so far
svn update /var/www/k12/
#svn log --revision "HEAD" /var/www/code/ | head -2 | tail -1 | awk '{print $1}' > /var/www/path/version.txt
# upload the files
rsync -ar --verbose --stats --progress --delete --exclude=*.svn /var/www/code/ example.com:/home/path
# bring database up to date
ssh example.com 'php /path/tasks/dbrefactor.php'
# notify me
ssh example.com 'php /path/tasks/build.php'
Only thing is the other day I changed the paths and forgot to update the rsync call. As a result the "notify me" step ran several times while I was figuring stuff out.
I know in linux you can do command1 && command2 and if command 1 "fails" command2 will not run, but how can I observe the "failure/success" exit codes for debugging purposes. Some of the scripts I wrote myself and I'm sure I will need to do something special.
The best option, especially for unattended scripts, is to set the -e shell option:
#!/bin/sh -e
or
set -e
This will cause the shell to stop executing if any (untested) command exits with a nonzero error code.
-e Exit immediately if a simple command (see SHELL GRAMMAR
above) exits with a non-zero status. The shell does not
exit if the command that fails is part of an until or
while loop, part of an if statement, part of a && or ||
list, or if the command's return value is being inverted
via !. A trap on ERR, if set, is executed before the
shell exits.
The exit code of a previous process happens to be in $? variable right after its execution. Usually (that's not required, but it's the convention everyone follows) the exit code of a successful command will be equal to 0, and any other value means an error.
Remember of the caveats! One of them is that after these commands:
svn log --revision "HEAD" /var/www/code/ | head -2 | tail -1 | awk '{print $1}'
echo "$?"
the zero result would most likely be returned, because in the $? the return code of awk is contained. To avoid it, set the pipefail option somewhere above the code:
set -o pipefail 1
The return value of the last-run command is stored in the variable $?. You can use that to determine which command to run next. Overview of special variables.
i think $? contains the last exit code
if [[ -z $? ]]
then
# notify me
ssh example.com 'php /path/tasks/build.php'
fi
I would suggest you can use the exit non zero at the points where the failure is expected and before processing step further you will check
if [ $? -neq 0 ]
then there is a failure.
The $? will always return a non zero number if the last process does not executed successfully.

Why does set -e cause my script to exit when it encounters the following?

I have a bash script that checks some log files created by a cron job that have time stamps in the filename (down to the second). It uses the following code:
CRON_LOG=$(ls -1 $LOGS_DIR/fetch_cron_{true,false}_$CRON_DATE*.log 2> /dev/null | sed 's/^[^0-9][^0-9]*\([0-9][0-9]*\).*/\1 &/' | sort -n | cut -d ' ' -f2- | tail -1 )
if [ -f "$CRON_LOG" ]; then
printf "Checking $CRON_LOG for errors\n"
else
printf "\n${txtred}Error: cron log for $CRON_NOW does not exist.${txtrst}\n"
printf "Either the specified date is too old for the log to still be around or there is a problem.\n"
exit 1
fi
CRIT_ERRS=$(cat $CRON_LOG | grep "ERROR" | grep -v "Duplicate tracking code")
if [ -z "$CRIT_ERRS" ]; then
printf "%74s[${txtgrn}PASS${txtrst}]\n"
else
printf "%74s[${txtred}FAIL${txtrst}]\n"
printf "Critical errors detected! Outputting to console...\n"
echo $CRIT_ERRS
fi
So this bit of code works fine, but I'm trying to clean up my scripts now and implement set -e at the top of all of them. When i do it to this script, it exits with error code 1. Note that I have errors form the first statement dumping to /dev/null. This is because some days the file has the word "true" and other days "false" in it. Anyway, i don't think this is my problem because the script outputs "Checking xxxxx.log for errors." before exiting when I add set -e to the top.
Note: the $CRON_DATE variable is derived form user input. I can run the exact same statement from command line "$./checkcron.sh 01/06/2010" and it works fine without the set -e statement at the top of the script.
UPDATE: I added "set -x" to my script and narrowed the problem down. The last bit of output is:
Checking /map/etl/tektronix/logs/fetch_cron_false_010710054501.log for errors
++ cat /map/etl/tektronix/logs/fetch_cron_false_010710054501.log
++ grep ERROR
++ grep -v 'Duplicate tracking code'
+ CRIT_ERRS=
[1]+ Exit 1 ./checkLoad.sh...
So it looks like the problem is occurring on this line:
CRIT_ERRS=$(cat $CRON_LOG | grep "ERROR" | grep -v "Duplicate tracking code")
Any help is appreciated. :)
Thanks,
Ryan
Adding set -x, which prints a trace of the script's execution, may help you diagnose the source of the error.
Edit:
Your grep is returning an exit code of 1 since it's not finding the "ERROR" string.
Edit 2:
My apologies regarding the colon. I didn't test it.
However, the following works (I tested this one before spouting off) and avoids calling the external cat. Because you're setting a variable using the results of a subshell and set -e looks at the subshell as a whole, you can do this:
CRIT_ERRS=$(cat $CRON_LOG | grep "ERROR" | grep -v "Duplicate tracking code"; true)
bash -c 'f=`false`; echo $?'
1
bash -c 'f=`true`; echo $?'
0
bash -e -c 'f=`false`; echo $?'
bash -e -c 'f=`true`; echo $?'
0
Note that backticks (and $()) "return" the error code of the last command they run. Solution:
CRIT_ERRS=$(cat $CRON_LOG | grep "ERROR" | grep -v "Duplicate tracking code" | cat)
Redirecting error messages to /dev/null does nothing about the exit status returned by the script. The reason your ls command isn't causing the error is because it's part of a pipeline, and the exit status of the pipeline is the return value of the last command in it (unless pipefail is enabled).
Given your update, it looks like the command that's failing is the last grep in the pipeline. grep only returns 0 if it finds a match; otherwise it returns 1, and if it encounters an error, it returns 2. This is a danger of set -e; things can fail even when you don't expect them to, because commands like grep return non-zero status even if there hasn't been an error. It also fails to exit on errors earlier in a pipeline, and so may miss some error.
The solutions given by geocar or ephemient (piping through cat or using || : to ensure that the last command in the pipe returns successfully) should help you get around this, if you really want to use set -e.
Asking for set -e makes the script exit as soon as a simple command exits with a non-zero exit status. This combines perniciously with your ls command, which exits with a non-zero status when asked to list a non-existent file, which is always the case for you because the true and false variants don't co-exist.

Resources