If [ $? -ne 0 ]; not working? - linux

I am trying to detect a running service and if not there, try to do something:
#!/bin/bash
service --status-all | grep 'My Service' &> /dev/null
if [ $? -ne 0 ]; then
echo "Service not there."
else
echo "Service is there."
fi
The service is clearly there but still I am getting "Service not there."
I read about the exit code $? I think maybe the exit code in a series of commands might have effect on what we wanna test?
So I am not sure what went wrong there?

To debug what is happening with your test, run one step at a time.
First do service --status-all by itself and check its output. Is the output what you expect it to be, and does it actually include the 'My Service' that you are grepping for?
Then run service --status-all | grep 'My Service' and check its output and exit code. Does it write out the match, and is it's exit code zero 0?
man grep tells us that:
The grep utility exits with one of the following values:
0 One or more lines were selected.
1 No lines were selected.
>1 An error occurred.
and also
-q, --quiet, --silent
Quiet mode: suppress normal output. grep will only search a file until a
match has been found, making searches potentially less expensive.
There are also improvements to this process that you can make...
if tests the return status of the command list that is executed, and if that status is zero the then branch is executed. Knowing this you can just test the return status of grep instead of the return status of test.
aside:
You are using the [ command, which is also the test command (try man test)
The test command exits with 0 when the test passes (succeeds), or with 1 when the test fails.
$ test 7 -eq 7;echo $?
0
$ test 7 -ne 7;echo $?
1
$ [ 7 -eq 2 ];echo $?
1
With this knowledge, again, you can directly test the exit code of grep.
Suppress grep's output with the "quiet" flag instead of redirection, and use grep -F for fixed strings, which is a.k.a. fgrep:
if ./service --status-all | fgrep -q 'My Servvice'
then
echo "Service IS there."
else
echo "Service NOT there."
fi

Related

how to get the status of the last command along with its output in linux

i am trying to get the status of the last command also i want the output to be stored in logfile.
{spark_home}/bin/spark-submit .....> 2>&1 | tee -a log1.txt
if [$? -eq 0] ; then
echo " success"
else
echo "fail"
applicationId= $(grep command to get the app id from log1.txt)
but as $? is checking for last status command , its always showing as 0 ie successful as i am writing the output to logfile. can someone help me how to get the status as well as write logs of the spark-submit to log file
When you use pipe with bash, you could read all command statue in PIPESTATUS array:
$ ls | grep spamandegg
$ echo ${PIPESTATUS[#]}
0 1
Here ls is Ok but grep token is not found.
The length of the array is equal to the number of commands in pipe sequence.
For you, the exit status of your spark-submit command is in ${PIPESTATUS[0]}

Problems of set -e with grep command [duplicate]

I am using following options
set -o pipefail
set -e
In bash script to stop execution on error. I have ~100 lines of script executing and I don't want to check return code of every line in the script.
But for one particular command, I want to ignore the error. How can I do that?
The solution:
particular_script || true
Example:
$ cat /tmp/1.sh
particular_script()
{
false
}
set -e
echo one
particular_script || true
echo two
particular_script
echo three
$ bash /tmp/1.sh
one
two
three will be never printed.
Also, I want to add that when pipefail is on,
it is enough for shell to think that the entire pipe has non-zero exit code
when one of commands in the pipe has non-zero exit code (with pipefail off it must the last one).
$ set -o pipefail
$ false | true ; echo $?
1
$ set +o pipefail
$ false | true ; echo $?
0
Just add || true after the command where you want to ignore the error.
Don't stop and also save exit status
Just in case if you want your script not to stop if a particular command fails and you also want to save error code of failed command:
set -e
EXIT_CODE=0
command || EXIT_CODE=$?
echo $EXIT_CODE
More concisely:
! particular_script
From the POSIX specification regarding set -e (emphasis mine):
When this option is on, if a simple command fails for any of the reasons listed in Consequences of Shell Errors or returns an exit status value >0, and is not part of the compound list following a while, until, or if keyword, and is not a part of an AND or OR list, and is not a pipeline preceded by the ! reserved word, then the shell shall immediately exit.
Instead of "returning true", you can also use the "noop" or null utility (as referred in the POSIX specs) : and just "do nothing". You'll save a few letters. :)
#!/usr/bin/env bash
set -e
man nonexistentghing || :
echo "It's ok.."
Thanks for the simple solution here from above:
<particular_script/command> || true
The following construction could be used for additional actions/troubleshooting of script steps and additional flow control options:
if <particular_script/command>
then
echo "<particular_script/command> is fine!"
else
echo "<particular_script/command> failed!"
#exit 1
fi
We can brake the further actions and exit 1 if required.
I found another way to solve this:
set +e
find "./csharp/Platform.$REPOSITORY_NAME/obj" -type f -iname "*.cs" -delete
find "./csharp/Platform.$REPOSITORY_NAME.Tests/obj" -type f -iname "*.cs" -delete
set -e
You can turn off failing on errors by set +e this will now ignore all errors after that line. Once you are done, and you want the script to fail again on any error, you can use set -e.
After applying set +e the find does not fail the whole script anymore, when files are not found. At the same time, error messages
from find are still printed, but the whole script continues to execute. So it is easy to debug if that causes the problem.
This is useful for CI & CD (for example in GitHub Actions).
If you want to prevent your script failing and collect the return code:
command () {
return 1 # or 0 for success
}
set -e
command && returncode=$? || returncode=$?
echo $returncode
returncode is collected no matter whether command succeeds or fails.
output=$(*command* 2>&1) && exit_status=$? || exit_status=$?
echo $output
echo $exit_status
Example of using this to create a log file
log_event(){
timestamp=$(date '+%D %T') #mm/dd/yy HH:MM:SS
echo -e "($timestamp) $event" >> "$log_file"
}
output=$(*command* 2>&1) && exit_status=$? || exit_status=$?
if [ "$exit_status" = 0 ]
then
event="$output"
log_event
else
event="ERROR $output"
log_event
fi
I have been using the snippet below when working with CLI tools and I want to know if some resource exist or not, but I don't care about the output.
if [ -z "$(cat no_exist 2>&1 >/dev/null)" ]; then
echo "none exist actually exist!"
fi
while || true is preferred one, but you can also do
var=$(echo $(exit 1)) # it shouldn't fail
I kind of like this solution :
: `particular_script`
The command/script between the back ticks is executed and its output is fed to the command ":" (which is the equivalent of "true")
$ false
$ echo $?
1
$ : `false`
$ echo $?
0
edit: Fixed ugly typo

Bash ignoring error for a particular command

I am using following options
set -o pipefail
set -e
In bash script to stop execution on error. I have ~100 lines of script executing and I don't want to check return code of every line in the script.
But for one particular command, I want to ignore the error. How can I do that?
The solution:
particular_script || true
Example:
$ cat /tmp/1.sh
particular_script()
{
false
}
set -e
echo one
particular_script || true
echo two
particular_script
echo three
$ bash /tmp/1.sh
one
two
three will be never printed.
Also, I want to add that when pipefail is on,
it is enough for shell to think that the entire pipe has non-zero exit code
when one of commands in the pipe has non-zero exit code (with pipefail off it must the last one).
$ set -o pipefail
$ false | true ; echo $?
1
$ set +o pipefail
$ false | true ; echo $?
0
Just add || true after the command where you want to ignore the error.
Don't stop and also save exit status
Just in case if you want your script not to stop if a particular command fails and you also want to save error code of failed command:
set -e
EXIT_CODE=0
command || EXIT_CODE=$?
echo $EXIT_CODE
More concisely:
! particular_script
From the POSIX specification regarding set -e (emphasis mine):
When this option is on, if a simple command fails for any of the reasons listed in Consequences of Shell Errors or returns an exit status value >0, and is not part of the compound list following a while, until, or if keyword, and is not a part of an AND or OR list, and is not a pipeline preceded by the ! reserved word, then the shell shall immediately exit.
Instead of "returning true", you can also use the "noop" or null utility (as referred in the POSIX specs) : and just "do nothing". You'll save a few letters. :)
#!/usr/bin/env bash
set -e
man nonexistentghing || :
echo "It's ok.."
Thanks for the simple solution here from above:
<particular_script/command> || true
The following construction could be used for additional actions/troubleshooting of script steps and additional flow control options:
if <particular_script/command>
then
echo "<particular_script/command> is fine!"
else
echo "<particular_script/command> failed!"
#exit 1
fi
We can brake the further actions and exit 1 if required.
I found another way to solve this:
set +e
find "./csharp/Platform.$REPOSITORY_NAME/obj" -type f -iname "*.cs" -delete
find "./csharp/Platform.$REPOSITORY_NAME.Tests/obj" -type f -iname "*.cs" -delete
set -e
You can turn off failing on errors by set +e this will now ignore all errors after that line. Once you are done, and you want the script to fail again on any error, you can use set -e.
After applying set +e the find does not fail the whole script anymore, when files are not found. At the same time, error messages
from find are still printed, but the whole script continues to execute. So it is easy to debug if that causes the problem.
This is useful for CI & CD (for example in GitHub Actions).
If you want to prevent your script failing and collect the return code:
command () {
return 1 # or 0 for success
}
set -e
command && returncode=$? || returncode=$?
echo $returncode
returncode is collected no matter whether command succeeds or fails.
output=$(*command* 2>&1) && exit_status=$? || exit_status=$?
echo $output
echo $exit_status
Example of using this to create a log file
log_event(){
timestamp=$(date '+%D %T') #mm/dd/yy HH:MM:SS
echo -e "($timestamp) $event" >> "$log_file"
}
output=$(*command* 2>&1) && exit_status=$? || exit_status=$?
if [ "$exit_status" = 0 ]
then
event="$output"
log_event
else
event="ERROR $output"
log_event
fi
I have been using the snippet below when working with CLI tools and I want to know if some resource exist or not, but I don't care about the output.
if [ -z "$(cat no_exist 2>&1 >/dev/null)" ]; then
echo "none exist actually exist!"
fi
while || true is preferred one, but you can also do
var=$(echo $(exit 1)) # it shouldn't fail
I kind of like this solution :
: `particular_script`
The command/script between the back ticks is executed and its output is fed to the command ":" (which is the equivalent of "true")
$ false
$ echo $?
1
$ : `false`
$ echo $?
0
edit: Fixed ugly typo

Aborting a shell script if any command returns a non-zero value

I have a Bash shell script that invokes a number of commands.
I would like to have the shell script automatically exit with a return value of 1 if any of the commands return a non-zero value.
Is this possible without explicitly checking the result of each command?
For example,
dosomething1
if [[ $? -ne 0 ]]; then
exit 1
fi
dosomething2
if [[ $? -ne 0 ]]; then
exit 1
fi
Add this to the beginning of the script:
set -e
This will cause the shell to exit immediately if a simple command exits with a nonzero exit value. A simple command is any command not part of an if, while, or until test, or part of an && or || list.
See the bash manual on the "set" internal command for more details.
It's really annoying to have a script stubbornly continue when something fails in the middle and breaks assumptions for the rest of the script. I personally start almost all portable shell scripts with set -e.
If I'm working with bash specifically, I'll start with
set -Eeuo pipefail
This covers more error handling in a similar fashion. I consider these as sane defaults for new bash programs. Refer to the bash manual for more information on what these options do.
To add to the accepted answer:
Bear in mind that set -e sometimes is not enough, specially if you have pipes.
For example, suppose you have this script
#!/bin/bash
set -e
./configure > configure.log
make
... which works as expected: an error in configure aborts the execution.
Tomorrow you make a seemingly trivial change:
#!/bin/bash
set -e
./configure | tee configure.log
make
... and now it does not work. This is explained here, and a workaround (Bash only) is provided:
#!/bin/bash
set -e
set -o pipefail
./configure | tee configure.log
make
The if statements in your example are unnecessary. Just do it like this:
dosomething1 || exit 1
If you take Ville Laurikari's advice and use set -e then for some commands you may need to use this:
dosomething || true
The || true will make the command pipeline have a true return value even if the command fails so the the -e option will not kill the script.
If you have cleanup you need to do on exit, you can also use 'trap' with the pseudo-signal ERR. This works the same way as trapping INT or any other signal; bash throws ERR if any command exits with a nonzero value:
# Create the trap with
# trap COMMAND SIGNAME [SIGNAME2 SIGNAME3...]
trap "rm -f /tmp/$MYTMPFILE; exit 1" ERR INT TERM
command1
command2
command3
# Partially turn off the trap.
trap - ERR
# Now a control-C will still cause cleanup, but
# a nonzero exit code won't:
ps aux | grep blahblahblah
Or, especially if you're using "set -e", you could trap EXIT; your trap will then be executed when the script exits for any reason, including a normal end, interrupts, an exit caused by the -e option, etc.
The $? variable is rarely needed. The pseudo-idiom command; if [ $? -eq 0 ]; then X; fi should always be written as if command; then X; fi.
The cases where $? is required is when it needs to be checked against multiple values:
command
case $? in
(0) X;;
(1) Y;;
(2) Z;;
esac
or when $? needs to be reused or otherwise manipulated:
if command; then
echo "command successful" >&2
else
ret=$?
echo "command failed with exit code $ret" >&2
exit $ret
fi
Run it with -e or set -e at the top.
Also look at set -u.
On error, the below script will print a RED error message and exit.
Put this at the top of your bash script:
# BASH error handling:
# exit on command failure
set -e
# keep track of the last executed command
trap 'LAST_COMMAND=$CURRENT_COMMAND; CURRENT_COMMAND=$BASH_COMMAND' DEBUG
# on error: print the failed command
trap 'ERROR_CODE=$?; FAILED_COMMAND=$LAST_COMMAND; tput setaf 1; echo "ERROR: command \"$FAILED_COMMAND\" failed with exit code $ERROR_CODE"; put sgr0;' ERR INT TERM
An expression like
dosomething1 && dosomething2 && dosomething3
will stop processing when one of the commands returns with a non-zero value. For example, the following command will never print "done":
cat nosuchfile && echo "done"
echo $?
1
#!/bin/bash -e
should suffice.
I am just throwing in another one for reference since there was an additional question to Mark Edgars input and here is an additional example and touches on the topic overall:
[[ `cmd` ]] && echo success_else_silence
Which is the same as cmd || exit errcode as someone showed.
For example, I want to make sure a partition is unmounted if mounted:
[[ `mount | grep /dev/sda1` ]] && umount /dev/sda1

Why does set -e cause my script to exit when it encounters the following?

I have a bash script that checks some log files created by a cron job that have time stamps in the filename (down to the second). It uses the following code:
CRON_LOG=$(ls -1 $LOGS_DIR/fetch_cron_{true,false}_$CRON_DATE*.log 2> /dev/null | sed 's/^[^0-9][^0-9]*\([0-9][0-9]*\).*/\1 &/' | sort -n | cut -d ' ' -f2- | tail -1 )
if [ -f "$CRON_LOG" ]; then
printf "Checking $CRON_LOG for errors\n"
else
printf "\n${txtred}Error: cron log for $CRON_NOW does not exist.${txtrst}\n"
printf "Either the specified date is too old for the log to still be around or there is a problem.\n"
exit 1
fi
CRIT_ERRS=$(cat $CRON_LOG | grep "ERROR" | grep -v "Duplicate tracking code")
if [ -z "$CRIT_ERRS" ]; then
printf "%74s[${txtgrn}PASS${txtrst}]\n"
else
printf "%74s[${txtred}FAIL${txtrst}]\n"
printf "Critical errors detected! Outputting to console...\n"
echo $CRIT_ERRS
fi
So this bit of code works fine, but I'm trying to clean up my scripts now and implement set -e at the top of all of them. When i do it to this script, it exits with error code 1. Note that I have errors form the first statement dumping to /dev/null. This is because some days the file has the word "true" and other days "false" in it. Anyway, i don't think this is my problem because the script outputs "Checking xxxxx.log for errors." before exiting when I add set -e to the top.
Note: the $CRON_DATE variable is derived form user input. I can run the exact same statement from command line "$./checkcron.sh 01/06/2010" and it works fine without the set -e statement at the top of the script.
UPDATE: I added "set -x" to my script and narrowed the problem down. The last bit of output is:
Checking /map/etl/tektronix/logs/fetch_cron_false_010710054501.log for errors
++ cat /map/etl/tektronix/logs/fetch_cron_false_010710054501.log
++ grep ERROR
++ grep -v 'Duplicate tracking code'
+ CRIT_ERRS=
[1]+ Exit 1 ./checkLoad.sh...
So it looks like the problem is occurring on this line:
CRIT_ERRS=$(cat $CRON_LOG | grep "ERROR" | grep -v "Duplicate tracking code")
Any help is appreciated. :)
Thanks,
Ryan
Adding set -x, which prints a trace of the script's execution, may help you diagnose the source of the error.
Edit:
Your grep is returning an exit code of 1 since it's not finding the "ERROR" string.
Edit 2:
My apologies regarding the colon. I didn't test it.
However, the following works (I tested this one before spouting off) and avoids calling the external cat. Because you're setting a variable using the results of a subshell and set -e looks at the subshell as a whole, you can do this:
CRIT_ERRS=$(cat $CRON_LOG | grep "ERROR" | grep -v "Duplicate tracking code"; true)
bash -c 'f=`false`; echo $?'
1
bash -c 'f=`true`; echo $?'
0
bash -e -c 'f=`false`; echo $?'
bash -e -c 'f=`true`; echo $?'
0
Note that backticks (and $()) "return" the error code of the last command they run. Solution:
CRIT_ERRS=$(cat $CRON_LOG | grep "ERROR" | grep -v "Duplicate tracking code" | cat)
Redirecting error messages to /dev/null does nothing about the exit status returned by the script. The reason your ls command isn't causing the error is because it's part of a pipeline, and the exit status of the pipeline is the return value of the last command in it (unless pipefail is enabled).
Given your update, it looks like the command that's failing is the last grep in the pipeline. grep only returns 0 if it finds a match; otherwise it returns 1, and if it encounters an error, it returns 2. This is a danger of set -e; things can fail even when you don't expect them to, because commands like grep return non-zero status even if there hasn't been an error. It also fails to exit on errors earlier in a pipeline, and so may miss some error.
The solutions given by geocar or ephemient (piping through cat or using || : to ensure that the last command in the pipe returns successfully) should help you get around this, if you really want to use set -e.
Asking for set -e makes the script exit as soon as a simple command exits with a non-zero exit status. This combines perniciously with your ls command, which exits with a non-zero status when asked to list a non-existent file, which is always the case for you because the true and false variants don't co-exist.

Resources