How can I use exit codes to run shell scripts sequentially? - linux

Since cruise control is full of bugs that have wasted my entire week, I have decided the existing shell scripts I have are simpler and thus better.
Here is what I have so far
svn update /var/www/k12/
#svn log --revision "HEAD" /var/www/code/ | head -2 | tail -1 | awk '{print $1}' > /var/www/path/version.txt
# upload the files
rsync -ar --verbose --stats --progress --delete --exclude=*.svn /var/www/code/ example.com:/home/path
# bring database up to date
ssh example.com 'php /path/tasks/dbrefactor.php'
# notify me
ssh example.com 'php /path/tasks/build.php'
Only thing is the other day I changed the paths and forgot to update the rsync call. As a result the "notify me" step ran several times while I was figuring stuff out.
I know in linux you can do command1 && command2 and if command 1 "fails" command2 will not run, but how can I observe the "failure/success" exit codes for debugging purposes. Some of the scripts I wrote myself and I'm sure I will need to do something special.

The best option, especially for unattended scripts, is to set the -e shell option:
#!/bin/sh -e
or
set -e
This will cause the shell to stop executing if any (untested) command exits with a nonzero error code.
-e Exit immediately if a simple command (see SHELL GRAMMAR
above) exits with a non-zero status. The shell does not
exit if the command that fails is part of an until or
while loop, part of an if statement, part of a && or ||
list, or if the command's return value is being inverted
via !. A trap on ERR, if set, is executed before the
shell exits.

The exit code of a previous process happens to be in $? variable right after its execution. Usually (that's not required, but it's the convention everyone follows) the exit code of a successful command will be equal to 0, and any other value means an error.
Remember of the caveats! One of them is that after these commands:
svn log --revision "HEAD" /var/www/code/ | head -2 | tail -1 | awk '{print $1}'
echo "$?"
the zero result would most likely be returned, because in the $? the return code of awk is contained. To avoid it, set the pipefail option somewhere above the code:
set -o pipefail 1

The return value of the last-run command is stored in the variable $?. You can use that to determine which command to run next. Overview of special variables.

i think $? contains the last exit code
if [[ -z $? ]]
then
# notify me
ssh example.com 'php /path/tasks/build.php'
fi

I would suggest you can use the exit non zero at the points where the failure is expected and before processing step further you will check
if [ $? -neq 0 ]
then there is a failure.
The $? will always return a non zero number if the last process does not executed successfully.

Related

What's the meaning of a ! before a command in the shell?

What is the purpose of a shell command (part of a shell script) starting with an exclamation mark?
Concrete example:
In foo.sh:
#!/usr/bin/env bash
set -e
! docker stop foo
! docker rm -f foo
# ... other stuff
I know that without the space the exclamation mark is used for history replacements and ! <expression> according to the man page can be used to evaluate "True if expr is false". But in the example context that does not make sense to me.
TL;DR: This is just by-passing the set -e flag in the specific line where you are using it.
Adding add to hek2mgl's correct and useful answer.
You have:
set -e
! command
Bash Reference Manual → Pipelines describes:
Each command in a pipeline is executed in its own subshell. The exit status of a pipeline is the exit status of the last command in the pipeline (...). If the reserved word ‘!’ precedes the pipeline, the exit status is the logical negation of the exit status as described above. The shell waits for all commands in the pipeline to terminate before returning a value.
This means that ! preceding a command is negating the exit status of it:
$ echo 23
23
$ echo $?
0
# But
$ ! echo 23
23
$ echo $?
1
Or:
$ echo 23 && echo "true" || echo "fail"
23
true
$ ! echo 23 && echo "true" || echo "fail"
23
fail
The exit status is useful in many ways. In your script, used together with set -e makes the script exit whenever a command returns a non-zero status.
Thus, when you have:
set -e
command1
command2
If command1 returns a non-zero status, the script will finish and won't proceed to command2.
However, there is also an interesting point to mention, described in 4.3.1 The Set Builtin:
-e
Exit immediately if a pipeline (see Pipelines), which may consist of a single simple command (see Simple Commands), a list (see Lists), or a compound command (see Compound Commands) returns a non-zero status. The shell does not exit if the command that fails is part of the command list immediately following a while or until keyword, part of the test in an if statement, part of any command executed in a && or || list except the command following the final && or ||, any command in a pipeline but the last, or if the command’s return status is being inverted with !. If a compound command other than a subshell returns a non-zero status because a command failed while -e was being ignored, the shell does not exit. A trap on ERR, if set, is executed before the shell exits.
Taking all of these into consideration, when you have:
set -e
! command1
command2
What you are doing is to by-pass the set -e flag in the command1. Why?
if command1 runs properly, it will return a zero status. ! will negate it, but set -e won't trigger an exit by the because it comes from a return status inverted with !, as described above.
if command1 fails, it will return a non-zero status. ! will negate it, so the line will end up returning a zero status and the script will continue normally.
If you don't want the script to fail in both cases, error or success of the command, you can also use this alternative:
set -e
docker stop foo || true
The boolean or true makes the pipeline always have 0 as the return value.

Re-installing Linux O.S. and then running bunch of commands in a .sh script , how to stop the script if something fails?

If i copy and paste all the commands into the terminal..
some do not even go through.
so the solution is perhaps to turn the file into an executable file
and then execute it.
but what if some commands fail.
the script keeps on executing the other commands.
obviously there is no solution to this right ?
The easiest way to do this is to use the -e option in your shell. For example:
#!/bin/sh -e
command1
command2
In this script, if command1 fails, then the script as a whole will fail at that point without running any further commands.
You can check the error code from commands you run
#!/bin/bash
function test {
"$#"
status=$?
if [ $status -ne 0 ]; then
echo "error with $1"
exit 255
fi
return $status
}
test ls
test ps -ef
test not_a_command
taken from here for more information Checking Bash exit status of several commands efficiently
#Terminal, you were almost there.
If you just stick && on the end of each command, then execution will stop with the first failure (ie. the first command that returns a non-zero exit code).
Example:
#!/bin/sh
true &&
echo 'got here' &&
echo 'got here too' &&
false &&
echo 'also got here'
produces the output
got here
got here too
(Actually, I thought it would also require line-continuation markers too: && \, but a quick test showed otherwise.)
Note: All of the above assumes that your shell is bash; I can't speak for other shells.

What does "who | grep $1" command do in the shell script?

I am learning shell programming from the very basics using the book called Beginning Linux Programming (4th Edition). I am confused by this script with an until-clause:
#!/bin/bash
until who | grep "$1" > /dev/null
do
sleep 60
done
# Now ring the bell and announce the unexpected user.
echo -e '\a'
echo "***** $1 has just logged in *****"
exit 0
My quesiton is what is who | grep "$1" > /dev/null used for here? Why redirect the grep output to /dev/null?
The 'until' loop is used to test a condition, as you mentioned, and will run all the 'do|done' block until the condition present becomes true. In other words, it only executes the code block when the condition present is FALSE, and runs it until it becomes true. The script you are testing is useful for catching a logged in user that you pass as a parameter to the script (hence, the grep "$1", being $1 a positional parameter). It will sleep for a minute (sleep 60) until that user logs in to the system, and then it will exit the loop and do all the '$1 has just logged in' stuff. The redirection of grep output to /dev/null is used to not display the output of the grep comand (you could have used grep -q "$1" and that will achieve the same effect).
Hope to have clarified your doubts.
while and until (and, admittedly, if) look at the exit code of the test, not at any text that may or may not be generated on stdout (or stderr).
I suspect the reason redirection to /dev/null has been used is because the command only generates output if there is a match, most of the time there is (admittedly) none, but when there is, you're not interested in seeing the result.

Aborting a shell script if any command returns a non-zero value

I have a Bash shell script that invokes a number of commands.
I would like to have the shell script automatically exit with a return value of 1 if any of the commands return a non-zero value.
Is this possible without explicitly checking the result of each command?
For example,
dosomething1
if [[ $? -ne 0 ]]; then
exit 1
fi
dosomething2
if [[ $? -ne 0 ]]; then
exit 1
fi
Add this to the beginning of the script:
set -e
This will cause the shell to exit immediately if a simple command exits with a nonzero exit value. A simple command is any command not part of an if, while, or until test, or part of an && or || list.
See the bash manual on the "set" internal command for more details.
It's really annoying to have a script stubbornly continue when something fails in the middle and breaks assumptions for the rest of the script. I personally start almost all portable shell scripts with set -e.
If I'm working with bash specifically, I'll start with
set -Eeuo pipefail
This covers more error handling in a similar fashion. I consider these as sane defaults for new bash programs. Refer to the bash manual for more information on what these options do.
To add to the accepted answer:
Bear in mind that set -e sometimes is not enough, specially if you have pipes.
For example, suppose you have this script
#!/bin/bash
set -e
./configure > configure.log
make
... which works as expected: an error in configure aborts the execution.
Tomorrow you make a seemingly trivial change:
#!/bin/bash
set -e
./configure | tee configure.log
make
... and now it does not work. This is explained here, and a workaround (Bash only) is provided:
#!/bin/bash
set -e
set -o pipefail
./configure | tee configure.log
make
The if statements in your example are unnecessary. Just do it like this:
dosomething1 || exit 1
If you take Ville Laurikari's advice and use set -e then for some commands you may need to use this:
dosomething || true
The || true will make the command pipeline have a true return value even if the command fails so the the -e option will not kill the script.
If you have cleanup you need to do on exit, you can also use 'trap' with the pseudo-signal ERR. This works the same way as trapping INT or any other signal; bash throws ERR if any command exits with a nonzero value:
# Create the trap with
# trap COMMAND SIGNAME [SIGNAME2 SIGNAME3...]
trap "rm -f /tmp/$MYTMPFILE; exit 1" ERR INT TERM
command1
command2
command3
# Partially turn off the trap.
trap - ERR
# Now a control-C will still cause cleanup, but
# a nonzero exit code won't:
ps aux | grep blahblahblah
Or, especially if you're using "set -e", you could trap EXIT; your trap will then be executed when the script exits for any reason, including a normal end, interrupts, an exit caused by the -e option, etc.
The $? variable is rarely needed. The pseudo-idiom command; if [ $? -eq 0 ]; then X; fi should always be written as if command; then X; fi.
The cases where $? is required is when it needs to be checked against multiple values:
command
case $? in
(0) X;;
(1) Y;;
(2) Z;;
esac
or when $? needs to be reused or otherwise manipulated:
if command; then
echo "command successful" >&2
else
ret=$?
echo "command failed with exit code $ret" >&2
exit $ret
fi
Run it with -e or set -e at the top.
Also look at set -u.
On error, the below script will print a RED error message and exit.
Put this at the top of your bash script:
# BASH error handling:
# exit on command failure
set -e
# keep track of the last executed command
trap 'LAST_COMMAND=$CURRENT_COMMAND; CURRENT_COMMAND=$BASH_COMMAND' DEBUG
# on error: print the failed command
trap 'ERROR_CODE=$?; FAILED_COMMAND=$LAST_COMMAND; tput setaf 1; echo "ERROR: command \"$FAILED_COMMAND\" failed with exit code $ERROR_CODE"; put sgr0;' ERR INT TERM
An expression like
dosomething1 && dosomething2 && dosomething3
will stop processing when one of the commands returns with a non-zero value. For example, the following command will never print "done":
cat nosuchfile && echo "done"
echo $?
1
#!/bin/bash -e
should suffice.
I am just throwing in another one for reference since there was an additional question to Mark Edgars input and here is an additional example and touches on the topic overall:
[[ `cmd` ]] && echo success_else_silence
Which is the same as cmd || exit errcode as someone showed.
For example, I want to make sure a partition is unmounted if mounted:
[[ `mount | grep /dev/sda1` ]] && umount /dev/sda1

Why does set -e cause my script to exit when it encounters the following?

I have a bash script that checks some log files created by a cron job that have time stamps in the filename (down to the second). It uses the following code:
CRON_LOG=$(ls -1 $LOGS_DIR/fetch_cron_{true,false}_$CRON_DATE*.log 2> /dev/null | sed 's/^[^0-9][^0-9]*\([0-9][0-9]*\).*/\1 &/' | sort -n | cut -d ' ' -f2- | tail -1 )
if [ -f "$CRON_LOG" ]; then
printf "Checking $CRON_LOG for errors\n"
else
printf "\n${txtred}Error: cron log for $CRON_NOW does not exist.${txtrst}\n"
printf "Either the specified date is too old for the log to still be around or there is a problem.\n"
exit 1
fi
CRIT_ERRS=$(cat $CRON_LOG | grep "ERROR" | grep -v "Duplicate tracking code")
if [ -z "$CRIT_ERRS" ]; then
printf "%74s[${txtgrn}PASS${txtrst}]\n"
else
printf "%74s[${txtred}FAIL${txtrst}]\n"
printf "Critical errors detected! Outputting to console...\n"
echo $CRIT_ERRS
fi
So this bit of code works fine, but I'm trying to clean up my scripts now and implement set -e at the top of all of them. When i do it to this script, it exits with error code 1. Note that I have errors form the first statement dumping to /dev/null. This is because some days the file has the word "true" and other days "false" in it. Anyway, i don't think this is my problem because the script outputs "Checking xxxxx.log for errors." before exiting when I add set -e to the top.
Note: the $CRON_DATE variable is derived form user input. I can run the exact same statement from command line "$./checkcron.sh 01/06/2010" and it works fine without the set -e statement at the top of the script.
UPDATE: I added "set -x" to my script and narrowed the problem down. The last bit of output is:
Checking /map/etl/tektronix/logs/fetch_cron_false_010710054501.log for errors
++ cat /map/etl/tektronix/logs/fetch_cron_false_010710054501.log
++ grep ERROR
++ grep -v 'Duplicate tracking code'
+ CRIT_ERRS=
[1]+ Exit 1 ./checkLoad.sh...
So it looks like the problem is occurring on this line:
CRIT_ERRS=$(cat $CRON_LOG | grep "ERROR" | grep -v "Duplicate tracking code")
Any help is appreciated. :)
Thanks,
Ryan
Adding set -x, which prints a trace of the script's execution, may help you diagnose the source of the error.
Edit:
Your grep is returning an exit code of 1 since it's not finding the "ERROR" string.
Edit 2:
My apologies regarding the colon. I didn't test it.
However, the following works (I tested this one before spouting off) and avoids calling the external cat. Because you're setting a variable using the results of a subshell and set -e looks at the subshell as a whole, you can do this:
CRIT_ERRS=$(cat $CRON_LOG | grep "ERROR" | grep -v "Duplicate tracking code"; true)
bash -c 'f=`false`; echo $?'
1
bash -c 'f=`true`; echo $?'
0
bash -e -c 'f=`false`; echo $?'
bash -e -c 'f=`true`; echo $?'
0
Note that backticks (and $()) "return" the error code of the last command they run. Solution:
CRIT_ERRS=$(cat $CRON_LOG | grep "ERROR" | grep -v "Duplicate tracking code" | cat)
Redirecting error messages to /dev/null does nothing about the exit status returned by the script. The reason your ls command isn't causing the error is because it's part of a pipeline, and the exit status of the pipeline is the return value of the last command in it (unless pipefail is enabled).
Given your update, it looks like the command that's failing is the last grep in the pipeline. grep only returns 0 if it finds a match; otherwise it returns 1, and if it encounters an error, it returns 2. This is a danger of set -e; things can fail even when you don't expect them to, because commands like grep return non-zero status even if there hasn't been an error. It also fails to exit on errors earlier in a pipeline, and so may miss some error.
The solutions given by geocar or ephemient (piping through cat or using || : to ensure that the last command in the pipe returns successfully) should help you get around this, if you really want to use set -e.
Asking for set -e makes the script exit as soon as a simple command exits with a non-zero exit status. This combines perniciously with your ls command, which exits with a non-zero status when asked to list a non-existent file, which is always the case for you because the true and false variants don't co-exist.

Resources