How to handle error/exception in shell script? - linux

Below is my script that I am executing in the bash. And it works fine.
fileexist=0
for i in $( ls /data/read-only/clv/daily/Finished-HADOOP_EXPORT_&processDate#.done); do
mv /data/read-only/clv/daily/Finished-HADOOP_EXPORT_&processDate#.done /data/read-only/clv/daily/archieve-wip/
fileexist=1
done
Problem Statement:-
In my above shell script which has to be run daily using cron job, I don't have any error/exception handling mechanism. Suppose if anything gets wrong then I don't know what's has happened?
As after the above script is executed, there are some other scripts that will be dependent on the data provided by above script, so I always get's complaint from the other people who are depending on my script data that something wrong has happened.
So is there any way I can get notified if anything wrong has happened in my script? Suppose if the cluster is having some maintenance and at that time I am running my script, so definitely it will be failing for sure, so can I be notified if my above scripts failed, so that I will be sure something wrong has happened.
Hope my question is clear enough.
Any thoughts will be appreciated.

You can check for the exit status of each command, as freetx answered, but this is manual error checking rather than exception handling. The standard way to get the equivalent of exception handling in sh is to start the script with set -e. That tells sh to exit with a non-zero status as soon as any executed command fails (i.e. exits with a non-zero exit status).
If it is intended for some command in such a script to (possibly) fail, you can use the construct COMMAND || true, which will force a zero exit status for that expression. For example:
#!/bin/sh
# if any of the following fails, the script fails
set -e
mkdir -p destdir/1/2
mv foo destdir/1/2
touch /done || true # allowed to fail
Another way to ensure that you are notified when things go wrong in a script invoked by cron is to adhere to the Unix convention of printing nothing unless an error ocurred. Successful runs will then pass without notice, and unsuccessful runs will cause the cron daemon to notify you of the error via email. Note that local mail delivery must be correctly configured on your system for this to work.

Its customary for every unix command line utility to return 0 upon success and non-zero on failure. Therefore you can use the $? pattern to display the last return value and handle things accordingly.
For instance:
> ls
> file1 file2
> echo $?
> 0
> ls file.no.exist
> echo $?
> 1
Therefore, you can use this as rudimentary error detection to see if something goes wrong. So the normal approach would be
some_command
if [ $? -gt 0 ]
then
handle_error here
fi

well if other scripts are on the same machine, then you could do a pgrep in other scripts for this script if found to sleep for a while and try other scripts later rechecking process is gone.
If script is on another machine or even local the other method is to produce a temp file on remote machine accessible via a running http browser that other scripts can check status i.e. running or complete
You could also either wrap script around another that looks for these errors and emails you if it finds it if not sends result as per normal to who ever
go=0;
function check_running() {
running=`pgrep -f your_script.sh|wc -l `
if [ $running -gt 1 ]; then
echo "already running $0 -- instances found $running ";
go=1;
}
check_running;
if [ $go -ge 1 ];then
execute your other script
else
sleep 120;
check_running;
fi

Related

How to exit from a script once a certain condition is met in a function, without executing further functions/further lines of the script in bash

I am writing a script in bash and I have some 15+ kubectl commands, after execution of each kubectl command, I need to check if echo $? is zero (or) not(which means it is successfully executed or not), if it is not zero, then I should exit from the script without executing the further lines/ further commands/further functions specified in the script. For this, I wrote a function, after every command I am calling this function to do the condition check of echo $? (for this function I am passing echo $? as an argument).
how to exit completely from the script once the echo $? !=0?
I'm sorry but your script doesn't make much sense to me, i'll try to provide an answer to your question anyway.
In bash scripting exit code 0 means success, anything but that means an error with some numbers having a special meaning.
To achieve what you want to do you need this simple function to be executed after every kubectl command:
function check() {
[ $? != 0 ] || exit 1
}
$command
check

A way to specify a command to run if the previous fails

Is it possible to trap error (unknown command) from the CLI, and do something in the case an error occured ?
To be more precise, I search a way to do something like this:
if [ previousCommandFails ] ; then
echo lastCommand >> somewhere.txt
fi
Echo is just an example to say that I need to access this lastCommand.
I want it to be a default behaviour in my computer, so the code must be placed somewhere like ~/.bashrc.
You can try the following solution. I don't guarantee that it's a good solution but it may help with your case.
Create a small script which can test the previous command i.e. test.sh with content:
if [ $? -ne 0 ]
then
history 1 >> /path/to/failed_commands.txt
fi
Then set this variable:
PROMPT_COMMAND+="source /path/to/test.sh"
PROMPT_COMMAND If set, the value is executed as a command prior to
issuing each primary prompt.
It depends on what you call fail. If it is just returning a non 0 value, I am afraid that you have to explicitely test it after each command, or use a specialized shell (*).
But trap can be used to execute a specific command when a signal is received:
trap action signal
If this is not enough, you will have to get the source of a shell (posix shell or bash) and tweak it for meet you needs...

Don't update watch output unless the command succeeds

Is there a way to get watch to update the screen when the command succeeds? I have a command that infrequently succeeds, and I want it to show the last successful output.
Is there a way without a helper program?
watch does not allow to conditionally show the output of a command, but you can start your command from script and show its output depending on the termination status. Use something like this:
#!/bin/bash
cmd > /tmp/cmd_out
if [ $? -eq 0 ]; then
cat /tmp/cmd_out
fi
Of course, your command should return proper exit status (not just 0 in any case) or this method will not work.

Concurrency with shell scripts in failure-prone environments

Good morning all,
I am trying to implement concurrency in a very specific environment, and keep getting stuck. Maybe you can help me.
this is the situation:
-I have N nodes that can read/write in a shared folder.
-I want to execute an application in one of them. this can be anything, like a shell script, an installed code, or whatever.
-To do so, I have to send the same command to all of them. The first one should start the execution, and the rest should see that somebody else is running the desired application and exit.
-The execution of the application can be killed at any time. This is important because does not allow relying on any cleaning step after the execution.
-if the application gets killed, the user may want to execute it again. He would then send the very same command.
My current approach is to create a shell script that wraps the command to be executed. This could also be implemented in C. Not python or other languages, to avoid library dependencies.
#!/bin/sh
# (folder structure simplified for legibility)
mutex(){
lockdir=".lock"
firstTask=1 #false
if mkdir "$lockdir" &> /dev/null
then
controlFile="controlFile"
#if this is the first node, start coordinator
if [ ! -f $controlFile ]; then
firstTask=0 #true
#tell the rest of nodes that I am in control
echo "some info" > $controlFile
fi
# remove control File when script finishes
trap 'rm $controlFile' EXIT
fi
return $firstTask
}
#The basic idea is that a task executes the desire command, stated as arguments to this script. The rest do nothing
if ! mutex ;
then
exit 0
fi
#I am the first node and the only one reaching this, so I execute whatever
$#
If there are no failures, this wrapper works great. The problem is that, if the script is killed before the execution, the trap is not executed and the control file is not removed. Then, when we execute the wrapper again to restart the task, it won't work as every node will think that somebody else is running the application.
A possible solution would be to remove the control script just before the "$#" call, but that it would lead to some race condition.
Any suggestion or idea?
Thanks for your help.
edit: edited with correct solution as future reference
Your trap syntax looks wrong: According to POSIX, it should be:
trap [action condition ...]
e.g.:
trap 'rm $controlFile' HUP INT TERM
trap 'rm $controlFile' 1 2 15
Note that $controlFile will not be expanded until the trap is executed if you use single quotes.

BASH: How monitor a script for execution failure

I'm using Linux to watch a script execution in order for it to be respawned when the script runs into an execution failure. Given is a simple 1-line script which should help demonstrate the problem.
Here's my script
#!/bin/bash
echo '**************************************'
echo '* Run IRC Bot *'
echo '**************************************'
echo '';
if [ -z "$1" ]
then
echo 'Example usage: ' $0 'intelbot'
fi
until `php $1.php`;
do
echo "IRC bot '$1' crashed with the code $?. Respawning.." >&2;
sleep 5
done;
What kill option should I use to say to until, hey I want this process to be killed and I want you to get it working again!
Edit
The aim here was to manually check for a script-execution failure so the IRC Bot can be re-spawned. The posted answer is very detailed so +1 to the contributor - a supervisor is indeed the best way to tackle this problem.
First -- don't do this at all; use a proper process supervision system to automate restarting your program for you, not a shell script. Your operating system will ship with one, be it SysV init's /etc/inittab (which, yes, will restart programs so listed when they exit if given an appropriate flag), or the more modern upstart (shipped with Ubuntu), systemd (shipped with current Fedora and Arch Linux), runit, daemontools, supervisord, launchd (shipped with MacOS X), etc.
Second: The backticks actually make your code behave in unpredictable ways; so does the lack of quotes on an expansion.
`php $1.php`
...does the following:
Substitutes the value of $1 into a string; let's say it's my * code.php.
String-splits that value; in this case, it would change it into three separate arguments: my, *, and code.php
Glob-expands those arguments; in this case, the * would be replaced with a separate argument for each file in the current directory
Runs the resulting program
Reads the output that program wrote to stdout, and runs that output as a separate command
Returns the exit status of that separate command.
Instead:
until php "$1.php"; do
echo "IRC bot '$1' crashed with the code $?. Respawning.." >&2;
sleep 5
done;
Now, the exit status returned by PHP when it receives a SIGTERM is something that can be controlled by PHP's signal handler -- unless you tell us how your PHP code is written, only codes which can't be handled (such as SIGKILL) will behave in a manner that's entirely consistent, and because they can't be handled, they're dangerous if your program needs to do any kind of safe shutdown or cleanup.
If you want your PHP code to install a signal handler, so you can control its exit status when signaled, see http://php.net/manual/en/function.pcntl-signal.php

Resources