I am creating a shell script which checks the status of a service running on another machine and if didn't get any response than performs some operation at the local system. I am using if clause in the script for this task.
Sometimes due to the network connection, it falsely assumes that the remote server is not responding and performs the tasks mentioned inside if clause. I want to set up a retry so that it checks if condition more than once when it didn't find any response in the first attempt.
is there any way to setup retry like thing in a shell script for this purpose?
Below is a sample code
RSI1_STATUS=$(psql -U username -h serverip -d postgres -t -c "select version();" )
if [ -z "$RSI1_STATUS" ] #Condition will be true if remote server is not active
then
touch /tmp/postgresql.trigger
fi
now I want to check if condition more than once if it is true in the first attempt.
You could add retry loop with a number of retries using a while loop:
retries=5
while ! check_network_connection && ((--retries)); do
sleep 1 # or probe the network, etc.
done
if [[ $retries -eq 0 ]]; then
echo "Error: Connection retries exhausted."
else
# connection succeeded.
fi
Whether you want to sleep or do something else depends on your usage and the application.
Note: The "network connection" might have succeeded after checking in the loop. So if retries is 0, it doesn't necessarily mean that the connection is still down.
Related
I have Jenkins pipeline job which goes thought all our Jenkins servers and check the connectivity (runs every few minutes).
ksh file:
#!/bin/ksh
JENKINS_URL=$1
curl --connect-timeout 10 "$JENKINS_URL" >/dev/null
status=`echo $?`
if [ "$status" == "7" ]; then
export SUBJECT="Connection refused or can not connect to URL $JENKINS_URL"
echo "$SUBJECT"|/usr/sbin/sendmail -t XXXX#gmail.com
else
echo "successfully connected $JENKINS_URL"
fi
exit 0
I would like to add another piece of code, which record all the times that server was down (it should include the name of the server and timestamp) into a file, and in case the server is up again, send an email which will notify about it, and it will be also recorded in the file.
I don't want to get extra alerts, only one alert (to file and mail) when it's down, and one when it's up again. any idea how to implement it?
The detailed answer was given by unix.stackexchange community:
https://unix.stackexchange.com/questions/562594/how-to-set-and-record-alerts-for-jenkin-server-down-and-up
Im trying to send a notification upon login via PAM, but i cant figure out how to send it to the user that is logging in.
I'm configuring my PAM to execute a script every time a user logs in. The problem is i need to send a notification if there have been any login attempts (its part of a bigger security thing im trying to add, where my laptop takes a picture with the webcam upon failed logins, and notifies me when i log in again, since my classmates like to try and guess my password for some reason).
The problem is that the line in my .sh file, which sends a user notification, sends it to root since thats the 'user' that executes the script, i want my script to send the notification to my current user (called "andreas"), but im having problems figuring this out.
Here is the line i added to the end of the PAM file system-login:
auth [default=ignore] pam_exec.so /etc/lockCam/call.sh
And here is the call.sh file:
#!/bin/sh
/etc/lockCam/notifier.sh &
The reason im calling another file is because i want it to run in the background WHILE the login process continues, that way the process doesnt slow down logging in.
Here is the script that is then executed:
#!/bin/sh
#sleep 10s
echo -e "foo" > "/etc/lockCam/test"
#This line is simply to make sure that i know that my script was executed
newLogins=`sed -n '3 p' /etc/lockCam/lockdata`
if [ $newLogins -gt 0 ]
then
su andreas -c ' notify-send --urgency=critical --expire-time=6000 "Someone tried to log in!" "$newLogins new lockCam images!" && exit'
callsInRow=`sed -n '2 p' /etc/lockCam/lockdata`
crntS=$(date "+%S")
crntS=${crntS#0}
crntM=$(date "+%M")
crntM=${crntM#0}
crntH=$(date "+%H")
crntH=${crntH#0}
((crntTime = $crntH \* 60 \* 60 + $crntM \* 60 + $crntS ))
#This whole process is absolutely stupid but i cant figure out a better way to make sure none of the integers are called "01" or something like that, which would trigger an error
echo -e "$crntTime\n$callsInRow\n0" > "/etc/lockCam/lockdata"
fi
exit 0
And this is where i THINK my error is, the line "su andreas -c...." is most likely formatted wrong or im doing something else wrong, everythin is executed upon login EXCEPT the notification doesnt show up. If i execute the script from a terminal when im already logged in there is no notification either, unless i remove the "su andreas -c" part and simply do "notify-send...", but that doesnt send out a notification when i log in, and i think thats because the notification is sent to the root user, and not "andreas".
I think your su needs to be passed the desktop users DBUS session bus address. The bus address can be easily obtained and used for X11 user sessions, but Wayland has tighter security, for Wayland the user session actually has to run up proxy to receive the messages. (Had you considered it might be easier to send an email?)
I have notify-desktop gist on github that works for X11 and should also work on Wayland (provided the proxy is running). For completeness I've appended the source code of the script to this post, it's extensively commented, I think it contains the pieces necessary to get you own code working.
#!/bin/bash
# Provides a way for a root process to perform a notify send for each
# of the local desktop users on this machine.
#
# Intended for use by cron and timer jobs. Arguments are passed straight
# to notify send. Falls back to using wall. Care must be taken to
# avoid using this script in any potential fast loops.
#
# X11 users should already have a dbus address socket at /run/user/<userid>/bus
# and this script should work without requiring any initialisation. Should
# this not be the case, X11 users could initilise a proxy as per the wayland
# instructions below.
#
# Due to stricter security requirments Wayland lacks an dbus socket
# accessable to root. Wayland users will need to run a proxy to
# provide root with the necessary socket. Each user can must add
# the following to a Wayland session startup script:
#
# notify-desktop --create-dbus-proxy
#
# That will start xdg-dbus-proxy process and make a socket available under:
# /run/user/<userid>/proxy_dbus_<desktop_sessionid>
#
# Once there is a listening socket, any root script or job can pass
# messages using the syntax of notify-send (man notify-send).
#
# Example messages
# notify-desktop -a Daily-backup -t 0 -i dialog-information.png "Backup completed without error"
# notify-desktop -a Remote-rsync -t 6000 -i dialog-warning.png "Remote host not currently on the network"
# notify-desktop -a Daily-backup -t 0 -i dialog-error.png "Error running backup, please consult journalctl"
# notify-desktop -a OS-Upgrade -t 0 -i dialog-warning.png "Update in progress, do not shutdown until further completion notice."
#
# Warnings:
# 1) There has only been limited testing on wayland
# 2) There has only been no testing for multiple GUI sessions on one desktop
#
if [ $1 == "--create-dbus-proxy" ]
then
if [ -n "$DBUS_SESSION_BUS_ADDRESS" ]
then
sessionid=$(cat /proc/self/sessionid)
xdg-dbus-proxy $DBUS_SESSION_BUS_ADDRESS /run/user/$(id -u)/proxy_dbus_$sessionid &
exit 0
else
echo "ERROR: no value for DBUS_SESSION_BUS_ADDRESS environment variable - not a wayland/X11 session?"
exit 1
fi
fi
function find_desktop_session {
for sessionid in $(loginctl list-sessions --no-legend | awk '{ print $1 }')
do
loginctl show-session -p Id -p Name -p User -p State -p Type -p Remote -p Display $sessionid |
awk -F= '
/[A-Za-z]+/ { val[$1] = $2; }
END {
if (val["Remote"] == "no" &&
val["State"] == "active" &&
(val["Type"] == "x11" || val["Type"] == "wayland")) {
print val["Name"], val["User"], val["Id"];
}
}'
done
}
count=0
while read -r -a desktop_info
do
if [ ${#desktop_info[#]} -eq 3 ]
then
desktop_user=${desktop_info[0]}
desktop_id=${desktop_info[1]}
desktop_sessionid=${desktop_info[2]}
proxy_bus_socket="/run/user/$desktop_id/proxy_dbus_$desktop_sessionid"
if [ -S $proxy_bus_socket ]
then
bus_address="$proxy_bus_socket"
else
bus_address="/run/user/$desktop_id/bus"
fi
sudo -u $desktop_user DBUS_SESSION_BUS_ADDRESS="unix:path=$bus_address" notify-send "$#"
count=$[count + 1]
fi
done <<<$(find_desktop_session)
# If no one has been notified fall back to wall
if [ $count -eq 0 ]
then
echo "$#" | wall
fi
# Don't want this to cause a job to stop
exit 0
It may look menacing, but the task is really simple:
server has the following directory structure:
/usr/multi
/1
job
file.a
file.b
/2
job
file.a
file.b
/3
job
file.a
file.b
And the following code:
#this is thread.sh
cd /usr/multi
#find the first directory that has a job file
id=$(ls */job)
#strip everything after "/" ("1/job" becomes "1")
id=${id%%/*}
#read job
read job <$id/job
if [ "$id" == "" ] || [ "$job" == "" ]
then
false
else
#mark that id as busy
mv $id/job $id/_job
#execute the job
script.sh $1 $job
#mark that id as available
mv $id/_job $id/job
fi
script.sh performs some operations (described in job file and received argument) on file.a and file.b.
Clients, on the other hand, execute this code:
#loop infinitely on failure, break loop on success
false
while [ "$?" != "0" ]
do
result=$(ssh $server "thread.sh 'some instructions'" </dev/null)
done
echo $result
So every client gets a separate id and gives the server some instructions to perform the job specified for that id. If there are more clients than available jobs on the server, clients will keep trying to grab the first available id and when they do, give some instructions to the server to perform corresponding job.
The problem is that every once in a while two clients get the same id; and thread.sh messes up the file.a and file.b.
My theory is that this happens when two clients request an id from the server at almost the same time, so that server cannot rename the job file quick enough for one client to see it as available, and the other one to see it as busy.
Should I put random sleep interval just before the if [ "$id" == "" ] || [ "$job" == "" ] so I will get some more randomness in the timing?
As you have correctly determined, your script is quite racy.
A simple locking in bash can be implemented using
set -o noclobber
and writing to a lockfile. If the lock is already being held (the file exists), your write attempt will fail in an atomic manner.
I run an automated backup shell script, it works great, but for some reason the FTP blocks me for a few minutes. I would like to add a retry and wait feature. below is sample of my code.
echo "Moving to external server"
cd /root/backup/
/usr/bin/ftp -n -i $FTP_SERVER <<END_SCRIPT
user $FTP_USERNAME $FTP_PASSWORD
mput $FILE
bye
END_SCRIPT
after a failed login i get the message below
Authentication failed. Blocked.
Login failed.
Incorrect sequence of commands: PASS required after USER
i need to capture such output and make the code atempt to sleep for few minutes before trying again.
ideas?
If it's possible for you to install additional programs onto the system of interest i encourage you to take a look at lftp.
With lftp it is possible to set paramters like the time between reconnects etc. manually.
To achieve your aim with lftp you have to invoke the following
lftp -u user,password ${FTP_SERVER} <<END
set ftp:retry-530 "Authentication failed"
set net:reconnect-interval-base 60
set net:reconnect-interval-multiplier 10
set net:max-retries 10
<some more custom commands>
END
If the pattern after ftp:retry-530 matches the 530 reply of the server lftp tries to reconnect every 60*10 seconds.
The message below is probably going to stderr instead of stdout so you will need to capture the stderr output first:
while true
do
if ( script 2>&1 |grep -q 'Authentication failed' )
then
echo "authentication failed, sleeping for a while before trying again"
sleep 60
else
#everything worked, break out of the while loop
break
fi
done
Sometimes when I execute a bash script with the curl command to upload some files to my ftp server, it will return some error like:
56 response reading failed
and I have to find the wrong line and re-run them manually and it will be OK.
I'm wondering if that could be re-run automatically when the error occurs.
My scripts is like this:
#there are some files(A,B,C,D,E) in my to_upload directory,
# which I'm trying to upload to my ftp server with curl command
for files in `ls` ;
do curl -T $files ftp.myserver.com --user ID:pw ;
done
But sometimes A,B,C, would be uploaded successfully, only D were left with an "error 56", so I have to rerun curl command manually. Besides, as Will Bickford said, I prefer that no confirmation will be required, because I'm always asleep at the time the script is running. :)
Here's a bash snippet I use to perform exponential back-off:
# Retries a command a configurable number of times with backoff.
#
# The retry count is given by ATTEMPTS (default 5), the initial backoff
# timeout is given by TIMEOUT in seconds (default 1.)
#
# Successive backoffs double the timeout.
function with_backoff {
local max_attempts=${ATTEMPTS-5}
local timeout=${TIMEOUT-1}
local attempt=1
local exitCode=0
while (( $attempt < $max_attempts ))
do
if "$#"
then
return 0
else
exitCode=$?
fi
echo "Failure! Retrying in $timeout.." 1>&2
sleep $timeout
attempt=$(( attempt + 1 ))
timeout=$(( timeout * 2 ))
done
if [[ $exitCode != 0 ]]
then
echo "You've failed me for the last time! ($#)" 1>&2
fi
return $exitCode
}
Then use it in conjunction with any command that properly sets a failing exit code:
with_backoff curl 'http://monkeyfeathers.example.com/'
Perhaps this will help. It will try the command, and if it fails, it will tell you and pause, giving you a chance to fix run-my-script.
COMMAND=./run-my-script.sh
until $COMMAND; do
read -p "command failed, fix and hit enter to try again."
done
I have faced a similar problem where I need to make contact with servers using curl that are in the process of starting up and haven't started up yet, or services that are temporarily unavailable for whatever reason. The scripting was getting out of hand, so I made a dedicated retry tool that will retry a command until it succeeds:
#there are some files(A,B,C,D,E) in my to_upload directory,
# which I'm trying to upload to my ftp server with curl command
for files in `ls` ;
do retry curl -f -T $files ftp.myserver.com --user ID:pw ;
done
The curl command has the -f option, which returns code 22 if the curl fails for whatever reason.
The retry tool will by default run the curl command over and over forever until the command returns status zero, backing off for 10 seconds between retries. In addition retry will read from stdin once and once only, and writes to stdout once and once only, and writes all stdout to stderr if the command fails.
Retry is available from here: https://github.com/minfrin/retry