Time out a curl command shell script centOS - linux

checkServer(){
response=$(curl --connect-timeout 10 --write-out %{http_code} --silent --output /dev/null localhost:8080/patuna/servicecheck)
if [ "$response" = "200" ];
then echo "`date --rfc-3339=seconds` - Server is healthy, up and running"
return 0
else
echo "`date --rfc-3339=seconds` - Server is not healthy(response code - $response ), server is going to restrat"
startTomcat
fi
}
Here i want to time out the curl command but it dose not work. in centos7 Shell scrit. what i simply needs to do is timeout the curl command
ERROR code is curl: option --connect-timeout=: is unknown

checkServer(){
response=$(curl --max-time 20 --connect-timeout 0 --write-out %{http_code} --silent --output /dev/null localhost:8080/patuna/servicecheck)
if [ "$response" = "200" ];
then echo "`date --rfc-3339=seconds` - Server is healthy, up and running"
return 0
else
echo "`date --rfc-3339=seconds` - Server is not healthy(response code - $response ), server is going to restrat"
startTomcat
fi
}

You can try the option of --max-time.
Maximum time in seconds that you allow the whole operation to take. This is useful for preventing your batch jobs from hanging for hours due to slow networks or links
going down. Since 7.32.0, this option accepts decimal values, but the actual timeout will decrease in accuracy as the specified timeout increases in decimal precision.
If you just want to check the http status code. You might want to check out the --head option.
I suggest using --silent with --show-error at the same time in case that you might want to know the error message.

Related

How can this bash script be made shorter and better? Is there a way to combine multiple curl statements and if statements in bash?

Status_code_1=$(curl -o /dev/null -s -w "%{http_code}\n" -H "Authorization: token abc123" https://url_1.com)
Status_code_2=$(curl -o /dev/null -s -w "%{http_code}\n" -H "Authorization: token abc123" https://url_2.com)
if [ $Status_code_1 == "200" ]; then
echo "url_1 is running successfully"
else
echo "Error at url_1. Status code:" $Status_code_1
fi
if [ $Status_code_2 == "200" ]; then
echo "url_2 is running successfully"
else
echo "Error Error at url_2. Status code:" $Status_code_2
fi
The main script is scheduled and runs everyday and prints the success message everytime. If the status code is anything other than 200, $Status_code_1 or $Status_code_2, whichever is down prints the error code.
The code is working fine but I want to know how can it be made shorter. Can the curl command from first 2 lines be combined because they have same authorization and everything, it's just the url at the end is different. Also later if statements are pretty much same, only that I'm running them separately for different urls.
Is it possible to write first 2 lines in one line and same for both if statements? I know AND and OR can be used for if statements but say we have 5 urls and 2 are down, how it will print the name of those 2 urls in that case?
Is there a way to combine multiple curl statements and if statements in bash?
curl can retrieve multiple URLs in one run, but then you're left to parse its output into per-URL pieces. Since you want to report on the response for each URL, it is probably to your advantage to run curl separately for each.
But you can make the script less repetitive by writing a shell function that performs one curl run and reports on the results:
test_url() {
local status=$(curl -o /dev/null -s -w "%{http_code}\n" -H "Authorization: token abc123" "$1")
if [ "$status" = 200 ]; then
echo "$2 is running successfully"
else
echo "Error at $2. Status code: $status"
fi
}
Then running it for a given URL is a one-liner:
test_url https://url_1.com url1_1
test_url https://url_2.com url1_2
Altogether, that's also about the same length as the original, but this is the break-even point on length. Each specific URL to test requires only one line, as opposed to six in your version. Also, if you want to change any of the details of the lookup or status reporting then you can do it in one place for all.
To avoid repetition, you can encapsulate code you need to reuse in a function.
httpresult () {
curl -o /dev/null -s -w "%{http_code}\n" -H "Authorization: token abc123" "$#"
}
check200 () {
local status=$(httpresult "$1")
if [ "$status" = "200" ]; then
echo "$0: ${2-$1} is running successfully" >&2
else
echo "$0: error at ${2-$1}. Status code: $status" >&2
fi
}
check200 "https://url_1.com/" "url_1"
check200 "https://url_2.com/" "url_2"
Splitting httpresult to a separate function isn't really necessary, but perhaps useful both as a demonstration of a more modular design, and as something you might reuse in other scripts too.
I changed the formatting of the status message to include the name of the script in the message, and to print diagnostics to standard error instead of standard output, in accordance with common best practices.
The check200 function accepts a URL and optionally a human-readable label to use in the diagnostic messages; if you omit it, they will simply contain the URL, too. It wasn't clear from your question whether the labels are important and useful.
Notice that the standard comparison operator in [ ... ] is =, not == (though Bash will accept both).

Keep track of execution time in a bash script and terminate current command if it takes too long

I'm trying to create a bash script to download files en masse from a certain website.
Their download links are sequential - e.g. it's just id=1, id=2, id=3 all the way up to 660000. The only requirement is that you have to be logged in, which makes this a bit harder. Oh, and the login will randomly time out after a few hours so I have to log back in.
Here's my current script, which works well about 99% of the time.
#!/bin/sh
cd downloads
for i in `seq 1 660000`
do
lastname=""
echo "Downloading file $i"
echo "Downloading file $i" >> _downloadinglog.txt
response=$(curl --write-out %{http_code} -b _cookies.txt -c _cookies.txt --silent --output /dev/null "[sample URL to make sure cookie is still logged in]")
if ! [ $response -eq 200 ]
then
echo "Cookie didn't work, trying to re-log in..."
curl -d "userid=[USERNAME]" -d "pwd=[PASSWORD]" -b _cookies.txt -c _cookies.txt --silent --output /dev/null "[login URL]"
response=$(curl --write-out %{http_code} -b _cookies.txt -c _cookies.txt --silent --output /dev/null "[sample URL again]")
if ! [ $response -eq 200 ]
then
echo "Something weird happened?? Response code $response. Logging in didn't fix issue, fix then resume from $(($i - 1))"
echo "Something weird happened?? Response code $response. Logging in didn't fix issue, fix then resume from $(($i - 1))" >> _downloadinglog.txt
exit 0
fi
echo "Downloading file $(($i - 1)) again incase cookie expiring caused it to fail"
echo "Downloading file $(($i - 1)) again incase cookie expiring caused it to fail" >> _downloadinglog.txt
lastname=$(curl --write-out %{filename_effective} -O -J -b _cookies.txt -c _cookies.txt "[URL to download files]?id=$(($i - 1))")
echo "id $(($i - 1)) = $lastname" >> _downloadinglog.txt
lastname=""
echo "Downloading file $i"
echo "Downloading file $i" >> _downloadinglog.txt
fi
lastname=$(curl --write-out %{filename_effective} -O -J -b _cookies.txt -c _cookies.txt "[URL to download files]?id=$i")
echo "id $i = $lastname" >> _downloadinglog.txt
done
So basically what I have it doing is attempting to download a random file before moving to the next file in the set. If the download fails, we assume the login cookie is no longer valid and tell curl to log me back in.
This works great, and I was able to get several thousand files from it. But what would happen is - either my router goes down for a second or two, or THEIR site goes down for a minute or two, and curl will just sit there thinking it's downloading for hours. I once came back to it literally spending 24 hours on the same file. It doesn't seem to have the ability to know if the transfer timed out in the middle - only if it can't START the transfer.
I know there are ways to terminate execution of a command if you combine it with "sleep", but since this has to be "smart" and restart from where it left off, I can't just kill the whole script.
Any suggestions? I'm open to using something other than curl if I can use it to login via a terminal command.
You can try using the curl options --connect-timeout or --max-time .
--max-time should be your pick.
From manual :
--max-time
Maximum time in seconds that you allow the whole operation to take. This is useful for preventing your batch jobs from hanging for hours due to slow networks or links going down. Since 7.32.0, this option accepts decimal values, but the actual timeout will decrease in accuracy as the specified timeout increases in decimal precision. See also the --connect-timeout option.
If this option is used several times, the last one will be used.
Then capture the result of the command in a var and process further based on the result.

How to check if the file is downloaded with curl

I am writing a shell script, I need to download 6 files from internet, In the script I have done it as
curl a
curl b
curl c
and so on
It works, but sometimes curl: (7) couldn't connect to host for some files in the middle of script for example, It will successfully download a but miss the file b with the above error and will download the file c. I would like to catch this error so that my code will execute with the successful download of all the files.
You can use a loop:
while true; do
curl --fail b && break
done
The loop won't break until b is downloaded. You can make it a retry function which you can call if a download fails on the first try:
retry(){
while true; do
curl --fail "$1" && break ||
echo "Download failed for url: $1
retrying..."
done
}
Then do this:
curl --fail a || retry a
curl --fail b || retry b
curl --fail c || retry c
If you just want to silent the error messages then:
curl a 2>/dev/null
curl b 2>/dev/null
...
Or if you want to just detect the error then:
if ! curl --fail a; then
echo "Failed"
fi
or, a one liner:
curl --fail a || echo Failed
If you want to exit after a failure and also show your own message:
curl --fail a 2>/dev/null || { echo failed; exit 1; }
You could chain them with &&...
curl --fail a && curl --fail b && curl --fail c...
Update: as #nwk pointed out below, we need to add --fail to make curl fail on bad http codes.
Put set -e at the beginning of the shell script and use curl --fail. E.g.,
#!/bin/sh
set -e
curl --fail http://example.com/a
curl --fail http://example.com/b
curl --fail http://example.com/c
The set -e will make the script stop with an error on the first unsuccessful command (one with an exit status ≠ zero).
You can check the status of execution of the last command by looking at the $? shell variable.
Execute command below to check the status of the last command anything other 0 shall indicate an error.
echo $?
curl a
if [ "$?" -gt 0 ]
then
echo "Error downloading file a. Exiting"
exit
fi
curl b
if [ "$?" -gt 0 ]
then
echo "Error downloading file b. Exiting"
exit
fi
...
A simple modified form after #andlrc suggestion:
if ! curl a ; then echo "Got error downloading a"; fi
if ! curl b ; then echo "Got error downloading b"; fi
if ! curl c ; then echo "Got error downloading c"; fi
for file2get in a b c d;
do
do :;
until curl --fail $file2get;
done
Or add an iterator counter to prevent endless looping

How to check status of URLs from text file using bash shell script

I have to check the status of 200 http URLs and find out which of these are broken links. The links are present in a simple text file (say URL.txt present in my ~ folder). I am using Ubuntu 14.04 and I am a Linux newbie. But I understand the bash shell is very powerful and could help me achieve what I want.
My exact requirement would be to read the text file which has the list of URLs and automatically check if the links are working and write the response to a new file with the URLs and their corresponding status (working/broken).
I created a file "checkurls.sh" and placed it in my home directory where the urls.txt file is also located. I gave execute privileges to the file using
$chmod +x checkurls.sh
The contents of checkurls.sh is given below:
#!/bin/bash
while read url
do
urlstatus=$(curl -o /dev/null --silent --head --write-out '%{http_code}' "$url" )
echo "$url $urlstatus" >> urlstatus.txt
done < $1
Finally, I executed it from command line using the following -
$./checkurls.sh urls.txt
Voila! It works.
#!/bin/bash
while read -ru 4 LINE; do
read -r REP < <(exec curl -IsS "$LINE" 2>&1)
echo "$LINE: $REP"
done 4< "$1"
Usage:
bash script.sh urls-list.txt
Sample:
http://not-exist.com/abc.html
https://kernel.org/nothing.html
http://kernel.org/index.html
https://kernel.org/index.html
Output:
http://not-exist.com/abc.html: curl: (6) Couldn't resolve host 'not-exist.com'
https://kernel.org/nothing.html: HTTP/1.1 404 Not Found
http://kernel.org/index.html: HTTP/1.1 301 Moved Permanently
https://kernel.org/index.html: HTTP/1.1 200 OK
For everything, read the Bash Manual. See man curl, help, man bash as well.
What about to add some parallelism to the accepted solution. Lets modify the script chkurl.sh to be little easier to read and to handle just one request at a time:
#!/bin/bash
URL=${1?Pass URL as parameter!}
curl -o /dev/null --silent --head --write-out "$URL %{http_code} %{redirect_url}\n" "$URL"
And now you check your list using:
cat URL.txt | xargs -P 4 -L1 ./chkurl.sh
This could finish the job up to 4 times faster.
Herewith my full script that checks URLs listed in a file passed as an argument e.g. 'checkurls.sh listofurls.txt'.
What it does:
check url using curl and return HTTP status code
send email notifications when url returns other code than 200
create a temporary lock file for failed urls (file naming could be improved)
send email notification when url becoms available again
remove lock file once url becomes available to avoid further notifications
log events to a file and handle increasing log file size (AKA log
rotation, uncomment echo if code 200 logging required)
Code:
#!/bin/sh
EMAIL=" your#email.com"
DATENOW=`date +%Y%m%d-%H%M%S`
LOG_FILE="checkurls.log"
c=0
while read url
do
((c++))
LOCK_FILE="checkurls$c.lock"
urlstatus=$(/usr/bin/curl -H 'Cache-Control: no-cache' -o /dev/null --silent --head --write-out '%{http_code}' "$url" )
if [ "$urlstatus" = "200" ]
then
#echo "$DATENOW OK $urlstatus connection->$url" >> $LOG_FILE
[ -e $LOCK_FILE ] && /bin/rm -f -- $LOCK_FILE > /dev/null && /bin/mail -s "NOTIFICATION URL OK: $url" $EMAIL <<< 'The URL is back online'
else
echo "$DATENOW FAIL $urlstatus connection->$url" >> $LOG_FILE
if [ -e $LOCK_FILE ]
then
#no action - awaiting URL to be fixed
:
else
/bin/mail -s "NOTIFICATION URL DOWN: $url" $EMAIL <<< 'Failed to reach or URL problem'
/bin/touch $LOCK_FILE
fi
fi
done < $1
# REMOVE LOG FILE IF LARGER THAN 100MB
# alow up to 2000 lines average
maxsize=120000
size=$(/usr/bin/du -k "$LOG_FILE" | /bin/cut -f 1)
if [ $size -ge $maxsize ]; then
/bin/rm -f -- $LOG_FILE > /dev/null
echo "$DATENOW LOG file [$LOG_FILE] has been recreated" > $LOG_FILE
else
#do nothing
:
fi
Please note that changing order of listed urls in text file will affect any existing lock files (remove all .lock files to avoid confusion). It would be improved by using url as file name but certain characters such as : # / ? & would have to be handled for operating system.
I recently released deadlink, a command-line tool for finding broken links in files. Install with
pip install deadlink
and use as
deadlink check /path/to/file/or/directory
or
deadlink replace-redirects /path/to/file/or/directory
The latter will replace permanent redirects (301) in the specified files.
Example output:
if your input file contains one url per line you can use a script to read each line, then try to ping the url, if ping success then the url is valid
#!/bin/bash
INPUT="Urls.txt"
OUTPUT="result.txt"
while read line ;
do
if ping -c 1 $line &> /dev/null
then
echo "$line valid" >> $OUTPUT
else
echo "$line not valid " >> $OUTPUT
fi
done < $INPUT
exit
ping options :
-c count
Stop after sending count ECHO_REQUEST packets. With deadline option, ping waits for count ECHO_REPLY packets, until the timeout expires.
you can use this option as well to limit waiting time
-W timeout
Time to wait for a response, in seconds. The option affects only timeout in absense
of any responses, otherwise ping waits for two RTTs.
curl -s -I --http2 http://$1 >> fullscan_curl.txt | cut -d: -f1 fullscan_curl.txt | cat fullscan_curl.txt | grep HTTP >> fullscan_httpstatus.txt
its work me

Bash Ping Site Check Script. (Longer Timeout)

I am currently using a Raspberry PI as a ping server and this is the script that I am using checking for a ok response.
I'm not familiar with bash scripting so it's a bit of a beginner question with the curl call, is there a way to increase the timeout, as it keeps reporting false down websites.
#!/bin/bash
SITESFILE=/sites.txt #list the sites you want to monitor in this file
EMAILS=" " #list of email addresses to receive alerts (comma separated)
while read site; do
if [ ! -z "${site}" ]; then
CURL=$(curl -s --head $site)
if echo $CURL | grep "200 OK" > /dev/null
then
echo "The HTTP server on ${site} is up!"
sleep 2
else
MESSAGE="This is an alert that your site ${site} has failed to respond 200 OK."
for EMAIL in $(echo $EMAILS | tr "," " "); do
SUBJECT="$site (http) Failed"
echo "$MESSAGE" | mail -s "$SUBJECT" $EMAIL
echo $SUBJECT
echo "Alert sent to $EMAIL"
done
fi
fi
done < $SITESFILE
Yes, man curl:
--connect-timeout <seconds>
Maximum time in seconds that you allow the connection to the server to take.
This only limits the connection phase, once curl has connected this option is
of no more use. See also the -m, --max-time option.
You can also consider using ping to test the connection before calling curl. Something with a ping -c2 will give you 2 pings to test the connection. Then just check the return from ping (i.e. [[ $? -eq 0 ]] means ping succeeded, then connect with curl)
Also you can use [ -n ${site} ] (site is set) instead of [ ! -z ${site} ] (site is not-unset). Additionally, you will generally want to use the [[ ]] test keywords instead of single [ ] for test constructs. For ultimate portability, just use test -n "${site}" (always double-quote when using test.
I think you need this option --max-time <seconds>
-m/--max-time <seconds>
Maximum time in seconds that you allow the whole operation to take. This is useful for preventing your batch jobs from hanging for hours
due to slow networks or links going down.
--connect-timeout <seconds>
Maximum time in seconds that you allow the connection to the server to take. This only limits the connection phase, once curl has connected this option is of no more use.

Resources