How to check if the file is downloaded with curl - linux

I am writing a shell script, I need to download 6 files from internet, In the script I have done it as
curl a
curl b
curl c
and so on
It works, but sometimes curl: (7) couldn't connect to host for some files in the middle of script for example, It will successfully download a but miss the file b with the above error and will download the file c. I would like to catch this error so that my code will execute with the successful download of all the files.

You can use a loop:
while true; do
curl --fail b && break
done
The loop won't break until b is downloaded. You can make it a retry function which you can call if a download fails on the first try:
retry(){
while true; do
curl --fail "$1" && break ||
echo "Download failed for url: $1
retrying..."
done
}
Then do this:
curl --fail a || retry a
curl --fail b || retry b
curl --fail c || retry c
If you just want to silent the error messages then:
curl a 2>/dev/null
curl b 2>/dev/null
...
Or if you want to just detect the error then:
if ! curl --fail a; then
echo "Failed"
fi
or, a one liner:
curl --fail a || echo Failed
If you want to exit after a failure and also show your own message:
curl --fail a 2>/dev/null || { echo failed; exit 1; }

You could chain them with &&...
curl --fail a && curl --fail b && curl --fail c...
Update: as #nwk pointed out below, we need to add --fail to make curl fail on bad http codes.

Put set -e at the beginning of the shell script and use curl --fail. E.g.,
#!/bin/sh
set -e
curl --fail http://example.com/a
curl --fail http://example.com/b
curl --fail http://example.com/c
The set -e will make the script stop with an error on the first unsuccessful command (one with an exit status ≠ zero).

You can check the status of execution of the last command by looking at the $? shell variable.
Execute command below to check the status of the last command anything other 0 shall indicate an error.
echo $?
curl a
if [ "$?" -gt 0 ]
then
echo "Error downloading file a. Exiting"
exit
fi
curl b
if [ "$?" -gt 0 ]
then
echo "Error downloading file b. Exiting"
exit
fi
...
A simple modified form after #andlrc suggestion:
if ! curl a ; then echo "Got error downloading a"; fi
if ! curl b ; then echo "Got error downloading b"; fi
if ! curl c ; then echo "Got error downloading c"; fi

for file2get in a b c d;
do
do :;
until curl --fail $file2get;
done
Or add an iterator counter to prevent endless looping

Related

Using an "if" statement with curl in terminal?

I'm using this command to get the response code of a page using curl:
curl -s -o /dev/null -w "%{http_code}" 'https://www.example.com'
If the response code is 200, then I want to delete a certain file on my computer. If it isn't 200, nothing should be done.
What's the easiest way to do this?
You can store the result in a shell variable (via command substitution), and then test the value with a simple if and [[ command. For example, in bash:
#!/bin/bash
code=$(curl -s -o /dev/null -w "%{http_code}" 'https://www.example.com')
if [[ $code == 200 ]]; then
rm /path/to/file
# other actions
fi
If all you want is a simple rm, you can shorten it to:
#!/bin/bash
[[ $code == 200 ]] && rm /path/to/file
In a generic POSIX shell, you'll have to use a less flexible [ command and quote the variable:
#!/bin/sh
code=$(curl -s -o /dev/null -w "%{http_code}" 'https://www.example.com')
if [ "$code" = 200 ]; then
rm /path/to/file
fi
Additionally, to test for a complete class of codes (e.g. 2xx), you can use wildcards:
#!/bin/bash
[[ $code == 2* ]] && rm /path/to/file
and the case command (an example here).

Checking if String is in a command answer

I'm struggling with a problem in linux bash.
I want a script to execute a command
curl -s --head http://myurl/ | head -n 1
and if the result of the command contains 200 it executes another command.
Else it is echoing something.
What i have now:
CURLCHECK=curl -s --head http://myurl | head -n 1
if [[ $($CURLCHECK) =~ "200" ]]
then
echo "good"
else
echo "bad"
fi
The script prints:
HTTP/1.1 200 OK
bad
I tried many ways but none of them seems to work.
Can someone help me?
I would do something like this:
if curl -s --head http://myurl | head -n 1 | grep "200" >/dev/null 2>&1; then
echo good
else
echo bad
fi
You need to actually capture the output from the curl command:
CURLCHECK=$(curl -s --head http://myurl | head -n 1)
I'm surprised you're not getting a "-s: command not found" error
You can use this curl command with -w "%{http_code}" to just get http status code:
[[ $(curl -s -w "%{http_code}" -A "Chrome" -L "http://myurl/" -o /dev/null) == 200 ]] &&
echo "good" || echo "bad"
using wget
if wget -O /dev/null your_url 2>&1 | grep -F HTTP >/dev/null 2>&1 ;then echo good;else echo bad; fi

How to check status of URLs from text file using bash shell script

I have to check the status of 200 http URLs and find out which of these are broken links. The links are present in a simple text file (say URL.txt present in my ~ folder). I am using Ubuntu 14.04 and I am a Linux newbie. But I understand the bash shell is very powerful and could help me achieve what I want.
My exact requirement would be to read the text file which has the list of URLs and automatically check if the links are working and write the response to a new file with the URLs and their corresponding status (working/broken).
I created a file "checkurls.sh" and placed it in my home directory where the urls.txt file is also located. I gave execute privileges to the file using
$chmod +x checkurls.sh
The contents of checkurls.sh is given below:
#!/bin/bash
while read url
do
urlstatus=$(curl -o /dev/null --silent --head --write-out '%{http_code}' "$url" )
echo "$url $urlstatus" >> urlstatus.txt
done < $1
Finally, I executed it from command line using the following -
$./checkurls.sh urls.txt
Voila! It works.
#!/bin/bash
while read -ru 4 LINE; do
read -r REP < <(exec curl -IsS "$LINE" 2>&1)
echo "$LINE: $REP"
done 4< "$1"
Usage:
bash script.sh urls-list.txt
Sample:
http://not-exist.com/abc.html
https://kernel.org/nothing.html
http://kernel.org/index.html
https://kernel.org/index.html
Output:
http://not-exist.com/abc.html: curl: (6) Couldn't resolve host 'not-exist.com'
https://kernel.org/nothing.html: HTTP/1.1 404 Not Found
http://kernel.org/index.html: HTTP/1.1 301 Moved Permanently
https://kernel.org/index.html: HTTP/1.1 200 OK
For everything, read the Bash Manual. See man curl, help, man bash as well.
What about to add some parallelism to the accepted solution. Lets modify the script chkurl.sh to be little easier to read and to handle just one request at a time:
#!/bin/bash
URL=${1?Pass URL as parameter!}
curl -o /dev/null --silent --head --write-out "$URL %{http_code} %{redirect_url}\n" "$URL"
And now you check your list using:
cat URL.txt | xargs -P 4 -L1 ./chkurl.sh
This could finish the job up to 4 times faster.
Herewith my full script that checks URLs listed in a file passed as an argument e.g. 'checkurls.sh listofurls.txt'.
What it does:
check url using curl and return HTTP status code
send email notifications when url returns other code than 200
create a temporary lock file for failed urls (file naming could be improved)
send email notification when url becoms available again
remove lock file once url becomes available to avoid further notifications
log events to a file and handle increasing log file size (AKA log
rotation, uncomment echo if code 200 logging required)
Code:
#!/bin/sh
EMAIL=" your#email.com"
DATENOW=`date +%Y%m%d-%H%M%S`
LOG_FILE="checkurls.log"
c=0
while read url
do
((c++))
LOCK_FILE="checkurls$c.lock"
urlstatus=$(/usr/bin/curl -H 'Cache-Control: no-cache' -o /dev/null --silent --head --write-out '%{http_code}' "$url" )
if [ "$urlstatus" = "200" ]
then
#echo "$DATENOW OK $urlstatus connection->$url" >> $LOG_FILE
[ -e $LOCK_FILE ] && /bin/rm -f -- $LOCK_FILE > /dev/null && /bin/mail -s "NOTIFICATION URL OK: $url" $EMAIL <<< 'The URL is back online'
else
echo "$DATENOW FAIL $urlstatus connection->$url" >> $LOG_FILE
if [ -e $LOCK_FILE ]
then
#no action - awaiting URL to be fixed
:
else
/bin/mail -s "NOTIFICATION URL DOWN: $url" $EMAIL <<< 'Failed to reach or URL problem'
/bin/touch $LOCK_FILE
fi
fi
done < $1
# REMOVE LOG FILE IF LARGER THAN 100MB
# alow up to 2000 lines average
maxsize=120000
size=$(/usr/bin/du -k "$LOG_FILE" | /bin/cut -f 1)
if [ $size -ge $maxsize ]; then
/bin/rm -f -- $LOG_FILE > /dev/null
echo "$DATENOW LOG file [$LOG_FILE] has been recreated" > $LOG_FILE
else
#do nothing
:
fi
Please note that changing order of listed urls in text file will affect any existing lock files (remove all .lock files to avoid confusion). It would be improved by using url as file name but certain characters such as : # / ? & would have to be handled for operating system.
I recently released deadlink, a command-line tool for finding broken links in files. Install with
pip install deadlink
and use as
deadlink check /path/to/file/or/directory
or
deadlink replace-redirects /path/to/file/or/directory
The latter will replace permanent redirects (301) in the specified files.
Example output:
if your input file contains one url per line you can use a script to read each line, then try to ping the url, if ping success then the url is valid
#!/bin/bash
INPUT="Urls.txt"
OUTPUT="result.txt"
while read line ;
do
if ping -c 1 $line &> /dev/null
then
echo "$line valid" >> $OUTPUT
else
echo "$line not valid " >> $OUTPUT
fi
done < $INPUT
exit
ping options :
-c count
Stop after sending count ECHO_REQUEST packets. With deadline option, ping waits for count ECHO_REPLY packets, until the timeout expires.
you can use this option as well to limit waiting time
-W timeout
Time to wait for a response, in seconds. The option affects only timeout in absense
of any responses, otherwise ping waits for two RTTs.
curl -s -I --http2 http://$1 >> fullscan_curl.txt | cut -d: -f1 fullscan_curl.txt | cat fullscan_curl.txt | grep HTTP >> fullscan_httpstatus.txt
its work me

Check for existence of wget/curl

Trying to do a script to download a file using wget, or curl if wget doesn't exist in Linux. How do I have the script check for existence of wget?
Linux has a which command which will check for the existence of an executable on your path:
pax> which ls ; echo $?
/bin/ls
0
pax> which no_such_executable ; echo $?
1
As you can see, it sets the return code $? to easily tell if the executable was found, so you could use something like:
if which wget >/dev/null ; then
echo "Downloading via wget."
wget --option argument
elif which curl >/dev/null ; then
echo "Downloading via curl."
curl --option argument
else
echo "Cannot download, neither wget nor curl is available."
fi
wget http://download/url/file 2>/dev/null || curl -O http://download/url/file
One can also use command or type or hash to check if wget/curl exists or not. Another thread here - "Check if a program exists from a Bash script" answers very nicely what to use in a bash script to check if a program exists.
I would do this -
if [ ! -x /usr/bin/wget ] ; then
# some extra check if wget is not installed at the usual place
command -v wget >/dev/null 2>&1 || { echo >&2 "Please install wget or set it in your path. Aborting."; exit 1; }
fi
First thing to do is try install to install wget with your usual package management system,. It should tell you if already installed;
yum -y wget
Otherwise just launch a command like below
wget http://download/url/file
If you receive no error, then its ok.
A solution taken from the K3S install script (https://raw.githubusercontent.com/rancher/k3s/master/install.sh)
function download {
url=$1
filename=$2
if [ -x "$(which wget)" ] ; then
wget -q $url -O $2
elif [ -x "$(which curl)" ]; then
curl -o $2 -sfL $url
else
echo "Could not find curl or wget, please install one." >&2
fi
}
# to use in the script:
download https://url /local/path/to/download
Explanation:
It looks for the location of wget and checks for a file to exist there, if so, it does a script-friendly (i.e. quiet) download. If wget isn't found, it tries curl in a similarly script-friendly way.
(Note that the question doesn't specify BASH however my answer assumes it.)
Simply run
wget http://download/url/file
you will see the statistics whether the endpoint is available or not.

Execute Shell script after other script got executed successfully

Problem Statement:-
I have four shell script that I want to execute only when the previous script got executed successfully. And I am running it like this currently-
./verify-export-realtime.sh
sh -x lca_query.sh
sh -x liv_query.sh
sh -x lqu_query.sh
So In order to make other scripts run after previous script was successful. I need to do something like below? I am not sure whether I am right? If any script got failed due to any reason it will print as Failed due to some reason right?
./verify-export-realtime.sh
RET_VAL_STATUS=$?
echo $RET_VAL_STATUS
if [ $RET_VAL_STATUS -ne 0 ]; then
echo "Failed due to some reason"
exit
fi
sh -x lca_query.sh
RET_VAL_STATUS=$?
echo $RET_VAL_STATUS
if [ $RET_VAL_STATUS -ne 0 ]; then
echo "Failed due to some reason"
exit
fi
sh -x liv_query.sh
RET_VAL_STATUS=$?
echo $RET_VAL_STATUS
if [ $RET_VAL_STATUS -ne 0 ]; then
echo "Failed due to some reason"
exit
fi
sh -x lqu_query.sh
The shell provides an operator && to do exactly this. So you could write:
./verify-export-realtime.sh && \
sh -x lca_query.sh && \
sh -x liv_query.sh && \
sh -x lqu_query.sh
or you could get rid of the line continuations (\) and write it all on one line
./verify-export-realtime.sh && sh -x lca_query.sh && sh -x liv_query.sh && sh -x lqu_query.sh
If you want to know how far it got, you can add extra commands that just set a variable:
done=0
./verify-export-realtime.sh && done=1 &&
sh -x lca_query.sh && done=2 &&
sh -x liv_query.sh && done=3 &&
sh -x lqu_query.sh && done=4
The value of $done at the end tells you how many commands completed successfully. $? will get set to the exit value of the last command run (which is the one that failed), or 0 if all succeeded
You can simply run a chain of scripts in the command line (or from other script), when the first failing command will break this chain, using "&&" operator:
$ script1.sh && echo "First done, running the second" && script2.sh && echo "Second done, running the third" && script3.sh && echo "Third done, cool!"
And so on. The operation will break once one of the steps fails.
That should be right. You can also print the error code if necessary by echoing the $ variable. You can also make your own return value codes by actually returning your own values in those scripts and checking them in this main one. It might be more helpful then "The script failed for some reason".
if you want more flexible of handling errors
script1.sh
rc=$?
if [ ${rc} -eq 0 ];then
echo "script1 pass, starting script2"
script2.sh
rc=$?
if [ ${rc} -eq 0 ];then
echo "script2 pass"
else
echo "script2 failed"
fi
else
echo "script1 failed"
fi
The standard way to do this is to simply add a shell option that causes the script to abort if any simple command fails. Simply write the interpreter line as:
#!/bin/sh -e
or add the command:
set -e
(It is also common to do cmd1 && cmd2 && cmd3 as mentioned in other solutions.)
You absolutely should not attempt to print an error message. The command should print a relevant error message before it exits if it encounters an error. If the commands are not well behaved and do not write useful error messages, you should fix them rather than trying to guess what error they encountered. If you do write an error message, at the very least write it to the correct place. Errors belong on stderr:
echo "Some error occurred" >&2
As #William Pursell said, your scripts really should report their own errors. If you also need error reporting in the calling script, the easiest way to do it is like this:
if ! ./verify-export-realtime.sh; then
echo "Error running verify-export-realtime.sh; rest of script cancelled." >&2
elif ! sh -x lca_query.sh; then
echo "Error running lca_query.sh; rest of script cancelled." >&2
elif ! sh -x liv_query.sh; then
echo "Error running liv_query.sh; rest of script cancelled." >&2
elif ! sh -x lqu_query.sh; then
echo "Error running lqu_query.sh." >&2
fi

Resources