How to ensure that user & pass is correct in curl using bash - linux

I wrote the following script:
#!/bin/bash
if [ $# -ne 1 ];then
echo "Usage: ./script <input-file>"
exit 1
fi
while read user pass; do
curl -iL --data-urlencode user="$user" --data-urlencode password="$pass" http://foo.com/signin 1>/dev/null 2>&1
if [ $? -eq 0 ];then
echo "ok"
elif [ $? -ne 0 ]; then
echo "failed"
fi
done < $1
question:
Whenever I run with even wrong user & pass the result is ok for me...
How can I be sure that my parameters are correct or not?
Thanks

It’s because you are getting output from your curl command. Typing that command with random user/pass gets this:
$ curl -iL --data-urlencode user=BLAHBLAH --data-urlencode password=BLAH http://foo.com/signin
HTTP/1.1 301 Moved Permanently
Server: nginx/1.0.5
Date: Wed, 19 Feb 2014 05:53:36 GMT
Content-Type: text/html
Content-Length: 184
Connection: keep-alive
Location: http://www.foo.com/signin
. . .
. . .
<body>
<!-- This file lives in public/500.html -->
<div class="dialog">
<h1>We're sorry, but something went wrong.</h1>
</div>
</body>
</html>
Hence,
$ echo $?
0
But modify the URL to garbage:
$ curl -iL --data-urlencode user=BLAHBLAH --data-urlencode password=BLAH http://foof.com/signin
curl: (6) Could not resolve host: foof.com
$ echo $?
6

Even when the login fails, the HTTP server will still return a page with an error message. As curl is able to retrieve this page it will finish successfully.
In order to get curl to fail on server errors, you need the --fail parameter. Although this may not be fail-safe according to the curl man page, it is worth a try.
If --fail does not work, you could parse the header in the output of your curl request or have a look at the --write-out parameter.

Related

How to test and wait for an HTTP service using cURL?

Description
I have started up a docker HTTP service
I want to wait the HTTP service to be UP and running using curl
Reproduction
I have written the following bash script
#!/bin/bash
found_message_in_a_bottle() {
curl -H "Accept: application/json" --connect-timeout 2 -s "$1" 2>/dev/null| grep -q Welcome
}
wait_for_saltmaster() {
while ! found_message_in_a_bottle "$1"; do
echo wait for service...
sleep 1
done
}
I then test my service :
source myscript.sh
wait_for_saltmaster localhost:8080
Expected
I expect to wait for service until I have 200 code and HTTP response.
Result
For some services, I have HTTP service 200 code but the curl request keep retrying.
Question
Is there anything wrong in my cURL command ?
If you are checking the HTTP 200 response code, you should try to lookup if the response header having the HTTP 200 response code.
In general the response header consists of these information.
HTTP/1.1 200 OK
Date: Mon, 11 Dec 2017 23:59:11 GMT
Server: Apache/2.2.32 (Unix) mod_wsgi/3.5 Python/2.7.13 PHP/7.1.8 mod_ssl/2.2.32 OpenSSL/1.0.2j DAV/2 mod_fastcgi/2.4.6 mod_perl/2.0.9 Perl/v5.24.0
X-Powered-By: PHP/7.1.8
Content-Length: 0
Content-Type: text/html; charset=UTF-8
You can check the response header, by this curl command, extract the first line and see if it is 200 OK.
curl -H "Accept: application/json" --connect-timeout 2 -s -D - "$1" -o /dev/null 2>/dev/null | head -n1 | grep 200

How to check status of URLs from text file using bash shell script

I have to check the status of 200 http URLs and find out which of these are broken links. The links are present in a simple text file (say URL.txt present in my ~ folder). I am using Ubuntu 14.04 and I am a Linux newbie. But I understand the bash shell is very powerful and could help me achieve what I want.
My exact requirement would be to read the text file which has the list of URLs and automatically check if the links are working and write the response to a new file with the URLs and their corresponding status (working/broken).
I created a file "checkurls.sh" and placed it in my home directory where the urls.txt file is also located. I gave execute privileges to the file using
$chmod +x checkurls.sh
The contents of checkurls.sh is given below:
#!/bin/bash
while read url
do
urlstatus=$(curl -o /dev/null --silent --head --write-out '%{http_code}' "$url" )
echo "$url $urlstatus" >> urlstatus.txt
done < $1
Finally, I executed it from command line using the following -
$./checkurls.sh urls.txt
Voila! It works.
#!/bin/bash
while read -ru 4 LINE; do
read -r REP < <(exec curl -IsS "$LINE" 2>&1)
echo "$LINE: $REP"
done 4< "$1"
Usage:
bash script.sh urls-list.txt
Sample:
http://not-exist.com/abc.html
https://kernel.org/nothing.html
http://kernel.org/index.html
https://kernel.org/index.html
Output:
http://not-exist.com/abc.html: curl: (6) Couldn't resolve host 'not-exist.com'
https://kernel.org/nothing.html: HTTP/1.1 404 Not Found
http://kernel.org/index.html: HTTP/1.1 301 Moved Permanently
https://kernel.org/index.html: HTTP/1.1 200 OK
For everything, read the Bash Manual. See man curl, help, man bash as well.
What about to add some parallelism to the accepted solution. Lets modify the script chkurl.sh to be little easier to read and to handle just one request at a time:
#!/bin/bash
URL=${1?Pass URL as parameter!}
curl -o /dev/null --silent --head --write-out "$URL %{http_code} %{redirect_url}\n" "$URL"
And now you check your list using:
cat URL.txt | xargs -P 4 -L1 ./chkurl.sh
This could finish the job up to 4 times faster.
Herewith my full script that checks URLs listed in a file passed as an argument e.g. 'checkurls.sh listofurls.txt'.
What it does:
check url using curl and return HTTP status code
send email notifications when url returns other code than 200
create a temporary lock file for failed urls (file naming could be improved)
send email notification when url becoms available again
remove lock file once url becomes available to avoid further notifications
log events to a file and handle increasing log file size (AKA log
rotation, uncomment echo if code 200 logging required)
Code:
#!/bin/sh
EMAIL=" your#email.com"
DATENOW=`date +%Y%m%d-%H%M%S`
LOG_FILE="checkurls.log"
c=0
while read url
do
((c++))
LOCK_FILE="checkurls$c.lock"
urlstatus=$(/usr/bin/curl -H 'Cache-Control: no-cache' -o /dev/null --silent --head --write-out '%{http_code}' "$url" )
if [ "$urlstatus" = "200" ]
then
#echo "$DATENOW OK $urlstatus connection->$url" >> $LOG_FILE
[ -e $LOCK_FILE ] && /bin/rm -f -- $LOCK_FILE > /dev/null && /bin/mail -s "NOTIFICATION URL OK: $url" $EMAIL <<< 'The URL is back online'
else
echo "$DATENOW FAIL $urlstatus connection->$url" >> $LOG_FILE
if [ -e $LOCK_FILE ]
then
#no action - awaiting URL to be fixed
:
else
/bin/mail -s "NOTIFICATION URL DOWN: $url" $EMAIL <<< 'Failed to reach or URL problem'
/bin/touch $LOCK_FILE
fi
fi
done < $1
# REMOVE LOG FILE IF LARGER THAN 100MB
# alow up to 2000 lines average
maxsize=120000
size=$(/usr/bin/du -k "$LOG_FILE" | /bin/cut -f 1)
if [ $size -ge $maxsize ]; then
/bin/rm -f -- $LOG_FILE > /dev/null
echo "$DATENOW LOG file [$LOG_FILE] has been recreated" > $LOG_FILE
else
#do nothing
:
fi
Please note that changing order of listed urls in text file will affect any existing lock files (remove all .lock files to avoid confusion). It would be improved by using url as file name but certain characters such as : # / ? & would have to be handled for operating system.
I recently released deadlink, a command-line tool for finding broken links in files. Install with
pip install deadlink
and use as
deadlink check /path/to/file/or/directory
or
deadlink replace-redirects /path/to/file/or/directory
The latter will replace permanent redirects (301) in the specified files.
Example output:
if your input file contains one url per line you can use a script to read each line, then try to ping the url, if ping success then the url is valid
#!/bin/bash
INPUT="Urls.txt"
OUTPUT="result.txt"
while read line ;
do
if ping -c 1 $line &> /dev/null
then
echo "$line valid" >> $OUTPUT
else
echo "$line not valid " >> $OUTPUT
fi
done < $INPUT
exit
ping options :
-c count
Stop after sending count ECHO_REQUEST packets. With deadline option, ping waits for count ECHO_REPLY packets, until the timeout expires.
you can use this option as well to limit waiting time
-W timeout
Time to wait for a response, in seconds. The option affects only timeout in absense
of any responses, otherwise ping waits for two RTTs.
curl -s -I --http2 http://$1 >> fullscan_curl.txt | cut -d: -f1 fullscan_curl.txt | cat fullscan_curl.txt | grep HTTP >> fullscan_httpstatus.txt
its work me

How to check if HTTP returns HTTP/1.1 200 OK using wget

here i am trying to check if wget -s returns HTTP/1.1 200 OK using shell script.
i am using this wget command to get the http status from url
#!/bin/sh
#
URL="http://www.example.com"
wget -S $URL
if i get it returns HTTP/1.1 200 OK, then it should exit else run scr.sh script.
How can i do this?
Here I'm using exit code of silent wget command:
wget -q --spider $URL
if [ $? -ne 0 ] ; then
echo "do something"
fi
In case if response is an error it runs some command.
Using wget:
SHOULD_EXIT=$(wget --server-response --content-on-error=off ${URL} | awk -F':' '$1 ~ / Status$/ { print $2 ~ /200 OK/ }')
${SHOULD_EXIT} will be 1 for 200 OK and 0 otherwise.

Linux cURL XML file POST - how to check for error/success?

I am using cURL to post a XML file on Linux as follows:
curl -X POST --header "Content-Type: text/xml" -d #test.xml "https://www.example.com"
How can I check the status of the cURL command to see if the file was posted or not?
Thanks for any help.
I'm not sure if I got you but you can basically check the return value of the curl command. The return value of the last command is stored in the variable $?.
Example:
curl -X POST --header "Content-Type: text/xml" -d #test.xml "https://www.example.com"
ret=$? # store return value for later usage in the error message
if [ $ret != 0 ] ; then
echo "POST failed with exit code $ret"
fi
This list of possible error codes can be found at the bottom of the man page. They are very helpful for debugging.
You can get the response status of your operation using -w %{http_code}
curl -s -o out.txt -w %{http_code} http://www.example.com/
In this example -s means silent mode, and -o out.txt means to save the response(usually html) into a file.
For this above command, you'l have output 200 when its success.

Get final URL after curl is redirected

I need to get the final URL after a page redirect preferably with curl or wget.
For example http://google.com may redirect to http://www.google.com.
The contents are easy to get(ex. curl --max-redirs 10 http://google.com -L), but I'm only interested in the final url (in the former case http://www.google.com).
Is there any way of doing this by using only Linux built-in tools? (command line only)
curl's -w option and the sub variable url_effective is what you are
looking for.
Something like
curl -Ls -o /dev/null -w %{url_effective} http://google.com
More info
-L Follow redirects
-s Silent mode. Don't output anything
-o FILE Write output to <file> instead of stdout
-w FORMAT What to output after completion
More
You might want to add -I (that is an uppercase i) as well, which will make the command not download any "body", but it then also uses the HEAD method, which is not what the question included and risk changing what the server does. Sometimes servers don't respond well to HEAD even when they respond fine to GET.
Thanks, that helped me. I made some improvements and wrapped that in a helper script "finalurl":
#!/bin/bash
curl $1 -s -L -I -o /dev/null -w '%{url_effective}'
-o output to /dev/null
-I don't actually download, just discover the final URL
-s silent mode, no progressbars
This made it possible to call the command from other scripts like this:
echo `finalurl http://someurl/`
as another option:
$ curl -i http://google.com
HTTP/1.1 301 Moved Permanently
Location: http://www.google.com/
Content-Type: text/html; charset=UTF-8
Date: Sat, 19 Jun 2010 04:15:10 GMT
Expires: Mon, 19 Jul 2010 04:15:10 GMT
Cache-Control: public, max-age=2592000
Server: gws
Content-Length: 219
X-XSS-Protection: 1; mode=block
<HTML><HEAD><meta http-equiv="content-type" content="text/html;charset=utf-8">
<TITLE>301 Moved</TITLE></HEAD><BODY>
<H1>301 Moved</H1>
The document has moved
here.
</BODY></HTML>
But it doesn't go past the first one.
Thank you. I ended up implementing your suggestions: curl -i + grep
curl -i http://google.com -L | egrep -A 10 '301 Moved Permanently|302 Found' | grep 'Location' | awk -F': ' '{print $2}' | tail -1
Returns blank if the website doesn't redirect, but that's good enough for me as it works on consecutive redirections.
Could be buggy, but at a glance it works ok.
You can do this with wget usually. wget --content-disposition "url" additionally if you add -O /dev/null you will not be actually saving the file.
wget -O /dev/null --content-disposition example.com
The parameters -L (--location) and -I (--head) still doing unnecessary HEAD-request to the location-url.
If you are sure that you will have no more than one redirect, it is better to disable follow location and use a curl-variable %{redirect_url}.
This code do only one HEAD-request to the specified URL and takes redirect_url from location-header:
curl --head --silent --write-out "%{redirect_url}\n" --output /dev/null "https://""goo.gl/QeJeQ4"
Speed test
all_videos_link.txt - 50 links of goo.gl+bit.ly which redirect to youtube
1. With follow location
time while read -r line; do
curl -kIsL -w "%{url_effective}\n" -o /dev/null $line
done < all_videos_link.txt
Results:
real 1m40.832s
user 0m9.266s
sys 0m15.375s
2. Without follow location
time while read -r line; do
curl -kIs -w "%{redirect_url}\n" -o /dev/null $line
done < all_videos_link.txt
Results:
real 0m51.037s
user 0m5.297s
sys 0m8.094s
curl can only follow http redirects. To also follow meta refresh directives and javascript redirects, you need a full-blown browser like headless chrome:
#!/bin/bash
real_url () {
printf 'location.href\nquit\n' | \
chromium-browser --headless --disable-gpu --disable-software-rasterizer \
--disable-dev-shm-usage --no-sandbox --repl "$#" 2> /dev/null \
| tr -d '>>> ' | jq -r '.result.value'
}
If you don't have chrome installed, you can use it from a docker container:
#!/bin/bash
real_url () {
printf 'location.href\nquit\n' | \
docker run -i --rm --user "$(id -u "$USER")" --volume "$(pwd)":/usr/src/app \
zenika/alpine-chrome --no-sandbox --repl "$#" 2> /dev/null \
| tr -d '>>> ' | jq -r '.result.value'
}
Like so:
$ real_url http://dx.doi.org/10.1016/j.pgeola.2020.06.005
https://www.sciencedirect.com/science/article/abs/pii/S0016787820300638?via%3Dihub
This would work:
curl -I somesite.com | perl -n -e '/^Location: (.*)$/ && print "$1\n"'
I'm not sure how to do it with curl, but libwww-perl installs the GET alias.
$ GET -S -d -e http://google.com
GET http://google.com --> 301 Moved Permanently
GET http://www.google.com/ --> 302 Found
GET http://www.google.ca/ --> 200 OK
Cache-Control: private, max-age=0
Connection: close
Date: Sat, 19 Jun 2010 04:11:01 GMT
Server: gws
Content-Type: text/html; charset=ISO-8859-1
Expires: -1
Client-Date: Sat, 19 Jun 2010 04:11:01 GMT
Client-Peer: 74.125.155.105:80
Client-Response-Num: 1
Set-Cookie: PREF=ID=a1925ca9f8af11b9:TM=1276920661:LM=1276920661:S=ULFrHqOiFDDzDVFB; expires=Mon, 18-Jun-2012 04:11:01 GMT; path=/; domain=.google.ca
Title: Google
X-XSS-Protection: 1; mode=block
Can you try with it?
#!/bin/bash
LOCATION=`curl -I 'http://your-domain.com/url/redirect?r=something&a=values-VALUES_FILES&e=zip' | perl -n -e '/^Location: (.*)$/ && print "$1\n"'`
echo "$LOCATION"
Note: when you execute the command curl -I http://your-domain.com have to use single quotes in the command like curl -I 'http://your-domain.com'
You could use grep. doesn't wget tell you where it's redirecting too? Just grep that out.

Resources