CURL: grabbing liveleak video - linux

Can i grab a video using curl?
I was using a website to download videos from liveleak, but it stopped working. I need this for one of my scripts.
basically this is the link:
http://www.liveleak.com/e/955_1345380192
redirected to this
http://edge.liveleak.com/80281E/u/u/ll2_player_files/mp55/player.swf?config=http://www.liveleak.com/player?a=config%26item_token=955_1345380192%26embed=1%26extra_params=
and that conf link contains the video link. every time i try to download it, i get
--->Make sure a file_url, file_token or playlist_token are set!
http://www.liveleak.com/player?a=config%26item_token=955_1345380192%26embed=1%26extra_params=
what I've tried so far:
curl http://edge.liveleak.com/80281E/u/u/ll2_player_files/mp55/player.swf?config=http://www.liveleak.com/player?a=config%26item_token=955_1345380192%26embed=1%26extra_params= -s -L -b LCOOKIE -c LCOOKIE -o LIVE
curl http://edge.liveleak.com/80281E/u/u/ll2_player_files/mp55/player.swf?config=http://www.liveleak.com/player?a=config%26item_token=955_1345380192%26embed=1%26extra_params= -I
curl http://edge.liveleak.com/80281E/u/u/ll2_player_files/mp55/player.swf?config=http://www.liveleak.com/player?a=config%26item_token=955_1345380192%26embed=1%26extra_params= -v
curl http://www.liveleak.com/player?a=config&item_token=955_1345380192&embed=1&extra_params=
wget http://www.liveleak.com/player?a=config&item_token=955_1345380192&embed=1&extra_params=
curl -A "Mozilla/5.0 (X11; U; Linux x86_64; ru; rv:1.9.2.15) Gecko/20110303 Ubuntu/10.10 (maverick) Firefox/3.6.15" http://www.liveleak.com/player?a=config&item_token=955_1345380192&embed=1&extra_params=

Here is your epic one-liner:
UA="Mozilla/5.0 (X11; U; Linux x86_64; ru; rv:1.9.2.15) Gecko/20110303 Ubuntu/10.10 (maverick) Firefox/3.6.15"; curl -A "$UA" $(sed -n -e 's/.*<file>\(.*\)<\/file>.*/\1/p' <(wget -q -O - $(wget -U "$UA" -nv -r -np -nd -H --spider "http://www.liveleak.com/e/955_1345380192" 2>&1 | egrep ' URL:' | awk '{print $4}' | sed "s/.*\?config\=//g" | sed -e's/%\([0-9A-F][0-9A-F]\)/\\\\\x\1/g' | xargs echo -e)))
As requested, it uses curl (and few additional tools); see bash manual and documentation of other commands for explanation.
Short summary: they moved information about video in xml file. To have it easy next time use the latest Firefox and it's ability to spy on all HTTP request and log their contents (no add-ons needed!)

Related

Linux curl : no url found (or) curl: malformed url

So I am downloading docker setup on my linux vm, and have to run this command as part of the steps, but even though it mentions url, and I changed once -o to -O but still getting those errors, what to do for this?
this is the command im running
sudo curl -L $(curl -L https://api.github.com/repos/docker/compose/releases/latest | grep "browser_download_url" | grep "$(uname -s)-$(uname -m)\"" | sed -nr 's/\s+"browser_download_url":\s+"(https.*)"/\1/p') -o /usr/local/bin/docker-compose
The grep that is filtering what system you are running is outputting an upper case L in Linux, this may be the cause of your errors. Try this:
sudo curl -L $(curl -L https://api.github.com/repos/docker/compose/releases/latest | grep "browser_download_url" | grep -i "$(uname -s)-$(uname -m)\"" | sed -nr 's/\s+"browser_download_url":\s+"(https.*)"/\1/p') -o /usr/local/bin/docker-compose
Hope this helps!

How can I use WGET to get only status info and save it somewhere?

can I use WGET to get, let's say, status 200 OK and save that status somewhere? If not, how can I do that using ubuntu linux?
Thanks!
With curl you can
curl -L -o /dev/null -s -w "%{http_code}\n" http://google.com >> status.txt
You use --save-headers to add the headers to the output, put the output to the console using -O -, ignore the errors stream using >/dev/null and get only the status line using grep HTTP/.
You can then output that into a file using >status_file
$ wget --save-headers -O - http://google.com/ 2>/dev/null | grep HTTP/ > status_file
The question suggests that the output of the wget command be stored somewhere. As another alternative, the following example shows how to store the output of wget execution in a shell variable (wget_status). Where after the execution of the wget command the status of the execution is stored in the variable wget_status. The wget status is displayed in the console using the echo command.
$ wget_status=$(wget --server-response ${URL} 2>&1 | awk '/^ HTTP/{print $2}')
$ echo $wget_status
200
After the execution of the wget command, the execution status can be manipulated using the value of the wget_status variable.
For more information consult the following link as a reference:
https://www.unix.com/shell-programming-and-scripting/148595-capture-http-response-code-wget.html
The tests were executed using cloudshell on a linux system.
Linux cs-335831867014-default 5.10.90+ #1 SMP Wed Mar 23 09:10:07 UTC 2022 x86_64 GNU/Linux

Linux Shell : tail a log file and send each line to http service using curl

I'm collecting logs and need to send the logs to a real-time log monitor system.
The log monitor system receives messages via http service.
This command works:
tail -F app.2015-02-17.log | sed -e "s/\"//g" | awk '{print "{\"topic\":\"EC_IPService_Log\",\"message\":\""$0"\"}"}'|while read LINE ; do (curl -X POST -H "Authorization:Basic cm9ja2V0bXE6cm9ja2V0bXE=" -H "mq-version:1" -H "Content-Type:application/json" http://localhost:9999/MQ/sendMessage -d "$LINE") ; done
The question is this command do not run fast enough to handle log from tail command, so it's getting slower and slower. When I run this command for 10 hour, it may handler the log one hour ago...
If I change curl command to echo like this:
tail -F app.2015-02-17.log | sed -e "s/\"//g" | awk '{print "{\"topic\":\"EC_IPService_Log\",\"message\":\""$0"\"}"}'|while read LINE ; do (echo "$LINE") ; done
This command run fast enough.
Is there any better solution to use curl to solve this issue?
Thanks in advance.

Curl: How to insert value to a cookie?

Ho to insert cookies value in curl? from firebug request headers I can see in the following
Cookie: PHPSESSID=gg792c2ktu6sch6n8q0udd94o0; was=1; uncheck2=1; uncheck3=1; uncheck4=1; uncheck5=0; hd=1; uncheck1=1"
I have tried the following:
curl http://site.com/ -s -L -b cookie.c -c cookie.c -d "was=1; uncheck2=1; uncheck3=1; uncheck4=1; uncheck5=0; hd=1; uncheck1=1" > comic
and the only thing i see in cookie.c is
PHPSESSID=gg792c2ktu6sch6n8q0udd94o0; was=1;
To pass keys/values to cURL cookie, you need the -b switch, not -d.
For the forms -d, the data will be separated by & and not by ; in your curl command.
So :
curl http://site.com/ \
-s \
-L \
-b cookie.c \
-c cookie.c \
-b "was=1; uncheck2=1; uncheck3=1; uncheck4=1; uncheck5=0; hd=1; uncheck1=1"
> comic
If you need to know the names of the forms to be POSTed, you can run the following command :
mech-dump --forms http://site.com/
It comes with libwww-mechanize-perl package with debian or derivated.

Get final URL after curl is redirected

I need to get the final URL after a page redirect preferably with curl or wget.
For example http://google.com may redirect to http://www.google.com.
The contents are easy to get(ex. curl --max-redirs 10 http://google.com -L), but I'm only interested in the final url (in the former case http://www.google.com).
Is there any way of doing this by using only Linux built-in tools? (command line only)
curl's -w option and the sub variable url_effective is what you are
looking for.
Something like
curl -Ls -o /dev/null -w %{url_effective} http://google.com
More info
-L Follow redirects
-s Silent mode. Don't output anything
-o FILE Write output to <file> instead of stdout
-w FORMAT What to output after completion
More
You might want to add -I (that is an uppercase i) as well, which will make the command not download any "body", but it then also uses the HEAD method, which is not what the question included and risk changing what the server does. Sometimes servers don't respond well to HEAD even when they respond fine to GET.
Thanks, that helped me. I made some improvements and wrapped that in a helper script "finalurl":
#!/bin/bash
curl $1 -s -L -I -o /dev/null -w '%{url_effective}'
-o output to /dev/null
-I don't actually download, just discover the final URL
-s silent mode, no progressbars
This made it possible to call the command from other scripts like this:
echo `finalurl http://someurl/`
as another option:
$ curl -i http://google.com
HTTP/1.1 301 Moved Permanently
Location: http://www.google.com/
Content-Type: text/html; charset=UTF-8
Date: Sat, 19 Jun 2010 04:15:10 GMT
Expires: Mon, 19 Jul 2010 04:15:10 GMT
Cache-Control: public, max-age=2592000
Server: gws
Content-Length: 219
X-XSS-Protection: 1; mode=block
<HTML><HEAD><meta http-equiv="content-type" content="text/html;charset=utf-8">
<TITLE>301 Moved</TITLE></HEAD><BODY>
<H1>301 Moved</H1>
The document has moved
here.
</BODY></HTML>
But it doesn't go past the first one.
Thank you. I ended up implementing your suggestions: curl -i + grep
curl -i http://google.com -L | egrep -A 10 '301 Moved Permanently|302 Found' | grep 'Location' | awk -F': ' '{print $2}' | tail -1
Returns blank if the website doesn't redirect, but that's good enough for me as it works on consecutive redirections.
Could be buggy, but at a glance it works ok.
You can do this with wget usually. wget --content-disposition "url" additionally if you add -O /dev/null you will not be actually saving the file.
wget -O /dev/null --content-disposition example.com
The parameters -L (--location) and -I (--head) still doing unnecessary HEAD-request to the location-url.
If you are sure that you will have no more than one redirect, it is better to disable follow location and use a curl-variable %{redirect_url}.
This code do only one HEAD-request to the specified URL and takes redirect_url from location-header:
curl --head --silent --write-out "%{redirect_url}\n" --output /dev/null "https://""goo.gl/QeJeQ4"
Speed test
all_videos_link.txt - 50 links of goo.gl+bit.ly which redirect to youtube
1. With follow location
time while read -r line; do
curl -kIsL -w "%{url_effective}\n" -o /dev/null $line
done < all_videos_link.txt
Results:
real 1m40.832s
user 0m9.266s
sys 0m15.375s
2. Without follow location
time while read -r line; do
curl -kIs -w "%{redirect_url}\n" -o /dev/null $line
done < all_videos_link.txt
Results:
real 0m51.037s
user 0m5.297s
sys 0m8.094s
curl can only follow http redirects. To also follow meta refresh directives and javascript redirects, you need a full-blown browser like headless chrome:
#!/bin/bash
real_url () {
printf 'location.href\nquit\n' | \
chromium-browser --headless --disable-gpu --disable-software-rasterizer \
--disable-dev-shm-usage --no-sandbox --repl "$#" 2> /dev/null \
| tr -d '>>> ' | jq -r '.result.value'
}
If you don't have chrome installed, you can use it from a docker container:
#!/bin/bash
real_url () {
printf 'location.href\nquit\n' | \
docker run -i --rm --user "$(id -u "$USER")" --volume "$(pwd)":/usr/src/app \
zenika/alpine-chrome --no-sandbox --repl "$#" 2> /dev/null \
| tr -d '>>> ' | jq -r '.result.value'
}
Like so:
$ real_url http://dx.doi.org/10.1016/j.pgeola.2020.06.005
https://www.sciencedirect.com/science/article/abs/pii/S0016787820300638?via%3Dihub
This would work:
curl -I somesite.com | perl -n -e '/^Location: (.*)$/ && print "$1\n"'
I'm not sure how to do it with curl, but libwww-perl installs the GET alias.
$ GET -S -d -e http://google.com
GET http://google.com --> 301 Moved Permanently
GET http://www.google.com/ --> 302 Found
GET http://www.google.ca/ --> 200 OK
Cache-Control: private, max-age=0
Connection: close
Date: Sat, 19 Jun 2010 04:11:01 GMT
Server: gws
Content-Type: text/html; charset=ISO-8859-1
Expires: -1
Client-Date: Sat, 19 Jun 2010 04:11:01 GMT
Client-Peer: 74.125.155.105:80
Client-Response-Num: 1
Set-Cookie: PREF=ID=a1925ca9f8af11b9:TM=1276920661:LM=1276920661:S=ULFrHqOiFDDzDVFB; expires=Mon, 18-Jun-2012 04:11:01 GMT; path=/; domain=.google.ca
Title: Google
X-XSS-Protection: 1; mode=block
Can you try with it?
#!/bin/bash
LOCATION=`curl -I 'http://your-domain.com/url/redirect?r=something&a=values-VALUES_FILES&e=zip' | perl -n -e '/^Location: (.*)$/ && print "$1\n"'`
echo "$LOCATION"
Note: when you execute the command curl -I http://your-domain.com have to use single quotes in the command like curl -I 'http://your-domain.com'
You could use grep. doesn't wget tell you where it's redirecting too? Just grep that out.

Resources