Multiple downloads with wget at the same time - linux

I have a link.txt with multiple links for download,all are protected by the same username and password.
My intention is to download multiple files at the same time, if the file contains 5 links, to download all 5 files at the same time.
I've tried this, but without success.
cat links.txt | xargs -n 1 -P 5 wget --user user007 --password pass147
cat links.txt | xargs -n 1 -P 5 wget --user=user007 --password=pass147
give me this error:
Reusing existing connection to HTTP request sent,
awaiting response... 404 Not Found
This message appears in all the links i try to download, except for the last link in the file which starts to download.
i am currently use, but this download just one file at the time
wget -user=admin --password=145788s -i links.txt

Use wget's -i and -b flags.
Go to background immediately after startup. If no output file is specified via the -o, output is redirected to wget-log.
-i file
Read URLs from a local or external file. If - is specified as file, URLs are read from the standard input. (Use ./- to read from a file literally named -.)
Your command will look like:
wget --user user007 --password "pass147*" -b -i links.txt
Note: You should always quote strings with special characters (eg: *).


wget recursion and file extraction

I'm trying to use wget to elegantly & politely download all the pdfs from a website. The pdfs live in various sub-directories under the starting URL. It appears that the -A pdf option is conflicting with the -r option. But I'm not a wget expert! This command:
wget -nd -np -r site/path
faithfully traverses the entire site downloading everything downstream of path (not polite!). This command:
wget -nd -np -r -A pdf site/path
finishes immediately having downloaded nothing. Running that same command in debug mode:
wget -nd -np -r -A pdf -d site/path
reveals that the sub-directories are ignored with the debug message:
Deciding whether to enqueue "https://site/path/subdir1". https://site/path/subdir1 (subdir1) does not match acc/rej rules. Decided NOT to load it.
I think this means that the sub directories did not satisfy the "pdf" filter and were excluded. Is there a way to get wget to recurse into sub directories (of random depth) and only download pdfs (into a single local dir)? Or does wget need to download everything and then I need to manually filter for pdfs afterward?
UPDATE: thanks to everyone for their ideas. The solution was to use a two step approach including a modified version of this:
Try this
1)the “-l” switch specifies to wget to go one level down from the primary URL specified. You could obviously switch that to how ever many levels down in the links you want to follow.
wget -r -l1 -A.pdf
refer man wget for more details
if the above doesn't work,try this
verify that the TOS of the web site permit to crawl it. Then, one solution is :
mech-dump --links '' |
grep pdf$ |
sed 's/\s+/%20/g' |
xargs -I% wget
The mech-dump command comes with Perl's module WWW::Mechanize (libwww-mechanize-perl package on debian & debian likes distros
for installing mech-dump
sudo apt-get update -y
sudo apt-get install -y libwww-mechanize-shell-perl
github repo
I haven't tested this, but you cans still give a try, what i think is you still need to find a way to get all URLs of a website and pipe to any of the solutions I have given.
You will need to have wget and lynx installed:
sudo apt-get install wget lynx
Prepare a script name it however you want for this example pdflinkextractor
echo "Getting link list..."
lynx -cache=0 -dump -listonly "$WEBSITE" | grep ".*\.pdf$" | awk '{print $2}' | tee pdflinks.txt
echo "Downloading..."
wget -P pdflinkextractor_files/ -i pdflinks.txt
to run the file
chmod 700 pdfextractor
$ ./pdflinkextractor

Wget gives Error 403: forbidden
In the above given web directory there are approx 30 zip file like
I want to download all file of that directory. I have tried all options available on stack-overflow related to my questions but every time i get the same Error 403: forbidden.
I have tried the following commands :
wget --user-agent="Mozilla" -r -np
wget -r -l1 -H -t1 -nd -N -np -erobots=off
wget -U firefox -r -np
I've tried to visit your link using browser and received the same "Forbidden" message.
Try to open the link in Private window and see what happens.
It's quite possible that you are logged in on this site, so your browser has cookies which allow you to view this directory.
If so, you will need to find out these cookies and specify them too in wget so it can access the protected resource.

How to use lftp to transfer segmented files?

I want to transfer a file from my server to another.The network between these servers isn't very well,so I want to use lftp to speed up.My script is like this:
lftp -u user,password -e "set sftp:connect-program 'ssh -a -x -i /key'; mirror --use-pget=5 -i data.tar.gz -r -R /data/ /tmp; quit" sftp://**.**.**.**:22
I found data.tar.gz wasn't segmented, But When I use it to download a file, that will works.
What should I do?
Segmented uploads are not implemented in lftp. If you have ssh access to the server, login there and use lftp to download the file. If there were many files, you could also upload different files in parallel using -P mirror option.

Using wget to download all zip files on an shtml page

I've been trying to download all the zip files on this website to an EC2 server. However, it is not recognizing the links and thus not downloading anything. I think it's because the shtml file requires that SSI be enabled and that's somehow causing a problem with wget. But I don't really understand that stuff.
This is the code I've been using unsuccessfully.
wget -r -l1 -H -t1 -nd -N -np -erobots=off
Thanks for any help you can provide!
The zip links aren't present on the source code, that's why you cannot download them via wget, they're generated via javascript. The file list is "located" inside under node <fec_file status="Archive"></fec_file>
You can code a script to parse the xml file and convert the nodes to the actual links because they've a pattern.
As #cyrus mentioned, the files are also on, you can use wget -m for mirroring the ftp and -A zip to restrict the download to zip files, i.e.:
wget -A zip -m --user=anonymous
Or wget -r
wget -A zip --ftp-user=anonymous -r*

Ubuntu: Using curl to download an image

I want to download an image accessible from this link: into my local system. Now, I'm aware that the curl command can be used to download remote files through the terminal. So, I entered the following in my terminal in order to download the image into my local system:
However, this doesn't seem to work, so obviously there is some other way to download images from the Internet using curl. What is the correct way to download images using this command?
curl without any options will perform a GET request. It will simply return the data from the URI specified. Not retrieve the file itself to your local machine.
When you do,
$ curl
You will receive binary data:
|�>�$! <R�HP#T*�Pm�Z��jU֖��ZP+UAUQ#�
��{X\� K���>0c�yF[i�}4�!�V̧�H_�)nO#�;I��vg^_ ��-Hm$$N0.
���%Y[�L�U3�_^9��P�T�0'u8�l�4 ...
In order to save this, you can use:
$ curl > image.png
to store that raw image data inside of a file.
An easier way though, is just to use wget.
$ wget
$ ls
For those who don't have nor want to install wget, curl -O (capital "o", not a zero) will do the same thing as wget. E.g. my old netbook doesn't have wget, and is a 2.68 MB install that I don't need.
curl -O
If you want to keep the original name — use uppercase -O
curl -O
If you want to save remote file with a different name — use lowercase -o
curl -o myPic.png
Create a new file called files.txt and paste the URLs one per line. Then run the following command.
xargs -n 1 curl -O < files.txt
For ones who got permission denied for saving operation, here is the command that worked for me:
$ curl --output py.png
try this
$ curl > precomposed.png
