Download files using Shell script - linux

I want to download a number of files which are as follows:
http://example.com/directory/file1.txt
http://example.com/directory/file2.txt
http://example.com/directory/file3.txt
http://example.com/directory/file4.txt
.
.
http://example.com/directory/file199.txt
http://example.com/directory/file200.txt
Can anyone help me with it using shell scripting? Here is what I'm using but it is downloading only the first file.
for i in {1..200}
do
exec wget http://example.com/directory/file$i.txt;
done

wget http://example.com/directory/file{1..200}.txt
should do it. That expands to wget http://example.com/directory/file1.txt http://example.com/directory/file2.txt ....
Alternatively, your current code should work fine if you remove the call to exec, which is unnecessary and doesn't do what you seem to think it does.

To download a list of files you can use wget -i <file> where is a file name with a list of url to download.
For more details you can review the help page: man wget

Related

How to download the contents of a web page on to a .txt file using wget in Linux command line?

The page that I'm trying to download from (made up contact info):
http://www.filltext.com/?rows=1000&nro={index}&etunimi={firstName}&sukunimi={lastName}&email={email}&puhelinnumero={phone}&pituus={numberRange|150,200}&syntymaaika={date|10-01-1950,30-12-1999}&postinumero={zip}&kaupunki={city}&maa={country}&pretty=true
The command that I have been using (I have tried a lot of different options etc.):
wget -r -O -F [filename] URL
It works in the sense that it downloads the web page content to the file, but instead of being the raw data that is inside the cells, it's just a bunch of curly brackets.
How do I download the actual raw data instead of the JSON file? Any help would be very much apppreciated!
Thank you.
Do you want something like wget google.com -q -O - ?

How to download multiple links into a folder via wget in linux

When I want to download a file in a folder in Linux via wget I use the following:
wget -P /patch/to/folder/ http://example.com/file.zip
Now let's say I want to download several files into the same folder and my urls are:
http://example.com/file1.zip
http://example.com/file2.zip
http://example.com/file3.zip
How can I achieve this in one with command in the same folder /patch/to/folder/?
Thank you.
You can just append more URLs to your command:
wget -P /patch/to/folder/ http://example.com/file1.zip http://example.com/file2.zip http://example.com/file3.zip
If they only differ by number, you can have bash brace expansion automatically generate all the arguments before wget runs:
wget -P /patch/to/folder/ http://example.com/file{1..3}.zip
You can tell that this is possible from the invocation synopsis in man wget, where the ... is a convention that means it can accept multiple at a time:
SYNOPSIS
wget [option]... [URL]...

LFS version 7.8, wget is not working

I was trying to build the LFS project and following the book version 7.8. But I'm stuck as wget is not working.
when I execute the command -
"wget --input-file=wget-list --continue --directory-prefix=$LFS/sources"
it returns an error
"wget-list: No such file or directory
No URLs found in wget-list."
I have created $LFS/sources directory.
Kindly let me know what I can do to get over this. Any help is appreciable.
You need to give the path file to --input-file. in this case wget-list you can get it from the LFS website: http://www.linuxfromscratch.org/lfs/view/stable/wget-list
so, you can try this below:
"wget --input-file=wget-list --continue --directory-prefix=$LFS/sources"
It seems you don't have a file called wget-list in the currect directory where you run the wget command.
the other option is that the wget-list file doesn't contain the urls in a way wget can read them.
I have this problem too, but I solved that by do this:
at first save wget-list in /mnt/lfs/sources by this command:
sudo wget --input-file="http://www.linuxfromscratch.org/lfs/downloads/7.7/wget-list" --continue --directory-prefix=/mnt/lfs/sources
use this command to download all files:
sudo wget -i /mnt/lfs/sources/wget-list --directory-prefix=$LFS/sources
The command that you are executing cannot find the input file. The file should be placed in the same directory from where you are trying to execute the command. Or else, you can simply execute the following command to fetch all the packages:
sudo wget --input-file="http://www.linuxfromscratch.org/lfs/downloads/7.8/wget-list" --continue --directory-prefix=$LFS/sources
I had a similar problem where it was telling me it couldn't find any of the URLs but it downloaded one file, the solution to the problem was to enter the following code:
wget -nc --input-file="http://www.linuxfromscratch.org/lfs/view/stable/wget-list" --continue --directory-prefix=$LFS/sources
The -nc is no clobber which stops it from downloading the same file twice which is the problem I was having

How can I write a bash script that pulls all of the http://whatever.com/something out of a file

I am migrating a site in PHP and someone has hardcoded all the links into a function call display image('http://whatever.com/images/xyz.jpg').
I can easily use text mate to convert all of these to http://whatever.com/images/xyz.jpg.
But what I also need to do is bring the images down with it so for example wget -i images.txt.
But I need to write a bash script to compile images.txt with all the links to save me doing this manually because there are a lot!
Any help you can give on this is greatly appreciated.
I found a one-liner on that website that should work: (replace index.php by your source)
wget `cat index.php | grep -P -o 'http:(\.|-|\/|\w)*\.(gif|jpg|png|bmp)'`
If you wget the file via. a web server, will you not get the output from the PHP script? That will contain img tags which you can extract using xml_grep or some such tool.

wget .listing file, is there a way to specify the name of it

Ok so I need to run wget but I'm prohibited from creating 'dot' files in the location that I need to run the wget. So my question is 'Can I get wget to use a name other than .listing that I can specify'.
further clarification : this is to sync / mirror an ftp folder with a local one, So using the -O option is not really useful, as I require all files to maintain format.
You can use the -O option to set the output filename, as in:
wget -O file http://stackoverflow.com
You can also use wget --help to get a complete list of options.
For folks that come along afterwards, and are surprised by an answer to the wrong question, here is a copy of one of the comments below:
#FelixD, yes, unfortunately misunderstood the question. Looking at the code for wget version 1.19 (Feb 2017), specifically ftp.c, it appears that the .listing file is hardcoded in macro LIST_FILENAME, and no override possible. There are probably better options for mirroring ftp sites - maybe take a look at lftp and its mirror command, also includes parallel downloads: lftp.yar.ru
#Paul: You can use that -O option specified by spong
No. You can't do this.
wget/src/ftp.c
/* File where the "ls -al" listing will be saved. */
#ifdef MSDOS
#define LIST_FILENAME "_listing"
#else
#define LIST_FILENAME ".listing"
#endif
I have same problem;
wget seems to save the .listing file in current directory where wget was called from, regardless of -O path/outpout_file
As an ugly/desperate solution we can try to run wget from random directories:
cd /temp/random_1; wget ftp://example.com/ -O /full/save_path/to_file_1.txt
cd /temp/random_2; wget ftp://example.com/ -O /full/save_path/to_file_2.txt
Note: manual says that using the --no-remove-listing option will cause it to create .listing.1, .listing.2, etc, so that might be an option to avoid conflicts.
Note: .listing file is not created at all if ftp login failed.

Resources