How to append the wget downloaded file? - linux

I am running wget through cronjob for executing some script in scheduled manner. Everytime the output is downloaded and saved as new file. I want to append the output to same file. How can I do that?
I am talking about the downloaded content from the URL but not the log of the execution.

You can do it using the following command:
wget <URL> -O ->> <FILE_NAME>

My first approach now would be to download it to a file, add the content of the new downloaded file to the previously downloaded file and delete it.

Related

LFS version 7.8, wget is not working

I was trying to build the LFS project and following the book version 7.8. But I'm stuck as wget is not working.
when I execute the command -
"wget --input-file=wget-list --continue --directory-prefix=$LFS/sources"
it returns an error
"wget-list: No such file or directory
No URLs found in wget-list."
I have created $LFS/sources directory.
Kindly let me know what I can do to get over this. Any help is appreciable.
You need to give the path file to --input-file. in this case wget-list you can get it from the LFS website: http://www.linuxfromscratch.org/lfs/view/stable/wget-list
so, you can try this below:
"wget --input-file=wget-list --continue --directory-prefix=$LFS/sources"
It seems you don't have a file called wget-list in the currect directory where you run the wget command.
the other option is that the wget-list file doesn't contain the urls in a way wget can read them.
I have this problem too, but I solved that by do this:
at first save wget-list in /mnt/lfs/sources by this command:
sudo wget --input-file="http://www.linuxfromscratch.org/lfs/downloads/7.7/wget-list" --continue --directory-prefix=/mnt/lfs/sources
use this command to download all files:
sudo wget -i /mnt/lfs/sources/wget-list --directory-prefix=$LFS/sources
The command that you are executing cannot find the input file. The file should be placed in the same directory from where you are trying to execute the command. Or else, you can simply execute the following command to fetch all the packages:
sudo wget --input-file="http://www.linuxfromscratch.org/lfs/downloads/7.8/wget-list" --continue --directory-prefix=$LFS/sources
I had a similar problem where it was telling me it couldn't find any of the URLs but it downloaded one file, the solution to the problem was to enter the following code:
wget -nc --input-file="http://www.linuxfromscratch.org/lfs/view/stable/wget-list" --continue --directory-prefix=$LFS/sources
The -nc is no clobber which stops it from downloading the same file twice which is the problem I was having

wget to download new wildcard files and overwrite old ones

I'm currently using wget to download specific files from a remote server. The files are updated every week, but always have the same file names. e.g new upload file1.jpg will replace local file1.jpg
This is how I am grabbing them, nothing fancy :
wget -N -P /path/to/local/folder/ http://xx.xxx.xxx.xxx/remote/files/file1.jpg
This downloads file1.jpg from the remote server if it is newer than the local version then overwrites the local one with the new one.
Trouble is, I'm doing this for over 100 files every week and have set up cron jobs to fire the 100 different download scripts at specific times.
Is there a way I can use a wildcard for the file name and have just one script that fires every 5 minutes for example?
Something like....
wget -N -P /path/to/local/folder/ http://xx.xxx.xxx.xxx/remote/files/*.jpg
Will that work? Will it check the local folder for all current file names, see what is new and then download and overwrite only the new ones? Also, is there any danger of it downloading partially uploaded files on the remote server?
I know that some kind of file sync script between servers would be a better option but they all look pretty complicated to set up.
Many thanks!
You can specify the files to be downloaded one by one in a text file, and then pass that file name using option -i or --input-file.
e.g. contents of list.txt:
http://xx.xxx.xxx.xxx/remote/files/file1.jpg
http://xx.xxx.xxx.xxx/remote/files/file2.jpg
http://xx.xxx.xxx.xxx/remote/files/file3.jpg
....
then
wget .... --input-file list.txt
Alternatively, If all your *.jpg files are linked from a particular HTML page, you can use recursive downloading, i.e. let wget follow links on your page to all linked resources. You might need to limit the "recursion level" and file types in order to prevent downloading too much. See wget --help for more info.
wget .... --recursive --level=1 --accept=jpg --no-parent http://.../your-index-page.html

wget Downloading and replacing file only if target is newer than source

This is what I'm trying to achieve :
User uploads file1.jpg to Server A
Using wget Server B only downloads file1.jpg from Server A if the file is newer than the one that already exists on Server B and then replaces the file on Server B with the new one.
I know I can use :
wget -N http://www.mywebsite.com/files/file1.jpg
To check that the target file is newer than the source but I'm a little confused as to how I format the command to let it know what and where the actual source file is?
Is it something like? :
wget -N http://www.mywebsite.com/files/file1.jpg /serverb/files/file1.jpg
Cheers!
You can use -P option to specify the directory where the file(s) will be downloaded:
$ wget -N -P /serverb/files/ http://www.mywebsite.com/files/file1.jpg
You are also talking about downloading and replacing the file. Be aware, that wget overwrites the file, so it is "broken" while being downloaded. I don't think you can do atomic replacement of the file using only wget. You need a small script that uses temporary files and move to atomically replace the file in Server B.

Using wget on a directory

I'm fairly new to shell and I'm trying to use wget to download a .zip file from one directory to another. The only file in the directory I am copying the file from is the .zip file. However when I use wget IP address/directory it downloads an index.html file instead of the .zip. Is there something I am missing to get it to download the .zip without having to explicitly state it?
wget is the utility to download file from web.
you have mentioned you want to copy from one directory to other. you meant it is on same server/node?
In that case you can simply use cp command
And if you want if from any other server/node [file transfer] you can use scp or ftp

Download files using Shell script

I want to download a number of files which are as follows:
http://example.com/directory/file1.txt
http://example.com/directory/file2.txt
http://example.com/directory/file3.txt
http://example.com/directory/file4.txt
.
.
http://example.com/directory/file199.txt
http://example.com/directory/file200.txt
Can anyone help me with it using shell scripting? Here is what I'm using but it is downloading only the first file.
for i in {1..200}
do
exec wget http://example.com/directory/file$i.txt;
done
wget http://example.com/directory/file{1..200}.txt
should do it. That expands to wget http://example.com/directory/file1.txt http://example.com/directory/file2.txt ....
Alternatively, your current code should work fine if you remove the call to exec, which is unnecessary and doesn't do what you seem to think it does.
To download a list of files you can use wget -i <file> where is a file name with a list of url to download.
For more details you can review the help page: man wget

Resources