Download online XML file using cron - cron

I am very new using cron jobs, and I was wondering if there was a way to save this XML file 'http://xml.weather.yahoo.com/forecastrss?p=75039' and save it to a folder in my hosting, somewhere like '/urban/test-page/'I have no idea if this is even possible since the file is in another server, so I am unsure if I can do it this way, or if I should just use a VBscript to accomplish this task.
Thanks for any information that might help me with this.

You can use wget with -P option for specifying the directory to save.
wget http://xml.weather.yahoo.com/forecastrss?p=75039 -P /urban/test-page/
and to specify the filename, add -O filename.extension. e.g.:
wget http://xml.weather.yahoo.com/forecastrss?p=75039 -P /urban/test-page/ -O forecast.xml
You will need to make sure that the job is run by a user with permissions for the folder you want to save to.
To set this up as a cronjob, running once a day at 1pm, run:
0 13 * * * wget http://xml.weather.yahoo.com/forecastrss?p=75039 -P /urban/test-page/

Related

Monthly script for copying file from server to another server if exist

Currently I follow the following steps for copying a file from server_x to server_y (Let's suppose I run this code on 20th April 2021 = 202104)
sshpass -p "my_server_y_password" ssh myusernameY#server_y
sshpass -p "my_server_x_password" scp myusernameX#server_x:server_x/file/path/file_202104.txt server_y/file/path
I want to get this code to run on 20th of every month and also use the YYYYMM date format for searching the file. If the file does not exist, it shouldn't do anything
I think that the code should look like (inside my_bash_script.sh)
year_month="$(date +'%Y%m')"
FILE=file/path/file_$(year_month).txt
# somehow check if file from server_x exists (currently I don't know how to do it on a different server)
if test -f "$FILE"; then
sshpass -p my_server_y_password ssh myusernameY#server_y
sshpass -p "my_server_x_password" scp myusernameX#server_x:${FILE} server_y/file/path
fi
And, after this, I should run my script
* * 20 * * my_bash_script.sh
I am sure there are much more elegant solutions to approach my problem. Also, I think some parts of my code don't even work properly. Please help me finding a proper solution for my task
Edit reasons: I replaced my former code with crontab command as suggested by #looppool
Yes, there are much more elegant solutions to this. Use cron. See this answer as an example. Simply change the third parameter for your convenience:
* * 20 * * my_bash_script.sh
For your updated question, you don't have to check for yourself whether or not a file is there or if it is updated. Simply use the rsync-command to copy a file from one server to another. It will only update the files which have changed. rsync also understands the ssh-Protocol, you'll easily a lot of examples on the internet on how to do that.
However, how can I check the existence of a file in a remote server?
No, as aforementioned, you don't need to do it by yourself, unless if you want to do extra operations on existing files.
If you just want to sync files, then just simply try:
rsync path/to/local_file remote_host:path/to/remote_directory
Let me know if it can work for you.

uploading file to google-drive using gdrive is not working on crontab

I wrote backup script for my computer. The backup scenario is like this:
Whole directories under root are bound into tar.gz twice a day(3AM, and 12AM), and this archive is going to be uploaded to google-drive using gdrive app. every 3AM.
and here is the script
#!/bin/bash
#Program: arklab backup script version 2.0
#Author: namil son
#Last modified date: 160508
#Contact: 21100352#handong.edu
#It should be executed as a super user
export LANG=en
MD=`date +%m%d`
TIME=`date +%y%m%d_%a_%H`
filename=`date +%y%m%d_%a_%H`.tar.gz
HOST=$HOSTNAME
backuproot="/local_share/backup/"
backup=`cat $backuproot/backup.conf`
gdriveID="blablabla" #This argument should be manually substituted to google-drive directory ID for each server.
#Start a new backup period at January first and June first.
if [ $MD = '0101' -o $MD = '0601' ]; then
mkdir $backuproot/`date +%y%m`
rm -rf $backuproot/`date --date '1 year ago' +%y%m`
echo $backuproot/`date +%y%m` > $backuproot/backup.conf #Save directory name for this period in backup.conf
backup=`cat $backuproot/backup.conf`
gdrive mkdir -p $gdriveID `date +%y%m` > $backup/dir
awk '{print $2}' $backup/dir > dirID
rm -f $backup/dir
fi
#make tar ball
tar -g $backup/snapshot -czpf $backup/$filename / --exclude=/tmp/* --exclude=/mnt/* --exclude=/media/* --exclude=/proc/* --exclude=/lost+found/* --exclude=/sys/* --exclude=/local_share/backup/* --exclude=/home/* \
--exclude=/share/*
#upload backup file using gdrive under the path written in dirID
if [ `date +%H` = '03' ]; then
gdrive upload -p `cat $backup/dirID` $backup/$filename
gdrive upload -p `cat $backup/dirID` $backup/`date --date '15 hour ago' +%y%m%d_%a_%H`.tar.gz
fi
Here is the problem!
When run this script on crontab, it works pretty well except for uploading tar ball to google-drive, though whole script works perfectly when run the script manually. Only the uploading process is not working when it is runned on crontab!
Can anybody help me?
Crontab entry is like this:
0 3,12 * * * sh /local_share/backup/backup2.0.sh &>> /local_share/backup/backup.sh.log
I have same case.
This is my solution
Change your command gdrive to absolute path to gdrive command
Example:
Don't set cron like this
0 1 * * * gdrive upload abc.tar.gz
Use absolute path
0 1 * * * /usr/local/bin/gdrive upload abc.tar.gz
It will work perfectly
I had the exact same issue with minor differences. I'm using gdrive on a CentOS system. Setup was fine. As root, I set up gdrive. From the command line, 'drive list' worked fine. I used the following blog post to set up gdrive:
http://linuxnewbieguide.org/?p=1078
I wrote a PHP script to do a backup of some directories. When I ran the PHP script as root from the command line, everything worked and uploaded to Google Drive just fine.
So I threw:
1 1 * * * php /root/my_backup_script.php
Into root's crontab. The script executed fine, but the upload to Google Drive wasn't working. I did some debugging, the line:
drive upload --file /root/myfile.bz2
Just wasn't working. The only command-line return was a null string. Very confusing. I'm no unix expert, but I thought when crontab runs as a user, it runs as a user (in this case root). To test, I did the following, and this is very insecure and not recommended:
I created a file with the root password at /root/.rootpassword
chmod 500 .rootpassword
Changed the crontab line to:
1 1 * * * cat /root/.rootpassword | sudo -kS php /root/my_backup_script.php
And now it works, but this is a horrible solution, as the root password is stored in a plain text file on the system. The file is readable only by root, but it is still a very bad solution.
I don't know why (again, no unix expert) I have to have root crontab run a command as sudo to make this work. I know the issue is with the gdrive token generated during gdrive setup. When crontab runs the token is not matching and the upload fails. But when you have crontab sudo as root and run the php script, it works.
I have thought of a possible solution that doesn't require storing the root password in a text file on the system. I am tired right now and haven't tried it. I have been working on this issue for about 4 days, trying various Google Drive backup solutions... all failing. It basically goes like this:
Run the gdrive setup all within the PHP/Apache interpreter. This will (perhaps) set the gdrive token to apache. For example:
Create a PHP script at /home/public_html/gdrive_setup.php. This file needs to step through the entire gdrive and token setup.
Run the script in a browser, get gdrive and the token all set up.
Test gdrive, write a PHP script something like:
$cmd = exec("drive list");
echo $cmd;
Save as gdrive_test.php and run in a browser. If it outputs your google drive files, it's working.
Write up your backup script in php. Put it in a non-indexable web directory and call it something random like 2DJAj23DAJE123.php
Now whenever you pull up 2DJAj23DAJE123.php in a web browser your backup should run.
Finally, edit crontab for root and add:
1 1 * * * wget http://my-website.com/non-indexable-directory/2DJAj23DAJE123.php >/dev/null 2>&1
In theory this should work. No passwords are stored. The only security hole is someone else might be able to run your backup if they executed 2DJAj23DAJE123.php.
Further checks could be added, like checking the system time at the start of 2DJAj23DAJE123.php and make sure it matches the crontab run time, before executing. If the times don't match, just exit the script and do nothing.
The above is all theory and not tested. I think it should work, but I am very tired from this issue.
I hope this was helpful and not overly complicated, but Google Drive IS complicated since their switch over in authentication method earlier this year. Many of the posts/blog posts you find online will just not work.
Sometimes the problem occurs because of the config path of the gdrive, means gdrive cannot find the default configuration so in order to avoid such problems we add --config flag
gdrive upload --config /home/<you>/.gdrive -p <google_drive_folder_id> /path/to/file_to_be_uploaded
Source: GDrive w/ CRON
I have had the same issue and fixed by indicating where the drive command file is.
Ex:
/usr/sbin/drive upload --file xxx..

wget to download new wildcard files and overwrite old ones

I'm currently using wget to download specific files from a remote server. The files are updated every week, but always have the same file names. e.g new upload file1.jpg will replace local file1.jpg
This is how I am grabbing them, nothing fancy :
wget -N -P /path/to/local/folder/ http://xx.xxx.xxx.xxx/remote/files/file1.jpg
This downloads file1.jpg from the remote server if it is newer than the local version then overwrites the local one with the new one.
Trouble is, I'm doing this for over 100 files every week and have set up cron jobs to fire the 100 different download scripts at specific times.
Is there a way I can use a wildcard for the file name and have just one script that fires every 5 minutes for example?
Something like....
wget -N -P /path/to/local/folder/ http://xx.xxx.xxx.xxx/remote/files/*.jpg
Will that work? Will it check the local folder for all current file names, see what is new and then download and overwrite only the new ones? Also, is there any danger of it downloading partially uploaded files on the remote server?
I know that some kind of file sync script between servers would be a better option but they all look pretty complicated to set up.
Many thanks!
You can specify the files to be downloaded one by one in a text file, and then pass that file name using option -i or --input-file.
e.g. contents of list.txt:
http://xx.xxx.xxx.xxx/remote/files/file1.jpg
http://xx.xxx.xxx.xxx/remote/files/file2.jpg
http://xx.xxx.xxx.xxx/remote/files/file3.jpg
....
then
wget .... --input-file list.txt
Alternatively, If all your *.jpg files are linked from a particular HTML page, you can use recursive downloading, i.e. let wget follow links on your page to all linked resources. You might need to limit the "recursion level" and file types in order to prevent downloading too much. See wget --help for more info.
wget .... --recursive --level=1 --accept=jpg --no-parent http://.../your-index-page.html

Keep files updated from remote server

I have a server at hostname.com/files. Whenever a file has been uploaded I want to download it.
I was thinking of creating a script that constantly checked the files directory. It would check the timestamp of the files on the server and download them based on that.
Is it possible to check the files timestamp using a bash script? Are there better ways of doing this?
I could just download all the files in the server every 1 hour. Would it therefore be better to use a cron job?
If you have a regular interval at which you'd like to update your files, yes, a cron job is probably your best bet. Just write a script that does the checking and run that at an hourly interval.
As #Barmar commented above, rsync could be another option. Put something like this in the crontab and you should be set:
# min hour day month day-of-week user command
17 * * * * user rsync -av http://hostname.com/ >> rsync.log
would grab files from the server in that location and append the details to rsync.log on the 17th minute of every hour. Right now, though, I can't seem to get rsync to get files from a webserver.
Another option using wget is:
wget -Nrb -np -o wget.log http://hostname.com/
where -N re-downloads only files newer than the timestamp on the local version, -b sends
the process to the background, -r recurses into directories and -o specifies a log file. This works from an arbitrary web server. -np makes sure it doesn't go up into a parent directory, effectively spidering the entire server's content.
More details, as usual, will be in the man pages of rsync or wget.

wget .listing file, is there a way to specify the name of it

Ok so I need to run wget but I'm prohibited from creating 'dot' files in the location that I need to run the wget. So my question is 'Can I get wget to use a name other than .listing that I can specify'.
further clarification : this is to sync / mirror an ftp folder with a local one, So using the -O option is not really useful, as I require all files to maintain format.
You can use the -O option to set the output filename, as in:
wget -O file http://stackoverflow.com
You can also use wget --help to get a complete list of options.
For folks that come along afterwards, and are surprised by an answer to the wrong question, here is a copy of one of the comments below:
#FelixD, yes, unfortunately misunderstood the question. Looking at the code for wget version 1.19 (Feb 2017), specifically ftp.c, it appears that the .listing file is hardcoded in macro LIST_FILENAME, and no override possible. There are probably better options for mirroring ftp sites - maybe take a look at lftp and its mirror command, also includes parallel downloads: lftp.yar.ru
#Paul: You can use that -O option specified by spong
No. You can't do this.
wget/src/ftp.c
/* File where the "ls -al" listing will be saved. */
#ifdef MSDOS
#define LIST_FILENAME "_listing"
#else
#define LIST_FILENAME ".listing"
#endif
I have same problem;
wget seems to save the .listing file in current directory where wget was called from, regardless of -O path/outpout_file
As an ugly/desperate solution we can try to run wget from random directories:
cd /temp/random_1; wget ftp://example.com/ -O /full/save_path/to_file_1.txt
cd /temp/random_2; wget ftp://example.com/ -O /full/save_path/to_file_2.txt
Note: manual says that using the --no-remove-listing option will cause it to create .listing.1, .listing.2, etc, so that might be an option to avoid conflicts.
Note: .listing file is not created at all if ftp login failed.

Resources