Possible to add current date to rsync working path? - linux

I have an every-hour rsync cron task that is used to add new files to the backup server. Directory structure is the following: /myfiles/year/month/date where year, month and date are actual dates of files. Cron task is defined as the file in the /etc/cron.d
The problem is that I have to indicate a "root" /myfiles directory to make rsync replicate my folder structure in the backup location with every new day. Amount of files is substantial - up to 1000 files a day, so rsync needs to iterate through all yearly files to build a copy list while it's not needed at all because I need to copy today's only files. As of April, it takes ~25 minutes even with --ignore-existing option.
Can someone help me to create a script or whatever to add a current year, month and date to the working rsync path in the cron task, if possible? The final result should look like that:
0 * * * * root rsync -rt --ignore-existing /myfiles/2020/04/26 user#myserver:/myfiles/2020/04/26
where /2020/04/26 is variable part that is changing every day.
I have very limited experience with *nix systems so I feel that is possible but basically have no clue how to start.

To add an actual date to the path one can use the date utility or the builtin printf from the bash shell.
Using date
echo "/myfiles/$(date +%Y/%m/%d)"
Using printf
echo "/myfiles/$(printf '%(%Y/%m/%d)T')"
In your case when using the builtin printf you need to define the shell as bash in the cron entry.
0 * * * * root rsync -rt --ignore-existing "/myfiles/$(printf '\%(\%Y/\%m/\%d)T')" "user#myserver:/myfiles/$(printf '\%(\%Y/\%m/\%d)T')"
Using date either define the PATH to include where the date utility is or just use an absolute path
0 * * * * root rsync -rt --ignore-existing "/myfiles/$(/bin/date +\%Y/\%m/\%d)" "user#myserver:/myfiles/$(/bin/date +\%Y/\%m/\%d)"
The date syntax should work on both GNU and BSD date.
The % needs to be escaped inside the cron entry.
See the local documentation on your cron(5) on how to add the PATH and SHELL variables. Although the SHELL normally can be SHELL=/bin/bash and PATH to PATH=/sbin:/bin:/usr/sbin:/usr/bin

Related

Renaming and moving file using crontab

I want to make a backup of my history.txt file, where I store some information of my system. I would like to do this using the crontab and this is what I have now in my crontab:
0 * * * * cp -a history.txt "history-$(date + "%Y%m%d-%h%m%s")" ; mv "history-$(date + "%Y%m%d-%h%m%s")" l-systems
Like you can see i want to preform the backup every hour and I want to give the file a name with the date. First I make a cp of the file and rename it. After that I try to move the new file in a directory called l-systems. This doesn't work right now, can someone help?
I would advise you to make a backup shell script like so:
#!/bin/sh
DATE=$(date + "%Y%m%d-%h%m%s")
cp -a history.txt "history-$DATE"
mv "history-$DATE" l-systems
The call this script from crontab. Since we use the same variable twice it will always be the same for both commands regardless of how long each takes

mysqldump problem in Crontab and bash file

I have created a cron tab to backup my DB each 30 minutes...
*/30 * * * * bash /opt/mysqlbackup.sh > /dev/null 2>&1
The cron tab works well.. Each 30 minutes I have my backup with the script bellow.
#!/bin/sh
find /opt/mysqlbackup -type f -mtime +2 -exec rm {} +
mysqldump --single-transaction --skip-lock-tables --user=myuser --
password=mypass mydb | gzip -9 > /opt/mysqlbackup/$(date +%Y-%m-%d-%H.%M)_mydb.sql.gz
But my problem is that the rm function to delete old data isn't working.. this is never deleted.. Do you know why ?
and also... the name of my backup is 2020-02-02-12.12_mydb.sql.gz?
I always have a ? at the end of my file name.. Do you know why ?
Thank you for your help
The question mark typically indicates a character that can't be displayed; the fact that it's at the end of a line makes me think that your script has Windows line endings rather than Unix. You can fix that with the dos2unix command:
dos2unix /path/to/script.sh
It's also good practice not to throw around MySQL passwords on the CLI or store them in executable scripts. You can accomplish this by using MySQL Option files, specifically the file that defines user-level options (~/.my.cnf).
This would require us to figure out which user is executing that cronjob, however. My assumption is that you did not make that definition inside the system-level crontab; if you had, you'd actually be trying to execute /opt/mysqlbackup.sh > /dev/null 2>&1 as the user bash. This user most likely doesn't (and shouldn't) exist, so cron would fail to execute the script entirely.
As this is not the case (you say it's executing the mysqldump just fine), this makes me believe you have the definition in a user-level crontab instead. Once we figure out which user that actually is as I asked for in my comment, we can identify the file permissions issue as well as create the aforementioned MySQL options file.
Using find with mtime is not the best choice. If for some reason mysqldump stops creating backups, then in two days all backups will be deleted.
You can use my Python script "rotate-archives" for smart delete backups. (https://gitlab.com/k11a/rotate-archives). The script adds the current date at the beginning of the file or directory name. Like 2020-12-31_filename.ext. Subsequently uses this date to decide on deletion.
Running a script on your question:
rotate-archives.py test_mode=off age_from-period-amount_for_last_timeslot=0-0-48 archives_dir=/mnt/archives
In this case, 48 new archives will always be saved. Old archives in excess of this number will be deleted.
An example of more flexible archives deletion:
rotate-archives.py test_mode=off age_from-period-amount_for_last_timeslot=7-5,31-14,365-180-5 archives_dir=/mnt/archives
As a result, there will remain archives from 7 to 30 days old with a time interval between archives of 5 days, from 31 to 364 days old with time interval between archives 14 days, from 365 days old with time interval between archives 180 days and the number of 5.

Cron job to SFTP files in a directory

Scenario: I need to create a cron job that scans through a directory and sftps each file to another machine
bash script : /home/user/sendFiles.sh
cron interval : 1 minute
directory: /home/user/myfiles
sftp destination: 10.10.10.123
Create the cron job
crontab -u user 1 * * * /home/user/sendFiles.sh
Create the Script
#!/bin/bash
/usr/bin/scp -r user#10.10.10.123:/home/user/myfiles .
#REMOVE FILES AFTER ALL HAVE BEEN SENT
rm -rf *
Problem: Not exactly sure if that cron tab is correct or how to sftp an entire directory with the cron tab
If is going to be executed on a cronjob, I'm assuming its in order to sync the dir.
In that case, I would use rdiff-backup to make an incremental backup. That way, only the things that change will transferred.
This system will use ssh for transferring the data, but using rdiff-backup instead of a plain scp. The major benefit of doing it this way is speed; is faster to transfer only the parts that have changed.
This is the command to do a copy over ssh using rdiff-backup:
rdiff-backup /some/local-dir hostname.net::/whatever/remote-dir
Add that to a cronjob, making sure the user that executes the rdiff backup has the ssh keys, so it does not require a password.
(About ssh keys: Read about ssh keys here: http://www.linuxproblem.org/art_9.html Once is set, try to do a regular ssh to see if you can log without password.)
Something like this:
* * * * * rdiff-backup /some/local-dir hostname.net::/whatever/remote-dir
will do the copy every minute. (your example, 1 * * * * will execute every first minute of each hour; that is, once every hour, instead of once every minute)
Keep in mind that can cause problems if the transfer is huge it hasn't finished to transfer. But I guess that you want is to do transfers of not so huge files. Or that your network speed is large. Otherwise, change it to do the transfer every 5 minutes by using */5 * * * * instead.
And finally, read more about rdiff-backup here : http://www.nongnu.org/rdiff-backup/examples.html
rdiff-backup is a good option, there is also rsync
rsync -az user#10.10.10.123:/home/user/myfiles .
I notice you are also deleting files, is this simply because you don't want to recopy them? rsync will only copy updated files.
You might also be interested in unison which does a two way sync
These are both good answers; if you stick with scp you may want to make a slight change to your script:
#!/bin/bash
/usr/bin/scp -r user#10.10.10.123:/home/user/myfiles .
#REMOVE FILES AFTER ALL HAVE BEEN SENT
cd /home/user/myfiles # make sure you're in the right directory before rm
rm -rf *

Keep files updated from remote server

I have a server at hostname.com/files. Whenever a file has been uploaded I want to download it.
I was thinking of creating a script that constantly checked the files directory. It would check the timestamp of the files on the server and download them based on that.
Is it possible to check the files timestamp using a bash script? Are there better ways of doing this?
I could just download all the files in the server every 1 hour. Would it therefore be better to use a cron job?
If you have a regular interval at which you'd like to update your files, yes, a cron job is probably your best bet. Just write a script that does the checking and run that at an hourly interval.
As #Barmar commented above, rsync could be another option. Put something like this in the crontab and you should be set:
# min hour day month day-of-week user command
17 * * * * user rsync -av http://hostname.com/ >> rsync.log
would grab files from the server in that location and append the details to rsync.log on the 17th minute of every hour. Right now, though, I can't seem to get rsync to get files from a webserver.
Another option using wget is:
wget -Nrb -np -o wget.log http://hostname.com/
where -N re-downloads only files newer than the timestamp on the local version, -b sends
the process to the background, -r recurses into directories and -o specifies a log file. This works from an arbitrary web server. -np makes sure it doesn't go up into a parent directory, effectively spidering the entire server's content.
More details, as usual, will be in the man pages of rsync or wget.

Would this Cron job be possible?

I am using Red Hat Linux 5 version and my application is a Java EE Application .
We allow users to upload the Pictures in our website .
These Pictures will be stored inside a Folder in our server .
Now my question is that , on daily basis at a Particular time , i want to move all the images from that folder and move to another folder , where the folder name would be the day it has been moved .
Please let me know if this possible .
Thank you very much
man cron
man crontab
Write a small bashscript, which has your desired behaviour. Add it to your crontab or how cronjobs are realized in your distribution. (I'm using arch linux, so I do not want to give specific instructions, because of differences between distributions...)
Or use a java cron implementation and write everything in java.
You will have to create a cron job to do so, as well as a shell script.
In cron:
# The first minute of the first hour of day run the script
1 1 * * * /scripts/move_images
In /scripts/move_image
#!/bin/bash
# Pick date (YYYY-MM-DD)
date=`date +%Y-%m-%d`
# Create new dir
mkdir -p /local_of_new_folder/$date
# Move all images from old folder to new folder
mv /old_folder/* /local_of_new_folder/$date
Change mode of the script to be a executable
chmod +x /scripts/move_image
Sorry about my English, i'm Brazilian
:)

Resources