Nightly backup and merging backup - linux

I have a nightly back up script that makes a backup from one server of any files that have been modified and thensync them across to our back server.
/var/backups/backup-2011-04-02/backuped/ backuped files and folders
The format above is the nightly incremental backup, which copies all the files and folders to a date stamped folder and then another folder underneath.
Thinking of a script which would run after the back up script to merge all the files in the /var/backups/backup-2011-04-02/backuped/ into /var/www/live/documents
So in theory I need to merge a number of different folders from the backup into the live www on the backup server only with the right date
So whats the best way to go about this script?

You could run rsync on each backup directory to the destination in order of
creation:
$ for f in `ls -t /var/backups`; do rsync -aL "/var/backups/$f" /var/www/live/documents/; done
Of course you can put this line in a nightly cron job. The only thing to look out for is the line above will choke if the filenames in your backup directory have spaces in them, but it looks like they don't, so you may be ok.

Related

How to find and remove partially transferred files after numerous failed rsync attempts

I have launched few rsyncs over sshfs(sftp) that leaves temporary files.
Is there any way how to cleanup those files?
I don't want to run rsync with --partial option, because there are many big files and it can take ages.
I tried to find them this way:
find -name ".*.??????"
and it finds some temporary files. But I'm not 100% sure if there are any files that are not discovered using this pattern.
Is this solution sufficient?
You could run rsync again with both the --delete and --dry-run options, and perhaps with --itemize-changes. This would show you a list of all the changes that would be made. Just take note of any deletions, ignoring changed files. Unless your files have odd names, it should be obvious what are rsync temp files left behind and what are not.

How to archive files and sub folders in a location to another place in linux

I am trying to create a shell script to copy folders and files within those folders from one Linux machine to another linux machine. After copying I would like to delete only the files that are copied. I want to retain the folder structure as is.
Eg.
Machine X has a main folder named F with subfolders A,B,C folders in which each of them has 10 files.
I would like to make a copy in such a way that machine Y will have a folder named F with subfolders A,B,C containing the same files. Once the copy of all folders and files are complete, it should delete all the files in source folder but retain the folders.
The code below is untested. Use with care and backup first.
Something like this should get you started:
#!/bin/bash
srcdir=...
set -ex
rsync \
--verbose \
--recursive \
"${srcdir}/" \
user#host:/dstdir/
find "${srcdir}" -type f -delete
Set the srcdir variable and the remote argument to rsync to taste.
The rsync options are just from memory, so they may need tweaking. Read the documentation, especially options regarding deletion, backup, permissions and links.
(I'd rather not answer questions requests that show no signs of effort, but my fingers were itching, so there you go.)
scp the files, check the exit code of the scp and then delete the files locally.
Something like scp files user#remotehost:/path/ && rm files
If scp has failed, the second part of the command won't execute

Changing file modification time for tar archive members

I have a tar archive on an NTFS drive on a windows machine which contains a folder with files residing on a drive on my linux machine. I try to update the archive from a bash shell script from my linux machine with the -u (--update) tar option, so that only new versions of archive members are appended to the archive. However, due to the "time skew" between file times on two filesystems, tar appends to the archive ALL the files in the folder, even if the folder does not contain any new versions of files at all.
So the problem is: how to add to an archive on machine B only new version of files from a folder on machine A in conditions when there is time skew between machines?
Is there a way to solve this problem so that mtimes of individual files in archive were preserved or changed insignificantly (e.g. adjusted 10 minutes ahead to negate the time skew)? This probably can be accomplished by calling tar individually for appending each file, but is there a more optimal solution?
Maybe there is a way to change mtime individually for each file when it is added to the archive? The option --after-date for appending only files modified after certain date apparently is not quite suitable filter for this task.

Compare two folders containing source files & hardlinks, remove orphaned files

I am looking for a way to compare two folders containing source files and hard links (lets use /media/store/download and /media/store/complete as an example) and then remove orphaned files that don't exist in both folders. These files may have been renamed and may be stored in subdirectories.
I'd like to set this up on a cron script to run regularly. I just can't logically figure out myself how work the logic of the script - could anyone be so kind as to help?
Many thanks
rsync can do what you want, using the --existing, --ignore-existing, and --delete options. You'll have to run it twice, once in each "direction" to clean orphans from both source and target directories.
rsync -avn --existing --ignore-existing --delete /media/store/download/ /media/store/complete
rsync -avn --existing --ignore-existing --delete /media/store/complete/ /media/store/download
--existing says don't copy orphan files
--ignore-existing says don't update existing files
--delete says delete orphans on target dir
The trailing slash on the source dir, and no trailing slash on the target dir, are mandatory for your task.
The 'n' in -avn means not to really do anything, and I always do a "dry run" with the -n option to make sure the command is going to do what I want, ESPECIALLY when using --delete. Once you're confident your command is correct, run it with just -av to actually do the work.
Perhaps rsync is of use ?
Rsync is a fast and extraordinarily versatile file copying tool. It
can copy locally, to/from another host over any remote shell, or
to/from a remote rsync daemon. It offers a large number of options
that control every aspect of its behavior and permit very flexible
specification of the set of files to be copied. It is famous for its
delta-transfer algorithm, which reduces the amount of data sent over
the network by sending only the differences between the source files
and the existing files in the destination. Rsync is widely used for
backups and mirroring and as an improved copy command for everyday
use.
Note it has a --delete option
--delete delete extraneous files from dest dirs
which could help with your specific use case above.
You can also use "diff" command to list down all the different files in two folders.

Bash - delete files by date/filename

I have a bash script which creates a mysqldump backup every hour in a certain directory.
The filenames of the backup files include the date and hour as per the following schema:
backupfile_<day>-<month>-<year>_<hour>.sql.gz
and to clarify here are some example filenames:
backupfile_30-05-2012_0800.sql.gz
backupfile_01-06-2012_0100.sql.gz
backupfile_05-06-2012_1500.sql.gz
Would someone help me with creating a script that will loop through all files in the directory and then delete files LEAVING the following:
Keep alternate hour backups older than a day
Keep twice daily backups older than a week
Keep once daily backups older than a month.
I have the following beginnings of the script:
#!/bin/bash
cd /backup_dir
for file in *
do
# do the magic to find out if this files time is up (i.e. needs to be deleted)
# delete the file
done
I have seen many fancy scripts like this for taking scheduled backups and wonder why folks don't make a use of logroate utility available on most of *nix distros available today support following options of your interest:
compress
Old versions of log files are compressed with gzip by default.
dateext
Archive old versions of log files adding a daily extension like YYYYMMDD instead
of simply adding a number.
olddir directory
Logs are moved into directory for rotation. The directory must be on the same
physical device as the log file being rotated, and is assumed to be relative to
the directory holding the log file unless an absolute path name is specified.
When this option is used all old versions of the log end up in directory. This
option may be overriden by the noolddir option.
notifempty
Do not rotate the log if it is empty (this overrides the ifempty option).
postrotate/endscript
The lines between postrotate and endscript (both of which must appear on lines by
themselves) are executed after the log file is rotated. These directives may
only appear inside of a log file definition. See prerotate as well.
You can parse your timestamps by iterating over filenames, or you can use the -cmin flag in the find command (see man 1 find for details).

Resources