Exclude directories and delete old backup - linux

I am using following simple script script to take backup of all of my websites via tar
TIME=`date +%b-%d-%y`
FILENAME=backup-$TIME.tar.gz
#Parent backup directory
backup_parent_dir="/backup/httpdocs"
#Create backup directory and set permissions
backup_date=`date +%Y_%m_%d_%H_%M`
backup_dir="${backup_parent_dir}/${backup_date}"
echo "Backup directory: ${backup_dir}"
mkdir -p "${backup_dir}"
chmod 755 "${backup_dir}"
SRCDIR=/var/www/sites #Location of Important Data Directory (Source of backup).
tar -cpzf $backup_dir/$FILENAME $SRCDIR
Now, this is wokring fine but I need 2 things if I can do via same script
Can I exclude some folder within /var/www/sites directory, like if I don't want /var/www/sites/abc.com/logs folder to be backup. Can I define it and some other sub-directories within this script?
This script takes all sites in tar format in specified folder /backup/httpdocs through cronjob which runs daily during night and for old tarballs(older than 7 days) I have to delete them manually, so it there any possibility through same script so when it runs, it checks if there is any backup exists older than 7 days and it deletes it automatically?
EDIT:
Thanks everyone, this is what I am using now which takes backup excluding log files and delete anything older than 7 days
#!/bin/bash
#START
TIME=`date +%b-%d-%y` # This Command will add date in Backup File Name.
FILENAME=backup-$TIME.tar.gz # Here i define Backup file name format.
# Parent backup directory
backup_parent_dir="/backup/httpdocs"
# Create backup directory and set permissions
backup_date=`date +%Y_%m_%d_%H_%M`
backup_dir="${backup_parent_dir}/${backup_date}"
echo "Backup directory: ${backup_dir}"
mkdir -p "${backup_dir}"
chmod 755 "${backup_dir}"
SRCDIR=/var/www/vhosts # Location of Important Data Directory (Source of backup).
tar -cpzf $backup_dir/$FILENAME $SRCDIR --exclude=$SRCDIR/*/logs
find ${backup_parent_dir} -name '*' -type d -mtime +2 -exec rm -rfv "{}" \;
#END

Exclude from tar:
tar -cpzf $backup_dir/$FILENAME --exclude=/var/www/sites/abc.com/logs $SRCDIR
Find and delete old backups:
find ${backup_parent_dir} -type f -name 'backup-*.tar.gz' -mtime +7 -delete
The filter of find is conservative: selecting names that match backup-*.tar.gz probably renders the -type f (files only) option useless. I added it just in case you also have directories with such names. The -mtime +7 option is to be checked by you because older than 7 days is not accurate enough. Depending on what you have in mind it may be +6, +7 or +8. Please have a look at the find man page and decide by yourself. Note that the selection of backups to delete is not based on their names but on their date of last modification. If you modify them after they are created it may not be what you want. Let us know.

Use the --exclude option
tar -cpzf $backup_dir/$FILENAME $SRCDIR --exclude=$SRCDIR/*/logs
Name your backup files with an identifier that is derived from the day of the week. This will ensure the new file for each day will overwrite any existing file.
Day:
a Day of the week - abbreviated name (Mon)
A Day of the week - full name (Monday)
u Day of the week - number (Monday = 1)
d Day of the month - 2 digits (05)
e Day of the month - digit preceded by a space ( 5)
j Day of the year - (1-366)
w Same as 'u'
From: http://ss64.com/bash/date.html
For example:
TARGET_BASENAME= `date +%u`

option 1 when not many files/directories to exclude
tar -cpzf $backup_dir/$FILENAME --exclude=$SRCDIR/dir_ignore --exclude=$SRCDIR/*.log $SRCDIR
or if you have many entries to exclude much better done it in file
tar -cpzf $backup_dir/$FILENAME -X /path/to/exclude.txt $SRCDIR
where /path/to/exclude.txt file looks like
/var/www/dir_to_ignore
/var/www/*.log
you cannot use variables,but can use wildcards
second question answered very good by both guys before,i personalty love
find ${backup_parent_dir} -type f -name 'backup-*.tar.gz' -mtime +7 -delete

Related

rsync- create daily backups that are deleted after 30 days?

I am attempting to write a script in rsync to save daily backups in new directories named after the date they are created, before they are deleted 30 days after being created. The code below works, but it will quickly fill up my memory because the -u option will not see that several files in the directory structure already exist in a previous backup. Is there a better way to do this to preserve memory/bandwidth? I have had the --delete and --backup-dir options mentioned to me, but I have no idea how they would apply to this specific scenario.
#!/bin/bash
#User who's files are being backed up
BNAME=username
#directory to back up
BDIR=/home/username/BackThisUp
#directory to backup to
BackupDir=/var/home/username_local/BackupTo
#user
RUSER=$USER
#SSH Key
KEY=/var/home/username_local/.ssh
#Backupname
RBackup=`date +%F`
#Backup Server
BServ=backup.server
#Path
LPATH='Data for backup'
#date
DATE=`date +%F`
#make parent directory for backup
mkdir $BackupDir/$BNAME > /dev/null 2>&1
#Transfer new backups
rsync -avpHrz -e "ssh -i $KEY" $BNAME#$BServ:$BDIR $BackupDir/$BNAME/$DATE
find $BackupDir/$BNAME -type d -ctime +30 -exec rm -rf {} \;
I might do somethign simpler. Create a hash that only has the date's
day in it. For example, 8/11/2015 would hash to 11
Then do something like
# this number changes based on date.
hash=`date +%d`
rm -rf backup_folder/$hash
# then recreate backup_folder/$hash
You'll have around 30 days of backups. You may want to zip/compress these folders, assuming you have 30 times the size of the folder available on the disk.

Linux - Find command and tar command Failure

I am using a combination of find and copy command in my backup script.
it is used on a fairly huge amount of data,
first, out of 25 files it needs to find all the files older than 60 mins
then copy these files to a temp directory - each of these files are 1.52GB to 2GB
one of these 25 files will have data being appended continuously.
I have learnt from googling that Tarring operation will fail if there is an update going on to the file being attempted to tar, is it the same thing with find and copy also??
I am trying something like this,
/usr/bin/find $logPath -mmin +60 -type f -exec /bin/cp {} $logPath/$bkpDirectoryName \;
after this I have a step where I tar the files copied to the temp directory as mentioned above(&bkpDirectoryName), here I use as mentioned below,
/bin/tar -czf $bkpDir/$bkpDirectoryName.tgz $logPath/$bkpDirectoryName
and this also fails.
the same backup script was running from past many days and suddenly it has started failing and causing me headache! can someone please help me on this??
can you try these steps please
instead of copying files older than 60 min, move them.
run the tar on the moved files
If you do the above, the file which is continuously appended will not be moved.
In case any of your other 24 files might be updated after 60 min, you can do the following
Once you move a file, touch a file with the same name in case there are async updates which are not continuous.
When tarring the file, give a timestamp name to the tar file.This way you have a rolling tar of your logs
If nothing works due to some custom requirement on your side, try doing a rsync and then do the same operations on the rsynced files (i.e find and tar or just tar)
try this
output=`find $logPath -mmin 60 -type f`
if [ "temp$output" != "temp" ];then
cp -rf $output $other_than_logPath/$bkpDirectoryName/
else
echo sorry
fi
I think, you are using +60 instead of 60.
I also want to know, at what interval your script gets called.
#!/bin/bash
for find in `find / -name "*" -mmin 60`
do
cp $find / ## Choose directory
done
That's basically what you need, just change the directory I guess//

7zip archieving files that are newer than a specific date

I create 7zip files like this from command line in Linux:
# 7za a /backup/files.7z /myfolder
After that I want to create another zip file that includes all files inside /myfolder that are newer then dd-mm-YY.
Is it possible to archieve files with respect to file's last change time ?
(I don't want to update "files.7z" file I need to create another zip file that includes only new files)
The proposal by Gooseman:
# find myfolder -mtime -10 -exec 7za a /backup/newfile.7z {} \;
adds all files of each directory tree which got new files since the directory is also new and then adds all new files just archived again.
The following includes only new files but does not store the path names in the archive:
# find myfolder -type f -mtime -10 -exec 7za a /backup/newfile.7z {} \;
This stores only new files — with path names:
# find myfolder -type f -mtime -10 > /tmp/list.txt
# tar -cvf /tmp/newfile.tar -T /tmp/list.txt
# 7za a /backup/newfile.7z /tmp/newfile.tar
You could try this command:
find myfolder -mtime -10 -exec 7za a /backup/newfile.7z {} \;
In order to find the number to use by the mtime option you could use some of these answers:
How to find the difference in days between two dates? In your case it would be the difference between the current date and your custom dd-mm-YY (in my example dd-mm-YY is 10 days back from now)
From man find:
-n for less than n
-mtime n
File's data was last modified n*24 hours ago. See the comments for -atime to understand how rounding affects the interpretation of file modification times.

In linux shell, How to cp/rm files by time?

In linux shell, When I run
ls -al -t
that show the time of files.
How to cp/rm files by time? just like copy all the files that created today or yesterday. Thanks a lot.
Depending on what you actually want to do, find provides -[acm]time options for finding files by accessed, created or modified dates, along with -newer and -min. You can combine them with -exec to copy, delete, or whatever you want to do. For example:
find -maxdepth 1 -mtime +1 -type f -exec cp '{}' backup \;
Will copy all the regular files in the current directory more than 1 day old to the directory backup (assuming the directory backup exists).
Simple Example
find /path/to/folder/ -mtime 1 -exec rm {} \; // Deletes all Files modified yesterday
For more examples google for bash find time or take a look here

Deleting files based on CreationTime

In a directory there are files which are generated daily.
Format of files, if its generated on 16th Apr 2012 is TEST_20120416.
So I need to delete all the files which are older than 7 days. I tried doing this
#!/bin/ksh
find /data/Test/*.* -mtime -7 -exec rm -rf {} \;
exit 0
Now the problem is above code is deleting based on modification time but according to requirement file should delete based on creation time.Kindly help me out in deleting files based on filename(filename has timestamp).
As you fortunately have creation date encoded in filename, this should work:
#!/bin/sh
REFDATE=$(date --date='-7 days' +%Y%m%d)
PREFIX=TEST_
find /data/Test/ -name $PREFIX\* | while read FNAME; do
if [ ${FNAME#$PREFIX} -lt $REFDATE ]; then
rm $FNAME
fi
done
It will print warnings if you have some other files with names starting with TEST_ in which case some more filtering may be needed.
find /data/Test/*.* -ctime -7 -delete
'find /data/Test/.' will find the all the files in the /data/Test folder and argument '-ctime -7' will limit the search to the creation time to last 7 days and -delete option will delete such files

Resources