Linux - Find command and tar command Failure - linux

I am using a combination of find and copy command in my backup script.
it is used on a fairly huge amount of data,
first, out of 25 files it needs to find all the files older than 60 mins
then copy these files to a temp directory - each of these files are 1.52GB to 2GB
one of these 25 files will have data being appended continuously.
I have learnt from googling that Tarring operation will fail if there is an update going on to the file being attempted to tar, is it the same thing with find and copy also??
I am trying something like this,
/usr/bin/find $logPath -mmin +60 -type f -exec /bin/cp {} $logPath/$bkpDirectoryName \;
after this I have a step where I tar the files copied to the temp directory as mentioned above(&bkpDirectoryName), here I use as mentioned below,
/bin/tar -czf $bkpDir/$bkpDirectoryName.tgz $logPath/$bkpDirectoryName
and this also fails.
the same backup script was running from past many days and suddenly it has started failing and causing me headache! can someone please help me on this??

can you try these steps please
instead of copying files older than 60 min, move them.
run the tar on the moved files
If you do the above, the file which is continuously appended will not be moved.
In case any of your other 24 files might be updated after 60 min, you can do the following
Once you move a file, touch a file with the same name in case there are async updates which are not continuous.
When tarring the file, give a timestamp name to the tar file.This way you have a rolling tar of your logs
If nothing works due to some custom requirement on your side, try doing a rsync and then do the same operations on the rsynced files (i.e find and tar or just tar)

try this
output=`find $logPath -mmin 60 -type f`
if [ "temp$output" != "temp" ];then
cp -rf $output $other_than_logPath/$bkpDirectoryName/
else
echo sorry
fi
I think, you are using +60 instead of 60.
I also want to know, at what interval your script gets called.

#!/bin/bash
for find in `find / -name "*" -mmin 60`
do
cp $find / ## Choose directory
done
That's basically what you need, just change the directory I guess//

Related

unix script archive log files older than 15 days

I have a list of log files in a directory, which are piled for more than a year now. I've written the below script to archive the log files which are older than 15 days.
Script:
#!/bin/bash
files=($(find /opt/Informatica/9.5.1/server/infa_shared/SessLogs -type f -mtime +15))
file=SessLog_bkup_`date +"%y-%m-%d"`.tar.gz
Backup=/opt/Informatica/9.5.1/server/infa_shared/SessLogs/Backup
tar -zcf $file --remove-files "${files[#]}"
mv $file $Backup
But, when I run the script it throws below error
Error:
./backuplogs.sh: line 5: /bin/tar: Argument list too long.
Please advise if I'm missing something in the script
thanks for the help
Kiran
Your error message is due to failure of execve(2) of /bin/tar by your shell with E2BIG.
Read the man page of tar(1). You could use
-T, --files-from=FILE
Get names to extract or create from FILE.
and combine that with some other parts of your script (e.g. redirecting the find command output to some temporary file, to be passed by -T to tar ....).
But as commented by hek2mgl you really want logrotate(8)
You could also use other archivers, e.g. afio(1)

Linux large amount of files not being deleted

I have a folder of cache files in a linux VM that weren't being deleted for some reason.
I'm trying to delete them ( or the folder it self ) but nothing seems to work.
rm just gives me back Argument list is too long
I'm trying now
find ./cache -type f -delete , hitting ls-l every once in a while but keep getting the same # of files.
Also tried
find ./cache -type f -exec rm-v {} \; but same thing again.
I would be ok if i just delete the folder as long as i recreate it after.
Thank you
EDIT: Ok found out ls-l does not return the # of files, if however i do
ls | wc -l system seems to not respond at all.
Use rm -R filename to remove large data files
Linux command line length is limited so the rm cannot work.
The find command will work, though your directory is really big. Launch your find command and go to lunch.
EDIT – btw make sure to ls the same directory you want to remove files of, i.e. ./cache. It is not clear in your question.

Exclude directories and delete old backup

I am using following simple script script to take backup of all of my websites via tar
TIME=`date +%b-%d-%y`
FILENAME=backup-$TIME.tar.gz
#Parent backup directory
backup_parent_dir="/backup/httpdocs"
#Create backup directory and set permissions
backup_date=`date +%Y_%m_%d_%H_%M`
backup_dir="${backup_parent_dir}/${backup_date}"
echo "Backup directory: ${backup_dir}"
mkdir -p "${backup_dir}"
chmod 755 "${backup_dir}"
SRCDIR=/var/www/sites #Location of Important Data Directory (Source of backup).
tar -cpzf $backup_dir/$FILENAME $SRCDIR
Now, this is wokring fine but I need 2 things if I can do via same script
Can I exclude some folder within /var/www/sites directory, like if I don't want /var/www/sites/abc.com/logs folder to be backup. Can I define it and some other sub-directories within this script?
This script takes all sites in tar format in specified folder /backup/httpdocs through cronjob which runs daily during night and for old tarballs(older than 7 days) I have to delete them manually, so it there any possibility through same script so when it runs, it checks if there is any backup exists older than 7 days and it deletes it automatically?
EDIT:
Thanks everyone, this is what I am using now which takes backup excluding log files and delete anything older than 7 days
#!/bin/bash
#START
TIME=`date +%b-%d-%y` # This Command will add date in Backup File Name.
FILENAME=backup-$TIME.tar.gz # Here i define Backup file name format.
# Parent backup directory
backup_parent_dir="/backup/httpdocs"
# Create backup directory and set permissions
backup_date=`date +%Y_%m_%d_%H_%M`
backup_dir="${backup_parent_dir}/${backup_date}"
echo "Backup directory: ${backup_dir}"
mkdir -p "${backup_dir}"
chmod 755 "${backup_dir}"
SRCDIR=/var/www/vhosts # Location of Important Data Directory (Source of backup).
tar -cpzf $backup_dir/$FILENAME $SRCDIR --exclude=$SRCDIR/*/logs
find ${backup_parent_dir} -name '*' -type d -mtime +2 -exec rm -rfv "{}" \;
#END
Exclude from tar:
tar -cpzf $backup_dir/$FILENAME --exclude=/var/www/sites/abc.com/logs $SRCDIR
Find and delete old backups:
find ${backup_parent_dir} -type f -name 'backup-*.tar.gz' -mtime +7 -delete
The filter of find is conservative: selecting names that match backup-*.tar.gz probably renders the -type f (files only) option useless. I added it just in case you also have directories with such names. The -mtime +7 option is to be checked by you because older than 7 days is not accurate enough. Depending on what you have in mind it may be +6, +7 or +8. Please have a look at the find man page and decide by yourself. Note that the selection of backups to delete is not based on their names but on their date of last modification. If you modify them after they are created it may not be what you want. Let us know.
Use the --exclude option
tar -cpzf $backup_dir/$FILENAME $SRCDIR --exclude=$SRCDIR/*/logs
Name your backup files with an identifier that is derived from the day of the week. This will ensure the new file for each day will overwrite any existing file.
Day:
a Day of the week - abbreviated name (Mon)
A Day of the week - full name (Monday)
u Day of the week - number (Monday = 1)
d Day of the month - 2 digits (05)
e Day of the month - digit preceded by a space ( 5)
j Day of the year - (1-366)
w Same as 'u'
From: http://ss64.com/bash/date.html
For example:
TARGET_BASENAME= `date +%u`
option 1 when not many files/directories to exclude
tar -cpzf $backup_dir/$FILENAME --exclude=$SRCDIR/dir_ignore --exclude=$SRCDIR/*.log $SRCDIR
or if you have many entries to exclude much better done it in file
tar -cpzf $backup_dir/$FILENAME -X /path/to/exclude.txt $SRCDIR
where /path/to/exclude.txt file looks like
/var/www/dir_to_ignore
/var/www/*.log
you cannot use variables,but can use wildcards
second question answered very good by both guys before,i personalty love
find ${backup_parent_dir} -type f -name 'backup-*.tar.gz' -mtime +7 -delete

Zipping and deleting files with certain age

i'm trying to elaborate a command that will find files that haven't been modified in over 6 months and zip them with one command. Afterwards i want to delete all those files and i just archived.
My current command to find the directories with the files is
find /var/www -type d -mtime -400 ! -mtime -180 | xargs ls -l > testd.txt
This gave me all the directories including the files that are older than 6 months
Now i was wondering if there was a way of zipping all the results and deleting them afterwards. Something amongst the line of
find /var/www -type f -mtime -400 ! -mtime -180 | gzip -c archive.gz
If anyone knows the proper syntax to achieve this i'd love to know. Thakns!
Edit, after a few tests this command results in a corrupted file
find /var/www -mtime -900 ! -mtime -180 | xargs tar -cf test4.tar
Any ideas?
Break this into several distinct steps that you can implement and thoroughly test separately:
Build a list of files to be archived and then deleted, saved to a temp file
Use the list from step 1 to add the files to .tar.gz archives. Give the archive file a name following a specific pattern that won't appear in the files to be archived, and put it in a directory outside the hierarchy of files being archived.
Read back the files from the .tar.gz and compare them (or their hashes) to the original files to ENSURE that you got them all without corruption
Use the list from step 1 to delete the files. Do not use a wildcard for deletion. Put in some guard code to prevent deletion of any file matching the name pattern of the archive .tar.gz file(s) created in step 2.
When testing a script that can do irreversible damage, always code the dangerous command with a leading echo and leave it that way until you are sure everything works. Only then remove the echo.
Consider zip, it should meet your requirements.
find ... | zip -m# archive.zip
-m (move) deletes the input directories/files after making the specified zip archive.
-# takes the list of input files from standard input.
You may find more options which are useful to you in the zip manual, e. g.
-r (recurse) travels the directory structure recursively.
-sf (show-files) shows the files that would be operated on, then exits.
-t or --from-date operates on files not modified prior to the specified date.
-tt or --before-date operates on files not modified after or at the specified date.
This could possibly make findexpendable.
zip -mr --from-date 2012-09-05 --before-date 2013-04-13 archive /var/www

Shell script - Find files modified today, create directory, and move them there

I was wondering if there is a simple and concise way of writing a shell script that would go through a series of directories, (i.e., one for each student in a class), determine if within that directory there are any files that were modified within the last day, and only in that case the script would create a subdirectory and copy the files there. So if the directory had no files modified in the last 24h, it would remain untouched. My initial thought was this:
#!/bin/sh
cd /path/people/ #this directory has multiple subdirectories
for i in `ls`
do
if find ./$i -mtime -1 -type f then
mkdir ./$i/updated_files
#code to copy the files to the newly created directory
fi
done
However, that seems to create /updated_files for all subdirectories, not just the ones that have recently modified files.
Heavier use of find will probably make your job much easier. Something like
find /path/people -mtime -1 -type f -printf "mkdir --parents %h/updated_files\n" | sort | uniq | sh
The problem is that you are assuming the find command will fail if it finds nothing. The exit code is zero (success) even if it finds nothing that matches.
Something like
UPDATEDFILES=`find ./$i -mtime -1 -type f`
[ -z "$UPDATEDFILES" ] && continue
mkdir ...
cp ...
...

Resources