Zipping and deleting files with certain age - linux

i'm trying to elaborate a command that will find files that haven't been modified in over 6 months and zip them with one command. Afterwards i want to delete all those files and i just archived.
My current command to find the directories with the files is
find /var/www -type d -mtime -400 ! -mtime -180 | xargs ls -l > testd.txt
This gave me all the directories including the files that are older than 6 months
Now i was wondering if there was a way of zipping all the results and deleting them afterwards. Something amongst the line of
find /var/www -type f -mtime -400 ! -mtime -180 | gzip -c archive.gz
If anyone knows the proper syntax to achieve this i'd love to know. Thakns!
Edit, after a few tests this command results in a corrupted file
find /var/www -mtime -900 ! -mtime -180 | xargs tar -cf test4.tar
Any ideas?

Break this into several distinct steps that you can implement and thoroughly test separately:
Build a list of files to be archived and then deleted, saved to a temp file
Use the list from step 1 to add the files to .tar.gz archives. Give the archive file a name following a specific pattern that won't appear in the files to be archived, and put it in a directory outside the hierarchy of files being archived.
Read back the files from the .tar.gz and compare them (or their hashes) to the original files to ENSURE that you got them all without corruption
Use the list from step 1 to delete the files. Do not use a wildcard for deletion. Put in some guard code to prevent deletion of any file matching the name pattern of the archive .tar.gz file(s) created in step 2.
When testing a script that can do irreversible damage, always code the dangerous command with a leading echo and leave it that way until you are sure everything works. Only then remove the echo.

Consider zip, it should meet your requirements.
find ... | zip -m# archive.zip
-m (move) deletes the input directories/files after making the specified zip archive.
-# takes the list of input files from standard input.
You may find more options which are useful to you in the zip manual, e. g.
-r (recurse) travels the directory structure recursively.
-sf (show-files) shows the files that would be operated on, then exits.
-t or --from-date operates on files not modified prior to the specified date.
-tt or --before-date operates on files not modified after or at the specified date.
This could possibly make findexpendable.
zip -mr --from-date 2012-09-05 --before-date 2013-04-13 archive /var/www

Related

Bash script to delete a file in all sub directories.

I have a directory that is filled with subdirectories exceeding 450 GBs. Inside of these subdirectories is an instruction file in each subdirectory. I have a script that copies the instruction file in the directory I am currently in and puts it inside every subdirectory via:
#!/bin/bash
for d in */; do cp "INSTALLATION INSTRUCTIONS.rtf" "$d"; done
I need to remove all of these files in the subdirectories and replace them with new instructions. Can I simple write another script that does this:
#!/bin/bash
for d in */; do rm "INSTALLATION INSTRUCTIONS.rtf" "$d"; done
I am very hesitant and wanted to make sue as these files are vitally important and I don't want to accidentally remove anything and making a backup of 450+ GBs is very taxing.
find . -mindepth 2 -name "INSTALLATION INSTRUCTIONS.rtf" -exec rm -f '{}' +
Since this is "vitally important" data, I would first list all files that match the file name you want to delete/overwrite, without taking any action on it (other than listing):
find /folder/ -type f -name "INSTALLATION INSTRUCTIONS.rtf" -print > /tmp/holder
That would create a list of matches on /tmp/holder. Then you could analyze this list before taking any action (either visually or programatically) to make sure that the list does not include anything you don't want to delete (when dealing with big amounts of data, strange things can happen, so be proactive on protecting the data).
If you are happy with what the list shows, then you could delete the old instructions, or if possible, overwrite them with the new file. Here's an example to overwrite the old file with the new one:
while read -r line; do cp --no-preserve=all /folder/newfile "$line"; done < /tmp/holder
The cp --no-preserve=all command (available on GNU bash) would ensure that the new file has permissions that are "adequate" to the folder where they are located. You may change that to a simple cp if you don't want that to happen.

How to create empty clone directories using bash script as subdirectories under another directory

I need to create empty clone directories with the same directory names as subdirectories under another directory in another location, I need to accomplish this using bash scripting.
To be more specific, I have a number of directories that are being generated by a data logging system with each directory being named according to the day, month, and, year of creation/recording, hence I have a starting directory say 24012016 and so on with increments in the day number, month, and, year; also there are gaps in records for certain days due to technical reasons.
Each such directory contains files with two different extensions; I need a script that will create a directory with the same name, that is 24012016, as a subdirectory under another directory in another location, but without the files within it, and, also copy files, one of the two with different extensions into the new clone directory, have this process repeated for all the directories, have this process repeated for all the directories.
given old_root the base directory to start coping and new_root the base destination:
find <old_root> -type d -exec mkdir -p "<new_root>/{}" \;
the second part is unclear
For example, something like that:
cd ${where_to_search}
find . -type d | while read dir; do
mkdir -p ${new_parentdir}/${dir}
done
find . -type f -name "*.${extension_to_copy}" | while read file; do
cp ${file} ${new_parent}/${file};
done

Wrtie a script to Delete files if it exists in different folder in Linux

I'm trying write a script in linux. Where I have some csv files in Two different folders(A and B) and then after some processing copy of rejected files are moving to Bad Folder.
SO I want bad files to be deleted from Table A and B which have copied to Bad Folder.
Can you help me to write this script for linux?
Best
lets say name of Bad Folder is 'badFolder' and considering 'A', 'B' and 'badFolder' are in same directory
Steps to delete files from folder A and B:
step 1: change current directory to your 'badFolder'
cd badFolder
step 2: delete identical files
find . -type f -exec rm -f ../A/{} \;
find . -type f -exec rm -f ../B/{} \;
The argument -type f tells to look for files, not directories.
The -exec ... \; argument tells that, once it finds a file in 'badFolder', it should run the command rm -f on its counterpart in the A subdirectory.
Because rm is given with the -f option, it will silently ignore files that don't exist.
Also, it will not prompt before deleting files. This is very handy when deleting a large number of files. However, be sure that you really want to delete the files before running this script.
#!/bin/bash
#Set the working folder in which you want to delete the file
Working_folder=/<Folder>/<path>
cd $Working_folder
#command to delete all files present in folders
rm <filenames seperated by space>
echo "files are deleted"
#if you want to delete all files you can use wild card character
# e.g. command rm *.*
# if you want to delete a particular file say for deleting .csv file you can use command rm *.csv command
Set variables containing the paths of your A, B and BAD directories.
Then you can do something along the lines of
for file in ls ${PATH_TO_BAD}
do
rm ${PATH_TO_A}/$file
rm ${PATH_TO_B}/$file
done
This is iterating over the BAD directory and any file it finds, it deletes from the A and B directories.

Script to zip complete file structure depending on file age

Alright so i have a web server running CentOS at work that is hosting a few websites internally only. It's our developpement server and thus has lots [read tons] of old junk websites and whatnot.
I was trying to elaborate a command that would find files that haven't been modified for over 6 months, group them all in a tarball and then delete them. Thus far i have tried many different type of find commands with arguments and whatnot. Our structure looks like such
/var/www/joomla/username/fileshere/temp
/var/www/username/fileshere
So i tried something amongst the lines of :
find /var/www -mtime -900 ! -mtime -180 | xargs tar -cf test4.tar
Only to have a 10MB resulting tar, when the expected result would be over 50 GB's.
I tried using gzip instead, but i ended up zipping MY WHOLE SERVER thus making is unusable, had to transfer the whole filesystem and reinstall a complete new server and lots of shit and trouble and... you get the idea. So i want to find the perfect command that won't blow up our server but will find all FILES and DIRECTORIES that haven't been modified for over 6 months.
Be careful with ctime.
ctime is related to changes made to inodes (changing permissions, owner, etc)
atime when a file was last accessed (check if your file system is using noatime or relatime options, in that case the atime option may not work in the expected way)
mtime when data in a file was last modified.
Depending on what are you trying to do, the mtime option could be your best option.
Besides, you should check the print0 option. From man find:
-print0
True; print the full file name on the standard output, followed by a null character (instead of the newline character that -print uses). This allows file names that contain newlines or
other types of white space to be correctly interpreted by programs that process the find output. This option corresponds to the -0 option of xargs.
I do not know what are you trying to do but this command could be useful for you:
find /var/www -mtime +180 -print0 | xargs -0 tar -czf example.tar.gz
Try this:
find /var/www -ctime +180 | xargs tar cf test.tar
The ctime parameter tells you the difference between current time and each files modification times, and if you use the + instead of minus it will give you the "files modified in a date older than x days".
Then just pass it to tar with xargs and you should be set.

Bash script to recursively step through folders and delete files

Can anyone give me a bash script or one line command i can run on linux to recursively go through each folder from the current folder and delete all files or directories starting with '._'?
Change directory to the root directory you want (or change . to the directory) and execute:
find . -name "._*" -print0 | xargs -0 rm -rf
xargs allows you to pass several parameters to a single command, so it will be faster than using the find -exec syntax. Also, you can run this once without the | to view the files it will delete, make sure it is safe.
find . -name '._*' -exec rm -Rf {} \;
I've had a similar problem a while ago (I assume you are trying to clean up a drive that was connected to a Mac which saves a lot of these files), so I wrote a simple python script which deletes these and other useless files; maybe it will be useful to you:
http://github.com/houbysoft/short/blob/master/tidy
find /path -name "._*" -exec rm -fr "{}" +;
Instead of deleting the AppleDouble files, you could merge them with the corresponding files. You can use dot_clean.
dot_clean -- Merge ._* files with corresponding native files.
For each dir, dot_clean recursively merges all ._* files with their corresponding native files according to the rules specified with the given arguments. By default, if there is an attribute on the native file that is also present in the ._ file, the most recent attribute will be used.
If no operands are given, a usage message is output. If more than one directory is given, directories are merged in the order in which they are specified.
Because dot_clean works recursively by default, use:
dot_clean <directory>
If you want to turn off the recursively merge, use -f for flat merge.
dot_clean -f <directory>
find . -name '.*' -delete
A bit shorter and perform better in case of extremely long list of files.

Resources