BASH Script to Remove old files, and create a text file containing the count and size of total files deleted. - linux

I am an Intern and was given a task of creating a BASH script to delete files in a directory older than 60 days and then exports a text file containing the number of files deleted as well as the amount of data removed. I am still attempting to learn BASH and have a one liner to remove files older than 30 days;
`find $DIR -type f -mtime -60 -exec rm -rf {}\;`
I am still actively attempting to learn BASH, so extra notes on any responses would be greatly appreciated!
P.S. I found Bash Academy , but looks like the site is incomplete, any recommendations for further reading in my quest to learn bash will also be greatly appreciated!

I would use the below script, say deleter.sh for the purpose :
#!/bin/bash
myfunc()
{
local totalsize=0
echo " Removing files listed below "
echo "${#}"
sizes=( $(stat --format=%s "${#}") ) #storing sizes in an array.
for i in "${sizes[#]}"
do
(( totalsize += i )) #calculating total size.
done
echo "Total space to be freed : $totalsize bytes"
rm "${#}"
if [ $? -eq 0 ] #$? is the return value
then
echo "All files deleted"
else
echo "Some files couldn't be deleted"
fi
}
export -f myfunc
find "$1" -type f -not -name "*deleter.sh" -mtime +60\
-exec bash -c 'myfunc "$#"' _ {} +
# -not -name "*deleter.sh" to prevent self deletion
# Note -mtime +60 for files older than 60 days.
Do
chmod +x ./deleter.sh
And run it as
./deleter '/path/to/your/directory'
References
Find [ manpage ] for more info.
stat --format=%s gives size in bytes which we store in an array. See [ stat ] manpage.
feedback appreciated

Related

How to delete older files but keep recent ones during backup?

I have a remote server that copies 30-some backup files to a local server every day and I want to remove the old backups if and only if a newer backup successfully copied.
With different codes I tried, I managed to erase older files, but I got the problem that if it found one new backup, it deleted ALL older ones.
I have something like (picture this with 20 virtual machines):
vm001-2019-08-01.bck
vm001-2019-07-28.bck
vm002-2019-08-01.bck
vm003-2019-07-29.bck
vm004-2019-08-01.bck
vm004-2019-07-31.bck
vm004-2019-07-30.bck
vm004-2019-07-29.bck
...
And I'd want to erase all but keep only the most recent ones.
i.e.: erase:
vm001-2019-07-28.bck
vm002-2019-07-29.bck
vm004-2019-07-31.bck
vm004-2019-07-30.bck
vm004-2019-07-29.bck
and keep only:
vm001-2019-08-01.bck
vm002-2019-08-01.bck
vm003-2019-07-29.bck
vm004-2019-08-01.bck
the problem I had is that if I have any recent backup of any machine, files like vm-003-2019-07-29 get deleted, because they are older, even if they are of different machines.
I know there are several variants of this question in the site, but I can't quite get this to work.
I've been trying variants of this code:
#!/bin/bash
for i in ./*.bck
do
echo "found" "$i"
if [[ -n $(find "$i" -type f -mmin -1440) ]]
then
echo "$i"
find "$i" -type f -mmin +1440 -exec rm -f "$i" {} +
fi
done
(The echos are for debugging purposes only)
At this time, this code finds the newer and the older files, but doesn't delete anything. If I put find "$i" -type f -mmin +1440 -exec echo "$i" {} +, it never prints anything, as if find $i is not finding anything, but when I run it as a solo command in the terminal, it does (minus the -exec part).
I've tested this script generating files with different timestamps using touch -d, but I had no success.
Unless you add the -name test before the filename find is going to consider "$i" to be the name of a directory to search in. So your find command should be:
find -name "$i" -type f -mmin -1440
which will search in the current directory. Or
find /path/to/dir -name "$i" -type f -mmin -1440
which will search in a directory named "/path/to/dir".
But, based on BashFAQ/099, I would do this to delete all but the newest file for each VM (untested):
#!/bin/bash
declare -A newest # associative array to store name of newest file for each VM
for f in *
do
vm=${f%%-*} # extracts vm name from filename (i.e. vmm001 from vm001-2019-08-01.bck)
if [[ -f $f && $f -nt ${newest["$vm"]} ]]
then
newest["$vm"]=$f
fi
done
for f in *
do
vm=${f%%-*}
if [[ -f $f && $f != ${newest["$vm"]} ]]
then
rm "$f"
fi
done
This is set up to run against files in the current directory. It assumes that the files are named as shown in the question (the VM name is separated from the rest of the file name by a hyphen). In order to use an associative array, Bash 4 or higher is required.

Display contents of files which are greater than 0

I need to write a shell script.
I have bunch of files in a directory. From there, I need to display content of files which are greater than 0 bytes in size. and delete the files which are 0 in size.
Please help. Thanks in advance.
I Found an answer which works fine. But, any more inputs will be welcome.
The answer is the following. Which I need to use in shell script.
find . -size 0c -delete
Here's something that will work
#!/bin/bash
for f in $(ls) ; do
if [ -f $f ] ; then
if [ -s $f ] ; then
ls $f
else
rm $f
fi
fi
done
Note, it's just doing ls in the current directory. You could also pass in a directory as arg to look in or other method. Also, this won't pick up hidden files (.*).
The key to how it works are the Bash conditional expressions -s (true if file is more than 0 size) and -f (true if regular file).
To display the contents of all files under topdir that have non-zero size:
find topdir -type f \! -size 0c -exec cat {} +
To delete all completely empty files under the same directory:
find topdir -type f -size 0c -ok rm {} \;
Replace -ok with -exec (and the \; at the end to +) if you don't want to confirm each removal.
This solution assumes a POSIX find.
for i in `ls` ; do
if [ -s $i ] ; then
cat $i
else
rm -f $i
fi
done
if you have spaces in you filenames you may need to change IFS env variable, or think about using "find" command instead

Check if directory has changed

I am working on a backup script and I've got a problem. I would like to backup my documents to a ftp server. Because I don't like encfs so I try to realise this by using z-zip and encrypted archives. This is working well but I would like to create a new archive only when a file inside a subdirectory has changed so lftp is only uploading the changed ones.
My codesnippet looks like this:
cd /mnt/HD_a2/documents
for i in */
do 7za a -t7z /mnt/HD_a2/encrypted/ul_doc/"${i%/}.7z" -p1234 -mhe "$i"
done
How can I change this code so it's only creating a new archive when a file inside "i" has been changed within the last 7 days? (This script is executed by cron every 7 days)
for i in */
do
if [ `find "$i" -type f -mtime -7 | wc -l` -gt 0 ]
then 7za a -t7z /mnt/HD_a2/encrypted/ul_doc/"${i%/}.7z" -p1234 -mhe "$i"
fi
done
So, Barmar`s answer is almost right, but it does not count files correctly. I've looked around for other similar questions and it seems like it is a common mistake(please note that it is not critical for his solution, but it might confuse other programmers if they touch it, because it does not what most people will expect), no one is accounting the fact that filenames can contain newlines. So here is a bit better version that gives you the right file count:
for i in */
do
fileCount=$(find "$i" -type f -mtime -8 -exec printf . \;)
if (( ${#fileCount} > 0 )); then
7za a -t7z /mnt/HD_a2/encrypted/ul_doc/"${i%/}.7z" -p1234 -mhe "$i"
fi
done
But what if you have thousands of files? That would build a string that is exactly as long as the number of your files.
So we can use this:
for i in */
do
fileCount=$(find "$i" -type f -mtime -8 -exec printf . \; | wc -c)
if (( fileCount > 0 )); then
7za a -t7z /mnt/HD_a2/encrypted/ul_doc/"${i%/}.7z" -p1234 -mhe "$i"
fi
done
Or this:
for i in */
do
fileCount=$(find "$i" -type f -mtime -8 -exec echo \; | wc -l)
if (( fileCount > 0 )); then
7za a -t7z /mnt/HD_a2/encrypted/ul_doc/"${i%/}.7z" -p1234 -mhe "$i"
fi
done
Hoping that useless string data is not going to be cached.
But we only need to know if there is AT LEAST one file, we don't have to count all of them! This way we can stop immediately if something was found.
for i in */
do
fileFound=$(find "$i" -type f -mtime -8 | head -n 1)
if [[ $fileFound ]]; then
7za a -t7z /mnt/HD_a2/encrypted/ul_doc/"${i%/}.7z" -p1234 -mhe "$i"
fi
done
That's it. This solution will work MUCH faster because it does not have to look for other files and it does not even have to check their mtime. Try running this code without head and you'll notice a significant difference - from several hours to less than a second for my home folder. (I'm not even sure if it will ever manage to finish it on my pc, I have millions of small files in my home folder...)

Folder/subfolder size then deletion

I am new in the world of Linux:please, accept my apology for this question:
I have written a script which check the size of a folder and ALL of its subfolder and if the size is greater than X bytes then files which have been modified more than X days ago will be deleted.
I have put folder size zero for test purpose, therefore, the script should have perform the deletion process, but, it does NOT.
This script does do what I expect and I do not know why.
Thanks for your help.
Script is:
#!/bin/sh
# 100GB SIZE LIMIT
SIZE="0"
# check the current size
CHECK="`du /media/nssvolumes/TEST/MB/`"
if ["$CHECK" -gt "$SIZE"]; then echo "$ACTION"
ACTION="`find /media/nssvolumes/TEST/MB/ -mindepth 0 -maxdepth 3 -mtime +1 -type f -exec rm -f {} \;`"
else exit
fi
Well, here we have a bunch of errors.
Supposing your OS is Linux, try running du from the command line to realise what it returns, because it is NOT a number.
Second problem - using of [] requires spaces.
Also, integers does not require ""
You're printing $ACTION before its value is defined.
exit is not needed to be called if script has no more code to execute
So the fixed script is:
#!/bin/sh
# 100GB SIZE LIMIT
SIZE=0
MY_DIR="/media/nssvolumes/TEST/MB/"
# check the current size
CHECK=$(du -bs $MY_DIR|awk '{print $1}')
if [ $CHECK -gt $SIZE ]; then
echo "ACTION"
find $MY_DIR -mindepth 0 -maxdepth 3 -mtime +1 -type f -exec rm {} \;
fi

Find files older than X days excluding some other files

i'm trying to write a shell script, for linux and solaris, that finds some specific files older than X days and then deletes them. the trick is that during this process there are a couple of files that must not be deleted.
for example from the following list of files i need to delete *.zip and keep *.log and *.something.*
1.zip
2.zip
3.log
prefix.something.suffix
finding the files and feeding them to rm was easy, but i'm having difficulties in excluding the files from the deletion list.
experimenting around i discovered one can benefit from multiple complex expressions grouped with logical operators like this:
find -L path -type f \( -name '*.log' \) -a ! \( -name '*.zip' -o -name '*something*' \) -mtime +3
cheers,
G
or you could do this:
find /appl/ftp -type f -mtime +30 |grep -vf [exclude_file] | xargs rm -rf;
I needed to find a way to provide a hard coded list of exclude files to not remove, but remove everything else that was older than 30 days. Here is a little script to perform a remove of all files older that 30 days, except files that are listed in the [exclude_file].
EXCL_FILES=`/bin/cat [exclude_file]`;
RM_FILE=`/usr/bin/find [path] -type f -mtime +30`;
for I in $RM_FILES;
do
for J in $EXCL_FILES;
do
grep $J $I;
if [[ $? == 0 ]]; then
/bin/rm $I;
if [[ $? != 0 ]]; then echo "PROBLEM: Could not remove $I"; exit 1; fi;
fi;
done;
done;

Resources