Bash script to delete files in a directory if there are more than 5 - linux

This is a backup script that copies files from one directory to another. I use a for loop to check if there are more than five files. If there are, the loop should delete the oldest entries first.
I tried ls -tr | head -n -5 | xargs rm from the command line and it works successfully to delete older files if there are more than 5 in the directory.
However, when I put it into my for loop, I get an error rm: missing operand
Here is the full script. I don't think I am using the for loop correctly in the script, but I'm really not sure how to use the commands ls -tr | head -n -5 | xargs rm in a loop that iterates over the files in the directory.
timestamp=$(date +"%m-%d-%Y")
dest=${HOME}/mybackups
src=${HOME}/safe
fname='bu_'
ffname=${HOME}/mybackups/${fname}${timestamp}.tar.gz
# for loop for deletion of file
for f in ${HOME}/mybackups/*
do
ls -tr | head -n -5 | xargs rm
done
if [ -e $ffname ];
then
echo "The backup for ${timestamp} has failed." | tee ${HOME}/mybackups/Error_${timestamp}
else
tar -vczf ${dest}/${fname}${timestamp}.tar.gz ${src}
fi
Edit: I took out the for loop, so it's now just:
[...]
ffname=${HOME}/mybackups/${fname}${timestamp}.tar.gz
ls -tr | head -n -5 | xargs rm
if [ -e $ffname ];
[...]
The script WILL work if it is in the mybackups directory, however, I continue to get the same error if it is not in that directory. The script gets the file names but tries to remove them from the current directory, I think... I tried several modifications but nothing has worked so far.

I get an error rm: missing operand
The cause of that error is that there are no files left to be deleted. To avoid that error, use the --no-run-if-empty option:
ls -tr | head -n -5 | xargs --no-run-if-empty rm
In the comments, mklement0 notes that this issue is peculiar to GNU xargs. BSD xargs will not run with an empty argument. Consequently, it does not need and does not support the --no-run-if-empty option.
More
Quoting from a section of code in the question:
for f in ${HOME}/mybackups/*
do
ls -tr | head -n -5 | xargs rm
done
Note that (1) f is never used for anything and (2) this runs the ls -tr | head -n -5 | xargs rm several times in a row when it needs to be run only once.
Obligatory Warning
Your approach parses the output of ls. This makes for a simple and easily understood command. It can work if all your files are sensibly named. It will not work in general. For more on this, see: Why you shouldn't parse the output of ls(1).
Safer Alternative
The following will work with all manner of file names, whether they contains spaces, tabs, newlines, or whatever:
find . -maxdepth 1 -type f -printf '%T# %i\n' | sort -n | head -n -5 | while read tstamp inode
do
find . -inum "$inode" -delete
done

SMH. I ended up coming up to the simplest solution in the world by just cd-ing into the directory before I ran ls -tr | head -n -5 | xargs rm . Thanks for everyone's help!
timestamp=$(date +"%m-%d-%Y")
dest=${HOME}/mybackups
src=${HOME}/safe
fname='bu_'
ffname=${HOME}/mybackups/${fname}${timestamp}.tar.gz
cd ${HOME}/mybackups
ls -tr | head -n -5 | xargs rm
if [ -e $ffname ];
then
echo "The backup for ${timestamp} has failed." | tee ${HOME}/mybackups/Error_${timestamp}
else
tar -vczf ${dest}/${fname}${timestamp}.tar.gz ${src}
fi
This line ls -tr | head -n -5 | xargs rm came from here
ls -tr displays all the files, oldest first (-t newest first, -r
reverse).
head -n -5 displays all but the 5 last lines (ie the 5 newest files).
xargs rm calls rm for each selected file
.

Related

Recursivley go to directories that start with *TEST* and preserve only the latest 5 folders

Here is my directory structure.
./TEST1/automation
./TEST2_1/automation
./TEST3.4/automation
./general/automation
I want to preserve only the latest 5 sub-folders under all directories that starts with TEST*/automation.
Currently, my script goes into each directory as below and executes the command:
./TEST1/automation
ls -dt */ | tail -n +5 | xargs rm -rf
./TEST2_1/automation
ls -dt */ | tail -n +5 | xargs rm -rf
./TEST3.4/automation
ls -dt */ | tail -n +5 | xargs rm -rf
Everytime we add a new folder that starts with TEST, I've to manually update the script.
Basically, go into all directories that starts with TEST*/automation and preseve only latest 5 folders.
Try this one:
find -regex '.*/TEST.*/automation' -print0 |xargs -0 -I {} -n1 bash -c 'cd "{}"; ls -rt | tail -n +4 | xargs -I {} echo rm -rf -- "{}"'
If the output looks alright (check that it does indeed show "rm ... " for all files/directories you want to get rid of) remove the echo.
Caveat: the second part of execution (the first xargs does not explicitly look for directories, it will also list files. From your description it's unclear whether your automation directories contain both files and directories or just directories.

Using pipes with find command in linux

I would like to find files in my home directory that start with '~', sort them numerically, print the first five and delete them using find command and pipes in Linux. I have a bash script:
#!/bin/bash
find ~/ -name "~*" | sort -n | head -5 | tee | xargs rm
This works fine for deleting files, but I was expecting tee command to print deleted files to standard output. All this command does is delete files, but there in so output in terminal. What should I add/ change?
Thank you.
You could just use the verbose flag on rm and it will tell you what it's deleting
find ~/ -name "~*" | sort -n | head -5 | xargs rm -v
Use man rm to see the docs
-v, --verbose
explain what is being done
You can use rm -v to print each deleting filename:
find ~ -name '~*' -print0 | sort -zn | head -z -n 5 | xargs -0 rm -v
Also note use -print0 and all corresponding options in sort. head, xargs to address filenames with whitespace and glob characters.

bash file cron job has issue when new file arrives at vendor

I have changed this post down to just what I think the main problem is..
There is a need to leave the two latest
how to delete all files except the latest three in a folder
ls -t1 /home/jdoe/checks/downloads/*.md5 | head -n +2 | xargs rm -r
This removes the oldest files..
And to test:
ls -t1 /home/jdoe/checks/downloads/*.md5 | head -n +2
We really want to leave the two (2) newest files:
ls -t1 /home/jdoe/checks/downloads/*.md5 | tail -n +2 | xargs rm -r
This does not seem to work..
And to test:
ls -t1 /home/jdoe/checks/downloads/*.md5 | tail -n +2
Thanks!
I was able to get some assistance from one of my co-workers and this appears to be what we need.
ls -1tr /home/jdoe/checks/downloads/*.md5 | head -n -2 | while read f; do
#rm -f "$f"
print "file to delete is $f"
done

Bash script to delete all but N files when sorted alphabetically

It's hard to explain in the title.
I have a bash script that runs daily to backup one folder into a zip file. The zip files are named worldYYYYMMDD.zip with YYYYMMDD being the date of backup. What I want to do is delete all but the 5 most recent backups. Sorting the files alphabetically will list the oldest ones first, so I basically need to delete all but the last 5 files when sorted in alphabetical order.
The following line should do the trick.
ls -F world*.zip | head -n -5 | xargs -r rm
ls -F: List the files alphabetically
head -n -5: Filter out all lines except the last 5
xargs -r rm: remove each given file. -r: don't run rm if the input is empty
How about this:
find /your/directory -name 'world*.zip' -mtime +5 | xargs rm
Test it before. This should remove all world*.zip files older than 5 days. So a different logic than you have.
I can't test it right now because I don't have a Linux machine, but I think it should be:
rm `ls -A | head -5`
ls | grep ".*[\.]zip" | sort | tail -n-5 | while read file; do rm $file; done
sort sorts the files
tail -n-5 returns all but the 5 most recent
the while loop does the deleting
ls world*.zip | sort -r | tail n+5 | xargs rm
sort -r will sort in reversed order, so the newest will be at the top
tail n+5 will output lines, starting with the 5th
xargs rm will remove the files. Xargs is used to pass stdin as parameters to rm.

Shell script to count files, then remove oldest files

I am new to shell scripting, so I need some help here. I have a directory that fills up with backups. If I have more than 10 backup files, I would like to remove the oldest files, so that the 10 newest backup files are the only ones that are left.
So far, I know how to count the files, which seems easy enough, but how do I then remove the oldest files, if the count is over 10?
if [ls /backups | wc -l > 10]
then
echo "More than 10"
fi
Try this:
ls -t | sed -e '1,10d' | xargs -d '\n' rm
This should handle all characters (except newlines) in a file name.
What's going on here?
ls -t lists all files in the current directory in decreasing order of modification time. Ie, the most recently modified files are first, one file name per line.
sed -e '1,10d' deletes the first 10 lines, ie, the 10 newest files. I use this instead of tail because I can never remember whether I need tail -n +10 or tail -n +11.
xargs -d '\n' rm collects each input line (without the terminating newline) and passes each line as an argument to rm.
As with anything of this sort, please experiment in a safe place.
find is the common tool for this kind of task :
find ./my_dir -mtime +10 -type f -delete
EXPLANATIONS
./my_dir your directory (replace with your own)
-mtime +10 older than 10 days
-type f only files
-delete no surprise. Remove it to test your find filter before executing the whole command
And take care that ./my_dir exists to avoid bad surprises !
Make sure your pwd is the correct directory to delete the files then(assuming only regular characters in the filename):
ls -A1t | tail -n +11 | xargs rm
keeps the newest 10 files. I use this with camera program 'motion' to keep the most recent frame grab files. Thanks to all proceeding answers because you showed me how to do it.
The proper way to do this type of thing is with logrotate.
I like the answers from #Dennis Williamson and #Dale Hagglund. (+1 to each)
Here's another way to do it using find (with the -newer test) that is similar to what you started with.
This was done in bash on cygwin...
if [[ $(ls /backups | wc -l) > 10 ]]
then
find /backups ! -newer $(ls -t | sed '11!d') -exec rm {} \;
fi
Straightforward file counter:
max=12
n=0
ls -1t *.dat |
while read file; do
n=$((n+1))
if [[ $n -gt $max ]]; then
rm -f "$file"
fi
done
I just found this topic and the solution from mikecolley helped me in a first step. As I needed a solution for a single line homematic (raspberrymatic) script, I ran into a problem that this command only gave me the fileames and not the whole path which is needed for "rm". My used CUxD Exec command can not start in a selected folder.
So here is my solution:
ls -A1t $(find /media/usb0/backup/ -type f -name homematic-raspi*.sbk) | tail -n +11 | xargs rm
Explaining:
find /media/usb0/backup/ -type f -name homematic-raspi*.sbk searching only files -type f whiche are named like -name homematic-raspi*.sbk (case sensitive) or use -iname (case insensitive) in folder /media/usb0/backup/
ls -A1t $(...) list the files given by find without files starting with "." or ".." -A sorted by mtime -t and with a return of only one column -1
tail -n +11 return of only the last 10 -n +11 lines for following rm
xargs rm and finally remove the raiming files in the list
Maybe this helps others from longer searching and makes the solution more flexible.
stat -c "%Y %n" * | sort -rn | head -n +10 | \
cut -d ' ' -f 1 --complement | xargs -d '\n' rm
Breakdown: Get last-modified times for each file (in the format "time filename"), sort them from oldest to newest, keep all but the last ten entries, and then keep all but the first field (keep only the filename portion).
Edit: Using cut instead of awk since the latter is not always available
Edit 2: Now handles filenames with spaces
On a very limited chroot environment, we had only a couple of programs available to achieve what was initially asked. We solved it that way:
MIN_FILES=5
FILE_COUNT=$(ls -l | grep -c ^d )
if [ $MIN_FILES -lt $FILE_COUNT ]; then
while [ $MIN_FILES -lt $FILE_COUNT ]; do
FILE_COUNT=$[$FILE_COUNT-1]
FILE_TO_DEL=$(ls -t | tail -n1)
# be careful with this one
rm -rf "$FILE_TO_DEL"
done
fi
Explanation:
FILE_COUNT=$(ls -l | grep -c ^d ) counts all files in the current folder. Instead of grep we could use also wc -l but wc was not installed on that host.
FILE_COUNT=$[$FILE_COUNT-1] update the current $FILE_COUNT
FILE_TO_DEL=$(ls -t | tail -n1) Save the oldest file name in the $FILE_TO_DEL variable. tail -n1 returns the last element in the list.
Based on others suggestions and some awk foo, I got this to work. I know this an old thread, but I didn't find a decent answer here and this sorted it for me. This just deletes the oldest file, but you can change the head -n 1 to 10 and get the oldest 10.
find $DIR -type f -printf '%T+ %p\n' | sort | head -n 1 | awk '{first =$1; $1 =""; print $0}' | xargs -d '\n' rm
Using inode numbers via stat & find command (to avoid pesky-chars-in-file-name issues):
stat -f "%m %i" * | sort -rn -k 1,1 | tail -n +11 | cut -d " " -f 2 | \
xargs -n 1 -I '{}' find "$(pwd)" -type f -inum '{}' -print
#stat -f "%m %i" * | sort -rn -k 1,1 | tail -n +11 | cut -d " " -f 2 | \
# xargs -n 1 -I '{}' find "$(pwd)" -type f -inum '{}' -delete

Resources