Find files older than X days excluding some other files - linux

i'm trying to write a shell script, for linux and solaris, that finds some specific files older than X days and then deletes them. the trick is that during this process there are a couple of files that must not be deleted.
for example from the following list of files i need to delete *.zip and keep *.log and *.something.*
1.zip
2.zip
3.log
prefix.something.suffix
finding the files and feeding them to rm was easy, but i'm having difficulties in excluding the files from the deletion list.

experimenting around i discovered one can benefit from multiple complex expressions grouped with logical operators like this:
find -L path -type f \( -name '*.log' \) -a ! \( -name '*.zip' -o -name '*something*' \) -mtime +3
cheers,
G

or you could do this:
find /appl/ftp -type f -mtime +30 |grep -vf [exclude_file] | xargs rm -rf;

I needed to find a way to provide a hard coded list of exclude files to not remove, but remove everything else that was older than 30 days. Here is a little script to perform a remove of all files older that 30 days, except files that are listed in the [exclude_file].
EXCL_FILES=`/bin/cat [exclude_file]`;
RM_FILE=`/usr/bin/find [path] -type f -mtime +30`;
for I in $RM_FILES;
do
for J in $EXCL_FILES;
do
grep $J $I;
if [[ $? == 0 ]]; then
/bin/rm $I;
if [[ $? != 0 ]]; then echo "PROBLEM: Could not remove $I"; exit 1; fi;
fi;
done;
done;

Related

BASH: Filter list of files by return value of another command

I have series of directories with (mostly) video files in them, say
test1
1.mpg
2.avi
3.mpeg
junk.sh
test2
123.avi
432.avi
432.srt
test3
asdf.mpg
qwerty.mpeg
I create a variable (video_dir) with the directory names (based on other parameters) and use that with find to generate the basic list. I then filter based on another variable (video_type) for file types (because there is sometimes non-video files in the dirs) piping it through egrep. Then I shuffle the list around and save it out to a file. That file is later used by mplayer to slideshow through the list.
I currently use the following command to accomplish that. I'm sure it's a horrible way to do it, but it works for me and it's quite fast even on big directories.
video_dir="/test1 /test2"
video_types=".mpg$|.avi$|.mpeg$"
find ${video_dir} -type f |
egrep -i "${video_types}" |
shuf > "$TEMP_OUT"
I now would like to add the ability to filter out files based on the resolution height of the video file. I can get that from.
mediainfo --Output='Video;%Height%' filename
Which just returns a number. I have tried using the -exec functionality of find to run that command on each file.
find ${video_dir} -type f -exec mediainfo --Output='Video;%Height%' {} \;
but that just returns the list of heights, not the filenames and I can't figure out how to reject ones based on a comparison, like <480.
I could do a for next loop but that seems like a bad (slow) idea.
Using info from #mark-setchell I modified it to,
video_dir="test1"
find ${video_dir} -type f \
-exec bash -c 'h=$(mediainfo --Output="Video;%Height%" "$1"); [[ $h -gt 480 ]]' _ {} \; -print
Which works.
You can replace your egrep with the following so you are still inside the find command (-iname is case insensitive and -o represents a logical OR):
find test1 test2 -type f \
\( -iname "*.mpg" -o -iname "*.avi" -o -iname "*.mpeg" \) \
NEXT_BIT
The NEXT_BIT can then -exec bash and exit with status 0 or 1 depending on whether you want the current file included or excluded. So it will look like this:
-exec bash -c 'H=$(mediainfo -output ... "$1"); [ $H -lt 480 ] && exit 1; exit 0' _ {} \;
So, taking note of #tripleee advice in comments about superfluous exit statements, I get this:
find test1 test2 -type f \
\( -iname "*.mpg" -o -iname "*.avi" -o -iname "*.mpeg" \) \
-exec bash -c 'h=$(mediainfo ...options... "$1"); [ $h -lt 480 ]' _ {} \; -print
This Q&A was focused on one particular case, so the accepted answer is not as general as it could be.
find
If the list of files comes from find, one can use its filtering facilities, e.g. -exec:
find ${video_dir} -type f \
-exec COMMAND \; \
-print
Here
COMMAND is not enclosed in quotes -- find reads everything after -exec and up to a \;
find will expand {} to the current file name (including path -- you might find -execdir helpful, which will cd to the file's directory and replace {} with the leaf file name)
The exit code of COMMAND is treated as follows:
0 -> true
non-0 -> false
Note that you can build more complex expressions (e.g. -not -exec ...), which will be evaluated "from left to right, according to the rules of precedence ... -and is assumed where the operator is omitted." (per man find)
xargs
If the list of files comes from elsewhere (and is available on stdin), you can use xargs as follows (from
If xargs is map, what is filter? )
ls | xargs -I{} bash -c "COMMAND '{}' && echo '{}'"
Here is my solution.
#!/bin/bash
shopt -s nullglob
video_dir=(/test1 /test2)
while IFS= read -rd '' file; do
if [[ $file = *.#(mpg|avi|mpeg|mp4) ]]; then
h=$(mediainfo --Output="Video;%Height%" "$file")
(( h >= 480 )) && echo "$file"
fi
done < <(find "${video_dir[#]}" -type f -print0)
This solution you can process everything inside the while read loop.

Delete all files older than 30 days, based on file name as date

I'm new to bash, I have a task to delete all files older than 30 days, I can figure this out based on the files name Y_M_D.ext 2019_04_30.txt.
I know I can list all files with ls in a the folder containing the files. I know I can get todays date with $ date and can configure that to match the file format $ date "+%Y_%m_%d"
I know I can delete files using rm.
How do I tie all this together into a bash script that deletes files older than 30 days from today?
In pseudo-python code I guess it would look like:
for file in folder:
if file.name to date > 30 day from now:
delete file
I am by no means a systems administrator, but you could consider a simple shell script along the lines of:
# Generate the date in the proper format
discriminant=$(date -d "30 days ago" "+%Y_%m_%d")
# Find files based on the filename pattern and test against the date.
find . -type f -maxdepth 1 -name "*_*_*.txt" -printf "%P\n" |
while IFS= read -r FILE; do
if [ "${discriminant}" ">" "${FILE%.*}" ]; then
echo "${FILE}";
fi
done
Note that this is will probably be considered a "layman" solution by a professional. Maybe this is handled better by awk, which I am unfortunately not accustomed to using.
Here is another solution to delete log files older than 30 days:
#!/bin/sh
# A table that contains the path of directories to clean
rep_log=("/etc/var/log" "/test/nginx/log")
echo "Cleaning logs - $(date)."
#loop for each path provided by rep_log
for element in "${rep_log[#]}"
do
#display the directory
echo "$element";
nb_log=$(find "$element" -type f -mtime +30 -name "*.log*"| wc -l)
if [[ $nb_log != 0 ]]
then
find "$element" -type f -mtime +30 -delete
echo "Successfull!"
else
echo "No log to clean !"
fi
done
allows to include multiple directory where to delete files
rep_log=("/etc/var/log" "/test/nginx/log")
we fill the var: we'r doing a search (in the directory provided) for files which are older than 30 days and whose name contains at least .log. Then counts the number of files.
nb_log=$(find "$element" -type f -mtime +30 -name "*.log*"| wc -l)
we then check if there is a result other than 0 (posisitive), if yes we delete
find "$element" -type f -mtime +30 -delete
For delete file older than X days you can use this command and schedule it in /etc/crontab
find /PATH/TO/LOG/* -mtime +10 | xargs -d '\n' rm
or
find /PATH/TO/LOG/* -type f -mtime +10 -exec rm -f {} \

Find all files containing the filename of specific date range on Terminal/Linux

I have a surveillance camera which is capturing image base on my given condition. The images are saved on my Linux. Image naming convention are given below-
CAPTURE04.YYYYMMDDHHMMSS.jpg
The directory contains the following files -
CAPTURE04.20171020080501.jpg
CAPTURE04.20171021101309.jpg
CAPTURE04.20171021101913.jpg
CAPTURE04.20171021102517.jpg
CAPTURE04.20171021103422.jpg
CAPTURE04.20171022103909.jpg
CAPTURE04.20171022104512.jpg
CAPTURE04.20171022105604.jpg
CAPTURE04.20171022110101.jpg
CAPTURE04.20171022112513.jpg ... and so on.
However, Actually, now I'm trying to find a way to get all files between a specific date time (filename) range by using the terminal command.
Note: Need to follow the filename (YYYYMMDDHHMMSS), not the file created/modified time.
Such as I need to get all files whose file name is between 2017-10-20 08:30:00 and 2017-10-22 09:30:00
I'm trying and searching google around and got the following command -
find -type f -newermt "2017-10-20 08:30:00" \! -newermt "2017-10-22 09:30:00" -name '*.jpg'
It returns the files which are created/modified on that given date range. But I need to find files base on the given filenames range. So I think it does not work on my condition.
Also trying with the following command-
find . -maxdepth 1 -size +1c -type f \( -name 'CAPTURE04.20171020083000*.jpg' -o -name 'CAPTURE04.2017102209300*.jpg' \) | sort -n
This is not working.. :(
Please help me to write the actual command. Thanks, in advance.
Complete find + bash solution:
find . -type f -regextype posix-egrep -regex ".*CAPTURE04\.[0-9]{14}\.jpg" -exec bash -c \
'fn=${0##*/}; d=${fn:10:-4};
[[ $d -ge 20171020083000 && $d -le 20171022093000 ]] && echo "$0"' {} \;
fn=${0##*/} - obtaining file basename
d=${fn:10:-4} - extracting datetime section from the file's basename
[[ $d -ge 20171020083000 && $d -le 20171022093000 ]] && echo "$0" - print the filepath only if its datetime "section" is in specified range
One way(bash), not an elegant one:
ls CAPTURE04.2017102008{30..59}*.jpg CAPTURE04.2017102009{00..30}*.jpg 2>/dev/null
as maxdepth option is used means all files are in current directory so can be done in a loop with globs
for file in CAPTURE04.201710{20..22}*.jpg; do
if [[ $file > CAPTURE04.20171020083000 && $file < CAPTURE04.2017102209300 ]]; then
... do something with "$file"
fi
done

Find directories that doesn't contain "*.sql" files

I have several directories like below. I want to list the directories which dont have the SQL files. For example "dev.mysite.com" in below example. Im using ubuntu 14.04 LTS
orange.com/
orange.com.sql
10.10.10.1/public_html/...
apple.edu.us/
apple.edu.sql
10.10.10.2/public_html/...
dev.mysite.com/
10.10.10.3/public_html/
example.com/
mysql_dbdump20150911.sql
10.10.11.11/public_html/...
Ive tried to achieve this using "find" with "cut" and "xargs" and moving those directories to the "dirwithsql" directory and take remaining ones as directory without sql file manually.
find . -maxdepth 2 -iname "*.sql" | cut -d'/' -f 2 | xargs -n 1 -I {} mv {} /backup/dirwithsql/{}
Ive tried with
find -maxdepth 2 ! -iname "*.sql" -exec dirname {} \;
But above command shows all directories
Is there any better method ?
Thank you
I had a similar problem recently, this worked for me
for f in $(find . -type d -maxdepth 2); do if [[ $(ls -1 $f | grep '.sql$'|wc -l) == 0 ]] ; then echo $f; fi; done
find . -type d -print0 | while read -d $'\0' n; do [ -f "$n"/*.sql ] || echo $n; done
Shall work even with exotic directory names (spaces, newlines...)

Linux Move files to their child directory in a loop

Can you please suggest efficient way to move files from one location to their sub directory in a loop.
Ex:
/MY_PATH/User1/1234/Daily/abc.txt to /MY_PATH/User1/1234/Daily/Archive/abc.txt
/MY_PATH/User2/3456/Daily/def.txt to /MY_PATH/User2/3456/Daily/Archive/def.txt
/MY_PATH/User1/1111/Daily/hij.txt to /MY_PATH/User1/1111/Daily/Archive/hij.txt
/MY_PATH/User2/2222/Daily/def.txt to /MY_PATH/User2/2222/Daily/Archive/def.txt
I started in this way, but need your suggestions and best way to write it:
#!/bin/bash
dir1="/MyPath/"
subs= `ls $dir1`
for i in $subs; do
mv $dir1/$i/*/Daily $dir1/$i/*/Daily/Archive
done
My one line bash
for dir in $(
find MY_PATH -mindepth 3 -maxdepth 3 -type d -name Daily
);do
mkdir -p $dir/Archives
find $dir -maxdepth 1 -mindepth 1 ! -name Archives \
-exec mv -t $dir/Archives {} +
done
To quickly test:
mkdir -p MY_PATH/User{1,2,3,4}/{1234,2346,3333,2323}/Daily
touch MY_PATH/User{1,2,3,4}/{1234,2346,3333,2323}/Daily/{abc,bcd,def,feg,fds}.txt
for dir in $( find MY_PATH -mindepth 3 -maxdepth 3 -type d -name Daily );do
mkdir -p $dir/Archives; find $dir -maxdepth 1 -mindepth 1 ! -name Archives \
-exec mv -t $dir/Archives {} + ; done
ls -lR MY_PATH
This seem match OP's request
For more robust solution
There is a solution wich work with spaces somewhere in path...
Edited to include #mklement0's well pointed suggestion.
while IFS= read dir;do
mkdir -p "$dir"/Archives
find "$dir" -maxdepth 1 -mindepth 1 ! -name Archives \
-exec mv -t "$dir/Archives" {} +
done < <(
find MY_PATH -mindepth 3 -maxdepth 3 -type d -name Daily
)
Same demo;
mkdir -p MY_PATH/User{1,2,3,"4 3"}/{1234,"23 6",3333,2323}/Daily
touch MY_PATH/User{1,2,3,"4 3"}/{1234,"23 6",3333,2323}/Daily/{abc,"b c",def,hgz0}.txt
while read dir;do mkdir -p "$dir"/Archives;find "$dir" -maxdepth 1 -mindepth 1 \
! -name Archives -exec mv -t "$dir/Archives" {} +; done < <(
find MY_PATH -mindepth 3 -maxdepth 3 -type d -name Daily )
ls -lR MY_PATH
Assuming the directory structure is as you have shown in your examples, i.e.
MY_PATH/
subdir-level-1/
subdir-level-2/
Daily/
files
Archive/
Here's what you can do:
shopt -s nullglob # defend against globbing failure -- inspired by mklement0's answer
root="/MyPath"
for dir in "${root}"/*/*/Daily/; do
mkdir -p "${dir}/Archive" # if Archive might not exist; to be pedantic you should look at David C. Rankin's answer for error handling, but usually we know what we're doing so that's not necessary
find "${dir}" -maxdepth 1 -type f -print0 | xargs -0 mv -t "${dir}/Archive"
done
The reason I use find and xargs is to save a few processes; you can as well move files in each ${dir} one by one.
Update: #mklement0 suggested that find "${dir}" -maxdepth 1 -type f -print0 | xargs -0 mv -t "${dir}/Archive" can be further improved to
find "${dir}" -maxdepth 1 -type f -exec mv -t "${dir}/Archive" +
which is a very good point.
Try the following:
dir1="/MyPath"
for d in "$dir1"/*/*/Daily/; do
[[ -d $d ]] || break # break, if no subdirectories match
for f in "$d"/*; do # loop over files in */*/Daily/
[[ -f "$f" ]] || continue # skip non-files or if nothing matches
mv "$f" "$d"/Archive/
done
done
"$dir1"*/*/Daily/ matches all grandchild subdirectories of $dir1; thanks to the terminating /, only directories match; note that, as a result, $d ends in /.
Note that $d therefore ends in /, and, strictly speaking, needs no / later on when synthesizing paths with it (e.g., "$d"/*), but doing so does no harm and helps readability, as #4ae1e1 points out in a comment.
[[ -d $d ]] || break ensures that the loop is exited if no grandchild directories match (by default, a glob (pattern) that has no matches is passed as is to the loop).
for f in "$d"* loops over all entries (files and/or subdirs.) in $d:
[[ -f "$f" ]] || continue ensures that only files are processed or, in the event that nothing matches, the loop is exited.
mv "$f" "$d"/Archive/ then moves each file to subdir. Archive.
You need to check for, and if not present, create the destination directory before moving the file to Archive. If you cannot create the directory (due to permissions or otherwise), you skip the move. The following does not assume any limitation on depth, but will omit any directory containing Archive as an intermediate subdirectory:
oldifs="$IFS"
IFS=$'\n'
for i in $(find /MY_PATH -type f); do
[[ "$i" =~ Archive ]] && continue
[ -d "${i%/*}/Archive" ] || mkdir -p "${i%/*}/Archive"
[ -d "${i%/*}/Archive" ] || {
printf "error: unable to create '%s'\n" "${i%/*}/Archive"
continue
}
mv -fv "$i" "${i/Daily/Daily\/Archive}"
done
IFS="$oldifs"
Output when run
$ bash archive_daily.sh
mv -fv /MY_PATH/User1/1111/Daily/hij.txt /MY_PATH/User1/1111/Daily/Archive/hij.txt
mv -fv /MY_PATH/User1/1234/Daily/abc.txt /MY_PATH/User1/1234/Daily/Archive/abc.txt
mv -fv /MY_PATH/User2/3456/Daily/def.txt /MY_PATH/User2/3456/Daily/Archive/def.txt
mv -fv /MY_PATH/User2/2222/Daily/def.txt /MY_PATH/User2/2222/Daily/Archive/def.txt
Note: you can limit/tighten the file selection by adjusting the call to find populating the for loop (e.g. -name or -iname). This simply checks/moves every file to its Archive folder. To limit to only files with the .txt extension, you can specify find /MY_PATH -type f -name "*.txt". To limit to only files in the /MY_PATH/User1 and /MY_PATH/User2directories with a .txt extension, use find /MY_PATH/User[12] -type f -name "*.txt".
Note2: when looping on filenames, the paths & filenames should not contain non-standard characters for the current locale. Certainly you should not have the '\n' as a character in your filename. Setting IFS is required to protect against word splitting on spaces in either the path or filename.
Since you said efficient, anything with a subshell will fail in funny ways with lots of entries. You're better off using xargs:
#!/bin/bash
dir1="/MyPath/"
find $dir1 -name Daily -type d -depth 3 | while read i
do
pushd .
cd $i
mkdir Archive
find . -type f -depth 1 | xargs -J {} mv {} Archive
popd
done
The outer find will look for you Daily directories. It's very specific in that they have to be at a certain depth and directories, not regular files. The results gets piped into read, where each directory is entered, Archive is created, and files batch-copied with xargs ... mv. Complete file lists and directory lists are never stored in memory, so it scales very well.

Resources