Locate files older than 7 days and contain the word "t" in the third char of the file name - linux

I am trying to figure out how to find all the files that are older than 7 days and contain the letter "t" as the third character (of the filename).
I only figure out how to find the files that are older that 7 days:
find /home -mtime +7 -print

To restrict to filenames having a "t" in the third position, like "25t.txt" or "data-19.doc", add this clause:
-name "??t*"
to the command. -name looks only the base name, i.e. with the path removed.

You need to specialize your find with a regex in this way:
find /home -mtime +7 -regextype posix-extended -regex '^.*\/.{2}T.*' -print
Explanation of the command:
You add a regular expression that filter all the result of the find for the first N character before the "/" character and after the "/" character have at third position the character "T". You need the first part of the regular expression ( ^.*\/ ) because the find return the result with fullpath so in the form "./dir/dir1/filename.extension". The last part of regular espression is to filter all the file with extension.
Annotation: you can substitute "T" with character you want.

Related

regex for find cmd to output all files with an 'e' in them but not at start/end of name

I need a find command to output all filenames in a certain directory that contain an 'e' but the 'e' must not be at the start/end of the file name.
$ ls .
emil eva extreme let yes
The find command I am looking for should only output let and yes, not the other names.
Things I have tried so far:
find dir -name "*e*
find dir -name "^e$"
find dir -name "[a-z]e[a-z]"
Similar questions I cannot figure out:
list all files that:
begin with a,z or y
do not begin with x,z or y
consist of only one char
consist of only two chars
consist of only two OR three chars
Thanks in advance :)
if you want to understand, enter the command
man find
I think you can easily solve the others and if not, ask them in another question.
solution for case with letters "e":
find . -maxdepth 1 -type f -regextype sed -regex '^\./[^e].*e.*[^e]$'
explanation:
find. looks in the current directory
-maxdepth 1 does not go into subdirectories
-type f only displays files
-regextype sed sets the type of regular expressions to those for sed
-regex '^\./[^e].*e.*[^e]$' the regular expression
^ the beginning of the sequence
\. literal dot
/ literal '/'
[^e] character not being the letter e
.* any number of any characters
e literal sign e
$ end of the sequence

List files with names that contain alphabetic characters and any other symbols (i.e numbers, punctuation, etc.) and sort them by size

I need help modifying command i have already written.
That's what i was able to achieve:
find -type f -name '*[:alpha:]*' -exec ls -ltu {} \; | sort -k 5 -n -r
However, this command also finds filenames that cosist solely of alphabetic characters, so i need to get rid of them too. I have tried doing something like this to the code:
find -type f -name '*[:alpha:]*' -and ! -name '[:alpha:]' -exec ls -ltu {} \; | sort -k 5 -n -r
But it does nothing. I understand that something is wrong with my name formatting but i have no idea how to fix it.
Character classes like [:alpha:] may only be used within character range [..] expressions, e.g. [0-9_[:alpha:]]. They may not be used alone.
[:alpha:] by itself a character range expression equivalent to [ahlp:] and matches any of the characters "ahlp" or colons. It does not match alphabetical characters.
To find files that contains both at least one alphabetic and at least one non-alphabetic characters:
find dir -type f -name '*[[:alpha:]]*' -name '*[^[:alpha:]]*'

Separating the time from a date from a string

I have some files that contain in their name the following string: "20171011095942", which is the date and time "2017/10/11 09:59:42".
text_text_20171011095937_155.DAT.gz
text_text_20171011095942_156.DAT.gz
I need to select all files that start at the hour 09 and put them in another folder. If I use the command:
date -d '20171011095942' +'%R'
It says "invalid date". How can I separate the time from that string so I can then select only those files?
Thank you!
With find + mv commands:
find . -type f -regextype posix-egrep -regex ".*_2017101109[0-9]{4}_.*\.gz" -exec mv {} dest_dir/ \;
In the above command change dest_dir to your "another folder".
.*_2017101109[0-9]{4}_.*\.gz - regex pattern to match all filenames containing the needed sequence.
.* - matches any character(s)
_2017101109 - matches the crucial numeric sequence (<year><month><day><hours>)
[0-9]{4}_ - ensures that the above mentioned sequence if followed by 4 digits which point to <minutes><seconds>
\.gz - ensures a file extension to be .gz

find type -f also returns non-matching files in matching directory

all files/folders in current directory:
./color_a.txt
./color_b.txt
./color_c.txt
./color/color_d.txt
./color/blue.txt
./color/red.txt
./color/yellow.txt
command used to find all files with the word color in name:
find ./*color* -type f
result:
./color_a.txt
./color_b.txt
./color_c.txt
./color/color_d.txt
./color/blue.txt
./color/red.txt
./color/yellow.txt
expected result:
./color_a.txt
./color_b.txt
./color_c.txt
./color/color_d.txt
The result also includes all the non-matching file names under a matching parent directory.
How could I get ONLY files with names directly matching the color pattern?
Thanks a lot!
What you probably want for filename filtering is a simple -name <glob-pattern> test:
find -name '*color*' -type f
From man find:
-name pattern
Base of file name (the path with the leading directories removed) matches shell
pattern pattern. Because the leading directories are removed, the file names
considered for a match with -name will never include a slash, so `-name a/b' will
never match anything (you probably need to use -path instead).
Just as a side note, when you wrote:
find ./*color* -type f
the shell expanded the (unquoted) glob pattern ./*color*, and what was really executed (what find saw) was this:
find ./color ./color_a.txt ./color_b.txt ./color_c.txt -type f
thus producing a list of files in all of those locations.
You can use the regex option
find -regex ".*color_.*" -type f

Delete files that don't match a particular string format

I have a set of files that are named similarly:
TEXT_TEXT_YYYYMMDD
Example file name:
My_House_20170426
I'm trying to delete all files that don't match this format. Every file should have a string of text followed by an underscore, followed by another string of text and another underscore, then a date stamp of YYYYMMDD.
Can someone provide some advice on how to build a find or a remove statement that will delete files that don't match this format?
Using find, add -delete to the end once you're sure it works.
# gnu find
find . -regextype posix-egrep -type f -not -iregex '.*/[a-z]+_[a-z]+_[0-9]{8}'
# OSX find
find -E . -type f -not -iregex '.*/[a-z]+_[a-z]+_[0-9]{8}'
Intentionally only matching alphabetical characters for TEXT. Add 0-9 to each TEXT area like this [a-z0-9] if you need numbers.
grep -v '(pattern)'
will filter out lines that match a pattern, leaving those that don't match. You might try piping in the output of ls. And if you're particularly brave, you could pipe the output to something like xargs rm. But deleting is kinda scary, so maybe save the output to a file first, look at it, then delete the files listed.

Resources