find command regex with optional subexpression - linux

I have the following file list
file <-
file.2019041543764832 <-
file.2019041643764832 <-
file.2019041243764832
file.2019041143764832
I want to find all the marked files which are prefixed with file and optionally suffixed by the dates 20190415xxxxx or 20190416xxxxx
I have tried the following but it does not yield any output.
find . -regex 'file(\.2019041(5|6)[0-9].*)?' -regextype egrep
I need some help with the correct regex type and the correct synatx to achieve this.

find . -regex './file\(.2019041\(5\|6\)[0-9]*\)?' -regextype egrep
or just
find . -regex './file\(.2019041[56][0-9]*\)?'
(When using find ., my version of find prefixes the matches with ./, so I added that to the regexp.)

Related

how to delete files have specific pattern in linux?

I have a set of images like these
12345-image-1-medium.jpg 12345-image-2-medium.png 12345-image-3-large.jpg
what pattern should I write to select these images and delete them
I also have these images that don't want to select
12345-image-profile-small.jpg 12345-image-profile-medium.jpg 12345-image-profile-large.png
I have tried this regex but not worked
1234-image-[0-9]+-small.*
I think bash not support regex as in Javascript, Go, Python or Java
for pic in 12345*.{jpg,png};do rm $pic;done
for more information on wildcards take a look here
So long as you do NOT have filenames with embedded '\n' character, then the following find and grep will do:
find . -type f | grep '^.*/[[:digit:]]\{1,5\}-image-[[:digit:]]\{1,5\}'
It will find all files below the current directory and match (1 to 5 digits) followed by "-image-" followed by another (1 to 5 digits). In your case with the following files:
$ ls -1
123-image-99999-small.jpg
12345-image-1-medium.jpg
12345-image-2-medium.png
12345-image-3-large.jpg
12345-image-profile-large.png
12345-image-profile-medium.jpg
12345-image-profile-small.jpg
The files you request are matched in addition to 123-image-99999-small.jpg, e.g.
$ find . -type f | grep '^.*/[[:digit:]]\{1,5\}-image-[[:digit:]]\{1,5\}'
./123-image-99999-small.jpg
./12345-image-3-large.jpg
./12345-image-2-medium.png
./12345-image-1-medium.jpg
You can use the above in a command substitution to remove the files, e.g.
$ rm $(find . -type f | grep '^.*/[[:digit:]]\{1,5\}-image-[[:digit:]]\{1,5\}')
The remaining files are:
$ l1
12345-image-profile-large.png
12345-image-profile-medium.jpg
12345-image-profile-small.jpg
If Your find Supports -regextype
If your find supports the regextype allowing you to specify which set of regular expression syntax to use, you can use -regextype grep for grep syntax and use something similar to the above to remove the files with the -execdir option, e.g.
$ find . -type f -regextype grep -regex '^.*/[[:digit:]]\+-image-[[:digit:]]\+.*$' -execdir rm '{}' +
I do not know whether this is supported by BSD or Solaris, etc.., so check before turning it loose in a script. Also note, [[:digit:]]\+ tests for (1 or more) digits and is not limited to 5-digits as shown in your question.
Ok I solve it with this pattern
12345-image-*[0-9]-*
eg:
rm -rf 12345-image-*[0-9]-*
it matches all the file names start with 12345-image- then a number then - symbol and any thing after that
as I found it's globbing in bash not regex
and I found this app really use full

Unix search ignore caps

I want to search for a specific file by name in Solaris.
But I don't know whether the file has caps in the name or not, so I want to ignore the caps.
If i use:
find . -name 'word'
it won't find me file that is named WoRd.
I know I have to use -i somehow, but I just can't manage to find the correct syntax.
Thanks.
Please try this one. It will be solved your problem
$find . -iname 'word'
Where i for ignore case sensitive
Try this, -type f matches files only, you were correct to assume using the -i flag
find . -type f -print | grep -i "word"
Also check out this answer that goes deeper: https://unix.stackexchange.com/questions/40766/help-understanding-find-syntax-on-solaris

How to find all file has 4 characters in their name

I want to find all files that have 4 characters in their name I try this command
ls [0-9A-Za-z]{4}
and
ls *????
they don't work any help
Try this :
ls [0-9A-Za-z][0-9A-Za-z][0-9A-Za-z][0-9A-Za-z]
or simpler as proposed in the comments by #Paul R
ls ????
You can't mix regex notation with glob notation.
If you want to use regex, you can give a try to find :
find . -type f -regextype posix-egrep -regex './\w{4}'
Note:
\w
is the same as
[0-9A-Za-z_]
in regex, and
-typef is for filtering files only.
There are two ways:
ls ????
or
echo ????
If you want to omit directories it's another story.

find command search pattern

I have below 4 files
a_ROLLBACK2to3__test.sql,
a_1to2__test.sql,
a_2to3__test.sql,
a_2to2__test.sql
I want to write a find command to return the files a_1to2__test.sql, a_2to3__test.sql and a_2to2__test.sql, the file a_ROLLBACK2to3__test.sql should not be included in the search.
my find command looks like
find . -name "*_*to*__*.sql"
but this returns all files but I don’t want a_ROLLBACK2to3__test.sql.
basically the files with ROLLBACK after the first _ should not be included..
Can anyone help me to write the search pattern for my requirement?
Thanks
Simply filter the results with grep:
find . -name '*_*to*__*.sql' | grep -v ROLLBACK
Or use the AND clause -a with negation !:
find . -name '*_*to*__*.sql' -a ! -name '*ROLLBACK*'
You could simply look for the underscore followed by a digit:
find . -name '*_[0-9]*to*__*.sql'
or for an underscore not followed by R:
find . -name '*_[!R]*to*__*.sql'

Find file without spaces in file name

I'm trying to find all *.cpp, using find files under current directory, which do not contain spaces in neither dirname nor basename. I understand I need to use -wholename flag, but I can not find an appropriate regex syntax.
Use find with a regex:
find . -type f -regex "[^ ]*.cpp"

Resources