How to recursively delete all files in folder that dont match a given pattern - linux

I would like to delete all files in a given folder that dont match the pattern ^transactions_[0-9]+
Let's say I have these files in the folder
file_list
transactions_010116.csv
transactions_020116.csv
transactions_check_010116.csv
transactions_check_020116.csv
I would like to delete transactions_check_010116.csv and transactions_check_020116.csv and leave the first two as they are using ^transactions_[0-9]+
I've been trying to use find something like below, but this expression deletes everything in the folder not just the files that dont match the pattern:
find /my_file_location -type f ! -regex '^transactions_[0-9]+' -delete
What i'm trying to do here is using regex find all files in folder that dont start with ^transactions_[0-9]+ and delete them.

Depending on your implementation, you could have to use option -E to allow the use of full regexes. An other problem is that -regex gives you an almost full path starting with the directory you passed.
So the correct command should be:
find -E /my_file_location ! -regex '.*/transactions_[0-9]+$' -type f -delete
But you should first issue the same with -print to be sure...

grep has -v option to grep everything not matching the provided regex:
find . | grep -v '^transactions_[0-9]+' | xargs rm -f

Related

how to delete files have specific pattern in linux?

I have a set of images like these
12345-image-1-medium.jpg 12345-image-2-medium.png 12345-image-3-large.jpg
what pattern should I write to select these images and delete them
I also have these images that don't want to select
12345-image-profile-small.jpg 12345-image-profile-medium.jpg 12345-image-profile-large.png
I have tried this regex but not worked
1234-image-[0-9]+-small.*
I think bash not support regex as in Javascript, Go, Python or Java
for pic in 12345*.{jpg,png};do rm $pic;done
for more information on wildcards take a look here
So long as you do NOT have filenames with embedded '\n' character, then the following find and grep will do:
find . -type f | grep '^.*/[[:digit:]]\{1,5\}-image-[[:digit:]]\{1,5\}'
It will find all files below the current directory and match (1 to 5 digits) followed by "-image-" followed by another (1 to 5 digits). In your case with the following files:
$ ls -1
123-image-99999-small.jpg
12345-image-1-medium.jpg
12345-image-2-medium.png
12345-image-3-large.jpg
12345-image-profile-large.png
12345-image-profile-medium.jpg
12345-image-profile-small.jpg
The files you request are matched in addition to 123-image-99999-small.jpg, e.g.
$ find . -type f | grep '^.*/[[:digit:]]\{1,5\}-image-[[:digit:]]\{1,5\}'
./123-image-99999-small.jpg
./12345-image-3-large.jpg
./12345-image-2-medium.png
./12345-image-1-medium.jpg
You can use the above in a command substitution to remove the files, e.g.
$ rm $(find . -type f | grep '^.*/[[:digit:]]\{1,5\}-image-[[:digit:]]\{1,5\}')
The remaining files are:
$ l1
12345-image-profile-large.png
12345-image-profile-medium.jpg
12345-image-profile-small.jpg
If Your find Supports -regextype
If your find supports the regextype allowing you to specify which set of regular expression syntax to use, you can use -regextype grep for grep syntax and use something similar to the above to remove the files with the -execdir option, e.g.
$ find . -type f -regextype grep -regex '^.*/[[:digit:]]\+-image-[[:digit:]]\+.*$' -execdir rm '{}' +
I do not know whether this is supported by BSD or Solaris, etc.., so check before turning it loose in a script. Also note, [[:digit:]]\+ tests for (1 or more) digits and is not limited to 5-digits as shown in your question.
Ok I solve it with this pattern
12345-image-*[0-9]-*
eg:
rm -rf 12345-image-*[0-9]-*
it matches all the file names start with 12345-image- then a number then - symbol and any thing after that
as I found it's globbing in bash not regex
and I found this app really use full

Using "grep" to search for specific type of files in all subdirectories

I am trying to find a specific line in files that contains "Mutual_Values_23.0" in a directory that contains a lot of subdirectories. I know this line number is stored in a file which starts with "gnuout_mutual_....txt" (the ellipses part of the file name is the time stamp so that varies).
I wanted to know if there is a way to specify "grep" command to look into the subdirectories only for the files starting with "gnuout_mutual_....txt"
I have tried
grep -r "Mutual_Values_23.0" *
but that's taking a long time
You can use the following option of grep:
--include=GLOB
Search only files whose base name matches GLOB (using wildcard matching as described under --exclude).
And for the line number you should use the -n option.
From within the root of the folders you want to look into, you can use this final command:
grep -nr "Mutual_Values_23.0" --include="gnuout_mutual_*txt"
Use find to search all sub-directories for the "gnuout...txt` file with the search string "Mutual_Values_23.0"
find . -mindepth 1 -name gnuout_mutual_\*.txt -type f -exec grep "Mutual_Values_23.0" {} +
If you make use of bash, you can use the globstar option:
globstar
If set, the pattern ** used in a pathname expansion context will
match all files and zero or more directories and subdirectories.
If the pattern is followed by a /, only directories and
subdirectories match.
So you can use it like:
$ shopt -s globstar
$ grep "search_string" **/glob-pattern
or in the case of the OP:
$ shopt -s globstar
$ grep Mutual_Values_23.0 **/gnuout_mutual_*.txt
GNU grep has the --include GLOB option where GLOB can be used to specify the file name pattern that you need to match.
grep -rn --include 'gnuout_mutual_*txt' 'Mutual_Values_23.0' .
You could use find to search for files and pass results to grep.
find /directory_where_to_search/ -iname 'gnuout_mutual_*.txt' | xargs grep 'Mutual_Values_23.0' -sl
Use this command:
$ find . -name "*Mutual_Values_23.0*"
Note: Run this command in the directory where you want to search your set of files.
Hope it helps, cheers!

Mass Find/Replace within files having specific filename under command line

I am looking for a quick command to search all .htaccess files for a specific IP address and change it to another IP address from the command line
something like
grep -rl '255.255.254.254' ./ | xargs sed -i 's/254/253/g'
I know the above example is a bad way to do it, just an example (and showing I did some searching to find a solution
Search: files with filename .htaccess (within 2 levels deep of current path?)
Find: 255.255.254.254
Replace with: 255.255.253.253
or, is this too much to ask of my server and I would be better off replacing them as I find them?
Try:
find . -type f -name '.htaccess' -execdir sed -i 's/255\.255\.254\.254/255.255.253.253/g' {} +
How it works:
find .
Start looking for files in the current directory.
-type f
Look only for regular files.
-name '.htaccess'
Look only for files named .htaccess.
-execdir sed -i 's/255\.255\.254\.254/255.255.253.253/g' {} +
For any such files found, run this sed command on them.
Because . is a wildcard and you likely want to match only literal periods, we escape them: \.
We use -execdir rather than the older -exec because it is more secure against race conditions.

"find" specific contents [linux]

I would like to go through all the files in the current directory (or sub-directories) and echoes me back the name of files only if they contain certain words.
More detail:
find -type f -name "*hello *" will give me all file names that have "hello" in their names. But instead of that, I want to search through the files and if that file's content contains "hello" then prints out the name of the file.
Is there a way to approach this?
You can use GNU find and GNU grep as
find /path -type f -exec grep -Hi 'hello' {} +
This is efficient in a way that it doesn't invoke as many grep instances to as many files returned from find. This works in an underlying assumption that find returns a set of files for grep to search on. If you are unsure if the files may not be available, as a fool-proof way, you can use xargs with -r flag, in which case the commands following xargs are executed only if the piped commands return any results
find /path -type f | xargs -r0 grep -Hi 'hello'

Linux - Find files that do not contain certain characters

I understand that using something like [^a]* will output all the files that do not start with "a".
If I want to echo files that contain at least 5 characters that do not start with "abc" (but can contain "abc" in the middle of the filename), how should I go about doing so?
I have
echo [^abc]?????*
but the output also removes files like "123abc", which I don't quite understand.
You don't indicate which OS your question applies to, but one way to determine the set of matching files on Mac OS X or Linux would be:
find . -maxdepth 1 -type f -name "?????*" | egrep -v "./abc"
Note that this will list only files in the current directory. If you want to include files in subdirectories, you'll need to remove the maxdepth argument.
Also note that these commands are case-sensitive. You'll need to use -iname and -i to make them case-insensitive.
EDIT:
If you really need to use the echo command, the following will work:
echo `find . -maxdepth 1 -type f -name "?????*" | egrep -v "./abc"`

Resources