Recursively find files with a specific extension - linux

I'm trying to find files with specific extensions.
For example, I want to find all .pdf and .jpg files that's named Robert
I know I can do this command
$ find . -name '*.h' -o -name '*.cpp'
but I need to specify the name of the file itself besides the extensions.
I just want to see if there's a possible way to avoid writing the file name again and over again
Thank you !

My preference:
find . -name '*.jpg' -o -name '*.png' -print | grep Robert

Using find's -regex argument:
find . -regex '.*/Robert\.\(h\|cpp\)$'
Or just using -name:
find . -name 'Robert.*' -a \( -name '*.cpp' -o -name '*.h' \)

find -name "*Robert*" \( -name "*.pdf" -o -name "*.jpg" \)
The -o repreents an OR condition and you can add as many as you wish within the braces. So this says to find all files containing the word "Robert" anywhere in their names and whose names end in either "pdf" or "jpg".

As an alternative to using -regex option on find, since the question is labeled bash, you can use the brace expansion mechanism:
eval find . -false "-o -name Robert".{jpg,pdf}

This q/a shows how to use find with regular expression: How to use regex with find command?
Pattern could be something like
'^Robert\\.\\(h|cgg\\)$'

As a script you can use:
find "${2:-.}" -iregex ".*${1:-Robert}\.\(h\|cpp\)$" -print
save it as findcc
chmod 755 findcc
and use it as
findcc [name] [[search_direcory]]
e.g.
findcc # default name 'Robert' and directory .
findcc Joe # default directory '.'
findcc Joe /somewhere # no defaults
note you cant use
findcc /some/where #eg without the name...
also as alternative, you can use
find "$1" -print | grep "$#"
and
findcc directory grep_options
like
findcc . -P '/Robert\.(h|cpp)$'

Using bash globbing (if find is not a must)
ls Robert.{pdf,jpg}

Recurisvely with ls: (-al for include hidden folders)
ftype="jpg"
ls -1R *.${ftype} 2> /dev/null

For finding the files in system using the files database:
locate -e --regex "\.(h|cpp)$"
Make sure locate package is installed i.e. mlocate

Related

Can't find a file by pattern [duplicate]

I am having a hard time getting find to look for matches in the current directory as well as its subdirectories.
When I run find *test.c it only gives me the matches in the current directory. (does not look in subdirectories)
If I try find . -name *test.c I would expect the same results, but instead it gives me only matches that are in a subdirectory. When there are files that should match in the working directory, it gives me: find: paths must precede expression: mytest.c
What does this error mean, and how can I get the matches from both the current directory and its subdirectories?
Try putting it in quotes -- you're running into the shell's wildcard expansion, so what you're acually passing to find will look like:
find . -name bobtest.c cattest.c snowtest.c
...causing the syntax error. So try this instead:
find . -name '*test.c'
Note the single quotes around your file expression -- these will stop the shell (bash) expanding your wildcards.
What's happening is that the shell is expanding "*test.c" into a list of files. Try escaping the asterisk as:
find . -name \*test.c
From find manual:
NON-BUGS
Operator precedence surprises
The command find . -name afile -o -name bfile -print will never print
afile because this is actually equivalent to find . -name afile -o \(
-name bfile -a -print \). Remember that the precedence of -a is
higher than that of -o and when there is no operator specified
between tests, -a is assumed.
“paths must precede expression” error message
$ find . -name *.c -print
find: paths must precede expression
Usage: find [-H] [-L] [-P] [-Olevel] [-D ... [path...] [expression]
This happens because *.c has been expanded by the shell resulting in
find actually receiving a command line like this:
find . -name frcode.c locate.c word_io.c -print
That command is of course not going to work. Instead of doing things
this way, you should enclose the pattern in quotes or escape the
wildcard:
$ find . -name '*.c' -print
$ find . -name \*.c -print
Try putting it in quotes:
find . -name '*test.c'
I see this question is already answered. I just want to share what worked for me. I was missing a space between ( and -name. So the correct way of chosen a files with excluding some of them would be like below;
find . -name 'my-file-*' -type f -not \( -name 'my-file-1.2.0.jar' -or -name 'my-file.jar' \)
I came across this question when I was trying to find multiple filenames that I could not combine into a regular expression as described in #Chris J's answer, here is what worked for me
find . -name one.pdf -o -name two.txt -o -name anotherone.jpg
-o or -or is logical OR. See Finding Files on Gnu.org for more information.
I was running this on CygWin.
You can try this:
cat $(file $( find . -readable) | grep ASCII | tr ":" " " | awk '{print $1}')
with that, you can find all readable files with ascii and read them with cat
if you want to specify his weight and no-executable:
cat $(file $( find . -readable ! -executable -size 1033c) | grep ASCII | tr ":" " " | awk '{print $1}')
In my case i was missing trailing / in path.
find /var/opt/gitlab/backups/ -name *.tar

bash: filter the files where NOT to search

I have created a script that searches for the specified keywords in specified directories:
find $directory -type f -name "*.properties" -exec grep -Fi "$keyword"
The problem i faced is that the $directory contains 2 types of files - sample files and config files: config / sample.config. Where sample.config is an example only, thus i'm not interested to include them into the search.
The question is how to exclude these 'sample.*' files out of the results of my results?
From the question to exclude sample.config files, add ! -name sample.config in find commands, for example :
find $(<$SRC) -type f -name "*.properties" ! -name sample.config -exec grep -Fi "$keyword" --color {} +
however *.properties can't match sample.config so it will not change the result
Probably 1 command to search $keyword, with all 4 kinds of your file types, exclude sample.*:
msr -rp dir1,dir2,dirN -f "\.(properties|pl|xml|ini)$" --nf "^sample\." -it "keyword"
Use -PAC or -P -A -C to remove color and line number etc. to get pure result.
Use -l to just list the file paths and show distribution: count + percentage.
msr.gcc* is a single exe tool to search/replace file/pipe in my open project https://github.com/qualiu/msr tools directory, with cross platform versions and OS-bit versions. Built-in doc like: https://qualiu.github.io/msr/usage-by-running/msr-CentOS-7.html Vivid-demo, Performance-comparision-with-findstr-and-grep, test etc. just see the home.
Using the suggestion of #Nahuel, i've modified it a bit and it started working for me as:
find $(<$SRC) -type f -name "*.properties" ! -name "sample.*" -exec grep -Fi "$keyword" --color {} +

Find all files contained into directory named

I would like to recursively find all files contained into a directory that has name “name1” or name “name2”
for instance:
structure/of/dir/name1/file1.a
structure/of/dir/name1/file2.b
structure/of/dir/name1/file3.c
structure/of/dir/name1/subfolder/file1s.a
structure/of/dir/name1/subfolder/file2s.b
structure/of/dir/name2/file1.a
structure/of/dir/name2/file2.b
structure/of/dir/name2/file3.c
structure/of/dir/name2/subfolder/file1s.a
structure/of/dir/name2/subfolder/file2s.b
structure/of/dir/name3/name1.a ←this should not show up in the result
structure/of/dir/name3/name2.a ←this should not show up in the result
so when I start my magic command the expected output should be this and only this:
structure/of/dir/name1/file1.a
structure/of/dir/name1/file2.b
structure/of/dir/name1/file3.c
structure/of/dir/name2/file1.a
structure/of/dir/name2/file2.b
structure/of/dir/name2/file3.c
I scripted something but it does not work because it search within the files and not only folder names:
for entry in $(find $SEARCH_DIR -type f | grep 'name1\|name2');
do
echo "FileName: $(basename $entry)"
done
If you can use the -regex option, avoiding subfolders with [^/]:
~$ find . -type f -regex ".*name1/[^/]*" -o -regex ".*name2/[^/]*"
./structure/of/dir/name2/file1.a
./structure/of/dir/name2/file3.c
./structure/of/dir/name2/subfolder
./structure/of/dir/name2/file2.b
./structure/of/dir/name1/file1.a
./structure/of/dir/name1/file3.c
./structure/of/dir/name1/file2.b
I'd use -path and -prune for this, since it's standard (unlike -regex which is GNU specific).
find . \( -path "*/name1/*" -o -path "*/name2/*" \) -prune -type f -print
But more importantly, never do for file in $(find...). Use finds -exec or a while read loop instead, depending on what you really need to with the matching files. See UsingFind and BashFAQ 20 for more on how to handle find safely.

Exclude list of files from find

If I have a list of filenames in a text file that I want to exclude when I run find, how can I do that? For example, I want to do something like:
find /dir -name "*.gz" -exclude_from skip_files
and get all the .gz files in /dir except for the files listed in skip_files. But find has no -exclude_from flag. How can I skip all the files in skip_files?
I don't think find has an option like this, you could build a command using printf and your exclude list:
find /dir -name "*.gz" $(printf "! -name %s " $(cat skip_files))
Which is the same as doing:
find /dir -name "*.gz" ! -name first_skip ! -name second_skip .... etc
Alternatively you can pipe from find into grep:
find /dir -name "*.gz" | grep -vFf skip_files
This is what i usually do to remove some files from the result (In this case i looked for all text files but wasn't interested in a bunch of valgrind memcheck reports we have here and there):
find . -type f -name '*.txt' ! -name '*mem*.txt'
It seems to be working.
I think you can try like
find /dir \( -name "*.gz" ! -name skip_file1 ! -name skip_file2 ...so on \)
find /var/www/test/ -type f \( -iname "*.*" ! -iname "*.php" ! -iname "*.jpg" ! -iname "*.png" \)
The above command gives list of all files excluding files with .php, .jpg ang .png extension. This command works for me in putty.
Josh Jolly's grep solution works, but has O(N**2) complexity, making it too slow for long lists. If the lists are sorted first (O(N*log(N)) complexity), you can use comm, which has O(N) complexity:
find /dir -name '*.gz' |sort >everything_sorted
sort skip_files >skip_files_sorted
comm -23 everything_sorted skip_files_sorted | xargs . . . etc
man your computer's comm for details.
This solution will go through all files (not exactly excluding from the find command), but will produce an output skipping files from a list of exclusions.
I found that useful while running a time-consuming command (file /dir -exec md5sum {} \;).
You can create a shell script to handle the skipping logic and run commands on the files found (make it executable with chmod, replace echo with other commands):
$ cat skip_file.sh
#!/bin/bash
found=$(grep "^$1$" files_to_skip.txt)
if [ -z "$found" ]; then
# run your command
echo $1
fi
Create a file with the list of files to skip named files_to_skip.txt (on the dir you are running from).
Then use find using it:
find /dir -name "*.gz" -exec ./skip_file.sh {} \;
This should work:
find * -name "*.gz" $(printf "! -path %s " $(<skip_files.txt))
Working out
Assuming skip_files has a filename on each line, you can get the list of filenames via $(<skip_files.txt). E.g. echo $(<skip_files.txt) should print them all out.
For each filename you want to have a ! -path filename expression. To build this, use $(printf "! -path %s " $(<skip_files.txt))
Then, put it together with a filter on -name "*.gz"

How do I list the files having particular strings from group of directories in bash?

I want to list files having EXACT strings like "hello", "how" and "todo" from a directory (which is having multiple directories). Also I want to list c(.c) and cpp (.cpp) files only.
I have tried with grep -R (grep -R "hello" /home) but not satisfied. Please help me to enhance my grep -R command or any alternate way. Thanks in advance.
if you want to find files, a good start is usually to use find.
if you want to find all .cpp and .-c files that contain the strings "hello", "how" or "todo" in their content, use something like:
find /home \( -name "*.c" -or -name "*.cpp" \) \
-exec egrep -l "(hello|how|todo)" \{\} \;
if instead you want to find all .cpp and .-c files that contain the strings "hello", "how" or "todo" in their filenames, use something like:
find /home \
\( \( -name "*.c" -or -name "*.cpp" \) \
-and \
\( -name "*hello*" -or -name "*how*" -or -name "*todo*" \) \
\)
there is a bit of quoting (using \) involved, as (), {} and ; are considered special characters by the shell...
In fact grep itself would be fine for this.
However I would strongly suggest ack-grep. It is a good alternative to grep which just suit your need.
You can find it here
With ack-grep it is just as simple as
ack-grep --cc --cpp "(hello|how|todo)"
You can try the followings:
grep -rn --include={*.c,*.cpp} yourdirectory -e ^h[a-z]*
This will search through all the files which have .c and .cpp extensions and finds patterns starts with h (you need to prepare you own to meet your need) from your specified directory.

Resources