bash: filter the files where NOT to search - linux

I have created a script that searches for the specified keywords in specified directories:
find $directory -type f -name "*.properties" -exec grep -Fi "$keyword"
The problem i faced is that the $directory contains 2 types of files - sample files and config files: config / sample.config. Where sample.config is an example only, thus i'm not interested to include them into the search.
The question is how to exclude these 'sample.*' files out of the results of my results?

From the question to exclude sample.config files, add ! -name sample.config in find commands, for example :
find $(<$SRC) -type f -name "*.properties" ! -name sample.config -exec grep -Fi "$keyword" --color {} +
however *.properties can't match sample.config so it will not change the result

Probably 1 command to search $keyword, with all 4 kinds of your file types, exclude sample.*:
msr -rp dir1,dir2,dirN -f "\.(properties|pl|xml|ini)$" --nf "^sample\." -it "keyword"
Use -PAC or -P -A -C to remove color and line number etc. to get pure result.
Use -l to just list the file paths and show distribution: count + percentage.
msr.gcc* is a single exe tool to search/replace file/pipe in my open project https://github.com/qualiu/msr tools directory, with cross platform versions and OS-bit versions. Built-in doc like: https://qualiu.github.io/msr/usage-by-running/msr-CentOS-7.html Vivid-demo, Performance-comparision-with-findstr-and-grep, test etc. just see the home.

Using the suggestion of #Nahuel, i've modified it a bit and it started working for me as:
find $(<$SRC) -type f -name "*.properties" ! -name "sample.*" -exec grep -Fi "$keyword" --color {} +

Related

How can i update the contents of a file by replacing strings using grep and find command

I am finding XML files under particular sub directory having "responsible" word in it. Searching is working fine as shown below.
find . -name '*.xml' -exec grep -H 'responsible' {} \;
./dir1/d1.xml<responsible><></responsible>
./dir2/d2.xml<responsible><SYSTEM></responsible>
./dir3/d3.xml<responsible><SYSTEM></responsible>
... and so on.
Is there a way i can replace all occurrences of SYSTEM with blank one.
result i am looking is:
./dir1/d1.xml<responsible><></responsible>
./dir2/d2.xml<responsible><></responsible>
./dir3/d3.xml<responsible><></responsible>
I would use perl pie
perl -p -i -e 's/<responsible><SYSTEM><\/responsible>/<responsible><><\/responsible>/' `find ./ -name *.xml`

bash script parameter all file with beginning with a specified string

I am trying to make the code find all directories start with same letters, this is the code so far. I have two directories lit and lite I should be able to see both directories with I search for lit.
for I in $*
do
echo "the directories $(pwd)/"$1" was modified on "$(date -d "$(stat -c '%y' $1)"
'+%d %d %H:%M'$1)
done
The find command can take the type of files your looking for, and also perform a search for a given name.
find . -type d -name "lit*" -exec ls -ld {} \;
Here we set -type d for directories and -name <search>*" for the name of the files to search for.
You can then execute a command for each result using the -exec parameter

How do I find the number of all .txt files in a directory and all sub directories using specifically the find command and the wc command?

So far I have this:
find -name ".txt"
I'm not quite sure how to use wc to find out the exact number of files. When using the command above, all the .txt files show up, but I need the exact number of files with the .txt extension. Please don't suggest using other commands as I'd like to specifically use find and wc. Thanks
Try:
find . -name '*.txt' | wc -l
The -l option to wc tells it to return just the number of lines.
Improvement (requires GNU find)
The above will give the wrong number if any .txt file name contains a newline character. This will work correctly with any file names:
find . -iname '*.txt' -printf '1\n' | wc -l
-printf '1\n tells find to print just the line 1 for each file name found. This avoids problems with file names having difficult characters.
Example
Let's create two .txt files, one with a newline in its name:
$ touch dir1/dir2/a.txt $'dir1/dir2/b\nc.txt'
Now, let's find the find command:
$ find . -name '*.txt'
./dir1/dir2/b?c.txt
./dir1/dir2/a.txt
To count the files:
$ find . -name '*.txt' | wc -l
3
As you can see, the answer is off by one. The improved version, however, works correctly:
$ find . -iname '*.txt' -printf '1\n' | wc -l
2
find -type f -name "*.h" -mtime +10 -print | wc -l
This worked out.

Recursively find files with a specific extension

I'm trying to find files with specific extensions.
For example, I want to find all .pdf and .jpg files that's named Robert
I know I can do this command
$ find . -name '*.h' -o -name '*.cpp'
but I need to specify the name of the file itself besides the extensions.
I just want to see if there's a possible way to avoid writing the file name again and over again
Thank you !
My preference:
find . -name '*.jpg' -o -name '*.png' -print | grep Robert
Using find's -regex argument:
find . -regex '.*/Robert\.\(h\|cpp\)$'
Or just using -name:
find . -name 'Robert.*' -a \( -name '*.cpp' -o -name '*.h' \)
find -name "*Robert*" \( -name "*.pdf" -o -name "*.jpg" \)
The -o repreents an OR condition and you can add as many as you wish within the braces. So this says to find all files containing the word "Robert" anywhere in their names and whose names end in either "pdf" or "jpg".
As an alternative to using -regex option on find, since the question is labeled bash, you can use the brace expansion mechanism:
eval find . -false "-o -name Robert".{jpg,pdf}
This q/a shows how to use find with regular expression: How to use regex with find command?
Pattern could be something like
'^Robert\\.\\(h|cgg\\)$'
As a script you can use:
find "${2:-.}" -iregex ".*${1:-Robert}\.\(h\|cpp\)$" -print
save it as findcc
chmod 755 findcc
and use it as
findcc [name] [[search_direcory]]
e.g.
findcc # default name 'Robert' and directory .
findcc Joe # default directory '.'
findcc Joe /somewhere # no defaults
note you cant use
findcc /some/where #eg without the name...
also as alternative, you can use
find "$1" -print | grep "$#"
and
findcc directory grep_options
like
findcc . -P '/Robert\.(h|cpp)$'
Using bash globbing (if find is not a must)
ls Robert.{pdf,jpg}
Recurisvely with ls: (-al for include hidden folders)
ftype="jpg"
ls -1R *.${ftype} 2> /dev/null
For finding the files in system using the files database:
locate -e --regex "\.(h|cpp)$"
Make sure locate package is installed i.e. mlocate

Exclude list of files from find

If I have a list of filenames in a text file that I want to exclude when I run find, how can I do that? For example, I want to do something like:
find /dir -name "*.gz" -exclude_from skip_files
and get all the .gz files in /dir except for the files listed in skip_files. But find has no -exclude_from flag. How can I skip all the files in skip_files?
I don't think find has an option like this, you could build a command using printf and your exclude list:
find /dir -name "*.gz" $(printf "! -name %s " $(cat skip_files))
Which is the same as doing:
find /dir -name "*.gz" ! -name first_skip ! -name second_skip .... etc
Alternatively you can pipe from find into grep:
find /dir -name "*.gz" | grep -vFf skip_files
This is what i usually do to remove some files from the result (In this case i looked for all text files but wasn't interested in a bunch of valgrind memcheck reports we have here and there):
find . -type f -name '*.txt' ! -name '*mem*.txt'
It seems to be working.
I think you can try like
find /dir \( -name "*.gz" ! -name skip_file1 ! -name skip_file2 ...so on \)
find /var/www/test/ -type f \( -iname "*.*" ! -iname "*.php" ! -iname "*.jpg" ! -iname "*.png" \)
The above command gives list of all files excluding files with .php, .jpg ang .png extension. This command works for me in putty.
Josh Jolly's grep solution works, but has O(N**2) complexity, making it too slow for long lists. If the lists are sorted first (O(N*log(N)) complexity), you can use comm, which has O(N) complexity:
find /dir -name '*.gz' |sort >everything_sorted
sort skip_files >skip_files_sorted
comm -23 everything_sorted skip_files_sorted | xargs . . . etc
man your computer's comm for details.
This solution will go through all files (not exactly excluding from the find command), but will produce an output skipping files from a list of exclusions.
I found that useful while running a time-consuming command (file /dir -exec md5sum {} \;).
You can create a shell script to handle the skipping logic and run commands on the files found (make it executable with chmod, replace echo with other commands):
$ cat skip_file.sh
#!/bin/bash
found=$(grep "^$1$" files_to_skip.txt)
if [ -z "$found" ]; then
# run your command
echo $1
fi
Create a file with the list of files to skip named files_to_skip.txt (on the dir you are running from).
Then use find using it:
find /dir -name "*.gz" -exec ./skip_file.sh {} \;
This should work:
find * -name "*.gz" $(printf "! -path %s " $(<skip_files.txt))
Working out
Assuming skip_files has a filename on each line, you can get the list of filenames via $(<skip_files.txt). E.g. echo $(<skip_files.txt) should print them all out.
For each filename you want to have a ! -path filename expression. To build this, use $(printf "! -path %s " $(<skip_files.txt))
Then, put it together with a filter on -name "*.gz"

Resources