grep search find . in files with .txt extension only - search

find . | xargs grep "Book" -sl | xargs grep -inw "Stars" -sl
this returns the filename of the files that contain words "Book" and "Stars".
How do you modify this so that you could search only *.txt files ?
Also is there a way to specify the search directory?

Without using too many options(such as : -iname,-inw ,. so on):
locate /*txt | xargs grep -rl Book | xargs grep -rl Stars
That is it .
UPDATE :
If you want to specify your directory . take this example :
Directory = /home/corbett/Documents
Thus ,
locate /home/corbett/Documents/*txt |xargs grep -rl Book |xargs grep -rl Stars
If the search is not case sensitive , use -i option for grep cmd:
locate /home/corbett/Documents/*txt |xargs grep -i -rl Book |xargs grep -i -rl Stars
Locate vs Find :
Find is more efficient for the new files .
Locate is the fastest as it handle database of strings(files paths)

Just use the -iname parameter. It makes find search all files with extension ".txt" (case-insensitive). Also add the -type f parameter to ensure you are only searching files (this should allow you to remove the -s "suppress error" parameter passed to grep).
find . -iname "*.txt" -type f | xargs grep "Book" -sl | xargs grep -inw "Stars" -sl
You can also use -exec parameter instead.
find . -iname "*txt" -type f -exec grep "Book" -lZ {} \; | xargs -0 grep -inw "Stars" -l
The -Z parameter for grep and the -0 parameter for xargs ensures that filenames piped from one to the other work even if they have spaces in them.

Related

how do I change string in all sub directories with same file name (For eg: data.txt) in linux using termianl?

find . -name "data.txt" -print0 | grep -rl "pa028" ./ |xargs -0 sed -i '' -e 's/pa028/pa014/g'
I tried to replace pa028 with pa014 in the file name "data.txt" in all subdirectories. Can you find please correct me?
You can't put grep between find -print0 and xargs -0 because grep operates on lines, and this pipeline contains null-separated text instead of lines. Additionally, grep -r . will ignore the standard input you so expensively set up find to produce.
find . -name "data.txt" -exec grep -q "pa028" {} \; -print0 |
xargs -r -0 sed -i '' -e 's/pa028/pa014/g'
The logic here is to use -exec grep -q as a predicate to find so we produce a null-terminated list of matching files (for which the -exec returns true) to pass to xargs -r -0. (The -r option is important, too; you get weird errors if xargs runs anyway even though find produced no output.)
There is an extension to GNU grep to operate on null-terminated strings with -z and print null-terminated file names with -Z -l but that's a fairly recent development, so I'm not yet prepared to recommend that.

Grep - How to concatenate filename to each returned line of file content?

I have a statement which
Finds a set of files
Cats their contents out
Then greps their contents
It is this pipeline:
find . | grep -i "Test_" | xargs cat | grep -i "start-node name="
produces an output such as:
<start-node name="Start" secure="false"/>
<start-node name="Run" secure="false"/>
What I was hoping to get is something like:
filename1-<start-node name="Start" secure="false"/>
filename2-<start-node name="Run" secure="false"/>
An easier may be to execute grep on the result of find, without xargs and cat:
grep -i "Test_" `find .` | grep -i "start-node name="
Because you cat all the files into a single stream, grep doesn't have any filename information. You want to give all the filenames to grep as arguments:
find ... | xargs grep "<start-node name=" /dev/null
Note two additional changes - I've dropped the -i flag, as it appears you're inspecting XML, and that's not case-insensitive; I've added /dev/null to the list of files, so that grep always has at least two files of input, even if find only gives one result. That's the portable way to get grep to print filenames.
Now, let's look at the find command. Instead of finding all files, then filtering through grep, we can use the -iregex predicate of GNU grep:
find . -iregex '.*Test_.*' \( -type 'f' -o -type 'l' \) | xargs grep ...
The mixed-case pattern suggests your filenames aren't really case-insensitive, and you might not want to grep symlinks (I'm sure you don't want directories and special files passed through), in which case you can simplify (and can use portable find again):
find . -name '*Test_*' -type 'f' | xargs grep ...
Now protect against the kind of filenames that trip up pipelines, and you have
find . -name '*Test_*' -type 'f' -print0 \
| xargs -0 grep -e "<start-node name=" -- /dev/null
Alternatively, if you have GNU grep, you don't need find at all:
grep --recursive --include '*[Tt]est_*' -e "<start-node name=" .
If you just need to count them:
find . | grep -i "Test_" | xargs cat | grep -i "start-node name=" | awk 'BEGIN{n=0}{n=n+1;print "filename" n "-" $0}'
From man grep:
-H Always print filename headers with output lines.

How to find all files which include few specific strings but not necessarily in the same line?

In linux grep -r <string> <path> is a common way to find all instances of in files under <path>, which basically gives you all the files under <path> which include <string>. But what if I want to find all files which include few strings? From grep -r <string1> <path> | grep <string2> I can get all files which include <string1> and <string2> in the same line, but how can I get the files which include <string1> and <string2> in separate lines?
You can try
grep -rl searchstring1 . | xargs grep -l searchstring2
to get a list of file names in directory . containing both searchstrings (not necessarily in the same line). You can cascade that in case you want more search strings:
grep -rl searchstring1 . \
| xargs grep -l searchstring2 \
| xargs grep -l searchstring3
This is tricky in case you have spaces and other nasty characters in the file names because then the xargs gets fooled. In such special cases (or just to be sure not to get that problem) you can use 0-byte terminated strings:
grep -rlZ searchstring1 . \
| xargs -0 grep -lZ searchstring2 \
| xargs -0 grep -l searchstring3
And to check the output you can use sth like:
grep -rlZ searchstring1 . \
| xargs -0 grep -lZ searchstring2 \
| xargs -0 grep -lZ searchstring3 \
| xargs -0 egrep 'searchstring2|searchstring2|searchstring3' /dev/null \
| less
A completely different approach is using find straight forward (but that starts lots of grep processes and is therefore probably less efficient):
find . -type f \( \
-exec grep -q searchstring1 {} \; -a \
-exec grep -q searchstring2 {} \; -a \
-exec grep -q searchstring2 {} \; \) -print

How to find text files not containing text on Linux?

How do I find files not containing some text on Linux? Basically I'm looking for the inverse of the following
find . -print | xargs grep -iL "somestring"
The command you quote, ironically enough does exactly what you describe.
Test it!
echo "hello" > a
echo "bye" > b
grep -iL BYE a b
Says a only.
I think you may be confusing -L and -l
find . -print | xargs grep -iL "somestring"
is the inverse of
find . -print | xargs grep -il "somestring"
By the way, consider
find . -print0 | xargs -0 grep -iL "somestring"
Or even
grep -IRiL "somestring" .
You can do it with grep alone (without find).
grep -riL "somestring" .
This is the explanation of the parameters used on grep
-L, --files-without-match
each file processed.
-R, -r, --recursive
Recursively search subdirectories listed.
-i, --ignore-case
Perform case insensitive matching.
If you use l lowercase you will get the opposite (files with matches)
-l, --files-with-matches
Only the names of files containing selected lines are written
Find the markdown file through find and grep to find the mismatch
$ find. -name '* .md' -print0 | xargs -0 grep -iL "title"
Directly use grep's -L to search for files that only contain markdown files and no titles
$ grep -iL "title" -r ./* --include '* .md'
If you use "find" the script do "grep" also in folder:
[root#vps test]# find | xargs grep -Li 1234
grep: .: Is a directory
.
./test.txt
./test2.txt
[root#vps test]#
Use the "grep" directly:
# grep -Li 1234 /root/test/*
/root/test/test2.txt
/root/test/test.txt
[root#vps test]#
or specify in "find" the options "-type f"...even if you use the find you will put more time (first the list of files and then make the grep).

Unix Command to List files containing string but *NOT* containing another string

How do I recursively view a list of files that has one string and specifically doesn't have another string? Also, I mean to evaluate the text of the files, not the filenames.
Conclusion:
As per comments, I ended up using:
find . -name "*.html" -exec grep -lR 'base\-maps' {} \; | xargs grep -L 'base\-maps\-bot'
This returned files with "base-maps" and not "base-maps-bot". Thank you!!
Try this:
grep -rl <string-to-match> | xargs grep -L <string-not-to-match>
Explanation: grep -lr makes grep recursively (r) output a list (l) of all files that contain <string-to-match>. xargs loops over these files, calling grep -L on each one of them. grep -L will only output the filename when the file does not contain <string-not-to-match>.
The use of xargs in the answers above is not necessary; you can achieve the same thing like this:
find . -type f -exec grep -q <string-to-match> {} \; -not -exec grep -q <string-not-to-match> {} \; -print
grep -q means run quietly but return an exit code indicating whether a match was found; find can then use that exit code to determine whether to keep executing the rest of its options. If -exec grep -q <string-to-match> {} \; returns 0, then it will go on to execute -not -exec grep -q <string-not-to-match>{} \;. If that also returns 0, it will go on to execute -print, which prints the name of the file.
As another answer has noted, using find in this way has major advantages over grep -Rl where you only want to search files of a certain type. If, on the other hand, you really want to search all files, grep -Rl is probably quicker, as it uses one grep process to perform the first filter for all files, instead of a separate grep process for each file.
These answers seem off as the match BOTH strings. The following command should work better:
grep -l <string-to-match> * | xargs grep -c <string-not-to-match> | grep '\:0'
Here is a more generic construction:
find . -name <nameFilter> -print0 | xargs -0 grep -Z -l <patternYes> | xargs -0 grep -L <patternNo>
This command outputs files whose name matches <nameFilter> (adjust find predicates as you need) which contain <patternYes>, but do not contain <patternNo>.
The enhancements are:
It works with filenames containing whitespace.
It lets you filter files by name.
If you don't need to filter by name (one often wants to consider all the files in current directory), you can strip find and add -R to the first grep:
grep -R -Z -l <patternYes> | xargs -0 grep -L <patternNo>
find . -maxdepth 1 -name "*.py" -exec grep -L "string-not-to-match" {} \;
This Command will get all ".py" files that don't contain "string-not-to-match" at same directory.
To match string A and exclude strings B & C being present in the same line I use, and quotes to allow search string to contain a space
grep -r <string A> | grep -v -e <string B> -e "<string C>" | awk -F ':' '{print $1}'
Explanation: grep -r recursively filters all lines matching in output format
filename: line
To exclude (grep -v) from those lines the ones that also contain either -e string B or -e string C. awk is used to print only the first field (the filename) using the colon as fieldseparator -F

Resources