Run recursive grep using two patterns - linux

How can I use this grep pattern to recursively search a directory? I need for both of these to be on the same line in the file the string. I keep getting the message back this is a directory. How can I make it search recursively all files with the extension .cfc?
"<cffunction" and "inject="
grep -insR "<cffunction" | grep "inject=" /c/mydirectory/

Use find and exec:
find your_dir -name "*.cfc" -type f -exec grep -insE 'inject=.*<cffunction|<cffunction.*inject=' /dev/null {} +
find finds your *.cfc files recursively and feeds into grep, picking only regular files (-type f)
inject=.*<cffunction|<cffunction.*inject= catches lines that have your patterns in either order
{} + ensures each invocation of grep gets up to ARG_MAX files
/dev/null argument to grep ensures that the output is prefixed with the name of file even when there is a single *.cfc file

You've got it backwards, you should pipe your file search to the second command, like:
grep -nisr "inject=" /c/mydirectory | grep "<cffunction"
edit: to exclude some directories and search only in *.cfc files, use:
grep -nisr --exclude-dir={/abs/dir/path,rel/dir/path} --include \*.cfc "inject=" /c/mydirectory | grep "<cffunction"

Related

How to pipe grep into a program that uses a directory with only read permissions

I have a remote directory that I do not have any permissions to other than to read files.
Typically I run a custom server-wide script, dostuff, as such:
dostuff /path/to/images/*img
However, this directory accidentally has two distinct sets of files (same number of files each) with very similar names:
13-08_1_0XXX.img
13-08_1_0XXX_16YYYY.img
where Xs increments from 1-900 together, and Ys have somewhat arbitrary numbers.
For example, I can regex select filenames for the first set with:
find . | grep -E -w -o ".{12}.img"
So I tried
find /path/to/images/ | grep -E -w -o ".{12}.img" | dostuff
but that does not work. As I can't move the files or copy them elsewhere, I think the only solution is to figure out how to pipe them into the script as two individual sets of images. Any suggestions would be appreciated!
Assuming you have GNU findutils, following should work:
$ find /path/to/imagedir -regextype posix-extended \
-regex '.*/13-08_1_0([1-9][0-9][0-8]|900)(.img|_16[0-9]+{4}.img)' -exec dosomestuff {} \;
-regextype posix-extended -regex option in find gives grep -E type functionality.
{} is the matched pattern found.
-exec dosomestuff {} would _do_some_stuff_ to the matched {}.
You can put the find...|grep... command in backquotes, and make that the argument to dostuff. Better is piping to find ... | grep ... | xargs dostuff. xargs will invoke dostuff with the (stdin stream of) filenames as an argument list. If the list of files would exceed max bash command line length (32KB), xargs breaks it into multiple invocations of dostuff.

"find" specific contents [linux]

I would like to go through all the files in the current directory (or sub-directories) and echoes me back the name of files only if they contain certain words.
More detail:
find -type f -name "*hello *" will give me all file names that have "hello" in their names. But instead of that, I want to search through the files and if that file's content contains "hello" then prints out the name of the file.
Is there a way to approach this?
You can use GNU find and GNU grep as
find /path -type f -exec grep -Hi 'hello' {} +
This is efficient in a way that it doesn't invoke as many grep instances to as many files returned from find. This works in an underlying assumption that find returns a set of files for grep to search on. If you are unsure if the files may not be available, as a fool-proof way, you can use xargs with -r flag, in which case the commands following xargs are executed only if the piped commands return any results
find /path -type f | xargs -r0 grep -Hi 'hello'

How to know which file holds grep result?

There is a directory which contains 100 text files. I used grep to search a given text in the directory as follow:
cat *.txt | grep Ya_Mahdi
and grep shows Ya_Mahdi.
I need to know which file holds the text. Is it possible?
Just get rid of cat and provide the list of files to grep:
grep Ya_Mahdi *.txt
While this would generally work, depending on the number of .txt files in that folder, the argument list for grep might get too large.
You can use find for a bullet proof solution:
find --maxdepth 1 -name '*.txt' -exec grep -H Ya_Mahdi {} +

Grep into given filenames

I have a directory that contains many subdirectories that contain many files.
I list the contents of the current directory using ls *. I see that there are certain files that are relevant, in terms of their names. Therefore, the relevant files can be obtained as such ls * | grep "abc\|def\|ghi".
Now I want to search within the given filenames. So I try something like:
ls * | grep "abc\|def\|ghi" | zgrep -i "ERROR" *, however, this is not looking into the file contents, rather the names. Is there an easy way to do this with pipes?
To use grep to search the contents of files within a directory, try using the find command, using xargs to couple it with the grep command, like so:
find . -type f | xargs grep '...'
You can do it like this:
find -E . -type f -regex ".*/.*(abc|def).*" -exec grep -H ERROR {} \+
The -E allows use of extended regexes so you can use the pipe (|) for expressing alternations. The + at the end allows searching in as many files as possible for each invocation of -exec grep rather than needing a whole new process for every single file.
You should use xargs to grep each file contents:
ls * | grep "abc\|def\|ghi" | xargs zgrep -i "ERROR" *
I know you asked for a solution with pipes, but they are not necessary for this task. grep has many parameters, and can solve this problem alone:
grep . -rh --include "*abc*" --include "*def*" -e "ERROR"
Parameters:
--include : Search only files whose base name matches the give wildcard pattern (not regex!)
-h : Suppress the prefixing of file names on output.
-r : recursive
-e : regex filter pattern
grep -i "ERROR" `ls * | grep "abc\|def\|ghi"`

Find all directories containing a file that contains a keyword in linux

In my hierarchy of directories I have many text files called STATUS.txt. These text files each contain one keyword such as COMPLETE, WAITING, FUTURE or OPEN. I wish to execute a shell command of the following form:
./mycommand OPEN
which will list all the directories that contain a file called STATUS.txt, where this file contains the text "OPEN"
In future I will want to extend this script so that the directories returned are sorted. Sorting will determined by a numeric value stored the file PRIORITY.txt, which lives in the same directories as STATUS.txt. However, this can wait until my competence level improves. For the time being I am happy to list the directories in any order.
I have searched Stack Overflow for the following, but to no avail:
unix filter by file contents
linux filter by file contents
shell traverse directory file contents
bash traverse directory file contents
shell traverse directory find
bash traverse directory find
linux file contents directory
unix file contents directory
linux find name contents
unix find name contents
shell read file show directory
bash read file show directory
bash directory search
shell directory search
I have tried the following shell commands:
This helps me identify all the directories that contain STATUS.txt
$ find ./ -name STATUS.txt
This reads STATUS.txt for every directory that contains it
$ find ./ -name STATUS.txt | xargs -I{} cat {}
This doesn't return any text, I was hoping it would return the name of each directory
$ find . -type d | while read d; do if [ -f STATUS.txt ]; then echo "${d}"; fi; done
... or the other way around:
find . -name "STATUS.txt" -exec grep -lF "OPEN" \{} +
If you want to wrap that in a script, a good starting point might be:
#!/bin/sh
[ $# -ne 1 ] && echo "One argument required" >&2 && exit 2
find . -name "STATUS.txt" -exec grep -lF "$1" \{} +
As pointed out by #BroSlow, if you are looking for directories containing the matching STATUS.txt files, this might be more what you are looking for:
fgrep --include='STATUS.txt' -rl 'OPEN' | xargs -L 1 dirname
Or better
fgrep --include='STATUS.txt' -rl 'OPEN' |
sed -e 's|^[^/]*$|./&|' -e 's|/[^/]*$||'
# ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
# simulate `xargs -L 1 dirname` using `sed`
# (no trailing `\`; returns `.` for path without dir part)
Maybe you can try this:
grep -rl "OPEN" . --include='STATUS.txt'| sed 's/STATUS.txt//'
where grep -r means recursive , -l means only list the files matching, '.' is the directory location. You can pipe it to sed to remove the file name.
You can then wrap this in a bash script file where you can pass in keywords such as 'OPEN', 'FUTURE' as an argument.
#!/bin/bash
grep -rl "$1" . --include='STATUS.txt'| sed 's/STATUS.txt//'
Try something like this
find -type f -name "STATUS.txt" -exec grep -q "OPEN" {} \; -exec dirname {} \;
or in a script
#!/bin/bash
(($#==1)) || { echo "Usage: $0 <pattern>" && exit 1; }
find -type f -name "STATUS.txt" -exec grep -q "$1" {} \; -exec dirname {} \;
You could use grep and awk instead of find:
grep -r OPEN * | awk '{split($1, path, ":"); print path[1]}' | xargs -I{} dirname {}
The above grep will list all files containing "OPEN" recursively inside you dir structure. The result will be something like:
dir_1/subdir_1/STATUS.txt:OPEN
dir_2/subdir_2/STATUS.txt:OPEN
dir_2/subdir_3/STATUS.txt:OPEN
Then the awk script will split this output at the colon and print the first part of it (the dir path).
dir_1/subdir_1/STATUS.txt
dir_2/subdir_2/STATUS.txt
dir_2/subdir_3/STATUS.txt
The dirname will then return only the directory path, not the file name, which I suppose it what you want.
I'd consider using Perl or Python if you want to evolve this further, though, as it might get messier if you want to add priorities and sorting.
Taking up the accepted answer, it does not output a sorted and unique directory list. At the end of the "find" command, add:
| sort -u
or:
| sort | uniq
to get the unique list of the directories.
Credits go to Get unique list of all directories which contain a file whose name contains a string.
IMHO you should write a Python script which:
Examines your directory structure and finds all files named STATUS.txt.
For each found file:
reads the file and executes mycommand depending on what the file contains.
If you want to extend the script later with sorting, you can find all the interesting files first, save them to a list, sort the list and execute the commands on the sorted list.
Hint: http://pythonadventures.wordpress.com/2011/03/26/traversing-a-directory-recursively/

Resources