How to perform grep operation on all files in a directory? - linux

Working with xenserver, and I want to perform a command on each file that is in a directory, grepping some stuff out of the output of the command and appending it in a file.
I'm clear on the command I want to use and how to grep out string(s) as needed.
But what I'm not clear on is how do I have it perform this command on each file, going to the next, until no more files are found.

In Linux, I normally use this command to recursively grep for a particular text within a directory:
grep -rni "string" *
where
r = recursive i.e, search subdirectories within the current directory
n = to print the line numbers to stdout
i = case insensitive search

grep $PATTERN * would be sufficient. By default, grep would skip all subdirectories. However, if you want to grep through them, grep -r $PATTERN * is the case.

Use find. Seriously, it is the best way because then you can really see what files it's operating on:
find . -name "*.sql" -exec grep -H "slow" {} \;
Note, the -H is mac-specific, it shows the filename in the results.

To search in all sub-directories, but only in specific file types, use grep with --include.
For example, searching recursively in current directory, for text in *.yml and *.yaml :
grep "text to search" -r . --include=*.{yml,yaml}

If you want to do multiple commands, you could use:
for I in `ls *.sql`
do
grep "foo" $I >> foo.log
grep "bar" $I >> bar.log
done

Related

Find all directories containing a file that contains a keyword in linux

In my hierarchy of directories I have many text files called STATUS.txt. These text files each contain one keyword such as COMPLETE, WAITING, FUTURE or OPEN. I wish to execute a shell command of the following form:
./mycommand OPEN
which will list all the directories that contain a file called STATUS.txt, where this file contains the text "OPEN"
In future I will want to extend this script so that the directories returned are sorted. Sorting will determined by a numeric value stored the file PRIORITY.txt, which lives in the same directories as STATUS.txt. However, this can wait until my competence level improves. For the time being I am happy to list the directories in any order.
I have searched Stack Overflow for the following, but to no avail:
unix filter by file contents
linux filter by file contents
shell traverse directory file contents
bash traverse directory file contents
shell traverse directory find
bash traverse directory find
linux file contents directory
unix file contents directory
linux find name contents
unix find name contents
shell read file show directory
bash read file show directory
bash directory search
shell directory search
I have tried the following shell commands:
This helps me identify all the directories that contain STATUS.txt
$ find ./ -name STATUS.txt
This reads STATUS.txt for every directory that contains it
$ find ./ -name STATUS.txt | xargs -I{} cat {}
This doesn't return any text, I was hoping it would return the name of each directory
$ find . -type d | while read d; do if [ -f STATUS.txt ]; then echo "${d}"; fi; done
... or the other way around:
find . -name "STATUS.txt" -exec grep -lF "OPEN" \{} +
If you want to wrap that in a script, a good starting point might be:
#!/bin/sh
[ $# -ne 1 ] && echo "One argument required" >&2 && exit 2
find . -name "STATUS.txt" -exec grep -lF "$1" \{} +
As pointed out by #BroSlow, if you are looking for directories containing the matching STATUS.txt files, this might be more what you are looking for:
fgrep --include='STATUS.txt' -rl 'OPEN' | xargs -L 1 dirname
Or better
fgrep --include='STATUS.txt' -rl 'OPEN' |
sed -e 's|^[^/]*$|./&|' -e 's|/[^/]*$||'
# ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
# simulate `xargs -L 1 dirname` using `sed`
# (no trailing `\`; returns `.` for path without dir part)
Maybe you can try this:
grep -rl "OPEN" . --include='STATUS.txt'| sed 's/STATUS.txt//'
where grep -r means recursive , -l means only list the files matching, '.' is the directory location. You can pipe it to sed to remove the file name.
You can then wrap this in a bash script file where you can pass in keywords such as 'OPEN', 'FUTURE' as an argument.
#!/bin/bash
grep -rl "$1" . --include='STATUS.txt'| sed 's/STATUS.txt//'
Try something like this
find -type f -name "STATUS.txt" -exec grep -q "OPEN" {} \; -exec dirname {} \;
or in a script
#!/bin/bash
(($#==1)) || { echo "Usage: $0 <pattern>" && exit 1; }
find -type f -name "STATUS.txt" -exec grep -q "$1" {} \; -exec dirname {} \;
You could use grep and awk instead of find:
grep -r OPEN * | awk '{split($1, path, ":"); print path[1]}' | xargs -I{} dirname {}
The above grep will list all files containing "OPEN" recursively inside you dir structure. The result will be something like:
dir_1/subdir_1/STATUS.txt:OPEN
dir_2/subdir_2/STATUS.txt:OPEN
dir_2/subdir_3/STATUS.txt:OPEN
Then the awk script will split this output at the colon and print the first part of it (the dir path).
dir_1/subdir_1/STATUS.txt
dir_2/subdir_2/STATUS.txt
dir_2/subdir_3/STATUS.txt
The dirname will then return only the directory path, not the file name, which I suppose it what you want.
I'd consider using Perl or Python if you want to evolve this further, though, as it might get messier if you want to add priorities and sorting.
Taking up the accepted answer, it does not output a sorted and unique directory list. At the end of the "find" command, add:
| sort -u
or:
| sort | uniq
to get the unique list of the directories.
Credits go to Get unique list of all directories which contain a file whose name contains a string.
IMHO you should write a Python script which:
Examines your directory structure and finds all files named STATUS.txt.
For each found file:
reads the file and executes mycommand depending on what the file contains.
If you want to extend the script later with sorting, you can find all the interesting files first, save them to a list, sort the list and execute the commands on the sorted list.
Hint: http://pythonadventures.wordpress.com/2011/03/26/traversing-a-directory-recursively/

Search and replace entire files

I've seen numerous examples for replacing one string with another among multiple files but what I want to do is a bit different. Probably a lot simpler :)
Find all the files that match a certain string and replace them completely with the contents of a new file.
I have a find command that works
find /home/*/public_html -name "index.php" -exec grep "version:1.23" '{}' \; -print
This finds all the files I need to update.
Now how do I replace their entire content with the CONTENTS of /home/indexnew.txt (I could also name it /home/index.php)
I emphasize content because I don't want to change the name or ownership of the files I'm updating.
find ... | while read filename; do cat static_file > "$filename"; done
efficiency hint: use grep -q -- it will return "true" immediately when the first match is found, not having to read the entire file.
If you have a bunch of files you want to replace, and you can get all of their names using wildcards you can try piping output to the tee command:
cat my_file | tee /home/*/update.txt
This should look through all the directories in /home and write the text in my_file to update.txt in each of those directories.
Let me know if this helps or isn't what you want.
I am not sure if your command without -l and then print it is better than to add -l in grep to list file directly.
find /home/*/public_html -name "index.php" -exec grep -l "version:1.23" '{}' \; |xargs -i cp /home/index.php {}
Here is the option -l detail
-l, --files-with-matches
Suppress normal output; instead print the name of each input
file from which output would normally have been printed. The
scanning will stop on the first match. (-l is specified by
POSIX.)

Linux : Search for a Particular word in a List of files under a directory

I have a big list of log files in a particular directory , related to my java Application under my Linux Remote Servers .
When i do ls on that particular directory it shows a list of files (nearly 100 files )
Now in that List of files , i need to find out a particular word , please tell me , how can i do this ??
The problem is that I cannot open each and every file and search for that word using /
Please tell me how can i search for a word in the list of files provided .
You can use this command:
grep -rn "string" *
n for showing line number with the filename
r for recursive
grep is made for this.
Use:
grep myword * for a simple word
grep 'my sentence' * for a literal string
grep "I am ${USER}" * when you need variable replacement
You can also use regular expressions.
Add -r for recursive and -n to show the line number of matching lines.
And check man grep.
This is a very frequent task in linux. I use grep -rn '' . all the time to do this. -r for recursive (folder and subfolders) -n so it gives the line numbers, the dot stands for the current directory.
grep -rn '<word or regex>' <location>
do a
man grep
for more options
also you can try the following.
find . -name '*.java' -exec grep "<yourword" /dev/null {} \;
It gets all the files with .java extension and searches 'yourword' in each file, if it presents, it lists the file.
Hope it helps :)
You could club find with exec as follows to get the list of the files as well as the occurrence of the word/string that you are looking for
find . -exec grep "my word" '{}' \; -print
use this command
grep "your word" searchDirectory/*.log
Get more on this link
http://www.cyberciti.biz/faq/howto-recursively-search-all-files-for-words/
You are looking for grep command.
You can read 15 Practical Grep Command Examples In Linux / UNIX for some samples.

bash script to find pattern in text file and return entire line

I need to make a bash script which loops through a bunch of .txt files in a directory, then searches each .txt for a string, and returns the entire line that string appears on
I know how to look through all the .txt files in the directory,
I just need to be pointed in the right direction for searching the file itself, and returning a line based on a match in that line
Within one dir
grep "search string" *.txt
Search or go to sub-dir
find /full/path/to/dir -name "*.txt" -exec grep "search string" {} ;\
you can use a loop:
for i in $(find|grep .txt); do grep "search" "$i";
If you also want to print the filename with each match, add -H to grep
One slight addition. Add -n to return the line.
grep -rn "foo" *.txt
See grep --help for more.

How can I use grep to find a word inside a folder?

In Windows, I would have done a search for finding a word inside a folder. Similarly, I want to know if a specific word occurs inside a directory containing many sub-directories and files. My searches for grep syntax shows I must specify the filename, i.e. grep string filename.
Now, I do not know the filename, so what do I do?
A friend suggested to do grep -nr string, but I don't know what this means and I got no results with it (there is no response until I issue a Ctrl + C).
grep -nr 'yourString*' .
The dot at the end searches the current directory. Meaning for each parameter:
-n Show relative line number in the file
'yourString*' String for search, followed by a wildcard character
-r Recursively search subdirectories listed
. Directory for search (current directory)
grep -nr 'MobileAppSer*' . (Would find MobileAppServlet.java or MobileAppServlet.class or MobileAppServlet.txt; 'MobileAppASer*.*' is another way to do the same thing.)
To check more parameters use man grep command.
grep -nr string my_directory
Additional notes: this satisfies the syntax grep [options] string filename because in Unix-like systems, a directory is a kind of file (there is a term "regular file" to specifically refer to entities that are called just "files" in Windows).
grep -nr string reads the content to search from the standard input, that is why it just waits there for input from you, and stops doing so when you press ^C (it would stop on ^D as well, which is the key combination for end-of-file).
GREP: Global Regular Expression Print/Parser/Processor/Program.
You can use this to search the current directory.
You can specify -R for "recursive", which means the program searches in all subfolders, and their subfolders, and their subfolder's subfolders, etc.
grep -R "your word" .
-n will print the line number, where it matched in the file.
-i will search case-insensitive (capital/non-capital letters).
grep -inR "your regex pattern" .
There's also:
find directory_name -type f -print0 | xargs -0 grep -li word
but that might be a bit much for a beginner.
find is a general purpose directory walker/lister, -type f means "look for plain files rather than directories and named pipes and what have you", -print0 means "print them on the standard output using null characters as delimiters". The output from find is sent to xargs -0 and that grabs its standard input in chunks (to avoid command line length limitations) using null characters as a record separator (rather than the standard newline) and then applies grep -li word to each set of files. On the grep, -l means "list the files that match" and -i means "case insensitive"; you can usually combine single character options so you'll see -li more often than -l -i.
If you don't use -print0 and -0 then you'll run into problems with file names that contain spaces so using them is a good habit.
grep -nr search_string search_dir
will do a RECURSIVE (meaning the directory and all it's sub-directories) search for the search_string. (as correctly answered by usta).
The reason you were not getting any anwers with your friend's suggestion of:
grep -nr string
is because no directory was specified. If you are in the directory that you want to do the search in, you have to do the following:
grep -nr string .
It is important to include the '.' character, as this tells grep to search THIS directory.
Why not do a recursive search to find all instances in sub directories:
grep -r 'text' *
This works like a charm.
Similar to the answer posted by #eLRuLL, an easier way to specify a search that respects word boundaries is to use the -w option:
grep -wnr "yourString" .
Another option that I like to use:
find folder_name -type f -exec grep your_text {} \;
-type f returns you only files and not folders
-exec and {} runs the grep on the files that were found in the search (the exact syntax is "-exec command {}").
grep -r "yourstring" *
Will find "yourstring" in any files and folders
Now if you want to look for two different strings at the same time you can always use option E and add words for the search. example after the break
grep -rE "yourstring|yourotherstring|$" * will search for list locations where yourstring or yourotherstring matches
The answer you selected is fine, and it works, but it isn't the correct way to do it, because:
grep -nr yourString* .
This actually searches the string "yourStrin" and "g" 0 or many times.
So the proper way to do it is:
grep -nr \w*yourString\w* .
This command searches the string with any character before and after on the current folder.
grep -R "string" /directory/
-R follows also symlinks when -r does not.
The following sample looks recursively for your search string in the *.xml and *.js files located somewhere inside the folders path1, path2 and path3.
grep -r --include=*.xml --include=*.js "your search string" path1 path2 path3
So you can search in a subset of the files for many directories, just providing the paths at the end.
Run(terminal) the following command inside the directory. It will recursively check inside subdirectories too.
grep -r 'your string goes here' *
Don't use grep. Download Silver Searcher or ripgrep. They're both outstanding, and way faster than grep or ack with tons of options.

Resources