How to grep/find for a list of file names? - linux

So for example, I have a text document of a list of file names I may have in a directory. I want to grep or use find to find out if those file names exist in a specific directory and the subdirectories within it. Current I can do it manually via find . | grep filename but that's one at a time and when I have over 100 file names I need to check to see if I have them or not that can be really pesky and time-consuming.
What's the best way to go about this?

xargs is what you want here. The case is following:
Assume you have a file named filenames.txt that contains a list of files
a.file
b.file
c.file
d.file
e.file
and only e.file doesn't exist.
the command in terminal is:
cat filenames.txt | xargs -I {} find . -type f -name {}
the output of this command is:
a.file
b.file
c.file
d.file
Maybe this is helpful.

If the files didn't move, since the last time, updatedb ran, often < 24h, your fastest search is by locate.
Read the filelist into an array and search by locate. In case the filenames are common (or occur as a part of other files), grep them by the base dir, where to find them:
< file.lst mapfile filearr
locate ${filearr[#]} | grep /path/where/to/find
If the file names may contain whitespace or characters, which might get interpreted by the bash, the usual masking mechanisms have to been taken.

A friend had helped me figure it out via find . | grep -i -Ff filenames.txt

Related

Linux terminal: Recursive search for string only in files w given file extension; display file name and absolute path

I'm new to Linux terminal; using Ubuntu Peppermint 5.
I want to recursively search all directories for a given text string (eg 'mystring'), in all files which have a given file extension (eg. '*.doc') in the file name; and then display a list of the file names and absolute file paths of all matches. I don't need to see any lines of content.
This must be a common problem. I'm hoping to find a solution which does the search quickly and efficiently, and is also simple to remember and type into the terminal.
I've tried using 'cat', 'grep', 'find', and 'locate' with various options, and piped together in different combinations, but I haven't found a way to do the above.
Something similar was discussed on:
How to show grep result with complete path or file name
and:
Recursively search for files of a given name, and find instances of a particular phrase AND display the path to that file
but I can't figure a way to adapt these to do the above, and would be grateful for any suggestions.
According to the grep manual, you can do this using the --include option (combined with the -l option if you want only the name — I usually use -n to show line numbers):
--include=glob
Search only files whose name matches glob, using wildcard matching as described under --exclude.
-l
--files-with-matches
Suppress normal output; instead print the name of each input file from which output would normally have been printed. The scanning of each file stops on the first match. (-l is specified by POSIX.)
A suitable glob would be "*.doc" (ensure that it is quoted, to allow the shell to pass it to grep).
GNU grep also has a recursive option -r (not in POSIX grep). Together with the globbing, you can search a directory-tree of ".doc" files like this:
grep -r -l --include="*.doc" "mystring" .
If you wanted to make this portable, then find is the place to start. But using grep's extension makes searches much faster, and is available on any Linux platform.
find . -name '*.doc' -exec grep -l 'mystring' {} \; -print
How it works:
find searches recursively from the given path .
for all files which name is '*.doc'
-exec grep execute grep on files found
suppress output from grep -l
and search inside the files for 'mystring'
The expression for grep ends with the {} \;
and -print print out all names where grep founds mystring.
EDIT:
To get only results from the current directory without recursion you can add:
-maxdepth 0 to find.

Search and replace entire files

I've seen numerous examples for replacing one string with another among multiple files but what I want to do is a bit different. Probably a lot simpler :)
Find all the files that match a certain string and replace them completely with the contents of a new file.
I have a find command that works
find /home/*/public_html -name "index.php" -exec grep "version:1.23" '{}' \; -print
This finds all the files I need to update.
Now how do I replace their entire content with the CONTENTS of /home/indexnew.txt (I could also name it /home/index.php)
I emphasize content because I don't want to change the name or ownership of the files I'm updating.
find ... | while read filename; do cat static_file > "$filename"; done
efficiency hint: use grep -q -- it will return "true" immediately when the first match is found, not having to read the entire file.
If you have a bunch of files you want to replace, and you can get all of their names using wildcards you can try piping output to the tee command:
cat my_file | tee /home/*/update.txt
This should look through all the directories in /home and write the text in my_file to update.txt in each of those directories.
Let me know if this helps or isn't what you want.
I am not sure if your command without -l and then print it is better than to add -l in grep to list file directly.
find /home/*/public_html -name "index.php" -exec grep -l "version:1.23" '{}' \; |xargs -i cp /home/index.php {}
Here is the option -l detail
-l, --files-with-matches
Suppress normal output; instead print the name of each input
file from which output would normally have been printed. The
scanning will stop on the first match. (-l is specified by
POSIX.)

linux include all directories

how would I type a file path in ubuntu terminal to include all files in all sub-directories?
If I had a main directory called "books" but had a ton of subdirectories with all sorts of different names containing files, how would I type a path to include all files in all subdirectories?
/books/???
From within the books top directory, you can use the command:
find . -type f
Then, if you wanted to, say run each file through cat, you could use the xargs command:
find . -type f | xargs cat
For more info, use commands:
man find
man xargs
It is unclear what you actually want ... Probably you will get a better solution to your problem, if you ask directly for it, not for one other problem you've come accross trying to circumvent the original problem.
do you mean something like the following?
file */*
where the first * expands for all subdirectories and the second * for all contained files ?
I have chosen the file command arbitrarily. You can choose whatever command you want to run on the files you get shell-expanded.
Also note that directories will also be included (if not excluded by name, e.g. *.png or *.txt).
The wildcard * is not exactly the file path to include all files in all subdirectories but it expands to all files (or directories) matching the wildcard expression as a list, e.g. file1 file2 file3 file4. See also this tutorial on shell expansion.
Note that there may be easy solutions to related problems. Like to copy all files in all subdirectories (cp -a for example, see man cp).
I also like find very much. It's quite easy to generate more flexible search patterns in combination with grep. To provide a random example:
du `find . | grep some_pattern_to_occur | grep -v some_pattern_to_not_occur`
./books/*
For example, assuming i'm in the parent directory of 'books':
ls ./books/*
EDIT:
Actually, to list all the tree recursively you should use:
ls -R ./books/*

Searching for information in files in several directories

I need to check several files which are in different locations for a specific information.
So, how to make a script which checks for the argument word through several directories?
The directories are in different locations. For ex.
/home/check1/
/opt/log/
/var/status/
You could also do (next to ´find´)
do a
for DIR in /home/check1 /opt/log /var/status ; do
grep -R searchword $DIR;
done
At the very simplest, it boils down to
find . -name '*.c' | xargs grep word
to find a given word in all the .c files in the current directory and below.
grep -R may also work for you, but it can be a problem if you don't want to search all files.
Use the grep -R (recursive) option and give grep multiple directory arguments.
Try find http://content.hccfl.edu/pollock/Unix/FindCmd.htm using your searchwords and the directories.
The man page of grep should explain what you need. Anyway, if you need to search recursively you can use:
grep -R --include=PATTERN "string_to_search" $directory
You can also use:
--exclude=PATTERN to skip some file
--exclude-dir=PATTERN to skip some directories
The other option is use find to get the files and pipe it to grep to search the strings.

How can I use grep to find a word inside a folder?

In Windows, I would have done a search for finding a word inside a folder. Similarly, I want to know if a specific word occurs inside a directory containing many sub-directories and files. My searches for grep syntax shows I must specify the filename, i.e. grep string filename.
Now, I do not know the filename, so what do I do?
A friend suggested to do grep -nr string, but I don't know what this means and I got no results with it (there is no response until I issue a Ctrl + C).
grep -nr 'yourString*' .
The dot at the end searches the current directory. Meaning for each parameter:
-n Show relative line number in the file
'yourString*' String for search, followed by a wildcard character
-r Recursively search subdirectories listed
. Directory for search (current directory)
grep -nr 'MobileAppSer*' . (Would find MobileAppServlet.java or MobileAppServlet.class or MobileAppServlet.txt; 'MobileAppASer*.*' is another way to do the same thing.)
To check more parameters use man grep command.
grep -nr string my_directory
Additional notes: this satisfies the syntax grep [options] string filename because in Unix-like systems, a directory is a kind of file (there is a term "regular file" to specifically refer to entities that are called just "files" in Windows).
grep -nr string reads the content to search from the standard input, that is why it just waits there for input from you, and stops doing so when you press ^C (it would stop on ^D as well, which is the key combination for end-of-file).
GREP: Global Regular Expression Print/Parser/Processor/Program.
You can use this to search the current directory.
You can specify -R for "recursive", which means the program searches in all subfolders, and their subfolders, and their subfolder's subfolders, etc.
grep -R "your word" .
-n will print the line number, where it matched in the file.
-i will search case-insensitive (capital/non-capital letters).
grep -inR "your regex pattern" .
There's also:
find directory_name -type f -print0 | xargs -0 grep -li word
but that might be a bit much for a beginner.
find is a general purpose directory walker/lister, -type f means "look for plain files rather than directories and named pipes and what have you", -print0 means "print them on the standard output using null characters as delimiters". The output from find is sent to xargs -0 and that grabs its standard input in chunks (to avoid command line length limitations) using null characters as a record separator (rather than the standard newline) and then applies grep -li word to each set of files. On the grep, -l means "list the files that match" and -i means "case insensitive"; you can usually combine single character options so you'll see -li more often than -l -i.
If you don't use -print0 and -0 then you'll run into problems with file names that contain spaces so using them is a good habit.
grep -nr search_string search_dir
will do a RECURSIVE (meaning the directory and all it's sub-directories) search for the search_string. (as correctly answered by usta).
The reason you were not getting any anwers with your friend's suggestion of:
grep -nr string
is because no directory was specified. If you are in the directory that you want to do the search in, you have to do the following:
grep -nr string .
It is important to include the '.' character, as this tells grep to search THIS directory.
Why not do a recursive search to find all instances in sub directories:
grep -r 'text' *
This works like a charm.
Similar to the answer posted by #eLRuLL, an easier way to specify a search that respects word boundaries is to use the -w option:
grep -wnr "yourString" .
Another option that I like to use:
find folder_name -type f -exec grep your_text {} \;
-type f returns you only files and not folders
-exec and {} runs the grep on the files that were found in the search (the exact syntax is "-exec command {}").
grep -r "yourstring" *
Will find "yourstring" in any files and folders
Now if you want to look for two different strings at the same time you can always use option E and add words for the search. example after the break
grep -rE "yourstring|yourotherstring|$" * will search for list locations where yourstring or yourotherstring matches
The answer you selected is fine, and it works, but it isn't the correct way to do it, because:
grep -nr yourString* .
This actually searches the string "yourStrin" and "g" 0 or many times.
So the proper way to do it is:
grep -nr \w*yourString\w* .
This command searches the string with any character before and after on the current folder.
grep -R "string" /directory/
-R follows also symlinks when -r does not.
The following sample looks recursively for your search string in the *.xml and *.js files located somewhere inside the folders path1, path2 and path3.
grep -r --include=*.xml --include=*.js "your search string" path1 path2 path3
So you can search in a subset of the files for many directories, just providing the paths at the end.
Run(terminal) the following command inside the directory. It will recursively check inside subdirectories too.
grep -r 'your string goes here' *
Don't use grep. Download Silver Searcher or ripgrep. They're both outstanding, and way faster than grep or ack with tons of options.

Resources