Find all files matching 'name' on linux system, and search with them for 'text' - linux

I need to find all instances of 'filename.ext' on a linux system and see which ones contain the text 'lookingfor'.
Is there a set of linux command line operations that would work?

find / -type f -name filename.ext -exec grep -l 'lookingfor' {} +
Using a + to terminate the command is more efficient than \; because find sends a whole batch of files to grep instead of sending them one by one. This avoids a fork/exec for each single file which is found.
A while ago I did some testing to compare the performance of xargs vs {} + vs {} \; and I found that {} + was faster. Here are some of my results:
time find . -name "*20090430*" -exec touch {} +
real 0m31.98s
user 0m0.06s
sys 0m0.49s
time find . -name "*20090430*" | xargs touch
real 1m8.81s
user 0m0.13s
sys 0m1.07s
time find . -name "*20090430*" -exec touch {} \;
real 1m42.53s
user 0m0.17s
sys 0m2.42s

Go to respective directory and type the following command.
find . -name "*.ext" | xargs grep
'lookingfor'

A more simple one would be,
find / -type f -name filename.ext -print0 | xargs -0 grep 'lookingfor'
-print0 to find & 0 to xargs would mitigate the issue of large number of files in a single directory.

Try:
find / -type f -name filename.ext -exec grep -H -n 'lookingfor' {} \;
find searches recursively starting from the root / for files named filename.ext and for every found occurrence it runs grep on the file name searching for lookingfor and if found prints the line number (-n) and the file name (-H).

I find the following command the simplest way:
grep -R --include="filename.ext" lookingfor
or add -i to search case insensitive:
grep -i -R --include="filename.ext" lookingfor

Related

Moving files with a specific modification date; "find | xargs ls | grep | -exec" fails w/ "-exec: command not found"

Iam using centos 7
If I want to find files that have specific name and specific date then moving these files to another folder iam issuing the command
find -name 'fsimage*' | xargs ls -ali | grep 'Oct 20' | -exec mv {} /hdd/fordelete/ \;
with the following error
-bash: -exec: command not found xargs: ls: terminated by signal 13
As another answer already explains, -exec is an action for find, you can't use it as a shell command. On contrary, xargs and grep are commands, and you can't use them as find actions, just like you can't use pipe | inside find.
But more importantly, even though you could use ls and grep on find's result just to move files older than some amount of time, you shouldn't. Such pipeline is fragile and fails on many corner cases, like symlinks, files with newlines in name, etc.
Instead, use find. You'll find it quite powerful.
For example, to mv files modified more than 7 days ago, use the -mtime test:
find -name 'fsimage*' -mtime +7 -exec mv '{}' /some/dir/ \;
To mv files modified on a specific/reference date, e.g. 2017-10-20, you can use the -newerXY test:
find -name 'fsimage*' -newermt 2017-10-20 ! -newermt 2017-10-21 -exec mv '{}' /some/dir/ \;
Also, if your mv supports the -t option (to give target dir first, multiple files after), you can use {} + placeholder in find for multiple files, reducing the total number of mv command invocations (thanks #CharlesDuffy):
find -name 'fsimage*' -mtime +7 -exec mv -t /some/dir/ '{}' +
the -exec as you wrote it is quite meaningless, moreover it seems you are mixing find syntax with shell oe (-exec as you wrote it should be passed to find)
there are probably more concise ways of doing, but this should do what you expect:
find -name 'fsimage*' -type f | xargs ls -ali | grep 'Oct 20' | awk '{ print $NF }' | while read file; do mv "$file" /hdd/fordelete/ ; done
nevertheless, you should take care of not just copy/paste things you do not really understand from the web, you may wreck you system...

Using 'find' to return filenames without extension

I have a directory (with subdirectories), of which I want to find all files that have a ".ipynb" extension. But I want the 'find' command to just return me these filenames without the extension.
I know the first part:
find . -type f -iname "*.ipynb" -print
But how do I then get the names without the "ipynb" extension?
Any replies greatly appreciated...
To return only filenames without the extension, try:
find . -type f -iname "*.ipynb" -execdir sh -c 'printf "%s\n" "${0%.*}"' {} ';'
or (omitting -type f from now on):
find "$PWD" -iname "*.ipynb" -execdir basename {} .ipynb ';'
or:
find . -iname "*.ipynb" -exec basename {} .ipynb ';'
or:
find . -iname "*.ipynb" | sed "s/.*\///; s/\.ipynb//"
however invoking basename on each file can be inefficient, so #CharlesDuffy suggestion is:
find . -iname '*.ipynb' -exec bash -c 'printf "%s\n" "${#%.*}"' _ {} +
or:
find . -iname '*.ipynb' -execdir basename -s '.sh' {} +
Using + means that we're passing multiple files to each bash instance, so if the whole list fits into a single command line, we call bash only once.
To print full path and filename (without extension) in the same line, try:
find . -iname "*.ipynb" -exec sh -c 'printf "%s\n" "${0%.*}"' {} ';'
or:
find "$PWD" -iname "*.ipynb" -print | grep -o "[^\.]\+"
To print full path and filename on separate lines:
find "$PWD" -iname "*.ipynb" -exec dirname "{}" ';' -exec basename "{}" .ipynb ';'
Here's a simple solution:
find . -type f -iname "*.ipynb" | sed 's/\.ipynb$//1'
I found this in a bash oneliner that simplifies the process without using find
for n in *.ipynb; do echo "${n%.ipynb}"; done
If you need to have the name with directory but without the extension :
find . -type f -iname "*.ipynb" -exec sh -c 'f=$(basename $1 .ipynb);d=$(dirname $1);echo "$d/$f"' sh {} \;
find . -type f -iname "*.ipynb" | grep -oP '.*(?=[.])'
The -o flag outputs only the matched part. The -P flag matches according to Perl regular expressions. This is necessary to make the lookahead (?=[.]) work.
Perl One Liner
what you want
find . | perl -a -F/ -lne 'print $F[-1] if /.*.ipynb/g'
Then not your code
what you do not want
find . | perl -a -F/ -lne 'print $F[-1] if !/.*.ipynb/g'
NOTE
In Perl you need to put extra .. So your pattern would be .*.ipynb
If there's no occurrence of this ".ipynb" string on any file name other than a suffix, then you can try this simpler way using tr:
find . -type f -iname "*.ipynb" -print | tr -d ".ipbyn"
If you don't know that the extension is or there are multiple you could use this:
find . -type f -exec basename {} \;|perl -pe 's/(.*)\..*$/$1/;s{^.*/}{}'
and for a list of files with no duplicates (originally differing in path or extension)
find . -type f -exec basename {} \;|perl -pe 's/(.*)\..*$/$1/;s{^.*/}{}'|sort|uniq
Another easy way which uses basename is:
find . -type f -iname '*.ipynb' -exec basename -s '.ipynb' {} +
Using + will reduce the number of invocations of the command (manpage):
-exec command {} +
This variant of the -exec action runs the specified command on
the selected files, but the command line is built by appending
each selected file name at the end; the total number of
invocations of the command will be much less than the number
of matched files. The command line is built in much the same
way that xargs builds its command lines. Only one instance of
'{}' is allowed within the command, and (when find is being
invoked from a shell) it should be quoted (for example, '{}')
to protect it from interpretation by shells. The command is
executed in the starting directory. If any invocation with
the `+' form returns a non-zero value as exit status, then
find returns a non-zero exit status. If find encounters an
error, this can sometimes cause an immediate exit, so some
pending commands may not be run at all. For this reason -exec
my-command ... {} + -quit may not result in my-command
actually being run. This variant of -exec always returns
true.
Using -s with basename runs accepts multiple filenames and removes a specified suffix (manpage):
-a, --multiple
support multiple arguments and treat each as a NAME
-s, --suffix=SUFFIX
remove a trailing SUFFIX; implies -a

How to find total size of all files under the ownership of a user?

I'm trying to find out the total size of all files owned by a given user.
I've tried this:
find $myfolder -user $myuser -type f -exec du -ch {} +
But this gives me an error:
missing argument to exec
and I don't know how to fix it. Can somebody can help me with this?
You just need to terminate the -exec. If you want the totals for each directory
possibly -type d is required.
find $myfolder -user $myuser -type d -exec du -ch {} \;
Use:
find $myfolder -user gisi -type f -print0 | xargs -0 du -sh
where user gisi is my cat ;)
Note the option -s for summarize
Further note that I'm using find ... -print0 which on the one hand separates filenames by 0 bytes, which are one of the few characters which are not allowed in filenames, and on the other hand xargs -0 which uses the 0 byte as the delimiter. This makes sure that even exotic filenames won't be a problem.
some version of find command does not like "+" for termination of find command
use "\;" instead of "+"

Loop over file names from `find`?

If I run this command:
sudo find . -name *.mp3
then I can get a listing of lots of mp3 files.
Now I want to do something with each mp3 file in a loop. For example, I could create a while loop, and inside assign the first file name to the variable file. Then I could do something with that file. Next I could assign the second file name to the variable file and do with that, etc.
How can I realize this using a linux shell command? Any help is appreciated, thanks!
For this, use the read builtin:
sudo find . -name *.mp3 |
while read filename
do
echo "$filename" # ... or any other command using $filename
done
Provided that your filenames don't use the newline (\n) character, this should work fine.
My favourites are
find . -name '*.mp3' -exec cmd {} \;
or
find . -name '*.mp3' -print0 | xargs -0 cmd
While Loop
As others have pointed out, you can frequently use a while read loop to read filenames line by line, it has the drawback of not allowing line-ends in filenames (who uses that?).
xargs vs. -exec cmd {} +
Summarizing the comments saying that -exec...+ is better, I prefer xargs because it is more versatile:
works with other commands than just find
allows 'batching' (grouping) in command lines, say xargs -n 10 (ten at a time)
allows parallellizing, say xargs -P4 (max 4 concurrent processes running at a time)
does privilige separation (such as in the OP's case, where he uses sudo find: using -exec would run all commands as the root user, whereas with xargs that isn't necessary:
sudo find -name '*.mp3' -print0 | sudo xargs -0 require_root.sh
sudo find -name '*.mp3' -print0 | xargs -0 nonroot.sh
in general, pipes are just more versatile (logging, sorting, remoting, caching, checking, parallelizing etc, you can do that)
How about using the -exec option to find?
find . -name '*.mp3' -exec mpg123 '{}' \;
That will call the command mpg123 for every file found, i.e. it will play all the files, in the order they are found.
for file in $(sudo find . -name *.mp3);
do
# do something with file
done

how to find files containing a string using egrep

I would like to find the files containing specific string under linux.
I tried something like but could not succeed:
find . -name *.txt | egrep mystring
Here you are sending the file names (output of the find command) as input to egrep; you actually want to run egrep on the contents of the files.
Here are a couple of alternatives:
find . -name "*.txt" -exec egrep mystring {} \;
or even better
find . -name "*.txt" -print0 | xargs -0 egrep mystring
Check the find command help to check what the single arguments do.
The first approach will spawn a new process for every file, while the second will pass more than one file as argument to egrep; the -print0 and -0 flags are needed to deal with potentially nasty file names (allowing to separate file names correctly even if a file name contains a space, for example).
try:
find . -name '*.txt' | xargs egrep mystring
There are two problems with your version:
Firstly, *.txt will first be expanded by the shell, giving you a listing of files in the current directory which end in .txt, so for instance, if you have the following:
[dsm#localhost:~]$ ls *.txt
test.txt
[dsm#localhost:~]$
your find command will turn into find . -name test.txt. Just try the following to illustrate:
[dsm#localhost:~]$ echo find . -name *.txt
find . -name test.txt
[dsm#localhost:~]$
Secondly, egrep does not take filenames from STDIN. To convert them to arguments you need to use xargs
find . -name *.txt | egrep mystring
That will not work as egrep will be searching for mystring within the output generated by find . -name *.txt which are just the path to *.txt files.
Instead, you can use xargs:
find . -name *.txt | xargs egrep mystring
You could use
find . -iname *.txt -exec egrep mystring \{\} \;
Here's an example that will return the file paths of a all *.log files that have a line that begins with ERROR:
find . -name "*.log" -exec egrep -l '^ERROR' {} \;
there's a recursive option from egrep you can use
egrep -R "pattern" *.log
If you only want the filenames:
find . -type f -name '*.txt' -exec egrep -l pattern {} \;
If you want filenames and matches:
find . -type f -name '*.txt' -exec egrep pattern {} /dev/null \;

Resources