$(find -X) equivalent on linux - linux

I'm trying to use the bash find command to create an array of elements and search elements inside it with a for loop.
I would need to do something like this:
for file in $(find -? dirname) ; do
echo element contains $file
done
I know that you can do find -X on mac (which parses spaces and \n as xargs does), but is there any way to do so on linux?
Thank you in advance for your reply

The OSX find manpage says of -X:
However, you may wish to consider the -print0 primary in conjunction
with ``xargs -0'' as an effective alternative.
So you could take that advice:
find dirname -print0 | xargs -0 grep foo # or whatever it is you wanted to do
Alternatively, find can execute a command for each found file itself:
find dirname -exec echo found {} \;
Note the escaped ; to terminate -- it's just something you have to suck up about find -exec.
Or for xargs-like chunking:
find dirname -exec grep foo {} +

Related

How to grep contents from list of files from Linux ls or find command

I am running -> "find . -name '*.txt'" command and getting list of files.
I am getting below mention output:
./bsd/contrib/amd/ldap-id.txt
./bsd/contrib/expat/tests/benchmark/README.txt
./bsd/contrib/expat/tests/README.txt
./bsd/lib/libc/softfloat/README.txt
and so on,
Out of these files how can i run grep command and read contents and filter only those files which have certain keyword? for e.g. "version" in it.
xargs is a great way to accomplish this, and its already been covered.
The -exec option of find is also useful for this. It will perform a command over all files returned from find.
To invoke grep as few times as possible, passing multiple filenames to each call:
find . -name '*.txt' -exec grep -H 'foo' {} +
Alternately, to invoke grep exactly once for each file found:
find . -name '*.txt' -exec grep -H 'foo' {} ';'
In either case, {} is like a placeholder for the values from find; if your shell is zsh, it may be necessary to escape it, as in '{}'.
There are several ways to accomplish this.
If there are non-.txt files which might usefully contain the keyword:
grep -r KEYWORD *
This uses the recursive directory search option of grep.
To search only .txt files:
find . -name '*.txt' -exec grep KEYWORD {} \;
or
find . -name '*.txt' -exec grep KEYWORD {} +
or
find . -execdir grep KEYWORD {}
The first runs grep for each matching file. The second runs grep much fewer times, accumulating many matched files before invoking grep. The third form runsgrep` once in every directory.
There is usually a function built into find for that, but to be portable across platforms, I typically use xargs. Say you want to find all the xml files in or below the current directly and get a list of each occurrence of 'foo', you can do this:
find ./ -type f -name '*.xml' -print0 | xargs -0 -n 1 grep -H foo
It should be self-explanatory except for the -print0, which separates filenames with NULs rather than newlines, and the -0, which tells xargs to use those NULs rather than interpreting spaces and quotes as syntax (which can confuse it if filenames contain either).

find -exec doesn't recognize argument

I'm trying to count the total lines in the files within a directory. To do this I am trying to use a combination of find and wc. However, when I run find . -exec wc -l {}\;, I recieve the error find: missing argument to -exec. I can't see any apparent issues, any ideas?
You simply need a space between {} and \;
find . -exec wc -l {} \;
Note that if there are any sub-directories from the current location, wc will generate an error message for each of them that looks something like that:
wc: ./subdir: Is a directory
To avoid that problem, you may want to tell find to restrict the search to files :
find . -type f -exec wc -l {} \;
Another note: good idea using the -exec option . Too many times people pipe commands together thinking to get the same result, for instance here it would be :
find . -type f | xargs wc -l
The problem with piping commands in such a manner is that it breaks if any files has spaces in it. For instance here if a file name was "a b" , wc would receive "a" and then "b" separately and you would obviously get 2 error messages: a: no such file and b: no such file.
Unless you know for a fact that your file names never have any spaces in them (or non-printable characters), if you do need to pipe commands together, you need to tell all the tools you are piping together to use the NULL character (\0) as a separator instead of a space. So the previous command would become:
find . -type f -print0 | xargs -0 wc -l
With version 4.0 or later of bash, you don't need your find command at all:
shopt -s globstar
wc -l **/*
There's no simple way to skip directories, which as pointed out by Gui Rava you might want to do, unless you can differentiate files and directories by name alone. For example, maybe directories never have . in their name, while all the files have at least one extension:
wc -l **/*.*

Remove files not containing a specific string

I want to find the files not containing a specific string (in a directory and its sub-directories) and remove those files. How I can do this?
The following will work:
find . -type f -print0 | xargs --null grep -Z -L 'my string' | xargs --null rm
This will firstly use find to print the names of all the files in the current directory and any subdirectories. These names are printed with a null terminator rather than the usual newline separator (try piping the output to od -c to see the effect of the -print0 argument.
Then the --null parameter to xargs tells it to accept null-terminated inputs. xargs will then call grep on a list of filenames.
The -Z argument to grep works like the -print0 argument to find, so grep will print out its results null-terminated (which is why the final call to xargs needs a --null option too). The -L argument to grep causes grep to print the filenames of those files on its command line (that xargs has added) which don't match the regular expression:
my string
If you want simple matching without regular expression magic then add the -F option. If you want more powerful regular expressions then give a -E argument. It's a good habit to use single quotes rather than double quotes as this protects you against any shell magic being applied to the string (such as variable substitution)
Finally you call xargs again to get rid of all the files that you've found with the previous calls.
The problem with calling grep directly from the find command with the -exec argument is that grep then gets invoked once per file rather than once for a whole batch of files as xargs does. This is much faster if you have lots of files. Also don't be tempted to do stuff like:
rm $(some command that produces lots of filenames)
It's always better to pass it to xargs as this knows the maximum command-line limits and will call rm multiple times each time with as many arguments as it can.
Note that this solution would have been simpler without the need to cope with files containing white space and new lines.
Alternatively
grep -r -L -Z 'my string' . | xargs --null rm
will work too (and is shorter). The -r argument to grep causes it to read all files in the directory and recursively descend into any subdirectories). Use the find ... approach if you want to do some other tests on the files as well (such as age or permissions).
Note that any of the single letter arguments, with a single dash introducer, can be grouped together (for instance as -rLZ). But note also that find does not use the same conventions and has multi-letter arguments introduced with a single dash. This is for historical reasons and hasn't ever been fixed because it would have broken too many scripts.
GNU grep and bash.
grep -rLZ "$str" . | while IFS= read -rd '' x; do rm "$x"; done
Use a find solution if portability is needed. This is slightly faster.
EDIT: This is how you SHOULD NOT do this! Reason is given here. Thanks to #ormaaj for pointing it out!
find . -type f | grep -v "exclude string" | xargs rm
Note: grep pattern will match against full file path from current directory (see find . -type f output)
One possibility is
find . -type f '!' -exec grep -q "my string" {} \; -exec echo rm {} \;
You can remove the echo if the output of this preview looks correct.
The equivalent with -delete is
find . -type f '!' -exec grep -q "user_id" {} \; -delete
but then you don't get the nice preview option.
To remove files not containing a specific string:
Bash:
To use them, enable the extglob shell option as follows:
shopt -s extglob
And just remove all files that don't have the string "fix":
rm !(*fix*)
If you want to don't delete all the files that don't have the names "fix" and "class":
rm !(*fix*|*class*)
Zsh:
To use them, enable the extended glob zsh shell option as follows:
setopt extended_glob
Remove all files that don't have the string, in this example "fix":
rm -- ^*fix*
If you want to don't delete all the files that don't have the names "fix" and "class":
rm -- ^(*fix*|*class*)
It's possible to use it for extensions, you only need to change the regex: (.zip) , (.doc), etc.
Here are the sources:
https://www.tecmint.com/delete-all-files-in-directory-except-one-few-file-extensions/
https://codeday.me/es/qa/20190819/1296122.html
I can think of a few ways to approach this. Here's one: find and grep to generate a list of files with no match, and then xargs rm them.
find yourdir -type f -exec grep -F -L 'yourstring' '{}' + | xargs -d '\n' rm
This assumes GNU tools (grep -L and xargs -d are non-portable) and of course no filenames with newlines in them. It has the advantage of not running grep and rm once per file, so it'll be reasonably fast. I recommend testing it with "echo" in place of "rm" just to make sure it picks the right files before you unleash the destruction.
This worked for me, you can remove the -f if you're okay with deleting directories.
myString="keepThis"
for x in `find ./`
do if [[ -f $x && ! $x =~ $myString ]]
then rm $x
fi
done
Another solution (although not as fast). The top solution didn't work in my case because the string I needed to use in place of 'my string' has special characters.
find -type f ! -name "*my string*" -exec rm {} \; -print

Loop over file names from `find`?

If I run this command:
sudo find . -name *.mp3
then I can get a listing of lots of mp3 files.
Now I want to do something with each mp3 file in a loop. For example, I could create a while loop, and inside assign the first file name to the variable file. Then I could do something with that file. Next I could assign the second file name to the variable file and do with that, etc.
How can I realize this using a linux shell command? Any help is appreciated, thanks!
For this, use the read builtin:
sudo find . -name *.mp3 |
while read filename
do
echo "$filename" # ... or any other command using $filename
done
Provided that your filenames don't use the newline (\n) character, this should work fine.
My favourites are
find . -name '*.mp3' -exec cmd {} \;
or
find . -name '*.mp3' -print0 | xargs -0 cmd
While Loop
As others have pointed out, you can frequently use a while read loop to read filenames line by line, it has the drawback of not allowing line-ends in filenames (who uses that?).
xargs vs. -exec cmd {} +
Summarizing the comments saying that -exec...+ is better, I prefer xargs because it is more versatile:
works with other commands than just find
allows 'batching' (grouping) in command lines, say xargs -n 10 (ten at a time)
allows parallellizing, say xargs -P4 (max 4 concurrent processes running at a time)
does privilige separation (such as in the OP's case, where he uses sudo find: using -exec would run all commands as the root user, whereas with xargs that isn't necessary:
sudo find -name '*.mp3' -print0 | sudo xargs -0 require_root.sh
sudo find -name '*.mp3' -print0 | xargs -0 nonroot.sh
in general, pipes are just more versatile (logging, sorting, remoting, caching, checking, parallelizing etc, you can do that)
How about using the -exec option to find?
find . -name '*.mp3' -exec mpg123 '{}' \;
That will call the command mpg123 for every file found, i.e. it will play all the files, in the order they are found.
for file in $(sudo find . -name *.mp3);
do
# do something with file
done

How do I include a pipe | in my linux find -exec command?

This isn't working. Can this be done in find? Or do I need to xargs?
find -name 'file_*' -follow -type f -exec zcat {} \| agrep -dEOE 'grep' \;
the solution is easy: execute via sh
... -exec sh -c "zcat {} | agrep -dEOE 'grep' " \;
The job of interpreting the pipe symbol as an instruction to run multiple processes and pipe the output of one process into the input of another process is the responsibility of the shell (/bin/sh or equivalent).
In your example you can either choose to use your top level shell to perform the piping like so:
find -name 'file_*' -follow -type f -exec zcat {} \; | agrep -dEOE 'grep'
In terms of efficiency this results costs one invocation of find, numerous invocations of zcat, and one invocation of agrep.
This would result in only a single agrep process being spawned which would process all the output produced by numerous invocations of zcat.
If you for some reason would like to invoke agrep multiple times, you can do:
find . -name 'file_*' -follow -type f \
-printf "zcat %p | agrep -dEOE 'grep'\n" | sh
This constructs a list of commands using pipes to execute, then sends these to a new shell to actually be executed. (Omitting the final "| sh" is a nice way to debug or perform dry runs of command lines like this.)
In terms of efficiency this results costs one invocation of find, one invocation of sh, numerous invocations of zcat and numerous invocations of agrep.
The most efficient solution in terms of number of command invocations is the suggestion from Paul Tomblin:
find . -name "file_*" -follow -type f -print0 | xargs -0 zcat | agrep -dEOE 'grep'
... which costs one invocation of find, one invocation of xargs, a few invocations of zcat and one invocation of agrep.
find . -name "file_*" -follow -type f -print0 | xargs -0 zcat | agrep -dEOE 'grep'
You can also pipe to a while loop that can do multiple actions on the file which find locates. So here is one for looking in jar archives for a given java class file in folder with a large distro of jar files
find /usr/lib/eclipse/plugins -type f -name \*.jar | while read jar; do echo $jar; jar tf $jar | fgrep IObservableList ; done
the key point being that the while loop contains multiple commands referencing the passed in file name separated by semicolon and these commands can include pipes. So in that example I echo the name of the matching file then list what is in the archive filtering for a given class name. The output looks like:
/usr/lib/eclipse/plugins/org.eclipse.core.contenttype.source_3.4.1.R35x_v20090826-0451.jar
/usr/lib/eclipse/plugins/org.eclipse.core.databinding.observable_1.2.0.M20090902-0800.jar
org/eclipse/core/databinding/observable/list/IObservableList.class
/usr/lib/eclipse/plugins/org.eclipse.search.source_3.5.1.r351_v20090708-0800.jar
/usr/lib/eclipse/plugins/org.eclipse.jdt.apt.core.source_3.3.202.R35x_v20091130-2300.jar
/usr/lib/eclipse/plugins/org.eclipse.cvs.source_1.0.400.v201002111343.jar
/usr/lib/eclipse/plugins/org.eclipse.help.appserver_3.1.400.v20090429_1800.jar
in my bash shell (xubuntu10.04/xfce) it really does make the matched classname bold as the fgrep highlights the matched string; this makes it really easy to scan down the list of hundreds of jar files that were searched and easily see any matches.
on windows you can do the same thing with:
for /R %j in (*.jar) do #echo %j & #jar tf %j | findstr IObservableList
note that in that on windows the command separator is '&' not ';' and that the '#' suppresses the echo of the command to give a tidy output just like the linux find output above; although findstr is not make the matched string bold so you have to look a bit closer at the output to see the matched class name. It turns out that the windows 'for' command knows quite a few tricks such as looping through text files...
enjoy
I found that running a string shell command (sh -c) works best, for example:
find -name 'file_*' -follow -type f -exec bash -c "zcat \"{}\" | agrep -dEOE 'grep'" \;
If you are looking for a simple alternative, this can be done using a loop:
for i in $(find -name 'file_*' -follow -type f); do
zcat $i | agrep -dEOE 'grep'
done
or, more general and easy to understand form:
for i in $(YOUR_FIND_COMMAND); do
YOUR_EXEC_COMMAND_AND_PIPES
done
and replace any {} by $i in YOUR_EXEC_COMMAND_AND_PIPES
Here's what you should do:
find -name 'file_*' -follow -type f -exec sh -c 'zcat "$1" | agrep -dEOE "grep"' sh {} \;
I tried a couple of these answers and they didn't work for me. #flolo's answer doesn't work correctly if your filenames have special characters. According to this answer:
The find command executes the command directly. The command, including the filename argument, will not be processed by the shell or anything else that might modify the filename. It's very safe.
You lose that safety if you put the {} inside the sh command string.
There is a potential problem with #Rolf W. Rasmussen's answer. Yes, it handles special characters (as far as I know), but if the find output is too long, you won't be able to execute xargs -0 ...: there is a command line character limit set by the kernel and sometimes your shell. Coincidentally, every time I want to pipe commands from a find, I run into this limit.
But, they do bring up a valid point regarding the performance limitations. I'm not sure how to overcome that, though personally, I've never run into a situation where my suggestion is too slow.

Resources