How do I include a pipe | in my linux find -exec command? - linux

This isn't working. Can this be done in find? Or do I need to xargs?
find -name 'file_*' -follow -type f -exec zcat {} \| agrep -dEOE 'grep' \;

the solution is easy: execute via sh
... -exec sh -c "zcat {} | agrep -dEOE 'grep' " \;

The job of interpreting the pipe symbol as an instruction to run multiple processes and pipe the output of one process into the input of another process is the responsibility of the shell (/bin/sh or equivalent).
In your example you can either choose to use your top level shell to perform the piping like so:
find -name 'file_*' -follow -type f -exec zcat {} \; | agrep -dEOE 'grep'
In terms of efficiency this results costs one invocation of find, numerous invocations of zcat, and one invocation of agrep.
This would result in only a single agrep process being spawned which would process all the output produced by numerous invocations of zcat.
If you for some reason would like to invoke agrep multiple times, you can do:
find . -name 'file_*' -follow -type f \
-printf "zcat %p | agrep -dEOE 'grep'\n" | sh
This constructs a list of commands using pipes to execute, then sends these to a new shell to actually be executed. (Omitting the final "| sh" is a nice way to debug or perform dry runs of command lines like this.)
In terms of efficiency this results costs one invocation of find, one invocation of sh, numerous invocations of zcat and numerous invocations of agrep.
The most efficient solution in terms of number of command invocations is the suggestion from Paul Tomblin:
find . -name "file_*" -follow -type f -print0 | xargs -0 zcat | agrep -dEOE 'grep'
... which costs one invocation of find, one invocation of xargs, a few invocations of zcat and one invocation of agrep.

find . -name "file_*" -follow -type f -print0 | xargs -0 zcat | agrep -dEOE 'grep'

You can also pipe to a while loop that can do multiple actions on the file which find locates. So here is one for looking in jar archives for a given java class file in folder with a large distro of jar files
find /usr/lib/eclipse/plugins -type f -name \*.jar | while read jar; do echo $jar; jar tf $jar | fgrep IObservableList ; done
the key point being that the while loop contains multiple commands referencing the passed in file name separated by semicolon and these commands can include pipes. So in that example I echo the name of the matching file then list what is in the archive filtering for a given class name. The output looks like:
/usr/lib/eclipse/plugins/org.eclipse.core.contenttype.source_3.4.1.R35x_v20090826-0451.jar
/usr/lib/eclipse/plugins/org.eclipse.core.databinding.observable_1.2.0.M20090902-0800.jar
org/eclipse/core/databinding/observable/list/IObservableList.class
/usr/lib/eclipse/plugins/org.eclipse.search.source_3.5.1.r351_v20090708-0800.jar
/usr/lib/eclipse/plugins/org.eclipse.jdt.apt.core.source_3.3.202.R35x_v20091130-2300.jar
/usr/lib/eclipse/plugins/org.eclipse.cvs.source_1.0.400.v201002111343.jar
/usr/lib/eclipse/plugins/org.eclipse.help.appserver_3.1.400.v20090429_1800.jar
in my bash shell (xubuntu10.04/xfce) it really does make the matched classname bold as the fgrep highlights the matched string; this makes it really easy to scan down the list of hundreds of jar files that were searched and easily see any matches.
on windows you can do the same thing with:
for /R %j in (*.jar) do #echo %j & #jar tf %j | findstr IObservableList
note that in that on windows the command separator is '&' not ';' and that the '#' suppresses the echo of the command to give a tidy output just like the linux find output above; although findstr is not make the matched string bold so you have to look a bit closer at the output to see the matched class name. It turns out that the windows 'for' command knows quite a few tricks such as looping through text files...
enjoy

I found that running a string shell command (sh -c) works best, for example:
find -name 'file_*' -follow -type f -exec bash -c "zcat \"{}\" | agrep -dEOE 'grep'" \;

If you are looking for a simple alternative, this can be done using a loop:
for i in $(find -name 'file_*' -follow -type f); do
zcat $i | agrep -dEOE 'grep'
done
or, more general and easy to understand form:
for i in $(YOUR_FIND_COMMAND); do
YOUR_EXEC_COMMAND_AND_PIPES
done
and replace any {} by $i in YOUR_EXEC_COMMAND_AND_PIPES

Here's what you should do:
find -name 'file_*' -follow -type f -exec sh -c 'zcat "$1" | agrep -dEOE "grep"' sh {} \;
I tried a couple of these answers and they didn't work for me. #flolo's answer doesn't work correctly if your filenames have special characters. According to this answer:
The find command executes the command directly. The command, including the filename argument, will not be processed by the shell or anything else that might modify the filename. It's very safe.
You lose that safety if you put the {} inside the sh command string.
There is a potential problem with #Rolf W. Rasmussen's answer. Yes, it handles special characters (as far as I know), but if the find output is too long, you won't be able to execute xargs -0 ...: there is a command line character limit set by the kernel and sometimes your shell. Coincidentally, every time I want to pipe commands from a find, I run into this limit.
But, they do bring up a valid point regarding the performance limitations. I'm not sure how to overcome that, though personally, I've never run into a situation where my suggestion is too slow.

Related

Command Linux to copy files from a certain weekday

I am figuring out a command to copy files that are modified on a Saturday.
find -type f -printf '%Ta\t%p\n'
This way the line starts with the weekday.
When I combine this with a 'egrep' command using a regular expression (starts with "za") it shows only the files which start with "za".
find -type f -printf '%Ta\t%p\n' | egrep "^(za)"
("za" is a Dutch abbreviation for "zaterdag", which means Saturday,
This works just fine.
Now I want to copy the files with this command:
find -type f -printf '%Ta\t%p\n' -exec cp 'egrep "^(za)" *' /home/richard/test/ \;
Unfortunately it doesn't work.
Any suggestions?
The immediate problem is that -printf and -exec are independent of each other. You want to process the result of -printf to decide whether or not to actually run the -exec part. Also, of course, passing an expression in single quotes simply passes a static string, and does not evaluate the expression in any way.
The immediate fix to the evaluation problem is to use a command substitution instead of single quotes, but the problem that the -printf function's result is not available to the command substitution still remains (and anyway, the command substitution would happen before find runs, not while it runs).
A common workaround would be to pass a shell script snippet to -exec, but that still doesn't expose the -printf function to the -exec part.
find whatever -printf whatever -exec sh -c '
case $something in za*) cp "$1" "$0"; esac' "$DEST_DIR" {} \;
so we have to figure out a different way to pass the $something here.
(The above uses a cheap trick to pass the value of $DEST_DIR into the subshell so we don't have to export it. The first argument to sh -c ... ends up in $0.)
Here is a somewhat roundabout way to accomplish this. We create a format string which can be passed to sh for evaluation. In order to avoid pesky file names, we print the inode numbers of matching files, then pass those to a second instance of find for performing the actual copying.
find \( -false $(find -type f \
-printf 'case %Ta in za*) printf "%%s\\n" "-o -inum %i";; esac\n' |
sh) \) -exec cp -t "$DEST_DIR" \+
Using the inode number means any file name can be processed correctly (including one containing newlines, single or double quotes, etc) but may increase running time significantly, because we need two runs of find. If you have a large directory tree, you will probably want to refactor this for your particular scenario (maybe run only in the current directory, and create a wrapper to run it in every directory you want to examine ... thinking out loud here; not sure it helps actually).
This uses features of GNU find which are not available e.g. in *BSD (including OSX). If you are not on Linux, maybe consider installing the GNU tools.
What you can do is a shell expansion. Something like
cp $(find -type f -printf '%Ta\t%p\n' | egrep "^(za)") $DEST_DIR
Assuming that the result of your find and grep is just the filenames (and full paths, at that), this will copy all the files that match your criteria to whatever you set $DEST_DIR to.
EDIT As mentioned in the comments, this won't work if your filenames contain spaces. If that's the case, you can do something like this:
find -type f -printf '%Ta\t%p\n' | egrep "^(za)" | while read file; do cp "$file" $DEST_DIR; done

How to grep contents from list of files from Linux ls or find command

I am running -> "find . -name '*.txt'" command and getting list of files.
I am getting below mention output:
./bsd/contrib/amd/ldap-id.txt
./bsd/contrib/expat/tests/benchmark/README.txt
./bsd/contrib/expat/tests/README.txt
./bsd/lib/libc/softfloat/README.txt
and so on,
Out of these files how can i run grep command and read contents and filter only those files which have certain keyword? for e.g. "version" in it.
xargs is a great way to accomplish this, and its already been covered.
The -exec option of find is also useful for this. It will perform a command over all files returned from find.
To invoke grep as few times as possible, passing multiple filenames to each call:
find . -name '*.txt' -exec grep -H 'foo' {} +
Alternately, to invoke grep exactly once for each file found:
find . -name '*.txt' -exec grep -H 'foo' {} ';'
In either case, {} is like a placeholder for the values from find; if your shell is zsh, it may be necessary to escape it, as in '{}'.
There are several ways to accomplish this.
If there are non-.txt files which might usefully contain the keyword:
grep -r KEYWORD *
This uses the recursive directory search option of grep.
To search only .txt files:
find . -name '*.txt' -exec grep KEYWORD {} \;
or
find . -name '*.txt' -exec grep KEYWORD {} +
or
find . -execdir grep KEYWORD {}
The first runs grep for each matching file. The second runs grep much fewer times, accumulating many matched files before invoking grep. The third form runsgrep` once in every directory.
There is usually a function built into find for that, but to be portable across platforms, I typically use xargs. Say you want to find all the xml files in or below the current directly and get a list of each occurrence of 'foo', you can do this:
find ./ -type f -name '*.xml' -print0 | xargs -0 -n 1 grep -H foo
It should be self-explanatory except for the -print0, which separates filenames with NULs rather than newlines, and the -0, which tells xargs to use those NULs rather than interpreting spaces and quotes as syntax (which can confuse it if filenames contain either).

find -exec doesn't recognize argument

I'm trying to count the total lines in the files within a directory. To do this I am trying to use a combination of find and wc. However, when I run find . -exec wc -l {}\;, I recieve the error find: missing argument to -exec. I can't see any apparent issues, any ideas?
You simply need a space between {} and \;
find . -exec wc -l {} \;
Note that if there are any sub-directories from the current location, wc will generate an error message for each of them that looks something like that:
wc: ./subdir: Is a directory
To avoid that problem, you may want to tell find to restrict the search to files :
find . -type f -exec wc -l {} \;
Another note: good idea using the -exec option . Too many times people pipe commands together thinking to get the same result, for instance here it would be :
find . -type f | xargs wc -l
The problem with piping commands in such a manner is that it breaks if any files has spaces in it. For instance here if a file name was "a b" , wc would receive "a" and then "b" separately and you would obviously get 2 error messages: a: no such file and b: no such file.
Unless you know for a fact that your file names never have any spaces in them (or non-printable characters), if you do need to pipe commands together, you need to tell all the tools you are piping together to use the NULL character (\0) as a separator instead of a space. So the previous command would become:
find . -type f -print0 | xargs -0 wc -l
With version 4.0 or later of bash, you don't need your find command at all:
shopt -s globstar
wc -l **/*
There's no simple way to skip directories, which as pointed out by Gui Rava you might want to do, unless you can differentiate files and directories by name alone. For example, maybe directories never have . in their name, while all the files have at least one extension:
wc -l **/*.*

$(find -X) equivalent on linux

I'm trying to use the bash find command to create an array of elements and search elements inside it with a for loop.
I would need to do something like this:
for file in $(find -? dirname) ; do
echo element contains $file
done
I know that you can do find -X on mac (which parses spaces and \n as xargs does), but is there any way to do so on linux?
Thank you in advance for your reply
The OSX find manpage says of -X:
However, you may wish to consider the -print0 primary in conjunction
with ``xargs -0'' as an effective alternative.
So you could take that advice:
find dirname -print0 | xargs -0 grep foo # or whatever it is you wanted to do
Alternatively, find can execute a command for each found file itself:
find dirname -exec echo found {} \;
Note the escaped ; to terminate -- it's just something you have to suck up about find -exec.
Or for xargs-like chunking:
find dirname -exec grep foo {} +

Insert line into multi specified files

I want to insert a line into the start of multiple specified type files, which the files are located in current directory or the sub dir.
I know that using
find . -name "*.csv"
can help me to list the files I want to use for inserting.
and using
sed -i '1icolumn1,column2,column3' test.csv
can use to insert one line at the start of file,
but now I do NOT know how to pipe the filenames from "find" command to "sed" command.
Could anybody give me any suggestion?
Or is there any better solution to do this?
BTW, is it work to do this in one line command?
Try using xargs to pass output of find and command line arguments to next command, here sed
find . -type f -name '*.csv' -print0 | xargs -0 sed -i '1icolumn1,column2,column3'
Another option would be to use -exec option of find.
find . -type f -name '*.csv' -exec sed -i '1icolumn1,column2,column3' {} \;
Note : It has been observed that xargs is more efficient way and can handle multiple processes using -P option.
This way :
find . -type f -name "*.csv" -exec sed -i '1icolumn1,column2,column3' {} +
-exec do all the magic here. The relevant part of man find :
-exec command ;
Execute command; true if 0 status is returned. All following arguments
to find are taken to be arguments to the command until an argument consisting
of `;' is encountered. The string `{}' is replaced by the current file name
being processed everywhere it occurs in the arguments to the command, not just
in arguments where it is alone, as in some versions of find. Both of
these constructions might need to be escaped (with a `\') or quoted to protect
them from expansion by the shell. See the EXAMPLES section for examples of
the use of the -exec option. The specified command is run once for each
matched file. The command is executed in the starting directory. There
are unavoidable security problems surrounding use of the -exec action;
you should use the -execdir option instead

Resources