Copy specific files recursively - linux

This problem has been discussed extensively but I couldn't find a solution that would help me.
I'm trying to selectively copy files from a directory tree into a specific folder. After reading some Q&A, here's what I tried:
cp `find . -name "*.pdf" -type f` ../collect/
I am in the right parent directory and there indeed is a collect directory a level above. Now I'm getting the error: cp: invalid option -- 'o'
What is going wrong?

To handle difficult file names:
find . -name "*.pdf" -type f -exec cp {} ../collect/ \;
By default, find will print the file names that it finds. If one uses the -exec option, it will instead pass the file names on to a command of your choosing, in this case a cp command which is written as:
cp {} ../collect/ \;
The {} tells find where to insert the file name. The end of the command given to -exec is marked by a semicolon. Normally, the shell would eat the semicolon. So, we escape the semicolon with a backslash so that it is passed as an argument to the find command.
Because find gives the file name to cp directly without interference from the shell, this approach works for even the most difficult file names.
More efficiency
The above runs cp on every file found. If there are many files, that would be a lot of processes started. If one has GNU tools, that can be avoided as follows:
find . -name '*.pdf' -type f -exec cp -t ../collect {} +
In this variant of the command, find will supply many file names for each single invocation of cp, potentially greatly reducing the number of processes that need to be started.

Related

Command Linux to copy files from a certain weekday

I am figuring out a command to copy files that are modified on a Saturday.
find -type f -printf '%Ta\t%p\n'
This way the line starts with the weekday.
When I combine this with a 'egrep' command using a regular expression (starts with "za") it shows only the files which start with "za".
find -type f -printf '%Ta\t%p\n' | egrep "^(za)"
("za" is a Dutch abbreviation for "zaterdag", which means Saturday,
This works just fine.
Now I want to copy the files with this command:
find -type f -printf '%Ta\t%p\n' -exec cp 'egrep "^(za)" *' /home/richard/test/ \;
Unfortunately it doesn't work.
Any suggestions?
The immediate problem is that -printf and -exec are independent of each other. You want to process the result of -printf to decide whether or not to actually run the -exec part. Also, of course, passing an expression in single quotes simply passes a static string, and does not evaluate the expression in any way.
The immediate fix to the evaluation problem is to use a command substitution instead of single quotes, but the problem that the -printf function's result is not available to the command substitution still remains (and anyway, the command substitution would happen before find runs, not while it runs).
A common workaround would be to pass a shell script snippet to -exec, but that still doesn't expose the -printf function to the -exec part.
find whatever -printf whatever -exec sh -c '
case $something in za*) cp "$1" "$0"; esac' "$DEST_DIR" {} \;
so we have to figure out a different way to pass the $something here.
(The above uses a cheap trick to pass the value of $DEST_DIR into the subshell so we don't have to export it. The first argument to sh -c ... ends up in $0.)
Here is a somewhat roundabout way to accomplish this. We create a format string which can be passed to sh for evaluation. In order to avoid pesky file names, we print the inode numbers of matching files, then pass those to a second instance of find for performing the actual copying.
find \( -false $(find -type f \
-printf 'case %Ta in za*) printf "%%s\\n" "-o -inum %i";; esac\n' |
sh) \) -exec cp -t "$DEST_DIR" \+
Using the inode number means any file name can be processed correctly (including one containing newlines, single or double quotes, etc) but may increase running time significantly, because we need two runs of find. If you have a large directory tree, you will probably want to refactor this for your particular scenario (maybe run only in the current directory, and create a wrapper to run it in every directory you want to examine ... thinking out loud here; not sure it helps actually).
This uses features of GNU find which are not available e.g. in *BSD (including OSX). If you are not on Linux, maybe consider installing the GNU tools.
What you can do is a shell expansion. Something like
cp $(find -type f -printf '%Ta\t%p\n' | egrep "^(za)") $DEST_DIR
Assuming that the result of your find and grep is just the filenames (and full paths, at that), this will copy all the files that match your criteria to whatever you set $DEST_DIR to.
EDIT As mentioned in the comments, this won't work if your filenames contain spaces. If that's the case, you can do something like this:
find -type f -printf '%Ta\t%p\n' | egrep "^(za)" | while read file; do cp "$file" $DEST_DIR; done

Linux: how to look for files with a certain extension in hierarchy and execute command whenever one is found?

I have a directory hierarchy, whose names do not follow a pattern. E.g.
parent
bcgegec
hfiwehfiuwe
huiwwuifegeufg
whegwgefyfeg
hfeohfeiofe
chidchuehugfe
dedewdewf
tegtgetg
gtgetgtg
and so on.
Inside some of such directories there is a file with "gr" extension. I need to find each of such files, cd to its dir and execute "gnuplot" command having the .gr file as argument. I tried the following to nest two find commands, but the {} of the inner one does not work as I need. The outer find should iterate for every directory, and the inner find should look for the presence of the .gr file.
find $parentDir -type d -exec sh -c '(cd {} && find . -maxdepth 1 -name *.gr -exec /usr/bin/gnuplot {} \;)' \;
Perhaps this is what you are looking for:
find . -type f -name "*.gr" -execdir /usr/bin/gnuplot {} \;
Read through man find for other useful information.

Insert line into multi specified files

I want to insert a line into the start of multiple specified type files, which the files are located in current directory or the sub dir.
I know that using
find . -name "*.csv"
can help me to list the files I want to use for inserting.
and using
sed -i '1icolumn1,column2,column3' test.csv
can use to insert one line at the start of file,
but now I do NOT know how to pipe the filenames from "find" command to "sed" command.
Could anybody give me any suggestion?
Or is there any better solution to do this?
BTW, is it work to do this in one line command?
Try using xargs to pass output of find and command line arguments to next command, here sed
find . -type f -name '*.csv' -print0 | xargs -0 sed -i '1icolumn1,column2,column3'
Another option would be to use -exec option of find.
find . -type f -name '*.csv' -exec sed -i '1icolumn1,column2,column3' {} \;
Note : It has been observed that xargs is more efficient way and can handle multiple processes using -P option.
This way :
find . -type f -name "*.csv" -exec sed -i '1icolumn1,column2,column3' {} +
-exec do all the magic here. The relevant part of man find :
-exec command ;
Execute command; true if 0 status is returned. All following arguments
to find are taken to be arguments to the command until an argument consisting
of `;' is encountered. The string `{}' is replaced by the current file name
being processed everywhere it occurs in the arguments to the command, not just
in arguments where it is alone, as in some versions of find. Both of
these constructions might need to be escaped (with a `\') or quoted to protect
them from expansion by the shell. See the EXAMPLES section for examples of
the use of the -exec option. The specified command is run once for each
matched file. The command is executed in the starting directory. There
are unavoidable security problems surrounding use of the -exec action;
you should use the -execdir option instead

Why does find -exec mv {} ./target/ + not work?

I want to know exactly what {} \; and {} \+ and | xargs ... do. Please clarify these with explanations.
Below 3 commands run and output same result but the first command takes a little time and the format is also little different.
find . -type f -exec file {} \;
find . -type f -exec file {} \+
find . -type f | xargs file
It's because 1st one runs the file command for every file coming from the find command. So, basically it runs as:
file file1.txt
file file2.txt
But latter 2 find with -exec commands run file command once for all files like below:
file file1.txt file2.txt
Then I run the following commands on which first one runs without problem but second one gives error message.
find . -type f -iname '*.cpp' -exec mv {} ./test/ \;
find . -type f -iname '*.cpp' -exec mv {} ./test/ \+ #gives error:find: missing argument to `-exec'
For command with {} \+, it gives me the error message
find: missing argument to `-exec'
why is that? can anyone please explain what am I doing wrong?
The manual page (or the online GNU manual) pretty much explains everything.
find -exec command {} \;
For each result, command {} is executed. All occurences of {} are replaced by the filename. ; is prefixed with a slash to prevent the shell from interpreting it.
find -exec command {} +
Each result is appended to command and executed afterwards. Taking the command length limitations into account, I guess that this command may be executed more times, with the manual page supporting me:
the total number of invocations of the command will be much less than the number of matched files.
Note this quote from the manual page:
The command line is built in much the same way that xargs builds its command lines
That's why no characters are allowed between {} and + except for whitespace. + makes find detect that the arguments should be appended to the command just like xargs.
The solution
Luckily, the GNU implementation of mv can accept the target directory as an argument, with either -t or the longer parameter --target. It's usage will be:
mv -t target file1 file2 ...
Your find command becomes:
find . -type f -iname '*.cpp' -exec mv -t ./test/ {} \+
From the manual page:
-exec command ;
Execute command; true if 0 status is returned. All following arguments to find are taken to be arguments to the command until an argument consisting of `;' is encountered. The string `{}' is replaced by the current file name being processed everywhere it occurs in the arguments to the command, not just in arguments where it is alone, as in some versions of find. Both of these constructions might need to be escaped (with a `\') or quoted to protect them from expansion by the shell. See the EXAMPLES section for examples of the use of the -exec option. The specified command is run once for each matched file. The command is executed in the starting directory. There are unavoidable security problems surrounding use of the -exec action; you should use the -execdir option instead.
-exec command {} +
This variant of the -exec action runs the specified command on the selected files, but the command line is built by appending each selected file name at the end; the total number of invocations of the command will be much less than the number of matched files. The command line is built in much the same way that xargs builds its command lines. Only one instance of `{}' is allowed within the command. The command is executed in the starting directory.
I encountered the same issue on Mac OSX, using a ZSH shell: in this case there is no -t option for mv, so I had to find another solution.
However the following command succeeded:
find .* * -maxdepth 0 -not -path '.git' -not -path '.backup' -exec mv '{}' .backup \;
The secret was to quote the braces. No need for the braces to be at the end of the exec command.
I tested under Ubuntu 14.04 (with BASH and ZSH shells), it works the same.
However, when using the + sign, it seems indeed that it has to be at the end of the exec command.
The standard equivalent of find -iname ... -exec mv -t dest {} + for find implementations that don't support -iname or mv implementations that don't support -t is to use a shell to re-order the arguments:
find . -name '*.[cC][pP][pP]' -type f -exec sh -c '
exec mv "$#" /dest/dir/' sh {} +
By using -name '*.[cC][pP][pP]', we also avoid the reliance on the current locale to decide what's the uppercase version of c or p.
Note that +, contrary to ; is not special in any shell so doesn't need to be quoted (though quoting won't harm, except of course with shells like rc that don't support \ as a quoting operator).
The trailing / in /dest/dir/ is so that mv fails with an error instead of renaming foo.cpp to /dest/dir in the case where only one cpp file was found and /dest/dir didn't exist or wasn't a directory (or symlink to directory).
find . -name "*.mp3" -exec mv --target-directory=/home/d0k/Музика/ {} \+
no, the difference between + and \; should be reversed. + appends the files to the end of the exec command then runs the exec command and \; runs the command for each file.
The problem is find . -type f -iname '*.cpp' -exec mv {} ./test/ \+ should be find . -type f -iname '*.cpp' -exec mv {} ./test/ + no need to escape it or terminate the +
xargs I haven't used in a long time but I think works like +.

Can the find command's "exec" feature start a program in the background?

I would like to do something like:
find . -iname "*Advanced*Linux*Program*" -exec kpdf {} & \;
Possible? Some other comparable method available?
Firstly, it won't work as you've typed, because the shell will interpret it as
find . -iname "*Advanced*Linux*Program*" -exec kpdf {} &
\;
which is an invalid find run in the background, followed by a command that doesn't exist.
Even escaping it doesn't work, since find -exec actually execs the argument list given, instead of giving it to a shell (which is what actually handles & for backgrounding).
Once you know that that's the problem, all you have to do is start a shell to give these commands to:
find . -iname "*Advanced*Linux*Program*" -exec sh -c '"$0" "$#" &' kpdf {} \;
On the other hand, given what you're trying to do, I would suggest one of
find ... -exec kfmclient exec {} \; # KDE
find ... -exec gnome-open {} \; # Gnome
find ... -exec xdg-open {} \; # any modern desktop
which will open the file in the default program as associated by your desktop environment.
If your goal is just not having to close one pdf in order to see the next one as opposed to display each pdf in its own separate instance, you might try
find . -iname "*Advanced*Linux*Program*" -exec kpdf {} \+ &
With the plussed variant, -exec builds the command line like xargs would so all the files found would be handed to the same instance of kpdf. The & in the end then affects the whole find. With very large numbers of files found it might still open them in batches because command lines grow too long, but with respect to ressource consumption on your system this may even be a good thing. ;)
kpdf has to be able to take a list of files on the command line for this to work, as I don't use it myself I don't know this.

Resources