How does one find and copy files of the same extension, in different directories, to a single directory in linux?

How does one find and copy files of the same extension, in different directories, to a single directory in linux? - linux

So, How Do I find and copy all files,
*.a
that are in,
~/DIR{1,2,3,...}
to
~/tmp/foo?

Assumed you meant recursively copy everything of type .a from some source location.
Haven't verified yet, but this should do that.
find <root-of-search> -type f -name '*.a' -exec cp {} /tmp/foo \;
replace with the top of wherever you want to search from. You might have to throw quotes around *.a, and you might have to replace escape the ending semicolon by putting it in single quotes rather than back-slashing it.

In a bash shell:
cp ~/DIR*/*.a ~/tmp/foo

find ~/DIR{1,2,...} -name *.a print0 | xargs -i -0 cp '{}' ~/tmp/foo

Related

How do I write a command to search and remove files, even in cases where the files can't be found?

I’m using Amazon Linux with the bash shell. I want to find and remove some PDF files in a single line, so I tried
find /home/jboss/.jenkins/jobs/myco/workspace/ebook/ -name '*.pdf' | xargs rm
This works fine if there are PDF files. But if there are none, I get the error
rm: missing operand
Is there any way to write the above statement in a single line so that it will not fail, even if there are no files to remove?

This can easily be achieved using the -r flag to xargs.
I also recommend using "special character tolerant" version:
find /home/jboss/.jenkins/jobs/myco/workspace/ebook/ -name '*.pdf' -print0 | xargs -0 -r rm

Have you tried doing it all within the find command?
find /home/jboss/.jenkins/jobs/myco/workspace/ebook/ -name '*.pdf' -exec rm -f {} \;
I've always used the construct above though I believe you can also use the switch -delete which may be a bit more efficient. If you do use it remember to put -delete at the end as find is evaluated left to right as an expression.

You don't even need find, you can simply use rm, it supports basic pattern matching. Just do the following:
rm -f path/*.pdf

how to cp files with spaces in the filename when files are provided by find

I would like to ensure that all files found by find with a given criteria are properly copied to the required location.
$from = '/some/path/to/the/files'
$ext = 'custom_file_extension'
$dest = '/new/destination/for/the/files/with/given/extension'
cp 'find $from -name "*.$ext"' $dest
The problem here is that, when a file found with the proper extension and it is containing space cp cannot copy it properly.

You don't do that. You can't splat filenames with spaces that way.
You either get to use something from http://mywiki.wooledge.org/BashFAQ/001 to read the output from find line-by-line or into an array or you use find -exec to do the copy work.
Something like this:
from='/some/path/to/the/files'
ext='custom_file_extension'
dest='/new/destination/for/the/files/with/given/extension'
find "$from" -name "*.$ext" -exec cp -t "$dest" {} +
Using -exec command + here means that find will only execute as many cp commands as it needs based on command length limits. Using -exec command ; here would run one cp-per-file-found (but is more portable to older systems).
See comment from gniourf_gniourf about the use of -t in that cp command to make -exec command + work correctly.

Use -exec:
find "$from" -name "*.$ext" -exec cp {} "$dest" \;

you need to copy file one by one:
for file in "$from"/*."$ext"; do
cp "$file" "$dest"
done
I just use glob here, and it's enough and complete. I think find may introduce problem if the file name contains funny character.

The solution for this sort of problem is xargs -0 and the -print0 flag for find.
-print0 instructs find to print the results with a NUL character termination, instead of a newline, while -0 for xargs tells it expect input in that format.
Finally, the -J option for xargs allows one to put the arguments in the right place for a copy.
find "$from" -name "*.$ext" -print0 | xargs -0 -J % cp % "$dest"

It's better to use -exec argument of find command to do this:
find . -type f -name "*.ext" -exec cp {} ./destination_dir \;
I've checked this case with files containing spaces and it's work for me. Also don't forger to point out '-type f' if you want to find only files, not directories.

Copy specific files recursively

This problem has been discussed extensively but I couldn't find a solution that would help me.
I'm trying to selectively copy files from a directory tree into a specific folder. After reading some Q&A, here's what I tried:
cp `find . -name "*.pdf" -type f` ../collect/
I am in the right parent directory and there indeed is a collect directory a level above. Now I'm getting the error: cp: invalid option -- 'o'
What is going wrong?

To handle difficult file names:
find . -name "*.pdf" -type f -exec cp {} ../collect/ \;
By default, find will print the file names that it finds. If one uses the -exec option, it will instead pass the file names on to a command of your choosing, in this case a cp command which is written as:
cp {} ../collect/ \;
The {} tells find where to insert the file name. The end of the command given to -exec is marked by a semicolon. Normally, the shell would eat the semicolon. So, we escape the semicolon with a backslash so that it is passed as an argument to the find command.
Because find gives the file name to cp directly without interference from the shell, this approach works for even the most difficult file names.
More efficiency
The above runs cp on every file found. If there are many files, that would be a lot of processes started. If one has GNU tools, that can be avoided as follows:
find . -name '*.pdf' -type f -exec cp -t ../collect {} +
In this variant of the command, find will supply many file names for each single invocation of cp, potentially greatly reducing the number of processes that need to be started.

Find multiple files and rename them in Linux

I am having files like a_dbg.txt, b_dbg.txt ... in a Suse 10 system. I want to write a bash shell script which should rename these files by removing "_dbg" from them.
Google suggested me to use rename command. So I executed the command rename _dbg.txt .txt *dbg* on the CURRENT_FOLDER
My actual CURRENT_FOLDER contains the below files.
CURRENT_FOLDER/a_dbg.txt
CURRENT_FOLDER/b_dbg.txt
CURRENT_FOLDER/XX/c_dbg.txt
CURRENT_FOLDER/YY/d_dbg.txt
After executing the rename command,
CURRENT_FOLDER/a.txt
CURRENT_FOLDER/b.txt
CURRENT_FOLDER/XX/c_dbg.txt
CURRENT_FOLDER/YY/d_dbg.txt
Its not doing recursively, how to make this command to rename files in all subdirectories. Like XX and YY I will be having so many subdirectories which name is unpredictable. And also my CURRENT_FOLDER will be having some other files also.

You can use find to find all matching files recursively:
find . -iname "*dbg*" -exec rename _dbg.txt .txt '{}' \;
EDIT: what the '{}' and \; are?
The -exec argument makes find execute rename for every matching file found. '{}' will be replaced with the path name of the file. The last token, \; is there only to mark the end of the exec expression.
All that is described nicely in the man page for find:
-exec utility [argument ...] ;
True if the program named utility returns a zero value as its
exit status. Optional arguments may be passed to the utility.
The expression must be terminated by a semicolon (``;''). If you
invoke find from a shell you may need to quote the semicolon if
the shell would otherwise treat it as a control operator. If the
string ``{}'' appears anywhere in the utility name or the argu-
ments it is replaced by the pathname of the current file.
Utility will be executed from the directory from which find was
executed. Utility and arguments are not subject to the further
expansion of shell patterns and constructs.

For renaming recursively I use the following commands:
find -iname \*.* | rename -v "s/ /-/g"

small script i wrote to replace all files with .txt extension to .cpp extension under /tmp and sub directories recursively
#!/bin/bash
for file in $(find /tmp -name '*.txt')
do
mv $file $(echo "$file" | sed -r 's|.txt|.cpp|g')
done

with bash:
shopt -s globstar nullglob
rename _dbg.txt .txt **/*dbg*

find -execdir rename also works for non-suffix replacements on basenames
https://stackoverflow.com/a/16541670/895245 works directly only for suffixes, but this will work for arbitrary regex replacements on basenames:
PATH=/usr/bin find . -depth -execdir rename 's/_dbg.txt$/_.txt' '{}' \;
or to affect files only:
PATH=/usr/bin find . -type f -execdir rename 's/_dbg.txt$/_.txt' '{}' \;
-execdir first cds into the directory before executing only on the basename.
Tested on Ubuntu 20.04, find 4.7.0, rename 1.10.
Convenient and safer helper for it
find-rename-regex() (
set -eu
find_and_replace="$1"
PATH="$(echo "$PATH" | sed -E 's/(^|:)[^\/][^:]*//g')" \
find . -depth -execdir rename "${2:--n}" "s/${find_and_replace}" '{}' \;
)
GitHub upstream.
Sample usage to replace spaces ' ' with hyphens '-'.
Dry run that shows what would be renamed to what without actually doing it:
find-rename-regex ' /-/g'
Do the replace:
find-rename-regex ' /-/g' -v
Command explanation
The awesome -execdir option does a cd into the directory before executing the rename command, unlike -exec.
-depth ensure that the renaming happens first on children, and then on parents, to prevent potential problems with missing parent directories.
-execdir is required because rename does not play well with non-basename input paths, e.g. the following fails:
rename 's/findme/replaceme/g' acc/acc
The PATH hacking is required because -execdir has one very annoying drawback: find is extremely opinionated and refuses to do anything with -execdir if you have any relative paths in your PATH environment variable, e.g. ./node_modules/.bin, failing with:
find: The relative path ‘./node_modules/.bin’ is included in the PATH environment variable, which is insecure in combination with the -execdir action of find. Please remove that entry from $PATH
See also: https://askubuntu.com/questions/621132/why-using-the-execdir-action-is-insecure-for-directory-which-is-in-the-path/1109378#1109378
-execdir is a GNU find extension to POSIX. rename is Perl based and comes from the rename package.
Rename lookahead workaround
If your input paths don't come from find, or if you've had enough of the relative path annoyance, we can use some Perl lookahead to safely rename directories as in:
git ls-files | sort -r | xargs rename 's/findme(?!.*\/)\/?$/replaceme/g' '{}'
I haven't found a convenient analogue for -execdir with xargs: https://superuser.com/questions/893890/xargs-change-working-directory-to-file-path-before-executing/915686
The sort -r is required to ensure that files come after their respective directories, since longer paths come after shorter ones with the same prefix.
Tested in Ubuntu 18.10.

Script above can be written in one line:
find /tmp -name "*.txt" -exec bash -c 'mv $0 $(echo "$0" | sed -r \"s|.txt|.cpp|g\")' '{}' \;

If you just want to rename and don't mind using an external tool, then you can use rnm. The command would be:
#on current folder
rnm -dp -1 -fo -ssf '_dbg' -rs '/_dbg//' *
-dp -1 will make it recursive to all subdirectories.
-fo implies file only mode.
-ssf '_dbg' searches for files with _dbg in the filename.
-rs '/_dbg//' replaces _dbg with empty string.
You can run the above command with the path of the CURRENT_FOLDER too:
rnm -dp -1 -fo -ssf '_dbg' -rs '/_dbg//' /path/to/the/directory

You can use this below.
rename --no-act 's/\.html$/\.php/' *.html */*.html

This command worked for me. Remember first to install the perl rename package:
find -iname \*.* | grep oldname | rename -v "s/oldname/newname/g

To expand on the excellent answer #CiroSantilliПутлерКапут六四事 : do not match files in the find that we don't have to rename.
I have found this to improve performance significantly on Cygwin.
Please feel free to correct my ineffective bash coding.
FIND_STRING="ZZZZ"
REPLACE_STRING="YYYY"
FIND_PARAMS="-type d"
find-rename-regex() (
set -eu
find_and_replace="${1}/${2}/g"
echo "${find_and_replace}"
find_params="${3}"
mode="${4}"
if [ "${mode}" = 'real' ]; then
PATH="$(echo "$PATH" | sed -E 's/(^|:)[^\/][^:]*//g')" \
find . -depth -name "*${1}*" ${find_params} -execdir rename -v "s/${find_and_replace}" '{}' \;
elif [ "${mode}" = 'dryrun' ]; then
echo "${mode}"
PATH="$(echo "$PATH" | sed -E 's/(^|:)[^\/][^:]*//g')" \
find . -depth -name "*${1}*" ${find_params} -execdir rename -n "s/${find_and_replace}" '{}' \;
fi
)
find-rename-regex "${FIND_STRING}" "${REPLACE_STRING}" "${FIND_PARAMS}" "dryrun"
# find-rename-regex "${FIND_STRING}" "${REPLACE_STRING}" "${FIND_PARAMS}" "real"

In case anyone is comfortable with fd and rnr, the command is:
fd -t f -x rnr '_dbg.txt' '.txt'
rnr only command is:
rnr -f -r '_dbg.txt' '.txt' *
rnr has the benefit of being able to undo the command.

On Ubuntu (after installing rename), this simpler solution worked the best for me. This replaces space with underscore, but can be modified as needed.
find . -depth | rename -d -v -n "s/ /_/g"
The -depth flag is telling find to traverse the depth of a directory first, which is good because I want to rename the leaf nodes first.
The -d flag on rename tells it to only rename the filename component of the path. I don't know how general the behavior is but on my installation (Ubuntu 20.04), it could be the file or the directory as long as it is the leaf node of the path.
I recommend the -n (no action) flag first along with -v, so you can see what would get renamed and how.
Using the two flags together, it renames all the files in a directory first and then the directory itself. Working backwards. Which is exactly what I needed.

classic solution:
for f in $(find . -name "*dbg*"); do mv $f $(echo $f | sed 's/_dbg//'); done

Linux: how to replace all instances of a string with another in all files of a single type

I want to replace for example all instances of "123" with "321" contained within all .txt files in a folder (recursively).
I thought of doing this
sed -i 's/123/321/g' | find . -name \*.txt
but before possibly screwing all my files I would like to ask if it will work.

You have the sed and the find back to front. With GNU sed and the -i option, you could use:
find . -name '*.txt' -type f -exec sed -i s/123/321/g {} +
The find finds files with extension .txt and runs the sed -i command on groups of them (that's the + at the end; it's standard in POSIX 2008, but not all versions of find necessarily support it). In this example substitution, there's no danger of misinterpretation of the s/123/321/g command so I've not enclosed it in quotes. However, for simplicity and general safety, it is probably better to enclose the sed script in single quotes whenever possible.
You could also use xargs (and again using GNU extensions -print0 to find and -0 and -r to xargs):
find . -name '*.txt' -type f -print0 | xargs -0 -r sed -i 's/123/321/g'
The -r means 'do not run if there are no arguments' (so the find doesn't find anything). The -print0 and -0 work in tandem, generating file names ending with the C null byte '\0' instead of a newline, and avoiding misinterpretation of file names containing newlines, blanks and so on.
Note that before running the script on the real data, you can and should test it. Make a dummy directory (I usually call it junk), copy some sample files into the junk directory, change directory into the junk directory, and test your script on those files. Since they're copies, there's no harm done if something goes wrong. And you can simply remove everything in the directory afterwards: rm -fr junk should never cause you anguish.

Develop Reference

node.js excel linux python-3.x azure haskell apache-spark rust .htaccess string