i want to rename more than 20k files with a script, but it fails: "argument list too long"

i want to rename more than 20k files with a script, but it fails: "argument list too long" - linux

I create these files by this
for((i=0;i<20000;i++)
do
touch "www.google.com electronic&information this is bash test ${date} .txt"
done
Then, I want do replace "www.google.com" to "HNSD" .
in 20k files, all the filenames include the whitespace.
My first attempt was the following:
rename 's/www.google.com/HNSD/g' *
...but this yielded an "argument list too long" error.
My second attempt is the following:
#!/bin/bash
function _rename ()
{
while read line1; do
#rename 's/www.google.com/HNSD/g' $line1
sed -i "s/www.google.com/HNSD/g" $line1
#echo $line1
done
}
ls -1 | _rename
...but this doesn't rename the files given. How should this be done?

Running a single mv command per file won't cause this problem unless the filenames themselves are so long (or your environment is so full, thus crowding out space shared with argv use) that you can't fit both source and destination names on the command line at once. The following is thus generally quite safe:
for f in *www.google.com*; do
mv "$f" "${f//www.google.com/HNSD}"
done
Here, we're using parameter expansion to perform the rename operation.
The original code was problematic because it tried to put all filenames on the command line at once. If you use xargs or find -exec {} + to split your operation into multiple invocations, you won't have that problem. If you have rename, and want to call it as few times as possible to process all directory entries for the current location, that would look like the following:
printf '%s\0' * | xargs -0 rename 's/www.google.com/HNSD/'

Two options:
Use find, and rename using rename (or similar) within the -exec action:
find . -type f -name '*www.google.com*' -exec rename 's/www.google.com/HNSD/g' {} +
Leverage a for loop to iterate over the files:
for f in *www.google.com*; do rename 's/www.google.com/HNSD/g' "$f"; done

Related

How to read out a file line by line and for every line do a search with find and copy the search result to destination?

I hope you can help me with the following problem:
The Situation
I need to find files in various folders and copy them to another folder. The files and folders can contain white spaces and umlauts.
The filenames contain an ID and a string like:
"2022-01-11-02 super important file"
The filenames I need to find are collected in a textfile named ids.txt. This file only contains the IDs but not the whole filename as a string.
What I want to achieve:
I want to read out ids.txt line by line.
For every line in ids.txt I want to do a find search and copy cp the result to destination.
So far I tried:
for n in $(cat ids.txt); do find /home/alex/testzone/ -name "$n" -exec cp {} /home/alex/testzone/output \; ;
while read -r ids; do find /home/alex/testzone -name "$ids" -exec cp {} /home/alex/testzone/output \; ; done < ids.txt
The output folder remains empty. Not using -exec also gives no (search)results.
I was thinking that -name "$ids" is the root cause here. My files contain the ID + a String so I should search for names containing the ID plus a variable string (star)
As argument for -name I also tried "$ids *" "$ids"" *" and so on with no luck.
Is there an argument that I can use in conjunction with find instead of using the star in the -name argument?
Do you have any solution for me to automate this process in a bash script to read out ids.txt file, search the filenames and copy them over to specified folder?
In the end I would like to create a bash script that takes ids.txt and the search-folder and the output-folder as arguments like:
my-id-search.sh /home/alex/testzone/ids.txt /home/alex/testzone/ /home/alex/testzone/output
EDIT:
This is some example content of the ids.txt file where only ids are listed (not the whole filename):
2022-01-11-01
2022-01-11-02
2020-12-01-62
EDIT II:
Going on with the solution from tripleee:
#!/bin/bash
grep . $1 | while read -r id; do
echo "Der Suchbegriff lautet:"$id; echo;
find /home/alex/testzone -name "$id*" -exec cp {} /home/alex/testzone/ausgabe \;
done
In case my ids.txt file contains empty lines the -name "$id*" will be -name * which in turn finds all files and copies all files.
Trying to prevent empty line to be read does not seem to work. They should be filtered by the expression grep . $1 |. What am I doing wrong?

If your destination folder is always the same, the quickest and absolutely most elegant solution is to run a single find command to look for all of the files.
sed 's/.*/-o\n—name\n&*/' ids.txt |
xargs -I {} find -false {} -exec cp {} /home/alex/testzone/output +
The -false predicate is a bit of a hack to allow the list of actual predicates to start with -o (as in "or").
This could fail if ids.txt is too large to fit into a single xargs invocation, or if your sed does not understand \n to mean a literal newline.
(Here's a fix for the latter case:
xargs printf '-o\n-name\n%s*\n' <ids.txt |
...
Still the inherent problem with using xargs find like this is that xargs could split the list between -o and -name or between -name and the actual file name pattern if it needs to run more than one find command to process all the arguments.
A slightly hackish solution to that is to ensure that each pair is a single string, and then separately split them back out again:
xargs printf '-o_-name_%s*\n' <ids.txt |
xargs bash -c 'arr=("$#"); find -false ${arr[#]/-o_-name_/-o -name } -exec cp {} "$0"' /home/alex/testzone/ausgabe
where we temporarily hold the arguments in an array where each file name and its flags is a single item, and then replace the flags into separate tokens. This still won't work correctly if the file names you operate on contain literal shell metacharacters like * etc.)
A more mundane solution fixes your while read attempt by adding the missing wildcard in the -name argument. (I also took the liberty to rename the variable, since read will only read one argument at a time, so the variable name should be singular.)
while read -r id; do
find /home/alex/testzone -name "$id*" -exec cp {} /home/alex/testzone/output \;
done < ids.txt

Please try the following bash script copier.sh
#!/bin/bash
IFS=$'\n' # make newlines the only separator
set -f # disable globbing
file="files.txt" # name of file containing filenames
finish="finish" # destination directory
while read -r n ; do (
du -a | awk '{for(i=2;i<=NF;++i)printf $i" " ; print " "}' | grep $n | sed 's/ *$//g' | xargs -I '{}' cp '{}' $finish
);
done < $file
which copies recursively all the files named in files.txt from . and it's subfiles to ./finish
This new version works even if there are spaces in the directory names or file names.

renaming particular files in the subfolders with full directory path

experts, i have many folders and inside the folder there are many sub-folders and the sub-folders contain many files.However, in all the sub-folders one file name is same that is input.ps.Now i want to rename the same input.ps with full path plus file name
so input.ps in all directories should be renamed to home_wuan_data_filess_input.ps
i tried
#!/bin/sh
for file in /home/wuan/data/filess/*.ps
do
mv $file $file_
done
But it does not do the same as i expect,i hope experts will help me.Thanks in advance.

so input.ps in all directories should be renamed to home_wuan_data_filess_input.ps
You may use this find solution:
find /home/wuan/data/filess -type f -name 'input*.ps' -exec bash -c '
for f; do fn="${f#/}"; echo mv "$f" "${fn//\//_}"; done' _ {} +

while read line;
do
fil=${line//\//_}; # Converts all / characters to _ to form the file name
fil=${fil:1}; # Remove the first -
dir=${line%/*}; # Extract the directory
echo "mv $line $dir/$fil"; # Echo the move command
# mv "$line" "$dir/$fil"; # Remove the comment to perform the actual command
done <<< "$(find /path/to/dir -name "input.ps")"

Ok, so file will end up being
/home/wuan/data/filess/input.ps
What we need here is the path, and the full, snake-cased name. Let's start by getting the path:
for f in /home/wuan/data/filess/*.ps; do
path="${f%*/}";
This will match the substring of f up until the last occurrence of /, effectively giving us the path.
Next, we want to snake_case all the things, which is even easier:
for f in /home/wuan/data/filess/*.ps; do
path="${f%*/}";
newname="${f//\//_}"
This replaces all instances of / with _, giving the name you want the new file to have. Now let's put all of this together, and move the file f to path/newname:
for f in /home/wuan/data/filess/*.ps; do
path="${f%*/}";
newname="${f//\//_}"
mv "${f}" "${path}/${newname}"
done
That should do the trick
Here's one of many sites listing some of the bash string manipulations you can use.
Sorry for the delayed update, the power in my building just cut out :)

Remove Brackets from all the filenames in Linux

I have a string that remove the brackets from the filename.
$ find . -name "*.mkv" -exec rename 's/[\)\(]//g' {} \;
I have managed to make a statement that removes all the () in the filename of a directory, but whenever I run into a directory like for example amazing.(2018) It shows an error that:
No directory can be found.
Please provide any alternative I need this to work, and I want it to be recursive.

Better call sed:
find . -type f -iname "*.mkv" | \
while IFS= read -r line; \
do mv "$line" "$(printf %s "$line" | sed -re 's/(\[|\])//g')"; done;
input:
'1[a].mkv' '2[a].mkv' '3[a].mkv'
output:
1a.mkv 2a.mkv 3a.mkv

Ensure that you run "find" with the "-depth" parameter,
therefore leafs are renamed first,
i.e., a directory name is changed
only after all files and directories inside it are renamed.
Otherwise, if you first change the name of a directory,
files and directories inside it are no longer accessible
(they are not in the list that "find" built at the beginning).

Is there an efficient way to replace spaces with _ in filenames? (thousands of files)

I have a directory "FolderName" with 10,000 new files every day. Almost a half of those files are named as follows:
filename_yyyy-mm-dd_hh:mm
while the other half are named:
filename_yyyy-mm-dd hh:mm
(with space instead of underscore)
The script I'd like to set up should do the following:
Rename only the files containing a space in their name, skipping files that need no processing.
I cannot find a way to make the script efficient, I need to really skip the good half, my script tries to mv any file, and it's quite long and inefficient. Any good idea for a better design?
Thanks everybody

Check out the rename utility, a Perl script designed just for this, it's powerful and fast.
rename -n ' ' _ *\ *
Or:
find /path/to/dir -type f -name '* *' -exec rename -n ' ' _ {} \;
The -n flag is for dry run, to print what works happen without actually renaming anything. If the output looks good, remove the flag and rerun.

with the legacy rename you can easily convert single space to underlines
$ rename ' ' '_' files

for f in "$(find . -name '* *')"; do mv $f $(echo $f | sed 's/\ /_/'); done will do it. There should be a better way using find's -exec option as well.
EDIT:
If the function is put in a script, then find can be used directly:
cat <<EOF > space_to_underscore
#!/usr/bin/env bash
mv "$1" "$(sed 's/\ /_/' <(echo "$1"))"
EOF
chmod +x space_to_underscore
find . -name '* *' -exec ./space_to_underscore {} \;
This will be faster than using a for loop.

Rename only the files containing a space in their name, skipping files that need no processing:
for file in *\ * ; do mv "$file" "${file// /_}" ; done
Move all the files older than one week to a "NewFolderName" folder:
find -mtime 7 -exec mv {} NewFolderName/ \;

Add prefix to all images (recursive)

I have a folder with more than 5000 images, all with JPG extension.
What i want to do, is to add recursively the "thumb_" prefix to all images.
I found a similar question: Rename Files and Directories (Add Prefix) but i only want to add the prefix to files with the JPG extension.

One of possibly solutions:
find . -name '*.jpg' -printf "'%p' '%h/thumb_%f'\n" | xargs -n2 echo mv
Principe: find all needed files, and prepare arguments for the standard mv command.
Notes:
arguments for the mv are surrounded by ' for allowing spaces in filenames.
The drawback is: this will not works with filenames what are containing ' apostrophe itself, like many mp3 files. If you need moving more strange filenames check bellow.
the above command is for dry run (only shows the mv commands with args). For real work remove the echo pretending mv.
ANY filename renaming. In the shell you need a delimiter. The problem is, than the filename (stored in a shell variable) usually can contain the delimiter itself, so:
mv $file $newfile #will fail, if the filename contains space, TAB or newline
mv "$file" "$newfile" #will fail, if the any of the filenames contains "
the correct solution are either:
prepare a filename with a proper escaping
use a scripting language what easuly understands ANY filename
Preparing the correct escaping in bash is possible with it's internal printf and %q formatting directive = print quoted. But this solution is long and boring.
IMHO, the easiest way is using perl and zero padded print0, like next.
find . -name \*.jpg -print0 | perl -MFile::Basename -0nle 'rename $_, dirname($_)."/thumb_".basename($_)'
The above using perl's power to mungle the filenames and finally renames the files.

Beware of filenames with spaces in (the for ... in ... expression trips over those), and be aware that the result of a find . ... will always start with ./ (and hence try to give you names like thumb_./file.JPG which isn't quite correct).
This is therefore not a trivial thing to get right under all circumstances. The expression I've found to work correctly (with spaces, subdirs and all that) is:
find . -iname \*.JPG -exec bash -c 'mv "$1" "`echo $1 | sed \"s/\(.*\)\//\1\/thumb/\"`"' -- '{}' \;
Even that can fall foul of certain names (with quotes in) ...

In OS X 10.8.5, find does not have the -printf option. The port that contained rename seemed to depend upon a WebkitGTK development package that was taking hours to install.
This one line, recursive file rename script worked for me:
find . -iname "*.jpg" -print | while read name; do cur_dir=$(dirname "$name"); cur_file=$(basename "$name"); mv "$name" "$cur_dir/thumb_$cur_file"; done
I was actually renaming CakePHP view files with an 'admin_' prefix, to move them all to an admin section.

You can use that same answer, just use *.jpg, instead of just *.

for file in *.JPG; do mv $file thumb_$file; done
if it's multiple directory levels under the current one:
for file in $(find . -name '*.JPG'); do mv $file $(dirname $file)/thumb_$(basename $file); done
proof:
jcomeau#intrepid:/tmp$ mkdir test test/a test/a/b test/a/b/c
jcomeau#intrepid:/tmp$ touch test/a/A.JPG test/a/b/B.JPG test/a/b/c/C.JPG
jcomeau#intrepid:/tmp$ cd test
jcomeau#intrepid:/tmp/test$ for file in $(find . -name '*.JPG'); do mv $file $(dirname $file)/thumb_$(basename $file); done
jcomeau#intrepid:/tmp/test$ find .
.
./a
./a/b
./a/b/thumb_B.JPG
./a/b/c
./a/b/c/thumb_C.JPG
./a/thumb_A.JPG
jcomeau#intrepid:/tmp/test$

Use rename for this:
rename 's/(\w{1})\.JPG$/thumb_$1\.JPG/' `find . -type f -name *.JPG`

For only jpg files in current folder
for f in `ls *.jpg` ; do mv "$f" "PRE_$f" ; done

Develop Reference

node.js excel linux python-3.x azure haskell apache-spark rust .htaccess string