Recursively remove pattern from filenames without changing paths - linux

I have thousands of files in a directory tree with filenames like:
/Folder 0001 - 0500/0001 - Portrait - House.jpg
/Folder 2500 - 3000/2505 - Landscape - Mountain.jpg
Using linux command line I would like to remove everything up to the first word in the filenames, so "0001 - " and "2500 - ". The new filenames would look like:
/Folder 0001 - 0500/Portrait - House.jpg
/Folder 2500 - 3000/Landscape - Mountain.jpg
I have modified a script that kind of works:
find . -type f -name "*-*" -exec bash -c 'f="$1"; g="${f/[[:digit:]]/ -/ /}"; echo mv -- "$f" "$g"' _ '{}' \;
The problem here is that it butchers part of the path instead of the filename, so actual output generates filenames like:
/Folder -/ /001 - 0500/0001 - Portrait - House.jpg
/Folder -/ /500 - 3000/2505 - Landscape - Mountain.jpg
How can I modify this script to rename files using the pattern I described?

find . -mindepth 2 -type f -name "*-*" -exec bash -c '
shopt -s extglob
for arg do
dir=${arg%/*}
basename_old=${arg##*/}
basename_new=${basename_old##+([[:digit:]]) - }
[[ "$basename_new" = "$basename_old" ]] && continue # skip when no rename needed
printf "%q " mv -- "$dir/$basename_old" "$dir/$basename_new"
printf "\n"
done
' _ {} +
You can see this code running at https://ideone.com/YJNL9c
Using parameter expansions to split the directory name out from the filename allows these to be manipulated individually.
${arg%/*} removes everything after the last / from the variable in arg -- thus removing the filename, leaving the directories, when a path has at least one directory segment (providing this assurance is the reason for the -mindepth 2).
${arg##*/} removes the longest match to */ from the beginning -- thus removing the directories, leaving the basic filename.
By enabling the extglob shell option, we get regex-like capabilities in our fnmatch/glob-style expressions, including the ability to match one-or-more of a single digit; this is why +([[:digit:]]) - evaluates to "one or more digits, followed by -".
By using printf '%q ' instead of echo when generating shell commands, we generate safely-quoted output even without control of our filenames.
By using -exec ... {} +, we're passing multiple arguments to each bash instance, rather than invoking a separate interpreter for each file found. With for arg do, we iterate over all those arguments.

Related

How to rename file name contains backslash in bash?

I got a tar file, after extracting, there are many files naming like
a
b\c
d\e\f
g\h
I want to correct their name into files in sub-directories like
a
b/c
d/e/f
g/h
I face a problem when a variable contains backslash, it will change the original file name. I want to write a script to rename them.
Parameter expansion is the way to go. You have everything you need in bash, no need to use external tools like find.
$ touch a\\b c\\d\\e
$ ls -l
total 0
-rw-r--r-- 1 ghoti staff 0 11 Jun 23:13 a\b
-rw-r--r-- 1 ghoti staff 0 11 Jun 23:13 c\d\e
$ for file in *\\*; do
> target="${file//\\//}"; mkdir -p "${target%/*}"; mv -v "$file" "$target"; done
a\b -> a/b
c\d\e -> c/d/e
The for loop breaks out as follows:
for file in *\\*; do - select all files whose names contain backslashes
target="${file//\\//}"; - swap backslashes for forward slashes
mkdir -p "${target%/*}"; - create the target directory by stripping the filename from $target
mv -v "$file" "$target"; - move the file to its new home
done - end the loop.
The only tricky bit here I think is the second line: ${file//\\//} is an expression of ${var//pattern/replacement}, where the pattern is an escaped backslash (\\) and the replacement is a single forward slash.
Have a look at man bash and search for "Parameter Expansion" to learn more about this.
Alternately, if you really want to use find, you can still take advantage of bash's Parameter Expansion:
find . -name '*\\*' -type f \
-exec bash -c 't="${0//\\//}"; mkdir -p "${t%/*}"; mv -v "$0" "$t"' {} \;
This uses find to identify each file and process it with an -exec option that basically does the same thing as the for loop above. One significant difference here is that find will traverse subdirectories (limited by the -maxdepth option), so ... be careful.
Renaming a file with backslashes is simple: mv 'a\b' 'newname' (just quote it), but you'll need more than that.
You need to:
find all files with a backslash (e.g. a\b\c)
split path from filename (e.g. a\b from c)
create a complete path (e.g. a/b, dir b under dir a)
move the old file under a new name, under a created path (e.g. rename a\b\c to file named c in dir a/b)
Something like this:
#!/bin/bash
find . -name '*\\*' | while read f; do
base="${f%\\*}"
file="${f##*\\}"
path="${base//\\//}"
mkdir -p "$path"
mv "$f" "$path/$file"
done
(Edit: correct handling of filenames with spaces)

Bulk rename files in Unix with current date as suffix

I am trying to bulk rename all the files in the current folder with date suffix:
rename 's/(.*)/$1_$(date +%F)/' *
But that command is renaming info.txt to info.txt_1000 4 24 27 30 46 113 128 1000date +%F). I want the result to be info.txt_2016-10-13
You want $1 to be passed literally to rename, yet have $(date +%F) to be expanded by the shell. The latter won't happen when you use single quotes, only with double quotes. The solution is to use double quotes and escape $1 so the shell doesn't expand it.
rename "s/(.*)/\$1_$(date +%F)/" *
Portable Posix shell solution
Since you said "in Unix" and the rename command isn't portable (it's actually a part of the perl package), here is a solution that should work in more environments:
for file in *; do mv "$file" "${file}_$( date +%F )"; done
This creates a loop and then moves each individual file to the new name. Like the question, it uses date +%F via shell process substitution to insert the "full date" (YYYY-mm-dd). Process substitution must use double quotes (") and not single quotes (') because single quotes shut off shell interpretations (this goes for variables as well), as noted in John's answer.
Argument list too long
Your shell will complain if the directory has too many files in it, with an error like "Argument list too long." This is because * expands to the contents of the directory, all of which become arguments to the command you're running.
To solve that, create a script and then feed it through xargs as follows:
#!/bin/sh
if [ -z "$1" ]; then set *; fi # default to all files in curent directory
for file in "$#"; do mv "$file" "${file}_$( date +%F )"; done
Using ls to generate a file list isn't always wise for scripts (it can do weird things in certain contexts). Use find instead:
find . -type f -not -name '*_20??-??-??' -print0 |xargs -0 sh /path/to/script.sh
Note, that command is recursive (change that by adding -maxdepth 1 after the dot). find is extremely capable. In this example, it finds all files (-type f) that do not match the shell glob *_20??-??-?? (* matches any number of any characters, ? matches exactly one of any character, so this matches abc_2016-10-14 but not abc_2016-10-14-def). This uses find … -print0 and xargs -0 to ensure spacing is properly preserved (instead of spaces, these use the null character as the delimiter).
You can also do this with perl rename:
find . -type f -not -name '*_20??-??-??' -print0 |xargs -0 \
rename "s/(.*)/\$1_$( date +%F )/"

No such file or directory when piping. Each command works separately, but not when piping

I have 2 folders: folder_a & folder_b. In each of these folders there are a bunch of files. I am trying to use sed to move all of these files out of these folders and into my current working directory I am currently in.
My folder structure looks like this:
mytest:
a:
1.txt
2.txt
3.txt
b:
4.txt
5.txt
The command I am trying to use is:
find . -type d ! -iname '*.*' # find all folders other than root
| sed -r 's/.*/&\/*/' # add '/*' to each of the arguments
| sed -r 'p;s/.*/./' # output: a/* . b/* .
| xargs -n 2 mv # should be creating two commands: 'mv a/* .' and 'mv b/* .'
Unfortunately I get an error:
mv: cannot stat './aaa/*': No such file or directory
I also get the same error when I try this other strategy (using ls instead of mv):
for dir in */; do
ls $dir;
done;
Even if I use sed to replace the spaces in each directory name with '\ ', or surround the directory names with quotes I get the same error.
I'm not sure if these 2 examples are related in my misunderstanding of bash but they both seem to demonstrate my ignorance of how bash translates the output from one command into the input of another command.
Can anyone shed some light on this?
Update: Completely rewritten.
As #EtanReisner and #melpomene have noted, mv */* . or, more specifically, mv a/* b/* . is the most straightforward solution, but you state that this is in part a learning exercise, so the remainder of the answer shows an efficient find-based solution and explains the problem with the original command.
An efficient find-based solution
Generally, if feasible, it's best and most efficient to let find itself do the work, without involving additional tools; find's -exec action is like a built-in xargs, with {} representing the path at hand (with terminator \;) / all paths (with +):
find . -type f -exec echo mv -t . {} +
To be safe, his will just print the mv commands that would be executed; remove the echo to actually execute them.
This will execute a single[1] mv command to which all matching files are passed, and -t . moves them all to the current dir.
[1] If the resulting command line is too long (which is unlikely), it is split up into multiple commands, just as with xargs.
Operating on files (-type f) bypasses the need for globbing, as find will then enumerate all files for you (it also bypasses the need to exclude . explicitly).
Note that this solution works on entire subtrees, not just (immediate) subdirectories.
It's tempting to consider turning on Bash 4's globstar option and using mv */** ., but that won't work, because it will attempt to move directories as well, not just the files in them.
A caveat re -exec with +: it only works if {} - the placeholder for all paths - is the token immediately before the +.
Since you're on Linux, we can satisfy this condition by specifying the target folder for mv with option -t before the {}; on BSD-based systems such as OSX, you could not do that, because mv doesn't support -t there, so you'd have to use terminator \;, which means that mv is called once for every path, which is obviously much slower.
Why your command didn't work:
As #EtanReisner points out in a comment, xargs invokes the command specified without (implicitly) involving a shell, so globbing won't work; you can verify this with the following command:
echo '*' | xargs echo # -> '*' - NO globbing
If we leave the globbing issue aside, additional work would have been necessary to make your xargs command work correctly with folder names with embedded spaces (or other shell metacharacters):
find . -mindepth 1 -type d |
sed -r "s/.*/'&'\/* ./" | # -> '<input-path>'/* . (including single-quotes)
xargs -n 2 echo mv # NOTE: still won't work due to lack of globbing
Note how the (combined) sed command now produces a single output line '<input-path>'/* ., with the input path enclosed in embedded single-quotes, which is required for xargs to recognize <input-path> as a single argument, even if it contains embedded spaces.
(If your filenames contain single-quotes, you'd have to do more work; also note that since now all arguments for a given dir. are on a single line, you could use xargs -L 1 ....)
Also note how -mindepth 1 (only process paths at the subdirectory level or below) is used to skip processing of . itself.
The only way to make globbing happen is to get the shell involved:
find . -mindepth 1 -type d |
sed -r "s/.*/'&'\/* ./" | # -> '<input-path>'/* . (including single-quotes)
xargs -I {} sh -c 'echo mv {}' # works, but is inefficient
Note the use of xargs' -I option to treat each input line as its own argument ({} is a self-chosen placeholder for the input).
sh -c invokes the (default) shell to execute the resulting command, at which globbing does happen.
However, overall, this is quite inefficient:
A pipeline with 3 segments is used.
A shell instance is invoked for every input path, which in turn calls the mv utility.
Compare this to the efficient find-only solution above, which (typically) creates only 2 processes in total.

Add prefix to all images (recursive)

I have a folder with more than 5000 images, all with JPG extension.
What i want to do, is to add recursively the "thumb_" prefix to all images.
I found a similar question: Rename Files and Directories (Add Prefix) but i only want to add the prefix to files with the JPG extension.
One of possibly solutions:
find . -name '*.jpg' -printf "'%p' '%h/thumb_%f'\n" | xargs -n2 echo mv
Principe: find all needed files, and prepare arguments for the standard mv command.
Notes:
arguments for the mv are surrounded by ' for allowing spaces in filenames.
The drawback is: this will not works with filenames what are containing ' apostrophe itself, like many mp3 files. If you need moving more strange filenames check bellow.
the above command is for dry run (only shows the mv commands with args). For real work remove the echo pretending mv.
ANY filename renaming. In the shell you need a delimiter. The problem is, than the filename (stored in a shell variable) usually can contain the delimiter itself, so:
mv $file $newfile #will fail, if the filename contains space, TAB or newline
mv "$file" "$newfile" #will fail, if the any of the filenames contains "
the correct solution are either:
prepare a filename with a proper escaping
use a scripting language what easuly understands ANY filename
Preparing the correct escaping in bash is possible with it's internal printf and %q formatting directive = print quoted. But this solution is long and boring.
IMHO, the easiest way is using perl and zero padded print0, like next.
find . -name \*.jpg -print0 | perl -MFile::Basename -0nle 'rename $_, dirname($_)."/thumb_".basename($_)'
The above using perl's power to mungle the filenames and finally renames the files.
Beware of filenames with spaces in (the for ... in ... expression trips over those), and be aware that the result of a find . ... will always start with ./ (and hence try to give you names like thumb_./file.JPG which isn't quite correct).
This is therefore not a trivial thing to get right under all circumstances. The expression I've found to work correctly (with spaces, subdirs and all that) is:
find . -iname \*.JPG -exec bash -c 'mv "$1" "`echo $1 | sed \"s/\(.*\)\//\1\/thumb/\"`"' -- '{}' \;
Even that can fall foul of certain names (with quotes in) ...
In OS X 10.8.5, find does not have the -printf option. The port that contained rename seemed to depend upon a WebkitGTK development package that was taking hours to install.
This one line, recursive file rename script worked for me:
find . -iname "*.jpg" -print | while read name; do cur_dir=$(dirname "$name"); cur_file=$(basename "$name"); mv "$name" "$cur_dir/thumb_$cur_file"; done
I was actually renaming CakePHP view files with an 'admin_' prefix, to move them all to an admin section.
You can use that same answer, just use *.jpg, instead of just *.
for file in *.JPG; do mv $file thumb_$file; done
if it's multiple directory levels under the current one:
for file in $(find . -name '*.JPG'); do mv $file $(dirname $file)/thumb_$(basename $file); done
proof:
jcomeau#intrepid:/tmp$ mkdir test test/a test/a/b test/a/b/c
jcomeau#intrepid:/tmp$ touch test/a/A.JPG test/a/b/B.JPG test/a/b/c/C.JPG
jcomeau#intrepid:/tmp$ cd test
jcomeau#intrepid:/tmp/test$ for file in $(find . -name '*.JPG'); do mv $file $(dirname $file)/thumb_$(basename $file); done
jcomeau#intrepid:/tmp/test$ find .
.
./a
./a/b
./a/b/thumb_B.JPG
./a/b/c
./a/b/c/thumb_C.JPG
./a/thumb_A.JPG
jcomeau#intrepid:/tmp/test$
Use rename for this:
rename 's/(\w{1})\.JPG$/thumb_$1\.JPG/' `find . -type f -name *.JPG`
For only jpg files in current folder
for f in `ls *.jpg` ; do mv "$f" "PRE_$f" ; done

Strip leading dot from filenames bash script

I have some files in a bunch of directories that have a leading dot and thus are hidden. I would like to revert that and strip the leading dot.
I was unsuccessful with the following:
for file in `find files/ -type f`;
do
base=`basename $file`
if [ `$base | cut -c1-2` = "." ];
then newname=`$base | cut -c2-`;
dirs=`dirname $file`;
echo $dirs/$newname;
fi
done
Which fails on the condition statement:
[: =: unary operator expected
Furthermore, some files have a space in them and file returns them split.
Any help would be appreciated.
The easiest way to delete something from the start of a variable is to use ${var#pattern}.
$ FILENAME=.bashrc; echo "${FILENAME#.}"
bashrc
$ FILENAME=/etc/fstab; echo "${FILENAME#.}"
/etc/fstab
See the bash man page:
${parameter#word}
${parameter##word}
The word is expanded to produce a pattern just as in pathname expansion. If the pattern matches the beginning of the value of parameter, then the result of the expansion is
the expanded value of parameter with the shortest matching pattern (the ‘‘#’’ case) or the longest matching pattern (the ‘‘##’’ case) deleted.
By the way, with a more selective find command you don't need to do all the hard work. You can have find only match files with a leading dot:
find files/ -type f -name '.*'
Throwing that all together, then:
find files/ -type f -name '.*' -printf '%P\0' |
while read -d $'\0' path; do
dir=$(dirname "$path")
file=$(basename "$path")
mv "$dir/$file" "$dir/${file#.}"
done
Additional notes:
To handle file names with spaces properly you need to quote variable names when you reference them. Write "$file" instead of just $file.
For extra robustness the -printf '\0' and read -d $'\0' use NUL characters as delimiters so even file names with embedded newlines '\n' will work.
find files/ -name '.*' -printf '%f\n'|while read f; do
mv "files/$f" "files/${f#.}"
done
This script works with any file you
can throw at it, even if they have
spaces, newlines or other nefarious
characters in their name.
It works no matter how many subdirectories deep the hidden file is
Unlike other answers thus far, you
don't have to change the rest of the script when you change the path
given to find
*Note: I included an echo so that you can test it like a dry-run. Remove the single echo if you are satisfied with the results.
find . -name '.*' -exec sh -c 'for arg; do d="${arg%/*}"; f=${arg:${#d}}; echo mv "$arg" "$d/${f#.}"; done' _ {} +

Resources