I have a directory "FolderName" with 10,000 new files every day. Almost a half of those files are named as follows:
filename_yyyy-mm-dd_hh:mm
while the other half are named:
filename_yyyy-mm-dd hh:mm
(with space instead of underscore)
The script I'd like to set up should do the following:
Rename only the files containing a space in their name, skipping files that need no processing.
I cannot find a way to make the script efficient, I need to really skip the good half, my script tries to mv any file, and it's quite long and inefficient. Any good idea for a better design?
Thanks everybody
Check out the rename utility, a Perl script designed just for this, it's powerful and fast.
rename -n ' ' _ *\ *
Or:
find /path/to/dir -type f -name '* *' -exec rename -n ' ' _ {} \;
The -n flag is for dry run, to print what works happen without actually renaming anything. If the output looks good, remove the flag and rerun.
with the legacy rename you can easily convert single space to underlines
$ rename ' ' '_' files
for f in "$(find . -name '* *')"; do mv $f $(echo $f | sed 's/\ /_/'); done will do it. There should be a better way using find's -exec option as well.
EDIT:
If the function is put in a script, then find can be used directly:
cat <<EOF > space_to_underscore
#!/usr/bin/env bash
mv "$1" "$(sed 's/\ /_/' <(echo "$1"))"
EOF
chmod +x space_to_underscore
find . -name '* *' -exec ./space_to_underscore {} \;
This will be faster than using a for loop.
Rename only the files containing a space in their name, skipping files that need no processing:
for file in *\ * ; do mv "$file" "${file// /_}" ; done
Move all the files older than one week to a "NewFolderName" folder:
find -mtime 7 -exec mv {} NewFolderName/ \;
I need to create a clone of a directory tree so I can clean up duplicate files.
I don't need copies of the files, I just need the files, so I want to create a matching tree with hard links.
I threw this together in a couple of minutes when I realized my backup was going to take hours
It just echos the commands which I redirect to a file to examine before I run it.
Of course the usual problems, like files and directories containing quote or commas have not been addressed (bash scripting sucks for this, doesn't it, this and files containing leading dashes)
Isn't there some utility that already does this in a robust fashion?
BASEDIR=$1
DESTDIR=$2
for DIR in `find "$BASEDIR" -type d`
do
RELPATH=`echo $DIR | sed "s,$BASEDIR,,"`
DESTPATH=${DESTDIR}/$RELPATH
echo mkdir -p \"$DESTPATH\"
done
for FILE in `find "$BASEDIR" -type f`
do
RELPATH=`echo $FILE | sed "s,$BASEDIR,,"`
DESTPATH=${DESTDIR}/$RELPATH
echo ln \"$FILE\" \"$DESTPATH\"
done
Generally using find like that is a bad idea - you are basically relying on separating filenames on whitespace, when in fact all forms of whitespace are valid as filenames on most UNIX systems. Find itself has the ability to run single commands on each file found, which is generally a better thing to use. I would suggest doing something like this (I'd use a couple of scripts for this for simplicity, not sure how easy it would be to do it all in one):
main.sh:
BASEDIR="$1" #I tend to quote all variables - good habit to avoid problems with spaces, etc.
DESTDIR="$2"
find "$BASEDIR" -type d -exec ./handle_file.sh \{\} "$BASEDIR" "$DESTDIR" \; # \{\} is replaced with the filename, \; tells find the command is over
find "$BASEDIR" -type f -exec ./handle_file.sh \{\} "$BASEDIR" "$DESTDIR" \;
handle_file.sh:
FILENAME="$1"
BASEDIR="$2"
DESTDIR="$3"
RELPATH="${FILENAME#"$BASEDIR"}" # bash string substitution double quoting, to stop BASEDIR being interpreted as a pattern
DESTPATH="${DESTDIR}/$RELPATH"
if [ -f "$FILENAME" ]; then
echo ln \""$FILENAME"\" \""$DESTPATH"\"
elif [ -d "$FILENAME" ]; then
echo mkdir -p \""$DESTPATH"\"
fi
I've tested this with a simple tree with spaces, asterisks, apostrophes and even a carriage return in filenames and it seems to work.
Obviously remove the escaped quotes and the "echo" (but leave the real quotes) to make it work for real.
I'm trying to rename a load of files (I count over 200) that either have the company name in the filename, or in the text contents. I basically need to change any references to "company" to "newcompany", maintaining capitalisation where applicable (ie "Company becomes Newcompany", "company" becomes "newcompany"). I need to do this recursively.
Because the name could occur pretty much anywhere I've not been able to find example code anywhere that meets my requirements. It could be any of these examples, or more:
company.jpg
company.php
company.Class.php
company.Company.php
companysomething.jpg
Hopefully you get the idea. I not only need to do this with filenames, but also the contents of text files, such as HTML and PHP scripts. I'm presuming this would be a second command, but I'm not entirely sure what.
I've searched the codebase and found nearly 2000 mentions of the company name in nearly 300 files, so I don't fancy doing it manually.
Please help! :)
bash has powerful looping and substitution capabilities:
for filename in `find /root/of/where/files/are -name *company*`; do
mv $filename ${filename/company/newcompany}
done
for filename in `find /root/of/where/files/are -name *Company*`; do
mv $filename ${filename/Company/Newcompany}
done
For the file and directory names, use for, find, mv and sed.
For each path (f) that has company in the name, rename it (mv) from f to the new name where company is replaced by newcompany.
for f in `find -name '*company*'` ; do mv "$f" "`echo $f | sed s/company/nemcompany/`" ; done
For the file contents, use find, xargs and sed.
For every file, change company by newcompany in its content, keeping original file with extension .backup.
find -type f -print0 | xargs -0 sed -i .bakup 's/company/newcompany/g'
I'd suggest you take a look at man rename an extremely powerful perl-utility for, well, renaming files.
Standard syntax is
rename 's/\.htm$/\.html/' *.htm
the clever part is that the tool accept any perl-regexp as a pattern for a filename to be changed.
you might want to run it with the -n switch which will make the tool to only report what it would have changed.
Can't figure out a nice way to keep the capitalization right now, but since you already can search through the filestructure, issue several rename with different capitalization until all files are changed.
To loop through all files below current folder and to search for a particular string, you can use
find . -type f -exec grep -n -i STRING_TO_SEARCH_FOR /dev/null {} \;
The output from that command can be directed to a file (after some filtering to just extract the file names of the files that need to be changed).
find . /type ... > files_to_operate_on
Then wrap that in a while read loop and do some perl-magic for inplace-replacement
while read file
do
perl -pi -e 's/stringtoreplace/replacementstring/g' $file
done < files_to_operate_on
There are few right ways to recursively process files. Here's one:
while IFS= read -d $'\0' -r file ; do
newfile="${file//Company/Newcompany}"
newfile="${newfile//company/newcompany}"
mv -f "$file" "$newfile"
done < <(find /basedir/ -iname '*company*' -print0)
This will work with all possible file names, not just ones without whitespace in them.
Presumes bash.
For changing the contents of files I would advise caution because a blind replacement within a file could break things if the file is not plain text. That said, sed was made for this sort of thing.
while IFS= read -d $'\0' -r file ; do
sed -i '' -e 's/Company/Newcompany/g;s/company/newcompany/g'"$file"
done < <(find /basedir/ -iname '*company*' -print0)
For this run I recommend adding some additional switches to find to limit the files it will process, perhaps
find /basedir/ \( -iname '*company*' -and \( -iname '*.txt' -or -ianem '*.html' \) \) -print0
I'm trying to build a script that lists all the zip files in a set of directories, with some filters and get it to spit them out to file but when a filename has a space in it it seems to appear on a new line.
This list will eventually be used as an input to tar to gzip all the zip files, script is below:
#!/bin/bash
rm -f set1.txt
rm -f set2.txt
for line in $(find /home -type d -name assets ;);
do
echo $line >> set1.txt
for line in $(find $line -type f -name \*.zip -mtime +2 ;);
do
echo \"$line\" >> set2.txt
done;
This works as expected until you get a space in a filename then set2.txt contains entries like this:
"/home/xxxxxx/oldwebroot/htdocs/upload/assets/jobbags/rbjbCost"
"in"
"use"
"sept"
"2010.zip"
Does anyone know how I can get it to keep these filenames with spaces in in a single line with the whole lot wrapped in one set of quotes?
Thanks!
The correct way to loop over a set of files located via find is with a while read construct, thus:
while IFS= read -r -d '' line ; do
echo "$line" >> set1.txt
while IFS= read -r -d '' file ; do
printf '"%s"\n' "$file" >> set2.txt
done < <(find "$line" -type f -name \*.zip -mtime +2 -print0)
done < <(find /home -type d -name assets -print0)
For clarity I have given the inner loop variable a different name.
If you didn't have bash you'd have to issue the find command separately and redirect the output to a file, then read the file with while read ; do .. done < filename.
Note that each expansion of each variable is double-quoted. This is necessary.
Note also, however, that for what you want you can simply use the -printf switch to find, if you have GNU find.
find /home -type f -path '*/assets/*.zip' -mtime +2 -printf '"%p"\n' > set2.txt
Although, as #sarnold notes, this is not safe.
You should probably be executing your tar(1) command through some other mechanism; the find(1) program supports a -print0 option to request ASCII NUL-separated filename output, and the xargs(1) program supports a -0 option to tell it that the input is separated by ASCII NUL characters. (Since NUL is the only character that is not allowed in filenames, this is the only way to get reliable filename handling.)
Simply using the -print0 and -0 options will help but this still leaves the script open to another problem -- xargs(1) might decide to execute the tar(1) command two, three, or more times, depending upon its input. The last execution is the one that will "win", and the data from earlier invocations will be lost for ever. (This is useless as a backup.)
So you should also look into adding the --concatenate command line option to tar(1), too, so that it will add to the archive. It might make sense to perform the compression after all the files have been added, via gzip(1) or bzip2(1). (This does mean you need to remove the archive before a "fresh run" of this script.)
I have to rename a complete folder tree recursively so that no uppercase letter appears anywhere (it's C++ source code, but that shouldn't matter).
Bonus points for ignoring CVS and Subversion version control files/folders. The preferred way would be a shell script, since a shell should be available on any Linux box.
There were some valid arguments about details of the file renaming.
I think files with the same lowercase names should be overwritten; it's the user's problem. When checked out on a case-ignoring file system, it would overwrite the first one with the latter, too.
I would consider A-Z characters and transform them to a-z, everything else is just calling for problems (at least with source code).
The script would be needed to run a build on a Linux system, so I think changes to CVS or Subversion version control files should be omitted. After all, it's just a scratch checkout. Maybe an "export" is more appropriate.
Smaller still I quite like:
rename 'y/A-Z/a-z/' *
On case insensitive filesystems such as OS X's HFS+, you will want to add the -f flag:
rename -f 'y/A-Z/a-z/' *
A concise version using the "rename" command:
find my_root_dir -depth -exec rename 's/(.*)\/([^\/]*)/$1\/\L$2/' {} \;
This avoids problems with directories being renamed before files and trying to move files into non-existing directories (e.g. "A/A" into "a/a").
Or, a more verbose version without using "rename".
for SRC in `find my_root_dir -depth`
do
DST=`dirname "${SRC}"`/`basename "${SRC}" | tr '[A-Z]' '[a-z]'`
if [ "${SRC}" != "${DST}" ]
then
[ ! -e "${DST}" ] && mv -T "${SRC}" "${DST}" || echo "${SRC} was not renamed"
fi
done
P.S.
The latter allows more flexibility with the move command (for example, "svn mv").
for f in `find`; do mv -v "$f" "`echo $f | tr '[A-Z]' '[a-z]'`"; done
Just simply try the following if you don't need to care about efficiency.
zip -r foo.zip foo/*
unzip -LL foo.zip
One can simply use the following which is less complicated:
rename 'y/A-Z/a-z/' *
This works on CentOS/Red Hat Linux or other distributions without the rename Perl script:
for i in $( ls | grep [A-Z] ); do mv -i "$i" "`echo $i | tr 'A-Z' 'a-z'`"; done
Source: Rename all file names from uppercase to lowercase characters
(In some distributions the default rename command comes from util-linux, and that is a different, incompatible tool.)
This works if you already have or set up the rename command (e.g. through brew install in Mac):
rename --lower-case --force somedir/*
The simplest approach I found on Mac OS X was to use the rename package from http://plasmasturm.org/code/rename/:
brew install rename
rename --force --lower-case --nows *
--force Rename even when a file with the destination name already exists.
--lower-case Convert file names to all lower case.
--nows Replace all sequences of whitespace in the filename with single underscore characters.
Most of the answers above are dangerous, because they do not deal with names containing odd characters. Your safest bet for this kind of thing is to use find's -print0 option, which will terminate filenames with ASCII NUL instead of \n.
Here is a script, which only alter files and not directory names so as not to confuse find:
find . -type f -print0 | xargs -0n 1 bash -c \
's=$(dirname "$0")/$(basename "$0");
d=$(dirname "$0")/$(basename "$0"|tr "[A-Z]" "[a-z]"); mv -f "$s" "$d"'
I tested it, and it works with filenames containing spaces, all kinds of quotes, etc. This is important because if you run, as root, one of those other scripts on a tree that includes the file created by
touch \;\ echo\ hacker::0:0:hacker:\$\'\057\'root:\$\'\057\'bin\$\'\057\'bash
... well guess what ...
Here's my suboptimal solution, using a Bash shell script:
#!/bin/bash
# First, rename all folders
for f in `find . -depth ! -name CVS -type d`; do
g=`dirname "$f"`/`basename "$f" | tr '[A-Z]' '[a-z]'`
if [ "xxx$f" != "xxx$g" ]; then
echo "Renaming folder $f"
mv -f "$f" "$g"
fi
done
# Now, rename all files
for f in `find . ! -type d`; do
g=`dirname "$f"`/`basename "$f" | tr '[A-Z]' '[a-z]'`
if [ "xxx$f" != "xxx$g" ]; then
echo "Renaming file $f"
mv -f "$f" "$g"
fi
done
Folders are all renamed correctly, and mv isn't asking questions when permissions don't match, and CVS folders are not renamed (CVS control files inside that folder are still renamed, unfortunately).
Since "find -depth" and "find | sort -r" both return the folder list in a usable order for renaming, I preferred using "-depth" for searching folders.
One-liner:
for F in K*; do NEWNAME=$(echo "$F" | tr '[:upper:]' '[:lower:]'); mv "$F" "$NEWNAME"; done
Or even:
for F in K*; do mv "$F" "${F,,}"; done
Note that this will convert only files/directories starting with letter K, so adjust accordingly.
The original question asked for ignoring SVN and CVS directories, which can be done by adding -prune to the find command. E.g to ignore CVS:
find . -name CVS -prune -o -exec mv '{}' `echo {} | tr '[A-Z]' '[a-z]'` \; -print
[edit] I tried this out, and embedding the lower-case translation inside the find didn't work for reasons I don't actually understand. So, amend this to:
$> cat > tolower
#!/bin/bash
mv $1 `echo $1 | tr '[:upper:]' '[:lower:]'`
^D
$> chmod u+x tolower
$> find . -name CVS -prune -o -exec tolower '{}' \;
Ian
Not portable, Zsh only, but pretty concise.
First, make sure zmv is loaded.
autoload -U zmv
Also, make sure extendedglob is on:
setopt extendedglob
Then use:
zmv '(**/)(*)~CVS~**/CVS' '${1}${(L)2}'
To recursively lowercase files and directories where the name is not CVS.
Using Larry Wall's filename fixer:
$op = shift or die $help;
chomp(#ARGV = <STDIN>) unless #ARGV;
for (#ARGV) {
$was = $_;
eval $op;
die $# if $#;
rename($was,$_) unless $was eq $_;
}
It's as simple as
find | fix 'tr/A-Z/a-z/'
(where fix is of course the script above)
for f in `find -depth`; do mv ${f} ${f,,} ; done
find -depth prints each file and directory, with a directory's contents printed before the directory itself. ${f,,} lowercases the file name.
This works nicely on macOS too:
ruby -e "Dir['*'].each { |p| File.rename(p, p.downcase) }"
This is a small shell script that does what you requested:
root_directory="${1?-please specify parent directory}"
do_it () {
awk '{ lc= tolower($0); if (lc != $0) print "mv \"" $0 "\" \"" lc "\"" }' | sh
}
# first the folders
find "$root_directory" -depth -type d | do_it
find "$root_directory" ! -type d | do_it
Note the -depth action in the first find.
Use typeset:
typeset -l new # Always lowercase
find $topPoint | # Not using xargs to make this more readable
while read old
do new="$old" # $new is a lowercase version of $old
mv "$old" "$new" # Quotes for those annoying embedded spaces
done
On Windows, emulations, like Git Bash, may fail because Windows isn't case-sensitive under the hood. For those, add a step that mv's the file to another name first, like "$old.tmp", and then to $new.
With MacOS,
Install the rename package,
brew install rename
Use,
find . -iname "*.py" -type f | xargs -I% rename -c -f "%"
This command find all the files with a *.py extension and converts the filenames to lower case.
`f` - forces a rename
For example,
$ find . -iname "*.py" -type f
./sample/Sample_File.py
./sample_file.py
$ find . -iname "*.py" -type f | xargs -I% rename -c -f "%"
$ find . -iname "*.py" -type f
./sample/sample_file.py
./sample_file.py
Lengthy But "Works With No Surprises & No Installations"
This script handles filenames with spaces, quotes, other unusual characters and Unicode, works on case insensitive filesystems and most Unix-y environments that have bash and awk installed (i.e. almost all). It also reports collisions if any (leaving the filename in uppercase) and of course renames both files & directories and works recursively. Finally it's highly adaptable: you can tweak the find command to target the files/dirs you wish and you can tweak awk to do other name manipulations. Note that by "handles Unicode" I mean that it will indeed convert their case (not ignore them like answers that use tr).
# adapt the following command _IF_ you want to deal with specific files/dirs
find . -depth -mindepth 1 -exec bash -c '
for file do
# adapt the awk command if you wish to rename to something other than lowercase
newname=$(dirname "$file")/$(basename "$file" | awk "{print tolower(\$0)}")
if [ "$file" != "$newname" ] ; then
# the extra step with the temp filename is for case-insensitive filesystems
if [ ! -e "$newname" ] && [ ! -e "$newname.lcrnm.tmp" ] ; then
mv -T "$file" "$newname.lcrnm.tmp" && mv -T "$newname.lcrnm.tmp" "$newname"
else
echo "ERROR: Name already exists: $newname"
fi
fi
done
' sh {} +
References
My script is based on these excellent answers:
https://unix.stackexchange.com/questions/9496/looping-through-files-with-spaces-in-the-names
How to convert a string to lower case in Bash?
In OS X, mv -f shows "same file" error, so I rename twice:
for i in `find . -name "*" -type f |grep -e "[A-Z]"`; do j=`echo $i | tr '[A-Z]' '[a-z]' | sed s/\-1$//`; mv $i $i-1; mv $i-1 $j; done
I needed to do this on a Cygwin setup on Windows 7 and found that I got syntax errors with the suggestions from above that I tried (though I may have missed a working option). However, this solution straight from Ubuntu forums worked out of the can :-)
ls | while read upName; do loName=`echo "${upName}" | tr '[:upper:]' '[:lower:]'`; mv "$upName" "$loName"; done
(NB: I had previously replaced whitespace with underscores using:
for f in *\ *; do mv "$f" "${f// /_}"; done
)
Slugify Rename (regex)
It is not exactly what the OP asked for, but what I was hoping to find on this page:
A "slugify" version for renaming files so they are similar to URLs (i.e. only include alphanumeric, dots, and dashes):
rename "s/[^a-zA-Z0-9\.]+/-/g" filename
I would reach for Python in this situation, to avoid optimistically assuming paths without spaces or slashes. I've also found that python2 tends to be installed in more places than rename.
#!/usr/bin/env python2
import sys, os
def rename_dir(directory):
print('DEBUG: rename('+directory+')')
# Rename current directory if needed
os.rename(directory, directory.lower())
directory = directory.lower()
# Rename children
for fn in os.listdir(directory):
path = os.path.join(directory, fn)
os.rename(path, path.lower())
path = path.lower()
# Rename children within, if this child is a directory
if os.path.isdir(path):
rename_dir(path)
# Run program, using the first argument passed to this Python script as the name of the folder
rename_dir(sys.argv[1])
If you use Arch Linux, you can install rename) package from AUR that provides the renamexm command as /usr/bin/renamexm executable and a manual page along with it.
It is a really powerful tool to quickly rename files and directories.
Convert to lowercase
rename -l Developers.mp3 # or --lowcase
Convert to UPPER case
rename -u developers.mp3 # or --upcase, long option
Other options
-R --recursive # directory and its children
-t --test # Dry run, output but don't rename
-o --owner # Change file owner as well to user specified
-v --verbose # Output what file is renamed and its new name
-s/str/str2 # Substitute string on pattern
--yes # Confirm all actions
You can fetch the sample Developers.mp3 file from here, if needed ;)
None of the solutions here worked for me because I was on a system that didn't have access to the perl rename script, plus some of the files included spaces. However, I found a variant that works:
find . -depth -exec sh -c '
t=${0%/*}/$(printf %s "${0##*/}" | tr "[:upper:]" "[:lower:]");
[ "$t" = "$0" ] || mv -i "$0" "$t"
' {} \;
Credit goes to "Gilles 'SO- stop being evil'", see this answer on the similar question "change entire directory tree to lower-case names" on the Unix & Linux StackExchange.
I believe the one-liners can be simplified:
for f in **/*; do mv "$f" "${f:l}"; done
( find YOURDIR -type d | sort -r;
find yourdir -type f ) |
grep -v /CVS | grep -v /SVN |
while read f; do mv -v $f `echo $f | tr '[A-Z]' '[a-z]'`; done
First rename the directories bottom up sort -r (where -depth is not available), then the files.
Then grep -v /CVS instead of find ...-prune because it's simpler.
For large directories, for f in ... can overflow some shell buffers.
Use find ... | while read to avoid that.
And yes, this will clobber files which differ only in case...
find . -depth -name '*[A-Z]*'|sed -n 's/\(.*\/\)\(.*\)/mv -n -v -T \1\2 \1\L\2/p'|sh
I haven't tried the more elaborate scripts mentioned here, but none of the single commandline versions worked for me on my Synology NAS. rename is not available, and many of the variations of find fail because it seems to stick to the older name of the already renamed path (eg, if it finds ./FOO followed by ./FOO/BAR, renaming ./FOO to ./foo will still continue to list ./FOO/BAR even though that path is no longer valid). Above command worked for me without any issues.
What follows is an explanation of each part of the command:
find . -depth -name '*[A-Z]*'
This will find any file from the current directory (change . to whatever directory you want to process), using a depth-first search (eg., it will list ./foo/bar before ./foo), but only for files that contain an uppercase character. The -name filter only applies to the base file name, not the full path. So this will list ./FOO/BAR but not ./FOO/bar. This is ok, as we don't want to rename ./FOO/bar. We want to rename ./FOO though, but that one is listed later on (this is why -depth is important).
This comand in itself is particularly useful to finding the files that you want to rename in the first place. Use this after the complete rename command to search for files that still haven't been replaced because of file name collisions or errors.
sed -n 's/\(.*\/\)\(.*\)/mv -n -v -T \1\2 \1\L\2/p'
This part reads the files outputted by find and formats them in a mv command using a regular expression. The -n option stops sed from printing the input, and the p command in the search-and-replace regex outputs the replaced text.
The regex itself consists of two captures: the part up until the last / (which is the directory of the file), and the filename itself. The directory is left intact, but the filename is transformed to lowercase. So, if find outputs ./FOO/BAR, it will become mv -n -v -T ./FOO/BAR ./FOO/bar. The -n option of mv makes sure existing lowercase files are not overwritten. The -v option makes mv output every change that it makes (or doesn't make - if ./FOO/bar already exists, it outputs something like ./FOO/BAR -> ./FOO/BAR, noting that no change has been made). The -T is very important here - it treats the target file as a directory. This will make sure that ./FOO/BAR isn't moved into ./FOO/bar if that directory happens to exist.
Use this together with find to generate a list of commands that will be executed (handy to verify what will be done without actually doing it)
sh
This pretty self-explanatory. It routes all the generated mv commands to the shell interpreter. You can replace it with bash or any shell of your liking.
Using bash, without rename:
find . -exec bash -c 'mv $0 ${0,,}' {} \;