This question already has answers here:
How to perform a for-each loop over all the files under a specified path?
(4 answers)
Closed 7 years ago.
I'm trying to rename several files. So I need those file names first.
I'm using:
for FILE in $(find . -type f -name "*.flv" -exec basename {} \; ); do
echo "$FILE"
done
When I try just the find command, it returns the number of files correctly, but when Im using the for, I was expecting that ARQ would contain the entire name of a single file, but instead, it returns splited words of the filename.
So how can I get the entire name, not just separated words of it?
There are several ways to get that to work. The simplest is to use find's exec fully:
find . -type f -name "*.flv" -exec bash -c 'f=$(basename "$1"); printf "%s\n" "$f"' _ {} \;
In other words, you can put complex scripts in the -exec clause if you like.
As a second choice, consider this loop:
find . -name '*.flv' -print0 | while IFS= read -d '' -r file
do
f=$(basename "$file")
printf "%s\n" "$f"
done
Using for loop with the result from Command Substitution without double quote causing the result to break on space, tab and newline by default (that's IFS default value).
POSIXly, you don't need anything other than find and an inline-script:
$ find . -type f -name "*.flv" -exec sh -c '
for f do
printf "%s\n" "${f##*/}"
done
' sh {} +
With GNU find, you don't need the inline-script:
$ find . -type f -name "*.flv" -printf '%f\n'
Looking at the title of the question: avoiding splitting a string using for in:
Do not use the IFS field separators in the loop:
:~> a="sdad asd asda ad
> fdvbdsvf
> dfvsdfv
> 4"
:~> for s in $a; do
echo "== $s ==";
done
== sdad ==
== asd ==
== asda ==
== ad ==
== fdvbdsvf ==
== dfvsdfv ==
== 4 ==
:~> (IFS=; for s in $a; do
echo "== $s ==";
done)
== sdad asd asda ad
fdvbdsvf
dfvsdfv
4 ==
I used round brackets for the last command, so that the changed value of IFS is limited to that subprocess.
Instead of using find, use rename command which is designed to rename multiple files.
For example:
rename 's/foo/bar/' **/*.flv
which would replace foo in filename into bar in all *.flv files recursively. If your shell supports a new globbing option (such as Bash 4.x or zsh), make sure the option is enabled by shopt -s globstar.
Or if you're using find with a loop, you can use:
-print0 when piping to external programs such as xargs (with -0),
use -exec cmd to run command directly on the file ({})
use -execdir cmd to execute command in the directory where the file is present
Related
I have multiple files in multiple directories and i have to rename these files from lowercase to uppercase; the file extension may vary and needs to be in lowercase (should be renamed too for files with extensions in uppercase).
NB: I have rename version from util-linux on CentOS Linux7.
i tried this :
find /mydir -depth | xargs -n 1 rename -v 's/(.*)\/([^\/]*)/$1\/\U$2/' {} \;
find /mydir -depth | xargs -n 1 rename -v 's/(.*)\/([^\/]*)/$2\/\L$2/' {} \;
but it's not working it changes nothing and i have no output.
Itried another solution :
for SRC in `find my_root_dir -depth`
do
DST=`dirname "${SRC}"`/`basename "${SRC}" | tr '[A-Z]' '[a-z]'`
if [ "${SRC}" != "${DST}" ]
then
[ ! -e "${DST}" ] && mv -T "${SRC}" "${DST}" || echo "${SRC} was not renamed"
fi
done
this one partially works but transforms the files extensions to uppercase too.
Any suggestions on how to keep/transform the extensions to lowercase ?
Thank you!
Possible solution with Perl rename:
find /mydir -depth -type f -exec rename -v 's/(.*\/)?([^.]*)/$1\U$2/' {} +
The commands in the question have several problems.
You seem to confuse the syntax of find's -exec action and xargs.
find /mydir -depth -type f -exec rename -v 'substitution_command' {} \;
find /mydir -depth -type f| xargs -n 1 rename -v 'substitution_command'
The xargs version has problems in case a file name contains a space.
If you replace \; with +, multiple file names are passed to one invocation of rename.
The substitution command is only supported by the Perl version of the rename command. You might have to install this version. See Get the Perl rename utility instead of the built-in rename
The substitution did not work in my test. I successfully used
rename -v 's/(.*\/)?([^.]*)/$1\U$2/' file ...
The first group (.*\/)? optionally matches a sequence of characters with a trailing /. This is used to copy the directory unchanged.
The second group ([^.]*) matches a sequence of characters except ..
This is the file name part before the first dot (if any) which will be converted to uppercase. In case the file name has more than one extension, all will remain unchanged, e.g.
Path/To/Foo.Bar.Baz -> Path/To/FOO.Bar.Baz
rename-independent solution (using find together with mv)
You can rename all files in a directory with a following command:
for i in $( ls | grep [A-Z] ); do mv -i $i `echo $i | tr 'A-Z' 'a-z'`; done
First part (for i in $( ls | grep [A-Z] );) looks for all uppercase characters and executes until all files are "scanned".
Second part (``) turns all uppercase characters into lowercase ones.
Perl-based rename dependent solution
rename -f 'y/A-Z/a-z/' *
This command changes uppercase characters to the lowercase ones. -f option allows overwriting of existing files, but it is not necessary.
suggesting a trick with awk that will generate all required mv commands:
awk '{f=$0;split($NF,a,".");$NF=tolower(a[1])"."toupper(a[2]);print "mv "f" "$0}' FS=/ OFS=/ <<< $(find . -type f)
Inspect the result, and run all mv commands together:
bash <<< $(awk '{f=$0;split($NF,a,".");$NF=tolower(a[1])"."toupper(a[2]);print "mv "f" "$0}' FS=/ OFS=/ <<< $(find . -type f))
awk script script.awk explanation
BEGIN { # preprocessing configuration
FS="/"; # set awk field separtor to /
OFS="/"; # set awk output field separtor to /
}
{ # for each line in input list
filePath = $0; # save the whole filePath in variable
# fileName is contained in last field $NF
# split fileName by "." to head: splitedFileNameArr[1] and tail: splitedFileNameArr[2]
split($NF,splitedFileNameArr,".");
# recreate fileName from lowercase(head) "." uppercase(tail)
$NF = tolower(splitedFileNameArr[1]) "." toupper(splitedFileNameArr[2]);
# generate a "mv" command from original filePath and regenerated fileName
print "mv "filePath" "$0;
}
Testing:
mkdir {a1,B2}/{A1,b2} -p; touch {a1,B2}/{A1,b2}/{A,b}{b,C}.{c,D}{d,C}
find . -type f
./a1/A1/Ab.cC
./a1/A1/Ab.cd
./a1/A1/Ab.DC
./a1/A1/Ab.Dd
./B2/b2/AC.DC
./B2/b2/AC.Dd
.....
./B2/b2/bC.cd
./B2/b2/bC.DC
./B2/b2/bC.Dd
awk -f script.awk <<< $(find . -type f)
.....
mv ./a1/b2/Ab.cd ./a1/b2/ab.CD
mv ./a1/b2/Ab.DC ./a1/b2/ab.DC
mv ./a1/b2/Ab.Dd ./a1/b2/ab.DD
mv ./B2/A1/bC.Dd ./B2/A1/bc.DD
.....
mv ./B2/b2/bC.DC ./B2/b2/bc.DC
mv ./B2/b2/bC.Dd ./B2/b2/bc.DD
bash <<< $(awk -f script.awk <<< $(find . -type f))
find . -type f
I have a bash script, which contains the following lines:
for ((iTime=starttime;iTime<=endtime;iTime++))
do
find . -name "*${iTime}*" -exec cp --parents \{\} ${dst} \;
done
I have a structure with a few folders including subfolders and many files at the bottom of the tree. These files are labeled with date and time info in the filename, like "filename_2021063015300000_suffix". The time is in format yyyymmddhhmmss and two digits for 1/10 and 1/100 seconds. I have a lot of files, which means, that my approach is very slow. The files have a time distance of a few minutes, so only a couple of files (e.g. 10 per subfolder out of >10000) should be copied.
How can i find all the files in the time range and get them all in one find and copy command? Maybe get a list of all the files to copy with one find command and then copy the list of filepathes? But how can i do this?
If your time span is reasonably limited, just inline the acceptable file names into the single find command.
find . \( -false $(for ((iTime=starttime;iTime<=endtime;iTime++)); do printf ' %s' -o -name "*$iTime*"; done) \) -exec cp --parents \{\} ${dst} \;
The initial -false predicate inside the parentheses is just to simplify the following predicates so that they can all start with -o -name.
This could end up with an "argument list too long" error if your list of times is long, though. Perhaps a more robust solution is to pass the time resolution into the command.
find . -type f -exec bash -c '
for f; do
for ((iTime=starttime;iTime<=endtime;iTime++)); do
if [[ $f == *"$iTime"* ]]; then
cp --parents "$f" "$0"
break
fi
done' "$dst" {} +
The script inside -exec could probably be more elegant; if your file names have reasonably regular format, maybe just extract the timestamp and compare it numerically to check whether it's in range. Perhaps also notice how we abuse the $0 parameter after bash -c '...' to pass in the value of $dst.
Lose the find. I created -
filename_2020063015300000_suffix
filename_2021053015300000_suffix
filename_2021063015300000_suffix
filename_2022063015300000_suffix
foo/filename_2021053015312345_suffix
bar/baz/filename_2021053015310101_suffix
So if I execute
starttime=2021000000000000
endtime=2022000000000000
shopt -s globstar
for f in **/*_[0-9][0-9][0-9][0-9][0-9][0-9][0-9][0-9][0-9][0-9][0-9][0-9][0-9][0-9][0-9][0-9]_*; do # for all these
ts=${f//[^0-9]/} # trim to date
(( ts >= starttime )) || continue # skip too old
(( ts <= endtime )) || continue # skip too new
echo "$f" # list matches
done | xargs -I{} echo cp {} /new/dir/ # pass to xargs
I get
cp bar/baz/filename_2021053015310101_suffix /new/dir/
cp filename_2021053015300000_suffix /new/dir/
cp filename_2021063015300000_suffix /new/dir/
cp foo/filename_2021053015312345_suffix /new/dir/
There are ways to simplify that glob. If you use extglob you can make it shorter, and check more carefully with a regex - for example,
shopt -s globstar extglob
for f in **/*_+([0-9])_*; do
[[ "$f" =~ _[0-9]{16}_ ]] || continue;
It starts looking complicated and hard to maintain for the next guy, though.
Try these, replace the dst, starttime, endtime in your case, both work for me on Ubuntu16.04.
find . -type f -regextype sed -regex "[^_]*_[0-9]\{16\}_[^_]*" -exec bash -c 'dt=$(echo "$0" | grep -oP "\d{16}"); [ "$dt" -gt "$2" ] && [ "$dt" -lt "$3" ] && cp -p "$0" "$1"' {} 'dst/' 'starttime' 'endtime' \;
$0 is filename which contain the datetime, $1 is dst directory path, $2 is starttime, $3 is endtime
Or
find . -type f -regextype sed -regex "[^_]*_[0-9]\{16\}_[^_]*" | awk -v dst='/tmp/test_find/' '{if (0 == system("[ $(echo \"" $0 "\"" " | grep -oP \"" "(?<=_)\\d+(?=_)\") -gt starttime ] && [ $(echo \"" $0 "\"" " | grep -oP \"" "(?<=_)\\d+(?=_)\") -lt endtime ]")) {system("cp -p " $0 " " dst)}}'
Both of them, first, use find to find the file name which has the pattern like _2021063015300000_ (maybe this has 16 digital but you say this pattern format yyyymmddhhmmss only has 14 digital) with sed regex.
Then use -exec bash -c "get datetime in filename, compare them with times, and exec cp action"
Or use awk to get the datetime and compare with start or end time by system command, and will execute cp to dst directory at last also by system command.
PS. this pattern are dependent the filename which only has the datetime between two _.
I am trying the list all the subfolders within a folder:
find . -type d -maxdepth 1 -mindepth 1 2>/dev/null | while read dir
do
echo $dir
done
However, what I get printed out is
./dir1
./dir2
while I would need only
dir1
dir2
Complete use case:
later, I would like to create a new file with name of the folder e.g:
find . -type d -maxdepth 1 -mindepth 1 2>/dev/null | while read dir
do
echo 'MOVING TO'$dir
cd $dir
#SUMMARYLOG=$dir_log_merged # HERE IS WHERE THE ./ IS PROBLEMATIC
# QUESTION EDITED
SUMMARYLOG=${dir}_log_merged # HERE IS WHERE THE ./ IS PROBLEMATIC
echo -e "\n""\n"'SUMMARY LOGS TO '$SUMMARYLOG
touch $SUMMARYLOG
pwd
find . -size +0c -type f -name '*.err' | xargs -I % sh -c 'echo % >> {$SUMMARYLOG}; cat % >> "{$SUMMARYLOG}"; echo -e "\n" >> "{$SUMMARYLOG}"'
cat $SUMMARYLOG
cd ..
done
Basically, I would like to merge a set of .err files in each of the subfolders and create one file with the subfolder name.
I can not create my $SUMMARYLOG so I think the problem is in the find output ./dir...
Instead of find acrobatics, you could use a glob and parameter expansion:
for d in */; do echo "${d%/}"; done
where the "${d%/}" removes the trailing slash from each directory name.
If you have hidden directories, you have to add a second glob as */ ignores them:
for d in */ .[!.]*/; do echo "${d%/}"; done
where .[!.]*/ is a glob for "begins with . and is followed by anything but another .", to exclude . and ...
Apart from that, if you have $dir, you can't use $dir_log_merged to append _log_merged to it, as Bash will look for a variable called dir_log_merged. You have to use ${dir}_log_merged instead.
Another set of problems is in your xargs command that starts with
sh -c 'echo % >> {$SUMMARYLOG};
Single quotes prevent variables from expanding
SUMMARYLOG would be invisible in the subshell; you'd have to export it first
{$SUMMARYLOG} expands to the contents of $SUMMARYLOG (empty string, in your case), then surrounds that with {}, which is why you see the {} file being created
You can't use % like this within the sh -c command. You have to use it as an argument to sh -c and then refer to it like this:
sh -c 'echo "$1"' _ %
with _ as a dummy argument that becomes $0 within the sh -c command.
And finally, I would solve your task as follows:
for f in */*.err; do
! [[ -s $f ]] && continue # Skip empty files
{
echo "${f##*/}" # Basename of file
cat "$f" # File contents
echo # Empty line
} >> "${f%/*}/${f%/*}_log_merged" # Dirname plus new filename
done
I want to iterate over a list of files. This list is the result of a find command, so I came up with:
getlist() {
for f in $(find . -iname "foo*")
do
echo "File found: $f"
# do something useful
done
}
It's fine except if a file has spaces in its name:
$ ls
foo_bar_baz.txt
foo bar baz.txt
$ getlist
File found: foo_bar_baz.txt
File found: foo
File found: bar
File found: baz.txt
What can I do to avoid the split on spaces?
You could replace the word-based iteration with a line-based one:
find . -iname "foo*" | while read f
do
# ... loop body
done
There are several workable ways to accomplish this.
If you wanted to stick closely to your original version it could be done this way:
getlist() {
IFS=$'\n'
for file in $(find . -iname 'foo*') ; do
printf 'File found: %s\n' "$file"
done
}
This will still fail if file names have literal newlines in them, but spaces will not break it.
However, messing with IFS isn't necessary. Here's my preferred way to do this:
getlist() {
while IFS= read -d $'\0' -r file ; do
printf 'File found: %s\n' "$file"
done < <(find . -iname 'foo*' -print0)
}
If you find the < <(command) syntax unfamiliar you should read about process substitution. The advantage of this over for file in $(find ...) is that files with spaces, newlines and other characters are correctly handled. This works because find with -print0 will use a null (aka \0) as the terminator for each file name and, unlike newline, null is not a legal character in a file name.
The advantage to this over the nearly-equivalent version
getlist() {
find . -iname 'foo*' -print0 | while read -d $'\0' -r file ; do
printf 'File found: %s\n' "$file"
done
}
Is that any variable assignment in the body of the while loop is preserved. That is, if you pipe to while as above then the body of the while is in a subshell which may not be what you want.
The advantage of the process substitution version over find ... -print0 | xargs -0 is minimal: The xargs version is fine if all you need is to print a line or perform a single operation on the file, but if you need to perform multiple steps the loop version is easier.
EDIT: Here's a nice test script so you can get an idea of the difference between different attempts at solving this problem
#!/usr/bin/env bash
dir=/tmp/getlist.test/
mkdir -p "$dir"
cd "$dir"
touch 'file not starting foo' foo foobar barfoo 'foo with spaces'\
'foo with'$'\n'newline 'foo with trailing whitespace '
# while with process substitution, null terminated, empty IFS
getlist0() {
while IFS= read -d $'\0' -r file ; do
printf 'File found: '"'%s'"'\n' "$file"
done < <(find . -iname 'foo*' -print0)
}
# while with process substitution, null terminated, default IFS
getlist1() {
while read -d $'\0' -r file ; do
printf 'File found: '"'%s'"'\n' "$file"
done < <(find . -iname 'foo*' -print0)
}
# pipe to while, newline terminated
getlist2() {
find . -iname 'foo*' | while read -r file ; do
printf 'File found: '"'%s'"'\n' "$file"
done
}
# pipe to while, null terminated
getlist3() {
find . -iname 'foo*' -print0 | while read -d $'\0' -r file ; do
printf 'File found: '"'%s'"'\n' "$file"
done
}
# for loop over subshell results, newline terminated, default IFS
getlist4() {
for file in "$(find . -iname 'foo*')" ; do
printf 'File found: '"'%s'"'\n' "$file"
done
}
# for loop over subshell results, newline terminated, newline IFS
getlist5() {
IFS=$'\n'
for file in $(find . -iname 'foo*') ; do
printf 'File found: '"'%s'"'\n' "$file"
done
}
# see how they run
for n in {0..5} ; do
printf '\n\ngetlist%d:\n' $n
eval getlist$n
done
rm -rf "$dir"
There is also a very simple solution: rely on bash globbing
$ mkdir test
$ cd test
$ touch "stupid file1"
$ touch "stupid file2"
$ touch "stupid file 3"
$ ls
stupid file 3 stupid file1 stupid file2
$ for file in *; do echo "file: '${file}'"; done
file: 'stupid file 3'
file: 'stupid file1'
file: 'stupid file2'
Note that I am not sure this behavior is the default one but I don't see any special setting in my shopt so I would go and say that it should be "safe" (tested on osx and ubuntu).
find . -iname "foo*" -print0 | xargs -L1 -0 echo "File found:"
find . -name "fo*" -print0 | xargs -0 ls -l
See man xargs.
Since you aren't doing any other type of filtering with find, you can use the following as of bash 4.0:
shopt -s globstar
getlist() {
for f in **/foo*
do
echo "File found: $f"
# do something useful
done
}
The **/ will match zero or more directories, so the full pattern will match foo* in the current directory or any subdirectories.
I really like for loops and array iteration, so I figure I will add this answer to the mix...
I also liked marchelbling's stupid file example. :)
$ mkdir test
$ cd test
$ touch "stupid file1"
$ touch "stupid file2"
$ touch "stupid file 3"
Inside the test directory:
readarray -t arr <<< "`ls -A1`"
This adds each file listing line into a bash array named arr with any trailing newline removed.
Let's say we want to give these files better names...
for i in ${!arr[#]}
do
newname=`echo "${arr[$i]}" | sed 's/stupid/smarter/; s/ */_/g'`;
mv "${arr[$i]}" "$newname"
done
${!arr[#]} expands to 0 1 2 so "${arr[$i]}" is the ith element of the array. The quotes around the variables are important to preserve the spaces.
The result is three renamed files:
$ ls -1
smarter_file1
smarter_file2
smarter_file_3
find has an -exec argument that loops over the find results and executes an arbitrary command. For example:
find . -iname "foo*" -exec echo "File found: {}" \;
Here {} represents the found files, and wrapping it in "" allows for the resultant shell command to deal with spaces in the file name.
In many cases you can replace that last \; (which starts a new command) with a \+, which will put multiple files in the one command (not necessarily all of them at once though, see man find for more details).
I recently had to deal with a similar case, and I built a FILES array to iterate over the filenames:
eval FILES=($(find . -iname "foo*" -printf '"%p" '))
The idea here is to surround each filename with double quotes, separate them with spaces and use the result to initialize the FILES array.
The use of eval is necessary to evaluate the double quotes in the find output correctly for the array initialization.
To iterate over the files, just do:
for f in "${FILES[#]}"; do
# Do something with $f
done
In some cases, here if you just need to copy or move a list of files, you could pipe that list to awk as well.
Important the \"" "\" around the field $0 (in short your files, one line-list = one file).
find . -iname "foo*" | awk '{print "mv \""$0"\" ./MyDir2" | "sh" }'
Ok - my first post on Stack Overflow!
Though my problems with this have always been in csh not bash the solution I present will, I'm sure, work in both. The issue is with the shell's interpretation of the "ls" returns. We can remove "ls" from the problem by simply using the shell expansion of the * wildcard - but this gives a "no match" error if there are no files in the current (or specified folder) - to get around this we simply extend the expansion to include dot-files thus: * .* - this will always yield results since the files . and .. will always be present. So in csh we can use this construct ...
foreach file (* .*)
echo $file
end
if you want to filter out the standard dot-files then that is easy enough ...
foreach file (* .*)
if ("$file" == .) continue
if ("file" == ..) continue
echo $file
end
The code in the first post on this thread would be written thus:-
getlist() {
for f in $(* .*)
do
echo "File found: $f"
# do something useful
done
}
Hope this helps!
Another solution for job...
Goal was :
select/filter filenames recursively in directories
handle each names (whatever space in path...)
#!/bin/bash -e
## #Trick in order handle File with space in their path...
OLD_IFS=${IFS}
IFS=$'\n'
files=($(find ${INPUT_DIR} -type f -name "*.md"))
for filename in ${files[*]}
do
# do your stuff
# ....
done
IFS=${OLD_IFS}
Breadth-first list is important, here. Also, limiting the depth searched would be nice.
$ find . -type d
/foo
/foo/subfoo
/foo/subfoo/subsub
/foo/subfoo/subsub/subsubsub
/bar
/bar/subbar
$ find . -type d -depth
/foo/subfoo/subsub/subsubsub
/foo/subfoo/subsub
/foo/subfoo
/foo
/bar/subbar
/bar
$ < what goes here? >
/foo
/bar
/foo/subfoo
/bar/subbar
/foo/subfoo/subsub
/foo/subfoo/subsub/subsubsub
I'd like to do this using a bash one-liner, if possible. If there were a javascript-shell, I'd imagine something like
bash("find . -type d").sort( function (x) x.findall(/\//g).length; )
The find command supports -printf option which recognizes a lot of placeholders.
One such placeholder is %d which renders the depth of given path, relative to where find started.
Therefore you can use following simple one-liner:
find -type d -printf '%d\t%P\n' | sort -r -nk1 | cut -f2-
It is quite straightforward, and does not depend on heavy tooling like perl.
How it works:
it internally generates list of files, each rendered as a two-field line
the first field contains the depth, which is used for (reverse) numerical sorting, and then cut away
resulting is simple file listing, one file per line, in the deepest-first order
If you want to do it using standard tools, the following pipeline should work:
find . -type d | perl -lne 'print tr:/::, " $_"' | sort -n | cut -d' ' -f2
That is,
find and print all the directories here in depth first order
count the number of slashes in each directory and prepend it to the path
sort by depth (i.e., number of slashes)
extract just the path.
To limit the depth found, add the -maxdepth argument to the find command.
If you want the directories listed in the same order that find output them, use "sort -n -s" instead of "sort -n"; the "-s" flag stabilizes the sort (i.e., preserves input order among items that compare equally).
You can use find command,
find /path/to/dir -type d
So below example list of directories in current directory :
find . -type d
My feeling is that this is a better solution than previously mentioned ones. It involves grep and such and a loop, but I find it works very well, specifically for cases where you want things line buffered and not the full find buffered.
It is more resource intensive because of:
Lots of forking
Lots of finds
Each directory before the current depth is hit by find as many times as there is total depth to the file structure (this shouldn't be a problem if you have practically any amount of ram...)
This is good because:
It uses bash and basic gnu tools
It can be broken whenever you want (like you see what you were looking for fly by)
It works per line and not per find, so subsequent commands don't have to wait for a find and a sort
It works based on the actual file system separation, so if you have a directory with a slash in it, it won't be listed deeper than it is; if you have a different path separator configured, you still are fine.
#!/bin/bash
depth=0
while find -mindepth $depth -maxdepth $depth | grep '.'
do
depth=$((depth + 1))
done
You can also fit it onto one line fairly(?) easily:
depth=0; while find -mindepth $depth -maxdepth $depth | grep --color=never '.'; do depth=$((depth + 1)); done
But I prefer small scripts over typing...
I don't think you could do it using built-in utilities, since when traversing a directory hierarchy you almost always want a depth-first search, either top-down or bottom-up. Here's a Python script that will give you a breadth-first search:
import os, sys
rootdir = sys.argv[1]
queue = [rootdir]
while queue:
file = queue.pop(0)
print(file)
if os.path.isdir(file):
queue.extend(os.path.join(file,x) for x in os.listdir(file))
Edit:
Using os.path-module instead of os.stat-function and stat-module.
Using list.pop and list.extend instead of del and += operators.
I tried to find a way to do this with find but it doesn't appear to have anything like a -breadth option. Short of writing a patch for it, try the following shell incantation (for bash):
LIST="$(find . -mindepth 1 -maxdepth 1 -type d)";
while test -n "$LIST"; do
for F in $LIST; do
echo $F;
test -d "$F" && NLIST="$NLIST $(find $F -maxdepth 1 -mindepth 1 -type d)";
done;
LIST=$NLIST;
NLIST="";
done
I sort of stumbled upon this accidentally so I don't know if it works in general (I was testing it only on the specific directory structure you were asking about)
If you want to limit the depth, put a counter variable in the outer loop, like so (I'm also adding comments to this one):
# initialize the list of subdirectories being processed
LIST="$(find . -mindepth 1 -maxdepth 1 -type d)";
# initialize the depth counter to 0
let i=0;
# as long as there are more subdirectories to process and we haven't hit the max depth
while test "$i" -lt 2 -a -n "$LIST"; do
# increment the depth counter
let i++;
# for each subdirectory in the current list
for F in $LIST; do
# print it
echo $F;
# double-check that it is indeed a directory, and if so
# append its contents to the list for the next level
test -d "$F" && NLIST="$NLIST $(find $F -maxdepth 1 -mindepth 1 -type d)";
done;
# set the current list equal to the next level's list
LIST=$NLIST;
# clear the next level's list
NLIST="";
done
(replace the 2 in -lt 2 with the depth)
Basically this implements the standard breadth-first search algorithm using $LIST and $NLIST as a queue of directory names. Here's the latter approach as a one-liner for easy copy-and-paste:
LIST="$(find . -mindepth 1 -maxdepth 1 -type d)"; let i=0; while test "$i" -lt 2 -a -n "$LIST"; do let i++; for F in $LIST; do echo $F; test -d "$F" && NLIST="$NLIST $(find $F -maxdepth 1 -mindepth 1 -type d)"; done; LIST=$NLIST; NLIST=""; done
Without the deserved ordering:
find -maxdepth -type d
To get the deserved ordering, you have to do the recursion yourself, with this small shellscript:
#!/bin/bash
r ()
{
let level=$3+1
if [ $level -gt $4 ]; then return 0; fi
cd "$1"
for d in *; do
if [ -d "$d" ]; then
echo $2/$d
fi;
done
for d in *; do
if [ -d "$d" ]; then
(r "$d" "$2/$d" $level $4)
fi;
done
}
r "$1" "$1" 0 "$2"
Then you can call this script with parameters base directory and depth.
Here's a possible way, using find. I've not thoroughly tested it, so user beware...
depth=0
output=$(find . -mindepth $depth -maxdepth $depth -type d | sort);
until [[ ${#output} -eq 0 ]]; do
echo "$output"
let depth=$depth+1
output=$(find . -mindepth $depth -maxdepth $depth -type d | sort)
done
Something like this:
find . -type d |
perl -lne'push #_, $_;
print join $/,
sort {
length $a <=> length $b ||
$a cmp $b
} #_ if eof'