how to exclude all subdirectories of a given directory in the search path of the find command in unix - linux

I need to backup all the directory hierarchy of our servers, thus I need to list all the sub directories of some of the directories in the server.
The problem is that one of those sub directories contains tens of thousands of sub directories (file with only the names of the sub directories could take couple of hundreds megabytes and the respective find command takes very long).
For example, if I have a directory A and one sub directory A/a that contains tens of thousands of sub directories, I want to use the find command to list all the sub directories of A excluding all the sub directories of A/a but not excluding A/a itself.
I tried many variations of -prune using the answers in this question to no avail.
Is there a way to use the find command in UNIX to do this?
UPDATE:
the answer by #devnull worked very well, but now i have another problem, so i will refine my question a little:
i used the following command:
find /var/www -type d \( ! -wholename "/var/www/web-release-data/*" ! -wholename "/var/www/web-development-data/*" \)
the new problem that arises is that find for some reason is still traversing the whole directory tree of "/var/www/web-release-data/" and "/var/www/web-development-data/", thus it's very slow, and I fear it could take hours.
Is there any way make find completely exclude those directories and not traverse their respective directory hierarchies?

The following should work for you:
find A -type d \( ! -wholename "A/a/*" \)
This would list all subdirectories of A including A/a but excluding subdirectories of A/a.
Example:
$ mkdir -p A/{a..c}/{1..4}
$ find A -type d \( ! -wholename "A/a/*" \)
A
A/c
A/c/4
A/c/2
A/c/3
A/c/1
A/a
A/b
A/b/4
A/b/2
A/b/3
A/b/1

Another solution:
find A \! -path "A/a/*"
If you don't want a as well, use
find A \! -path "A/a/*" -a \! -path "A/a"

Have you tried rsync(1)? It has an option --exclude=PATTERN which might work well here:
rsync -avz --exclude=A/a <source> <target>
Using rsync you wouldn't need to use find(1)

To exclude 2 subdirs:
find . -type d ! -wholename "dir/name/*" -a ! -wholename "dir/name*"

To answer your updated question, you can do
find /var/www -wholename "/var/www/web-release-data/*" -o -wholename "/var/www/web-development-data/*" -prune -o -type d -print

Related

How can I remove specific directories that all start with a common letter?

I have many EC2 instances in a folder that I need to delete. Using -delete doesn't work because the directories are not empty. I tried looking for a way to get -rmdir -f to work with no success. The instance folders are all started with "i-" which led me to add wildcard "i-*" like that to get it to delete all directories starting with those characters. How can I manage to do this? the directories will never be empty either.
Assuming your current dir is the folder in question, how about:
find . -type d -name 'i-*'
If that lists the directories you want to remove, then change it to:
find . -type d -name 'i-*' -exec rm -r {} \;
In the command line interface/shell/born again shell/etc...
rm -r i-*
will remove ANY and ALL contained file(s) or directory(s) with subfiles and sub directories (recursive = -r) where the name begins with "i-" .
To delete the directories matching the pattern graphene-80* directly under /tmp, use
rm -rf /tmp/graphene-80*/
Here, the trailing / ensures that only directories whose names match the graphene-80* pattern are deleted (or symbolic links to directories), and not files etc.
To find the matching directories elsewhere under /tmp and delete them wherever they may be, use
find /tmp -type d -name 'graphene-80*' -prune -exec rm -rf {} +
To additionally see the names of the directories as they are deleted, insert -print before -exec.
The two tests -type d and -name 'graphene-80*' tests for directories with the names that we're looking for. The -prune removes the found directory from the search path (we don't want to look inside these directories as they are being deleted), and the -exec, finally, does the actual removal by means of calling rm.

Exclude range of directories in find command

I have directory called test which has sub folders in the date range like 01,02,...31. This all sub folders contain .bz2 files in it. I need to search all the files with .bz2 extension using find command but excluding particular range of directories. I know about find . -name ".bz2" -not -path "./01/*", but writing -not -path "./01/*" would be so pathetic if I would want to skip 10 directories. So how would I skip 01..19 subdirectories in my find command ?
You can use wildcards in the pattern for the option -not -path:
find ./ -type f -name "*.bz2" -not -path "./0*/*" -not -path "./1*/*
this will exclude all directories starting with 0 or 1. Or even better:
find ./ -type f -name "*.bz2" -not -path "./[01]*/*"
Firstly, you can help find by using -prune rather than -not -path - that will avoid even looking inside the relevant directories.
To your main point - you can build a wildcard for your example (numeric 01 to 19):
find . -path './0[1-9]' -prune -o -path './1[0-9]' -prune -o -print
If your range is less convenient (e.g. 05 to 25) you might want to build the range into a bash variable, then interpolate that into the find command:
a=("-path ./"{05..25}" -prune -o")
find . ${a[*]} -print -prune
(you might want to echo "${a[*]}" or printf '%s\n' ${a[*]} to see how it's working)
For me, I found the find command as a standalone tool somehow cumbersome. Therefore, I always end up using a combination of find just for the recursive file search and grep to make the actual exculsion/inclusion stuff. Finally I hand over the results to a third command which will perform the actions, like rm to remove files for example.
My generic command would look something like this:
find [root-path] | grep (-v)? -E "cond1|cond2|...|condN" | [action-performing-tool]
root-path is where to start the search recursively
add -v option is used to invert the matching results.
cond1 - condN, the conditions for the matching. When -v is involed then this are the conditions to not match.
the action-performing-tool does the actual work
For example you want to remove all files not matching some conditions in the current directory:
find . -not -name "\." | grep -v -E "cond1|cond2|cond3|...|condN" | xargs rm -rf
As you can see, we are searching in the current directory indicated by the dot as root-path: then we want to invert the matching results, because we want all files not matching our conditions: and finally we pass all files found to rm in order to delete them: I add -rf to recursive/force delete all files. I used the find command with -not -name "." to exclude the current directory indicated normally by dot.
For the actuall question: Assume we have a directory using .git and .metadata directory and we want to exclude them in our search:
find . -not -name "\." | grep -v -E ".git|.metadata" | [action-performing-tool]
Hope that helps!
If you wan to exclude child directory under parent directory then this might be useful:
E.g.- You have parent directory "ParentDir" and it has two child directories "Child1, Child2". You wan to read files from "Chiled2" only and skip "Child1". Then this will help.
find ./ParentDir ! -path "./ParentDir/Child1*" -name *.<extention>

How can I recursively delete a set of directories across a large array of different folders using bash?

I have several config folders (ex: .gnome, .mozilla) that I need to delete across a large array of directories. They all start with two alphabetical characters (ex: ag52156,ge51789) and are located in the same place.
I don't write bash so I wouldn't know how to start tackling this in the first place - but what should I look into so that I can write this?
Try this :
find [a-z][a-z]* -type d \( -name .gnome -o -name .mozilla \) -exec rm -r {} \;

Remove files for a lot of directories - Linux

How can I remove all .txt files present in several directories
Dir1 >
Dir11/123.txt
Dir12/456.txt
Dir13/test.txt
Dir14/manifest.txt
In my example I want to run the remove command from Dir1.
I know the linux command rm, but i don't know how can I make this works to my case.
PS.: I'm using ubuntu.
To do what you want recursively, find is the most used tool in this case. Combined with the -delete switch, you can do it with a single command (no need to use -exec (and forks) in find like other answers in this thread) :
find Dir1 -type f -name "*.txt" -delete
if you use bash4, you can do too :
( shopt -s globstar; rm Dir1/**/*.txt )
We're not going to enter sub directories so no need to use find; everything is at the same level. I think this is what you're looking for: rm */*.txt
Before you run this you can try echo */*.txt to see if the correct files are going to be removed.
Using find would be useful if you want to search subfolders of subfolders, etc.
There is no Dir1 in the current folder so don't do find Dir1 .... If you run the find from the prompt above this will work:
find . -type f -name "*.txt" -delete

Find Directories With No Files in Unix/Linux

I have a list of directories
/home
/dir1
/dir2
...
/dir100
Some of them have no files in it. How can I use Unix find to do it?
I tried
find . -name "*" -type d -size 0
Doesn't seem to work.
Does your find have predicate -empty?
You should be able to use find . -type d -empty
If you're a zsh user, you can always do this. If you're not, maybe this will convince you:
echo **/*(/^F)
**/* will expand to every child node of the present working directory and the () is a glob qualifier. / restricts matches to directories, and F restricts matches to non-empty ones. Negating it with ^ gives us all empty directories. See the zshexpn man page for more details.
-empty reports empty leaf dirs.
If you want to find empty trees then have a look at:
http://code.google.com/p/fslint/source/browse/trunk/fslint/finded
Note that script can't be used without the other support scripts,
but you might want to install fslint and use it directly?
You can also use:
find . -type d -links 2
. and .. both count as a link, as do files.
The answer of Pimin Konstantin Kefalou prints folders with only 2 links and other files (d, f, ...).
The easiest way I have found is:
for directory in $(find . -type d); do
if [ -n "$(find $directory -maxdepth 1 -type f)" ]; then echo "$directory"
fi
done
If you have name with spaces use quotes in "$directory".
You can replace . by your reference folder.
I haven't been able to do it with one find instruction.

Resources