"find" command but it stops going deep if it finds a directory starting with "." - linux

I have to make a script that goes through a whole folder (/home, in my case).
I have to save all the files except the ones that start with ., and also, if I find a directory that starts with ., I don't have to care what's inside, I don't have to read it.
For the first part we use the command
for path in $(find /home \! -name ".*");do
where path is a variable that contains the path. But we don't know how to do the directory part.
I thought I'd cut the path through the / and then see if there's any .. In that case, have an if that does not save the file, but I don't know how to cut a string and save it in a variable and then go through it.

You can prune all files starting with a ..
From the man page of GNU find:
-prune True; if the file is a directory, do not descend into it. If -depth is given, false; no effect. Because -delete implies -depth, you cannot usefully use -prune and -delete together.
You should not loop over the result from find. You will get unexpected results if you have filenames with spaces or newlines.
Use xargs or -exec, e.g.
find /home -path "*/.*" -prune -o -print0 | xargs -0I{} sh -c 'echo "doing something with $1"' sh {}
or
find /home -path "*/.*" -prune -o -exec sh -c 'for i; do echo "doing something with $i"; done' sh {} +
The -prune part removes all filenames (files and directories) starting with a dot and does not descend into directories starting with a dot.
All other filenames are printed with a NUL character instead of a newline (-o -print0) and piped to xargs or a shell script is executed with your action (as few times as possible).
To save all filenames into a file:
find /home -path "*/.*" -prune -o -print > allfiles.txt

Try this
for path in $(find /home -type d -name ".*" -prune -o -type f \! -name ".*" -print);do echo $path; done

I think I would do something like that:
for path in $(find . -type f | egrep -v '/\.[^\/]+\/'); do
...
Note that you may have to take extra steps if some of your files have spaces in their names.

Related

find command: delete everything but one folder

I have this command:
find ~/Desktop/testrm -mindepth 1 -path ~/Desktop/testrm/.snapshot -o -mtime +2 -prune -exec rm -rf {} +
I want it to work as is, but it must avoid to remove a specific directory ($ROOT_DIR/$DATA_DIR).
it must remove the files inside the directory but not the directory itself
the flag "r" in rm is needed because it has to delete other directories
-prune is not suitable since it will discard the content and also sub directories
You can exclude individual paths using the short circuiting behavior of -o (like you already did with ~/Desktop/testrm/.snapshot).
However, for each excluded path you also have to exclude all of its parent directories. Otherwise you would delete a/b/c by deleting a/b/ or a/ with rm -rf.
In the following script, the function orParents generates a part of the find command. Example:
find $(orParents a/b/c) ... would run
find -path a/b/c -o -path a/b -o -path a -o ....
#! /usr/bin/env bash
orParents() {
p="$1"
while
printf -- '-path %q -o' "$p"
p=$(dirname "$p")
[ "$p" != . ]
do :; done
}
find ~/Desktop/testrm -mindepth 1 \
$(orParents "$ROOT_DIR/$DATA_DIR") -path ~/Desktop/testrm/.snapshot -o \
-mtime +2 -prune -exec rm -rf {} +
Warning: You have to make sure that $ROOT_DIR/$DATA_DIR does not end with a / and does not contain glob characters like *, ?, and [].
Spaces are ok as printf %q escapes them correctly. However, find -path interprets its argument as a glob pattern independently. We could do a double quoting mechanism. Maybe something like printf %q "$(sed 's/[][*?\]/\\&/' <<< "$p")", but I'm not so sure about how exactly find -path interprets its argument.
Alternatively, you could write a script isParentOf and do ...
find ... -exec isParentOf "$ROOT_DIR/$DATA_DIR" {} \; -o ...
... to exclude $ROOT_DIR/$DATA_DIR and all of its parents. This is probably safer and more portable, but slower and a hassle to set up (find -exec bash -c ... and so on) if you don't want to add a script file to your path.

Find all files contained into directory named

I would like to recursively find all files contained into a directory that has name “name1” or name “name2”
for instance:
structure/of/dir/name1/file1.a
structure/of/dir/name1/file2.b
structure/of/dir/name1/file3.c
structure/of/dir/name1/subfolder/file1s.a
structure/of/dir/name1/subfolder/file2s.b
structure/of/dir/name2/file1.a
structure/of/dir/name2/file2.b
structure/of/dir/name2/file3.c
structure/of/dir/name2/subfolder/file1s.a
structure/of/dir/name2/subfolder/file2s.b
structure/of/dir/name3/name1.a ←this should not show up in the result
structure/of/dir/name3/name2.a ←this should not show up in the result
so when I start my magic command the expected output should be this and only this:
structure/of/dir/name1/file1.a
structure/of/dir/name1/file2.b
structure/of/dir/name1/file3.c
structure/of/dir/name2/file1.a
structure/of/dir/name2/file2.b
structure/of/dir/name2/file3.c
I scripted something but it does not work because it search within the files and not only folder names:
for entry in $(find $SEARCH_DIR -type f | grep 'name1\|name2');
do
echo "FileName: $(basename $entry)"
done
If you can use the -regex option, avoiding subfolders with [^/]:
~$ find . -type f -regex ".*name1/[^/]*" -o -regex ".*name2/[^/]*"
./structure/of/dir/name2/file1.a
./structure/of/dir/name2/file3.c
./structure/of/dir/name2/subfolder
./structure/of/dir/name2/file2.b
./structure/of/dir/name1/file1.a
./structure/of/dir/name1/file3.c
./structure/of/dir/name1/file2.b
I'd use -path and -prune for this, since it's standard (unlike -regex which is GNU specific).
find . \( -path "*/name1/*" -o -path "*/name2/*" \) -prune -type f -print
But more importantly, never do for file in $(find...). Use finds -exec or a while read loop instead, depending on what you really need to with the matching files. See UsingFind and BashFAQ 20 for more on how to handle find safely.

Getting all files from various folders and copying them with unique names

Currently using this command to get all my "fanart" from my TV folder, and dump it into a single folder.
find /volume1/tv/ -type f \( -name '*fanart.jpg'* -o -path '*/fanart/*.jpg' -o -path '*/extrafanart/*.jpg' \) -exec cp {} /volume1/tv/_FANART \;
Here's the issue: a lot of these files have the same name, and can't be dumped into the same folder. Example:
Folder A
fanart.jpg
Folder B
fanart.jpg
Is there a way to copy these files from their respective folders and give them a unique name in the destination folder? Name needn't be anything descriptive, random is just fine.
Thanks!
find /volume1/tv/ -type f \( -name '*fanart.jpg'* -o -path '*/fanart/*.jpg' -o -path '*/extrafanart/*.jpg' \) -exec cp --backup=numbered {} /volume1/tv/_FANART \;
..
cp --backup=numbered {}
If the file exists, this will not overwrite but make a backup with a number assigned.
The files will be hidden. Ctrl+H to view hidden files
You could copy the files while giving them names according to their locations in the original directory tree. For instance (":" is legal but
unusual in filenames), your "find" command could call a shell script (rather than "cp" directly), which might look like this:
#!/bin/sh
case "x$1" in
x/volume1/tv/_FANART/*)
;;
*)
target=`echo "$1" | sed -e 's,^/volume1/tv/,,' -e s,/,:,g`
cp "$1" "$2/$target"
;;
esac
and the corresponding "-exec" would be
-exec myscript "{}" /volume1/tv/_FANART \;
By the way, the source/destination on the original example are in the same directory tree "/volume1/tv", which is why the sample script uses a case statement - to exclude files already copied to the _FANART folder.
If you want to use the md5sum as the new name:
find /volume1/tv/ -type d -path '/volume1/tv/_FANART' -prune -o -type f \( -name '*fanart.jpg'* -o -path '*/fanart/*.jpg' -o -path '*/extrafanart/*.jpg' \) -exec sh -c 'md5=$(md5sum < "$0") && md5=${md5%% *}.jpg && echo cp "$0" "/volume1/tv/_FANART/$md5"' {} \;
Every thing happens in the sh command (all commands are separated by && but I omitted the && for clarity):
md5=$(md5sum < "$0")
md5=${md5%% *}.jpg
cp "$0" "/volume1/tv/_FANART/$md5"'
the $0 expands to the filename processed. We first compute the md5sum of the file, then only keep the md5sum (md5sum puts a hyphen next to the hash) and append .jpg to that, and finally we copy the file into the target folder, with the computed name.
Notes.
I added
-type d -path '/volume1/tv/_FANART` -prune -o
to your command to omit this folder, since you very likely don't want to process it; it would actually be weird to process it, as its content is changed throughout find's traversal.
I left an echo in the command, so that absolutely nothing is copied (as is, it's 100% safe, you can just copy and paste it in your terminal): it only shows what commands are going to be performed (and you'll also see how fast/slow it is).
The command is 100% safe regarding funny filenames with spaces, newlines, globs, etc.
I used md5sum < fileand not md5sum file, because if the filename file contains special characters (like backslashes, newlines, etc.), md5sum (at least my version) prepends the hash with a backslash. Weird. By not giving a filename, we're safe, this won't happen.

How to write a unix command or script to remove files of the same type in all sub-folders under current directory?

Is there a way to remove all temp files and executables under one folder AND its sub-folders?
All that I can think of is:
$rm -rf *.~
but this removes only temp files under current directory, it DOES NOT remove any other temp files under SUB-folders at all, also, it doesn't remove any executables.
I know there are similar questions which get very well answered, like this one:
find specific file type from folder and its sub folder
but that is a java code, I only need a unix command or a short script to do this.
Any help please?
Thanks a lot!
Perl from command line; should delete if file ends with ~ or it is executable,
perl -MFile::Find -e 'find(sub{ unlink if -f and (/~\z/ or (stat)[2] & 0111) }, ".")'
You can achieve the result with find:
find /path/to/directory \( -name '*.~' -o \( -perm /111 -a -type f \) \) -exec rm -f {} +
This will execute rm -f <path> for any <path> under (and including) /path/to/base/directory which:
matches the glob expression *.~
or which has an executable bit set (be it owner, group or world)
The above applies to the GNU version of find.
A more portable version is:
find /path/to/directory \( -name '*.~' -o \( \( -perm -01 -o -perm -010 -o -perm -0100 \) \
-a -type f \) \) -exec rm -f {} +
find . -name "*~" -exec rm {} \;
or whatever pattern is needed to match the tmp files.
If you want to use Perl to do it, use a specific module like File::Remove
This should do the job
find -type f -name "*~" -print0 | xargs -r -0 rm

Exclude list of files from find

If I have a list of filenames in a text file that I want to exclude when I run find, how can I do that? For example, I want to do something like:
find /dir -name "*.gz" -exclude_from skip_files
and get all the .gz files in /dir except for the files listed in skip_files. But find has no -exclude_from flag. How can I skip all the files in skip_files?
I don't think find has an option like this, you could build a command using printf and your exclude list:
find /dir -name "*.gz" $(printf "! -name %s " $(cat skip_files))
Which is the same as doing:
find /dir -name "*.gz" ! -name first_skip ! -name second_skip .... etc
Alternatively you can pipe from find into grep:
find /dir -name "*.gz" | grep -vFf skip_files
This is what i usually do to remove some files from the result (In this case i looked for all text files but wasn't interested in a bunch of valgrind memcheck reports we have here and there):
find . -type f -name '*.txt' ! -name '*mem*.txt'
It seems to be working.
I think you can try like
find /dir \( -name "*.gz" ! -name skip_file1 ! -name skip_file2 ...so on \)
find /var/www/test/ -type f \( -iname "*.*" ! -iname "*.php" ! -iname "*.jpg" ! -iname "*.png" \)
The above command gives list of all files excluding files with .php, .jpg ang .png extension. This command works for me in putty.
Josh Jolly's grep solution works, but has O(N**2) complexity, making it too slow for long lists. If the lists are sorted first (O(N*log(N)) complexity), you can use comm, which has O(N) complexity:
find /dir -name '*.gz' |sort >everything_sorted
sort skip_files >skip_files_sorted
comm -23 everything_sorted skip_files_sorted | xargs . . . etc
man your computer's comm for details.
This solution will go through all files (not exactly excluding from the find command), but will produce an output skipping files from a list of exclusions.
I found that useful while running a time-consuming command (file /dir -exec md5sum {} \;).
You can create a shell script to handle the skipping logic and run commands on the files found (make it executable with chmod, replace echo with other commands):
$ cat skip_file.sh
#!/bin/bash
found=$(grep "^$1$" files_to_skip.txt)
if [ -z "$found" ]; then
# run your command
echo $1
fi
Create a file with the list of files to skip named files_to_skip.txt (on the dir you are running from).
Then use find using it:
find /dir -name "*.gz" -exec ./skip_file.sh {} \;
This should work:
find * -name "*.gz" $(printf "! -path %s " $(<skip_files.txt))
Working out
Assuming skip_files has a filename on each line, you can get the list of filenames via $(<skip_files.txt). E.g. echo $(<skip_files.txt) should print them all out.
For each filename you want to have a ! -path filename expression. To build this, use $(printf "! -path %s " $(<skip_files.txt))
Then, put it together with a filter on -name "*.gz"

Resources