How to remove all .svn directories from my application directories - linux

One of the missions of an export tool I have in my application, is to clean all .svn directories from my application directory tree. I am looking for a recursive command in the Linux shell that will traverse the entire tree and delete the .svn files.
I am not using export, as this script will be used for some other file/directory names which are not related to SVN. I tried something like:
find . -name .svn | rm -fr
It didn't work...

Try this:
find . -name .svn -exec rm -rf '{}' \;
Before running a command like that, I often like to run this first:
find . -name .svn -exec ls '{}' \;

What you wrote sends a list of newline separated file names (and paths) to rm, but rm doesn't know what to do with that input. It's only expecting command line parameters.
xargs takes input, usually separated by newlines, and places them on the command line, so adding xargs makes what you had work:
find . -name .svn | xargs rm -fr
xargs is intelligent enough that it will only pass as many arguments to rm as it can accept. Thus, if you had a million files, it might run rm 1,000,000/65,000 times (if your shell could accept 65,002 arguments on the command line {65k files + 1 for rm + 1 for -fr}).
As persons have adeptly pointed out, the following also work:
find . -name .svn -exec rm -rf {} \;
find . -depth -name .svn -exec rm -fr {} \;
find . -type d -name .svn -print0|xargs -0 rm -rf
The first two -exec forms both call rm for each folder being deleted, so if you had 1,000,000 folders, rm would be invoked 1,000,000 times. This is certainly less than ideal. Newer implementations of rm allow you to conclude the command with a + indicating that rm will accept as many arguments as possible:
find . -name .svn -exec rm -rf {} +
The last find/xargs version uses print0, which makes find generate output that uses \0 as a terminator rather than a newline. Since POSIX systems allow any character but \0 in the filename, this is truly the safest way to make sure that the arguments are correctly passed to rm or the application being executed.
In addition, there's a -execdir that will execute rm from the directory in which the file was found, rather than at the base directory and a -depth that will start depth first.

No need for pipes, xargs, exec, or anything:
find . -name .svn -delete
Edit: Just kidding, evidently -delete calls unlinkat() under the hood, so it behaves like unlink or rmdir and will refuse to operate on directories containing files.

There are already many answers provided for deleting the .svn-directory. But I want to add, that you can avoid these directories from the beginning, if you do an export instead of a checkout:
svn export <url>

If you don't like to see a lot of
find: `./.svn': No such file or directory
warnings, then use the -depth switch:
find . -depth -name .svn -exec rm -fr {} \;

In Windows, you can use the following registry script to add "Delete SVN Folders" to your right click context menu. Run it on any directory containing those pesky files.
Windows Registry Editor Version 5.00
[HKEY_LOCAL_MACHINE\SOFTWARE\Classes\Folder\shell\DeleteSVN]
#="Delete SVN Folders"
[HKEY_LOCAL_MACHINE\SOFTWARE\Classes\Folder\shell\DeleteSVN\command]
#="cmd.exe /c \"TITLE Removing SVN Folders in %1 && COLOR 9A && FOR /r \"%1\" %%f IN (.svn) DO RD /s /q \"%%f\" \""

You almost had it. If you want to pass the output of a command as parameters to another one, you'll need to use xargs. Adding -print0 makes sure the script can handle paths with whitespace:
find . -type d -name .svn -print0|xargs -0 rm -rf

find . -name .svn |xargs rm -rf

As an important issue, when you want to utilize shell to delete .svn folders You need -depth argument to prevent find command entering the directory that was just deleted and showing error messages like e.g.
"find: ./.svn: No such file or directory"
As a result, You can use find command like below:
cd [dir_to_delete_svn_folders]
find . -depth -name .svn -exec rm -fr {} \;

Try this:
find . -name .svn -exec rm -v {} \;
Read more about the find command at developerWorks.

Alternatively, if you want to export a copy without modifying the working copy, you can use rsync:
rsync -a --exclude .svn path/to/working/copy path/to/export

Related

How to "rm -rf" with excluding files and folders with the "find -o" command

I'm trying to use the find command, but still can't figure out how to pipe the find ... to rm -rf
Here is the directory tree for testing:
/path/to/directory
/path/to/directory/file1_or_dir1_to_exclude
/path/to/directory/file2_or_dir2_to_exclude
/path/to/directory/.hidden_file1_or_dir1_to_exclude
/path/to/directory/.hidden_file2_or_dir2_to_exclude
/path/to/directory/many_other_files
/path/to/directory/many_other_directories
Here is the command for removing the whole directory:
rm -rf /path/to/directory
But how to rm -rf while excluding files and folders?
Here is the man help for reference:
man find
-prune True; if the file is a directory, do not descend into it. If
-depth is given, then -prune has no effect. Because -delete im‐
plies -depth, you cannot usefully use -prune and -delete to‐
gether.
For example, to skip the directory `src/emacs' and all files
and directories under it, and print the names of the other files
found, do something like this:
find . -path ./src/emacs -prune -o -print
What's the -o in this find command? Does it mean "or"? I can't find the meaning of -o in the man page.
mkdir -p /path/to/directory
mkdir -p /path/to/directory/file1_or_dir1_to_exclude
mkdir -p /path/to/directory/file2_or_dir2_to_exclude
mkdir -p /path/to/directory/.hidden_file1_or_dir1_to_exclude
mkdir -p /path/to/directory/.hidden_file2_or_dir2_to_exclude
mkdir -p /path/to/directory/many_other_files
mkdir -p /path/to/directory/many_other_directories
I have tried to use this find command to exclude the .hidden_file1_or_dir1_to_exclude and then pipe it to rm, but this command does not work as expected.
cd /path/to/directory
find . -path ./.hidden_file1_or_dir1_to_exclude -prune -o -print | xargs -0 -I {} rm -rf {}
The meaning of rm -rf is to recursively remove everything in a directory tree.
The way to avoid recursively removing everything inside a directory is to get find to enumerate exactly the files you want to remove, and nothing else (and then of course you don't need rm at all; find knows how to remove files, too).
find . -depth -path './.hidden_file1_or_dir1_to_exclude/*' -o -delete
Using -delete turns on the -depth option, which disables the availability of -prune; but just say "delete if not in this tree" instead. And indeed, as you seem to have discovered already, -o stands for "or".
The reason -delete enables -depth should be obvious; you can't traverse the files inside a directory after you have deleted it.
As an aside, you need to use -print0 if you use xargs -0. (This facility is a GNU extension, and generally not available on POSIX.)
You need to separate files from directories to exclude:
find . -mindepth 1\
\( -path ./dir_to_exclude -o\
-path ./.hidden_dir_to_exclude \) -type d -prune\
-o\
! \( -path ./file_to_exclude -o\
-path ./.hidden_file_to_exclude \)\
-exec echo rm -rf {} \;
You can remove the echo once tested.

How to delete all subdirectories with a specific name

I'm working on Linux and there is a folder, which contains lots of sub directories. I need to delete all of sub directories which have a same name. For example,
dir
|---subdir1
|---subdir2
| |-----subdir1
|---file
I want to delete all of subdir1. Here is my script:
find dir -type d -name "subdir1" | while read directory ; do
rm -rf $directory
done
However, I execute it but it seems that nothing happens.
I've tried also find dir -type d "subdir1" -delete, but still, nothing happens.
If find finds the correct directories at all, these should work:
find dir -type d -name "subdir1" -exec echo rm -rf {} \;
or
find dir -type d -name "subdir1" -exec echo rm -rf {} +
(the echo is there for verifying the command hits the files you wanted, remove it to actually run the rm and remove the directories.)
Both piping to xargs and to while read have the downside that unusual file names will cause issues. Also, find -delete will only try to remove the directories themselves, not their contents. It will fail on any non-empty directories (but you should at least get errors).
With xargs, spaces separate words by default, so even file names with spaces will not work. read can deal with spaces, but in your command it's the unquoted expansion of $tar that splits the variable on spaces.
If your filenames don't have newlines or trailing spaces, this should work, too:
find ... | while read -r x ; do rm -rf "$x" ; done
With the globstar option (enable with shopt -s globstar, requires Bash 4.0 or newer):
rm -rf **/subdir1/
The drawback of this solution as compared to using find -exec or find | xargs is that the argument list might become too long, but that would require quite a lot of directories named subdir1. On my system, ARG_MAX is 2097152.
Using xargs:
find dir -type d -name "subdir1" -print0 |xargs -0 rm -rf
Some information not directly related to the question/problem:
find|xargs or find -exec
https://www.everythingcli.org/find-exec-vs-find-xargs/
From the question, it seems you've tried to use while with find. The following substitution may help you:
while IFS= read -rd '' dir; do rm -rf "$dir"; done < <(find dir -type d -name "subdir" -print0)

cannot delete directory using shell command

I wanted to delete all directories with name in same pattern RC_200, here is what I did:
find -name "RC_200" -type d -delete
But it's complaining about this:
find: cannot delete '.RC200': Directory not empty
You should try:
find . -name RC200 -type d -exec rm -r {} \;
You can see here, what the command does
More, you can try what #anubhava recommended in comment (Note the + at the end); this one is equivalent to a xargs solution:
find . -name RC200 -type d -print0|xargs -0 rm -r
xargs executes the command passed as parameter, with the arguments passed to stdin. This is using rm -r to delete the directory and all its children
May be you should run administrator privilege.
if your os Ubuntu add "sudo" before your Command or what ever your os look at administrator key.
OR
remove file protection if it has.

find command in bash script resulting in "No such file or directory" error only for directories?

UPDATE 2014-03-21
So I realized I wasn't as efficient as I could be, as all the disks that I needed to "scrub" were under /media and named "disk1, disk2,disk3, etc." Here's the final script:
DIRTY_DIR="/media/disk*"
find $DIRTY_DIR -depth -type d -name .AppleDouble -exec rm -rf {} \;
find $DIRTY_DIR -depth -type d -name .AppleDB -exec rm -rf {} \;
find $DIRTY_DIR -depth -type d -name .AppleDesktop -exec rm -rf {} \;
find $DIRTY_DIR -type f -name ".*DS_Store" -exec rm -f {} \;
find $DIRTY_DIR -type f -name ".Thumbs.db" -exec rm -f {} \; # I know, I know, this is a Windows file.
Next will probably to just clean up the code even more, and add features like logging and reporting results (through e-mail or otherwise); excluding system and directories; and allowing people to customize the list of files/directories.
Thanks for all the help!
UPDATE
Before I incorporated the helpful suggestions provided by everyone, I performed some tests, the results of which were very interesting (see below).
As a test, I ran this command:
root#doi:~# find /media/disk3 -type d -name .AppleDouble -exec echo rm -rf {} \;
The results (which is what I expected):
rm -rf /media/disk3/Videos/Chorus/.AppleDouble
However, when I ran the actual command (without echo):
root#doi:~# find /media/disk3 -type d -name .AppleDouble -exec rm -rf {} \;
I received the same "error" output:
find: `/media/disk3/Videos/Chorus/.AppleDouble': No such file or directory
I put "error" in quotes because obviously the folder was removed, as verified by immediately running:
root#doi:~# find /media/disk3 -type d -name .AppleDouble -exec rm -rf {} \;
root#doi:~#
It seems like the find command stored the original results, acted on it by deleting the directory, but then tried to delete it again? Or is the -f option of rm, which is supposed to be for ignoring nonexistent files and arguments, is ignored? I note that when I run tests with the rm command alone without the find command, everything worked as expected. Thus, directly running rm -rf ... \nonexistent_directory, no errors were returned even though the "non_existent_directory" was not there, and directly running rm -r \nonexistent_directory provided the expected:
rm: cannot remove 'non_existent_directory': No such file or directory
Should I use the -delete option instead of the -exec rm ... option? I had wanted to make the script as broadly applicable as possible for systems that didn't have -delete option for find.
Lastly, I don't presume it matters if /media/disk1, /media/disk2, ... are combined in an AUFS filesystem under /media/storage as the find command is operating on the individual disks themselves?
Thanks for all the help so far, guys. I'll publish the script when I'm done.
ORIGINAL POST
I'm writing a bash script to delete a few OS X remnants on my Lubuntu file shares. However, when executing this:
...
BASE_DIR="/media/disk" # I have 4 disks: disk1, disk2, ...
COUNTER=1
while [ $COUNTER -lt 5 ]; do # Iterate through disk1, disk2, ...
DIRTY_DIR=${BASE_DIR}$COUNTER # Look under the current disk counter /media/disk1, /media/disk2, ...
find $DIRTY_DIR -name \.AppleDouble -exec rm -rf {} \; # Delete all .AppleDouble directories
find $DIRTY_DIR -name ".*DS_Store" -exec rm -rf {} \; # Delete all .DS_Store and ._.DS_Store files
COUNTER=$(($COUNTER+1))
done
...
I see the following output:
find: /media/disk1/Pictures/.AppleDouble: No such file or directory
Before I added the -exec rm ... portion the script found the /media/disk1/Pictures/.AppleDouble directory. The script works properly for removing DS_Store files, but what am I missing for the find command for directories?
I'm afraid to screw too much with the -exec portion as I don't want to obliterate directories in error.
tl;dr - Pass -prune if you're deleting directories using find.
For anyone else who stumbles on this question. Running an example like this
find /media/disk3 -type d -name .AppleDouble -exec rm -rf {} \;
results in an error like
rm: cannot remove 'non_existent_directory': No such file or directory
When finding and deleting directories with find, you'll often encounter this error because find stores the directory to process subdirectories, then deletes it with exec, then tries to traverse the subdirectories which no longer exist.
You can either pass -maxdepth 0 or -prune to prevent this issue. Like so:
find /media/disk3 -type d -name .AppleDouble -prune -exec rm -rf {} \;
Now it deletes the directories without any errors. Hurray! :)
You don't need to escape DOT in shell glob as this is not regex. So use .AppleDouble instead of \.AppleDouble:
find $DIRTY_DIR -name .AppleDouble -exec rm -rf '{}' \;
PS: I don't see anywhere $COUNTER being incremented in your script.

Argument list too long error for rm, cp, mv commands

I have several hundred PDFs under a directory in UNIX. The names of the PDFs are really long (approx. 60 chars).
When I try to delete all PDFs together using the following command:
rm -f *.pdf
I get the following error:
/bin/rm: cannot execute [Argument list too long]
What is the solution to this error?
Does this error occur for mv and cp commands as well? If yes, how to solve for these commands?
The reason this occurs is because bash actually expands the asterisk to every matching file, producing a very long command line.
Try this:
find . -name "*.pdf" -print0 | xargs -0 rm
Warning: this is a recursive search and will find (and delete) files in subdirectories as well. Tack on -f to the rm command only if you are sure you don't want confirmation.
You can do the following to make the command non-recursive:
find . -maxdepth 1 -name "*.pdf" -print0 | xargs -0 rm
Another option is to use find's -delete flag:
find . -name "*.pdf" -delete
tl;dr
It's a kernel limitation on the size of the command line argument. Use a for loop instead.
Origin of problem
This is a system issue, related to execve and ARG_MAX constant. There is plenty of documentation about that (see man execve, debian's wiki, ARG_MAX details).
Basically, the expansion produce a command (with its parameters) that exceeds the ARG_MAX limit.
On kernel 2.6.23, the limit was set at 128 kB. This constant has been increased and you can get its value by executing:
getconf ARG_MAX
# 2097152 # on 3.5.0-40-generic
Solution: Using for Loop
Use a for loop as it's recommended on BashFAQ/095 and there is no limit except for RAM/memory space:
Dry run to ascertain it will delete what you expect:
for f in *.pdf; do echo rm "$f"; done
And execute it:
for f in *.pdf; do rm "$f"; done
Also this is a portable approach as glob have strong and consistant behavior among shells (part of POSIX spec).
Note: As noted by several comments, this is indeed slower but more maintainable as it can adapt more complex scenarios, e.g. where one want to do more than just one action.
Solution: Using find
If you insist, you can use find but really don't use xargs as it "is dangerous (broken, exploitable, etc.) when reading non-NUL-delimited input":
find . -maxdepth 1 -name '*.pdf' -delete
Using -maxdepth 1 ... -delete instead of -exec rm {} + allows find to simply execute the required system calls itself without using an external process, hence faster (thanks to #chepner comment).
References
I'm getting "Argument list too long". How can I process a large list in chunks? # wooledge
execve(2) - Linux man page (search for ARG_MAX) ;
Error: Argument list too long # Debian's wiki ;
Why do I get “/bin/sh: Argument list too long” when passing quoted arguments? # SuperUser
find has a -delete action:
find . -maxdepth 1 -name '*.pdf' -delete
Another answer is to force xargs to process the commands in batches. For instance to delete the files 100 at a time, cd into the directory and run this:
echo *.pdf | xargs -n 100 rm
If you’re trying to delete a very large number of files at one time (I deleted a directory with 485,000+ today), you will probably run into this error:
/bin/rm: Argument list too long.
The problem is that when you type something like rm -rf *, the * is replaced with a list of every matching file, like “rm -rf file1 file2 file3 file4” and so on. There is a relatively small buffer of memory allocated to storing this list of arguments and if it is filled up, the shell will not execute the program.
To get around this problem, a lot of people will use the find command to find every file and pass them one-by-one to the “rm” command like this:
find . -type f -exec rm -v {} \;
My problem is that I needed to delete 500,000 files and it was taking way too long.
I stumbled upon a much faster way of deleting files – the “find” command has a “-delete” flag built right in! Here’s what I ended up using:
find . -type f -delete
Using this method, I was deleting files at a rate of about 2000 files/second – much faster!
You can also show the filenames as you’re deleting them:
find . -type f -print -delete
…or even show how many files will be deleted, then time how long it takes to delete them:
root#devel# ls -1 | wc -l && time find . -type f -delete
100000
real 0m3.660s
user 0m0.036s
sys 0m0.552s
Or you can try:
find . -name '*.pdf' -exec rm -f {} \;
you can try this:
for f in *.pdf
do
rm "$f"
done
EDIT:
ThiefMaster comment suggest me not to disclose such dangerous practice to young shell's jedis, so I'll add a more "safer" version (for the sake of preserving things when someone has a "-rf . ..pdf" file)
echo "# Whooooo" > /tmp/dummy.sh
for f in '*.pdf'
do
echo "rm -i \"$f\""
done >> /tmp/dummy.sh
After running the above, just open the /tmp/dummy.sh file in your favorite editor and check every single line for dangerous filenames, commenting them out if found.
Then copy the dummy.sh script in your working dir and run it.
All this for security reasons.
For somone who doesn't have time.
Run the following command on terminal.
ulimit -S -s unlimited
Then perform cp/mv/rm operation.
I'm surprised there are no ulimit answers here. Every time I have this problem I end up here or here. I understand this solution has limitations but ulimit -s 65536 seems to often do the trick for me.
You could use a bash array:
files=(*.pdf)
for((I=0;I<${#files[#]};I+=1000)); do
rm -f "${files[#]:I:1000}"
done
This way it will erase in batches of 1000 files per step.
you can use this commend
find -name "*.pdf" -delete
The rm command has a limitation of files which you can remove simultaneous.
One possibility you can remove them using multiple times the rm command bases on your file patterns, like:
rm -f A*.pdf
rm -f B*.pdf
rm -f C*.pdf
...
rm -f *.pdf
You can also remove them through the find command:
find . -name "*.pdf" -exec rm {} \;
If they are filenames with spaces or special characters, use:
find -name "*.pdf" -delete
For files in current directory only:
find -maxdepth 1 -name '*.pdf' -delete
This sentence search all files in the current directory (-maxdepth 1) with extension pdf (-name '*.pdf'), and then, delete.
i was facing same problem while copying form source directory to destination
source directory had files ~3 lakcs
i used cp with option -r and it's worked for me
cp -r abc/ def/
it will copy all files from abc to def without giving warning of Argument list too long
Try this also If you wanna delete above 30/90 days (+) or else below 30/90(-) days files/folders then you can use the below ex commands
Ex: For 90days excludes above after 90days files/folders deletes, it means 91,92....100 days
find <path> -type f -mtime +90 -exec rm -rf {} \;
Ex: For only latest 30days files that you wanna delete then use the below command (-)
find <path> -type f -mtime -30 -exec rm -rf {} \;
If you wanna giz the files for more than 2 days files
find <path> -type f -mtime +2 -exec gzip {} \;
If you wanna see the files/folders only from past one month .
Ex:
find <path> -type f -mtime -30 -exec ls -lrt {} \;
Above 30days more only then list the files/folders
Ex:
find <path> -type f -mtime +30 -exec ls -lrt {} \;
find /opt/app/logs -type f -mtime +30 -exec ls -lrt {} \;
And another one:
cd /path/to/pdf
printf "%s\0" *.[Pp][Dd][Ff] | xargs -0 rm
printf is a shell builtin, and as far as I know it's always been as such. Now given that printf is not a shell command (but a builtin), it's not subject to "argument list too long ..." fatal error.
So we can safely use it with shell globbing patterns such as *.[Pp][Dd][Ff], then we pipe its output to remove (rm) command, through xargs, which makes sure it fits enough file names in the command line so as not to fail the rm command, which is a shell command.
The \0 in printf serves as a null separator for the file names wich are then processed by xargs command, using it (-0) as a separator, so rm does not fail when there are white spaces or other special characters in the file names.
Argument list too long
As this question title for cp, mv and rm, but answer stand mostly for rm.
Un*x commands
Read carefully command's man page!
For cp and mv, there is a -t switch, for target:
find . -type f -name '*.pdf' -exec cp -ait "/path to target" {} +
and
find . -type f -name '*.pdf' -exec mv -t "/path to target" {} +
Script way
There is an overall workaroung used in bash script:
#!/bin/bash
folder=( "/path to folder" "/path to anther folder" )
if [ "$1" != "--run" ] ;then
exec find "${folder[#]}" -type f -name '*.pdf' -exec $0 --run {} +
exit 0;
fi
shift
for file ;do
printf "Doing something with '%s'.\n" "$file"
done
What about a shorter and more reliable one?
for i in **/*.pdf; do rm "$i"; done
I had the same problem with a folder full of temporary images that was growing day by day and this command helped me to clear the folder
find . -name "*.png" -mtime +50 -exec rm {} \;
The difference with the other commands is the mtime parameter that will take only the files older than X days (in the example 50 days)
Using that multiple times, decreasing on every execution the day range, I was able to remove all the unnecessary files
You can create a temp folder, move all the files and sub-folders you want to keep into the temp folder then delete the old folder and rename the temp folder to the old folder try this example until you are confident to do it live:
mkdir testit
cd testit
mkdir big_folder tmp_folder
touch big_folder/file1.pdf
touch big_folder/file2.pdf
mv big_folder/file1,pdf tmp_folder/
rm -r big_folder
mv tmp_folder big_folder
the rm -r big_folder will remove all files in the big_folder no matter how many. You just have to be super careful you first have all the files/folders you want to keep, in this case it was file1.pdf
To delete all *.pdf in a directory /path/to/dir_with_pdf_files/
mkdir empty_dir # Create temp empty dir
rsync -avh --delete --include '*.pdf' empty_dir/ /path/to/dir_with_pdf_files/
To delete specific files via rsync using wildcard is probably the fastest solution in case you've millions of files. And it will take care of error you're getting.
(Optional Step): DRY RUN. To check what will be deleted without deleting. `
rsync -avhn --delete --include '*.pdf' empty_dir/ /path/to/dir_with_pdf_files/
.
.
.
Click rsync tips and tricks for more rsync hacks
I found that for extremely large lists of files (>1e6), these answers were too slow. Here is a solution using parallel processing in python. I know, I know, this isn't linux... but nothing else here worked.
(This saved me hours)
# delete files
import os as os
import glob
import multiprocessing as mp
directory = r'your/directory'
os.chdir(directory)
files_names = [i for i in glob.glob('*.{}'.format('pdf'))]
# report errors from pool
def callback_error(result):
print('error', result)
# delete file using system command
def delete_files(file_name):
os.system('rm -rf ' + file_name)
pool = mp.Pool(12)
# or use pool = mp.Pool(mp.cpu_count())
if __name__ == '__main__':
for file_name in files_names:
print(file_name)
pool.apply_async(delete_files,[file_name], error_callback=callback_error)
If you want to remove both files and directories, you can use something like:
echo /path/* | xargs rm -rf
I only know a way around this.
The idea is to export that list of pdf files you have into a file. Then split that file into several parts. Then remove pdf files listed in each part.
ls | grep .pdf > list.txt
wc -l list.txt
wc -l is to count how many line the list.txt contains. When you have the idea of how long it is, you can decide to split it in half, forth or something. Using split -l command
For example, split it in 600 lines each.
split -l 600 list.txt
this will create a few file named xaa,xab,xac and so on depends on how you split it.
Now to "import" each list in those file into command rm, use this:
rm $(<xaa)
rm $(<xab)
rm $(<xac)
Sorry for my bad english.
I ran into this problem a few times. Many of the solutions will run the rm command for each individual file that needs to be deleted. This is very inefficient:
find . -name "*.pdf" -print0 | xargs -0 rm -rf
I ended up writing a python script to delete the files based on the first 4 characters in the file-name:
import os
filedir = '/tmp/' #The directory you wish to run rm on
filelist = (os.listdir(filedir)) #gets listing of all files in the specified dir
newlist = [] #Makes a blank list named newlist
for i in filelist:
if str((i)[:4]) not in newlist: #This makes sure that the elements are unique for newlist
newlist.append((i)[:4]) #This takes only the first 4 charcters of the folder/filename and appends it to newlist
for i in newlist:
if 'tmp' in i: #If statment to look for tmp in the filename/dirname
print ('Running command rm -rf '+str(filedir)+str(i)+'* : File Count: '+str(len(os.listdir(filedir)))) #Prints the command to be run and a total file count
os.system('rm -rf '+str(filedir)+str(i)+'*') #Actual shell command
print ('DONE')
This worked very well for me. I was able to clear out over 2 million temp files in a folder in about 15 minutes. I commented the tar out of the little bit of code so anyone with minimal to no python knowledge can manipulate this code.
I have faced a similar problem when there were millions of useless log files created by an application which filled up all inodes. I resorted to "locate", got all the files "located"d into a text file and then removed them one by one. Took a while but did the job!
I solved with for
I am on macOS with zsh
I moved thousands only jpg files. Within mv in one line command.
Be sure there are no spaces or special characters in the name of the files you are trying to move
for i in $(find ~/old -type f -name "*.jpg"); do mv $i ~/new; done
A bit safer version than using xargs, also not recursive:
ls -p | grep -v '/$' | grep '\.pdf$' | while read file; do rm "$file"; done
Filtering our directories here is a bit unnecessary as 'rm' won't delete it anyway, and it can be removed for simplicity, but why run something that will definitely return error?
Using GNU parallel (sudo apt install parallel) is super easy
It runs the commands multithreaded where '{}' is the argument passed
E.g.
ls /tmp/myfiles* | parallel 'rm {}'
For remove first 100 files:
rm -rf 'ls | head -100'

Resources