Linux delete file with size 0 [duplicate] - linux

This question already has answers here:
How to delete many 0 byte files in linux?
(10 answers)
Closed 6 years ago.
The community reviewed whether to reopen this question 1 year ago and left it closed:
Not suitable for this site This question does not appear to be about a specific programming problem, a software algorithm, or software tools primarily used by programmers. If you believe the question would be on-topic on another Stack Exchange site, you can leave a comment to explain where the question may be able to be answered.
How do I delete a certain file in linux if its size is 0. I want to execute this in an crontab without any extra script.
l filename.file | grep 5th-tab | not eq 0 | rm
Something like this?

This will delete all the files in a directory (and below) that are size zero.
find /tmp -size 0 -print -delete
If you just want a particular file;
if [ ! -s /tmp/foo ] ; then
rm /tmp/foo
fi

you would want to use find:
find . -size 0 -delete

To search and delete empty files in the current directory and subdirectories:
find . -type f -empty -delete
-type f is necessary because also directories are marked to be of size zero.
The dot . (current directory) is the starting search directory. If you have GNU find (e.g. not Mac OS), you can omit it in this case:
find -type f -empty -delete
From GNU find documentation:
If no files to search are specified, the current directory (.) is used.

You can use the command find to do this. We can match files with -type f, and match empty files using -size 0. Then we can delete the matches with -delete.
find . -type f -size 0 -delete

On Linux, the stat(1) command is useful when you don't need find(1):
(( $(stat -c %s "$filename") )) || rm "$filename"
The stat command here allows us just to get the file size, that's the -c %s (see the man pages for other formats). I am running the stat program and capturing its output, that's the $( ). This output is seen numerically, that's the outer (( )). If zero is given for the size, that is FALSE, so the second part of the OR is executed. Non-zero (non-empty file) will be TRUE, so the rm will not be executed.

This works for plain BSD so it should be universally compatible with all flavors. Below.e.g in pwd ( . )
find . -size 0 | xargs rm

For a non-recursive delete (using du and awk):
rm `du * | awk '$1 == "0" {print $2}'`

find . -type f -empty -exec rm -f {} \;

Related

In shell, how to remove file if bigger than 100MB, move otherwise [closed]

Closed. This question does not meet Stack Overflow guidelines. It is not currently accepting answers.
We don’t allow questions seeking recommendations for books, tools, software libraries, and more. You can edit the question so it can be answered with facts and citations.
Closed 3 years ago.
Improve this question
What would be the easiest way to do the following with shell commands?
Pseudo code: rm abc if >100MB else mv abc /tmp
abc could either be a name of ONE file or ONE directory.
I want to have an alias that if run on a file or directory, it removes it if it's size is greater than 100MB and if not, it moves it to another directory.
I know I could accomplish something similar with a whole fuction, but there must be a slick one-liner that could do the same.
To move a single regular file if its size is lower than 100MB and delete it otherwise, you can use the following commands :
# 104857600 = 1024 * 1024 * 100 = 100M
[ $(stat --printf '%s' "$file") -gt 104857600 ] && rm "$file" || mv "$file" /tmp/
To move a single directory and its content if its combined size is lower than 100MB and delete it otherwise, you can use the following commands :
[ $(du -s "$directory" | cut -f1) -gt 104857600 ] && rm -rf "$directory" || mv "$directory" /tmp/
To do one or the other depending on whether the input parameter points to a file or a directory, you can use if [ -d "$path" ]; then <directory pipeline>; else <file pipeline>; fi.
To recursively move or delete all the files of a directory depending on their size you can use the following :
find . -type f -a \( -size +100M -exec rm {} + -o -exec mv -t /tmp/ {} + \)
It first selects files in the current directory, then execute rm ... with the list of those files whose size is greater than 100M and mv ... /tmp with the rest of the files.
This is possible by a combination of find command and xargs / rm statement and rsync and the order in which you perform this in the script.
Like i.e.:
find /foo/bar -type f -size +100M -print | xargs rm
Where the pipeline through xargs is for efficiency, so that the rm command is not being executed for each file which has been found by find.
And next an rsync statement to mirror a hierarchie of remaining files (from your question its not 100% clear to me whether there are subdirectories of other files or not, therefore I propose rsync which also would cover a subdir hirarchie) to a different path and using the rsync command line option
--remove-source-files
rsync -av --remove source files /foo/bar /tmp
Conclusion: this combination of find / rsync in the proper order works much more efficient compared to other proposed solutions with "find .. -exec" where the executed program would be forked as often as a file has been found. The more advanced Unix Admins avoid "find ... -exec" not to waste valuable system resources and because it scales much better when having a lot of files.

finding files and moving their folders [closed]

Closed. This question does not meet Stack Overflow guidelines. It is not currently accepting answers.
This question does not appear to be about a specific programming problem, a software algorithm, or software tools primarily used by programmers. If you believe the question would be on-topic on another Stack Exchange site, you can leave a comment to explain where the question may be able to be answered.
Closed 9 years ago.
Improve this question
I have a huge number of text files, organized in a big folder tree, on Debian Linux. What I need is to find all text files having a specific name pattern and then move the containing folder to a destination.
Example:
/home/spenx/src/a12/a1a22.txt
/home/spenx/src/a12/a1a51.txt
/home/spenx/src/a12/a1b61.txt
/home/spenx/src/a12/a1x71.txt
/home/spenx/src/a167/a1a22.txt
/home/spenx/src/a167/a1a51.txt
/home/spenx/src/a167/a1b61.txt
/home/spenx/src/a167/a1x71.txt
The commands:
find /home/spenx/src -name "a1a2*txt"
mv /home/spenx/src/a12 /home/spenx/dst
mv /home/spenx/src/a167 /home/spenx/dst
The result:
/home/spenx/dst/a12/a1a22.txt
/home/spenx/dst/a167/a1a22.txt
Thank you for your help.
SK
combination of find, dirname and mv along with xargs should solve your problem
find /home/spenx/src -name "a1a2*txt" | xargs -n 1 dirname | xargs -I list mv list /home/spenx/dst/
find will fetch list of files
dirname will extract path of file. Note that it can only take one argument at a time
mv will move source directories to destination
xargs is the key to allow output of one command to be passed as arguments to next command
For details of options used with xargs, refer to its man page of just do man xargs on terminal
You can execute:
find /home/spenx/src name "a1a2*txt" -exec mv {} /home/spenx/dst \;
Font: http://www.cyberciti.biz/tips/howto-linux-unix-find-move-all-mp3-file.html
Create this mv.sh script in the current directory that will contain this:
o=$1
d=$(dirname $o)
mkdir /home/spenx/dst/$d 2>/dev/null
mv $o /home/spenx/dst/$d
Make sure it is executable by this command:
chmod +x mv.sh
Next call this command:
find /home/spenx/src -name "a1a2*txt" -exec ./mv.sh {} \;
find /home/spenx/src -name "a1a2*txt" -exec mv "{}" yourdest_folder \;
There's probably multiple ways to do this, but, since it seems you might have multiple matches in a single directory, I would probably do something along this line:
find /home/spenx/src -name "a1a2*txt" -print0 | xargs -0 -n 1 dirname | sort -u |
while read d
do
mv "${d}" /home/spenx/dst
done
It's kind of long, but the steps are:
Find the list of all matching files (the find part), using -print0 to compensate for any names that have spaces or other odd characters in them
extract the directory part of each file name (the xargs ... dirname part)
sort and uniquify the list to get rid of duplicates
Feed the resulting list into a loop that moves each directory in turn

List all graphic image files with find? [closed]

Closed. This question does not meet Stack Overflow guidelines. It is not currently accepting answers.
This question does not appear to be about a specific programming problem, a software algorithm, or software tools primarily used by programmers. If you believe the question would be on-topic on another Stack Exchange site, you can leave a comment to explain where the question may be able to be answered.
Closed 2 years ago.
Improve this question
There are many types of graphic images in this huge archive such as .jpg, .gif, .png, etc. I don't know all the types. Is there a way with 'find' to be able to have it list all the graphic images regardless of their dot extension name? Thanks!
This should do the trick
find . -name '*' -exec file {} \; | grep -o -P '^.+: \w+ image'
example output:
./navigation/doc/Sphärische_Trigonometrie-Dateien/bfc9bd9372f650fd158992cf5948debe.png: PNG image
./navigation/doc/Sphärische_Trigonometrie-Dateien/6564ce3c5b95ded313b84fa918b32776.png: PNG image
./navigation/doc/subr_1.jpe: JPEG image
./navigation/doc/Astroanalytisch-Dateien/Gamma.gif: GIF image
./navigation/doc/Astroanalytisch-Dateien/deltaS.jpg: JPEG image
./navigation/doc/Astroanalytisch-Dateien/GammaBau.jpg: JPEG image
The following suits me better since in my case I wanted to pipe this list of files to another program.
find . -type f -exec file --mime-type {} \+ | awk -F: '{if ($2 ~/image\//) print $1}'
If you wanted to tar the images up (as someone in the comments) asked
find . -type f -exec file --mime-type {} \+ | awk -F: '{if ($2 ~/image\//) printf("%s%c", $1, 0)}' | tar -cvf /tmp/file.tar --null -T -
find . -type f -exec file {} \; | grep -o -P '^.+: \w+ image'
should even be better.
Grepping or using awk for "image" only will not do it. PSD-files will be identified by "Image" with a capital "I" so we need to improve the regexp to either be case insensitive or also include the capital I. EPS-files will not contain the word "image" at all so we need to also match for "EPS" or "Postscript" depending on what you want. So here is my improved version:
find . -type f -exec file {} \; | awk -F: '{ if ($2 ~/[Ii]mage|EPS/) print $1}'
Update (2022-03-03)
This is a refined version with the following changes:
Remove xargs.
Support filenames which contains : based on 林果皞's comment.
find . -type f |
file --mime-type -f - |
grep -F image/ |
rev | cut -d : -f 2- | rev
Below is a more performant solution compared to the chosen answer:
find . -type f -print0 |
xargs -0 file --mime-type |
grep -F 'image/' |
cut -d ':' -f 1
Use -type f instead of -name '*' since the former will search only files while the latter search both files and directories.
xargs execute file with arguments as many as possible, which is super fast compared to find -exec file {} \; which executes file for each found.
grep -F is faster since we only want to match fixed string.
cut is faster than awk (more than 5 times faster as I can recall).
Related to the same problem, I just published a tool called photofind (https://github.com/trimap/photofind). It behaves like the normal find-command but is specialized for image files and supports filtering of results also based on the EXIF-information stored within the image files. See the linked github-repo for more details.

Find all files with name containing string [closed]

Closed. This question does not meet Stack Overflow guidelines. It is not currently accepting answers.
This question does not appear to be about a specific programming problem, a software algorithm, or software tools primarily used by programmers. If you believe the question would be on-topic on another Stack Exchange site, you can leave a comment to explain where the question may be able to be answered.
Closed 2 years ago.
Improve this question
I have been searching for a command that will return files from the current directory which contain a string in the filename. I have seen locate and find commands that can find files beginning with something first_word* or ending with something *.jpg.
How can I return a list of files which contain a string in the filename?
For example, if 2012-06-04-touch-multiple-files-in-linux.markdown was a file in the current directory.
How could I return this file and others containing the string touch? Using a command such as find '/touch/'
Use find:
find . -maxdepth 1 -name "*string*" -print
It will find all files in the current directory (delete maxdepth 1 if you want it recursive) containing "string" and will print it on the screen.
If you want to avoid file containing ':', you can type:
find . -maxdepth 1 -name "*string*" ! -name "*:*" -print
If you want to use grep (but I think it's not necessary as far as you don't want to check file content) you can use:
ls | grep touch
But, I repeat, find is a better and cleaner solution for your task.
Use grep as follows:
grep -R "touch" .
-R means recurse. If you would rather not go into the subdirectories, then skip it.
-i means "ignore case". You might find this worth a try as well.
The -maxdepth option should be before the -name option, like below.,
find . -maxdepth 1 -name "string" -print
find $HOME -name "hello.c" -print
This will search the whole $HOME (i.e. /home/username/) system for any files named “hello.c” and display their pathnames:
/Users/user/Downloads/hello.c
/Users/user/hello.c
However, it will not match HELLO.C or HellO.C. To match is case insensitive pass the -iname option as follows:
find $HOME -iname "hello.c" -print
Sample outputs:
/Users/user/Downloads/hello.c
/Users/user/Downloads/Y/Hello.C
/Users/user/Downloads/Z/HELLO.c
/Users/user/hello.c
Pass the -type f option to only search for files:
find /dir/to/search -type f -iname "fooBar.conf.sample" -print
find $HOME -type f -iname "fooBar.conf.sample" -print
The -iname works either on GNU or BSD (including OS X) version find command. If your version of find command does not supports -iname, try the following syntax using grep command:
find $HOME | grep -i "hello.c"
find $HOME -name "*" -print | grep -i "hello.c"
OR try
find $HOME -name '[hH][eE][lL][lL][oO].[cC]' -print
Sample outputs:
/Users/user/Downloads/Z/HELLO.C
/Users/user/Downloads/Z/HEllO.c
/Users/user/Downloads/hello.c
/Users/user/hello.c
If the string is at the beginning of the name, you can do this
$ compgen -f .bash
.bashrc
.bash_profile
.bash_prompt
An alternative to the many solutions already provided is making use of the glob **. When you use bash with the option globstar (shopt -s globstar) or you make use of zsh, you can just use the glob ** for this.
**/bar
does a recursive directory search for files named bar (potentially including the file bar in the current directory). Remark that this cannot be combined with other forms of globbing within the same path segment; in that case, the * operators revert to their usual effect.
Note that there is a subtle difference between zsh and bash here. While bash will traverse soft-links to directories, zsh will not. For this you have to use the glob ***/ in zsh.
find / -exec grep -lR "{test-string}" {} \;
grep -R "somestring" | cut -d ":" -f 1

Argument list too long error for rm, cp, mv commands

I have several hundred PDFs under a directory in UNIX. The names of the PDFs are really long (approx. 60 chars).
When I try to delete all PDFs together using the following command:
rm -f *.pdf
I get the following error:
/bin/rm: cannot execute [Argument list too long]
What is the solution to this error?
Does this error occur for mv and cp commands as well? If yes, how to solve for these commands?
The reason this occurs is because bash actually expands the asterisk to every matching file, producing a very long command line.
Try this:
find . -name "*.pdf" -print0 | xargs -0 rm
Warning: this is a recursive search and will find (and delete) files in subdirectories as well. Tack on -f to the rm command only if you are sure you don't want confirmation.
You can do the following to make the command non-recursive:
find . -maxdepth 1 -name "*.pdf" -print0 | xargs -0 rm
Another option is to use find's -delete flag:
find . -name "*.pdf" -delete
tl;dr
It's a kernel limitation on the size of the command line argument. Use a for loop instead.
Origin of problem
This is a system issue, related to execve and ARG_MAX constant. There is plenty of documentation about that (see man execve, debian's wiki, ARG_MAX details).
Basically, the expansion produce a command (with its parameters) that exceeds the ARG_MAX limit.
On kernel 2.6.23, the limit was set at 128 kB. This constant has been increased and you can get its value by executing:
getconf ARG_MAX
# 2097152 # on 3.5.0-40-generic
Solution: Using for Loop
Use a for loop as it's recommended on BashFAQ/095 and there is no limit except for RAM/memory space:
Dry run to ascertain it will delete what you expect:
for f in *.pdf; do echo rm "$f"; done
And execute it:
for f in *.pdf; do rm "$f"; done
Also this is a portable approach as glob have strong and consistant behavior among shells (part of POSIX spec).
Note: As noted by several comments, this is indeed slower but more maintainable as it can adapt more complex scenarios, e.g. where one want to do more than just one action.
Solution: Using find
If you insist, you can use find but really don't use xargs as it "is dangerous (broken, exploitable, etc.) when reading non-NUL-delimited input":
find . -maxdepth 1 -name '*.pdf' -delete
Using -maxdepth 1 ... -delete instead of -exec rm {} + allows find to simply execute the required system calls itself without using an external process, hence faster (thanks to #chepner comment).
References
I'm getting "Argument list too long". How can I process a large list in chunks? # wooledge
execve(2) - Linux man page (search for ARG_MAX) ;
Error: Argument list too long # Debian's wiki ;
Why do I get “/bin/sh: Argument list too long” when passing quoted arguments? # SuperUser
find has a -delete action:
find . -maxdepth 1 -name '*.pdf' -delete
Another answer is to force xargs to process the commands in batches. For instance to delete the files 100 at a time, cd into the directory and run this:
echo *.pdf | xargs -n 100 rm
If you’re trying to delete a very large number of files at one time (I deleted a directory with 485,000+ today), you will probably run into this error:
/bin/rm: Argument list too long.
The problem is that when you type something like rm -rf *, the * is replaced with a list of every matching file, like “rm -rf file1 file2 file3 file4” and so on. There is a relatively small buffer of memory allocated to storing this list of arguments and if it is filled up, the shell will not execute the program.
To get around this problem, a lot of people will use the find command to find every file and pass them one-by-one to the “rm” command like this:
find . -type f -exec rm -v {} \;
My problem is that I needed to delete 500,000 files and it was taking way too long.
I stumbled upon a much faster way of deleting files – the “find” command has a “-delete” flag built right in! Here’s what I ended up using:
find . -type f -delete
Using this method, I was deleting files at a rate of about 2000 files/second – much faster!
You can also show the filenames as you’re deleting them:
find . -type f -print -delete
…or even show how many files will be deleted, then time how long it takes to delete them:
root#devel# ls -1 | wc -l && time find . -type f -delete
100000
real 0m3.660s
user 0m0.036s
sys 0m0.552s
Or you can try:
find . -name '*.pdf' -exec rm -f {} \;
you can try this:
for f in *.pdf
do
rm "$f"
done
EDIT:
ThiefMaster comment suggest me not to disclose such dangerous practice to young shell's jedis, so I'll add a more "safer" version (for the sake of preserving things when someone has a "-rf . ..pdf" file)
echo "# Whooooo" > /tmp/dummy.sh
for f in '*.pdf'
do
echo "rm -i \"$f\""
done >> /tmp/dummy.sh
After running the above, just open the /tmp/dummy.sh file in your favorite editor and check every single line for dangerous filenames, commenting them out if found.
Then copy the dummy.sh script in your working dir and run it.
All this for security reasons.
For somone who doesn't have time.
Run the following command on terminal.
ulimit -S -s unlimited
Then perform cp/mv/rm operation.
I'm surprised there are no ulimit answers here. Every time I have this problem I end up here or here. I understand this solution has limitations but ulimit -s 65536 seems to often do the trick for me.
You could use a bash array:
files=(*.pdf)
for((I=0;I<${#files[#]};I+=1000)); do
rm -f "${files[#]:I:1000}"
done
This way it will erase in batches of 1000 files per step.
you can use this commend
find -name "*.pdf" -delete
The rm command has a limitation of files which you can remove simultaneous.
One possibility you can remove them using multiple times the rm command bases on your file patterns, like:
rm -f A*.pdf
rm -f B*.pdf
rm -f C*.pdf
...
rm -f *.pdf
You can also remove them through the find command:
find . -name "*.pdf" -exec rm {} \;
If they are filenames with spaces or special characters, use:
find -name "*.pdf" -delete
For files in current directory only:
find -maxdepth 1 -name '*.pdf' -delete
This sentence search all files in the current directory (-maxdepth 1) with extension pdf (-name '*.pdf'), and then, delete.
i was facing same problem while copying form source directory to destination
source directory had files ~3 lakcs
i used cp with option -r and it's worked for me
cp -r abc/ def/
it will copy all files from abc to def without giving warning of Argument list too long
Try this also If you wanna delete above 30/90 days (+) or else below 30/90(-) days files/folders then you can use the below ex commands
Ex: For 90days excludes above after 90days files/folders deletes, it means 91,92....100 days
find <path> -type f -mtime +90 -exec rm -rf {} \;
Ex: For only latest 30days files that you wanna delete then use the below command (-)
find <path> -type f -mtime -30 -exec rm -rf {} \;
If you wanna giz the files for more than 2 days files
find <path> -type f -mtime +2 -exec gzip {} \;
If you wanna see the files/folders only from past one month .
Ex:
find <path> -type f -mtime -30 -exec ls -lrt {} \;
Above 30days more only then list the files/folders
Ex:
find <path> -type f -mtime +30 -exec ls -lrt {} \;
find /opt/app/logs -type f -mtime +30 -exec ls -lrt {} \;
And another one:
cd /path/to/pdf
printf "%s\0" *.[Pp][Dd][Ff] | xargs -0 rm
printf is a shell builtin, and as far as I know it's always been as such. Now given that printf is not a shell command (but a builtin), it's not subject to "argument list too long ..." fatal error.
So we can safely use it with shell globbing patterns such as *.[Pp][Dd][Ff], then we pipe its output to remove (rm) command, through xargs, which makes sure it fits enough file names in the command line so as not to fail the rm command, which is a shell command.
The \0 in printf serves as a null separator for the file names wich are then processed by xargs command, using it (-0) as a separator, so rm does not fail when there are white spaces or other special characters in the file names.
Argument list too long
As this question title for cp, mv and rm, but answer stand mostly for rm.
Un*x commands
Read carefully command's man page!
For cp and mv, there is a -t switch, for target:
find . -type f -name '*.pdf' -exec cp -ait "/path to target" {} +
and
find . -type f -name '*.pdf' -exec mv -t "/path to target" {} +
Script way
There is an overall workaroung used in bash script:
#!/bin/bash
folder=( "/path to folder" "/path to anther folder" )
if [ "$1" != "--run" ] ;then
exec find "${folder[#]}" -type f -name '*.pdf' -exec $0 --run {} +
exit 0;
fi
shift
for file ;do
printf "Doing something with '%s'.\n" "$file"
done
What about a shorter and more reliable one?
for i in **/*.pdf; do rm "$i"; done
I had the same problem with a folder full of temporary images that was growing day by day and this command helped me to clear the folder
find . -name "*.png" -mtime +50 -exec rm {} \;
The difference with the other commands is the mtime parameter that will take only the files older than X days (in the example 50 days)
Using that multiple times, decreasing on every execution the day range, I was able to remove all the unnecessary files
You can create a temp folder, move all the files and sub-folders you want to keep into the temp folder then delete the old folder and rename the temp folder to the old folder try this example until you are confident to do it live:
mkdir testit
cd testit
mkdir big_folder tmp_folder
touch big_folder/file1.pdf
touch big_folder/file2.pdf
mv big_folder/file1,pdf tmp_folder/
rm -r big_folder
mv tmp_folder big_folder
the rm -r big_folder will remove all files in the big_folder no matter how many. You just have to be super careful you first have all the files/folders you want to keep, in this case it was file1.pdf
To delete all *.pdf in a directory /path/to/dir_with_pdf_files/
mkdir empty_dir # Create temp empty dir
rsync -avh --delete --include '*.pdf' empty_dir/ /path/to/dir_with_pdf_files/
To delete specific files via rsync using wildcard is probably the fastest solution in case you've millions of files. And it will take care of error you're getting.
(Optional Step): DRY RUN. To check what will be deleted without deleting. `
rsync -avhn --delete --include '*.pdf' empty_dir/ /path/to/dir_with_pdf_files/
.
.
.
Click rsync tips and tricks for more rsync hacks
I found that for extremely large lists of files (>1e6), these answers were too slow. Here is a solution using parallel processing in python. I know, I know, this isn't linux... but nothing else here worked.
(This saved me hours)
# delete files
import os as os
import glob
import multiprocessing as mp
directory = r'your/directory'
os.chdir(directory)
files_names = [i for i in glob.glob('*.{}'.format('pdf'))]
# report errors from pool
def callback_error(result):
print('error', result)
# delete file using system command
def delete_files(file_name):
os.system('rm -rf ' + file_name)
pool = mp.Pool(12)
# or use pool = mp.Pool(mp.cpu_count())
if __name__ == '__main__':
for file_name in files_names:
print(file_name)
pool.apply_async(delete_files,[file_name], error_callback=callback_error)
If you want to remove both files and directories, you can use something like:
echo /path/* | xargs rm -rf
I only know a way around this.
The idea is to export that list of pdf files you have into a file. Then split that file into several parts. Then remove pdf files listed in each part.
ls | grep .pdf > list.txt
wc -l list.txt
wc -l is to count how many line the list.txt contains. When you have the idea of how long it is, you can decide to split it in half, forth or something. Using split -l command
For example, split it in 600 lines each.
split -l 600 list.txt
this will create a few file named xaa,xab,xac and so on depends on how you split it.
Now to "import" each list in those file into command rm, use this:
rm $(<xaa)
rm $(<xab)
rm $(<xac)
Sorry for my bad english.
I ran into this problem a few times. Many of the solutions will run the rm command for each individual file that needs to be deleted. This is very inefficient:
find . -name "*.pdf" -print0 | xargs -0 rm -rf
I ended up writing a python script to delete the files based on the first 4 characters in the file-name:
import os
filedir = '/tmp/' #The directory you wish to run rm on
filelist = (os.listdir(filedir)) #gets listing of all files in the specified dir
newlist = [] #Makes a blank list named newlist
for i in filelist:
if str((i)[:4]) not in newlist: #This makes sure that the elements are unique for newlist
newlist.append((i)[:4]) #This takes only the first 4 charcters of the folder/filename and appends it to newlist
for i in newlist:
if 'tmp' in i: #If statment to look for tmp in the filename/dirname
print ('Running command rm -rf '+str(filedir)+str(i)+'* : File Count: '+str(len(os.listdir(filedir)))) #Prints the command to be run and a total file count
os.system('rm -rf '+str(filedir)+str(i)+'*') #Actual shell command
print ('DONE')
This worked very well for me. I was able to clear out over 2 million temp files in a folder in about 15 minutes. I commented the tar out of the little bit of code so anyone with minimal to no python knowledge can manipulate this code.
I have faced a similar problem when there were millions of useless log files created by an application which filled up all inodes. I resorted to "locate", got all the files "located"d into a text file and then removed them one by one. Took a while but did the job!
I solved with for
I am on macOS with zsh
I moved thousands only jpg files. Within mv in one line command.
Be sure there are no spaces or special characters in the name of the files you are trying to move
for i in $(find ~/old -type f -name "*.jpg"); do mv $i ~/new; done
A bit safer version than using xargs, also not recursive:
ls -p | grep -v '/$' | grep '\.pdf$' | while read file; do rm "$file"; done
Filtering our directories here is a bit unnecessary as 'rm' won't delete it anyway, and it can be removed for simplicity, but why run something that will definitely return error?
Using GNU parallel (sudo apt install parallel) is super easy
It runs the commands multithreaded where '{}' is the argument passed
E.g.
ls /tmp/myfiles* | parallel 'rm {}'
For remove first 100 files:
rm -rf 'ls | head -100'

Resources