"rm" (delete) 8 million files in a directory? [closed]

"rm" (delete) 8 million files in a directory? [closed] - linux

Closed. This question is off-topic. It is not currently accepting answers.
Want to improve this question? Update the question so it's on-topic for Stack Overflow.
Closed 10 years ago.
Improve this question
I have 8 million files in my /tmp and I need to remove them. This server is also running pretty important app and I can not overload it.
$ ls | grep .| xargs rm
The above makes my app unresponsive.
Do you have any ideas how to remove these files? Thanks in advance!

Well yes, don't use ls (because it may sort files, and the file list may draw more memory than you would like), don't add pointless indirections like a pipe, or xargs.
find . -type f -delete

grep . is match anything, including nothing.
Cut it out of your chain to remove a process launched for each file. That should speed things up nicely.
ls | xargs rm -rf
Note that this will choke on whitespace, so an improvement is
ls | xargs -I{} rm -v {}
Of course, a much faster method is to remove the directory and recreate it. However, you do need to take care that your script doesn't get "lost" in the directory tree and remove stuff it shouldn't.
rm -rf dir
mkdir dir
Note that there are some subtle differences between removing all files, and removing and recreating the directory. Removing all files will only remove visible files and directories; while removing the directory and recreating will remove all files and directories, visible and hidden.

try this:
ls -1 | grep -v -e "ignoreFile" -e "ignoreFile2" | xargs rm -rf
ls -1 is simplifying ls | grep .
grep -v will remove lines from the list. just give it any files that should not be deleted, separating patterns with -e flag
And just for a complete explaination:
(I'm guessing this is already known)
rm -rf :
-r recursive
-f force

Related

In shell, how to remove file if bigger than 100MB, move otherwise [closed]

Closed. This question does not meet Stack Overflow guidelines. It is not currently accepting answers.
We don’t allow questions seeking recommendations for books, tools, software libraries, and more. You can edit the question so it can be answered with facts and citations.
Closed 3 years ago.
Improve this question
What would be the easiest way to do the following with shell commands?
Pseudo code: rm abc if >100MB else mv abc /tmp
abc could either be a name of ONE file or ONE directory.
I want to have an alias that if run on a file or directory, it removes it if it's size is greater than 100MB and if not, it moves it to another directory.
I know I could accomplish something similar with a whole fuction, but there must be a slick one-liner that could do the same.

To move a single regular file if its size is lower than 100MB and delete it otherwise, you can use the following commands :
# 104857600 = 1024 * 1024 * 100 = 100M
[ $(stat --printf '%s' "$file") -gt 104857600 ] && rm "$file" || mv "$file" /tmp/
To move a single directory and its content if its combined size is lower than 100MB and delete it otherwise, you can use the following commands :
[ $(du -s "$directory" | cut -f1) -gt 104857600 ] && rm -rf "$directory" || mv "$directory" /tmp/
To do one or the other depending on whether the input parameter points to a file or a directory, you can use if [ -d "$path" ]; then <directory pipeline>; else <file pipeline>; fi.
To recursively move or delete all the files of a directory depending on their size you can use the following :
find . -type f -a \( -size +100M -exec rm {} + -o -exec mv -t /tmp/ {} + \)
It first selects files in the current directory, then execute rm ... with the list of those files whose size is greater than 100M and mv ... /tmp with the rest of the files.

This is possible by a combination of find command and xargs / rm statement and rsync and the order in which you perform this in the script.
Like i.e.:
find /foo/bar -type f -size +100M -print | xargs rm
Where the pipeline through xargs is for efficiency, so that the rm command is not being executed for each file which has been found by find.
And next an rsync statement to mirror a hierarchie of remaining files (from your question its not 100% clear to me whether there are subdirectories of other files or not, therefore I propose rsync which also would cover a subdir hirarchie) to a different path and using the rsync command line option
--remove-source-files
rsync -av --remove source files /foo/bar /tmp
Conclusion: this combination of find / rsync in the proper order works much more efficient compared to other proposed solutions with "find .. -exec" where the executed program would be forked as often as a file has been found. The more advanced Unix Admins avoid "find ... -exec" not to waste valuable system resources and because it scales much better when having a lot of files.

Can't SSH to keep latest 5 folders and delete the older folders [closed]

Closed. This question does not meet Stack Overflow guidelines. It is not currently accepting answers.
This question does not appear to be about a specific programming problem, a software algorithm, or software tools primarily used by programmers. If you believe the question would be on-topic on another Stack Exchange site, you can leave a comment to explain where the question may be able to be answered.
Closed 5 years ago.
Improve this question
Currently I am using ls -1t | tail -n +6 | xargs rm -rf and it works fine at the server itself. But when I try it through ssh using root in a bash script, it doesn't run/work.
This is the line I am using : ssh -q -oStrictHostKeyChecking=no -oConnectTimeout=1 root#$host "sudo cd /path/to/folder && sudo ls -1t | tail -n +6 | xargs rm -rf"
May I know what's the issue here?

root#$host suggests that you're already logged in as root, so using sudo is redundant here.
cd /path/to/folder && ls -1t | tail -n +6 | xargs rm -rf
should do the trick.
But this is only safe if you exactly know that /path/to/folder can not contain any files with possibly dangerous characters in their names. For example a file named ..\n or similar would cause the whole directory to be deleted.
The reason your original example does not work is that sudo executes a program, not a series of shell commands. Also cd is not a program but a shell builtin, so it can't be executed through sudo, as this wouldn't really make sense, the directory change would be lost after cd returned. If that worked, then in your case the first statement (sudo cd /path/to/folder) would execute successfully, and then the second one (sudo ls -1t | tail -n +6 | xargs rm -rf) would execute in the current directory, but only the ls command as root, the rest as the current user.
To execute the whole command line through sudo
sudo sh -c "cd /path/to/folder && ls -1t | tail -n +6 | xargs rm -rf"
Or, if the current user has access rights for /path/to/folder, then actually only the last part needs to be executed as root:
cd /path/to/folder && ls -1t | tail -n +6 | sudo xargs rm -rf

How to show a 'grep' result with the complete path or file name [closed]

Closed. This question does not meet Stack Overflow guidelines. It is not currently accepting answers.
This question does not appear to be about a specific programming problem, a software algorithm, or software tools primarily used by programmers. If you believe the question would be on-topic on another Stack Exchange site, you can leave a comment to explain where the question may be able to be answered.
Closed 10 months ago.
Improve this question
How can I get the complete file path when I use grep?
I use commands like
cat *.log | grep somethingtosearch
I need to show the result with the complete file path from where the matched result were taken out.
How can I do it?

Assuming you have two log-files in:
C:/temp/my.log
C:/temp/alsoMy.log
'cd' to C: and use:
grep -r somethingtosearch temp/*.log
It will give you a list like:
temp/my.log:somethingtosearch
temp/alsoMy.log:somethingtosearch1
temp/alsoMy.log:somethingtosearch2

I think the real solution is:
cat *.log | grep -H somethingtosearch

Command:
grep -rl --include="*.js" "searchString" ${PWD}
Returned output:
/root/test/bas.js

If you want to see the full paths, I would recommend to cd to the top directory (of your drive if using Windows)
cd C:\
grep -r somethingtosearch C:\Users\Ozzesh\temp
Or on Linux:
cd /
grep -r somethingtosearch ~/temp
If you really resist on your file name filtering (*.log) and you want recursive (files are not all in the same directory), combining find and grep is the most flexible way:
cd /
find ~/temp -iname '*.log' -type f -exec grep somethingtosearch '{}' \;

It is similar to BVB Media's answer.
grep -rnw 'blablabla' `pwd`
It works fine on my Ubuntu 16.04 (Xenial Xerus) Bash.

For me
grep -b "searchsomething" *.log
worked as I wanted

This works when searching files in all directories.
sudo ls -R | grep -i something_bla_bla
The output shows all files and directories which include "something_bla_bla". The directories with path, but not the files.
Then use locate on the wanted file.

The easiest way to print full paths is to replace the relative start path with the absolute path:
grep -r --include="*.sh" "pattern" ${PWD}

Use:
grep somethingtosearch *.log
and the filenames will be printed out along with the matches.

How can I run dos2unix on an entire directory? [closed]

Closed. This question does not meet Stack Overflow guidelines. It is not currently accepting answers.
This question does not appear to be about a specific programming problem, a software algorithm, or software tools primarily used by programmers. If you believe the question would be on-topic on another Stack Exchange site, you can leave a comment to explain where the question may be able to be answered.
Closed 2 years ago.
The community reviewed whether to reopen this question 11 months ago and left it closed:
Original close reason(s) were not resolved
Improve this question
I have to convert an entire directory using dos2unix. I am not able to figure out how to do this.

find . -type f -print0 | xargs -0 dos2unix
Will recursively find all files inside current directory and call for these files dos2unix command

If it's a large directory you may want to consider running with multiple processors:
find . -type f -print0 | xargs -0 -n 1 -P 4 dos2unix
This will pass 1 file at a time, and use 4 processors.

As I happened to be poorly satisfied by dos2unix, I rolled out my own simple utility. Apart of a few advantages in speed and predictability, the syntax is also a bit simpler :
endlines unix *
And if you want it to go down into subdirectories (skipping hidden dirs and non-text files) :
endlines unix -r .
endlines is available here https://github.com/mdolidon/endlines

A common use case appears to be to standardize line endings for all files committed to a Git repository:
git ls-files -z | xargs -0 dos2unix
Keep in mind that certain files (e.g. *.sln, *.bat) are only used on Windows operating systems and should keep the CRLF ending:
git ls-files -z '*.sln' '*.bat' | xargs -0 unix2dos
If necessary, use .gitattributes

It's probably best to skip hidden files and folders, such as .git. So instead of using find, if your bash version is recent enough or if you're using zsh, just do:
dos2unix **
Note that for Bash, this will require:
shopt -s globstar
....but this is a useful enough feature that you should honestly just put it in your .bashrc anyway.
If you don't want to skip hidden files and folders, but you still don't want to mess with find (and I wouldn't blame you), you can provide a second recursive-glob argument to match only hidden entries:
dos2unix ** **/.*
Note that in both cases, the glob will expand to include directories, so you will see the following warning (potentially many times over): Skipping <dir>, not a regular file.

For any Solaris users (am using 5.10, may apply to newer versions too, as well as other unix systems):
dos2unix doesn't default to overwriting the file, it will just print the updated version to stdout, so you will have to specify the source and target, i.e. the same name twice:
find . -type f -exec dos2unix {} {} \;

I think the simplest way is:
dos2unix $(find . -type f)

I've googled this like a million times, so my solution is to just put this bash function in your environment.
.bashrc or .profile or whatever
dos2unixd() {
find $1 -type f -print0 | xargs -0 dos2unix
}
Usage
$ dos2unixd ./somepath
This way you still have the original command dos2unix and it's easy to remember this one dos2unixd.

I have had the same problem and thanks to the posts here I have solved it. I knew that I have around a hundred files and I needed to run it for *.js files only.
find . -type f -name '*.js' -print0 | xargs -0 dos2unix
Thank you all for your help.

for FILE in /var/www/html/files/*
do
/usr/bin/dos2unix FILE
done

If there is no sub-directory, you can also take
ls | xargs -I {} dos2unix "{}"

Unix command deleted every directory even though not specified

I am very new to the unix. I ran the following command.
ls -l | xargs rm -rf bark.*
and above command removed every directory in the folder.
Can any one explained me why ?

The -r argument means "delete recursively" (ie descend into subdirectories). The -f command means "force" (in other words, don't ask for confirmation). -rf means "descend recursively into subdirectories without asking for confirmation"
ls -l lists all files in the directory. xargs takes the input from ls -l and appends it to the command you pass to xargs
The final command that got executed looked like this:
rm -rf bark.* <output of ls -l>
This essentially removed bark.* and all files in the current directory. Moral of the story: be very careful with rm -rf. (You can use rm -ri to ask before deleting files instead)

rm(1) deleted every file and directory in the current working directory because you asked it to.
To see roughly what happened, run this:
cd /etc ; ls -l | xargs echo
Pay careful attention to the output.
I strongly recommend using echo in place of rm -rf when constructing command lines. Only if the output looks fine should you then re-run the command with rm -rf. When in doubt, maybe just use rm -r so that you do not accidentally blow away too much. rm -ir if you are very skeptical of your command line. (I have been using Linux since 1994 and I still use this echo trick when constructing slightly complicated command lines to selectively delete a pile of files.)
Incidentally, I would avoid parsing ls(1) output in any fashion -- filenames can contain any character except ASCII NUL and / chars -- including newlines, tabs, and output that looks like ls -l output. Trying to parse this with tools such as xargs(1) can be dangerous.
Instead, use find(1) for these sorts of things. To delete all files in all directories named bark.*, I'd run a command like this:
find . -type d -name 'bark.*' -print0 | xargs -0 rm -r
Again, I'd use echo in place of rm -r for the first execution -- and if it looked fine, then I'd re-run with rm -r.

The ls -l command gave a list of all the subdirectories in your current present-working-directory (PWD).
The rm command can delete multiple files/directories if you pass them to it as a list.
eg: rm test1.txt test2.txt myApp will delete all three of the files with names:
test1.txt
test2.txt
myApp
Also, the flags for the rm command you used are common in many a folly.
rm -f - Force deletion of files without asking or confirming
rm -r - Recurse into all subdirectories and delete all their contents and subdirectories
So, let's say you are in /home/user, and the directory structure looks like so:
/home/user
|->dir1
|->dir2
`->file1.txt
the ls -l command will provide the list containing "dir1 dir2 file1.txt", and the result of the command ls -l | xargs rm -rf will look like this:
rm -rf dir1 dir2 file1.txt
If we expand your original question with the example above, the final command that gets passed to the system becomes:
rm -rf di1 dir2 file1.txt bark.*
So, everything in the current directory gets wiped out, so the bark.* is redundant (you effectively told the machine to destroy everything in the current directory anyway).
I think what you meant to do was delete all files in the current directory and all subdirectories (recurse) that start with bark. To do that, you just have to do:
find -iname bark.* | xargs rm
The command above means "find all files in this directory and subdirectories, ignoring UPPERCASE/lowercase/mIxEdCaSe, that start with the characters "bark.", and delete them". This could still be a bad command if you have a typo, so to be sure, you should always test before you do a batch-deletion like this.
In the future, first do the following to get a list of all the files you will be deleting first to confirm they are the ones you want deleted.
find -iname bark.* | xargs echo
Then if you are sure, delete them via
find -iname bark.* | xargs rm
Hope this helps.
As a humorous note, one of the most famous instances of "rm -rf" can be found here:
https://github.com/MrMEEE/bumblebee-Old-and-abbandoned/commit/a047be85247755cdbe0acce6f1dafc8beb84f2ac
An automated script runs something like rm -rf /usr/local/........., but due to accidentally inserting a space, the command became rm -rf /usr /local/......, so this effectively means "delete all root folders that start with usr or local", effectively destroying the system of anyone who uses it. I feel bad for that developer.
You can avoid these kinds of bugs by quoting your strings, ie:
rm -rf "/usr/ local/...." would have provided an error message and avoided this bug, because the quotes mean that everything between them is the full path, NOT a list of separate paths/files (ie: you are telling rm that the file/folder has a SPACE character in its name).

Develop Reference

node.js excel linux python-3.x azure haskell apache-spark rust .htaccess string

"rm" (delete) 8 million files in a directory? [closed] - linux

Well yes, don't use ls (because it may sort files, and the file list may draw more memory than you would like), don't add pointless indirections like a pipe, or xargs. find . -type f -delete

Related

In shell, how to remove file if bigger than 100MB, move otherwise [closed]

Can't SSH to keep latest 5 folders and delete the older folders [closed]

How to show a 'grep' result with the complete path or file name [closed]

How can I run dos2unix on an entire directory? [closed]

Unix command deleted every directory even though not specified

Categories

Resources