Can I list the files and directories on a logical volume - linux

I have a logical volume, /dev/dm-0, 95% full. I want to delete some directories and files from it to free up space.
I have no idea how to list what files and dirs are on it. Google doesn't even have an answer it seems. I have to believe this is possible, but crazy how hard it is to find the answer.

Goto the directory and execute b
find . -type f

Related

Linux - Recursively list all files from folder with performance

I want to list down absolute filenames of almost 80 thousand files recursively from folders in text file.
Do anybody know which command will give me performance?
I know that I can use commands like ls, tree, ll But which command will give me more performance.
Also note that these files are on NAS and NAS is mapped using symlink command.
Simply find -type f -print0 for zero-delimited output. I don't think you can get significantly faster than that.

Need ideas for running srm on large amount of files

For security, I need to use srm (secure delete) rather than rm to delete some files: http://en.wikipedia.org/wiki/Srm_%28Unix%29
I currently have srm set up to run 3 passes over any data that I need to delete. The problem I'm having is that srm is running extremely extremely slowly on large amounts of files. For example, there is a 150 directory I tried to delete, and I found it to only have deleted 10GB over 1 week.
I know that srm will run slowly with multiple small files, but does directory depth matter as well? For most of the data I need to delete on a weekly basis, the actual files themselves are nested in various deep subdirectories. Would it help out if I flattened the directory structure before running srm?
Here are two workarounds I am looking at (maybe a combination of both), though I don't know how much they would help out:
Flatten all the directories structures before running srm. That way, all the files that need to be wiped out are in the same target dierctory.
Archive the entire directory before running srm. That way, the target file would be one big tar.gz file. Zipping up the data will likely take a while, but not as long as the srm would have taken.
Does anybody have any other suggestions on what I could do? Some others have used shred as well, but the results were similar and we ended up switching over to srm.
Don't know much about srm, but might be worth trying :
find $mydir -type f -exec srm {} \;
find $mydir -type d -exec srm {} \;

How to traverse a disk of type ext3 with 500GB data

There are ~10 million files on a disk (not under the same directory).
I want to get [(file_name, file_size, file_atime)] of all files.
But the command
find /data -type f -printf "%p\t%A#\t%s\n"
is hopelessly slow and cause the IO %util ~100%.
Any advice?
Not much you can do.
Check if you are using directory indexes (dir_index).
If you are desperate you can use debug2fs and read the data raw, but I would not recommend it.
You can also buy an SSD - the slowness is probably from seeking, if you do with often an SSD will speed things up quite a bit.

Can /tmp in Linux ever fill up?

I'm putting some files in /tmp on a web server that are being used by a web application for a limited amount of time. If the files get left in the server's /tmp after the user quits using the application and this happens repeatedly, should i be concerned about the directory filling up? I read online that rebooting cleans out the /tmp directory, but this box doesn't get rebooted very much.
Tom
Yes, it will fill up. Consider implementing a cron job that will delete old files after a while.
Something like this should do the trick:
/usr/bin/find /tmp/mydata -type f -atime +1 -exec rm -f {} \;
This will delete files that have a modification time that's more than a day old.
Or as a crontab entry:
# run five minutes after midnight, every day
5 0 * * * /usr/bin/find /tmp/mydata -type f -atime +1 -exec rm -f {} \;
where /tmp/mydata is a subdirectory where your application stores its temporary files. (Simply deleting old files under /tmp would be a very bad idea, as someone else pointed out here.)
Look at the crontab and find man pages for the details. Don't go running scripts that delete files on your filesystem without understanding all the details - that's how bad things happen to good servers. :)
Of course, if you can just modify your application to delete temporary files when it's done with them, that would be a far better solution, generally.
You can't just blindly delete everything that hasn't been modified for a certain amount of time. A lot of programs store sockets in there, which never get modified but are still an integral part of the program working. Take for instance mysql from one of my servers:
srwxrwxrwx 1 mysql mysql 0 Sep 11 04:01 mysql.sock=
That's a valid, working "file" in /tmp. It just looks old because mysql hasn't been restarted in a while. Either limit your find with '-type f' or '-atime', or use one of the distro-provided tools others have mentioned.
The only thing you can write to without worrying it will fill up is /dev/null. Everything else will eventually run out of space if you keep dumping things in it.
One simple approach would be to have a cron job clean up all your /tmp files that are older than, say, a few days.
Yep It will be linked to one of your disks/partitions and can fill up.
It gets deleted on a reboot.
When the user quits the application you should clean the files up after them.
In which language is your web-application? A lot of languages propose temp files:
C
python
php
...
Search in your language if there is such a feature.
Just a warning: not all Linux installation clean the /tmp directory after each reboot
Some linux distros have a package that will clean up old files in /tmp for you. It isn't hard to implement your own, as mentioned above. One thing to look out for are long running processes, especially "zombies", which are ones that have died but which haven't finished cleaning up after themselves. If a process has a file open, just deleting it from /tmp won't actually reclaim its space - you have to kill the process or somehow coerce it to close the file. Many programs that write log or temporary files are designed to catch a signal (often SIGUSR1) and close and re-open any log or temporary files for that reason.
Many Linux distributions include something named 'tmpwatch', or similar which runs via cron and deletes things on a pre-defined gradient. Some are smart enough to go by the owner of the file .. stuff that is owned by daemon users gets cleaned out faster than stuff owned by regular users. Check on the mailing lists for your distro of choice to find out.
Still, you should have SNMP or some other kind of monitor watching how much room is available, if it fills up services like Apache aren't going to be happy. For instance, e-accelerator for PHP will need plenty of room, some mail scanners don't clean up properly, etc.

Maximum number of inodes in a directory? [closed]

Closed. This question does not meet Stack Overflow guidelines. It is not currently accepting answers.
This question does not appear to be about a specific programming problem, a software algorithm, or software tools primarily used by programmers. If you believe the question would be on-topic on another Stack Exchange site, you can leave a comment to explain where the question may be able to be answered.
Closed 5 years ago.
Improve this question
Is there a maximum number of inodes in a single directory?
I have a directory of over 2 million files and can't get the ls command to work against that directory. So now I'm wondering if I've exceeded a limit on inodes in Linux. Is there a limit before a 2^64 numerical limit?
df -i should tell you the number of inodes used and free on the file system.
Try ls -U or ls -f.
ls, by default, sorts the files alphabetically. If you have 2 million files, that sort can take a long time. If ls -U (or perhaps ls -f), then the file names will be printed immediately.
No. Inode limits are per-filesystem, and decided at filesystem creation time. You could be hitting another limit, or maybe 'ls' just doesn't perform that well.
Try this:
tune2fs -l /dev/DEVICE | grep -i inode
It should tell you all sorts of inode related info.
What you hit is an internal limit of ls. Here is an article which explains it quite well:
http://www.olark.com/spw/2011/08/you-can-list-a-directory-with-8-million-files-but-not-with-ls/
Maximum directory size is filesystem-dependent, and thus the exact limit varies. However, having very large directories is a bad practice.
You should consider making your directories smaller by sorting files into subdirectories. One common scheme is to use the first two characters for a first-level subdirectory, as follows:
${topdir}/aa/aardvark
${topdir}/ai/airplane
This works particularly well if using UUID, GUIDs or content hash values for naming.
As noted by Rob Adams, ls is sorting the files before displaying them. Note that if you are using NFS, the NFS server will be sorting the directory before sending it, and 2 million entries may well take longer than the NFS timeout. That makes the directory unlistable via NFS, even with the -f flag.
This may be true for other network file systems as well.
While there's no enforced limit to the number of entries in a directory, it's good practice to have some limit to the entries you anticipate.
Can you get a real count of the number of files? Does it fall very near a 2^n boundry? Could you simply be running out of RAM to hold all the file names?
I know that in windows at least file system performance would drop dramatically as the number of files in the folder went up, but I thought that linux didn't suffer from this issue, at least if you were using a command prompt. God help you if you try to get something like nautilus to open a folder with that many files.
I'm also wondering where these files come from. Are you able to calculate file names programmatically? If that's the case, you might be able to write a small program to sort them into a number of sub-folders. Often listing the name of a specific file will grant you access where trying to look up the name will fail. For example, I have a folder in windows with about 85,000 files where this works.
If this technique is successful, you might try finding a way to make this sort permanent, even if it's just running this small program as a cron job. It'll work especially well if you can sort the files by date somewhere.
Unless you are getting an error message, ls is working but very slowly. You can try looking at just the first ten files like this:
ls -f | head -10
If you're going to need to look at the file details for a while, you can put them in a file first. You probably want to send the output to a different directory than the one you are listing at the moment!
ls > ~/lots-of-files.txt
If you want to do something to the files, you can use xargs. If you decide to write a script of some kind to do the work, make sure that your script will process the list of files as a stream rather than all at once. Here's an example of moving all the files.
ls | xargs -I thefilename mv thefilename ~/some/other/directory
You could combine that with head to move a smaller number of the files.
ls | head -10000 | xargs -I x mv x /first/ten/thousand/files/go/here
You can probably combine ls | head into a shell script to that will split up the files into a bunch of directories with a manageable number of files in each.
For NetBackup, the binaries that analyze the directories in clients perform some type of listing that timeouts by the enormous quantity of files in every folder (about one million per folder, SAP work directory).
My solution was (as Charles Duffy write in this thread), reorganize the folders in subfolders with less archives.
Another option is find:
find . -name * -exec somcommands {} \;
{} is the absolute filepath.
The advantage/disadvantage is that the files are processed one after each other.
find . -name * > ls.txt
would print all filenames in ls.txt
find . -name * -exec ls -l {} \; > ls.txt
would print all information form ls for each file in ls.txt

Resources