how to get size of folder including apparent size of sparse files? (du is too slow) - linux

I have a folder containing a lot of KVM qcow2 files, they are all sparse files.
Now I need to get the total size of folder, the qcow2 file size should be counted as apparent size(not real size).
for example:
image: c9f38caf104b4d338cc1bbdd640dca89.qcow2
file format: qcow2
virtual size: 100G (107374182400 bytes)
disk size: 3.3M
cluster_size: 65536
the image should be treated as 100G but not 3.3M
originally I use statvfs() but it can only return real size of the folder. then I switch to 'du --apparent-size', but it's too slow given I have 10000+ files and it takes almost 5 minutes to caculate.
anybody knows a fast way that can get the size of folder counting qcow2's virtual size? thank you

There is no way to find out this information without stat()ing every file in the directory. It is slow if you have this many files in a single directory. stat() needs to retrieve the inode of every single file.
Adding more memory might help due to caching.

You could use something like this:
find images/ -name "*.qcow2" -exec qemu-img info {} \; | grep virtual | cut -d"(" -f2 | awk '{ SUM += $1} END { print SUM }'

Modern Unix*ish OSes provide a way to retrieve the stats of all entries of a directory in one step. This also needs to look at all inodes but probably it can be done optimized in the file system driver itself and thus might be faster.
Apparently you are not looking for a way to do this using system calls from C, so I guess a feasible approach could be to use Python. There you have access to this feature using the function scandir() in module os.

Related

Get directory size with xen image

I want to check the size of my directory.
directory is xen domU image.
directory name is xendisk
du -sh ./xendisk
returns 5.4G.
but xen domU Image size is 10G.
ls -alh and du -sh image
what happened?
You have created a sparse file for your image. If you used a command like truncate -s 10G domU.img to create the image then this would be the result.
The wiki which I have linked has more information but basically a sparse file is one where the empty parts of the file take no space. This is useful when dealing with VMs because in most cases your VM will only take a fraction of the space available to it so using a sparse file will mean that it takes far less space on your filesystem (as you have observed). The article states that this is acheived using the following mechanism:
When reading sparse files, the file system transparently converts
metadata representing empty blocks into "real" blocks filled with zero
bytes at runtime. The application is unaware of this conversion.
If you need to check the size with du you may be interested in the --apparent-size option, which will include all of the unallocated blocks in the calculation. Therefore you could use this command if you need the output to match what ls is telling you:
du -sh --apparent-size ./xendisk

caching on ramdisk - finding stalest file to delete

I have a nice caching system in linux that uses a ramdisk to cache both image files and the HTML output of various pages of my website.
My website is rather large and the ramdisk space required to cache everything exceeds 15GB (excluding image output) and I only have 2GB available for the cache.
Writing to and reading from cache is relatively fast but the problem is trying to figure out how to quickly find the stale-most file(s) when I run out of space in order to make room for a new file. I believe using "ls -R" and scanning the large output is a slow process.
My only other option which is inefficient to me is to flush the entire cache frequently in order to never run out of ramdisk space.
My cache allows my website to load many pages with a time to first byte (TTFB) of under 200ms which is what google likes, so I want to try to keep that 200ms as a maximum TTFB value when loading a file from cache, even if files are deleted as a result from lack of ramdisk space.
I thought of using direct access memory via pointers for cache, but because the output to cache is of various sizes, I would feel that option would waste memory space at best or use alot of cpu to find the next free memory location.
Anyone got an idea on how I can quickly seek and then remove the stalest file from my cache?
ls -latr should not be slow while working with a ramdisk. but this may be closer to what you are looking for:
find -type f -printf '%T+ %p\n' | sort | head -1

Maximum number of files/directories on Linux?

I'm developing a LAMP online store, which will allow admins to upload multiple images for each item.
My concern is - right off the bat there will be 20000 items meaning roughly 60000 images.
Questions:
What is the maximum number of files and/or directories on Linux?
What is the usual way of handling this situation (best practice)?
My idea was to make a directory for each item, based on its unique ID, but then I'd still have 20000 directories in a main uploads directory, and it will grow indefinitely as old items won't be removed.
Thanks for any help.
ext[234] filesystems have a fixed maximum number of inodes; every file or directory requires one inode. You can see the current count and limits with df -i. For example, on a 15GB ext3 filesystem, created with the default settings:
Filesystem Inodes IUsed IFree IUse% Mounted on
/dev/xvda 1933312 134815 1798497 7% /
There's no limit on directories in particular beyond this; keep in mind that every file or directory requires at least one filesystem block (typically 4KB), though, even if it's a directory with only a single item in it.
As you can see, though, 80,000 inodes is unlikely to be a problem. And with the dir_index option (enablable with tune2fs), lookups in large directories aren't too much of a big deal. However, note that many administrative tools (such as ls or rm) can have a hard time dealing with directories with too many files in them. As such, it's recommended to split your files up so that you don't have more than a few hundred to a thousand items in any given directory. An easy way to do this is to hash whatever ID you're using, and use the first few hex digits as intermediate directories.
For example, say you have item ID 12345, and it hashes to 'DEADBEEF02842.......'. You might store your files under /storage/root/d/e/12345. You've now cut the number of files in each directory by 1/256th.
If your server's filesystem has the dir_index feature turned on (see tune2fs(8) for details on checking and turning on the feature) then you can reasonably store upwards of 100,000 files in a directory before the performance degrades. (dir_index has been the default for new filesystems for most of the distributions for several years now, so it would only be an old filesystem that doesn't have the feature on by default.)
That said, adding another directory level to reduce the number of files in a directory by a factor of 16 or 256 would drastically improve the chances of things like ls * working without over-running the kernel's maximum argv size.
Typically, this is done by something like:
/a/a1111
/a/a1112
...
/b/b1111
...
/c/c6565
...
i.e., prepending a letter or digit to the path, based on some feature you can compute off the name. (The first two characters of md5sum or sha1sum of the file name is one common approach, but if you have unique object ids, then 'a'+ id % 16 is easy enough mechanism to determine which directory to use.)
60000 is nothing, 20000 as well. But you should put group these 20000 by any means in order to speed up access to them. Maybe in groups of 100 or 1000, by taking the number of the directory and dividing it by 100, 500, 1000, whatever.
E.g., I have a project where the files have numbers. I group them in 1000s, so I have
id/1/1332
id/3/3256
id/12/12334
id/350/350934
You actually might have a hard limit - some systems have 32 bit inodes, so you are limited to a number of 2^32 per file system.
In addition of the general answers (basically "don't bother that much", and "tune your filesystem", and "organize your directory with subdirectories containing a few thousand files each"):
If the individual images are small (e.g. less than a few kilobytes), instead of putting them in a folder, you could also put them in a database (e.g. with MySQL as a BLOB) or perhaps inside a GDBM indexed file. Then each small item won't consume an inode (on many filesystems, each inode wants at least some kilobytes). You could also do that for some threshold (e.g. put images bigger than 4kbytes in individual files, and smaller ones in a data base or GDBM file). Of course, don't forget to backup your data (and define a backup stategy).
The year is 2014. I come back in time to add this answer.
Lots of big/small files? You can use Amazon S3 and other alternatives based on Ceph like DreamObjects, where there are no directory limits to worry about.
I hope this helps someone decide from all the alternatives.
md5($id) ==> 0123456789ABCDEF
$file_path = items/012/345/678/9AB/CDE/F.jpg
1 node = 4096 subnodes (fast)

How to make file sparse?

If I have a big file containing many zeros, how can i efficiently make it a sparse file?
Is the only possibility to read the whole file (including all zeroes, which may patrially be stored sparse) and to rewrite it to a new file using seek to skip the zero areas?
Or is there a possibility to make this in an existing file (e.g. File.setSparse(long start, long end))?
I'm looking for a solution in Java or some Linux commands, Filesystem will be ext3 or similar.
A lot's changed in 8 years.
Fallocate
fallocate -d filename can be used to punch holes in existing files. From the fallocate(1) man page:
-d, --dig-holes
Detect and dig holes. This makes the file sparse in-place,
without using extra disk space. The minimum size of the hole
depends on filesystem I/O block size (usually 4096 bytes).
Also, when using this option, --keep-size is implied. If no
range is specified by --offset and --length, then the entire
file is analyzed for holes.
You can think of this option as doing a "cp --sparse" and then
renaming the destination file to the original, without the
need for extra disk space.
See --punch-hole for a list of supported filesystems.
(That list:)
Supported for XFS (since Linux 2.6.38), ext4 (since Linux
3.0), Btrfs (since Linux 3.7) and tmpfs (since Linux 3.5).
tmpfs being on that list is the one I find most interesting. The filesystem itself is efficient enough to only consume as much RAM as it needs to store its contents, but making the contents sparse can potentially increase that efficiency even further.
GNU cp
Additionally, somewhere along the way GNU cp gained an understanding of sparse files. Quoting the cp(1) man page regarding its default mode, --sparse=auto:
sparse SOURCE files are detected by a crude heuristic and the corresponding DEST file is made sparse as well.
But there's also --sparse=always, which activates the file-copy equivalent of what fallocate -d does in-place:
Specify --sparse=always to create a sparse DEST file whenever the SOURCE file contains a long enough sequence of zero bytes.
I've finally been able to retire my tar cpSf - SOURCE | (cd DESTDIR && tar xpSf -) one-liner, which for 20 years was my graybeard way of copying sparse files with their sparseness preserved.
Some filesystems on Linux / UNIX have the ability to "punch holes" into an existing file. See:
LKML posting about the feature
UNIX file trunctation FAQ (search for F_FREESP)
It's not very portable and not done the same way across the board; as of right now, I believe Java's IO libraries do not provide an interface for this.
If hole punching is available either via fcntl(F_FREESP) or via any other mechanism, it should be significantly faster than a copy/seek loop.
I think you would be better off pre-allocating the whole file and maintaining a table/BitSet of the pages/sections which are occupied.
Making a file sparse would result in those sections being fragmented if they were ever re-used. Perhaps saving a few TB of disk space is not worth the performance hit of a highly fragmented file.
You can use $ truncate -s filename filesize on linux teminal to create sparse file having
only metadata.
NOTE --Filesize is in bytes.
According to this article, it seems there is currently no easy solution, except for using FIEMAP ioctl. However, I don't know how you can make "non sparse" zero blocks into "sparse" ones.

What happens if there are too many files under a single directory in Linux?

If there are like 1,000,000 individual files (mostly 100k in size) in a single directory, flatly (no other directories and files in them), is there going to be any compromises in efficiency or disadvantages in any other possible ways?
ARG_MAX is going to take issue with that... for instance, rm -rf * (while in the directory) is going to say "too many arguments". Utilities that want to do some kind of globbing (or a shell) will have some functionality break.
If that directory is available to the public (lets say via ftp, or web server) you may encounter additional problems.
The effect on any given file system depends entirely on that file system. How frequently are these files accessed, what is the file system? Remember, Linux (by default) prefers keeping recently accessed files in memory while putting processes into swap, depending on your settings. Is this directory served via http? Is Google going to see and crawl it? If so, you might need to adjust VFS cache pressure and swappiness.
Edit:
ARG_MAX is a system wide limit to how many arguments can be presented to a program's entry point. So, lets take 'rm', and the example "rm -rf *" - the shell is going to turn '*' into a space delimited list of files which in turn becomes the arguments to 'rm'.
The same thing is going to happen with ls, and several other tools. For instance, ls foo* might break if too many files start with 'foo'.
I'd advise (no matter what fs is in use) to break it up into smaller directory chunks, just for that reason alone.
My experience with large directories on ext3 and dir_index enabled:
If you know the name of the file you want to access, there is almost no penalty
If you want to do operations that need to read in the whole directory entry (like a simple ls on that directory) it will take several minutes for the first time. Then the directory will stay in the kernel cache and there will be no penalty anymore
If the number of files gets too high, you run into ARG_MAX et al problems. That basically means that wildcarding (*) does not always work as expected anymore. This is only if you really want to perform an operation on all the files at once
Without dir_index however, you are really screwed :-D
Most distros use Ext3 by default, which can use b-tree indexing for large directories.
Some of distros have this dir_index feature enabled by default in others you'd have to enable it yourself. If you enable it, there's no slowdown even for millions of files.
To see if dir_index feature is activated do (as root):
tune2fs -l /dev/sdaX | grep features
To activate dir_index feature (as root):
tune2fs -O dir_index /dev/sdaX
e2fsck -D /dev/sdaX
Replace /dev/sdaX with partition for which you want to activate it.
When you accidently execute "ls" in that directory, or use tab completion, or want to execute "rm *", you'll be in big trouble. In addition, there may be performance issues depending on your file system.
It's considered good practice to group your files into directories which are named by the first 2 or 3 characters of the filenames, e.g.
aaa/
aaavnj78t93ufjw4390
aaavoj78trewrwrwrwenjk983
aaaz84390842092njk423
...
abc/
abckhr89032423
abcnjjkth29085242nw
...
...
The obvious answer is the folder will be extremely difficult for humans to use long before any technical limit, (time taken to read the output from ls for one, their are dozens of other reasons) Is there a good reason why you can't split into sub folders?
Not every filesystem supports that many files.
On some of them (ext2, ext3, ext4) it's very easy to hit inode limit.
I've got a host with 10M files in a directory. (don't ask)
The filesystem is ext4.
It takes about 5 minutes to
ls
One limitation I've found is that my shell script to read the files (because AWS snapshot restore is a lie and files aren't present till first read) wasn't able to handle the argument list so I needed to do two passes. Firstly construct a file list with find (wholename in case you want to do partial matches)
find /path/to_dir/ -wholename '*.ldb'| tee filenames.txt
then secondly read from a the file containing filenames and read all files. (with limited parallelism)
while read -r line; do
if test "$(jobs | wc -l)" -ge 10; then
wait -n
fi
{
#do something with 10x fanout
} &
done < filenames.txt
Posting here in case anyone finds the specific work-around useful when working with too many files.

Resources