I have two same files in different directories with same inodes [closed] - linux

Closed. This question does not meet Stack Overflow guidelines. It is not currently accepting answers.
This question does not appear to be about a specific programming problem, a software algorithm, or software tools primarily used by programmers. If you believe the question would be on-topic on another Stack Exchange site, you can leave a comment to explain where the question may be able to be answered.
Closed 3 years ago.
Improve this question
I have two same files in different directories. When i looked for the inode number, it was same for all the files in both directories. Does those files in different directory consume diskspaces individually ?
They are not symlinks. they are hardlinks.

I had a similar query. The Answer is long but hopefully, it will help you:
2 files can have the same inode, but only if they are part of different partitions.
Inodes are only unique on a partition level, not on the whole system.
On each partition, there is a superblock. This superblock tells the system which inodes are used, which are free, etc (I'll spare you the technical details).
Each item on the disk -so files, but also directories, Fifo pipes and special device files- each have their own inode. All the inodes are stored on the disk, right beside the superblock (normally).
For instance in the case of regular files, inodes simply contain some informations like the last access/modification times, the size of the files, the file permissions, the disk blocks it occupies, etc.
For directories, inodes tell the system where the blocks that contain the contents of the directory are stored, as well as the last access/modified dates and permission on the directory.
You can see this if you look at the size of a directory via "ls -ld dir". It is usually an multiple of the size of a disk block (512kB or 1MB usually). The contents of directory blocks are nothing more than a list of name-inode pairs (ie a filename + it's inode number). When you do an "ls", those contents get printed, without the need for the system to actually locate all files in the directory. If you access a file or subdirectory, the system simply looks up the inode number from the directory contents and then retrieves the inode in question, so that you can have fast access to the file/subdirectory in question.
So, you can immediately see that the inode information describes directly what's on a partition to the system. It is the core of the filesystem on your partition.
Unless you are using special software like LVM to make the system believe that multiple physical disks or multiple partitions act as one partition, each partition needs to know it's own contents. Otherwise, you would never be able to share disks for instance via NFS mounting (each computer that shares the disk must know it's contents).
Continuing this line of thought, it is logical that inodes are unique on the (logical) partition level.
To answer your question on hard links. You first need to know what the difference is between a hard and a soft link (or symbolic link). Let's say you want to link A to B, regardless of what A and B really are (files, directories, device files, etc). Thus creating two ways of accessing the same item on your filesystem.
Using a soft link is easy. A and B both have their own inodes and both are part of different directories.
However, A actually contains the full path and name of B. When your system tries to access A, it will see the reference to B (via the full path), locate B by following the path and then access it. Since the full filesystem path is used, soft links work across different partitions. If A is indeed a soft link to B on a different partition and B's partition is unmounted, then the link will continue to exist but will simply point to something unreachable. So you can't access A either. Same goes when B gets deleted. If A (the soft link) would be deleted, then B is still there, unaltered.
Hard links are a different story. As I explained, the contents of a directory are nothing more than pairs of inode numbers and names. The inodes are used to access the actual items in the directory. The names are just there for the ease-of-use, more or less. A hard link is nothing more than copying the inode number from one entry in some directory's contents into another entry. This second entry can be in a different directory's contents or even in the same directory (under a different name).
Since both directory entries have the same inode number, they point to the same item on the disk (ie the same physical file). Of course, inode numbers, as explained above, are partition-specific. So, duplicating an inode number on a different partition would not work as expected. That's why hard links cannot work across partitions.
Internally, the inodes of items such as files and directories also contain a link counter. This counter holds the number of (hard) links to the item. When you delete the item (using "rm" for instance), you internally "unlink" it (hence the term "unlink" instead of "delete" or "remove" that you see in some shells, like Perl). Unlinking simply decreases the link counter in the inode and deletes the entry in the directory's contents list. If the link counter drops to 0 (the last link is deleted), then the disk blocks occupied by the item get freed. When new items are created on the disk later on, they may use the freed blocks and overwrite them. In other words, when the last link is deleted, the item becomes unreachable (as if it was deleted from the disk). So, as long as the last link isn't deleted the contents of the item (file/directory) are still accessible and usable via the remaining hard links.
Symbolic links are created by "ln -s", hard links via simple "ln". See ln's man pages for details.

Related

In a kernel module, how to know whether given inode belongs to a specific directory?

One possible way is that, compare given inode with list of inodes in that directory. The list of inodes could be predetermined or it can be calculated run time, both ways have their own problems:
Predetermined list: List can be changed during this operation, i.e. files could be added or removed from that directory.
Run time list: If that directory has too many files, it's too much overhead for each access of any file in the system.
Is there any efficient solution/way for this? I have tried by comparing file by it's path, which was really a bad idea.
Either if you do it in kernel mode or in user mode has no advantages. To see if an inode is indeed in some directory you have to read that directory as files are located in directories normally as a linear list. This can lead your process blocking for directory blocks to be present if not cached and, in that time, the directory contents can be modified. Only if you maintain the directory inode blocked while doing that operation can help, but this can add severe performance restrictions to your operating system. Another issue is that each filesystem is free to implement directory contents in it's own format. In userland you get an uniform directory format, but in kernel mode you have to deal with the different approaches for different filesystem types. Why do you need to know that? I can't imagine a scenario where this can be needed. Perhaps you can redesign your algorithm for the directory contents to be unnecessary.
By the way, dealing with complete paths or searching directories have obscure race conditions that can deal your system blocked someway. What can happen if, in the middle of your seach, somebody tries to unlink the inode you are searching for; or the directory contents must be modified; or some other process is using namei() to traverse through your directory upwards; or downwards. Have you think in all these possibilities?

Can inode and crtime be used as a unique file identifier?

I have a file indexing database on Linux. Currently I use file path as an identifier.
But if a file is moved/renamed, its path is changed and I cannot match my DB record to the new file and have to delete/recreate the record. Even worse, if a directory is moved/renamed, then I have to delete/recreate records for all files and nested directories.
I would like to use inode number as a unique file identifier, but inode number can be reused if file is deleted and another file created.
So, I wonder whether I can use a pair of {inode,crtime} as a unique file identifier.
I hope to use i_crtime on ext4 and creation_time on NTFS.
In my limited testing (with ext4) inode and crtime do, indeed, remain unchanged when renaming or moving files or directories within the same file system.
So, the question is whether there are cases when inode or crtime of a file may change.
For example, can fsck or defragmentation or partition resizing change inode or crtime or a file?
Interesting that
http://msdn.microsoft.com/en-us/library/aa363788%28VS.85%29.aspx says:
"In the NTFS file system, a file keeps the same file ID until it is deleted."
but also:
"In some cases, the file ID for a file can change over time."
So, what are those cases they mentioned?
Note that I studied similar questions:
How to determine the uniqueness of a file in linux?
Executing 'mv A B': Will the 'inode' be changed?
Best approach to detecting a move or rename to a file in Linux?
but they do not answer my question.
{device_nr,inode_nr} are a unique identifier for an inode within a system
moving a file to a different directory does not change its inode_nr
the linux inotify interface enables you to monitor changes to inodes (either files or directories)
Extra notes:
moving files across filesystems is handled differently. (it is infact copy+delete)
networked filesystems (or a mounted NTFS) can not always guarantee the stability of inodenumbers
Microsoft is not a unix vendor, its documentation does not cover Unix or its filesystems, and should be ignored (except for NTFS's internals)
Extra text: the old Unix adagium "everything is a file" should in fact be: "everything is an inode". The inode carries all the metainformation about a file (or directory, or a special file) except the name. The filename is in fact only a directory entry that happens to link to the particular inode. Moving a file implies: creating a new link to the same inode, end deleting the old directory entry that linked to it.
The inode metatata can be obtained by the stat() and fstat() ,and lstat() system calls.
The allocation and management of i-nodes in Unix is dependent upon the filesystem. So, for each filesystem, the answer may vary.
For the Ext3 filesystem (the most popular), i-nodes are reused, and thus cannot be used as a unique file identifier, nor is does reuse occur according to any predictable pattern.
In Ext3, i-nodes are tracked in a bit vector, each bit representing a single i-node number. When an i-node is freed, it's bit is set to zero. When a new i-node is needed, the bit vector is searched for the first zero-bit and the i-node number (which may have been previously allocated to another file) is reused.
This may lead to the naive conclusion that the lowest numbered available i-node will be the one reused. However, the Ext3 file system is complex and highly optimised, so no assumptions should be made about when and how i-node numbers can be reused, even though they clearly will.
From the source code for ialloc.c, where i-nodes are allocated:
There are two policies for allocating an inode. If the new inode is a
directory, then a forward search is made for a block group with both
free space and a low directory-to-inode ratio; if that fails, then of
he groups with above-average free space, that group with the fewest
directories already is chosen. For other inodes, search forward from
the parent directory's block group to find a free inode.
The source code that manages this for Ext3 is called ialloc and the definitive version is here: https://github.com/torvalds/linux/blob/master/fs/ext3/ialloc.c
I guess the dB application would need to consider the case where the file is subject to restoration from backup, which would preserve the file crtime, but not the inode number.

Is there a Linux filesystem, perhaps fuse, which gives the directory size as the size of its contents and its subdirs? [closed]

Closed. This question is off-topic. It is not currently accepting answers.
Want to improve this question? Update the question so it's on-topic for Stack Overflow.
Closed 10 years ago.
Improve this question
If there isn't, how feasible would it be to write one? A filesystem which for each directory keeps the size of its contents recursively and which is kept updated not by re-calculating the size on each change on the filesystem, but for example update the dir size when a file is removed or grows.
I am not aware of such a file system. From filesystem's point of view a directory is a file.
You can use:
du -s -h <dir>
to display the total size of all the files in the directory.
From the filesystem point of view, size of directory is size of information about its existence, which needs to be saved on the medium physically. Note, that "size" of directory containing files which have 10GB in total, will be actually the same as "size" of empty directory, because information needed to mark its existence will take same storage space. That's why size of files ( sockets, links and other stuff inside ), isn't actually the same as "directory size". Subdirectories can be mounted from various locations, including remote, and recursively mounted. Somewhat directory size is just a human vision, for real files are not "inside" directories physically - a directory is just a mark of container, exactly the same way as special file ( e.g. device file ) is marked a special file. Recounting and updating total directory size depends more on NUMBER of items in it, than sum of their sizes, and modern filesystem can keep hundreds of thousands of files ( if not more ) "in" one directory, even without subdirs, so counting their sizes could be quite heavy task, in comparison with possible profit from having this information. In short, when you execute e.g. "du" ( disk usage ) command, or when you count directory size in windows, actually doing it someway by the kernel with filesystem driver won't be faster - counting is counting.
There are quota systems, which keep and update information about total size of files owned by particular user or groups, they're, however, limited to monitor partitions separately, as for particular partition quota may be enabled or not. Moreover, quota usage gets updated, as you said, when file grows or is removed, and that's why information may be inaccurate - for this reason quota is rebuild from time to time, e.g. with cron job, by scanning all files in all directories "from the scratch", on the partition on which it is enabled.
Also note, that bottleneck of IO operations speed ( including reading information about the files ) is usually speed of the medium itself, then communication bus, and then CPU, while you're considering every filesystem to be fast as RAM FS. RAM FS is probably most trivial files system, virtually kept in RAM, which makes IO operations go very fast. You can build it at module and try to add functionality you've described, you will learn many interesting things :)
FUSE stands for "file system in user space", FS implemented with fuse are usually quite slow. They make sense when functionality in particular case is more important than speed, e.g. you can create a pseudo-filesystem basing on temperature reading from your newly bought e-thermometer you connected to your computer via USB, however they're not speed daemons, you know :)

Maximum number of files/directories on Linux?

I'm developing a LAMP online store, which will allow admins to upload multiple images for each item.
My concern is - right off the bat there will be 20000 items meaning roughly 60000 images.
Questions:
What is the maximum number of files and/or directories on Linux?
What is the usual way of handling this situation (best practice)?
My idea was to make a directory for each item, based on its unique ID, but then I'd still have 20000 directories in a main uploads directory, and it will grow indefinitely as old items won't be removed.
Thanks for any help.
ext[234] filesystems have a fixed maximum number of inodes; every file or directory requires one inode. You can see the current count and limits with df -i. For example, on a 15GB ext3 filesystem, created with the default settings:
Filesystem Inodes IUsed IFree IUse% Mounted on
/dev/xvda 1933312 134815 1798497 7% /
There's no limit on directories in particular beyond this; keep in mind that every file or directory requires at least one filesystem block (typically 4KB), though, even if it's a directory with only a single item in it.
As you can see, though, 80,000 inodes is unlikely to be a problem. And with the dir_index option (enablable with tune2fs), lookups in large directories aren't too much of a big deal. However, note that many administrative tools (such as ls or rm) can have a hard time dealing with directories with too many files in them. As such, it's recommended to split your files up so that you don't have more than a few hundred to a thousand items in any given directory. An easy way to do this is to hash whatever ID you're using, and use the first few hex digits as intermediate directories.
For example, say you have item ID 12345, and it hashes to 'DEADBEEF02842.......'. You might store your files under /storage/root/d/e/12345. You've now cut the number of files in each directory by 1/256th.
If your server's filesystem has the dir_index feature turned on (see tune2fs(8) for details on checking and turning on the feature) then you can reasonably store upwards of 100,000 files in a directory before the performance degrades. (dir_index has been the default for new filesystems for most of the distributions for several years now, so it would only be an old filesystem that doesn't have the feature on by default.)
That said, adding another directory level to reduce the number of files in a directory by a factor of 16 or 256 would drastically improve the chances of things like ls * working without over-running the kernel's maximum argv size.
Typically, this is done by something like:
/a/a1111
/a/a1112
...
/b/b1111
...
/c/c6565
...
i.e., prepending a letter or digit to the path, based on some feature you can compute off the name. (The first two characters of md5sum or sha1sum of the file name is one common approach, but if you have unique object ids, then 'a'+ id % 16 is easy enough mechanism to determine which directory to use.)
60000 is nothing, 20000 as well. But you should put group these 20000 by any means in order to speed up access to them. Maybe in groups of 100 or 1000, by taking the number of the directory and dividing it by 100, 500, 1000, whatever.
E.g., I have a project where the files have numbers. I group them in 1000s, so I have
id/1/1332
id/3/3256
id/12/12334
id/350/350934
You actually might have a hard limit - some systems have 32 bit inodes, so you are limited to a number of 2^32 per file system.
In addition of the general answers (basically "don't bother that much", and "tune your filesystem", and "organize your directory with subdirectories containing a few thousand files each"):
If the individual images are small (e.g. less than a few kilobytes), instead of putting them in a folder, you could also put them in a database (e.g. with MySQL as a BLOB) or perhaps inside a GDBM indexed file. Then each small item won't consume an inode (on many filesystems, each inode wants at least some kilobytes). You could also do that for some threshold (e.g. put images bigger than 4kbytes in individual files, and smaller ones in a data base or GDBM file). Of course, don't forget to backup your data (and define a backup stategy).
The year is 2014. I come back in time to add this answer.
Lots of big/small files? You can use Amazon S3 and other alternatives based on Ceph like DreamObjects, where there are no directory limits to worry about.
I hope this helps someone decide from all the alternatives.
md5($id) ==> 0123456789ABCDEF
$file_path = items/012/345/678/9AB/CDE/F.jpg
1 node = 4096 subnodes (fast)

Disadvantages to creating/removing many hard links?

I need to create hundreds to thousands of temporary hard or symbolic links that will be deleted shortly after creation. For my purposes both types of links will work (i.e. the target is not a directory and it always exists on the same file system)
As I understand it, symbolic links create a small file that contains the path to the original file. Whereas a hardlink creates a reference to the data in the same inode. So maybe if I am going to be creating/deleting thousands of these links is it better to be creating and deleting thousands of tiny files (symlinks) or thousands of these references (hardlinks)? It seems like one taxes the hard drive (maybe fragmentation) while the other might tax the file system itself? Where are inode references stored. Do I risk corrupting the file system by making so many hard links? What about speed?
Thanks for your expertise!
This a work around to be able to use ffmpeg to encode a movie out of an arbitrary subset of images from a directory. Since ffmpeg requires that the files be named properly (e.g. frame%04d.jpg) I realized I can just create hard/sym links to the subset of files and just name the links appropriately. This avoids renaming the original files and having to actually copy the data. It works great but it requires creating and deleting many thousands of links, repeatedly.
Sort of addresses this problem too I believe:
convert image sequence using ffmpeg
If this activity breaks your file system, then your file system is at fault, not you. File systems are generally pretty reliable, so don't worry about that.
Both options require adding an entry in the directory. The symbolic link requires creating a file as well. When you access the file the hard link jumps directly to the content, while accessing a symlink requires finding the symlink file, reading it, finding the directory with the content, finding where the content is, and then accessing that. Therefore symlinks are more work for the filesystem all around.
But the difference is minute when compared to the work of actually reading the data in the files. Therefore I would not worry about it, and just go with whichever one best gives you the semantics you want.
Since you are not trying to create hundreds of thousands to the same file, hard links are marginally better performing.
However, symbolic links in /tmp if /tmp is tmpfs is even better performing yet.
Oh, and symlinks are too small to cause fragmentation issues.
Both options require the addition of a file entry in the directory inode, the directory structure may grow by allocating new blocks.
But a symbolic link requires the allocation of an inode and the filesystem has a limit for inodes. Your hundreds of thousands symlinks may hit that limit and you may get the "Not enough space for file" error message even with gigabytes free.
By default, the file system creation tool choose the maximum number of inodes according to the physical partition size. For instance for Linux ext2/3/4, mkfs.ext3 uses a bytes-per-inode ratio you can find in your /etc/mke2fs.conf.
For an existing filesystem, here is a command to get information about inodes:
# dumpe2fs /dev/sda1 | grep -i inode | less
Inode count: 979200
Free inodes: 742304
Inodes per group: 16320
Inode blocks per group: 510
First inode: 11
Inode size: 128
Journal inode: 8
First orphan inode: 441066
Journal backup: inode blocks
As a conclusion, you should prefer hard links mainly for resource consumption on disk and in memory (VFS structures in caches).
Another advice: do not create too many files in the same directory, 2'000 files is a reasonable limit to avoid performance issues.

Resources