Why is root directory always stored in inode two? - linux

I'm learning about Linux filesystems, with these sources:
http://linuxgazette.net/issue21/ext2.html
http://homepage.smc.edu/morgan_david/cs40/analyze-ext2.htm
But I have one question about the root directory: why is its inode number always two? Why not one, or another number?

The first inode number is 1. 0 is used as a NULL value, to indicate that there is no inode. Inode 1 is used to keep track of any bad blocks on the disk; it is essentially a hidden file containing the bad blocks, so that they will not be used by another file. The bad blocks can be recorded using e2fsck -c. The filesystem root directory is inode 2.
The meaning of particular inode numbers differs by filesystem. For ext4 you can find more information on the Ext4 Wiki Ext4 Disk Layout page; in particular see the "Special inodes" table.

Related

How are ext4 directory entries stored in the i-nodes?

I am doing some experimentation with the internals of the ext4 file system, when I stumbled upon this issue while trying to implement reading a file by path.
The root directory i-node, number 2 as per the Kernel documentation's special i-node table, is easily found in the i-node table per the pointers in the block group descriptors and superblock.
As far as I understand it, the process of looking up a file by path is
Find the root directory i-node
Traverse it's directory entries until we find the name of the sub-directory we're looking for
Take the i-node number the directory entry we have found points to
go to (2.), repeat until we have found the file.
Read the file by parsing the extent tree
Is this correct?
If so, how are the struct ext4_dir_entrys stored/referenced from the i-node? I assume i_node.i_block[] has something to do with that, but I am not entirely clear on how to read the directory entries from there. Are they stored in the i-node? Or does the array contain pointers?

How to prove that directory is a file in Linux

"Everything is a file in Linux". How can i prove that directories are represented as files in linux. Also the physical hardware devices everything creates and is represented as files in Linux. But how can i prove this concept with supporting examples to someone.
Viewing the Directory and other physical hardwares as files in Liniux.( POC)
The "Everything is a file in Linux" statement is a bit of an oversimplification. There are many things in Linux that appear as files, but don't quite 'act' as you think they would in a conventional sense.
Block files (e.g. /dev/loop0) are a great example of this as they are used as a way of communicating with device drivers.
That said, directories are their own 'special' kind of file that contain inode ids pointing to a file's inode. I suppose a simple 'proof' of sorts would be to ls -l any directory and you will notice that most (if not all) of them will have a listed file size of 4096 bytes rather than listing the collective size of its contents.
4096 bytes is the smallest blocksize for most filesystems and is usually more than enough to fit all the information (inode ids) of a directory. So rather than direct information/access to its files, a directory rather holds meta-data about them.
Alternatively, using stat on any directory will display it's own inode number (as well as the number of links it has).
EDIT: Directory files contain the inode id (a pointer to a file's inode) not the inode itself. I have edited the answer.

Getting the root device in a kernel module

I did some web searches for this, but could only find results about getting the kernel module associated with a device node. Is there anyway I can get the major and minor numbers of the current system's root device and, if applicable, the root device's parent device (e.g., /dev/sda is the "parent" of /dev/sda2)? Does the kernel export some functions for getting this or would I need to get it indirectly?
There is no module associated with a device node. Possibly you know that the root directory is something local to a process (the process structure stores the inode reference for the root directory --- and this can be changed with the privileged chroot(2) system call) and the current working directory (to solve for paths not beginning with /)
If you want to know the device responsible of the root directory you have two options:
Your process has not been made a chroot(2) syscall, so you opendir("/") and then do a fstat(2) on it (or you can do a stat(2) syscall on the "/" directory). This will give the device in which the root directory resides as the st_dev field of the struct stat returns. It is formatted as a dev_t number, in which some of the bits represent the major number and some the minor number. You can use the MKDEV(ma,mi) and MAJOR(dev) and MINOR(dev) macros defined in <linux/kdev_t.h> to access the major and minor numbers. To get the physical disk, just mask the minor number with 0xf0 and you will get the minor number of the whole disk.
your process has made a chroot(2) syscall, so you are not allowed to access the real root directory in the system. If you have access to the /proc filesystem, then probably you can call mount(1) command to get the mount table. you can search that table for the / entry, and then get the /dev/sd<disk> entry. Once you got the device, getting the parent device is easy. You can mask the number as you did in the last point to get the minor number of the physical disk.
You can also get to the /proc/diskstats file, that shows you the statistics of each block device. You'll get the major, minor and device name in the first three fields of each line.
NOTE
There are some disk arrangementes that dont't allow partitioning, as RAID devices or volume manager disks. In those cases, getting to the physical disk (or disks, as there can be more than one) is more difficult.

Can inode and crtime be used as a unique file identifier?

I have a file indexing database on Linux. Currently I use file path as an identifier.
But if a file is moved/renamed, its path is changed and I cannot match my DB record to the new file and have to delete/recreate the record. Even worse, if a directory is moved/renamed, then I have to delete/recreate records for all files and nested directories.
I would like to use inode number as a unique file identifier, but inode number can be reused if file is deleted and another file created.
So, I wonder whether I can use a pair of {inode,crtime} as a unique file identifier.
I hope to use i_crtime on ext4 and creation_time on NTFS.
In my limited testing (with ext4) inode and crtime do, indeed, remain unchanged when renaming or moving files or directories within the same file system.
So, the question is whether there are cases when inode or crtime of a file may change.
For example, can fsck or defragmentation or partition resizing change inode or crtime or a file?
Interesting that
http://msdn.microsoft.com/en-us/library/aa363788%28VS.85%29.aspx says:
"In the NTFS file system, a file keeps the same file ID until it is deleted."
but also:
"In some cases, the file ID for a file can change over time."
So, what are those cases they mentioned?
Note that I studied similar questions:
How to determine the uniqueness of a file in linux?
Executing 'mv A B': Will the 'inode' be changed?
Best approach to detecting a move or rename to a file in Linux?
but they do not answer my question.
{device_nr,inode_nr} are a unique identifier for an inode within a system
moving a file to a different directory does not change its inode_nr
the linux inotify interface enables you to monitor changes to inodes (either files or directories)
Extra notes:
moving files across filesystems is handled differently. (it is infact copy+delete)
networked filesystems (or a mounted NTFS) can not always guarantee the stability of inodenumbers
Microsoft is not a unix vendor, its documentation does not cover Unix or its filesystems, and should be ignored (except for NTFS's internals)
Extra text: the old Unix adagium "everything is a file" should in fact be: "everything is an inode". The inode carries all the metainformation about a file (or directory, or a special file) except the name. The filename is in fact only a directory entry that happens to link to the particular inode. Moving a file implies: creating a new link to the same inode, end deleting the old directory entry that linked to it.
The inode metatata can be obtained by the stat() and fstat() ,and lstat() system calls.
The allocation and management of i-nodes in Unix is dependent upon the filesystem. So, for each filesystem, the answer may vary.
For the Ext3 filesystem (the most popular), i-nodes are reused, and thus cannot be used as a unique file identifier, nor is does reuse occur according to any predictable pattern.
In Ext3, i-nodes are tracked in a bit vector, each bit representing a single i-node number. When an i-node is freed, it's bit is set to zero. When a new i-node is needed, the bit vector is searched for the first zero-bit and the i-node number (which may have been previously allocated to another file) is reused.
This may lead to the naive conclusion that the lowest numbered available i-node will be the one reused. However, the Ext3 file system is complex and highly optimised, so no assumptions should be made about when and how i-node numbers can be reused, even though they clearly will.
From the source code for ialloc.c, where i-nodes are allocated:
There are two policies for allocating an inode. If the new inode is a
directory, then a forward search is made for a block group with both
free space and a low directory-to-inode ratio; if that fails, then of
he groups with above-average free space, that group with the fewest
directories already is chosen. For other inodes, search forward from
the parent directory's block group to find a free inode.
The source code that manages this for Ext3 is called ialloc and the definitive version is here: https://github.com/torvalds/linux/blob/master/fs/ext3/ialloc.c
I guess the dB application would need to consider the case where the file is subject to restoration from backup, which would preserve the file crtime, but not the inode number.

creating inode while creating pipe, fifo or socket

I have general question about Linux. Will the inode be created if I create a fifo? pipe? socket?
On Linux the answer can be obtained from /proc/<PID>/fd directory. To quote /proc documentation ( man 5 proc ):
For file descriptors for pipes and sockets, the entries will be
symbolic links whose content is the file type with the inode. A
readlink(2) call on this file returns a string in the format:
type:[inode]
For example, socket:[2248868] will be a socket and its inode is
2248868. For sockets, that inode can be used to find more information in one of the files under /proc/net/.
Let's verify that:
$ bash -c 'true | ls -l /proc/self/fd/0'
lr-x------ 1 user user 64 Sep 13 03:58 /proc/self/fd/0 -> 'pipe:[54741]'
So will pipes and sockets have an inode ? Yes ! What about FIFOs ? We can guess that since they have a filename, they do have inode ( and I don't think directory entries without inode can exist ). But lets verify:
$ mkfifo foobar.fifo
$ ls -i foobar.fifo
1093642 foobar.fifo
The answer is "yes, FIFOs have inodes,too".
However, this raises an important question: inodes are properties of filesystems, and inodes aren't unique accross filesystems, so which filesystem is being referenced when we see a pipe inode ? Well, turns out there exists pipefs virtual filesystem which is mounted in Kernel space, rather than userspace. It manages both pipes and FIFOs, so the inode number you see is the /proc example is the property of those filesystems, rather than the filesystem you have on disk. And yes, anonymous pipes and anonymous sockets won't have inode on disk filesystem, because there's no filename and no bytes on disk (although there may be caching of data, and in fact old Unixes cached pipes to disk). FIFOs and Unix-domain sockets, however, have filename on the filesystem, so in foobar.fifo example that inode belongs on the disk filesystem.
See also:
How pipes work in Linux
What is the difference between “Redirection” and “Pipe”?
No inode will be created for an anonymous pipe or a socket, as an inode is a property of a filesystem and neither of these two lives as a filesystem entity (they don't have a file path). They only have file descriptors.
However, for named pipes (aka fifo) an inode is created as it lives as an filesystem entity.

Resources