How to check in linux kernel at vfs layer whether the file object is for a directory or a file - linux

How to check in linux kernel at vfs layer whether the file object is for a directory or a file?
I have found that there is a function called is_dx(dir) which checks for this but it is present in namei.c in ext3 or ext4. I need to do this at vfs layer that is independent of the file system.

How about the S_ISDIR() macro defined in include/linux/stat.h? It takesinode->i_mode field to check if the inode in question belongs to a directory or a file.

Having in hand the inode of the initial directory, the code
examines the entry matching the first name to get the
corresponding inode.
q Then the directory file having that node is read from disk and
the entry matching the second name is examined to derive the
corresponding inode.
q This procedure is repeated for each name included in the path.
The dentry cache considerably speeds up the procedure
File system operations are mostly done at the dcache level , so
they are all under kernel lock.

Related

How are ext4 directory entries stored in the i-nodes?

I am doing some experimentation with the internals of the ext4 file system, when I stumbled upon this issue while trying to implement reading a file by path.
The root directory i-node, number 2 as per the Kernel documentation's special i-node table, is easily found in the i-node table per the pointers in the block group descriptors and superblock.
As far as I understand it, the process of looking up a file by path is
Find the root directory i-node
Traverse it's directory entries until we find the name of the sub-directory we're looking for
Take the i-node number the directory entry we have found points to
go to (2.), repeat until we have found the file.
Read the file by parsing the extent tree
Is this correct?
If so, how are the struct ext4_dir_entrys stored/referenced from the i-node? I assume i_node.i_block[] has something to do with that, but I am not entirely clear on how to read the directory entries from there. Are they stored in the i-node? Or does the array contain pointers?

what does the author mean by directory structure in operating system?

I'm reading Operating System Concepts by Avi Silberschatz(9thE), in section 11.4 File-System Mounting, the author explains the steps of filesystem mounting as follows:
The operating system is given the
name of the device and the mount point—the location within the file structure
where the file system is to be attached.
Next, the operating system verifies that the device contains a valid file
system.
Finally, the operating
system notes in its directory structure that a file system is mounted at the
specified mount point.
I'm confused with the final step, since to the best of my knowledge, the directory structure is stored somewhere on the disk, which records the files' information -- such as name, location, size, and type. Then what does the author mean by directory structure in operating system? Is it the same directory on disk?
Additionally, which part finishes the conversion from file name to physical address on disk? Is it the disk driver or the disk controller or done by processor with memory?
What you are reading is largely nonsense. To begin with, it is eunuchs specific. Eunuchs variants tend to have a single directory structure containing all disks and even things that are not really files.
Let us assume that you are on Windoze. If you mount a disk the drive gets a name, typically a single letter but larger names are possible in some cases. Let's say you mount a disk drive, and the system assigns it to "Q:".
So now Q: is available and you can access files, by specifying something like
"Q:\dir1\dir2\file.type"
You are just accessing the directory structure that exists on Q:.
Each drive has a separate, independent directory structure.
Many operating system operate this way and your sequence above is irrelevant to them.
Eunchs variants do not work this way. The system maintains a single directory starting at "/" which is the root directory for the system. This is a directory maintained by the operating system and does not exist at all on a disk drive.
On a Mac, for instance, there is a "/Volumes" directory that contains all the drives mounted. These too are directories maintained by the operating system and do not exist at all on a disk drive.
"/Volumes/Macintosh HD"
"/Volumes/Backup Drive"
These system directories then link to the directories that are stored on those disks. Thus, in Eunuchs, there are directories maintained by the operating system and directories maintained on the disk that are merged together.
So if you want to find "/Volumes/Backup Drive/dir/something.txt" the system goes to the root "/" finds "Volumes" and determines this is a system directory. Finds "Backup Drives" and determines this is a disk drive that has been mounted. Goes to the root directory of the drive and find that "dir" is a directory on the drive, and finds the file something.txt.
To add to the confusion, there are disk formats that have no directory structure at all. But this illustrates that your book is taking you on a confusing path.
Each disk drive has a format of some kind. E.g., NTFS, ODS-11, FAT, ....
What I am telling you from here on is generalization of what typically happens but there are large variations in how it works among systems.
Typically, each drive will have a header that includes a description of block clusters in use (often a bitmaps) and files on the disk. The file description will usually have a file name, date created, owner, etc. The file description will also have information about where the data is stored on the disk.
The drive often will have a directory structure in which there is some file it defines as the root directory. The directory structure exists by creating directory files within other directory files. A directory is normally just a file that has a list of file names and the address of their description in the the disk header. Other file attributes, such as the file size and date of creation, are not stored in the directory.You get that from the file description in the disk header.
The file structure in the disk header is separate from the directory structure. In fact, it is often possible to create a file that is not even in a directory at all. Or you can put a single file in multiple directories.
If your disk gets trashed and has to be recovered, this is usually done by looking at the disk header. You get back your files but lose your directory structure.
Additionally, which part finishes the conversion from file name to physical address on disk? Is it the disk driver or the disk controller or done by processor with memory?
The logical location on the disk is specified in the file description in the disk header. The format of that information is specific to the underlying disk format. Generally you have two paths to reach the file description:
You can go through the list of file headers maintained by the disk; or
You can navigate a directory structure until you find the file name you want with a link to the file description.

Resolving symbolic links algorithm

What should the algorithm for resolving symlinks on Linux look like?
Something like:
Split path to steps /usr/bin/hello -> ['usr', 'bin', 'hello']
First resolve /usr -> /something1
Add next step and resolve /something1/bin -> /something2
Add next step and resolve /something2/hello -> /something3
Will that work?
What you are actually looking for is readlink command, that relies on POSIX realpath. Its algorithm is available here
As written in one book, the idea is this:
All path type resolution (checking) processing uses the presence or absence of a leading slash (/) to indicate whether the path is an absolute or relative path. If the slash is present, the first qualifier after the slash is compared against the MVS prefix to determine if it matches the prefix. If so, then the path type will be considered to be explicitly resolved via the prefix. If no match is found, or no slash was present, the implicit path type resolution heuristic is used.
Some details are also available here
Basically when you request an I/O, the kernel has to go through a series of steps. The kernel needs to to search directories for the requested file, this isn't a problem because the kernel always knows from where to start because the root file has a constant inode number, it's inode 2 in ext family of filesystems. The kernel then converts the filename to an inode number once it locates the filename in a directory. Because each directory is just a special kind of file which holds entries each entry with (filename, inode) fields, by searching directories the kernel will be able to locate the file's inode.
Once the kernel finds the inode of a file, this inode holds the block addresses for a regular file and thus will be used to located the data stored in that file. Block addresses of a file hold the actual data that are stored in the file. *The difference between a regular file and symlink file is that, the symlink file is a file that points to another location and thus the kernel has to perform the same series of steps twice, that is, when the inode of a symlink file is found the kernel has to redo the same operation for the filepath that the symlink file points it, it has to search in directories and find a matching filename in a directory in order to get the inode number. This obviously adds an overhead.
A recursive (a.k.a cyclic) symlink is an invalid symlink.
Not sure if I've answered your question, but that's what generally happens, you also have the VFS layer on the top and below that is the physical filesystem. Some filesystems don't even support symlinks, like vfat.

Can inode and crtime be used as a unique file identifier?

I have a file indexing database on Linux. Currently I use file path as an identifier.
But if a file is moved/renamed, its path is changed and I cannot match my DB record to the new file and have to delete/recreate the record. Even worse, if a directory is moved/renamed, then I have to delete/recreate records for all files and nested directories.
I would like to use inode number as a unique file identifier, but inode number can be reused if file is deleted and another file created.
So, I wonder whether I can use a pair of {inode,crtime} as a unique file identifier.
I hope to use i_crtime on ext4 and creation_time on NTFS.
In my limited testing (with ext4) inode and crtime do, indeed, remain unchanged when renaming or moving files or directories within the same file system.
So, the question is whether there are cases when inode or crtime of a file may change.
For example, can fsck or defragmentation or partition resizing change inode or crtime or a file?
Interesting that
http://msdn.microsoft.com/en-us/library/aa363788%28VS.85%29.aspx says:
"In the NTFS file system, a file keeps the same file ID until it is deleted."
but also:
"In some cases, the file ID for a file can change over time."
So, what are those cases they mentioned?
Note that I studied similar questions:
How to determine the uniqueness of a file in linux?
Executing 'mv A B': Will the 'inode' be changed?
Best approach to detecting a move or rename to a file in Linux?
but they do not answer my question.
{device_nr,inode_nr} are a unique identifier for an inode within a system
moving a file to a different directory does not change its inode_nr
the linux inotify interface enables you to monitor changes to inodes (either files or directories)
Extra notes:
moving files across filesystems is handled differently. (it is infact copy+delete)
networked filesystems (or a mounted NTFS) can not always guarantee the stability of inodenumbers
Microsoft is not a unix vendor, its documentation does not cover Unix or its filesystems, and should be ignored (except for NTFS's internals)
Extra text: the old Unix adagium "everything is a file" should in fact be: "everything is an inode". The inode carries all the metainformation about a file (or directory, or a special file) except the name. The filename is in fact only a directory entry that happens to link to the particular inode. Moving a file implies: creating a new link to the same inode, end deleting the old directory entry that linked to it.
The inode metatata can be obtained by the stat() and fstat() ,and lstat() system calls.
The allocation and management of i-nodes in Unix is dependent upon the filesystem. So, for each filesystem, the answer may vary.
For the Ext3 filesystem (the most popular), i-nodes are reused, and thus cannot be used as a unique file identifier, nor is does reuse occur according to any predictable pattern.
In Ext3, i-nodes are tracked in a bit vector, each bit representing a single i-node number. When an i-node is freed, it's bit is set to zero. When a new i-node is needed, the bit vector is searched for the first zero-bit and the i-node number (which may have been previously allocated to another file) is reused.
This may lead to the naive conclusion that the lowest numbered available i-node will be the one reused. However, the Ext3 file system is complex and highly optimised, so no assumptions should be made about when and how i-node numbers can be reused, even though they clearly will.
From the source code for ialloc.c, where i-nodes are allocated:
There are two policies for allocating an inode. If the new inode is a
directory, then a forward search is made for a block group with both
free space and a low directory-to-inode ratio; if that fails, then of
he groups with above-average free space, that group with the fewest
directories already is chosen. For other inodes, search forward from
the parent directory's block group to find a free inode.
The source code that manages this for Ext3 is called ialloc and the definitive version is here: https://github.com/torvalds/linux/blob/master/fs/ext3/ialloc.c
I guess the dB application would need to consider the case where the file is subject to restoration from backup, which would preserve the file crtime, but not the inode number.

Why do we need directory structure for file system?

Jos of MIT OS lesson only uses File structure to describe regular file or dir.
But linux kernel uses dentry/inode/file structure to describe files.
Is it neccessary to use dentry for file system?
In Linux, dentry is a directory entry that associates inode and file object, but it is not necessary just a directory, could represent a file. Dentry enables the hard link which allows allow multiple hard links to be created for the same file. So you can create multiple names for the same file.
Dentry cache also does matter for performance of File system. The following picture is from "Understanding the Linux Kernel, 3rd Edition" which shows interactions between processes and VFS objects.
Jos does use directory entries. It just uses the File object to store directories (they use teh same object for storing directory data and file data)

Resources