Where are inodes stored at? - linux

I recently started learning about the Linux kernel and I just learned about inodes, which are data-structures containing meta-data of a file.
Now, how do the OS find the associated inode of a file? (Let's say a string of a path). Moreover, where are those inode stored at? I mean, obviously they are stored on the disk but how is it all managed?
One naive solution (I can come up with) would be to allocate on the disk a region designated only for inodes - What's actually done?

It depends on file system implementation. For example ext2fs/ext3fs choose to store inodes before data blocks within Block Group. The Second Extended File system (EXT2)
Remember inodes stored across all Block Groups. For example, inodes 1 to 32768 will get stored in Block Group-0 and inodes 32768 to 65536 stored on Block-Group-2 and so on.
So, the answer to your question is: Inodes are stored in inode tables, and there's an inode table in every block group in the partition.

Related

Where does Linux keep the record of free inodes?

I am curious how and where Linux (and any operating system that make use of Inode for its file system) keep track of free inodes that can be used? When a new file is crated, which inode does the operating system assign it to? Things get more complex as files are continuously created and deleted. How in general an OS manage which inodes are free and which are used?
I would guess the inodes are structured like a free list, similar to a memory allocator. But when look at all descriptions on the inode structure I did not find a filed of pointer for "next available inode". I think this is some important issue but, curiously, I am not able to find one literature with a definite answer.
First off, let's divorce the notion of inodes from Linux; inodes are a feature of the ext3, ext4, and UFS file systems as opposed to an OS. So where is inode information stored? The following link should answer that.
https://serverfault.com/questions/212766/where-is-the-inode-number-stored?newreg=ddf0ea8fd887447698c8f95
Regarding "free" inodes, inodes are not created until a new file or directory is created; there are no such thing as "free" inodes.
Where inodes are exactly stored is file system dependent, just like whether inodes are relevant or not.
Most traditional Unix/Linux file systems (e.g. ufs, ext2, ext3, ext4, gfs2, ocfs2, ...) do create a fixed size inode table stored on disk. This table is cached in RAM.
Exceptions are reiserfs, jfs, xfs, btrfs and zfs, which are able to dynamically allocate new inodes.
With the latter ones, free inodes doesn't make a lot of sense but with the former ones, one can definitely run out of inodes if the sizing done at file system creation time wasn't appropriate. In addition to the full inode table, there is usually also a free inode list stored in the file system.
NTFS has something similar to inodes uner the cover and finally, some file systems do not use inodes at all, like fat32 and iso9660.

What is the general sizing number for /dev partition on RHEL

Please check the below description:
Red Hat Enterprise Linux uses a naming scheme that is file-based, with file names in the form of /dev/xxyN.
Where,
xx:
The first two letters of the partition name indicate the type of device on which the partition resides, usually sd.
y:
This letter indicates which device the partition is on. For example, /dev/sda for the first hard disk, /dev/sdb for the second, and so on.
N:
The final number denotes the partition. The first four (primary or extended) partitions are numbered 1 through 4. Logical partitions start at 5. So, for example, /dev/sda3 is the third primary or extended partition on the first hard disk, and /dev/sdb6 is the second logical partition on the second hard disk.
In Red Hat Enterprise Linux each partition is used to form part of the storage necessary to support a single set of files and directories. Mounting a partition makes its storage available starting at the specified directory (known as a mount point).
For example, if partition /dev/sda5 is mounted on /usr/, that would mean that all files and directories under /usr/ physically reside on /dev/sda5. So the file /usr/share/doc/FAQ/txt/Linux-FAQ would be stored on /dev/sda5, while the file /etc/gdm/custom.conf would not. It is also possible that one or more directories below /usr/ would be mount points for other partitions. For instance, a partition (say, /dev/sda7) could be mounted on /usr/local/, meaning that /usr/local/man/whatis would then reside on /dev/sda7 rather than /dev/sda5.
Generally speaking, the disk spacing for /dev partition depends on number and size of the partitions (both primary and logical)to be used by operating system. However, there is no one right answer to this question. It depends on your needs and requirements.
My question is, Is there any affect to the initial partition memory (say, we given 32 GB to /dev partition while installing RHEL OS), if we are adding more harddisk memories(say in 100's of GB's) to /dev partition.
You don't create partitions for /dev. It's in memory, and managed fully automatically by the kernel. /dev exists to expose kernel objects such as devices to userspace, it is transient and doesn't require backing storage on disk.
if you run ls -l /dev/sda1, you will see that the first letter in the permission block says b. b = block-device. This is a special file that if stored on disk, only would hold two special numbers (called major and minor, usually stored together with the file-permissions). When you try to open this special file, the kernel will see that it is a "block" device and look up and major and minor numbers to find the matching physical driver that actually contains this data. Your read/write/ioctl calls will then be redirected to this driver.

How file system block size works?

All Linux file systems have 4kb block size. Let's say I have 10mb of hard disk storage. That means I have 2560 blocks available and let's say I copied 2560 files each having 1kb of size. Each 1 kb block will occupy 1 block though it is not filling entire block.
So my entire disk is now filled but still I have 2560x3kb of free space. If I want to store another file of say 1mb will the file system allow me to store? Will it write in the free space left in the individual blocks? Is there any concept addressing this problem?
I would appreciate some clarification.
Thanks in advance.
It is true, you are in a way wasting disk space if you are storing a lot of files which are much smaller than the smallest block size of the file system.
The reason why the block size is around 4kb is the amount of metadata associated with blocks. Smaller the block size, more there is metadata about the locations of the blocks compared to the actual data and more fragmented is the worst case scenario.
However, there are filesystems with different block sizes, most filesystems let you define the block size, typically the minimum block size is 512 bytes. If you are storing a lot of very small files having a small block size might make sense.
http://www.tldp.org/LDP/sag/html/filesystems.html
XFS Filesystem documentation has some comments on how to select filesystem block size - it is also possible to defined the directory block size:
http://techpubs.sgi.com/library/tpl/cgi-bin/getdoc.cgi?coll=linux&db=bks&srch=&fname=/SGI_Admin/LX_XFS_AG/sgi_html/ch02.html
You should consider setting a logical block size for a filesystem
directory that is greater than the logical block size for the
filesystem if you are supporting an application that reads directories
(with the readdir(3C) or getdents(2) system calls) many times in
relation to how much it creates and removes files. Using a small
filesystem block size saves on disk space and on I/O throughput for
the small files.

What kernel level functions are called when we perform write in ext3 file system?

I have ext3 file system mounted and I am creating a file on it to understand how block groups are allocated.
I want to know what functions are being called when I create/write a file. I know vfs_write is called and thereafter I am confused what all functions are called. (do_sync_write is mentioned but I am not sure if it's write).
Specifically, I don't want my files to exceed 2 block groups (trying to limit the size. 1 GB contains ~32768 blocks which are of 4K size each). Also, I am new to system programming so any help or direction will be great.

VFS and FS i-node difference

What is the difference between VFS i-node and FS (e.g. EXT) i-node?
Is it possible that EXT i-node is persistent (contains/points to data blocks), but VFS i-node is created just in i-node cache after read/use of EXT i-node?
Or the VFS i-node is just an image of FS i-node (it's the same) and i-nodes in those systems, which are not working with i-nodes (e.g. FAT, NTFS) has to be emulated (HOW?) to allow VFS work with those FS like they would support i-nodes?
You seem to have answered your questions yourself :)
Let's consider the case of EXT4:
The file system inode is stored on disk in the format as exactly described by struct ext4_inode. The struct ext4_inode_info is just an in-memory representation of the same. The VFS inode also an in-memory object that contains inode information that is common irrespective of the file system type and thus can be abstracted. It is allocated from the inode cache (a memory pool got using the slab allocator).The VFS struct inode is embedded in the filesystem specific in-memory struct inode. For example, struct ext4_inode_info has a member called struct inode vfs_inode. Given a VFS inode, you can get the FS specific inode using the standard container_of macro found in the kernel code. Thus any FS can get to it's own inode struct when it is handed over the generic inode struct by VFS.
Checkout what happens when a new inode is created using __ext4_new_inode()
FAT usually stores the metadata (i.e. inode information) on a directory entry. So the linux fat driver just reads it, populates the necessary fields in memory. Since there's no concept of inodes in FAT, the inode number is a random number- a call to iunique() to be precise.
Some good resources on VFS:
http://www.win.tue.nl/~aeb/linux/lk/lk-8.html
http://lxr.free-electrons.com/source/Documentation/filesystems/vfs.txt

Resources