Linux File System - linux

I am searching for a way to get all the metadata of the linux file system (ext2/3/4). The task is to find all the files (deleted/or not deleted) present on the linux partition. The metadata of the files should include creation time , modification time etc. (basically what you get from the command istat)
The problem i am facing is regarding the deleted files. I cannot find a way to get the inode of the deleted files currently present on the file system. Kindly suggest a way to solve this issue for the above mentioned file systems.
Thanks in advance.

You may find The Coroner's Toolkit to be quite useful. It includes tools to allow you to view any element of the metadata, directly view inodes, dump out all of the disk sectors that an inode references, dump disk sectors directly, etc. Since you are working with the inodes and sectors directly, it does not matter if they are deleted or not, they are all accessible.

Related

Knowing a file has been archived on ext4 file system

I need to find a way to determine whether a file has been archived (mainly using logrotate).
On BTRFS, the inode is changing when creating a new file with the same name.
But on the ext4 filesystem, it seems to not be the case.
The scenario is the following: a process is creating and feeding a Linux logfile on a dedicated path on a ext4 filesystem. At some point in time, it's rotated using the logrotate process. But re-created with the same path later on.
It seems the (inode,dev) combination is not sufficient to uniquely determine with no doubt whether the file has been rotated.
Thanks for any hint.

Files "disappearing" from initramfs

On an embedded platform running Linux 2.6.36, I occasionally run into a problem where files do not appear in the root file system that ARE present in our initramfs cpio file.
I am building the initramfs from a cpio listing file (see gen_init_cpio.c), but also ran into the problem before when just using a full directory.
When I say I know the files are present in the cpio file, I mean if I extract usr/initrmafs_data.cpio.gz the files are there.
It seems to be loosely related to the amount of content in the initramfs, but I haven't found the magic number of files and/or total storage size that causes files to start disappearing.
Is there an option in make menuconfig I'm missing that would fix this? A boot argument? Something else?
Any suggestions?
Update: To clarify, this is with a built-in ramdisk using CONFIG_INITRAMFS_SOURCE and it's compressed with gzip via setting CONFIG_INITRAMFS_COMPRESSION_GZIP. Also, this is for a mipsel-linux platform.
Update #2: I've added a printk to init/initramfs.c:clean_path and mysteriously, the previously "disappearing" files are now all there. I think this sorta seems to point to a kernel bug if attempting to log the behavior altered the behavior. I'll compare initramfs.c against a newer kernel tomorrow to see if that sheds any light on the matter.
Probably your image size is bigger than default ramdisk size (4MB afaik). Check if adding ramdisk_size=valuebiggerthanyourimagesize as a kernel parameter (before root=... parameter) solves your problem. You can also try to change kernel config value CONFIG_BLK_DEV_RAM_SIZE.

Integrity of two image files of different size running same os issue

I have a requirement. I have two virtual image files running light weight linux distribution (eg : slitaz),whose disk sizes are different. I want to check the integrity of the kernel running of these image files at a given point in time at block/sector level.
I have already accomplished the integrity check at file system level,by mounting the image to loop device and then accessing the required kernel files (vmlinuz and initrd) and hashing them and then comparing that hash with the genuine hash for these files.
Now I want to perform case to check integrity at block level,Here is what I did :
But is there a way to check the integrity in this case?
As we know that the contents at the block/sector level match for the part which belongs to the kernel in the two image files since they are running same linux distro.
I am unable to get block level information of the kernel resides to check for its integrity.Assuming my kernel files reside more than one block,how do i get info?
Any tool or any guidance in this is greatly appreciated.
If I understand correctly, does your question not simply become:
I need to know on which disk blocks exactly (within these file systems) the kernel files are located.
If that is the case (depending on the file system involved) you could probably use debugfs like described in this post.

How to estimate a file size from header's sector start address?

Suppose I have a deleted file in my unallocated space on a linux partition and i want to retrieve it.
Suppose I can get the start address of the file by examining the header.
Is there a way by which I can estimate the number of blocks to be analyzed hence (this depends on the size of the image.)
In general, Linux/Unix does not support recovering deleted files - if it is deleted, it should be gone. This is also good for security - one user should not be able to recover data in a file that was deleted by another user by creating huge empty file spanning almost all free space.
Some filesystems even support so called secure delete - that is, they can automatically wipe file blocks on delete (but this is not common).
You can try to write a utility which will open whole partition that your filesystem is mounted on (say, /dev/sda2) as one huge file and will read it and scan for remnants of your original data, but if file was fragmented (which is highly likely), chances are very small that you will be able to recover much of the data in some usable form.
Having said all that, there are some utilities which are trying to be a bit smarter than simple scan and can try to be undelete your files on Linux, like extundelete. It may work for you, but success is never guaranteed. Of course, you must be root to be able to use it.
And finally, if you want to be able to recover anything from that filesystem, you should unmount it right now, and take a backup of it using dd or pipe dd compressed through gzip to save space required.

Disadvantages to creating/removing many hard links?

I need to create hundreds to thousands of temporary hard or symbolic links that will be deleted shortly after creation. For my purposes both types of links will work (i.e. the target is not a directory and it always exists on the same file system)
As I understand it, symbolic links create a small file that contains the path to the original file. Whereas a hardlink creates a reference to the data in the same inode. So maybe if I am going to be creating/deleting thousands of these links is it better to be creating and deleting thousands of tiny files (symlinks) or thousands of these references (hardlinks)? It seems like one taxes the hard drive (maybe fragmentation) while the other might tax the file system itself? Where are inode references stored. Do I risk corrupting the file system by making so many hard links? What about speed?
Thanks for your expertise!
This a work around to be able to use ffmpeg to encode a movie out of an arbitrary subset of images from a directory. Since ffmpeg requires that the files be named properly (e.g. frame%04d.jpg) I realized I can just create hard/sym links to the subset of files and just name the links appropriately. This avoids renaming the original files and having to actually copy the data. It works great but it requires creating and deleting many thousands of links, repeatedly.
Sort of addresses this problem too I believe:
convert image sequence using ffmpeg
If this activity breaks your file system, then your file system is at fault, not you. File systems are generally pretty reliable, so don't worry about that.
Both options require adding an entry in the directory. The symbolic link requires creating a file as well. When you access the file the hard link jumps directly to the content, while accessing a symlink requires finding the symlink file, reading it, finding the directory with the content, finding where the content is, and then accessing that. Therefore symlinks are more work for the filesystem all around.
But the difference is minute when compared to the work of actually reading the data in the files. Therefore I would not worry about it, and just go with whichever one best gives you the semantics you want.
Since you are not trying to create hundreds of thousands to the same file, hard links are marginally better performing.
However, symbolic links in /tmp if /tmp is tmpfs is even better performing yet.
Oh, and symlinks are too small to cause fragmentation issues.
Both options require the addition of a file entry in the directory inode, the directory structure may grow by allocating new blocks.
But a symbolic link requires the allocation of an inode and the filesystem has a limit for inodes. Your hundreds of thousands symlinks may hit that limit and you may get the "Not enough space for file" error message even with gigabytes free.
By default, the file system creation tool choose the maximum number of inodes according to the physical partition size. For instance for Linux ext2/3/4, mkfs.ext3 uses a bytes-per-inode ratio you can find in your /etc/mke2fs.conf.
For an existing filesystem, here is a command to get information about inodes:
# dumpe2fs /dev/sda1 | grep -i inode | less
Inode count: 979200
Free inodes: 742304
Inodes per group: 16320
Inode blocks per group: 510
First inode: 11
Inode size: 128
Journal inode: 8
First orphan inode: 441066
Journal backup: inode blocks
As a conclusion, you should prefer hard links mainly for resource consumption on disk and in memory (VFS structures in caches).
Another advice: do not create too many files in the same directory, 2'000 files is a reasonable limit to avoid performance issues.

Resources