How to estimate a file size from header's sector start address? - linux

Suppose I have a deleted file in my unallocated space on a linux partition and i want to retrieve it.
Suppose I can get the start address of the file by examining the header.
Is there a way by which I can estimate the number of blocks to be analyzed hence (this depends on the size of the image.)

In general, Linux/Unix does not support recovering deleted files - if it is deleted, it should be gone. This is also good for security - one user should not be able to recover data in a file that was deleted by another user by creating huge empty file spanning almost all free space.
Some filesystems even support so called secure delete - that is, they can automatically wipe file blocks on delete (but this is not common).
You can try to write a utility which will open whole partition that your filesystem is mounted on (say, /dev/sda2) as one huge file and will read it and scan for remnants of your original data, but if file was fragmented (which is highly likely), chances are very small that you will be able to recover much of the data in some usable form.
Having said all that, there are some utilities which are trying to be a bit smarter than simple scan and can try to be undelete your files on Linux, like extundelete. It may work for you, but success is never guaranteed. Of course, you must be root to be able to use it.
And finally, if you want to be able to recover anything from that filesystem, you should unmount it right now, and take a backup of it using dd or pipe dd compressed through gzip to save space required.

Related

Efficiently inserting blocks into the middle of a file

I'm looking for, essentially, the ext4 equivalent of mremap().
I have a big mmap()'d file that I'm allocating arrays in, and the arrays need to grow. So I want to make the first array larger at its current location, and budge all the other arrays along in the file and the address space to make room.
If this was just anonymous memory, I could use mremap() to budge over whole pages in constant time, as long as I'm inserting a whole number of memory pages. But this is a disk-backed file, so the data needs to move in the file as well as in memory.
I don't actually want to read and then rewrite whole blocks of data to and from the physical disk. I want the data to stay on disk in the physical sectors it is in, and to induce the filesystem to adjust the file metadata to insert new sectors where I need the extra space. If I have to keep my inserts to some multiple of a filesystem-dependent disk sector size, that's fine. If I end up having to copy O(N) sector or extent references around to make room for the inserted extent, that's fine. I just don't want to have 2 gigabytes move from and back to the disk in order to insert a block in the middle of a 4 gigabyte file.
How do I accomplish an efficient insert by manipulating file metadata? Is a general API for this actually exposed in Linux? Or one that works if the filesystem happens to be e.g. ext4? Will a write() call given a source address in the memory-mapped file reduce to the sort of efficient shift I want under the right circumstances?
Is there a C or C++ API function with the semantics "copy bytes from here to there and leave the source with an undefined value" that I should be calling in case this optimization gets added to the standard library and the kernel in the future?
I've considered just always allocating new pages at the end of the file, and mapping them at the right place in memory. But then I would need to work out some way to reconstruct that series of mappings when I reload the file. Also, shrinking the data structure would be a nontrivial problem. At that point, I would be writing a database page manager.
I think I actually may have figured it out.
I went looking for "linux make a file sparse", which led me to this answer on Unix & Linux Stack Exchange which mentioned the fallocate command line tool. The fallocate tool has a --dig-holes option, which turns parts of a file that could be represented by holes into holes.
I then went looking for "fallocate dig holes" to find out how that works, and I got the fallocate man page. I noticed it also offers a way to insert a hole of some size:
-i, --insert-range
Insert a hole of length bytes from offset, shifting existing
data.
If a command line tool can do it, Linux can do it, so I dug into the source code for fallocate, which you can find on Github:
case 'i':
mode |= FALLOC_FL_INSERT_RANGE;
break;
It looks like the fallocate tool accomplishes a cheap hole insert (and a move of all the other file data) by calling the fallocate() Linux-specific function with the FALLOC_FL_INSERT_RANGE flag, added in Linux 4.1. This flag won't work on all filesystems, but it does work on ext4 and it does exactly what I want: adjust the file metadata to efficiently free up some space in the file's offset space at a certain point.
It's not immediately clear to me how this interacts with currently memory-mapped pages, but I think I can work with this.

Force rsync to compare local files byte by byte instead of checksum

I have written a Bash script to backup a folder. At the core of the script is an rsync instruction
rsync -abh --checksum /path/to/source /path/to/target
I am using --checksum because I neither want to rely on file size nor modification time to determine if the file in the source path needs to be backed up. However, most -- if not all -- of the time I run this script locally, i.e., with an external USB drive attached which contains the backup destination folder; no backup over network. Thus, there is no need for a delta transfer since both files will be read and processed entirely by the same machine. Calculating the checksums even introduces a speed down in this case. It would be better if rsync would just diff the files if they are both on stored locally.
After reading the manpage I stumbled upon the --whole-file option which seems to avoid the costly checksum calculation. The manpage also states that this is the default if source and destination are local paths.
So I am thinking to change my rsync statement to
rsync -abh /path/to/source /path/to/target
Will rsync now check local source and target files byte by byte or will it use modification time and/or size to determine if the source file needs to be backed up? I definitely do not want to rely on file size or modification times to decide if a backup should take place.
UPDATE
Notice the -b option in the rsync instruction. It means that destination files will be backed up before they are replaced. So blindly rsync'ing all files in the source folder, e.g., by supplying --ignore-times as suggested in the comments, is not an option. It would create too many duplicate files and waste storage space. Keep also in mind that I am trying to reduce backup time and workload on a local machine. Just backing up everything would defeat that purpose.
So my question could be rephrased as, is rsync capable of doing a file comparison on a byte by byte basis?
Question: is rsync capable of doing a file comparison on a byte by byte basis?
Strictly speaking, Yes:
It's a block by block comparison, but you can change the block size.
You could use --block-size=1, (but it would be unreasonably inefficient and inappropriate for basically every)
The block based rolling checksum is the default behavior over a network.
Use the --no-whole-file option to force this behavior locally. (see below)
Statement 1. Calculating the checksums even introduces a speed down in this case.
This is why it's off by default for local transfers.
Using the --checksum option forces an entire file read, as opposed to the default block-by-block delta-transfer checksum checking
Statement 2. Will rsync now check local source and target files byte by byte or
       will it use modification time and/or size to determine if the source
file        needs to be backed up?
By default it will use size & modification time.
You can use a combination of --size-only, --(no-)ignore-times, --ignore-existing and
--checksum to modify this behavior.
Statement 3. I definitely do not want to rely on file size or modification times to decide if a        backup should take place.
Then you need to use --ignore-times and/or --checksum
Statement 4. supplying --ignore-times as suggested in the comments, is not an option
Perhaps using --no-whole-file and --ignore-times is what you want then ? This forces the use of the delta-transfer algorithm, but for every file regardless of timestamp or size.
You would (in my opinion) only ever use this combination of options if it was critical to avoid meaningless writes (though it's critical that it's specifically the meaningless writes that you're trying to avoid, not the efficiency of the system, since it wouldn't actually be more efficient to do a delta-transfer for local files), and had reason to believe that files with identical modification stamps and byte size could indeed be different.
I fail to see how modification stamp and size in bytes is anything but a logical first step in identifying changed files.
If you compared the following two files:
File 1 (local) : File.bin - 79776451 bytes and modified on the 15 May 07:51
File 2 (remote): File.bin - 79776451 bytes and modified on the 15 May 07:51
The default behaviour is to skip these files. If you're not satisfied that the files should be skipped, and want them compared, you can force a block-by-block comparison and differential update of these files using --no-whole-file and --ignore-times
So the summary on this point is:
Use the default method for the most efficient backup and archive
Use --ignore-times and --no-whole-file to force delta-change (block by block checksum, transferring only differential data) if for some reason this is necessary
Use --checksum and --ignore-times to be completely paranoid and wasteful.
Statement 5. Notice the -b option in the rsync instruction. It means that destination files will be backed up before they are replaced
Yes, but this can work however you want it to, it doesn't necessarily mean a full backup every time a file is updated, and it certainly doesn't mean that a full transfer will take place at all.
You can configure rsync to:
Keep 1 or more versions of a file
Configure it with a --backup-dir to be a full incremental backup system.
Doing it this way doesn't waste space other than what is required to retain differential data. I can verify that in practise as there would not be nearly enough space on my backup drives for all of my previous versions to be full copies.
Some Supplementary Information
Why is Delta-transfer not more efficient than copying the whole file locally?
Because you're not tracking the changes to each of your files. If you actually have a delta file, you can merge just the changed bytes, but you need to know what those changed bytes are first. The only way you can know this is by reading the entire file
For example:
I modify the first byte of a 10MB file.
I use rsync with delta-transfer to sync this file
rsync immediately sees that the first byte (or byte within the first block) has changed, and proceeds (by default --inplace) to change just that block
However, rsync doesn't know it was only the first byte that's changed. It will keep checksumming until the whole file is read
For all intents and purposes:
Consider rsync a tool that conditionally performs a --checksum based on whether or not the file timestamp or size has changed. Overriding this to --checksum is essentially equivalent to --no-whole-file and --ignore-times, since both will:
Operate on every file, regardless of time and size
Read every block of the file to determine which blocks to sync.
What's the benefit then?
The whole thing is a tradeoff between transfer bandwidth, and speed / overhead.
--checksum is a good way to only ever send differences over a network
--checksum while ignoring files with the same timestamp and size is a good way to both only send differences over a network, and also maximize the speed of the entire backup operation
Interestingly, it's probably much more efficient to use --checksum as a blanket option than it would be to force a delta-transfer for every file.
There is no way to do byte-by-byte comparison of files instead of checksum, the way you are expecting it.
The way rsync works is to create two processes, sender and receiver, that create a list of files and their metadata to decide with each other, which files need to be updated. This is done even in case of local files, but in this case processes can communicate over a pipe, not over a network socket. After the list of changed files is decided, changes are sent as a delta or as whole files.
Theoretically, one could send whole files in the file list to the other to make a diff, but in practice this would be rather inefficient in many cases. Receiver would need to keep these files in the memory in case it detects the need to update the file, or otherwise the changes in files need to be re-sent. Any of the possible solutions here doesn't sound very efficient.
There is a good overview about (theoretical) mechanics of rsync: https://rsync.samba.org/how-rsync-works.html

Can file size be used to detect a partial append?

I'm thinking about ways for my application to detect a partially-written record after a program or OS crash. Since records are only ever appended to a file (never overwritten), is a crash while writing guaranteed to yield a file size that is shorter than it should be? Is this guaranteed even if the file was opened in read-write mode instead of append mode, so long as writes are always at the end of the file? This would greatly simplify crash recovery, since comparing the last record's expected size and position with the actual file size would be enough to detect a partial write.
I understand that random-access writes can be reordered by the filesystem, but I'm having trouble finding information on whether this can happen when appending. I imagine an out-of-order append would require the filesystem to create a "hole" at the tail of the (sparse) file, write blocks beyond the hole, and then fill in the blocks in between, but I'm hoping that such an approach would be so inefficient that nobody would ever implement their filesystem that way.
I suppose another problem might be a filesystem updating the directory entry's file size field before appending the new blocks to to the file, and the OS crashing in between. Does this ever happen in practice? (ext4, perhaps?) Is there a quick way to detect it? (And what happens when trying to read the unwritten blocks that should exist according to the file's size?)
Is there anything else, such as write reordering performed by a disk/flash drive, that would get in the way of using file size as a way to detect a partial append? I don't expect to be able to compensate for this sort of drive trickery in my application, but it would be good to know about.
If you want to be SURE that you're never going to lose records, you need a consistent journaling or transactional system for your files.
There is absolutely no guarantee that a write will have been fulfilled unless you either set O_DIRECT [which you probably do not want to do], or you use markers to indicate aht "this has been fully committed", that are only written when the file is closed. You can either do that in the mainfile, or, for example, have a file that records, externally, "last written record". If you open & close that file, it should be safe as long as the APP is what is crashing - if the OS crashes [or is otherwise abruptly stopped - e.g. power cut, disk unplugged, etc], all bets are off.
Write reordering and write caching is/can be done at all levels - the C library, the OS, the filesystem module and the hard disk/controller itself are all ABLE to reorder writes.

Shred: Doesn't work on Journaled FS?

Shred documentation says shred is "not guaranteed to be effective" (See bottom). So if I shred a document on my Ext3 filesystem or on a Raid, what happens? Do I shred part of the file? Does it sometimes shred the whole thing and sometimes not? Can it shred other stuff? Does it only shred the file header?
CAUTION: Note that shred relies on a very important assumption:
that the file system overwrites data in place. This is the
traditional way to do things, but many modern file system designs
do not satisfy this assumption. The following are examples of file
systems on which shred is not effective, or is not guaranteed to be
effective in all file sys‐ tem modes:
log-structured or journaled file systems, such as those supplied with AIX and Solaris (and JFS, ReiserFS, XFS, Ext3, etc.)
file systems that write redundant data and carry on even if some writes fail, such as RAID-based file systems
file systems that make snapshots, such as Network Appliance’s NFS server
file systems that cache in temporary locations, such as NFS version 3 clients
compressed file systems
In the case of ext3 file systems, the above disclaimer applies
(and shred is thus of limited effectiveness) only in data=journal
mode, which journals file data in addition to just metadata. In
both the data=ordered (default) and data=writeback modes, shred
works as usual. Ext3 journaling modes can be changed by adding
the data=something option to the mount options for a
particular file system in the /etc/fstab file, as documented in the
mount man page (man mount).
All shred does is overwrite, flush, check success, and repeat. It does absolutely nothing to find out whether overwriting a file actually results in the blocks which contained the original data being overwritten. This is because without knowing non-standard things about the underlying filesystem, it can't.
So, journaling filesystems won't overwrite the original blocks in place, because that would stop them recovering cleanly from errors where the change is half-written. If data is journaled, then each pass of shred might be written to a new location on disk, in which case nothing is shredded.
RAID filesystems (depending on the RAID mode) might not overwrite all of the copies of the original blocks. If there's redundancy, you might shred one disk but not the other(s), or you might find that different passes have affected different disks such that each disk is partly shredded.
On any filesystem, the disk hardware itself might just so happen to detect an error (or, in the case of flash, apply wear-leveling even without an error) and remap the logical block to a different physical block, such that the original is marked faulty (or unused) but never overwritten.
Compressed filesystems might not overwrite the original blocks, because the data with which shred overwrites is either random or extremely compressible on each pass, and either one might cause the file to radically change its compressed size and hence be relocated. NTFS stores small files in the MFT, and when shred rounds up the filesize to a multiple of one block, its first "overwrite" will typically cause the file to be relocated out to a new location, which will then be pointlessly shredded leaving the little MFT slot untouched.
Shred can't detect any of these conditions (unless you have a special implementation which directly addresses your fs and block driver - I don't know whether any such things actually exist). That's why it's more reliable when used on a whole disk than on a filesystem.
Shred never shreds "other stuff" in the sense of other files. In some of the cases above it shreds previously-unallocated blocks instead of the blocks which contain your data. It also doesn't shred any metadata in the filesystem (which I guess is what you mean by "file header"). The -u option does attempt to overwrite the file name, by renaming to a new name of the same length and then shortening that one character at a time down to 1 char, prior to deleting the file. You can see this in action if you specify -v too.
The other answers have already done a good job of explaining why shred may not be able to do its job properly.
This can be summarised as:
shred only works on partitions, not individual files
As explained in the other answers, if you shred a single file:
there is no guarantee the actual data is really overwritten, because the filesystem may send writes to the same file to different locations on disk
there is no guarantee the fs did not create copies of the data elsewhere
the fs might even decide to "optimize away" your writes, because you are writing the same file repeatedly (syncing is supposed to prevent this, but again: no guarantee)
But even if you know that your filesystem does not do any of the nasty things above, you also have to consider that many applications will automatically create copies of file data:
crash recovery files which word processors, editors (such as vim) etc. will write periodically
thumbnail/preview files in file managers (sometimes even for non-imagefiles)
temporary files that many applications use
So, short of checking every single binary you use to work with your data, it might have been copied right, left & center without you knowing. The only realistic way is to always shred complete partitions (or disks).
The concern is that data might exist on more than one place on the disk. When the data exists in exactly one location, then shred can deterministically "erase" that information. However, file systems that journal or other advanced file systems may write your file's data in multiple locations, temporarily, on the disk. Shred -- after the fact -- has no way of knowing about this and has no way of knowing where the data may have been temporarily written to disk. Thus, it has no way of erasing or overwriting those disk sectors.
Imagine this: You write a file to disk on a journaled file system that journals not just metadata but also the file data. The file data is temporarily written to the journal, and then written to its final location. Now you use shred on the file. The final location where the data was written can be safely overwritten with shred. However, shred would have to have some way of guaranteeing that the sectors in the journal that temporarily contained your file's contents are also overwritten to be able to promise that your file is truly not recoverable. Imagine a file system where the journal is not even in a fixed location or of a fixed length.
If you are using shred, then you're trying to ensure that there is no possible way your data could be reconstructed. The authors of shred are being honest that there are some conditions beyond their control where they cannot make this guarantee.

Recovering formatted partition

I accidentally hit 'format' in qtparted on my ext3 partition (for Ubuntu), which I was analyzing via another computer. The list of four messages it gives during the format process include "writing inode table" and "writing filesystem", or something similar.
How can I view this data? The tools I've looked at appear to either require intact inode tables or, in the case of file carvers, don't preserve directory structure (which may be impossible) and operate on a very limited set of file types. Can this data be recovered? The format operation took so little time that I suspect the data may still be there.
Take a look at the TestDisk utilities. They are designed for exactly this purpose, recovering data when the allocation table is missing/overwritten.
If you haven't used the disk since and it wasn't thoroughly overwritten during the format then there's a high chance you'll get a lot of your data back. Just dont't place any trust in its validity (a recovered file may have data that is wrong or missing, at any location).

Resources