Determine the offset and the size of another process write - linux

I'm working on a backup service. It tracks changes of the files in a the directory to backup. It does that by setting a watch (using inotify with Linux) and comparing the modification time and size after a file has been changed. When it is, the whole file is copied to backup. I'm thinking, could this be done much more efficient? If the backup service can determine the offset and the number of bytes written, it can just copy that, in stead of copy the whole file. I've been looking to fanotify, which offers some interesting features, like an fd to the file modified (by the other process). Now here it stops I think. There is no way as far as I can see it how the process using fanotify can determine from the fd how the file is changed.
Do I overlook something, or is it not possible to get this information?

Related

When writing to a newly created file, can I create the directory entry only after writing is completed?

I'm writing a file that takes minutes to write. External software monitors for this file to appear, but unfortunately doesn't monitor for inotify IN_CLOSE_WRITE events, but rather checks periodically "the file is there" and then starts to process it, which will fail if the file is incomplete. I cannot fix the external software. A workaround I've been using so far is to write a temporary file and then rename it when it's finished, but this workaround complicates my workflow for reasons beyond the scope of this question¹.
Files are not directory entries. Using hardlinks, there can be multiple pointers to the same file. When I open a file for writing, both the inode and the directory entry are created immediately. Can I prevent this? Can I postpone the creation of the directory entry until the file is closed, rather than when the file is opened for writing?
Example Python-code, but the question is not specific to Python:
fp = open(dest, 'w') # currently both inode and directory entry are created here
fp.write(...)
fp.write(...)
fp.write(...)
fp.close() # I would like to create the directory entry only here
Reading everything into memory and then writing it all in one go is not a good solution, because writing will still take time and the file might not fit into memory.
I found the related question Is it possible to create an unlinked file on a selected filesystem?, but I would want to first create an anonymous/unnamed file, then naming it when I'm done writing (I agree with the answer there that creating an inode is unavoidable, but that's fine; I just want to postpone naming it).
Tagging this as linux, because I suspect the answer might be different between Linux and Windows and I only need a solution on Linux.
¹Many files are produced in parallel within dask graphs, and injecting a "move as soon as finished" task in our system would be complicated, so we're really renaming 50 files when 50 files have been written, which causes delays.

Force rsync to compare local files byte by byte instead of checksum

I have written a Bash script to backup a folder. At the core of the script is an rsync instruction
rsync -abh --checksum /path/to/source /path/to/target
I am using --checksum because I neither want to rely on file size nor modification time to determine if the file in the source path needs to be backed up. However, most -- if not all -- of the time I run this script locally, i.e., with an external USB drive attached which contains the backup destination folder; no backup over network. Thus, there is no need for a delta transfer since both files will be read and processed entirely by the same machine. Calculating the checksums even introduces a speed down in this case. It would be better if rsync would just diff the files if they are both on stored locally.
After reading the manpage I stumbled upon the --whole-file option which seems to avoid the costly checksum calculation. The manpage also states that this is the default if source and destination are local paths.
So I am thinking to change my rsync statement to
rsync -abh /path/to/source /path/to/target
Will rsync now check local source and target files byte by byte or will it use modification time and/or size to determine if the source file needs to be backed up? I definitely do not want to rely on file size or modification times to decide if a backup should take place.
UPDATE
Notice the -b option in the rsync instruction. It means that destination files will be backed up before they are replaced. So blindly rsync'ing all files in the source folder, e.g., by supplying --ignore-times as suggested in the comments, is not an option. It would create too many duplicate files and waste storage space. Keep also in mind that I am trying to reduce backup time and workload on a local machine. Just backing up everything would defeat that purpose.
So my question could be rephrased as, is rsync capable of doing a file comparison on a byte by byte basis?
Question: is rsync capable of doing a file comparison on a byte by byte basis?
Strictly speaking, Yes:
It's a block by block comparison, but you can change the block size.
You could use --block-size=1, (but it would be unreasonably inefficient and inappropriate for basically every)
The block based rolling checksum is the default behavior over a network.
Use the --no-whole-file option to force this behavior locally. (see below)
Statement 1. Calculating the checksums even introduces a speed down in this case.
This is why it's off by default for local transfers.
Using the --checksum option forces an entire file read, as opposed to the default block-by-block delta-transfer checksum checking
Statement 2. Will rsync now check local source and target files byte by byte or
       will it use modification time and/or size to determine if the source
file        needs to be backed up?
By default it will use size & modification time.
You can use a combination of --size-only, --(no-)ignore-times, --ignore-existing and
--checksum to modify this behavior.
Statement 3. I definitely do not want to rely on file size or modification times to decide if a        backup should take place.
Then you need to use --ignore-times and/or --checksum
Statement 4. supplying --ignore-times as suggested in the comments, is not an option
Perhaps using --no-whole-file and --ignore-times is what you want then ? This forces the use of the delta-transfer algorithm, but for every file regardless of timestamp or size.
You would (in my opinion) only ever use this combination of options if it was critical to avoid meaningless writes (though it's critical that it's specifically the meaningless writes that you're trying to avoid, not the efficiency of the system, since it wouldn't actually be more efficient to do a delta-transfer for local files), and had reason to believe that files with identical modification stamps and byte size could indeed be different.
I fail to see how modification stamp and size in bytes is anything but a logical first step in identifying changed files.
If you compared the following two files:
File 1 (local) : File.bin - 79776451 bytes and modified on the 15 May 07:51
File 2 (remote): File.bin - 79776451 bytes and modified on the 15 May 07:51
The default behaviour is to skip these files. If you're not satisfied that the files should be skipped, and want them compared, you can force a block-by-block comparison and differential update of these files using --no-whole-file and --ignore-times
So the summary on this point is:
Use the default method for the most efficient backup and archive
Use --ignore-times and --no-whole-file to force delta-change (block by block checksum, transferring only differential data) if for some reason this is necessary
Use --checksum and --ignore-times to be completely paranoid and wasteful.
Statement 5. Notice the -b option in the rsync instruction. It means that destination files will be backed up before they are replaced
Yes, but this can work however you want it to, it doesn't necessarily mean a full backup every time a file is updated, and it certainly doesn't mean that a full transfer will take place at all.
You can configure rsync to:
Keep 1 or more versions of a file
Configure it with a --backup-dir to be a full incremental backup system.
Doing it this way doesn't waste space other than what is required to retain differential data. I can verify that in practise as there would not be nearly enough space on my backup drives for all of my previous versions to be full copies.
Some Supplementary Information
Why is Delta-transfer not more efficient than copying the whole file locally?
Because you're not tracking the changes to each of your files. If you actually have a delta file, you can merge just the changed bytes, but you need to know what those changed bytes are first. The only way you can know this is by reading the entire file
For example:
I modify the first byte of a 10MB file.
I use rsync with delta-transfer to sync this file
rsync immediately sees that the first byte (or byte within the first block) has changed, and proceeds (by default --inplace) to change just that block
However, rsync doesn't know it was only the first byte that's changed. It will keep checksumming until the whole file is read
For all intents and purposes:
Consider rsync a tool that conditionally performs a --checksum based on whether or not the file timestamp or size has changed. Overriding this to --checksum is essentially equivalent to --no-whole-file and --ignore-times, since both will:
Operate on every file, regardless of time and size
Read every block of the file to determine which blocks to sync.
What's the benefit then?
The whole thing is a tradeoff between transfer bandwidth, and speed / overhead.
--checksum is a good way to only ever send differences over a network
--checksum while ignoring files with the same timestamp and size is a good way to both only send differences over a network, and also maximize the speed of the entire backup operation
Interestingly, it's probably much more efficient to use --checksum as a blanket option than it would be to force a delta-transfer for every file.
There is no way to do byte-by-byte comparison of files instead of checksum, the way you are expecting it.
The way rsync works is to create two processes, sender and receiver, that create a list of files and their metadata to decide with each other, which files need to be updated. This is done even in case of local files, but in this case processes can communicate over a pipe, not over a network socket. After the list of changed files is decided, changes are sent as a delta or as whole files.
Theoretically, one could send whole files in the file list to the other to make a diff, but in practice this would be rather inefficient in many cases. Receiver would need to keep these files in the memory in case it detects the need to update the file, or otherwise the changes in files need to be re-sent. Any of the possible solutions here doesn't sound very efficient.
There is a good overview about (theoretical) mechanics of rsync: https://rsync.samba.org/how-rsync-works.html

Create a cache file which is automatically deleted when partition is nearly full in libc (all file systems) or ext4?

When writing software which needs to cache data on disk, is there a way in libc, or a way which is specific to a certain file system (such as ext4), to create a file and flag it as suitable to be deleted automatically (by the kernel) if the partition becomes almost full?
There’s something similar for memory pages: madvise(…, MADV_FREE).
Some systems achieve this by writing a daemon which monitors the partition fullness, and which manually deletes certain pre-determined paths once it exceeds a certain fill level. I’d like to avoid this if possible, as it’s not very scalable: each application would have to notify the daemon of new cache paths as they are created, which may be frequently. If this were in-kernel, a single flag could be held on each inode indicating whether it’s a cache file.
Having a standardised daemon for this would be acceptable as well. At the moment it seems like different major systems integrators all invent their own.
You can use crontab job, and look for specific file extension and delete it. You can even filter based on time and leave the files created in last n minutes.
If you are ok with this, let me know, I will add more details here.

How to detect a file has been deleted

I am writing a program to monitor the file system. But I'm not able to detect when a file is deleted. I tried monitoring with FAN_MARK_ONLYDIR flag hoping fanotify rise some event when deleting a file in a monitored dir, no results.
It is even possible do this using fanotify? There are something to help me to do this?
According to a linuxquestions.org thread fanotify doesn't detect file replacement or deletion or subdirectory creation, renaming, or deletion. Also see baach.de discussion, which compares (or mentions) inotify, dnotify, fam, Fanotify, tripwire, Python-fuse, and llfuse (python) among other file or directory change monitors.
inotify supports IN_DELETE and IN_DELETE_SELF events and if you are working with a limited number of directories, rather than an entire filesystem, is practical to use.
Edit: Among inotify limitations or caveats mentioned in its webpage are the following:
inotify monitoring of directories is not recursive: to monitor subdirectories under a directory, additional watches must be created. This can take a significant amount time for large directory trees. ... If monitoring an entire directory subtree, and a new subdirectory is created in that tree, be aware that by the time you create a watch for the new subdirectory, new files may already have been created in the subdirectory. Therefore, you may want to scan the contents of the subdirectory immediately after adding the watch.

How to estimate a file size from header's sector start address?

Suppose I have a deleted file in my unallocated space on a linux partition and i want to retrieve it.
Suppose I can get the start address of the file by examining the header.
Is there a way by which I can estimate the number of blocks to be analyzed hence (this depends on the size of the image.)
In general, Linux/Unix does not support recovering deleted files - if it is deleted, it should be gone. This is also good for security - one user should not be able to recover data in a file that was deleted by another user by creating huge empty file spanning almost all free space.
Some filesystems even support so called secure delete - that is, they can automatically wipe file blocks on delete (but this is not common).
You can try to write a utility which will open whole partition that your filesystem is mounted on (say, /dev/sda2) as one huge file and will read it and scan for remnants of your original data, but if file was fragmented (which is highly likely), chances are very small that you will be able to recover much of the data in some usable form.
Having said all that, there are some utilities which are trying to be a bit smarter than simple scan and can try to be undelete your files on Linux, like extundelete. It may work for you, but success is never guaranteed. Of course, you must be root to be able to use it.
And finally, if you want to be able to recover anything from that filesystem, you should unmount it right now, and take a backup of it using dd or pipe dd compressed through gzip to save space required.

Resources