inotify delete_self when modifying and saving a file - linux

I am running a small inotify script that sets up a watch on a file. Each time that file is edited and saved, the script notices that a DELETE_SELF event is triggered. Is that normal and if it is why? Shouldn't the inotify subsystem notice that the file still exists?

It depends on what the application that is editing the file is doing with it. In this case, it sounds like the behavior of your editor when it saves a file is to delete the old file and write the new contents as a new file with the same name. From the perspective of inotify, this is exactly what happens, so it fires a deletion event and then a creation event. Inotify cannot know that the file that was deleted and the file that was created in its place are logically related.

Related

File updates between open() and first read() are not read

External Updates that happened to that file content during the time between open() and first read() are not returned in the read() content.
How can I get the latest file content from the read()?
I've tried flush() and seek(0) but didn't help.
https://repl.it/repls/RealGreedyTransfer#main.py
import time
def myfoo(handle):
print("myfoo started", flush=True)
time.sleep(50)
# External updates that happen during that time don't show up in read()
# foo.flush()
# foo.seek(0)
# can't close and re-open file handle
print(handle.read()) # <-- Not reading updates done after file open
# Upstream code base passing a file handle under an exclusive fcntl.lockf() lock
handle = open('temp.txt', 'r+')
myfoo(handle)
The issue is with the way files are written. Many text editors don’t just write to the file, they use a different method: they write to a temporary file, and then rename it to the original filename. Since renames are atomic in POSIX, in the event of a system crash during saving, the old version of the file will be available, and the new version might or might not be available in the temporary file.
For most purposes, this works as desired. The only exception is in this case, where you’re holding onto a file handle. Renames/moves/deletions do not affect the file handles, they are still open with the file they were opened with, even if that file is no longer accessible from the filesystem. You can experiment with this by opening a file, then removing it with rm, and then reading from the file — it will still show you the file contents from before you deleted it. You can also access the file in Linux inside /proc/XX/fd.
Your file handle won’t see changes, unless they are actually written (and flushed) to the same file (without the rename dance). If you’re working with something that writes by renaming, you would need to reopen the file to see the new contents.

Copy screenlog.n file and restart log?

I'm running a application using gnu screen and logging everything the the -L flag. The screenlog.n file is being created just fine. What I would like to copy the contents of that file to something like log_<date>, and then clear the screenlog.n file to start logging the next day. So far I have only found solutions for appending, or leaving the screenlog.n file to keep all the information.
I found that screen will create a new screenlog.n file automatically if I delete the existing one while a screen session is detached. I simply scheduled a cronjob to copy and rename the existing file and then delete it. A new screenlog.n file is created as soon as there is something new to log.

Why file is accessible after deleting in unix?

I thought about a concurrency issue (in Solaris), what happen if while reading someone tries to delete the same file. I have a query regarding file existence in the Solaris/Linux. suppose I have a file test.txt, I have open it in vi editor, and then I have open a duplicate session and remove that file, but even after deleting that file I am able to read that file. so here are my questions:
Do I need to thinks about any locking mechanism while reading, so no one able to delete same file while reading.
What is the reason of showing different behavior from windows(like in windows if file is open in in some editor than we can not delete that file)
After removing that file, how I am still able to read that file, if I haven't closed file from vi editor.
I am asking files in general,but yes platform specific i.e. unix. what will happen if I am using a java program (buffer reader) for read file and file is deleted while reading, does buffer reader still able to read the file for next chunk or not?
You have basically 2 or 3 unrelated questions there. Text editors like to read the whole file into memory at the start of the editing session. Imagine every character you type being saved to disk immediately, with all characters after it in the file being rewritten one place further along to make room. That would be awful. Much better that the thing you're actually editing is a memory representation of the file (array of pointers to lines, probably with some metadata attached) which only gets converted back into a linear stream when you explicitly save.
Any relatively recent version of vim will notify you if the file you are editing is deleted from its original location with the message
E211: File "filename" no longer available
This warning is not just for unix. gvim on Windows will give it to you if you delete the file being edited. It serves as a reminder that you need to save the version you're working on before you exit, if you don't want the file to be gone.
(Note: the warning doesn't appear instantly - vim only checks for the original file's existence when you bring it back into the foreground after having switched away from it.)
So that's question 1, the behavior of text editors - there's no reason for them to keep the file open for the whole session because they aren't actually using it except at startup and during a save operation.
Question 2, why do some Windows editors keep the file open and locked - I don't know, Windows people are nuts.
Question 3, the one that's actually about unix, why do open files stay accessible after they're deleted - this is the most interesting one. The answer, guaranteed to shock you when presented directly:
There is no command, function, syscall, or any other method which actually requests deletion of a file.
Underlying rm and any other command that may appear to delete a file there is the system call unlink. And it's called unlink, not remove or deletefile or anything similar, because it doesn't remove a file. It removes a link (a.k.a. directory entry) which is an association between a file and a name in a directory. (Note: ANSI C added remove as a more generic function to appease non-unix people who had no intention of implementing unix filesystem semantics, but on unix, remove is just a rmdir if the target is a directory, and unlink for everything else.)
A file can have multiple links (see the ln command for how they are created), which means that the same file is known by multiple names. If you rm one of them, the others stick around and the file is not deleted. What happens when you remove the last link? Well, now you have a file with no name. But names are only one kind of reference to a file. There are at least 2 others: file descriptors and mmap regions. When the last reference to a file goes away, that's when the file is deleted.
Since references come in several forms, there are many kinds of events that can cause a file to be deleted. Here are some examples:
unlink (rm, etc.)
close file descriptor
dup2 (can implicitly closes a file descriptor before replacing it with a copy of a different file descriptor)
exec (can cause file descriptors to be closed via close-on-exec flag)
munmap (unmap memory region)
mmap (if you create a new memory map at an address that's already mapped, the old mapping is unmapped)
process death (which closes all file descriptors and unmaps all memory mappings of the process)
normal exit
fatal signal generated by the kernel (^C, segfault)
fatal signal sent from another process (kill)
I won't call that a complete list. And I don't encourage anyone to try to build a complete list. Just know that rm is "remove name", not "remove file", and files go away as soon as they're not in use.
If you want to destroy the contents of a file immediately, truncate it. All processes already using it will find that its size has suddenly become 0. (This is destruction as far as the normal file access methods are concerned. To destroy it more thoroughly so that even someone with raw disk access can't read what used to be there, you need to overwrite it. There's a tool called shred for that.)
I think your question has nothing to do with the difference between Windows/Linux. It's about how VI works.
when using VI to edit a file. VI will create a .swp file. And the .swp file is what you are actually editing. At the same time, if other users delete the original file will not effect your editing.
And when you type :w in VI, VI will use .swp file to overwrite the original file.

How system listen to file change?

In OS level,how can it achieve knowing something's changed(like file changing)
e.g:
In node,we can monitor a file and perform some actions while it changes
fs.watch(file_path,function(){
//do something while the file's changed;
});
can someone give me a brief intuition/idea/keyword about how it really works
one I can come up is that while I hit :w in vim ,it somehow invoke some system *fake_save_file* function,and inside this *fake_save_file* function,it dispatch some events to somewhere else
You might know that the kernel indexes the files in file system as file descriptors. File watching is achieved by listening to changes to those file descriptors. In linux inotify does that.
Whenever you open, read, write/modify or move a file, the kernel issues operations upon the file descriptor. Inotify extends the filesystem by tracking these operations and showing them to you.
The example you gave is somewhat incorrect. The fake_save_file is created by your text editor vim to store the temporary changes made unless you the user actually saves it. When you save it with :w the editor replaces your actual_save_file with a copy of fake_save_file.
As a user you would be watching your actual_save_file. This gets changed when you enter :w in vim. You will be notified as vim modified its contents.
pyinotify is might be you want.please check it.

Why syslog stop writing log after edit certain log?

I using centos5.8
I wonder why syslog stop writing log after certain log edited.
for example, after cut the line of /var/log/messages, syslog doesn't write new log.
Just old logs are remained.
But If I delete the messages file and reboot the system, syslog works fine.
Is there any ways, syslogd write new logs continuosely after edit certain log??
It depends how exactly a file is edited.
Remember that syslogd keeps the file open for continuously writing. If your editor writes a new file with the old name after unlink()ing or rename()ing the old file, that old file remains in operation for syslogd. The new file, however, remains untouched.
It can be informed about the new file to be used with the HUP signal.
Well, that would depend on how syslogd works. I should mention that it's probably not a good idea to edit system log files, by the way :-)
You may have been caught out by one peculiarity of the way UNIX file systems can work.
If process A has a file open and is writing to it, process B can go and just delete the file (either with something like rm or, as seems likely here, in an edit session which deletes the old file and rewrites a new one).
Deletion of that original file does not destroy the data, it simply breaks the link between the file name (the directory entry) and that data.
A process that has the original file open can continue to write to it as long as it wants and, since there's no entry in the file system for it, you can't (easily) see it.
A new file with that file name may be bought into existence but process A will not be writing to it, it will be writing to the original. There will be no connection between the old and the new file.
You'll sometimes see this effect when you're running low on disk space and decide to delete a lot of files to clear it up somewhat. If processes have those files open, deleting them will not help you at all since the space is still being used (and quite possibly the usage is increasing).
You have to both delete the files and somehow convince those processes holding them open to relinquish them.
If you edited /var/log/syslog you need to restart the syslog service afterwards, because syslogd needs to open the file handle again.

Resources