What is the OS-level handle of tempfile.mkstemp good for? - python-3.x

I use tempfile.mkstemp when I need to create files in a directory which might stay, but I don't care about the filename. It should only be something that doesn't exist so far and have a prefix- and a suffix.
One part about the documentation that I ignored so far is
mkstemp() returns a tuple containing an OS-level handle to an open file (as would be returned by os.open()) and the absolute pathname of that file, in that order.
What is the OS-level handle and how should one use it?
Background
I always used it like this:
from tempfile import mstemp
_, path = mkstemp(prefix=prefix, suffix=suffix, dir=dir)
with open(path, "w") as f:
f.write(data)
# do something
os.remove(path)
It worked fine so far. However, today I wrote a small script which generates huge files and deletes them. The script aborted the execution with the message
OSError: [Errno 28] No space left on device
When I checked, there were 80 GB free.
My suspicion is that os.remove only "marked" the files for deletion, but the files were not properly removed. And the next suspicion was that I might need to close the OS-level handle before the OS can actually free that disk space.

Your suspicion is correct. The os.remove only removes the directory entry that contains the name of the file. However, the file data remains intact and continues to consume space on the disk until the last open descriptor on the file is closed. During that time normal operations on the file through existing descriptors continue to work, which means you could still use the _ descriptor to seek in, read from, or write to the file after os.remove has returned.
In fact it's common practice to immediately os.remove the file before moving on to using the descriptor to operate on the file contents. This prevents the file from being opened by any other process, and also means that the file won't be left hanging around if this program dies unexpectedly before reaching a later os.remove.
Of course that only works if you're willing and able to use the low-level descriptor for all of your operations on the file, or if you use the os.fdopen method to construct a file object on top of the descriptor and use that new object for all operations. Obviously you only want to do one of those things; mixing descriptor access and file-object access to the same underlying file can produce unexpected results.
os.fdopen(_) should execute faster than open(path) but it doesn't have the context manager integration that open has, so it's not directly usable in a with construct. I think you can use contextlib.closing to get around that.

Related

When writing to a newly created file, can I create the directory entry only after writing is completed?

I'm writing a file that takes minutes to write. External software monitors for this file to appear, but unfortunately doesn't monitor for inotify IN_CLOSE_WRITE events, but rather checks periodically "the file is there" and then starts to process it, which will fail if the file is incomplete. I cannot fix the external software. A workaround I've been using so far is to write a temporary file and then rename it when it's finished, but this workaround complicates my workflow for reasons beyond the scope of this question¹.
Files are not directory entries. Using hardlinks, there can be multiple pointers to the same file. When I open a file for writing, both the inode and the directory entry are created immediately. Can I prevent this? Can I postpone the creation of the directory entry until the file is closed, rather than when the file is opened for writing?
Example Python-code, but the question is not specific to Python:
fp = open(dest, 'w') # currently both inode and directory entry are created here
fp.write(...)
fp.write(...)
fp.write(...)
fp.close() # I would like to create the directory entry only here
Reading everything into memory and then writing it all in one go is not a good solution, because writing will still take time and the file might not fit into memory.
I found the related question Is it possible to create an unlinked file on a selected filesystem?, but I would want to first create an anonymous/unnamed file, then naming it when I'm done writing (I agree with the answer there that creating an inode is unavoidable, but that's fine; I just want to postpone naming it).
Tagging this as linux, because I suspect the answer might be different between Linux and Windows and I only need a solution on Linux.
¹Many files are produced in parallel within dask graphs, and injecting a "move as soon as finished" task in our system would be complicated, so we're really renaming 50 files when 50 files have been written, which causes delays.

Node.js: manipulate file like a stack

I'm envisioning a implementation in node.js that can manipulate a file on disk as if it's a stack data struct.
Suppose file is utf-8 encoded plain text, each element of the stack corresponds to a '\n' delimited line in the file, and top of stack point to first line of that file. I want something that can simultaneously read and write the file.
const file = new FileAsStack("/path/to/file");
// read the first line from the file,
// also remove that line from the file.
let line = await file.pop();
To implement such interface naively, I could simply read the whole file into memory, and when .pop() read from memory, and write the remainder back to disk. Obviously such approach isn't ideal. Imagine dealing with a 10GB file, it'll be both memory intensive and I/O intensive.
With fs.read() I can read just a slice of the file, so the "read" part is solved. But the "write" part I have no idea. How can I effectively take just one line, and write the rest of the file back to it? I hope I don't have to read every bytes of that file into the memory then write back to disk...
I remember vaguely that file in a filesystem is just a pointer to a position on disk, is there any way I can simply move the pointer to the start of next line?
I need some insight into what syscalls or whatever can do this effectively, but I'm quite ignorant to low level system stuffs. Any help is appreciated!
What you're asking for is not something that a standard file system can do. You can't insert data into the beginning of a file in any traditional OS file system without rewriting the entire file. That's just the way they work.
Systems that absolutely need to be able to do something like that without rewriting the entire file and still use a traditional OS file system will build their own mini file system on top of the regular file system so that one virtual file consists of many pieces written to separate files or to separate blocks of a file. Then, in a system like that, you can insert data at the beginning of a virtual file without rewriting any of the existing data by writing a new block of data to disk and then updating your virtual file index (stored in some other file) to indicate that the first block of your virtual file now comes from a specific location. This file index specifies the order of the blocks of data in the file and where they come from.
Most programs that need to do something like this will instead use a database for storing records and then use indexes and queries for controlling order and let the underlying database worry about where individual bits get stored on disk. In this way, you can very efficiently insert data anywhere you want in a resulting query.

How safe is it reading / copying a file which is being appended to?

If a log file has events constantly being appended to it, how safe is it to read that file (or copy it) with another process?
Unix allows concurrent reading and writing. It is totally safe to read a file while others are appending to it.
Of course it can happen that an appending act is unfinished while a reading act is reaching the end of the file, then this reader will get an incomplete version (e. g. only a part of a new log entry at the end of the file). But technically, this is correct because the file really was in this state while it was being read (e. g. copied).
EDIT
There's more to it.
If a writer process has an open file handle, the file will stay on disk as long as this process keeps the open file handle.
If you remove the file (rm(1), unlink(2)), it will be removed from its directory only. It will stay on disk, and that writer (and everybody else who happens to have an open file handle) will still be able to read the contents of the already removed file. Only after the last process closes its file handle, the file contents will be freed on the disk.
This is sometimes an issue if a process writes a large log file which is filling up the disk. If it keeps and open file handle to the log file, the system administrator cannot free this disk capacity using rm.
A typical approach then is to kill the process as well. Hence it is a good idea, as a process, to close the file handle for the log file again after writing to the log (or close and reopen it at least from time to time).
There's more:
If a process has a an open file handle on a log file, this file handle contains a position. If now the log file is emptied (truncate(1), truncate(2), open(2) for writing not using append flags, : > filepath), the file's contents is indeed removed from the disk. If the process having an open file handle is now writing to this file, it will write at the old position, e. g. at a position of several megabytes. Doing this to an empty file will fill the gap with zeros.
This is no real problem, if a sparse file can be created (typically possible on Unix file systems). Only otherwise will it fill the disk again quickly. But in any case it can be very confusing.

Can POSIX/Linux unlink file entries completely race free?

POSIX famously lets processes rename and unlink file entries with no regard as to the effects on others using them, whilst Windows by default raises an error if you even try to touch the timestamps of a directory which has a file handle open somewhere deep inside inside.
However Windows doesn't need to be so conservative. If you open all your file handles with FILE_FLAG_BACKUP_SEMANTICS and FILE_SHARE_DELETE and take care to rename files to random names just before flagging deletion, you get POSIX semantics including lack of restriction on manipulating file paths containing open file handles.
One very nifty thing Windows can do is to perform renames and deletes and hard links only using an open file descriptor, and therefore you can delete a file without having to worry about whether another process has renamed it or any of the directories in the path preceding the file's location. This facility lets you perform completely race free file deletions - once you have an open file handle to the right file, you can stop caring about what other processes are doing to the filing system, at least for deletion (which is the most important as it implicitly involves destroying data).
This raises the question of what about POSIX? On POSIX unlink() takes a path, and between retrieving the current path of a file descriptor using /proc/self/fd/x or F_GETPATH and calling unlink() someone may have changed that path, thus potentially leading to the wrong file being unlinked and data lost.
A considerably safer solution is this:
Get one of the current paths of the open file descriptor using /proc/self/fd/x or F_GETPATH etc.
Open its containing directory.
Do a statat() on the containing directory for the leafname of the open file descriptor, checking if the device ids and inodes match.
If they match, do an unlinkat() to remove the leafname.
This is race safe from the parent directory upwards, though the hard link you delete may not be the one expected. However, it is not race safe if within the containing directory a third party process were to rename your file to something else and rename another file to your leafname between you checking for inode equivalence and calling the unlinkat(). Here the wrong file could be deleted, and data lost.
I therefore ask the question: can POSIX, or any specific POSIX implementation such as Linux, allow programs to unlink file entries completely race free? One solution could be to unlink a file entry by open file descriptor, another could be to unlink a file entry by inode, however google has not turned up solutions for either of those. Interestingly, NTFS does let you delete by a choice of inode or GUID (yes NTFS does provide inodes, you can fetch them from the NT kernel) in addition to deletion via open file handle, but that isn't much help here.
In case this seems like too esoteric a question, this problem affects proposed Boost.AFIO where I need to determine what filing system races I can mitigate and what I cannot as part of its documented hard behaviour guarantees.
Edit: Clarified that there is no canonical current path of an open file descriptor, and that in this use case we don't care - we just want to unlink some one of the links for the file.
No replies to this question, and I have spent several days trawling through Linux source code. I believe the answer is "currently you can't unlink a file race free", so I have opened a feature request at https://bugzilla.kernel.org/show_bug.cgi?id=93441 to have Linux extend unlinkat() with the AT_EMPTY_PATH Linux extension flag. If they accept that idea, I'll mark this answer as the correct one.

using files as IPC on linux

I have one writer which creates and sometimes updates a file with some status information. The readers are implemented in lua (so I got only io.open) and possibly bash (cat, grep, whatever). I am worried about what would happen if the status information is updated (which means a complete file rewrite) while a reader has an open handle to the file: what can happen? I have also read that if the write/read operation is below 4KB, it is atomic: that would be perfectly fine for me, as the status info can fit well in such dimension. Can I make this assumption?
A read or write is atomic under 4Kbytes only for pipes, not for disk files (for which the atomic granularity may be the file system block size, usually 512 bytes).
In practice you could avoid bothering about such issues (assuming your status file is e.g. less than 512 bytes), and I believe that if the writer is opening and writing quickly that file (in particular, if you avoid open(2)-ing a file and keeping the opened file handle for a long time -many seconds-, then write(2)-ing later -once, a small string- inside it), you don't need to bother.
If you are paranoid, but do assume that readers are (like grep) opening a file and reading it quickly, you could write to a temporary file and rename(2)-ing it when written (and close(2)-ed) in totality.
As Duck suggested, locking the file in both readers and writers is also a solution.
I may be mistaken, in which case someone will correct me, but I don't think the external readers are going to pay any attention to whether the file is being simultaneously updated. They are are going to print (or possibly eof or error out) whatever is there.
In any case, why not avoid the whole mess and just use file locks. Have the writer flock (or similar) and the readers check the lock. If they get the lock they know they are ok to read.

Resources