File Access method in Linux - linux

Read in text books that there are mainly two file access methods; sequential and direct. Which one we are using in Linux?
In read command we are giving the how much bytes to read and to which buffer. So we are having sequential access in Linux?
But physically we have files stored is blocks? I couldnt relate to it.
Whether direct access possible in Linux?
I read about these access models in Operating System Concepts by Galvin

Both are possible.
When you do a read on an ordinary file, it does read the file sequentially, advancing the file pointer each time by the right amount.
But you can also use seek to move to an arbitrary point in the file.
Not all files support random/direct access. Pipes for instance are typically only sequential access (you can't rewind or skip forward).
So pretty much everything is possible, but some file types have restrictions.
(File access with direct I/O (O_DIRECT flag) is a different concept altogether.)

You can certainly read/write from an arbitrary position in an open (disc) file.
There are a number of methods of doing random IO, which are optimised for different kinds of usage.
The simplest method is seek() followed by read() or write(). The file pointer moves on by the amount of bytes read/written, and it can allow sequential IO following a random jump. Consider seek() as logically spinning the an old "reel-to-reel" tape drive (even though we don't have these any more).
The pread and pwrite system calls combine seek() and read/write(), specifically for use in multithreaded programs (where two syscalls would result in a race condition). They don't change the file pointer, so you can think of it logically just taking or putting a random bit of data.
mmap() maps a file into memory - where you can then do with it, what you will, using conventional pointer/ memory manipulation (for example, memset, memcpy, etc).

Related

If I private-`mmap` a file and read it, then another process writes to the same file, will another read at the same location return the same value?

(Context: I'm trying to establish which sequences of mmap operations are safe from the "memory safety" point of view, i.e. what assumptions I can make about mmaped memory without risking security bugs as a consequence of undefined behaviour, or miscompiles due to compilers making incorrect assumptions about how memory could behave. I'm currently working on Linux but am hoping to port the program to other operating systems in the future, so although I'm primarily interested in Linux, answers about how other operating systems behave would also be appreciated.)
Suppose I map a portion into file into memory using mmap with MAP_PRIVATE. Now, assuming that the file doesn't change while I have it mapped, if I access part of the returned memory, I'll be given information from the file at that offset; and (because I used MAP_PRIVATE) if I write to the returned memory, my writes will persist in my process's memory but will have no effect on the underlying file.
However, I'm interested in what will happen if the file does change while I have it mapped (because some other process also has the file open and is writing to it). There are several cases that I know the answers to already:
If I map the file with MAP_SHARED, then if any other process writes to the file via a shared mmap, my own process's memory will also be updated. (This is the intended behaviour of MAP_SHARED, as one of its intended purposes is for shared-memory concurrency.) It's less clear what will happen if another process writes to the file via other means, but I'm not interested in that case.
If the following sequence of events occurs:
I map the file with MAP_PRIVATE;
A portion of the file I haven't accessed yet is written by another process;
I read that portion of the file via my mapping;
then, at least on Linux, the read might return either the old value or the new value:
It is unspecified whether changes made to the file after the mmap() call are visible in the mapped region.
— man 2 mmap on Linux
(This case – which is not the case I'm asking about – is covered in this existing StackOverflow question.)
I also checked the POSIX definition of mmap, but (unless I missed it) it doesn't seem to cover this case at all, leaving it unclear whether all POSIX systems would act the same way.
Linux's behaviour makes sense here: at the time of the access, the kernel might have already mapped the requested part of the file into memory, in which case it doesn't want to change the portion that's already there, but it might need to load it from disk, in which case it will see any new value that may have been written to the file since it was opened. So there are performance reasons to use the new value in some cases and the old value in other cases.
If the following sequence of events occurs:
I map the file with MAP_PRIVATE;
I write to a memory address within the file mapping;
Another process changes that part of the file;
then although I don't know this for certain, I think it's very likely that the rule is that the memory address in question continues to reflect the old value, that was written by our process. The reason is that the kernel needs to maintain two copies of that part of the file anyway: the values as seen by our process (which, because it used MAP_PRIVATE, can write to its view of the file without changing the underlying file), and the values that are actually in the file on disk. Writes by other processes obviously need to change the second copy here, so it would be bizarre to also change the first copy; doing so would make the interface less usable and also come at a performance cost, and would have no advantages.
There is one sequence of events, though, where I don't know what happens (and for which the behaviour is hard to determine experimentally, given the number of possible factors that might be relevant):
I map the file with MAP_PRIVATE;
I read some portion of the file via the mapping, without writing;
Another process changes part of the file that I just read;
I read the same portion of the file via the mapping, again.
In this situation, am I guaranteed to read the same data twice? Or is it possible to read the old data the first time and the new data the second time?

Python3 work effectively with remote binary data

I have some remote devices that are accessed via sftp using paramiko over (usually) a cellular connection. I need to do two things to some binary files on them:
Read them byte-by-byte to enter the contents into a database
Write the entire file out to a networked drive.
I have code to do each of these things that works, but I need to do both things in the most reasonably efficient way.
I could pass around the file_handle, read byte by byte to do the database entry, then do file_handle.seek(0) and pass that to other_file_handle.write, but I'm a little concerned about the flakiness of the cellular connection as I'm reading remote files byte by byte and processing the results and it means effectively iterating the thing twice.
I could fix the double iteration part of that problem by both translating the bytes to meaningful data and simultaneously writing them to a buffer to later dump to disk, but that seems awfully...manual?
I could read the entire remote binary file, write it to disk, open that for byte-by-byte processing but that's really inefficient compared to doing all the work in memory.
I could read in the remote data to an IO stream, and then manually both convert bytes and also put them into a write stream. But then the file writing code is totally coupled to the parsing code and again it's a lot of lower level manipulation.
The last is probably the "best" way but I'm hoping there's a better higher-level abstraction to use that lets me maintain a better separation of concerns. Is there an equivalent of the posix tee command or something here?

How to safely use mmap() for reading?

I have a need to do a lot of random-access reads in a large file so I use mmap(). This solution seems to be perfect as long as the mapped file is untouched. But this is not always the case. If the mapped file is tampered with, several problems arise:
If a change to a file reduces its length or the file becomes inaccessible then a process in order to live must handle SIGBUS signal (at least, in Linux implementation). This adds additional complications since I'm writing a library.
To make things even worse, mmap() manpage says it is unspecified if changes to the original
file are propagated to the memory. So they can very well be propagated.
This essentially means the contents of the file I work with can become white noise at any moment.
Does all of this mean that any program that maps a freely accessible file and does not handle these problems can be brought down by a DoS attack? Even while I do not expect evil hackers to go after my program, I can easily see a user modifying my mapped file, replacing it with another one or making the file inaccessible by, for example, removing a USB drive. And while I can write a signal handler (and this is a bit messy, so I am looking for a better solution) to solve the first problem,
I have no idea how to solve the second one.
The file can not be copied and can be freely moved around if it's not used by a program (just like any other media file). Linux file locks do not always work.
So, how to safely use mmap() for reading?

Transactionally writing files in Node.js

I have a Node.js application that stores some configuration data in a file. If you change some settings, the configuration file is written to disk.
At the moment, I am using a simple fs.writeFile.
Now my question is: What happens when Node.js crashes while the file is being written? Is there the chance to have a corrupt file on disk? Or does Node.js guarantee that the file is written in an atomic way, so that either the old or the new version is valid?
If not, how could I implement such a guarantee? Are there any modules for this?
What happens when Node.js crashes while the file is being written? Is
there the chance to have a corrupt file on disk? Or does Node.js
guarantee that the file is written in an atomic way, so that either
the old or the new version is valid?
Node implements only a (thin) async wrapper over system calls, thus it does not provide any guarantees about atomicity of writes. In fact, fs.writeAll repeatedly calls fs.write until all data is written. You are right that when Node.js crashes, you may end up with a corrupted file.
If not, how could I implement such a guarantee? Are there any modules for this?
The simplest solution I can come up with is the one used e.g. for FTP uploads:
Save the content to a temporary file with a different name.
When the content is written on disk, rename temporary file to destination file.
The man page says that rename guarantees to leave an instance of newpath in place (on Unix systems like Linux or OSX).
fs.writeFile, just like all the other methods in the fs module are implemented as simple wrappers around standard POSIX functions (as stated in the docs).
Digging a bit in nodejs' code, one can see that the fs.js, where all the wrappers are defined, uses fs.c for all its file system calls. More specifically, the write method is used to write the contents of the buffer. It turns out that the POSIX specification for write explicitly says that:
Atomic/non-atomic: A write is atomic if the whole amount written in
one operation is not interleaved with data from any other process.
This is useful when there are multiple writers sending data to a
single reader. Applications need to know how large a write request can
be expected to be performed atomically. This maximum is called
{PIPE_BUF}. This volume of IEEE Std 1003.1-2001 does not say whether
write requests for more than {PIPE_BUF} bytes are atomic, but requires
that writes of {PIPE_BUF} or fewer bytes shall be atomic.
So it seems it is pretty safe to write, as long as the size of the buffer is smaller than PIPE_BUF. This is a constant that is system-dependent though, so you might need to check it somewhere else.
write-file-atomic will do what you need. It writes to temporary file, then rename. That's safe.

Does each Unix file description have its own read/write buffers?

In reference to this question about read() and write(), I'm wondering if each open file description has its own read and write buffers or if perhaps there's a single read and write buffer for a file when it has been opened multiple times at once. I'm curious because this would have an effect on what exactly happens with overlapping writes to the same file. Perhaps this is something that varies among Unixes?
(To my understanding, "file description" refers to the info/options about an open file, such as the current marker position. "File descriptor", in contrast, refers to just the number used in a process to refer to a description.)
This depends a bit on whether you are talking about sockets or actual files.
Strictly speaking, a descriptor never has its own buffers; it's just a handle to a deeper abstraction.
File system objects have their "own" buffers, at least when they are required. That is, if a program writes less than the file system block size, the kernel has no choice but to read a FS block and merge the write with the existing data.
This buffer is attached to the vnode and at a lower level, possibly an inode. It's owned by the file and not the descriptor. It may be kept around for a long time if memory is available.
In the case of a socket, then a stream, but not specifically a single descriptor, does actually have buffers that it owns.
If the files were open in blocking mode then yes there should only be one buffer. I would bet the default is non blocking for performance reasons.

Resources