Linux I2C file handles - safe to cache? - linux

I am just starting to look at the I2C support on (embedded) linux (Beaglebone Black, to be precise). Since it's linux, everything is a file, so it's no surprise that I2C is too.
int file = open( "/dev/i2c-0", O_RDWR );
and then the actual address on that bus is selected through ioctl(). My question is this - is it safe, or even legal, to cache file for the duration of application execution? It would seem to my naive eyes that the overheading of opening a resource to read every 250ms would be an unnecessary strain on the kernel. So it is valid to open, and then just use ioctl() to switch address whenever I need to, or must I close() the descriptor between reads and writes?

is it safe, or even legal, to cache file for the duration of application execution?
The file descriptor (that is returned from open()) is valid for as long as your program needs to keep executing.
Device nodes in /dev may resemble filenames, but they are treated differently from filesystem entries once you look past the syscall interface. An open() or read() on a file descriptor for a device will invoke the device driver, whereas for an actual file its filesystem is invoked, which may eventually call a storage device driver.
It would seem to my naive eyes that the overheading of opening a resource to read every 250ms would be an unnecessary strain on the kernel.
Yes it would since those open() and close() syscalls are unnecessary.
So it is valid to open, and then just use ioctl() to switch address whenever I need to,
Yes, that is the proper useage.
or must I close() the descriptor between reads and writes?
That is not necessary nor advised.

Related

Do I need to flush named pipes?

I cannot find whether named pipes are buffered, hence the question.
The manpage says https://linux.die.net/man/3/mkfifo:
A FIFO special file is similar to a pipe ... any process can open it for reading or writing, in the same way as an ordinary file.
Pipes are not buffered, no need to flush. But in a ordinary file, I would fflush (or fsync) the file descriptor.
How about named pipe?
Pipes are not buffered, no need to flush.
I'd actually put that the other way around: for most intents and purposes, pipes are nothing but buffer. It is not meaningful to flush them because there is no underlying device to receive the data.
Moreover, although POSIX does not explicitly forbid additional buffering of pipe I/O, it does place sufficient behavioral requirements that I don't think there's any way to determine from observation whether such buffering occurs, except possibly by whether fsync() succeeds. In other words, even if there were extra buffering, it should not be necessary to fsync() a pipe end.
But in a ordinary file, I
would fflush (or fsync) the file descriptor.
Well no, you would not fflush() a file descriptor. fflush() operates on streams, represented by FILE objects, not on file descriptors. This is a crucial distinction, because most streams are buffered at the C library level, independent of the nature of the file underneath. It is this library-level buffer that fflush() interacts with. You can control the library-level buffering mode of a stream via the setvbuf() function.
On those systems that provide it, fsync() operates at a different, lower level. It instructs the OS to ensure that all data previously written to the specified file descriptor has been delivered to the underlying storage device. In other words, it flushes OS-level buffers.
Note well that you can wrap a stream around a pipe-end file descriptor via the fdopen() function. That doesn't make the pipe require flushing any more than it did before, but the stream will be buffered by default, so flushing will be relevant to it.
Note, too, that some storage devices perform their own buffering, so that even after the data have been handed off to a storage device, it is not certain that they are immediately persistent.
How about named pipe?
The discussion above about stream I/O vs. POSIX descriptor-based I/O applies here, too. If you access a named pipe via a stream, then its interaction with fflush() will depend on the buffering of that stream.
But I suppose your question is more about os-level buffering and flushing. POSIX does not appear to say much concrete, but since you tag [linux] and refer to a Linux manual page in your question, I offer this in response:
The only difference between pipes and FIFOs is the manner in which
they are created and opened. Once these tasks have been accomplished,
I/O on pipes and FIFOs has exactly the same semantics.
(Linux pipe(7) manual page.)
I don't understand quite well what you try to ask, but as you have been already told, pipes are not more than buffer.
Historically, fifos (or pipes) consumed the direct blocks of the inode used to maintain them, and they tied to a file (having it a name or not) in some filesystem.
Today, I don't know the exact implementation details for a fifo, but basically, the kernel buffers all data the writers have already written, but the readers haven't read yet. The fifo has an upper limit (system defined) for the amount of buffer they can support, but normally that fails around 10-20kb of data.
The kernel buffers, but there's no delay between writers and readers, because as soon a writer writes on a pipe, the kernel awakens all the readers waiting for it to have data. The reverse is also true, in the case the pipe gets full of data, as soon as a reader consumes it, all the writers are awaken to allow for filling it again.
Anyway, your question about flushing has nothing to do with pipes (well, not like, let me explain myself) but with <stdio.h> package. <stdio.h> does buffer, and it handles buffering on each FILE * individually, so you have calls for flushing buffers when you want them to be write(2)n to disk.
has a dynamic behaviour that allows to optimize buffering and not force programmers to have to flush at each time. That depends on the type of file descriptor associated with a FILE * pointer.
When the FILE * pointer is associated to a serial tty (it does check that calling to isatty(3) call, which internally makes an ioctl(2) call, that allow <stdio.h> to see if you are against a serial device, a char device. If this happens, then <stdio.h> does line buffering that means that always when a '\n' char is output to a device, the buffer is automatically buffered.
This supposes an optimization problem, because when, for example you are using cat(1) to copy a file, the largest the buffer normally supposes the most efficient approach. Well, <stdio.h> comes to solve the problem, because when output is not a tty device, it makes full buffering, and only flushes the internal buffers of the FILE * pointer when it is full of data.
So the question is: How does <stdio.h> behave with a fifo (or pipe) node? The answer is simple.... is is not a char device (or a tty) so <stdio.h> does full buffering on it. If you are communicating data between two processes and you want the reader to receive the data as soon as you have printf(3)ed it, then you have better to fflush(3), because if you don't, you can be waiting for a response that never comes, because what you have written, has not yet been written (not by the kernel, but by the <stdio.h> library)
As I said, I don't know if this is exactly the answer to your question, but for sure it can give you a hint on where the problem could be.

what's the difference between fwrite(), write() , pwrite(),fread(), read() ,pread(), fsync() in Linux?

I suppose the fwrite() is pass the data from the user application to the buffer in the user mode, however write() is passing the data from the buffer in the user mode to the buffer in the kernel mode, and fsync() is passing the data from the buffer in the kernel mode to the disk .Right? and read() is passing data from buffer in kernel mode to buffer in user mode, and fread() is passing data from buffer in user mode to user Application , right? For pwrite() , besides lseek, it also call the fsync()?
For pwrite() , besides lseek, it also call the fsync()?
No, pwrite() does not call fsync(). See pwrite(3): on file - Linux man page:
The pwrite() function shall be equivalent to write(), except that it writes into a given position without changing the file pointer.
Also fsync() write data from kernel buffers to disk, so which system call read data from disk to kernel buffers ?
To write data from kernel buffers to disk, fsync() can be called, but it doesn't have to, if it suffices that the buffers are flushed eventually, which will happen anyway sooner or later, unless the system crashes or is reset.
To read data from disk to kernel buffers, having a dedicated system call is not needed1. The system knows from the read() call which data are to be read, and data must be read before the call returns (unless they are already buffered).
1What comes closest to such a system call is probably (as already mentioned by Tsyvarev) fadvise(2): Give advice about file access - Linux man page:
Allows an application to to tell the kernel how it expects to use a file handle, so that the kernel can choose appropriate read-ahead and caching techniques for access to the corresponding file.

Why are these special device file reads a minimum of PAGE SIZE bytes?

I am coding my 2nd kernel module ever. I am attempting to provide user-space access to a firmware core, as a demo. The demo is under petalinux (an embedded OS specifically tailored to Zynq or Microblaze). I added virtual file system hooks to go between user space and the kernel module, and it seems to work, both on read and write. The only hiccup is that, somewhere between my user application and my kernel module, the OS balloons the size of my request up to PAGE SIZE (4096).
A co-worker commented that I might be mounting the module as a block device rather than a character device. This makes a lot of sense. Someone upstream of my module is certainly caching my results (which, if my understanding of block drivers is accurate, would make perfect sense for, say, the hard drive), but we're tied to a volatile device, so this isn't appropriate. But all the diagnostics I've been able to find suggest that it is mounted as a character device...
mknod /dev/myModule **c** (Dynamically specified Major Number) (Zero)
ls -la /dev/myModule
**c**rw-r--r-- 1 root root 252, - Jan 1 01:05 myModule
Here is the module source I am using to register the virtual file IO hooks.....
alloc_chrdev_region (&moduleMajorNumber, 0, 1, "moduleLayerCDMA");
register_chrdev_region (&moduleMajorNumber, 1, "moduleLayerCDMA");
cdevP = cdev_alloc();
cdevP->ops = &moduleLayerCDMA_fileOperations;
cdevP->owner = THIS_MODULE;
cdev_add(cdevP, moduleMajorNumber, 1);
Any clues?
Your problem comes from the fact that the standard C library buffered I/O routines (fopen, fclose, fread, fgetch & their friends) keep a user-space buffer for every opened file/device, and when your program tries to read from that file/device, the library routines try to do read-ahead, to prepare for later read calls, to increase the efficiency of the I/O. Similarly, writes with fwrite go through a write buffer, and only get flushed to the system with a system call when the buffer gets full or when closing the file/device or explicitly doing fflush.
There are two ways to solve the issue:
The easier might be to simply convert your user-space program to use non-buffered I/O (open, close, read, write & their friends), these are simply making the corresponding system call on a 1:1 basis.
Or handle the problem in your kernel module: disregard the number of bytes asked in a read if it is more than what you'd like to return in a single system call. You can look at that value as the length of the buffer provided by the caller, and you don't neccessarily have to fill it up completely. Of course, in the return value, you have to indicate how many bytes were actually read.

How to ensure read() to read data from the real device each time?

I'm periodically reading from a file and checking the readout to decide subsequent action. As this file may be modified by some mechanism which will bypass the block file I/O manipulation layer in the Linux kernel, I need to ensure the read operation reading data from the real underlying device instead of the kernel buffer.
I know fsync() can make sure all I/O write operations completed with all data written to the real device, but it's not for I/O read operations.
The file has to be kept opened.
So could anyone please kindly tell me how I can do to meet such requirement in Linux system? is there such a API similar to fsync() that can be called?
Really appreciate your help!
I believe that you want to use the O_DIRECT flag to open().
I think memory mapping in combination with madvise() and/or posix_fadvise() should satisfy your requirements... Linus contrasts this with O_DIRECT at http://kerneltrap.org/node/7563 ;-).
You are going to be in trouble if another device is writing to the block device at the same time as the kernel.
The kernel assumes that the block device won't be written by any other party than itself. This is true even if the filesystem is mounted readonly.
Even if you used direct IO, the kernel may cache filesystem metadata, so a change in the location of those blocks of the file may result in incorrect behaviour.
So in short - don't do that.
If you wanted, you could access the block device directly - which might be a more successful scheme, but still potentially allowing harmful race-conditions (you cannot guarantee the order of the metadata and data updates by the other device). These could cause you to end up reading junk from the device (if the metadata were updated before the data). You'd better have a mechanism of detecting junk reads in this case.
I am of course, assuming some very simple braindead filesystem such as FAT. That might reasonably be implemented in userspace (mtools, for instance, does)

May I open my own device driver twice simultaneously from a user program under Linux?

Somewhere I read that opening the same file twice has an undefined semantics and should be avoided. In my situation I would like to open my own device multiple times associating multiple file descriptors to it. The file operations of my device are all safe. Is there some part of Linux between the sys call open() and the point it calls the registered file operation .open() that is unsafe?
It is perfectly fine to open the same device file twice, as long as your driver is ok with that.
There is no hidden part that would make it unsafe if it is safe in the kernel.
For example, some video application use one process to do the display or capture, while another opens the device file to handle the controls.
If the driver does not support multiple open, then it should return an error when a second open happens.
You may open a device twice in the same process, if the driver will let you do so. Synchronization is the responsibility of the driver.
However, if you are opening, say, a raw disk device as a privileged user, you will want to make sure you don't clobber your own data in your process.
Opening the same file twice has well-defined semantics in cases which make sense. Processes still need some form of synchronisation if they're all doing read/write, otherwise the file is likely to end up full of rubbish.
For a device driver, the semantics of multiple opens is entirely up to the driver - some drivers prohibit it, in others it works fine (think /dev/null for instance). In some drivers it has a very special meaning (e.g. sound cards may mix the sound output between multiple apps)

Resources