POSIX: Pipe syscall in FreeBSD vs Linux - linux

In Linux (2.6.35-22-generic), man pipe states that
pipe() creates a pipe, a unidirectional data channel that can be used for interprocess communication."
In FreeBSD (6.3-RELEASE-p5), man pipe states that
The pipe() system call creates a pipe, which is an object allowing bidirectional data flow, and allocates a pair of file descriptors."
One is unidirectional, the other is bidirectional. I hope this isn't a silly question, but which method is the standard way of doing this? Are they both POSIX compliant?
To make my intentions clear, I lost some points on an exam for believing pipe() was one way and am looking for some ammo to get any points back ;p

I started this as a comment on Greg's answer at first, but it occurs to me that it more closely answers your specific question:
pipe()s documentation in the POSIX standard explicitly states that the behavior in question is "unspecified" -- that is, pipe() is not required to be bidirectional, though it's not forbidden. Linux's is unidirectional, FreeBSD's is bidirectional. Both are compliant, one just implements additional behavior that is not required (but doesn't break apps built to work on compliant systems).
Data can be written to the file
descriptor fildes[1] and read from the
file descriptor fildes[0]. A read on
the file descriptor fildes[0] shall
access data written to the file
descriptor fildes[1] on a
first-in-first-out basis. It is
unspecified whether fildes[0] is also
open for writing and whether fildes[1]
is also open for reading.
I wouldn't count on getting the points back (though you should). Professors have a tendency to ignore the real world in favor of whatever they've decided is correct.

The FreeBSD man page for pipe is pretty clear on this point:
The bidirectional nature of this implementation of pipes is not portable to older systems, so it is recommended to use the convention for using the endpoints in the traditional manner when using a pipe in one direction.

Related

POSIX replacement for inotify?

I need to tail an input file for commands, ignoring EOF. I've been using inotify(2) to block until changes have been made to the file after reaching EOF, which works fine. However, inotify(2) is a Linux-specific syscall. Are there any alternatives defined in POSIX?
Are there any alternatives defined in POSIX?
No.
Well, it's easy to prove that something exists - it's there. It's harder to prove something is not there.
It's not there. There is no POSIX interface with similar functionality as inotify or kqueue.
If you want to be portable, handle each system separately. Don't reinvent the wheel - libuv and libevent exist.

What's the difference between `flush` and `sync_all`?

I'm curious whether I should use Write::flush or File::sync_all when I finish writing a file.
TL;DR: If you want to "ensure" that the data has been written to the device, then use File::sync_all if you use a File. Note that this isn't necessary though.
The Write::flush implementation for File uses the operating system dependent flush operation, for example std::sys::unix::File::flush, or std::sys::windows::File::flush. Those flush operations do... nothing. Both implementations just return Ok(()).
Why? Because the write() already uses the underlying system call for write() in both cases; the handle-based write on Windows, and the file descriptor-based write on Unix-like systems. At that point, it's out of reach of the Rust environment, save for a system call that's specific to files.
So what is Write::flush useful for? It's useful if you have any kind of buffer before the actual file, for example a BufWriter. If you have a File wrapped by a BufWriter, then you need to use flush to ensure that the bytes get written to the file. While it's useful to keep in mind that BufWriter's Drop implementation also tries(!) to write those bytes, it may or may not work, so you're supposed to call Write::flush there (see BufWriter's documentation).
That being said, sync_all isn't necessary and instead will block your program. The operating system will handle the file system synchronisation. While you can certainly wait for that synchronisation to happen via sync_data or sync_all, you're usually better of with not doing either.
Write::flush for on-disk file is actually a no-op [source]. It's useless for File, just impl for consistency. This interface is meant for stream that utilizes app-level in-memory buffer before writing into destination, as stated in the doc:
Flush this output stream, ensuring that all intermediately buffered contents reach their destination.
File::sync_data is the kinda like the useful version of flush for File. Under the hood, intermediate buffer is used on kernel-level, and sync_data delegates to fdatasync POSIX call, which does what flush does on app-level, .
File::sync_all does what File::sync_data does, and on top of that, it also ensure metadata about a file is written to disk. It delegates to fsync on POSIX system.
Sidenote: depending on system (e.g. macOS, android, etc.), implementation for File::sync_data and File::sync_all could be exactly the same.

How can I create a userspace filesystem with FUSE without using libfuse?

I've found that the FUSE userspace library and kernel interface has been ported, since its inception on Linux, to many other systems, and presents a relatively stable API with a supposedly small surface area. If I wanted to author a filesystem in userspace, and I were not on Plan 9 or Hurd, I would think that FUSE is my best choice.
However, I am not going to use libfuse. This is partially because of pragmatism; using C is hard in my language of choice (Monte). It's also because I am totally uninterested in writing C support code, and libfuse's recommended usage is incompatible with Monte philosophy. This shouldn't be a problem, since C is not magical and /dev/fuse can be opened with standard system calls.
Going to look for documentation, however, I've found none. There is no documentation that I can find for the /dev/fuse ABI/API, and no stories of others taking this same non-C-bound route. Frustrating.
Does any kind of documentation exist on how to interact in a language-agnostic way with /dev/fuse and the FUSE subsystem of the kernel? If so, could you point me to it? Thanks!
Update: There exists go-fuse, which is in Go, a slightly more readable language than C. However, it does not contain any ABI/API documentation either.
Update: I notice that people have voted to close this. Don't worry, there is no need for that. I have satisfied myself that the documentation that I desire does not yet exist. I will write the documentation myself, publish it, and then link to it in an accepted answer. Hopefully the next person to search for this documentation will not be disappointed.
(I'm not accepting this until it's complete. In the interim, edits are welcome!)
The basic outline of a FUSE session:
open() is called on /dev/fuse. I'll call the resulting FD the control FD.
mount() is called with the target mount point, filesystem type "fuse" for normal mode or "fuseblk" for block-device mode, and options including "fd=X" where X is the control FD.
FUSE-specific structs are transferred on the control FD repeatedly. The general pattern of communication follows a request-response pattern, where the program read()s filesystem commands from the control FD and then write()s responses back.
umount() is called with the target mount point.
close() is called on the control FD.
With that all said, there's a handful of complications that one should be aware of. First, mount() is almost always a privileged syscall, so you'll have to be root to mount a FUSE filesystem. However, as one may have noticed, FUSE programs can generally be started as non-root! How?
There's a helper, /bin/fusermount, installed setuid. Usage is totally undocumented, but that's what I'm here for. Instead of open()ing /dev/fuse yourself, run fusermount as a subprocess, passing the target mount point as an argument, any extra mount options you like with -o, and (crucially) with the environment variable _FUSE_COMMFD exported and set to the ASCII string of an open FD, which I'll call the comm FD. You must create the comm FD yourself using e.g. pipe(). fusermount will call open() and mount() for you, and share the control FD back to you along the comm FD, using the sendmsg() trick for sharing FDs. Use recvmsg() to read it back.
Editorial: I really don't understand why this is structured to be so difficult. FDs are inherited by subprocesses; it would have been so much easier to open() the control FD in the top process and pass it down into fusermount. True, there's some confused deputy dangers, but fusermount is already installed and setuid and dangerous.
Anyway! fusermount will crudely daemonize and take care of calling umount() and close() to clean up once your main process exits.
Things not yet covered:
How is non-blocking access to FUSE handled? Can the control FD just be kicked into non-blocking mode? Does it actually not block, or does it behave like an ordinary file and secretly block on access?
The struct layouts. These can be more or less rediscovered from the C or Go source, but that's no excuse. I'll document them more seriously when I've worked up sufficient masochism.

what binary standards are there for sharing code in linux (similar to COM)?

So I have finished reading an article here:
https://msdn.microsoft.com/en-us/library/ms809983.aspx
about why we have COM and how it lets us share code without worrying about name mangling of compilers or unicode/ascii issues or memory management in a language independent manner.
I have elsewhere read that COM isn't supposed by LINUX because COM basically uses the OS as the moderator for acquisition of these standardized objects. Shouldn't there be something similar in Linux? and if so, what is it?
On Linux you can run any program that accepts its input on standard input, and connect it, via a pipe, to any other program that generates its results on its standard output.
The simple, file and pipe-based input/output in POSIX predates MS-Windows by decades. And, as long as both sides of the pipe agree on the format of the data being interchanged, it doesn't matter which compiler was used to create each program (although, on Linux, there's pretty one de-factor compiler, so it's a moot point).
And by using a socket-pair, the pipe becomes bi-directional, so both processes can swap data with each other.
This is, generally, how processes interoperate on Linux:
1) A pipe, or a network socket, that connects the two processes together
2) An agreed, established standard for the format of the data exchanged between the two processes.
It is important to understand that there is no practical standard for all processes to use the same exact format for exchanging messages. The closest that would come to such a standard, I suppose, would be the remote procedure call, or RPC, standard that's used in some low-level protocols, like NFS, but, mostly, individual applications define and use a particular format that's tailored for them.
For example, the X Window System Protocol: http://www.x.org/releases/X11R7.7/doc/xproto/x11protocol.html -- this is a format definition of a protocol for communicating between an X server and an X client. Applications that are written to use this protocol (they'll typically use an intermediate library or a toolkit, actually) can establish a connection and use any X server that talks the same protocol, over a network connection or a local pipe.

Can regular file reading benefited from nonblocking-IO?

It seems not to me and I found a link that supports my opinion. What do you think?
The content of the link you posted is correct. A regular file socket, opened in non-blocking mode, will always be "ready" for reading; when you actually try to read it, blocking (or more accurately as your source points out, sleeping) will occur until the operation can succeed.
In any case, I think your source needs some sedatives. One angry person, that is.
I've been digging into this quite heavily for the past few hours and can attest that the author of the link you cited is correct. However, the appears to be "better" (using that term very loosely) support for non-blocking IO against regular files in native Linux Kernel for v2.6+. The "libaio" package contains a library that exposes the functionality offered by the kernel, but it has some caveats about the different types of file systems which are supported and it's not portable to anything outside of Linux 2.6+.
And here's another good article on the subject.
You're correct that nonblocking mode has no benefit for regular files, and is not allowed to. It would be nice if there were a secondary flag that could be set, along with O_NONBLOCK, to change this, but due to the way cache and virtual memory work, it's actually not an easy task to define what correct "non-blocking" behavior for ordinary files would mean. Certainly there would be race conditions unless you allowed programs to lock memory associated with the file. (In fact, one way to implement a sort of non-sleeping IO for ordinary files would be to mmap the file and mlock the map. After that, on any reasonable implementation, read and write would never sleep as long as the file offset and buffer size remained within the bounds of the mapped region.)

Resources