Lets say I know that a file descriptor fd is open for reading in my process. I would like to pipe data from this fd into a fifo that is available for reading outside my of process, in a way that avoids calling poll or select on fd and manually reading/forwarding data. Can this be done?
You mean ask the OS to do that behind the scenes on an ongoing basis from now on? Like an I/O redirection?
No, you can't do that.
You could spawn a thread that does nothing but read the file fd and write the pipe fd, though. To avoid the overhead of copying memory around in read(2) and write(2) system calls, you can use sendfile(out_fd, in_fd, NULL, 4096) to tell the kernel to copy a page from in_fd to out_fd. See the man page.
You might have better results with splice(2), since it's designed for use with files and pipes. sendfile(2) used to require out_fd to be a socket. (Designed for zero-copy sending static data on TCP sockets, e.g. from a web server.)
Linux does have asynchronous I/O, so you can queue up a read or write to happen in the background. That's not a good choice here, because you can't queue up a copy from one fd to another. (no async splice(2) or sendfile(2)). Even if there was, it would have a specific request size, not a fire-and-forget keep copying-forever. AFAIK, threads have become the preferred way to do async I/O, rather than the POSIX AIO facilities.
Related
From man pages
O_NONBLOCK or O_NDELAY
This flag has no effect for regular files and block
devices; that is, I/O operations will (briefly) block when
device activity is required, regardless of whether O_NONBLOCK
is set. Since O_NONBLOCK semantics might eventually be
implemented, applications should not depend upon blocking
behavior when specifying this flag for regular files and block
devices.
From my question I had following understanding of the io system.
Device <-----> Kernel Buffers <-----> Process
So whenever Buffers are full (write) or empty (read), the corresponding command from the process can block or not depending on the flag above. Kernel interacting with the device is not blocking for the process. The kernel might or might not be using DMA for communication with the device.
But looks like my understanding is wrong as I can't see why the regular file descriptors can't be non-blocking. Could somebody help me here?
"Blocking" is defined as waiting for a file to become readable or writable.
Regular files are always readable and/or writable; in other words, it is always possible to try to start the read/write operation without having to wait for some external event:
when reading, the kernel already knows if there are more bytes in the file (if the end of the file has been reached, it is not possible to block to wait for some other process to append more bytes);
when writing, the kernel already knows if there is enough space on the disk to write something (if the disk is full, it is not possible to block to wait for some other process to delete some data to free up space).
I'm working on a library to create IPC based on UNIX socket on Linux.
The goal is to hide the IPC logic in a library and I use a thread which handles the socket for external communication.
Since I want to mux/demux data coming from outside to multiple reader/writer internal threads, I'm using pipes to communicate between this management thread and the others.
Now, I would like to manage kind of QoS as well and thus, would like to block some user threads in write direction when I know there is no space left of the other side (the other IPC process reading data from me too slowly for example).
To notify a internal user thread there is no place left for sending data (in its "virtual channel"), I would like to mark its sending pipe as non writable then this thread can use for example select on this sending pipe file descriptor.
My question is then: is there a way to mark a pipe write file descriptor as non writable even if there is still free place in its internal buffer and mark it as writable again when my management thread decides it ? Keeping in mind the write status event should be able to be managed by functions like select, poll, etc.
P.S.: I know there is a lot of libraries which could help me doing the same job more easily like ZeroMQ or nanomsg but they are far too heavy for what I would like to achieve.
According to the POSIX standard, writes to a pipe are guaranteed to be atomic (if the data size is less than PIPE_BUF).
As far as I understand, this means that any thread trying to write to the pipe will never access the pipe in the middle of another thread's write. What's not clear to me is how this is achieved and whether this atomicity guarantee has other implications.
Does this simply mean that the writing thread acquires a lock somewhere inside the write function?
Is the thread that's writing to the pipe guaranteed to never be scheduled out of context during the write operation?
Pipe write are atomic upto the size of Pipe. Let assume that a pipe size is 4kb, then the write are atomic up to the data_size < 4kb. In the POSIX systems, kernel uses internal mutexes, and locks the file descriptors for the pipe. Then it allows the requesting thread to write. If any other thread requests write at this point, then it would have to wait for the first thread. After that the file descriptors are unlocked, so the other waiting threads can write to the pipe. So yes, kernel would not allow more than one thread to write to the pipe at the same time.
However, there is an edge case to think. If data of size close to 4kb has already been written, and the reading has not been done yet, then the pipe may not be thread safe. Because, at this point, it may be possible that the total bytes written to the pipe may exceed to the 4kb limit.
I want to append data often to a file on the local filesystem. I want to do this without blocking for too long, and without making any worker threads. On Linux kernel 2.6.18.
It seems that the POSIX AIO implementation for glibc on Linux makes a userspace threadpool and blocks those threads. Which is cool, but I could just as easily spin off my own special dedicated file blocking thread.
http://www.kernel.org/doc/man-pages/online/pages/man7/aio.7.html
And it's my understanding that the Linux Kernel AIO implementation currently blocks on append. Appending is the only thing I want to do.
http://code.google.com/p/kernel/wiki/AIOUserGuide
https://groups.google.com/forum/#!msg/linux.kernel/rFOD6mR2n30/13VDXRTmBCgJ
I'm considering opening the file with O_NONBLOCK, and then doing a kind of lazy writing where if it EWOULDBLOCK, then try the write again later. Something like this:
open(pathname, O_CREAT | O_APPEND | O_NONBLOCK);
call write(), check for error EAGAIN | EWOULDBLOCK
if EAGAIN | EWOULDBLOCK, then just save the data to be written and try the write() again later.
Is this a good idea? Is there any actual advantage to this? If I'm the only one with an open file descriptor to that file, and I try a write() and it EWOULDBLOCK, then is it any less likely to EWOULDBLOCK later? Will it ever EWOULDBLOCK? If I write() and it doesn't EWOULDBLOCK, does that mean write() will return swiftly?
In other words, under exactly what circumstances, if any, will write() to a local file fail with EWOULDBLOCK on Linux 2.6.18?
I'm not sure about local file system, but I'm pretty sure that you can get EWOULDBLOCK when trying to write to a file on a mounted file system (e.g. nfs). The problem here is that normally you don't know if it is really "local" hard disk unless you spefically check for this every time you create/open the file. How to check this is of course system dependent.
Even if system creates some additional thread to do the actual write, this thread would have a buffer (which would not be infinite), so if you write fast enough, you could get EWOULDBLOCK.
under ... what circumstances ... will write() to a local file fail with EWOULDBLOCK
Perhaps there is no circumstance for a file. The Linux man page for write(2) states that EWOULDBLOCK would only be returned for a file descriptor that refers to a socket.
EAGAIN or EWOULDBLOCK
The file descriptor fd refers to a socket and has been marked nonblocking (O_NONBLOCK), and the write would block. POSIX.1-2001 allows either error to be returned for this case, and does not require these constants to have the same value, so a portable application should check for both possibilities.
Apparently this behavior is related to the fact that a socket would employ record locks, whereas a simple file would not.
This question is meant to be language and connection method independent. Actually finding methods is the question.
I know that I can directly pipe two processes through a call like prog1 | prog2 in the shell, and I've read something about RPC and Sockets. But everything was a little too abstract to really get a grip on it. For example it's not clear to me, how the sockets are created and if each process needs to create a socket or if many processes can use the same socket to transfer messages to each other or if I can get rid of the sockets completely.
Can someone explain how Interprocess-Communication in Linux really works and what options I have?
Pipe
In a producer-consumer scenario, you can use pipes, it's an IPC. A pipe is just what the name suggest, it connects a sink and a source together. In he shell the source is the standard output and the sink the standard input, so cmd1 | cmd2 just connects the output of cmd1 to the input of cmd2.
Using a pipe, it creates you two file descriptors. You can use one for the sink and the other one for the source. Once the pipe is created, you fork and one process uses one oof the file descriptor while the other one uses the other.
Other IPC
IPCs are various: pipe (in memory), named pipe (through a file), socket, shared memory, semaphore, message queue, signals, etc. All have pro's and con's. There are a lot of litterature online and in books about them. Describing them all here would be difficult.
Basically you have to understand that each process has it's own memory, separated from other processes. So you need to find shared resources through which to exchange data. A resource can be "physical" like a network (for socket) or mass storage (for files) or "abstract" like a pipe or a signal.
If one of the process is producer and other is a consumer then you can go for shared memory communication. You need a semaphore for this. One process will lock the semaphore then write to the shared memory and other will lock the semaphore and read the value. Since you use semaphore dirty reads/writes will be taken care.