Can posix read() receive less than requested 4 bytes from a pipe? - linux

A program from the answer https://stackoverflow.com/a/1586277/6362199 uses the system call read() to receive exactly 4 bytes from a pipe. It assumes that the function read() returns -1, 0 or 4. Can the read() function return 1, 2 or 3 for example if it was interrupted by a signal?
In the man page read(2) there is:
On success, the number of bytes read is returned (zero indicates
end of file), and the file position is advanced by this number. It
is not an error if this number is smaller than the number of bytes
requested; this may happen for example because fewer bytes are
actually available right now (maybe because we were close to
end-of-file, or because we are reading from a pipe, or from a
terminal), or because read() was interrupted by a signal.
Does this mean that the read() function can be interrupted during receiving such a small amount of data as 4 bytes? Should the source code from this answer be corrected?
In the man page pipe(7) there is:
POSIX.1-2001 says that write(2)s of less than PIPE_BUF bytes must be atomic: the output data is written to the pipe as a contiguous sequence.
but there is nothing similar about read().

If the write is atomic, that means that the entire content is already present in the buffer when the read happens so the only way to have an incomplete read is if the kernel thread decides to yield before it's finished - which wouldn't happen here.
In general you can rely on small write()s on pipes on the same system mapping to identical read()s. 4 bytes is unquestionably far smaller than any buffer would ever be, so it will definitely be atomic.

Related

read(2) on Tun fd returned zero

In my application, a Tun interface was created and the process keep reading the associated fd with read(2) in a select(2) loop. But, when I was debugging an issue in the application, I found that in some moments the read(2) operation on the Tun file descriptor can return zero. Is this possible and what's the condition it can happen?
Thanks in advance.
woody
Here is the information from the manpage on read(2)click here
Return Value
On success, the number of bytes read is returned (zero indicates end of file), and the file position is advanced by this number. It is not an error if this number is smaller than the number of bytes requested; this may happen for example because fewer bytes are actually available right now (maybe because we were close to end-of-file, or because we are reading from a pipe, or from a terminal), or because read() was interrupted by a signal. On error, -1 is returned, and errno is set appropriately. In this case it is left unspecified whether the file position (if any) changes.

Under what circumstances does the read() syscall return 0?

I'm looking at the read syscall in Unix, which (at least in Linux) has this signature: [1]
ssize_t read(int fd, void* buf, size_t count);
Let's assume that the call succeeds (i.e. no negative return values) and that count > 0 (i.e. the buffer actually can store a nonzero amount of bytes). Under which circumstances would read() return 0? I can think of the following:
When fd refers to a regular file and the end of the file has been reached.
When fd refers to the receiving end of a pipe, socket or FIFO, the sending end has been closed and the pipe's/socket's/FIFO's own buffer has been exhausted.
When fd refers to the slave side of a terminal device that is in ICANON and Ctrl-D has been sent into the master side while the line buffer was empty.
I'm curious if there are any other situations that I'm not aware of, where read() would return with a result of 0. I'm especially interested (because of reasons) in situations like the last one in the list above, where read() returns 0 once, but subsequent calls to read() on the same FD could return a nonzero result. If an answer only applies to a certain flavor of Unix, I'm still interested in hearing it.
[1] I know this signature is for the libc wrapper, not the actual syscall, but that's not important right now.
If the Physical File System does not support simple reads from directories, read() will return 0 if it is used for a directory.
If no process has the pipe open for writing, read() returns 0 to indicate the end of the file.
If the connection is broken on a stream socket, but no data is available, then the read() function returns 0 bytes as EOF.
Normally a return value of 0 always means end-of-file. However, if you specify 0 as the number of bytes to read, it will always return 0 unless there's an error detected.
Terminal devices are a special case. If the terminal is in cooked mode, typing Control-d tells the device driver to return from any pending read() immediately with whatever is in the input editing buffer, rather than waiting for the user to enter a newline. If the buffer is empty, this results in a zero-length read. This is how typing the EOF character at the beginning of a line is automatically treated as EOF by applications.

Linux tty flip buffer lock when reading part of available data

I have a driver that builds on the new serdev bus in the linux kernel.
In my driver I receive messages from an external device, all messages ends with a null byte (0x00) and the protocol ensures that there are no null bytes in my data (COBS). Now I try to have the TTY layer hand me full messages by scanning for zeros in my input and if there are none I'll just return zero in the callback that is called from the tty layer when bytes are available.
This kind of works. Or rather it works for some messages. After a while though it locks up and the tty layer keeps sending the same size of received bytes indefinitely. My guess is that this happens when one half of the tty flip buffer is full and the rest of my message is in the other half.
I have two questions:
Am I correct in that the tty layer can "hang" until I read out all data in one half of the flip buffer?
If that is so, is there some way to prevent this from happening? I'd rather not implement my own buffering scheme on top of the tty buffer already available.
Thanks
It looks like (drivers/tty/tty_buffer.c and the function flush_to_ldisc) that it is not possible to do what I attempted to do. When the tty buffer is about to flip over the consumer will have to do a read and buffer any half messages.
That is, returning zero and hoping for a larger chunk of data in your callback next time will only work up until the end of the first part of the buffer then the last bit of data must be read.
This is not a problem in userspace because a read call will have an argument that is the most bytes you want but read is free to return fewer bytes than requested.

Linux serial port read() returns the maximum buffer size passed in

unsigned char buf[256];
num = read (fd, &buf, sizeof (buf));
I have a program that reads the serial port at 100ms rate. Device can send a maximum of 120 bytes of data every 100ms. I am observing at times that read() returns 256 (the size of buf[], that I have passed in to read() ). Because of this all the bytes are mixed up and I see checksum failures.
Is there a way, that I can poll on the file descriptor if there is a data and read only the valid data.
Yes! You can use the functions poll() or select() to know when data is available, but you will not know exactly how much. For that you can either do a regular blocking read of 120 bytes if it's a fixed-size message each time, or you'll have to read smaller parts, like the header first then the body once you know the size from the header, or you can read big chunks into a ring buffer and process them however you want in user-space.

Why a store function in the sysfs API of the Linux Kernel needs to return used bytes?

From the documentation:
store() should return the number of bytes used from the buffer. If the
entire buffer has been used, just return the count argument.
What does it do with this value? What's the difference if from a buffer of size FOO I read 4 and not 6 bytes?
You must realize that by implementing a sysfs file, you are trying to behave like a file.
Let's see this from the other side first. From the man page of fwrite(3):
RETURN VALUE
fread() and fwrite() return the number of items successfully read or written (i.e., not the number of characters). If an error occurs, or the end-of-file is
reached, the return value is a short item count (or zero).
And even better, from the man page of write(2):
The number of bytes written may be less than count if, for example, there is insufficient space on the underlying physical medium, or the RLIMIT_FSIZE resource
limit is encountered (see setrlimit(2)), or the call was interrupted by a signal handler after having written less than count bytes. (See also pipe(7).)
What this means is that store(), which is implementing the other end of the write(2) function for your particular file should return the number of bytes written (i.e. read by you), in the very least so that write(2) can return that value to the user.
In most cases, if there is no error in the input, you would just want to return count to acknowledge that you have read everything and all is ok.

Resources