How to handle the Linux socket revents POLLERR, POLLHUP and POLLNVAL? - linux

I'm wondering what should be done when poll set these bits? Close socket, ignore it or what?

A POLLHUP means the socket is no longer connected. In TCP, this means FIN has been received and sent.
A POLLERR means the socket got an asynchronous error. In TCP, this typically means a RST has been received or sent. If the file descriptor is not a socket, POLLERR might mean the device does not support polling.
For both of the conditions above, the socket file descriptor is still open, and has not yet been closed (but shutdown() may have already been called). A close() on the file descriptor will release resources that are still being reserved on behalf of the socket. In theory, it should be possible to reuse the socket immediately (e.g., with another connect() call).
A POLLNVAL means the socket file descriptor is not open. It would be an error to close() it.

It depend on the exact error nature. Use getsockopt() to see the problem:
int error = 0;
socklen_t errlen = sizeof(error);
getsockopt(fd, SOL_SOCKET, SO_ERROR, (void *)&error, &errlen);
Values: http://www.xinotes.net/notes/note/1793/
The easiest way is to assume that the socket is no longer usable in any case and close it.

POLLNVAL means that the file descriptor value is invalid. It usually indicates an error in your program, but you can rely on poll returning POLLNVAL if you've closed a file descriptor and you haven't opened any file since then that might have reused the descriptor.
POLLERR is similar to error events from select. It indicates that a read or write call would return an error condition (e.g. I/O error). This does not include out-of-band data which select signals via its errorfds mask but poll signals via POLLPRI.
POLLHUP basically means that what's at the other end of the connection has closed its end of the connection. POSIX describes it as
The device has been disconnected. This event and POLLOUT are mutually-exclusive; a stream can never be writable if a hangup has occurred.
This is clear enough for a terminal: the terminal has gone away (same event that generates a SIGHUP: the modem session has been terminated, the terminal emulator window has been closed, etc.). POLLHUP is never sent for a regular file. For pipes and sockets, it depends on the operating system. Linux sets POLLHUP when the program on the writing end of a pipe closes the pipe, and sets POLLIN|POLLHUP when the other end of a socket closed the socket, but POLLIN only for a socket shutdown. Recent *BSD set POLLIN|POLLUP when the writing end of a pipe closes the pipe, and the behavior for sockets is more variable.

Minimal FIFO example
Once you understand when those conditions happen, it should be easy to know what to do with them.
poll.c
#define _XOPEN_SOURCE 700
#include <fcntl.h> /* creat, O_CREAT */
#include <poll.h> /* poll */
#include <stdio.h> /* printf, puts, snprintf */
#include <stdlib.h> /* EXIT_FAILURE, EXIT_SUCCESS */
#include <unistd.h> /* read */
int main(void) {
char buf[1024];
int fd, n;
short revents;
struct pollfd pfd;
fd = open("poll0.tmp", O_RDONLY | O_NONBLOCK);
pfd.fd = fd;
pfd.events = POLLIN;
while (1) {
puts("loop");
poll(&pfd, 1, -1);
revents = pfd.revents;
if (revents & POLLIN) {
n = read(pfd.fd, buf, sizeof(buf));
printf("POLLIN n=%d buf=%.*s\n", n, n, buf);
}
if (revents & POLLHUP) {
printf("POLLHUP\n");
close(pfd.fd);
pfd.fd *= -1;
}
if (revents & POLLNVAL) {
printf("POLLNVAL\n");
}
if (revents & POLLERR) {
printf("POLLERR\n");
}
}
}
GitHub upstream.
Compile with:
gcc -o poll.out -std=c99 poll.c
Usage:
sudo mknod -m 666 poll0.tmp p
./poll.out
On another shell:
printf a >poll0.tmp
POLLHUP
If you don't modify the source: ./poll.out outputs:
loop
POLLIN n=1 buf=a
loop
POLLHUP
loop
So:
POLLIN happens when input becomes available
POLLHUP happens when the file is closed by the printf
close(pfd.fd); and pfd.fd *= -1; clean things up, and we stop receiving POLLHUP
poll hangs forever
This is the normal operation.
You could now repoen the FIFO to wait for the next open, or exit the loop if you are done.
POLLNAL
If you comment out pfd.fd *= -1;: ./poll.out prints:
POLLIN n=1 buf=a
loop
POLLHUP
loop
POLLNVAL
loop
POLLNVAL
...
and loops forever.
So:
POLLIN and POLLHUP and close happened as before
since we didn't set pfd.fd to a negative number, poll keeps trying to use the fd that we closed
this keeps returning POLLNVAL forever
So we see that this shouldn't have happened, and indicates a bug in your code.
POLLERR
I don't know how to generate a POLLERR with FIFOs. Let me know if there is way. But it should be possible with file_operations of a device driver.
Tested in Ubuntu 14.04.

Related

Pipe data to thread. Read stuck

I want to trigger a callback when data is written on a file descriptor. For this I have set up a pipe and a reader thread, which reads the pipe. When it has data, the callback is called with the data.
The problem is that the reader is stuck on the read syscall. Destruction order is as follows:
Close write end of pipe (I expected this to trigger a return from blocking read, but apparently it doesn't)
Wait for reader thread to exit
Restore old file descriptor context (If stdout was redirected to the pipe, it no longer is)
Close read end of pipe
When the write end of the pipe is closed, on the read end, the read() system call returns 0 if it is blocking.
Here is an example program creating a reader thread from a pipe. The main program gets the data from stdin thanks to fgets() and write those data into the pipe. On the other side, the thread reads the pipe and triggers the callback passed as parameter. The thread stops when it gets 0 from the read() of the pipe (meaning that the main thread closed the write side):
#include <stdio.h>
#include <errno.h>
#include <pthread.h>
#include <unistd.h>
#include <string.h>
int pfd[2];
void read_cbk(char *data, size_t size)
{
int rc;
printf("CBK triggered, %zu bytes: %s", size, data);
}
void *reader(void *p){
char data[128];
void (* cbk)(char *data, size_t size) = (void (*)(char *, size_t))p;
int rc;
do {
rc = read(pfd[0], data, sizeof(data));
switch(rc) {
case 0: fprintf(stderr, "Thread: rc=0\n");
break;
case -1: fprintf(stderr, "Thread: rc=-1, errno=%d\n", errno);
break;
default: cbk(data, (size_t)rc);
}
} while(rc > 0);
}
int main(){
pthread_t treader;
int rc;
char input[128];
char *p;
pipe(pfd);
pthread_create(&treader, NULL, reader , read_cbk);
do {
// fgets() insert terminating \n and \0 in the buffer
// If EOF (that is to say CTRL-D), fgets() returns NULL
p = fgets(input, sizeof(input), stdin);
if (p != NULL) {
// Send the terminating \0 to the reader to facilitate printf()
rc = write(pfd[1], input, strlen(p) + 1);
}
} while (p);
close(pfd[1]);
pthread_join(treader, NULL);
close(pfd[0]);
}
Example of execution:
$ gcc t.c -o t -lpthread
$ ./t
azerty is not qwerty
CBK triggered, 22 bytes: azerty is not qwerty
string
CBK triggered, 8 bytes: string
# Here I typed CTRL-D to generate an EOF on stdin
Thread: rc=0
I found the problem. For redirection, the following has to be done
Create a pipe. This creates two file descriptors. One for reading, and one for writing.
dup2 so the original file descriptor is an alias to the write end of the pipe. This increments the use count of the write end by one
Thus, before synchronizing, I have to restore the context. This means that the following order is correct:
Close write end of pipe
Restore old file descriptor context
Wait for reader thread to exit
Close read end of pipe
For reference to the question, step 2 and 3 must be reorder in order to avoid deadlock.

Need to close linux FIFO on the read side after each message

Here is what I did, and it works:
"server" side (reader) pseudocode
mkfifo()
while (true){
open()
select(...NULL)
while (select... timeval(0)) {
read()
}
close()
}
"client" side (writer) real C code
int fd;
char * myfifo = "/tmp/saveterm.fifo";
char *msg = "MESSAGE";
fd = open(myfifo, O_WRONLY);
write(fd, msg, strlen(msg));
close(fd);
Now you see I need to open/close the fifo on the server after each event read. Is this supposed to be so? At first I only opened it once before the loop, but after the first message, the 'select' call would never block again until closed and reopened, and was always returning 1 and ISSET for the descriptor.
Yes, it's supposed to be like this.
Named pipes behave the same way as anonymous pipes: they both represent a single stream that is terminated when the last writer closes it. Specifically, the reader is not meant to hang forever just in case some future program decides to open the pipe and continue writing.
If you want packet based communication through a file, how about using a Unix socket in datagram mode instead?

Question on Stdin

I have the following requirement ;
1) select blocks on the file descriptor associated to stdin
2) Now how do I write a code such that select gets unblocked . The code should make the stdin file descriptor read ready .In other words the code should make the select unblock without waiting for user to give input
If we are talking about the select UNIX system call, and your are using it to wait for data on stdin, you could use the timeout parameter to indicate select you want to block for at most timeout seconds.
From the select man on Linux:
#include <sys/select.h>
int select(int nfds, fd_set *readfds, fd_set *writefds,
fd_set *exceptfds, struct timeval *timeout);
timeout is an upper bound on the amount of time elapsed before select()
returns. It may be zero, causing select() to return immediately. (This
is useful for polling.) If timeout is NULL (no timeout), select() can
block indefinitely.
The time structures involved are defined in and look like
struct timeval {
long tv_sec; /* seconds */
long tv_usec; /* microseconds */
};

using EOF for signaling on unnamed pipes

I have a test program that uses unnamed pipes created with pipe() to communicate between parent and child processes created with fork() on a Linux system.
Normally, when the sending process closes the write fd of the pipe, the receiving process returns from read() with a value of 0, indicating EOF.
However, it seems that if I stuff the pipe with a fairly large amount of data (maybe 100K bytes0 before the receiver starts reading, the receiver blocks after reading all the data in the pipe - even though the sender has closed it.
I have verified that the sending process has closed the pipe with lsof, and it seems pretty clear that the receiver is blocked.
Which leads to the question: is closing one end of the pipe a reliable way to let the receiver know that there is no more data?
If it is, and there are no conditions that can lead to a read() blocking on an empty, closed FIFO, there's something wrong with my code. If not, it means I need to find an alternate method of signalling the end of the data stream.
Resolution
I was pretty sure that the original assumption was correct, that closing a pipe causes an EOF at the reader side, this question was just a shot in the dark - thinking maybe there was some subtle pipe behavior I was overlooking. Nearly every example you ever see with pipes is a toy that sends a few bytes and exits. Things often work differently when you are no longer doing atomic operations.
In any case, I tried to simplify my code to flush out the problem and was successful in finding my problem. In pseudocode, I ended up doing something like this:
create pipe1
if ( !fork() ) {
close pipe1 write fd
do some stuff reading pipe1 until EOF
}
create pipe2
if ( !fork() ) {
close pipe2 write fd
do some stuff reading pipe2 until EOF
}
close pipe1 read fd
close pipe2 read fd
write data to pipe1
get completion response from child 1
close pipe1 write fd
write data to pipe2
get completion response from child 2
close pipe2 write fd
wait for children to exit
The child process reading pipe1 was hanging, but only when the amount of data in the pipe became substantial. This was occurring even though I had closed the pipe that child1 was reading.
A look at the source shows the problem. When I forked the second child process, it grabbed its own copy of the pipe1 file descriptors, which were left open. Even though only one process should be writing to the pipe, having it open in the second process kept it from going into an EOF state.
The problem didn't show up with small data sets, because child2 was finishing its business quickly, and exiting. But with larger data sets, child2 wasn't returning quickly, and I ended up with a deadlock.
read should return EOF when the writers have closed the write end.
Since you do a pipe and then a fork, both processes will have the write fd open. It could be that in the reader process you have forgotten to close the write portion of the pipe.
Caveat: It has been a long time since I programmed on Unix. So it might be inaccurate.
Here is some code from: http://www.cs.uml.edu/~fredm/courses/91.308/files/pipes.html. Look at the "close unused" comments below.
#include <stdio.h>
/* The index of the "read" end of the pipe */
#define READ 0
/* The index of the "write" end of the pipe */
#define WRITE 1
char *phrase = "Stuff this in your pipe and smoke it";
main () {
int fd[2], bytesRead;
char message [100]; /* Parent process message buffer */
pipe ( fd ); /*Create an unnamed pipe*/
if ( fork ( ) == 0 ) {
/* Child Writer */
close (fd[READ]); /* Close unused end*/
write (fd[WRITE], phrase, strlen ( phrase) +1); /* include NULL*/
close (fd[WRITE]); /* Close used end*/
printf("Child: Wrote '%s' to pipe!\n", phrase);
} else {
/* Parent Reader */
close (fd[WRITE]); /* Close unused end*/
bytesRead = read ( fd[READ], message, 100);
printf ( "Parent: Read %d bytes from pipe: %s\n", bytesRead, message);
close ( fd[READ]); /* Close used end */
}
}

recv with MSG_NONBLOCK and MSG_WAITALL

I want to use recv syscall with nonblocking flags MSG_NONBLOCK. But with this flag syscall can return before full request is satisfied. So,
can I add MSG_WAITALL flag? Will it be nonblocking?
or how should I rewrite blocking recv into the loop with nonblocking recv
For IPv4 TCP receives on Linux at least, MSG_WAITALL is ignored if MSG_NONBLOCK is specified (or the file descriptor is set to non-blocking).
From tcp_recvmsg() in net/ipv4/tcp.c in the Linux kernel:
if (copied >= target && !sk->sk_backlog.tail)
break;
if (copied) {
if (sk->sk_err ||
sk->sk_state == TCP_CLOSE ||
(sk->sk_shutdown & RCV_SHUTDOWN) ||
!timeo ||
signal_pending(current))
break;
target in this cast is set to to the requested size if MSG_DONTWAIT is specified or some smaller value (at least 1) if not. The function will complete if:
Enough bytes have been copied
There's a socket error
The socket has been closed or shutdown
timeo is 0 (socket is set to non-blocking)
There's a signal pending for the process
To me this seems like it may be a bug in Linux, but either way it won't work the way you want. It looks like dec-vt100's solution will, but there is a race condition if you try to receive from the same socket in more than one process or thread.That is, another recv() call by another thread/process could occur after your thread has performed a peek, causing your thread to block on the second recv().
EDIT:
Plain recv() will return whatever is in the tcp buffer at the time of the call up to the requested number of bytes. MSG_DONTWAIT just avoids blocking if there is no data at all ready to be read on the socket. MSG_WAITALL requests blocking until the entire number of bytes requested can be read. So you won't get "all or none" behavior. At best you should get EAGAIN if no data is present and block until the full message is available otherwise.
You might be able to fashion something out of MSG_PEEK or ioctl() with a FIONREAD (if your system supports it) that effectively behaves like you want but I am unaware how you can accomplish your goal just using the recv() flags.
This is what I did for the same problem, but I'd like some confirmation that this works as expected...
ssize_t recv_allOrNothing(int socket_id, void *buffer, size_t buffer_len, bool block = false)
{
if(!block)
{
ssize_t bytes_received = recv(socket_id, buffer, buffer_len, MSG_DONTWAIT | MSG_PEEK);
if (bytes_received == -1)
return -1;
if ((size_t)bytes_received != buffer_len)
return 0;
}
return recv(socket_id, buffer, buffer_len, MSG_WAITALL);
}

Resources