poll() can't detect event when socket is closed locally? - linux

I'm working on a project that will port a TCP/IP client program onto an embedded ARM-Linux controller board. The client program was originally written in epoll(). However, the target platform is quite old; the only kernel available is 2.4.x, and epoll() is not supported. So I decided to rewrite the I/O loop in poll().
But when I'm testing code, I found that poll() does not act as I expected : it won't return when a TCP/IP client socket is closed locally, by another thread. I've wrote a very simple codes to do some test:
#include <stdio.h>
#include <stdlib.h>
#include <sys/socket.h>
#include <netinet/in.h>
#include <arpa/inet.h>
#include <fcntl.h>
#include <pthread.h>
#include <poll.h>
struct pollfd fdList[1];
void *thread_runner(void *arg)
{
sleep(10);
close(fdList[0].fd);
printf("socket closed\n");
pthread_exit(NULL);
}
int main(void)
{
struct sockaddr_in hostAddr;
int sockFD;
char buf[32];
pthread_t handle;
sockFD = socket(AF_INET, SOCK_STREAM, 0);
fcntl(sockFD,F_SETFL,O_NONBLOCK|fcntl(sockFD,F_GETFL,0));
inet_aton("127.0.0.1",&(hostAddr.sin_addr));
hostAddr.sin_family = AF_INET;
hostAddr.sin_port = htons(12345);
connect(sockFD,(struct sockaddr *)&hostAddr,sizeof(struct sockaddr));
fdList[0].fd = sockFD;
fdList[0].events = POLLOUT;
pthread_create(&handle,NULL,thread_runner,NULL);
while(1) {
if(poll(fdList,1,-1) < 1) {
continue;
}
if(fdList[0].revents & POLLNVAL ) {
printf("POLLNVAL\n");
exit(-1);
}
if(fdList[0].revents & POLLOUT) {
printf("connected\n");
fdList[0].events = POLLIN;
}
if(fdList[0].revents & POLLHUP ) {
printf("closed by peer\n");
close(fdList[0].fd);
exit(-1);
}
if(fdList[0].revents & POLLIN) {
if( read(fdList[0].fd, buf, sizeof(buf)) < 0) {
printf("closed by peer\n");
close(fdList[0].fd);
exit(-1);
}
}
}
return 0;
}
In this code I first create a TCP client socket, set to non-blocking mode, add to poll(), and close() the socket in another thread. And the result is: "POLLNVAL" is never printed while the socket is closed.
Is that an expected behavior of poll() ? Will it help if I choose select() instead of poll() ?

Yes, this is expected behavior. You solve this by using shutdown() on the socket instead of close().
See e.g. http://www.faqs.org/faqs/unix-faq/socket/ section 2.6
EDIT: The reason this is expected is that poll() and select() reacts to events happening on one of their fd's. close() removes the fd, it does not exist at all anymore, and thus it can't have any events associated with it.

Related

Inconsistent behavior of select for socket between Linux and BSD

I am building a cross platform socket program under MacOS (FreeBSD) and Linux, like this
#include <stdio.h>
#include <stdlib.h>
#include <sys/time.h>
#include <sys/types.h>
#include <sys/socket.h>
#include <unistd.h>
int main(void) {
fd_set rfds;
fd_set wfds;
struct timeval tv;
int retval;
int fd = socket(AF_INET, SOCK_STREAM, 0);
FD_ZERO(&rfds);
FD_ZERO(&wfds);
FD_SET(fd, &rfds);
FD_SET(fd, &wfds);
tv.tv_sec = 1;
tv.tv_usec = 0;
retval = select(fd + 1, &rfds, &wfds, NULL, &tv);
printf("ready sockets = %d\n", retval);
exit(EXIT_SUCCESS);
}
It's very simple, it creates a socket, and use select to see if it's either readable or writable.
If I run it under MacOS
ready sockets = 0
You will see the program blocks for 1 second and it prints socket ready. But if you run it under Linux, you will see it prints out
ready sockets = 2
immediately. Which sounds very odd to me, since the socket just created, there is nothing to read or write, but yet select function told me it's ready to read and write, how come they behave differently?

which signal should I use to come out of accept() API?

I have two threads one is blocked for a new connection in accept(), and another one talks other processes. When My application is going to shutdown, I needs to wake up the first thread from the accept(). I have tried to read the man page of accept() but did not find some thing use full. My question is which signal I should send from the second thread to the first thread so that It will come out of accept and also it won't get killed??
Thanks.
you can use a select with a timeout, so for example your thread executing accept wakes up every 1 or 2 seconds if nothing occurs and checks for shutdown. You can check this page to have an idea.
Without using "select"
Example code worked very well on Windows. It displayed "Exit" when SIGINT raised. You can edit code as suitable for Linux. Almost every socket function is identical except you should use "close" instead of "closesocket" and you should delete first 2 lines of code it is about starting winsock and add necessary header files for Linux.
#include <stdio.h>
#include <winsock.h>
#include <signal.h>
#include <thread>
#pragma comment(lib,"wsock32.lib")
jmp_buf EXIT_POINT;
int sock,sockl=sizeof(struct sockaddr);
struct sockaddr_in xx,client;
int AcceptConnections = 1;
void _catchsignal(int signal)
{
closesocket(sock);
}
void thread_accept()
{
accept(sock,(struct sockaddr*)&client,&sockl);
}
void thread_sleep()
{
Sleep(1000);
raise(SIGINT);
}
int _tmain(int argc, _TCHAR* argv[])
{
WSADATA wsaData;
WSAStartup(MAKEWORD( 2, 2 ),&wsaData);
signal(SIGINT,_catchsignal);
xx.sin_addr.s_addr = INADDR_ANY;
xx.sin_family = AF_INET;
xx.sin_port = htons(9090);
sock = socket(AF_INET,SOCK_STREAM,0);
bind(sock,(struct sockaddr*)&xx,sizeof(struct sockaddr));
listen(sock,20);
std::thread th_accept(thread_accept);
std::thread th_sleep(thread_sleep);
th_accept.join();
th_sleep.join();
printf("Exit");
return 0;
}
First you can use "select" function for accept functions without blocking thread. You can learn more about select in msdn and beej my recommendation is look at last one and you can use MSDN resources on socket programming because Windows and most of operating systems work on BSD Sockets which is almost identical. After accept connections without blocking them you can just define a global variable which can stop loop.
Sorry for my English, and here is a example code:
#include <stdio.h>
#include <stdlib.h>
#include <winsock.h>
#define DEFAULT_PORT 9090
#define QUEUE_LIMIT 20
int main()
{
WSADATA wsaData;
WSAStartup(MAKEWORD( 2, 2 ),&wsaData);
int ServerStream,SocketQueueMax=0,i,j,TMP_ClientStream;
int ClientAddrSize = sizeof(struct sockaddr),RecvBufferLength;
fd_set SocketQueue,SocketReadQueue,SocketWriteQueue;
struct sockaddr_in ServerAddr,TMP_ClientAddr;
struct timeval SocketTimeout;
char RecvBuffer[255];
const char *HelloMsg = "Connected.";
SocketTimeout.tv_sec = 1;
ServerAddr.sin_addr.s_addr = INADDR_ANY;
ServerAddr.sin_family = AF_INET;
ServerAddr.sin_port = htons(DEFAULT_PORT);
ServerStream = socket(AF_INET,SOCK_STREAM,0);
bind(ServerStream,(struct sockaddr*)&ServerAddr,sizeof(struct sockaddr));
listen(ServerStream,QUEUE_LIMIT);
FD_ZERO(&SocketQueue);
FD_ZERO(&SocketReadQueue);
FD_ZERO(&SocketWriteQueue);
FD_SET(ServerStream,&SocketQueue);
SocketQueueMax = ServerStream;
bool AcceptConnections = 1;
while(AcceptConnections)
{
SocketReadQueue = SocketQueue;
SocketWriteQueue = SocketQueue;
select(SocketQueueMax + 1,&SocketReadQueue,&SocketWriteQueue,NULL,&SocketTimeout);
for(i=0;i < SocketQueueMax + 1;i++)
{
if(FD_ISSET(i,&SocketReadQueue))
{
if(i == ServerStream)
{
TMP_ClientStream = accept(ServerStream,(struct sockaddr*)&TMP_ClientAddr,&ClientAddrSize);
send(TMP_ClientStream,HelloMsg,strlen(HelloMsg),0);
FD_SET(TMP_ClientStream,&SocketQueue);
if(TMP_ClientStream > SocketQueueMax)
{
SocketQueueMax = TMP_ClientStream;
}
continue;
}
while((RecvBufferLength = recv(i,RecvBuffer,254,0)) > 0)
{
RecvBuffer[RecvBufferLength] = '\0';
for(j=0;j<SocketQueueMax + 1;j++)
{
if(j == i || j == ServerStream || !FD_ISSET(j,&SocketQueue))
{
continue;
}
send(j,RecvBuffer,RecvBufferLength + 1,0);
}
printf("%s",RecvBuffer);
if(RecvBufferLength < 254)
{
break;
}
}
}
}
}
return EXIT_SUCCESS;
}

setsockopt on "accepted" fd on Linux

I have had a rather strange observation about behavior of setsockopt on Linux for SO_REUSEADDR. In one line: if I apply the sockopt to an fd returned by accept on a "listening socket" the socketoption is reflected on the port held by the listening socket.
Ok some code.
Server : Opens a socket, applies SO_REUSEADDR to be true. Accepts a connection and then applies SO_REUSEADDR to be false on the fd on the fd returned by accept.
#include <stdio.h>
#include <sys/socket.h>
#include <sys/types.h>
#include <arpa/inet.h>
#include <string.h>
int main(void)
{
int s, len;
int sin_size;
int reuse = 1;
int ret;
struct sockaddr_in my_addr;
memset(&my_addr, 0, sizeof(my_addr));
my_addr.sin_family = AF_INET;
my_addr.sin_addr.s_addr = inet_addr("127.0.0.1");
my_addr.sin_port = htons(33235);
if( (s = socket(AF_INET, SOCK_STREAM, 0)) < 0)
{
printf("Socket Error\n");
return -1;
}
setsockopt(s, SOL_SOCKET, SO_REUSEADDR, &reuse, sizeof(int));
if( bind(s, (struct sockaddr*)&my_addr, sizeof(struct sockaddr)) < 0)
{
printf("Bind Error\n");
return -1;
}
listen(s, 6);
reuse = 0;
memset(&my_addr, 0, sizeof(my_addr));
while(1) {
ret = accept(s, (struct sockaddr*)&my_addr, &len);
if (ret<0) {
printf("Accept failed\n");
} else {
printf("Accepted a client setting reuse add to 0\n");
setsockopt(ret, SOL_SOCKET, SO_REUSEADDR, &reuse, sizeof(int));
}
}
printf("Server exiting\n");
return 0;
}
Client : Client connects to the server, and doesn't do anything after that ensuring that the server socket stays in TIME_WAIT state.
#include <stdio.h>
#include <sys/socket.h>
#include <sys/types.h>
#include <arpa/inet.h>
#include <string.h>
#include <errno.h>
int main(void)
{
int s, len;
int sin_size;
struct sockaddr_in my_addr;
memset(&my_addr, 0, sizeof(my_addr));
my_addr.sin_family = AF_INET;
my_addr.sin_addr.s_addr = inet_addr("127.0.0.1");
my_addr.sin_port = htons(33235);
if( (s = socket(AF_INET, SOCK_STREAM, 0)) < 0)
{
printf("Socket Error\n");
return -1;
}
if (!connect(s,(struct sockaddr*)&my_addr, sizeof(struct sockaddr)))
{
printf("Client Connected successfully\n");
}
else
{
printf("%s\n",strerror(errno));
}
while(1) sleep(1);
return 0;
}
Steps that I do reproduce the issue.
Run server.
Connect client.
Kill and restart server. The server fails with Bind Failure
I tested this on mac os. And the bind didn't fail. I have digged up all Posix specifications and none of them say that this code is undefined.
Question:
Can someone with more experience on this share their understanding of the issue?
One way to think about it is that SO_REUSEADDR determines if you can have another socket bound to that same address. It's a property of any socket (listen or connection), but very commonly inherited from listen via accept. In linux it's mapped to the struct sock "sk_reuse" flag.
If you clear this flag on a FD you "accepted" then from that point on the IP/Port pair is considered busy-and-non-reusable. The SO_REUSEADDR flag on the listen socket does not change, but the flag on the accepted socket affects bind logic. You could probably check this with getsockopt.
If you want to know more you can try to read the inet_csk_get_port function: http://lxr.free-electrons.com/source/net/ipv4/inet_connection_sock.c#L100. This is where the actual "binding" takes place.

Closing a file descriptor that is being polled

If I have two threads (Linux, NPTL), and I have one thread that is polling on one or more of file descriptors, and another is closing one of them, is that a reasonable action? Am I doing something that I shouldn't be doing in MT environment?
The main reason I consider doing that, is that I don't necessarily want to communicate with the polling thread, interrupt it, etc., I instead would like to just close the descriptor for whatever reasons, and when the polling thread wakes up, I expect the revents to contain POLLNVAL, which would be the indication that the file descriptor should just be thrown away by the thread before the next poll.
I've put together a simple test, which does show that the POLLNVAL is exactly what's going to happen. However, in that case, POLLNVAL is only set when the timeout expires, closing the socket doesn't seem to make the poll() return. If that's the case, I can kill the thread to make poll() restart to return.
#define _GNU_SOURCE
#include <stdio.h>
#include <pthread.h>
#include <poll.h>
#include <errno.h>
#include <sys/types.h>
#include <unistd.h>
#include <string.h>
#include <signal.h>
static pthread_t main_thread;
void * close_some(void*a) {
printf("thread #2 (%d) is sleeping\n", getpid());
sleep(2);
close(0);
printf("socket closed\n");
// comment out the next line to not forcefully interrupt
pthread_kill(main_thread, SIGUSR1);
return 0;
}
void on_sig(int s) {
printf("signal recieved\n");
}
int main(int argc, char ** argv) {
pthread_t two;
struct pollfd pfd;
int rc;
struct sigaction act;
act.sa_handler = on_sig;
sigemptyset(&act.sa_mask);
act.sa_flags = 0;
sigaction(SIGUSR1, &act, 0);
main_thread = pthread_self();
pthread_create(&two, 0, close_some, 0);
pfd.fd = 0;
pfd.events = POLLIN | POLLRDHUP;
printf("thread 0 (%d) polling\n", getpid());
rc = poll(&pfd, 1, 7000);
if (rc < 0) {
printf("error : %s\n", strerror(errno));
} else if (!rc) {
printf("time out!\n");
} else {
printf("revents = %x\n", pfd.revents);
}
return 0;
}
For Linux at least, this seems risky. The manual page for close warns:
It is probably unwise to close file descriptors while they may be in
use by system calls in other threads in the same process. Since a
file descriptor may be reused, there are some obscure race conditions
that may cause unintended side effects.
Since you're on Linux, you could do the following:
Set up an eventfd and add it to the poll
Signal the eventfd (write to it) when you want to close a fd
In the poll, when you see activity on the eventfd you can immediately close a fd and remove it from poll
Alternatively you could simply establish a signal handler and check for errno == EINTR when poll returns. The signal handler would only need to set some global variable to the value of the fd you're closing.
Since you're on Linux you might want to consider epoll as a superior albeit non-standard alternative to poll.

How can I trap a signal (`SIGPIPE`) for a socket that closes?

I've written a server that accepts a socket connection on a secondary port for the purposes of streaming debugging information that normally goes to stderr. This second port --an error serving port-- is only intended to have one connection at a time, which, is convenient, because it allows to me redirect stderr using a dup2(2) call. (See Can I redirect a parent process's stderr to a socket file descriptor on a forked process?).
The following code is nearly satisfactory in every regard. When a client logs into the port, the stderr stream is directed to the socket. When another client logs in, the stream is redirected again, and the first client stops receiving: entirely satisfactory.
Where it falls short in the design is when the client closes the connection, the server crashes because it is trying to write() to a socket that is closed.
I've got a rudimentary signal handler for the normal child processes, but I'm not sure how to handle the specific signal from the parent process when the error socket closes.
How can I trap the signal (in the parent) that the connection on the ERR_PORT_NUM has closed and have the signal handler reopen (or dup) stderr back to /dev/null for the next awaiting error client?
Also, what should I do with an original error client connection when a second connects? Currently the first client is left dangling. Even a non-graceful shut-down of the first connection is acceptable.
#include <stdio.h>
#include <stdlib.h>
#include <string.h>
#include <unistd.h>
#include <sys/types.h>
#include <sys/stat.h>
#include <sys/socket.h>
#include <fcntl.h>
#include <errno.h>
#include <pwd.h>
#include <signal.h>
#include <netinet/in.h>
#include <sys/mman.h>
#define PORT_NUM 12345
#define ERR_PORT_NUM 54321
static void child_handler(int signum)
{
switch (signum) {
case SIGALRM:
exit(EXIT_FAILURE);
break;
case SIGUSR1:
exit(EXIT_SUCCESS);
break;
case SIGCHLD:
exit(EXIT_FAILURE);
break;
}
}
static void daemonize(void)
{
/* Trap signals that we expect to recieve */
signal(SIGUSR1, child_handler);
signal(SIGALRM, child_handler);
signal(SIGCHLD, SIG_IGN); /* A child process dies */
signal(SIGTSTP, SIG_IGN); /* Various TTY signals */
signal(SIGTTOU, SIG_IGN);
signal(SIGTTIN, SIG_IGN);
signal(SIGHUP, SIG_IGN); /* Ignore hangup signal */
signal(SIGTERM, SIG_DFL); /* Die on SIGTERM */
freopen("/dev/null", "r", stdin);
freopen("/dev/null", "w", stdout);
freopen("/dev/null", "w", stderr);
}
static void server_work(void)
{
int sockfd, err_sockfd;
socklen_t clilen;
struct sockaddr_in serv_addr, cli_addr, err_serv_addr, err_cli_addr;
struct timeval tv = { 0 };
int new_stderr;
sockfd = socket(AF_INET, SOCK_STREAM, 0);
err_sockfd = socket(AF_INET, SOCK_STREAM, 0);
if (sockfd < 0 || err_sockfd < 0)
return;
memset((char *) &serv_addr, '\0', sizeof(serv_addr));
memset((char *) &err_serv_addr, '\0', sizeof(serv_addr));
serv_addr.sin_family = AF_INET;
serv_addr.sin_addr.s_addr = INADDR_ANY;
serv_addr.sin_port = htons(PORT_NUM);
err_serv_addr.sin_family = AF_INET;
err_serv_addr.sin_addr.s_addr = INADDR_ANY;
err_serv_addr.sin_port = htons(ERR_PORT_NUM);
if (bind(sockfd, (struct sockaddr *) &serv_addr, sizeof(serv_addr))
< 0)
return;
if (bind
(err_sockfd, (struct sockaddr *) &err_serv_addr,
sizeof(err_serv_addr)) < 0)
return;
listen(sockfd, 5);
listen(err_sockfd, 5);
clilen = sizeof(cli_addr);
while (1) {
int maxfd;
fd_set read_sockets_set;
FD_ZERO(&read_sockets_set);
FD_SET(sockfd, &read_sockets_set);
FD_SET(err_sockfd, &read_sockets_set);
maxfd = (err_sockfd > sockfd) ? err_sockfd : sockfd;
if (select(maxfd + 1, &read_sockets_set, NULL, NULL, NULL) < 0) {
break;
}
if (FD_ISSET(sockfd, &read_sockets_set)) {
/* Typical process fork(2) and such ... not gremaine to the question. */
}
if (FD_ISSET(err_sockfd, &read_sockets_set)) {
new_stderr =
accept(err_sockfd, (struct sockaddr *) &err_cli_addr,
&clilen);
dup2(new_stderr, STDERR_FILENO);
}
}
close(sockfd);
close(err_sockfd);
return;
}
int main(int argc, char *argv[])
{
daemonize(); /* greatly abbreviated for question */
server_work();
return 0;
}
You could simply ignore SIGPIPE. It's a useless, annoying signal.
signal(SIGPIPE, SIG_IGN);
If you ignore it then your program will instead receive an EPIPE error code from the failed write() call. This lets you handle the I/O error at a sensible place in your code rather than in some global signal handler.
EPIPE
fd is connected to a pipe or socket whose reading end is closed. When this happens the writing process will also receive a SIGPIPE signal. (Thus, the write return value is seen only if the program catches, blocks or ignores this signal.)

Resources