Suppose the following series of events occurs:
We set up a listening socket
Thread A blocks waiting for the listening socket to become readable, using EPOLLIN | EPOLLEXCLUSIVE
Thread B also blocks waiting for the listening socket to become readable, also using EPOLLIN | EPOLLEXCLUSIVE
An incoming connection arrives at the listening socket, making the socket readable, and the kernel elects to wake up thread A.
But, before the thread actually wakes up and calls accept, a second incoming connection arrives at the listening socket.
Here, the socket is already readable, so the second connection doesn't change that. This is level-triggered epoll, so according to the normal rules, the second connection can be treated as a no-op, and the second thread doesn't need to be awoken. ...Of course, not waking up the second thread would kind of defeat the whole purpose of EPOLLEXCLUSIVE? But my trust in API designers doing the right thing is not as strong as it once was, and I can't find anything in the documentation to rule this out.
Questions
a) Is the above scenario possible, where two connections arrive but only thread is woken? Or is it guaranteed that every distinct incoming connection on a listening socket will wake another thread?
b) Is there a general rule to predict how EPOLLEXCLUSIVE and level-triggered epoll interact?
b) What about EPOLLIN | EPOLLEXCLUSIVE and EPOLLOUT | EPOLLEXCLUSIVE for byte-stream fds, like a connected TCP socket or a pipe? E.g. what happens if more data arrives while a pipe is already readable?
Edited (original answer is after the code used for testing)
To make sure things are clear, I'll go over EPOLLEXCLUSIVE as it relates to edge triggered events (EPOLLET) as well as level-triggered events, to show how these effect expected behavior.
As you well know:
Edge Triggered: Once you set EPOLLET, events are triggered only if they change the state of the fd - meaning that only the first event is triggered and no new events will get triggered until that event is fully handled.
This design is explicitly meant to prevent epoll_wait from returning due to an event that is in the process of being handled (i.e., when new data arrives while the EPOLLIN was already raised but read hadn't been called or not all of the data was read).
The edge-triggered event rule is simple all same-type (i.e. EPOLLIN) events are merged until all available data was processed.
In the case of a listening socket, the EPOLLIN event won't be triggered again until all existing listen "backlog" sockets have been accepted using accept.
In the case of a byte stream, new events won't be triggered until all the the available bytes have been read from the stream (the buffer was emptied).
Level Triggered: On the other hand, level triggered events will behave closer to how legacy select (or poll) operates, allowing epoll to be used with older code.
The event-merger rule is more complex: events of the same type are only merged if no one is waiting for an event (no one is waiting for epoll_wait to return), or if multiple events happen before epoll_wait can return... otherwise any event causes epoll_wait to return.
In the case of a listening socket, the EPOLLIN event will be triggered every time a client connects... unless no one is waiting for epoll_wait to return, in which case the next call for epoll_wait will return immediately and all the EPOLLIN events that occurred during that time will have been merged into a single event.
In the case of a byte stream, new events will be triggered every time new data comes in... unless, of course, no one is waiting for epoll_wait to return, in which case the next call will return immediately for all the data that arrive util epoll_wait returned (even if it arrived in different chunks / events).
Exclusive return: The EPOLLEXCLUSIVE flag is used to prevent the "thundering heard" behavior, so only a single epoll_wait caller is woken up for each fd wake-up event.
As I pointed out before, for edge-triggered states, an fd wake-up event is a change in the fd state. So all EPOLLIN events will be raised until all data was read (the listening socket's backlog was emptied).
On the other hand, for level triggered events, each EPOLLIN will invoke a wake up event. If no one is waiting, these events will be merged.
Following the example in your question:
For level triggered events: every time a client connects, a single thread will return from epoll_wait... BUT, if two more clients were to connect while both threads were busy accepting the first two clients, these EPOLLIN events would merge into a single event and the next call to epoll_wait will return immediately with that merged event.
In the context of the example given in the question, thread B is expected to "wake up" due to epoll_wait returning.
In this case, both threads will "race" towards accept.
However, this doesn't defeat the EPOLLEXCLUSIVE directive or intent.
The EPOLLEXCLUSIVE directive is meant to prevent the "thundering heard" phenomenon. In this case, two threads are racing to accept two connections. Each thread can (presumably) call accept safely, with no errors. If three threads were used, the third would keep on sleeping.
If the EPOLLEXCLUSIVE weren't used, all the epoll_wait threads would have been woken up whenever a connection was available, meaning that as soon as the first connection arrived, both threads would have been racing to accept a single connection (resulting in a possible error for one of them).
For edge triggered events: only one thread is expected to receive the "wake up" call. That thread is expected to accept all waiting connections (empty the listen "backlog"). No more EPOLLIN events will be raised for that socket until the backlog is emptied.
The same applies to readable sockets and pipes. The thread that was woken up is expected to deal with all the readable data. This prevents to waiting threads from attempting to read the data concurrently and experiencing file lock race conditions.
I would recommend (and this is what I do) to set the listening socket to non-blocking mode and calling accept in a loop until an EAGAIN (or EWOULDBLOCK) error is raised, indicating that the backlog is empty. There is no way to avoid the risk of events being merged. The same is true for reading from a socket.
Testing this with code:
I wrote a simple test, with some sleep commands and blocking sockets. Client sockets are initiated only after both threads start waiting for epoll.
Client thread initiation is delayed, so client 1 and client 2 start a second apart.
Once a server thread is woken up, it will sleep for a second (allowing the second client to do it's thing) before calling accept. Maybe the servers should sleep a little more, but it seems close enough to manage the scheduler without resorting to conditional variables.
Here are the results of my test code (which might be a mess, I'm not the best person for test design)...
On Ubuntu 16.10, which supports EPOLLEXCLUSIVE, the test results show that the listening threads are woken up one after the other, in response to the clients. In the example in the question, thread B is woken up.
Test address: <null>:8000
Server thread 2 woke up with 1 events
Server thread 2 will sleep for a second, to let things happen.
client number 1 connected
Server thread 1 woke up with 1 events
Server thread 1 will sleep for a second, to let things happen.
client number 2 connected
Server thread 2 accepted a connection and saying hello.
client 1: Hello World - from server thread 2.
Server thread 1 accepted a connection and saying hello.
client 2: Hello World - from server thread 1.
To compare with Ubuntu 16.04 (without EPOLLEXCLUSIVE support), than both threads are woken up for the first connection. Since I use blocking sockets, the second thread hangs on accept until client # 2 connects.
main.c:178:2: warning: #warning EPOLLEXCLUSIVE undeclared, test is futile [-Wcpp]
#warning EPOLLEXCLUSIVE undeclared, test is futile
^
Test address: <null>:8000
Server thread 1 woke up with 1 events
Server thread 1 will sleep for a second, to let things happen.
Server thread 2 woke up with 1 events
Server thread 2 will sleep for a second, to let things happen.
client number 1 connected
Server thread 1 accepted a connection and saying hello.
client 1: Hello World - from server thread 1.
client number 2 connected
Server thread 2 accepted a connection and saying hello.
client 2: Hello World - from server thread 2.
For one more comparison, the results for level triggered kqueue show that both threads are awoken for the first connection. Since I use blocking sockets, the second thread hangs on accept until client # 2 connects.
Test address: <null>:8000
client number 1 connected
Server thread 2 woke up with 1 events
Server thread 1 woke up with 1 events
Server thread 2 will sleep for a second, to let things happen.
Server thread 1 will sleep for a second, to let things happen.
Server thread 2 accepted a connection and saying hello.
client 1: Hello World - from server thread 2.
client number 2 connected
Server thread 1 accepted a connection and saying hello.
client 2: Hello World - from server thread 1.
My test code was (sorry for the lack of comments and the messy code, I wasn't writing for future maintenance):
#ifndef _GNU_SOURCE
#define _GNU_SOURCE
#endif
#define ADD_EPOLL_OPTION 0 // define as EPOLLET or 0
#include <arpa/inet.h>
#include <errno.h>
#include <fcntl.h>
#include <limits.h>
#include <netdb.h>
#include <pthread.h>
#include <stdint.h>
#include <stdio.h>
#include <stdlib.h>
#include <string.h>
#include <sys/mman.h>
#include <sys/resource.h>
#include <sys/socket.h>
#include <sys/time.h>
#include <sys/types.h>
#include <time.h>
#include <unistd.h>
#if !defined(__linux__) && !defined(__CYGWIN__)
#include <sys/event.h>
#define reactor_epoll 0
#else
#define reactor_epoll 1
#include <sys/epoll.h>
#include <sys/timerfd.h>
#endif
int sock_listen(const char *address, const char *port);
void *listen_threard(void *arg);
void *client_thread(void *arg);
int server_fd;
char const *address = NULL;
char const *port = "8000";
int main(int argc, char const *argv[]) {
if (argc == 2) {
port = argv[1];
} else if (argc == 3) {
port = argv[2];
address = argv[1];
}
fprintf(stderr, "Test address: %s:%s\n", address ? address : "<null>", port);
server_fd = sock_listen(address, port);
/* code */
pthread_t threads[4];
for (size_t i = 0; i < 2; i++) {
if (pthread_create(threads + i, NULL, listen_threard, (void *)i))
perror("couldn't initiate server thread"), exit(-1);
}
for (size_t i = 2; i < 4; i++) {
sleep(1);
if (pthread_create(threads + i, NULL, client_thread, (void *)i))
perror("couldn't initiate client thread"), exit(-1);
}
// join only server threads.
for (size_t i = 0; i < 2; i++) {
pthread_join(threads[i], NULL);
}
close(server_fd);
sleep(1);
return 0;
}
/**
Sets a socket to non blocking state.
*/
inline int sock_set_non_block(int fd) // Thanks to Bjorn Reese
{
/* If they have O_NONBLOCK, use the Posix way to do it */
#if defined(O_NONBLOCK)
/* Fixme: O_NONBLOCK is defined but broken on SunOS 4.1.x and AIX 3.2.5. */
int flags;
if (-1 == (flags = fcntl(fd, F_GETFL, 0)))
flags = 0;
// printf("flags initial value was %d\n", flags);
return fcntl(fd, F_SETFL, flags | O_NONBLOCK);
#else
/* Otherwise, use the old way of doing it */
static int flags = 1;
return ioctl(fd, FIOBIO, &flags);
#endif
}
/* open a listenning socket */
int sock_listen(const char *address, const char *port) {
int srvfd;
// setup the address
struct addrinfo hints;
struct addrinfo *servinfo; // will point to the results
memset(&hints, 0, sizeof hints); // make sure the struct is empty
hints.ai_family = AF_UNSPEC; // don't care IPv4 or IPv6
hints.ai_socktype = SOCK_STREAM; // TCP stream sockets
hints.ai_flags = AI_PASSIVE; // fill in my IP for me
if (getaddrinfo(address, port, &hints, &servinfo)) {
perror("addr err");
return -1;
}
// get the file descriptor
srvfd =
socket(servinfo->ai_family, servinfo->ai_socktype, servinfo->ai_protocol);
if (srvfd <= 0) {
perror("socket err");
freeaddrinfo(servinfo);
return -1;
}
// // keep the server socket blocking for the test.
// // make sure the socket is non-blocking
// if (sock_set_non_block(srvfd) < 0) {
// perror("couldn't set socket as non blocking! ");
// freeaddrinfo(servinfo);
// close(srvfd);
// return -1;
// }
// avoid the "address taken"
{
int optval = 1;
setsockopt(srvfd, SOL_SOCKET, SO_REUSEADDR, &optval, sizeof(optval));
}
// bind the address to the socket
{
int bound = 0;
for (struct addrinfo *p = servinfo; p != NULL; p = p->ai_next) {
if (!bind(srvfd, p->ai_addr, p->ai_addrlen))
bound = 1;
}
if (!bound) {
// perror("bind err");
freeaddrinfo(servinfo);
close(srvfd);
return -1;
}
}
freeaddrinfo(servinfo);
// listen in
if (listen(srvfd, SOMAXCONN) < 0) {
perror("couldn't start listening");
close(srvfd);
return -1;
}
return srvfd;
}
/* will start listenning, sleep for 5 seconds, then accept all the backlog and
* finish */
void *listen_threard(void *arg) {
int epoll_fd;
ssize_t event_count;
#if reactor_epoll
#ifndef EPOLLEXCLUSIVE
#warning EPOLLEXCLUSIVE undeclared, test is futile
#define EPOLLEXCLUSIVE 0
#endif
// create the epoll wait fd
epoll_fd = epoll_create1(0);
if (epoll_fd < 0)
perror("couldn't create epoll fd"), exit(1);
// add the server fd to the epoll watchlist
{
struct epoll_event chevent = {0};
chevent.data.ptr = (void *)((uintptr_t)server_fd);
chevent.events =
EPOLLOUT | EPOLLIN | EPOLLERR | EPOLLEXCLUSIVE | ADD_EPOLL_OPTION;
epoll_ctl(epoll_fd, EPOLL_CTL_ADD, server_fd, &chevent);
}
// wait with epoll
struct epoll_event events[10];
event_count = epoll_wait(epoll_fd, events, 10, 5000);
#else
// testing on BSD, use kqueue
epoll_fd = kqueue();
if (epoll_fd < 0)
perror("couldn't create kqueue fd"), exit(1);
// add the server fd to the kqueue watchlist
{
struct kevent chevent[2];
EV_SET(chevent, server_fd, EVFILT_READ, EV_ADD | EV_ENABLE, 0, 0,
(void *)((uintptr_t)server_fd));
EV_SET(chevent + 1, server_fd, EVFILT_WRITE, EV_ADD | EV_ENABLE, 0, 0,
(void *)((uintptr_t)server_fd));
kevent(epoll_fd, chevent, 2, NULL, 0, NULL);
}
// wait with kqueue
static struct timespec reactor_timeout = {.tv_sec = 5, .tv_nsec = 0};
struct kevent events[10];
event_count = kevent(epoll_fd, NULL, 0, events, 10, &reactor_timeout);
#endif
close(epoll_fd);
if (event_count <= 0) {
fprintf(stderr, "Server thread %lu wakeup no events / error\n",
(size_t)arg + 1);
perror("errno ");
return NULL;
}
fprintf(stderr, "Server thread %lu woke up with %lu events\n",
(size_t)arg + 1, event_count);
fprintf(stderr,
"Server thread %lu will sleep for a second, to let things happen.\n",
(size_t)arg + 1);
sleep(1);
int connfd;
struct sockaddr_storage client_addr;
socklen_t client_addrlen = sizeof client_addr;
/* accept up all connections. we're non-blocking, -1 == no more connections */
if ((connfd = accept(server_fd, (struct sockaddr *)&client_addr,
&client_addrlen)) >= 0) {
fprintf(stderr,
"Server thread %lu accepted a connection and saying hello.\n",
(size_t)arg + 1);
if (write(connfd, arg ? "Hello World - from server thread 2."
: "Hello World - from server thread 1.",
35) < 35)
perror("server write failed");
close(connfd);
} else {
fprintf(stderr, "Server thread %lu failed to accept a connection",
(size_t)arg + 1);
perror(": ");
}
return NULL;
}
void *client_thread(void *arg) {
int fd;
// setup the address
struct addrinfo hints;
struct addrinfo *addrinfo; // will point to the results
memset(&hints, 0, sizeof hints); // make sure the struct is empty
hints.ai_family = AF_UNSPEC; // don't care IPv4 or IPv6
hints.ai_socktype = SOCK_STREAM; // TCP stream sockets
hints.ai_flags = AI_PASSIVE; // fill in my IP for me
if (getaddrinfo(address, port, &hints, &addrinfo)) {
perror("client couldn't initiate address");
return NULL;
}
// get the file descriptor
fd =
socket(addrinfo->ai_family, addrinfo->ai_socktype, addrinfo->ai_protocol);
if (fd <= 0) {
perror("client couldn't create socket");
freeaddrinfo(addrinfo);
return NULL;
}
// // // Leave the socket blocking for the test.
// // make sure the socket is non-blocking
// if (sock_set_non_block(fd) < 0) {
// freeaddrinfo(addrinfo);
// close(fd);
// return -1;
// }
if (connect(fd, addrinfo->ai_addr, addrinfo->ai_addrlen) < 0 &&
errno != EINPROGRESS) {
fprintf(stderr, "client number %lu FAILED\n", (size_t)arg - 1);
perror("client connect failure");
close(fd);
freeaddrinfo(addrinfo);
return NULL;
}
freeaddrinfo(addrinfo);
fprintf(stderr, "client number %lu connected\n", (size_t)arg - 1);
char buffer[128];
if (read(fd, buffer, 35) < 35) {
perror("client: read error");
close(fd);
} else {
buffer[35] = 0;
fprintf(stderr, "client %lu: %s\n", (size_t)arg - 1, buffer);
close(fd);
}
return NULL;
}
P.S.
As a final recommendation, I would consider having no more than a single thread and a single epoll fd per process. This way the "thundering heard" is a non-issue and EPOLLEXCLUSIVE (which is still very new and isn't widely supported) can be disregarded... the only "thundering heard" this still exposes is for the limited amount of shared sockets, where the race condition might be good for load balancing.
Original Answer
I'm not sure I understand the confusion, so I'll go over EPOLLET and EPOLLEXCLUSIVE to show their combined expected behavior.
As you well know:
Once you set EPOLLET (edge triggered), events are triggered on fd state changes rather than fd events.
This design is explicitly meant to prevent epoll_wait from returning due to an event that is in the process of being handled (i.e., when new data arrives while the EPOLLIN was already raised but read hadn't been called or not all of the data was read).
In the case of a listening socket, the EPOLLIN event won't be triggered again until all existing listen "backlog" sockets have been accepted using accept.
The EPOLLEXCLUSIVE flag is used to prevent the "thundering heard" behavior, so only a single epoll_wait caller is woken up for each fd wake-up event.
As I pointed out before, for edge-triggered states, an fd wake-up event is a change in the fd state. So all EPOLLIN events will be raised until all data was read (the listening socket's backlog was emptied).
When merging these behaviors, and following the example in your question, only one thread is expected to receive the "wake up" call. That thread is expected to accept all waiting connections (empty the listen "backlog") or no more EPOLLIN events will be raised for that socket.
The same applies to readable sockets and pipes. The thread that was woken up is expected to deal with all the readable data. This prevents to waiting threads from attempting to read the data concurrently and experiencing file lock race conditions.
I would recommend that you consider avoiding the edge triggered events if you mean to call accept only once for each epoll_wait wake-up event. Regardless of using EPOLLEXCLUSIVE, you run the risk of not emptying the existing "backlog", so that no new wake-up events will be raised.
Alternatively, I would recommend (and this is what I do) to set the listening socket to non-blocking mode and calling accept in a loop until and an EAGAIN (or EWOULDBLOCK) error is raised, indicating that the backlog is empty.
EDIT 1: Level Triggered Events
It seems, as Nathaniel pointed out in the comment, that I totally misunderstood the question... I guess I'm used to EPOLLET being the misunderstood element.
So, what happens with normal, level-triggered, events (NOT EPOLLET)?
Well... the expected behavior is the exact mirror image (opposite) of edge triggered events.
For listenning sockets, the epoll_wait is expected return whenever a new connected is available, whether accept was called after a previous event or not.
Events are only "merged" if no-one is waiting with epoll_wait... in which case the next call for epoll_wait will return immediately.
In the context of the example given in the question, thread B is expected to "wake up" due to epoll_wait returning.
In this case, both threads will "race" towards accept.
However, this doesn't defeat the EPOLLEXCLUSIVE directive or intent.
The EPOLLEXCLUSIVE directive is meant to prevent the "thundering heard" phenomenon. In this case, two threads are racing to accept two connections. Each thread can (presumably) call accept safely, with no errors. If three threads were used, the third would keep on sleeping.
If the EPOLLEXCLUSIVE weren't used, all the epoll_wait threads would have been woken up whenever a connection was available, meaning that as soon as the first connection arrived, both threads would have been racing to accept a single connection (resulting in a possible error for one of them).
This is only a partial answer, but Jason Baron (the author of the EPOLLEXCLUSIVE patch) just responded to an email I sent him to confirm that when using EPOLLEXCLUSIVE in level-triggered mode he does think it's possible that two connections will arrive but only one thread will be woken (thread B keeps sleeping). So when using EPOLLEXCLUSIVE you have to use the same kinds of defensive programming as you use for edge-trigged epoll, regardless of whether you set EPOLLET.
I have a server to collect Tcp data from different clients to a certain port. I have a scenario that whenever the client creates tcp connection and remain idle for more than let's say 30 min then I need to close the connection.
I have learned about TCP keep alive to track that the peer is dead or not and Mostly I found examples used in client side. Similarly can I used in the server side to poll the connection whether it is active or not?
Further In linux sysctl.conf , there is a configuration file to edit the values. This seems that the whole tcp connection is destroyed after certain inactivity. I am in need such that certain connection form the device are destroyed after certain time inactivity but not the whole tcp port connection closed.
I am using ubuntu to create the server to collect tcp connection. Can I use TCP Keep-Alives in server code to find the inactive client and close the particular client? or is there any other way in server side to implement such feature?
and while going through the web it is mentioned that
(getsockopt(s, SOL_SOCKET, SO_KEEPALIVE, &optval, &optlen)
this getsockopt is for the main tcp connection and setting here seems the setting is for whole connection to the server.
However what I need is for the specific client. I have the event server code as
here client_fd is accepted and now I need to close this client_fd if next data through this client is not received within certain time.
void event_server(EV_P_ struct ev_io *w, int revents) {
int flags;
struct sockaddr_in6 addr;
socklen_t len = sizeof(addr);
int client_fd;
// since ev_io is the first member,
// watcher `w` has the address of the
// start of the _sock_ev_serv struct
struct _sock_ev_serv* server = (struct _sock_ev_serv*) w;
server->socket_len = len;
for (;;) {
if ((client_fd = accept(server->fd, (struct sockaddr*) &addr, &len)) < 0) {
switch (errno) {
case EINTR:
case EAGAIN:
break;
default:
zlog_info(_c, "Error accepting connection from client \n");
//perror("accept");
}
break;
}
char ip[INET6_ADDRSTRLEN];
inet_ntop(AF_INET6, &addr.sin6_addr, ip, INET6_ADDRSTRLEN);
char *dev_ip = get_ip(ip);
server->device_ip = dev_ip;
zlog_debug(_c,"The obtained ip is %s and dev_ip is %s", ip, dev_ip);
/** check for the cidr address for config_ip **/
char *config_ip;
config_ip = get_config_ip(dev_ip, _client_map);
zlog_debug(_c,"The _config ip for dev_ip:%s is :%s", dev_ip, config_ip);
if (config_ip == NULL) {
zlog_debug(_c,"Connection attempted from unreigistered IP: %s", dev_ip);
zlog_info(_c, "Connection attempted from unregistered IP : %s", dev_ip);
AFREE(server->device_ip);
continue;
}
json_t *dev_config;
dev_config = get_json_object_from_json(_client_map, config_ip);
if (dev_config==NULL) {
zlog_debug(_c,"Connection attempted from unreigistered IP: %s", dev_ip);
zlog_info(_c, "Connection attempted from unregistered IP : %s", dev_ip);
AFREE(server->device_ip);
continue;
}
if ((flags = fcntl(client_fd, F_GETFL, 0)) < 0 || fcntl(client_fd, F_SETFL, flags | O_NONBLOCK) < 0) {
zlog_error(_c, "fcntl(2)");
}
struct _sock_ev_client* client = malloc(sizeof(struct _sock_ev_client));
client->device_ip = dev_ip;
client->server = server;
client->fd = client_fd;
// ev_io *watcher = (ev_io*)calloc(1, sizeof(ev_io));
ev_io_init(&client->io, event_client, client_fd, EV_READ);
ev_io_start(EV_DEFAULT, &client->io);
}
}
TCP keep alives are not to detect idle clients but to detect dead connections, i.e. if a client crashed without closing the connection or if the line is dead etc. But if the client is only idle but not dead the connection is still open. Any attempts to send an empty packet (which keep-alive packets are) to the client will result in an ACK from the client and thus keep alive will not report a dead connection.
To detect idle clients instead use either timeouts for read (SO_RCVTIMEO) or use a timeout with select, poll or similar functions.
I have implemented below mechanism to detect idle status on Socket IO activity.
My Socket is wrapped in some class like UserConnection. This class has one more attribute lastActivtyTime. Whenever I get a read on write on this Socket, I will update this attribute.
I have one more background Reaper thread, which will iterate through all UserConnection objects and check for lastActivtyTime. If current time - lastActivtyTime is greater than configured threshold parameter like 15 seconds, I will close the idle connection.
In your case, when you are iterating through all UserConnections, you can check client_id and your threshold of 30 minutes inactivity to close idle connection.
My use case (webservice):
Multiple clients => Webserver => Message to C program through UNIX domain socket.
I've been using Apache + PHP for the webserver layer, but I'm currently in the process of replacing it with Node.js.
The webservice gets up to 100 requests/sec, so it's a very real scenario that the C program will be busy when a new request comes in. PHP handles this just fine, but Node.js often fails with the error:
{
"code": "EAGAIN",
"errno": "EAGAIN",
"syscall": "connect",
"address": "/tmp/service.sock"
}
I'm assuming this is because PHP performs some kind of message queue/retry that will ensure all messages are sent to the C program (which Node.js does not).
Is there a simple way to do the same in Node.js or will have I have to implement a custom message queue?
C socket creation:
int listenSocket, newSocket;
struct sockaddr_un localAddress, remoteAddress;
// Create socket
if ((listenSocket = socket(AF_UNIX, SOCK_STREAM, 0)) == -1){
printf("Error opening listener socket");
exit(1);
}
// Configure socket
localAddress.sun_family = AF_UNIX; // Set UNIX domain socket type
strcpy(localAddress.sun_path, "/tmp/service.sock");
unlink(localAddress.sun_path); // Remove any previous instances of the socket
// Open listener socket
int len = strlen(localAddress.sun_path) + sizeof(localAddress.sun_family);
if (bind(listenSocket, (struct sockaddr *)&localAddress, len) == -1){
printf("Error binding socket at %s", localAddress.sun_path);
exit(1);
}
chmod(localAddress.sun_path, 0777);
// Listen for new connections on the listener socket
if (listen(listenSocket, 5) == -1){
printf("Error listening on listener socket");
exit(1);
}
// Handle new connections
while(!shutdown){
printf("Waiting for a connection...\n");
// Accept new connection
int sizeofRemoteAddress = sizeof(remoteAddress);
if ((newSocket = accept(listenSocket, (struct sockaddr *)&remoteAddress, &sizeofRemoteAddress)) == -1){
printf("Error accepting new connection: %s\n", strerror(errno));
continue;
}
// Read and handle data from client...
}
Connecting to the socket in PHP:
$socket = #socket_create(AF_UNIX, SOCK_STREAM, 0);
if (!$socket) return false;
$connected = #socket_connect($socket, "/tmp/service.sock");
if (!$connected) return false;
// Send message to server and read response...
Connecting to the socket in Node.js:
new Promise(function(resolve, reject){
var socket = Net.connect("/tmp/service.sock");
socket.on("error", function(err){
reject(err);
});
socket.on("connect", function(){
socket.write(message);
});
socket.on("data", function(data){
socket.end();
resolve(data.toString("UTF-8"));
});
});
EAGAIN is an expected condition when the system call was interrupted. You should simply repeat the same call while you get the EAGAIN error code. In typical C programs you'll see tons of while (returnCode == -1 && errno == EAGAIN) style of loops.
If you expect many interrupts, you could first disable interrupts (not sure how this is done in node.js) do your system call, then enable interrupts again.
Not sure if this answer is any good for a node.js application but I thought I mention it anyway.
Is there possible that accept() (on redhat Enterprise 4/linux kernel 2.6) return a same socket value for different tcp connections from the same process of a same application and same machine?
I am so surprised that when I got such a result that many connections have the same socket value on server side when I checked the log file!! How is it possible?!!
By the way, I am using TCP blocking socket to listen.
main(){
int fd, clientfd, len, clientlen;
sockaddr_in address, clientaddress;
fd = socket(PF_INET, SOCK_STREAM, 0);
....
memset(&address, 0, sizeof address);
address.sin_address = AF_INET;
address.sin_port = htons(port);
....
bind(fd, &address, sizeof address);
listen(fd, 100);
do {
clientfd = accept(fd, &clientaddress, &clientlen);
if (clientfd < 0) {
....
}
printf("clientfd = %d", clientfd);
switch(fork()){
case 0:
//do something else
exit(0);
default:
...
}
} while(1);
}
my question is that why printf("clientfd = %d"); prints a same number for different connections!!!
If server runs in multiple processes (like Apache with mpm worker model), then every process has its own file descriptor numbering starting from 0.
In other words, it is quite possible that different processes will get exact same socket file descriptor number. However, fd number it does not really mean anything. They still refer to different underlying objects, and different local TCP ports.
The socket is just a number.It is a hook to a data structure for the kernel.
BTW TCP uses IP. Look up the RFC
That printf() doesn't print any FD at all. It's missing an FD parameter. What you are seeing could be a return address or any other arbitrary junk on the stack.
I am writing a simple socket daemon witch listens to a port and reads the incoming data. It works fine until i choose to disconnect a client from the server...then it enters in a infinte loop recv() returns the last packet never gets to -1. My question is how can i detect that the client had been disconnected and close the thread/ socket el
My thread is as follows :
void * SocketHandler(void* lp){
int * csock = (int*)lp;
int test = 0;
char buffer[1024];
int buffer_len = 1024;
int bytecount,ierr;
memset(buffer,0,buffer_len);
while (test == 0)
{
if ((bytecount = recv(*csock, buffer, buffer_len, 0))== -1){
close(csock);
free(csock);
test++;
return 0;
}
else
{
syslog(LOG_NOTICE,"%s",buffer);
}
}
return 0;
};
A cleanly closed socket will end up in a ZERO read, while a broken connection is an error state returning -1. You need to catch the 0 return of your recv.
What happens here is that your end may not detect the fact the socket is dead (especially, if you are just reading from it).
What you can do is set keepalive on the socket. This will turn on periodic checks of the socket liveness. But don't expect fast reactions, the default timeout is something like 20 minutes.
i = 1;
setsockopt(sock, SOL_SOCKET, SO_KEEPALIVE, (char *)&i, sizeof(i));
Another option is to do your own keep-alive communication.
recv() will indicate a proper shutdown of the socket by returning 0 (see the manpage for details). It will return -1 if and only if an error occurred. You should check errno for the exact error, since it may or may not indicate that the connection failed (EINTR, EAGAIN or EWOULDBLOCK [non-blocking sockets assumed] would both be recoverable errors).
Side note: there's no need to pass the fd of the socket as pointer and since you're returning a void * you may want to change return 0 to return NULL (just for readability).