epoll: difference between level triggered and edge triggered when EPOLLONESHOT specified

epoll: difference between level triggered and edge triggered when EPOLLONESHOT specified - linux

What's the difference between level triggered and edge triggered mode, when EPOLLONESHOT specified?
There's a similar question already here. The answer by "Crouching Kitten" doesn't seem to be right (and as I understand, the other answer doesn't answer my question).
I've tried the following:
server sends 2 bytes to a client, while client waits in epoll_wait
client returns from epoll_wait, then reads 1 byte.
client re-arms the event (because of EPOLLONESHOT)
client calls epoll_wait again. Here, for both cases (LT & ET), epoll_wait doesn't wait, but returns immediately (contrary to the answer by "Crouching Kitten")
client can read the second byte
Is there any difference between LT & ET, when EPOLLONESHOT specified?

I think the bottom line answer is "there is not difference".
Looking at the code, it seems that the fd remembers the last set bits before being disabled by the one-shot. It remembers it was one shot, and it remembers whether it was ET or not.
Which is futile, because the fd is disabled until modified, and the next call to EPOLL_CTL_MOD will erase all of that, and replace with whatever the new MOD says.
Having said that, I do not understand why anyone would want both EPOLLET and EPOLLONESHOT. To me, the whole point of EPOLLET is that, unders certain programming models (namely, microthreads), it follows the semantics perfcetly. This means that I can add the fd to the epoll at the very start, and then never have to perform another epoll related system call.
EPOLLONESHOT, on the other hand, is used by people who want to keep a very strict control over when the fd is watched and when it isn't. That, by definition, is the opposite of what EPOLLET is used for. I just don't think the two are conceptually compatible.

The other poster said "I do not understand why anyone would want both EPOLLET and EPOLLONESHOT." Actually, according to epoll(7), there is a use case for that:
Since even with edge-triggered epoll, multiple events can be generated upon receipt of multiple chunks of data, the caller has the option to specify the EPOLLONESHOT flag, to tell epoll to disable the associated file descriptor after the receipt of an event with epoll_wait(2).

The key point is that whether EPOLL will treat the combination of EPOLLET | EPOLLONESHOT and EPOLLLT | EPOLLONESHOT as special case. As I known, it is not. EPOLL just care them seperately. To EPOLLET and EPOLLLT, the different kindly only is in function ep_send_events, if the EPOLLET is set, then the function will call list_add_tail to add the epitem into the ready list in epoll_fd/eventepoll object.
To the EPOLLONESHOT, the role is to disable the fd. So I think the different between them is the different between ET and LT. You can check the result using below codes I think
// server.cc
#include <sys/types.h>
#include <sys/socket.h>
#include <netinet/in.h>
#include <arpa/inet.h>
#include <assert.h>
#include <stdio.h>
#include <unistd.h>
#include <errno.h>
#include <string.h>
#include <fcntl.h>
#include <stdlib.h>
#include <sys/epoll.h>
#include <pthread.h>
#define MAX_EVENT_NUMBER 1024
int setnonblocking(int fd)
{
int old_option = fcntl(fd, F_GETFL);
int new_option = old_option | O_NONBLOCK;
fcntl(fd, F_SETFL, new_option);
return old_option;
}
void addfd(int epollfd, int fd, bool oneshot)
{
epoll_event event;
event.data.fd = fd;
event.events = EPOLLIN | EPOLLET;
if(oneshot)
event.events |= EPOLLONESHOT;
epoll_ctl(epollfd, EPOLL_CTL_ADD, fd, &event);
setnonblocking(fd);
}
// reset the fd with EPOLLONESHOT
void reset_oneshot(int epollfd, int fd)
{
epoll_event event;
event.data.fd = fd;
event.events = EPOLLIN | EPOLLET | EPOLLONESHOT;
epoll_ctl(epollfd, EPOLL_CTL_MOD, fd, &event);
}
int main(int argc, char** argv)
{
if(argc <= 2)
{
printf("usage: %s ip_address port_number\n", basename(argv[0]));
return 1;
}
const char* ip = argv[1];
int port = atoi(argv[2]);
int ret = 0;
struct sockaddr_in address;
bzero(&address, sizeof(address));
address.sin_family = AF_INET;
inet_pton(AF_INET, ip, &address.sin_addr);
address.sin_port = htons(port);
int listenfd = socket(PF_INET, SOCK_STREAM, 0);
assert(listenfd >= 0);
ret = bind(listenfd, (struct sockaddr*)&address, sizeof(address));
assert(ret != -1);
ret = listen(listenfd, 5);
assert(ret != -1);
epoll_event events[MAX_EVENT_NUMBER];
int epollfd = epoll_create(5);
addfd(epollfd, listenfd, false);
while(1)
{
printf("next loop: -----------------------------");
int ret = epoll_wait(epollfd, events, MAX_EVENT_NUMBER, -1);
if(ret < 0)
{
printf("epoll failure\n");
break;
}
for(int i = 0; i < ret; i++)
{
int sockfd = events[i].data.fd;
if(sockfd == listenfd)
{
printf("into listenfd part\n");
struct sockaddr_in client_address;
socklen_t client_addrlength = sizeof(client_address);
int connfd = accept(listenfd, (struct sockaddr*)&client_address,
&client_addrlength);
printf("receive connfd: %d\n", connfd);
addfd(epollfd, connfd, true);
// reset_oneshot(epollfd, listenfd);
}
else if(events[i].events & EPOLLIN)
{
printf("into linkedfd part\n");
printf("start new thread to receive data on fd: %d\n", sockfd);
char buf[2];
memset(buf, '\0', 2);
// just read one byte, and reset the fd with EPOLLONESHOT, check whether still EPOLLIN event
int ret = recv(sockfd, buf, 2 - 1, 0);
if(ret == 0)
{
close(sockfd);
printf("foreigner closed the connection\n");
break;
}
else if(ret < 0)
{
if(errno == EAGAIN)
{
printf("wait to the client send the new data, check the oneshot memchnism\n");
sleep(10);
reset_oneshot(epollfd, sockfd);
printf("read later\n");
break;
}
}
else {
printf("receive the content: %s\n", buf);
reset_oneshot(epollfd, sockfd);
printf("reset the oneshot successfully\n");
}
}
else
printf("something unknown happend\n");
}
sleep(1);
}
close(listenfd);
return 0;
}
the Client is
from socket import *
import sys
import time
long_string = b"this is a long content which need two time to fetch"
def sendOneTimeThenSleepAndClose(ip, port):
s = socket(AF_INET, SOCK_STREAM);
a = s.connect((ip, int(port)));
print("connect success: {}".format(a));
data = s.send(b"this is test");
print("send successfuly");
time.sleep(50);
s.close();
sendOneTimeThenSleepAndClose('127.0.0.1', 9999)

Related

How can i make sure that only a single instance of the process on Linux? [duplicate]

What would be your suggestion in order to create a single instance application, so that only one process is allowed to run at a time? File lock, mutex or what?

A good way is:
#include <sys/file.h>
#include <errno.h>
int pid_file = open("/var/run/whatever.pid", O_CREAT | O_RDWR, 0666);
int rc = flock(pid_file, LOCK_EX | LOCK_NB);
if(rc) {
if(EWOULDBLOCK == errno)
; // another instance is running
}
else {
// this is the first instance
}
Note that locking allows you to ignore stale pid files (i.e. you don't have to delete them). When the application terminates for any reason the OS releases the file lock for you.
Pid files are not terribly useful because they can be stale (the file exists but the process does not). Hence, the application executable itself can be locked instead of creating and locking a pid file.
A more advanced method is to create and bind a unix domain socket using a predefined socket name. Bind succeeds for the first instance of your application. Again, the OS unbinds the socket when the application terminates for any reason. When bind() fails another instance of the application can connect() and use this socket to pass its command line arguments to the first instance.

Here is a solution in C++. It uses the socket recommendation of Maxim. I like this solution better than the file based locking solution, because the file based one fails if the process crashes and does not delete the lock file. Another user will not be able to delete the file and lock it. The sockets are automatically deleted when the process exits.
Usage:
int main()
{
SingletonProcess singleton(5555); // pick a port number to use that is specific to this app
if (!singleton())
{
cerr << "process running already. See " << singleton.GetLockFileName() << endl;
return 1;
}
... rest of the app
}
Code:
#include <netinet/in.h>
class SingletonProcess
{
public:
SingletonProcess(uint16_t port0)
: socket_fd(-1)
, rc(1)
, port(port0)
{
}
~SingletonProcess()
{
if (socket_fd != -1)
{
close(socket_fd);
}
}
bool operator()()
{
if (socket_fd == -1 || rc)
{
socket_fd = -1;
rc = 1;
if ((socket_fd = socket(AF_INET, SOCK_DGRAM, 0)) < 0)
{
throw std::runtime_error(std::string("Could not create socket: ") + strerror(errno));
}
else
{
struct sockaddr_in name;
name.sin_family = AF_INET;
name.sin_port = htons (port);
name.sin_addr.s_addr = htonl (INADDR_ANY);
rc = bind (socket_fd, (struct sockaddr *) &name, sizeof (name));
}
}
return (socket_fd != -1 && rc == 0);
}
std::string GetLockFileName()
{
return "port " + std::to_string(port);
}
private:
int socket_fd = -1;
int rc;
uint16_t port;
};

For windows, a named kernel object (e.g. CreateEvent, CreateMutex). For unix, a pid-file - create a file and write your process ID to it.

You can create an "anonymous namespace" AF_UNIX socket. This is completely Linux-specific, but has the advantage that no filesystem actually has to exist.
Read the man page for unix(7) for more info.

Avoid file-based locking
It is always good to avoid a file based locking mechanism to implement the singleton instance of an application. The user can always rename the lock file to a different name and run the application again as follows:
mv lockfile.pid lockfile1.pid
Where lockfile.pid is the lock file based on which is checked for existence before running the application.
So, it is always preferable to use a locking scheme on object directly visible to only the kernel. So, anything which has to do with a file system is not reliable.
So the best option would be to bind to a inet socket. Note that unix domain sockets reside in the filesystem and are not reliable.
Alternatively, you can also do it using DBUS.

It's seems to not be mentioned - it is possible to create a mutex in shared memory but it needs to be marked as shared by attributes (not tested):
pthread_mutexattr_t attr;
pthread_mutexattr_init(&attr);
pthread_mutexattr_setpshared(&attr, PTHREAD_PROCESS_SHARED);
pthread_mutex_t *mutex = shmat(SHARED_MEMORY_ID, NULL, 0);
pthread_mutex_init(mutex, &attr);
There is also shared memory semaphores (but I failed to find out how to lock one):
int sem_id = semget(SHARED_MEMORY_KEY, 1, 0);

No one has mentioned it, but sem_open() creates a real named semaphore under modern POSIX-compliant OSes. If you give a semaphore an initial value of 1, it becomes a mutex (as long as it is strictly released only if a lock was successfully obtained).
With several sem_open()-based objects, you can create all of the common equivalent Windows named objects - named mutexes, named semaphores, and named events. Named events with "manual" set to true is a bit more difficult to emulate (it requires four semaphore objects to properly emulate CreateEvent(), SetEvent(), and ResetEvent()). Anyway, I digress.
Alternatively, there is named shared memory. You can initialize a pthread mutex with the "shared process" attribute in named shared memory and then all processes can safely access that mutex object after opening a handle to the shared memory with shm_open()/mmap(). sem_open() is easier if it is available for your platform (if it isn't, it should be for sanity's sake).
Regardless of the method you use, to test for a single instance of your application, use the trylock() variant of the wait function (e.g. sem_trywait()). If the process is the only one running, it will successfully lock the mutex. If it isn't, it will fail immediately.
Don't forget to unlock and close the mutex on application exit.

It will depend on which problem you want to avoid by forcing your application to have only one instance and the scope on which you consider instances.
For a daemon — the usual way is to have a /var/run/app.pid file.
For user application, I've had more problems with applications which prevented me to run them twice than with being able to run twice an application which shouldn't have been run so. So the answer on "why and on which scope" is very important and will probably bring answer specific on the why and the intended scope.

Here is a solution based on sem_open
/*
*compile with :
*gcc single.c -o single -pthread
*/
/*
* run multiple instance on 'single', and check the behavior
*/
#include <stdio.h>
#include <fcntl.h>
#include <sys/stat.h>
#include <semaphore.h>
#include <unistd.h>
#include <errno.h>
#define SEM_NAME "/mysem_911"
int main()
{
sem_t *sem;
int rc;
sem = sem_open(SEM_NAME, O_CREAT, S_IRWXU, 1);
if(sem==SEM_FAILED){
printf("sem_open: failed errno:%d\n", errno);
}
rc=sem_trywait(sem);
if(rc == 0){
printf("Obtained lock !!!\n");
sleep(10);
//sem_post(sem);
sem_unlink(SEM_NAME);
}else{
printf("Lock not obtained\n");
}
}
One of the comments on a different answer says "I found sem_open() rather lacking". I am not sure about the specifics of what's lacking

Based on the hints in maxim's answer here is my POSIX solution of a dual-role daemon (i.e. a single application that can act as daemon and as a client communicating with that daemon). This scheme has the advantage of providing an elegant solution of the problem when the instance started first should be the daemon and all following executions should just load off the work at that daemon. It is a complete example but lacks a lot of stuff a real daemon should do (e.g. using syslog for logging and fork to put itself into background correctly, dropping privileges etc.), but it is already quite long and is fully working as is. I have only tested this on Linux so far but IIRC it should be all POSIX-compatible.
In the example the clients can send integers passed to them as first command line argument and parsed by atoi via the socket to the daemon which prints it to stdout. With this kind of sockets it is also possible to transfer arrays, structs and even file descriptors (see man 7 unix).
#include <stdio.h>
#include <stddef.h>
#include <stdbool.h>
#include <stdlib.h>
#include <unistd.h>
#include <errno.h>
#include <signal.h>
#include <sys/socket.h>
#include <sys/un.h>
#define SOCKET_NAME "/tmp/exampled"
static int socket_fd = -1;
static bool isdaemon = false;
static bool run = true;
/* returns
* -1 on errors
* 0 on successful server bindings
* 1 on successful client connects
*/
int singleton_connect(const char *name) {
int len, tmpd;
struct sockaddr_un addr = {0};
if ((tmpd = socket(AF_UNIX, SOCK_DGRAM, 0)) < 0) {
printf("Could not create socket: '%s'.\n", strerror(errno));
return -1;
}
/* fill in socket address structure */
addr.sun_family = AF_UNIX;
strcpy(addr.sun_path, name);
len = offsetof(struct sockaddr_un, sun_path) + strlen(name);
int ret;
unsigned int retries = 1;
do {
/* bind the name to the descriptor */
ret = bind(tmpd, (struct sockaddr *)&addr, len);
/* if this succeeds there was no daemon before */
if (ret == 0) {
socket_fd = tmpd;
isdaemon = true;
return 0;
} else {
if (errno == EADDRINUSE) {
ret = connect(tmpd, (struct sockaddr *) &addr, sizeof(struct sockaddr_un));
if (ret != 0) {
if (errno == ECONNREFUSED) {
printf("Could not connect to socket - assuming daemon died.\n");
unlink(name);
continue;
}
printf("Could not connect to socket: '%s'.\n", strerror(errno));
continue;
}
printf("Daemon is already running.\n");
socket_fd = tmpd;
return 1;
}
printf("Could not bind to socket: '%s'.\n", strerror(errno));
continue;
}
} while (retries-- > 0);
printf("Could neither connect to an existing daemon nor become one.\n");
close(tmpd);
return -1;
}
static void cleanup(void) {
if (socket_fd >= 0) {
if (isdaemon) {
if (unlink(SOCKET_NAME) < 0)
printf("Could not remove FIFO.\n");
} else
close(socket_fd);
}
}
static void handler(int sig) {
run = false;
}
int main(int argc, char **argv) {
switch (singleton_connect(SOCKET_NAME)) {
case 0: { /* Daemon */
struct sigaction sa;
sa.sa_handler = &handler;
sigemptyset(&sa.sa_mask);
if (sigaction(SIGINT, &sa, NULL) != 0 || sigaction(SIGQUIT, &sa, NULL) != 0 || sigaction(SIGTERM, &sa, NULL) != 0) {
printf("Could not set up signal handlers!\n");
cleanup();
return EXIT_FAILURE;
}
struct msghdr msg = {0};
struct iovec iovec;
int client_arg;
iovec.iov_base = &client_arg;
iovec.iov_len = sizeof(client_arg);
msg.msg_iov = &iovec;
msg.msg_iovlen = 1;
while (run) {
int ret = recvmsg(socket_fd, &msg, MSG_DONTWAIT);
if (ret != sizeof(client_arg)) {
if (errno != EAGAIN && errno != EWOULDBLOCK) {
printf("Error while accessing socket: %s\n", strerror(errno));
exit(1);
}
printf("No further client_args in socket.\n");
} else {
printf("received client_arg=%d\n", client_arg);
}
/* do daemon stuff */
sleep(1);
}
printf("Dropped out of daemon loop. Shutting down.\n");
cleanup();
return EXIT_FAILURE;
}
case 1: { /* Client */
if (argc < 2) {
printf("Usage: %s <int>\n", argv[0]);
return EXIT_FAILURE;
}
struct iovec iovec;
struct msghdr msg = {0};
int client_arg = atoi(argv[1]);
iovec.iov_base = &client_arg;
iovec.iov_len = sizeof(client_arg);
msg.msg_iov = &iovec;
msg.msg_iovlen = 1;
int ret = sendmsg(socket_fd, &msg, 0);
if (ret != sizeof(client_arg)) {
if (ret < 0)
printf("Could not send device address to daemon: '%s'!\n", strerror(errno));
else
printf("Could not send device address to daemon completely!\n");
cleanup();
return EXIT_FAILURE;
}
printf("Sent client_arg (%d) to daemon.\n", client_arg);
break;
}
default:
cleanup();
return EXIT_FAILURE;
}
cleanup();
return EXIT_SUCCESS;
}

All credits go to Mark Lakata. I merely did some very minor touch up only.
main.cpp
#include "singleton.hpp"
#include <iostream>
using namespace std;
int main()
{
SingletonProcess singleton(5555); // pick a port number to use that is specific to this app
if (!singleton())
{
cerr << "process running already. See " << singleton.GetLockFileName() << endl;
return 1;
}
// ... rest of the app
}
singleton.hpp
#include <netinet/in.h>
#include <unistd.h>
#include <cerrno>
#include <string>
#include <cstring>
#include <stdexcept>
using namespace std;
class SingletonProcess
{
public:
SingletonProcess(uint16_t port0)
: socket_fd(-1)
, rc(1)
, port(port0)
{
}
~SingletonProcess()
{
if (socket_fd != -1)
{
close(socket_fd);
}
}
bool operator()()
{
if (socket_fd == -1 || rc)
{
socket_fd = -1;
rc = 1;
if ((socket_fd = socket(AF_INET, SOCK_DGRAM, 0)) < 0)
{
throw std::runtime_error(std::string("Could not create socket: ") + strerror(errno));
}
else
{
struct sockaddr_in name;
name.sin_family = AF_INET;
name.sin_port = htons (port);
name.sin_addr.s_addr = htonl (INADDR_ANY);
rc = bind (socket_fd, (struct sockaddr *) &name, sizeof (name));
}
}
return (socket_fd != -1 && rc == 0);
}
std::string GetLockFileName()
{
return "port " + std::to_string(port);
}
private:
int socket_fd = -1;
int rc;
uint16_t port;
};

#include <windows.h>
int main(int argc, char *argv[])
{
// ensure only one running instance
HANDLE hMutexH`enter code here`andle = CreateMutex(NULL, TRUE, L"my.mutex.name");
if (GetLastError() == ERROR_ALREADY_EXISTS)
{
return 0;
}
// rest of the program
ReleaseMutex(hMutexHandle);
CloseHandle(hMutexHandle);
return 0;
}
FROM: HERE

On Windows you could also create a shared data segment and use an interlocked function to test for the first occurence, e.g.
#include <Windows.h>
#include <stdio.h>
#include <conio.h>
#pragma data_seg("Shared")
volatile LONG lock = 0;
#pragma data_seg()
#pragma comment(linker, "/SECTION:Shared,RWS")
void main()
{
if (InterlockedExchange(&lock, 1) == 0)
printf("first\n");
else
printf("other\n");
getch();
}

I have just written one, and tested.
#define PID_FILE "/tmp/pidfile"
static void create_pidfile(void) {
int fd = open(PID_FILE, O_RDWR | O_CREAT | O_EXCL, 0);
close(fd);
}
int main(void) {
int fd = open(PID_FILE, O_RDONLY);
if (fd > 0) {
close(fd);
return 0;
}
// make sure only one instance is running
create_pidfile();
}

Just run this code on a seperate thread:
void lock() {
while(1) {
ofstream closer("myapplock.locker", ios::trunc);
closer << "locked";
closer.close();
}
}
Run this as your main code:
int main() {
ifstream reader("myapplock.locker");
string s;
reader >> s;
if (s != "locked") {
//your code
}
return 0;
}

Linux select() and FIFO ordering of multiple sockets?

Is there any way for the Linux select() call relay event ordering?
A description of what I'm seeing:
On one machine, I wrote a simple program which sends three multicast packets, one to each of three different multicast groups. These packets are sent back-to-back, with no delay in between. I.e. sendto(mcast_group1); sendto(mcast_group2); sendto(mcast_group3).
On the other machine, I have a receiving program. The program uses one socket per multicast group. Each socket does a bind() and IP_ADD_MEMBERSHIP (i.e. join/subscribe) to the address to which it listens. The program then does a select() on the three sockets.
When select returns, all three sockets are available for reading. But which one came first? The ready-for-reading list of sockets is a set, and therefore has no order. What I would like is if select() returned exactly once per received packet, in order (the increased overhead is acceptable here). Or, is there some other kind of mechanism I can use to determine packet receive order?
Additional information:
OS is CentOS 5 (effectively Redhat Enterprise Linux) on x86_64
NIC hardware is an Intel 82571EB
I've tried e1000e driver versions 1.3.10-k2 and 2.1.4-NAPI
I've tried pinning the NIC's interrupt to an unloaded and isolated CPU core
I've disabled hardware IRQ coalescing via setting the driver option InterruptThrottleRate=0, and setting rx-usecs=0 via ethtool
I also tried using epoll, and it has the same behavior
A final remark: packet ordering is preserved if I only use one socket. In this case, I bind to INADDR_ANY (0.0.0.0) and do the IP_ADD_MEMBERSHIP multiple times on the same socket. But this does not work for our application, because we need the filtering provided by binding to the actual multicast address. Ultimately, there will be multiple multicast receiving programs on the same machine, with subscription sets that may intersect with each other. So maybe an alternate solution is to find another way to achieve the filtering effect of bind(), but without bind().

You can use IP_PKTINFO to get the address of the multicast group the packet was send to - even if the socket is subscribed for a bunch of multicast groups. Having this in place, you will get the packets in order and the ability to filter by group addresses. See the example below:
#include <stdio.h>
#include <stdlib.h>
#include <string.h>
#include <unistd.h>
#include <sys/types.h>
#include <sys/socket.h>
#include <netinet/in.h>
#include <arpa/inet.h>
#include <sys/stat.h>
#include <ctype.h>
#include <errno.h>
#define PORT 1234
#define PPANIC(msg) perror(msg); exit(1);
#define STATS_PATCH 0
int main(int argc, char **argv)
{
fd_set master;
fd_set read_fds;
struct sockaddr_in serveraddr;
int sock;
int opt = 1;
size_t i;
int rc;
char *mcast_groups[] = {
"226.0.0.1",
"226.0.0.2",
NULL
};
#if STATS_PATCH
struct stat stat_buf;
#endif
struct ip_mreq imreq;
FD_ZERO(&master);
FD_ZERO(&read_fds);
rc = sock = socket(AF_INET, SOCK_DGRAM, IPPROTO_UDP);
if(rc == -1)
{
PPANIC("socket() failed");
}
rc = setsockopt(sock, SOL_SOCKET, SO_REUSEADDR, &opt, sizeof(opt));
if(rc == -1)
{
PPANIC("setsockopt(reuse) failed");
}
memset(&serveraddr, 0, sizeof(serveraddr));
serveraddr.sin_family = AF_INET;
serveraddr.sin_port = htons(PORT);
serveraddr.sin_addr.s_addr = htonl(INADDR_ANY);
rc = bind(sock, (struct sockaddr *)&serveraddr, sizeof(serveraddr));
if(rc == -1)
{
PPANIC("bind() failed");
}
rc = setsockopt(sock, IPPROTO_IP, IP_PKTINFO, &opt, sizeof(opt));
if(rc == -1)
{
PPANIC("setsockopt(IP_PKTINFO) failed");
}
for (i = 0; mcast_groups[i] != NULL; i++)
{
imreq.imr_multiaddr.s_addr = inet_addr(mcast_groups[i]);
imreq.imr_interface.s_addr = INADDR_ANY;
rc = setsockopt(sock, IPPROTO_IP, IP_ADD_MEMBERSHIP, (const void *)&imreq, sizeof(struct ip_mreq));
if (rc != 0)
{
PPANIC("joing mcast group failed");
}
}
FD_SET(sock, &master);
while(1)
{
read_fds = master;
rc = select(sock + 1, &read_fds, NULL, NULL, NULL);
if (rc == 0)
{
continue;
}
if(rc == -1)
{
PPANIC("select() failed");
}
if(FD_ISSET(sock, &read_fds))
{
char buf[1024];
int inb;
char ctrl_msg_buf[1024];
struct iovec iov[1];
iov[0].iov_base = buf;
iov[0].iov_len = 1024;
struct msghdr msg_hdr = {
.msg_iov = iov,
.msg_iovlen = 1,
.msg_name = NULL,
.msg_namelen = 0,
.msg_control = ctrl_msg_buf,
.msg_controllen = sizeof(ctrl_msg_buf),
};
struct cmsghdr *ctrl_msg_hdr;
inb = recvmsg(sock, &msg_hdr, 0);
if (inb < 0)
{
PPANIC("recvmsg() failed");
}
for (ctrl_msg_hdr = CMSG_FIRSTHDR(&msg_hdr); ctrl_msg_hdr != NULL; ctrl_msg_hdr = CMSG_NXTHDR(&msg_hdr, ctrl_msg_hdr))
{
if (ctrl_msg_hdr->cmsg_level == IPPROTO_IP && ctrl_msg_hdr->cmsg_type == IP_PKTINFO)
{
struct in_pktinfo *pckt_info = (struct in_pktinfo *)CMSG_DATA(ctrl_msg_hdr);
printf("got data for mcast group: %s\n", inet_ntoa(pckt_info->ipi_addr));
break;
}
}
printf("|");
for (i = 0; i < inb; i++)
printf("%c", isprint(buf[i])?buf[i]:'?');
printf("|\n");
#if STATS_PATCH
rc = fstat(sock, &stat_buf);
if (rc == -1)
{
perror("fstat() failed");
} else {
printf("st_atime: %d\n", stat_buf.st_atime);
printf("st_mtime: %d\n", stat_buf.st_mtime);
printf("st_ctime: %d\n", stat_buf.st_ctime);
}
#endif
}
}
return 0;
}
the code below won't solve OPs problem but may guide people dealing with similar requirements
(EDIT) One should not do such things late at night... even with that solution you will only get the order the fd was handled by select - and this will give you no indication about the time of the frame arrival.
As stated here, it is currently not possible to retrieve the order of the sockets or the timestamps they changed as the required callback is not set for socket inodes. But if you are able to patch your kernel, you may work around the problem by setting the time within the select system call.
The following patch may give you an idea:
diff --git a/fs/select.c b/fs/select.c
index 467bb1c..3f2927e 100644
--- a/fs/select.c
+++ b/fs/select.c
## -435,6 +435,9 ## int do_select(int n, fd_set_bits *fds, struct timespec *end_time)
for (i = 0; i < n; ++rinp, ++routp, ++rexp) {
unsigned long in, out, ex, all_bits, bit = 1, mask, j;
unsigned long res_in = 0, res_out = 0, res_ex = 0;
+ struct timeval tv;
+
+ do_gettimeofday(&tv);
in = *inp++; out = *outp++; ex = *exp++;
all_bits = in | out | ex;
## -452,6 +455,16 ## int do_select(int n, fd_set_bits *fds, struct timespec *end_time)
f = fdget(i);
if (f.file) {
const struct file_operations *f_op;
+ struct kstat stat;
+
+ int ret;
+ u8 is_sock = 0;
+
+ ret = vfs_getattr(&f.file->f_path, &stat);
+ if(ret == 0 && S_ISSOCK(stat.mode)) {
+ is_sock = 1;
+ }
+
f_op = f.file->f_op;
mask = DEFAULT_POLLMASK;
if (f_op->poll) {
## -464,16 +477,22 ## int do_select(int n, fd_set_bits *fds, struct timespec *end_time)
res_in |= bit;
retval++;
wait->_qproc = NULL;
+ if(is_sock && f.file->f_inode)
+ f.file->f_inode->i_ctime.tv_sec = tv.tv_sec;
}
if ((mask & POLLOUT_SET) && (out & bit)) {
res_out |= bit;
retval++;
wait->_qproc = NULL;
+ if(is_sock && f.file->f_inode)
+ f.file->f_inode->i_ctime.tv_sec = tv.tv_sec;
}
if ((mask & POLLEX_SET) && (ex & bit)) {
res_ex |= bit;
retval++;
wait->_qproc = NULL;
+ if(is_sock && f.file->f_inode)
+ f.file->f_inode->i_ctime.tv_sec = tv.tv_sec;
}
/* got something, stop busy polling */
if (retval) {
Notes:
this is... just for you :) - don't expect it in the mainline
do_gettimeofday() is called before each relevant fd is tested.
to get higher granularity this should be done in each iteration (and only if needed). since the stat-interface only offers a granularity of one second
you may (!UGLY!) use the remaining time attributes to map the fractions of a second to those fields.
this was done using kernel 3.16.0 and is not well tested. don't use it in a space ship or medical equipment. if you would like to try it, get a filesystem-image (eg. https://people.debian.org/~aurel32/qemu/amd64/debian_wheezy_amd64_standard.qcow2) and use qemu to test it:
sudo qemu-system-x86_64 -kernel arch/x86/boot/bzImage -hda debian_wheezy_amd64_standard.qcow2 -append "root=/dev/sda1"

If select() returns > 1 the events must have been so close together as to make the question of ordering meaningless.

You can obtain the timestamp at which a file descriptor became ready using fstat.
For more info read http://pubs.opengroup.org/onlinepubs/009695399/functions/fstat.html

Fail to wake up from epoll_wait when other process closes fifo

I'm seeing different epoll and select behavior in two different binaries and was hoping for some debugging help. In the following, epoll_wait and select will be used interchangeably.
I have two processes, one writer and one reader, that communicate over a fifo. The reader performs an epoll_wait to be notified of writes. I would also like to know when the writer closes the fifo, and it appears that epoll_wait should notify me of this as well. The following toy program, which behaves as expected, illustrates what I'm trying to accomplish:
#include <errno.h>
#include <fcntl.h>
#include <stdio.h>
#include <stdlib.h>
#include <string.h>
#include <sys/epoll.h>
#include <sys/stat.h>
#include <unistd.h>
int
main(int argc, char** argv)
{
const char* filename = "tempfile";
char buf[1024];
memset(buf, 0, sizeof(buf));
struct stat statbuf;
if (!stat(filename, &statbuf))
unlink(filename);
mkfifo(filename, S_IRUSR | S_IWUSR);
pid_t pid = fork();
if (!pid) {
int fd = open(filename, O_WRONLY);
printf("Opened %d for writing\n", fd);
sleep(3);
close(fd);
} else {
int fd = open(filename, O_RDONLY);
printf("Opened %d for reading\n", fd);
static const int MAX_LENGTH = 1;
struct epoll_event init;
struct epoll_event evs[MAX_LENGTH];
int efd = epoll_create(MAX_LENGTH);
int i;
for (i = 0; i < MAX_LENGTH; ++i) {
init.data.u64 = 0;
init.data.fd = fd;
init.events |= EPOLLIN | EPOLLPRI | EPOLLHUP;
epoll_ctl(efd, EPOLL_CTL_ADD, fd, &init);
}
while (1) {
int nfds = epoll_wait(efd, evs, MAX_LENGTH, -1);
printf("%d fds ready\n", nfds);
int nread = read(fd, buf, sizeof(buf));
if (nread < 0) {
perror("read");
exit(1);
} else if (!nread) {
printf("Child %d closed the pipe\n", pid);
break;
}
printf("Reading: %s\n", buf);
}
}
return 0;
}
However, when I do this with another reader (whose code I'm not privileged to post, but which makes the exact same calls--the toy program is modeled on it), the process does not wake when the writer closes the fifo. The toy reader also gives the desired semantics with select. The real reader configured to use select also fails.
What might account for the different behavior of the two? For any provided hypotheses, how can I verify them? I'm running Linux 2.6.38.8.

strace is a great tool to confirm that the system calls are invoked correctly (i.e. parameters are passed correctly and they don't return any unexpected errors).
In addition to that I would recommend using lsof to check that no other process has that FIFO still opened.

Passing struct through socket via recv C/C++

Helo, i am trying to pass it like this
typedef struct t_timeSliceRequest{
unsigned int processId;
unsigned int timeRequired;
int priority;
}timeSliceRequest;
struct t_timeSliceRequest request = { 1,2,1 };
sendFlag = send(socketID,(timeSliceRequest *) &request, sin_size ,0);
and on server side
recvFlag = recv(socketID,(timeSliceRequest *) &request,sin_size,0);
but its receiving garbage, even recv returning -1, please help
This is my full Conde
#include<sys/socket.h>
#include<sys/types.h>
#include<string.h>
#include<stdio.h>
#include<arpa/inet.h>
#include<time.h>
#include <stdlib.h>
#include <unistd.h>
#include <errno.h>
#include <netinet/in.h>
enum priority_e{ high, normal, low };
typedef struct t_timeSliceRequest{
unsigned int processId;
unsigned int timeRequired;
int priority;
}timeSliceRequest;
typedef struct t_TimeSliceResponse {
timeSliceRequest original_req;
// Unix time stamp of when process was started on server
unsigned int time_started;
// Waiting and running time till end of CPU bust
unsigned int ttl;
} TimeSliceResponse;
int main(int argc, char ** argv){
int socketID = 0, clientID = 0;
char sendBuffer[1024], recvBuffer[1024];
time_t time;
struct sockaddr_in servAddr, clientAddr;
struct t_timeSliceRequest request = {1,1,0};
memset(sendBuffer, '0', sizeof(sendBuffer));
memset(recvBuffer, '0', sizeof(recvBuffer));
fprintf(stdout,"\n\n --- Server starting up --- \n\n");
fflush(stdout);
socketID = socket(AF_INET, SOCK_STREAM, 0);
if(socketID == -1){
fprintf(stderr, " Can't create Socket");
fflush(stdout);
}
servAddr.sin_family = AF_INET;
servAddr.sin_port = htons(5000);
servAddr.sin_addr.s_addr = htonl(INADDR_ANY);
int bindID, sin_size, recvFlag;
bindID = bind(socketID, (struct sockaddr *)&servAddr, sizeof(servAddr)); // Casting sockaddr_in on sockaddr and binding it with socket id
if(bindID!=-1){
fprintf(stdout," Bind SucessFull");
fflush(stdout);
listen(socketID,5);
fprintf(stdout, " Server Waiting for connections\n");
fflush(stdout);
while(1){
sin_size = sizeof(struct sockaddr_in);
clientID = accept(socketID, (struct sockaddr *) &clientAddr, &sin_size);
fprintf(stdout,"\n I got a connection from (%s , %d)", inet_ntoa(clientAddr.sin_addr), ntohs(clientAddr.sin_port));
fflush(stdout);
sin_size = sizeof(request);
recvFlag = recv(socketID, &request,sin_size,0);
perror("\n Err: ");
fprintf(stdout, "\n recvFlag: %d", recvFlag);
fprintf(stdout, "\n Time Slice request received:\n\tPid: %d \n\tTime Required: %d ", ntohs(request.processId), ntohs(request.timeRequired));
fflush(stdout);
snprintf(sendBuffer, sizeof(sendBuffer), "%.24s\n", ctime(&time));
write(clientID, sendBuffer, strlen(sendBuffer));
close(clientID);
sleep(1);
}
}else{
fprintf(stdout, " Unable to Bind");
}
close(socketID);
return 0;
}
And Client Code is:
#include <sys/socket.h>
#include <sys/types.h>
#include <netinet/in.h>
#include <netdb.h>
#include <stdio.h>
#include <string.h>
#include <stdlib.h>
#include <unistd.h>
#include <errno.h>
#include <arpa/inet.h>
enum priority_e{ high = +1, normal = 0, low = -1};
typedef struct t_timeSliceRequest{
unsigned int processId;
unsigned int timeRequired;
int priority;
}timeSliceRequest;
int main(int argc, char *argv[])
{
int socketID = 0 /*Socket Descriptor*/, n = 0;
char recvBuffer[1024];
memset(recvBuffer, '0',sizeof(recvBuffer));
struct sockaddr_in servAddr;
struct t_timeSliceRequest request = { 1,2,high };
if(argc!=2){
fprintf(stderr,"\n Usage: %s <ip of server> \n",argv[0]);
return 1;
}
socketID = socket(AF_INET, SOCK_STREAM, 0);
if(socketID == -1){
fprintf(stderr, "\n Can't create socket \n");
return 1;
}
servAddr.sin_family = AF_INET;
servAddr.sin_port = htons(5000);
if(inet_pton(AF_INET, argv[1], &servAddr.sin_addr)==-1){
fprintf(stderr, "\n Unable to convert given IP to Network Form \n inet_pton Error");
return 1;
}
int connectFlag, sendFlag = 0;
connectFlag = connect(socketID, (struct sockaddr *)&servAddr, sizeof(servAddr));
if(connectFlag == -1){
fprintf(stderr, " Connection Failed\n");
return 1;
}
int sin_size = sizeof(struct t_timeSliceRequest);
fprintf(stdout, " \n %d \n %d \n %d", request.processId, request.timeRequired, request.priority);
sendFlag = send(socketID, &request, sin_size ,0);
fprintf(stdout, "\nSend Flag: %d\n", sendFlag);
n = read(socketID, recvBuffer, sizeof(recvBuffer)-1);
recvBuffer[n] = 0;
fprintf(stdout, "%s",recvBuffer);
if(n < 0){
fprintf(stderr, " Read error\n");
}
return 0;
}
This is the full Code, its giving 'Transport endpoint is not connected'

Keep in mind that sending structs like this over the network may lead to interoperability problems:
if source and destination have different endianess, you're going to receive wrong data (consider using functions like htonl to convert the data to network endianess)
you struct needs to be packed, otherwise different compilers can align differently the variables of the struct (see this to get an idea about aligning the variables)
In any case, ENOTCONN suggests an error establishing the connection between the two hosts.

Transport endpoint is not connected error is returned when your socket isn't bound to any (port,address) pair.
If it's a server side, you should use the socket descriptor that is returned by accept call. In case of a client - you should use a socket that is returned by the successful call to connect.
Btw, sending structure the way you are is quite dangerous. Compilers might insert padding bytes between structure members (invisible to you program, but they take space in the structure) to conform some alignment rules for the target platform. Besides, different platforms might have different endianness, which might screw your structure completely. If your client and server are compiled for different machines, the structure layout and endianness can be incompatible. To solve this problem, you can use packed structures. A way of declaring a structure as packed depends on a compiler. For GCC this can be done by means of adding a special attribute to a structure.
Another way to solve this problem is to put each individual field of a structure to a raw byte-buffer manually. The receiving side should take all this data out in exactly the same way as the data was originally put into that buffer. This approach can be tricky, since you need to take into account a network byte order when saving multi-byte values (like int, long etc). There is a special set of functions like htonl, htons, ntohs etc for that.
Updated
In your server:
recvFlag = recv(socketID, &request,sin_size,0);
Here it should be
recvFlag = recv(clientID, &request,sin_size,0);
socketID is a passive socket and can only accept connections (not send any data).
What is more, the result of accept isn't checked for -1.

setsockopt on "accepted" fd on Linux

I have had a rather strange observation about behavior of setsockopt on Linux for SO_REUSEADDR. In one line: if I apply the sockopt to an fd returned by accept on a "listening socket" the socketoption is reflected on the port held by the listening socket.
Ok some code.
Server : Opens a socket, applies SO_REUSEADDR to be true. Accepts a connection and then applies SO_REUSEADDR to be false on the fd on the fd returned by accept.
#include <stdio.h>
#include <sys/socket.h>
#include <sys/types.h>
#include <arpa/inet.h>
#include <string.h>
int main(void)
{
int s, len;
int sin_size;
int reuse = 1;
int ret;
struct sockaddr_in my_addr;
memset(&my_addr, 0, sizeof(my_addr));
my_addr.sin_family = AF_INET;
my_addr.sin_addr.s_addr = inet_addr("127.0.0.1");
my_addr.sin_port = htons(33235);
if( (s = socket(AF_INET, SOCK_STREAM, 0)) < 0)
{
printf("Socket Error\n");
return -1;
}
setsockopt(s, SOL_SOCKET, SO_REUSEADDR, &reuse, sizeof(int));
if( bind(s, (struct sockaddr*)&my_addr, sizeof(struct sockaddr)) < 0)
{
printf("Bind Error\n");
return -1;
}
listen(s, 6);
reuse = 0;
memset(&my_addr, 0, sizeof(my_addr));
while(1) {
ret = accept(s, (struct sockaddr*)&my_addr, &len);
if (ret<0) {
printf("Accept failed\n");
} else {
printf("Accepted a client setting reuse add to 0\n");
setsockopt(ret, SOL_SOCKET, SO_REUSEADDR, &reuse, sizeof(int));
}
}
printf("Server exiting\n");
return 0;
}
Client : Client connects to the server, and doesn't do anything after that ensuring that the server socket stays in TIME_WAIT state.
#include <stdio.h>
#include <sys/socket.h>
#include <sys/types.h>
#include <arpa/inet.h>
#include <string.h>
#include <errno.h>
int main(void)
{
int s, len;
int sin_size;
struct sockaddr_in my_addr;
memset(&my_addr, 0, sizeof(my_addr));
my_addr.sin_family = AF_INET;
my_addr.sin_addr.s_addr = inet_addr("127.0.0.1");
my_addr.sin_port = htons(33235);
if( (s = socket(AF_INET, SOCK_STREAM, 0)) < 0)
{
printf("Socket Error\n");
return -1;
}
if (!connect(s,(struct sockaddr*)&my_addr, sizeof(struct sockaddr)))
{
printf("Client Connected successfully\n");
}
else
{
printf("%s\n",strerror(errno));
}
while(1) sleep(1);
return 0;
}
Steps that I do reproduce the issue.
Run server.
Connect client.
Kill and restart server. The server fails with Bind Failure
I tested this on mac os. And the bind didn't fail. I have digged up all Posix specifications and none of them say that this code is undefined.
Question:
Can someone with more experience on this share their understanding of the issue?

One way to think about it is that SO_REUSEADDR determines if you can have another socket bound to that same address. It's a property of any socket (listen or connection), but very commonly inherited from listen via accept. In linux it's mapped to the struct sock "sk_reuse" flag.
If you clear this flag on a FD you "accepted" then from that point on the IP/Port pair is considered busy-and-non-reusable. The SO_REUSEADDR flag on the listen socket does not change, but the flag on the accepted socket affects bind logic. You could probably check this with getsockopt.
If you want to know more you can try to read the inet_csk_get_port function: http://lxr.free-electrons.com/source/net/ipv4/inet_connection_sock.c#L100. This is where the actual "binding" takes place.

Develop Reference

node.js excel linux python-3.x azure haskell apache-spark rust .htaccess string