Select() doesnt behave properly with an eventfd - linux

I want to use eventfd as a way to signal simple events between kernelspace and userspace. eventfd will be used as a way to signal and the actual data will be transferred using ioctl.
Before going ahead with implementing this, I wrote a simple program to see how eventfd behaves with select(). It seems that if you use select to wait on an eventfd, it wont return when u write to it in a separate thread. In the code I wrote, the writing thread waits for 5 seconds beginning from program start before writing to the eventfd twice. I would expect the select() to return in the reading thread immediately following this write but this does not happen. The select() returns only after the timeout of 10 seconds and returns zero. Regardless of this return zero, when I try to read the eventfd after 10 seconds, I get the correct value.
I use Ubuntu 12.04.1 (3.2.0-29-generic-pae) i386
Any idea why this is so? It seems to me that select() is not working as it should.
PS: This question is similar to linux - Can't get eventfd to work with epoll together
Is anyone else facing similar issues?
#include <unistd.h>
#include <stdlib.h>
#include <stdio.h>
#include <stdint.h> //Definition of uint64_t
#include <pthread.h> //One thread writes to fd, other waits on it and then reads it
#include <time.h> //Writing thread uses delay before writing
#include <sys/eventfd.h>
int efd; //Event file descriptor
void * writing_thread_func() {
uint64_t eftd_ctr = 34;
ssize_t s;
printf("\n%s: now running...",__func__);
printf("\n%s: now sleeping for 5 seconds...",__func__);
fflush(stdout); //must call fflush before sleeping to ensure previous printf() is executed
sleep(5);
printf("\n%s: Writing %lld to eventfd...",__func__,eftd_ctr);
s = write(efd, &eftd_ctr, sizeof(uint64_t));
if (s != sizeof(uint64_t)) {
printf("\n%s: eventfd writing error. Exiting...",__func__);
exit(EXIT_FAILURE);
}
eftd_ctr = 99;
printf("\n%s: Writing %lld to eventfd...",__func__,eftd_ctr);
s = write(efd, &eftd_ctr, sizeof(uint64_t));
if (s != sizeof(uint64_t)) {
printf("\n%s: eventfd writing error. Exiting...",__func__);
exit(EXIT_FAILURE);
}
printf("\n%s: thread exiting...",__func__);
pthread_exit(0);
}
void * reading_thread_func() {
ssize_t s;
uint64_t eftd_ctr;
int retval; //for select()
fd_set rfds; //for select()
struct timeval tv; //for select()
printf("\n%s: now running...",__func__);
printf("\n%s: now waiting on select()...",__func__);
//Watch efd
FD_ZERO(&rfds);
FD_SET(efd, &rfds);
//Wait up to 10 seconds
tv.tv_sec = 10;
tv.tv_usec = 0;
retval = select(1, &rfds, NULL, NULL, &tv);
if (retval == -1){
printf("\n%s: select() error. Exiting...",__func__);
exit(EXIT_FAILURE);
} else if (retval > 0) {
printf("\n%s: select() says data is available now. Exiting...",__func__);
printf("\n%s: returned from select(), now executing read()...",__func__);
s = read(efd, &eftd_ctr, sizeof(uint64_t));
if (s != sizeof(uint64_t)){
printf("\n%s: eventfd read error. Exiting...",__func__);
exit(EXIT_FAILURE);
}
printf("\n%s: Returned from read(), value read = %lld",__func__, eftd_ctr);
} else if (retval == 0) {
printf("\n%s: select() says that no data was available even after 10 seconds...",__func__);
printf("\n%s: but lets try reading efd count anyway...",__func__);
s = read(efd, &eftd_ctr, sizeof(uint64_t));
if (s != sizeof(uint64_t)){
printf("\n%s: eventfd read error. Exiting...",__func__);
exit(EXIT_FAILURE);
}
printf("\n%s: Returned from read(), value read = %lld",__func__, eftd_ctr);
exit(EXIT_FAILURE);
}
printf("\n%s: thread exiting...",__func__);
pthread_exit(0);
}
int main() {
pthread_t writing_thread_var, reading_thread_var;
//Create eventfd
efd = eventfd(0,0);
if (efd == -1){
printf("\n%s: Unable to create eventfd! Exiting...",__func__);
exit(EXIT_FAILURE);
}
printf("\n%s: eventfd created. value = %d. Spawning threads...",__func__,efd);
//Create threads
pthread_create(&writing_thread_var, NULL, writing_thread_func, NULL);
pthread_create(&reading_thread_var, NULL, reading_thread_func, NULL);
//Wait for threads to terminate
pthread_join(writing_thread_var, NULL);
pthread_join(reading_thread_var, NULL);
printf("\n%s: closing eventfd. Exiting...",__func__);
close(efd);
exit(EXIT_SUCCESS);
}

So it was a silly mistake:
I changed:
retval = select(1, &rfds, NULL, NULL, &tv);
to:
retval = select(efd+1, &rfds, NULL, NULL, &tv);
and it worked.
Thanks again #Steve-o

Related

How can i make sure that only a single instance of the process on Linux? [duplicate]

What would be your suggestion in order to create a single instance application, so that only one process is allowed to run at a time? File lock, mutex or what?
A good way is:
#include <sys/file.h>
#include <errno.h>
int pid_file = open("/var/run/whatever.pid", O_CREAT | O_RDWR, 0666);
int rc = flock(pid_file, LOCK_EX | LOCK_NB);
if(rc) {
if(EWOULDBLOCK == errno)
; // another instance is running
}
else {
// this is the first instance
}
Note that locking allows you to ignore stale pid files (i.e. you don't have to delete them). When the application terminates for any reason the OS releases the file lock for you.
Pid files are not terribly useful because they can be stale (the file exists but the process does not). Hence, the application executable itself can be locked instead of creating and locking a pid file.
A more advanced method is to create and bind a unix domain socket using a predefined socket name. Bind succeeds for the first instance of your application. Again, the OS unbinds the socket when the application terminates for any reason. When bind() fails another instance of the application can connect() and use this socket to pass its command line arguments to the first instance.
Here is a solution in C++. It uses the socket recommendation of Maxim. I like this solution better than the file based locking solution, because the file based one fails if the process crashes and does not delete the lock file. Another user will not be able to delete the file and lock it. The sockets are automatically deleted when the process exits.
Usage:
int main()
{
SingletonProcess singleton(5555); // pick a port number to use that is specific to this app
if (!singleton())
{
cerr << "process running already. See " << singleton.GetLockFileName() << endl;
return 1;
}
... rest of the app
}
Code:
#include <netinet/in.h>
class SingletonProcess
{
public:
SingletonProcess(uint16_t port0)
: socket_fd(-1)
, rc(1)
, port(port0)
{
}
~SingletonProcess()
{
if (socket_fd != -1)
{
close(socket_fd);
}
}
bool operator()()
{
if (socket_fd == -1 || rc)
{
socket_fd = -1;
rc = 1;
if ((socket_fd = socket(AF_INET, SOCK_DGRAM, 0)) < 0)
{
throw std::runtime_error(std::string("Could not create socket: ") + strerror(errno));
}
else
{
struct sockaddr_in name;
name.sin_family = AF_INET;
name.sin_port = htons (port);
name.sin_addr.s_addr = htonl (INADDR_ANY);
rc = bind (socket_fd, (struct sockaddr *) &name, sizeof (name));
}
}
return (socket_fd != -1 && rc == 0);
}
std::string GetLockFileName()
{
return "port " + std::to_string(port);
}
private:
int socket_fd = -1;
int rc;
uint16_t port;
};
For windows, a named kernel object (e.g. CreateEvent, CreateMutex). For unix, a pid-file - create a file and write your process ID to it.
You can create an "anonymous namespace" AF_UNIX socket. This is completely Linux-specific, but has the advantage that no filesystem actually has to exist.
Read the man page for unix(7) for more info.
Avoid file-based locking
It is always good to avoid a file based locking mechanism to implement the singleton instance of an application. The user can always rename the lock file to a different name and run the application again as follows:
mv lockfile.pid lockfile1.pid
Where lockfile.pid is the lock file based on which is checked for existence before running the application.
So, it is always preferable to use a locking scheme on object directly visible to only the kernel. So, anything which has to do with a file system is not reliable.
So the best option would be to bind to a inet socket. Note that unix domain sockets reside in the filesystem and are not reliable.
Alternatively, you can also do it using DBUS.
It's seems to not be mentioned - it is possible to create a mutex in shared memory but it needs to be marked as shared by attributes (not tested):
pthread_mutexattr_t attr;
pthread_mutexattr_init(&attr);
pthread_mutexattr_setpshared(&attr, PTHREAD_PROCESS_SHARED);
pthread_mutex_t *mutex = shmat(SHARED_MEMORY_ID, NULL, 0);
pthread_mutex_init(mutex, &attr);
There is also shared memory semaphores (but I failed to find out how to lock one):
int sem_id = semget(SHARED_MEMORY_KEY, 1, 0);
No one has mentioned it, but sem_open() creates a real named semaphore under modern POSIX-compliant OSes. If you give a semaphore an initial value of 1, it becomes a mutex (as long as it is strictly released only if a lock was successfully obtained).
With several sem_open()-based objects, you can create all of the common equivalent Windows named objects - named mutexes, named semaphores, and named events. Named events with "manual" set to true is a bit more difficult to emulate (it requires four semaphore objects to properly emulate CreateEvent(), SetEvent(), and ResetEvent()). Anyway, I digress.
Alternatively, there is named shared memory. You can initialize a pthread mutex with the "shared process" attribute in named shared memory and then all processes can safely access that mutex object after opening a handle to the shared memory with shm_open()/mmap(). sem_open() is easier if it is available for your platform (if it isn't, it should be for sanity's sake).
Regardless of the method you use, to test for a single instance of your application, use the trylock() variant of the wait function (e.g. sem_trywait()). If the process is the only one running, it will successfully lock the mutex. If it isn't, it will fail immediately.
Don't forget to unlock and close the mutex on application exit.
It will depend on which problem you want to avoid by forcing your application to have only one instance and the scope on which you consider instances.
For a daemon — the usual way is to have a /var/run/app.pid file.
For user application, I've had more problems with applications which prevented me to run them twice than with being able to run twice an application which shouldn't have been run so. So the answer on "why and on which scope" is very important and will probably bring answer specific on the why and the intended scope.
Here is a solution based on sem_open
/*
*compile with :
*gcc single.c -o single -pthread
*/
/*
* run multiple instance on 'single', and check the behavior
*/
#include <stdio.h>
#include <fcntl.h>
#include <sys/stat.h>
#include <semaphore.h>
#include <unistd.h>
#include <errno.h>
#define SEM_NAME "/mysem_911"
int main()
{
sem_t *sem;
int rc;
sem = sem_open(SEM_NAME, O_CREAT, S_IRWXU, 1);
if(sem==SEM_FAILED){
printf("sem_open: failed errno:%d\n", errno);
}
rc=sem_trywait(sem);
if(rc == 0){
printf("Obtained lock !!!\n");
sleep(10);
//sem_post(sem);
sem_unlink(SEM_NAME);
}else{
printf("Lock not obtained\n");
}
}
One of the comments on a different answer says "I found sem_open() rather lacking". I am not sure about the specifics of what's lacking
Based on the hints in maxim's answer here is my POSIX solution of a dual-role daemon (i.e. a single application that can act as daemon and as a client communicating with that daemon). This scheme has the advantage of providing an elegant solution of the problem when the instance started first should be the daemon and all following executions should just load off the work at that daemon. It is a complete example but lacks a lot of stuff a real daemon should do (e.g. using syslog for logging and fork to put itself into background correctly, dropping privileges etc.), but it is already quite long and is fully working as is. I have only tested this on Linux so far but IIRC it should be all POSIX-compatible.
In the example the clients can send integers passed to them as first command line argument and parsed by atoi via the socket to the daemon which prints it to stdout. With this kind of sockets it is also possible to transfer arrays, structs and even file descriptors (see man 7 unix).
#include <stdio.h>
#include <stddef.h>
#include <stdbool.h>
#include <stdlib.h>
#include <unistd.h>
#include <errno.h>
#include <signal.h>
#include <sys/socket.h>
#include <sys/un.h>
#define SOCKET_NAME "/tmp/exampled"
static int socket_fd = -1;
static bool isdaemon = false;
static bool run = true;
/* returns
* -1 on errors
* 0 on successful server bindings
* 1 on successful client connects
*/
int singleton_connect(const char *name) {
int len, tmpd;
struct sockaddr_un addr = {0};
if ((tmpd = socket(AF_UNIX, SOCK_DGRAM, 0)) < 0) {
printf("Could not create socket: '%s'.\n", strerror(errno));
return -1;
}
/* fill in socket address structure */
addr.sun_family = AF_UNIX;
strcpy(addr.sun_path, name);
len = offsetof(struct sockaddr_un, sun_path) + strlen(name);
int ret;
unsigned int retries = 1;
do {
/* bind the name to the descriptor */
ret = bind(tmpd, (struct sockaddr *)&addr, len);
/* if this succeeds there was no daemon before */
if (ret == 0) {
socket_fd = tmpd;
isdaemon = true;
return 0;
} else {
if (errno == EADDRINUSE) {
ret = connect(tmpd, (struct sockaddr *) &addr, sizeof(struct sockaddr_un));
if (ret != 0) {
if (errno == ECONNREFUSED) {
printf("Could not connect to socket - assuming daemon died.\n");
unlink(name);
continue;
}
printf("Could not connect to socket: '%s'.\n", strerror(errno));
continue;
}
printf("Daemon is already running.\n");
socket_fd = tmpd;
return 1;
}
printf("Could not bind to socket: '%s'.\n", strerror(errno));
continue;
}
} while (retries-- > 0);
printf("Could neither connect to an existing daemon nor become one.\n");
close(tmpd);
return -1;
}
static void cleanup(void) {
if (socket_fd >= 0) {
if (isdaemon) {
if (unlink(SOCKET_NAME) < 0)
printf("Could not remove FIFO.\n");
} else
close(socket_fd);
}
}
static void handler(int sig) {
run = false;
}
int main(int argc, char **argv) {
switch (singleton_connect(SOCKET_NAME)) {
case 0: { /* Daemon */
struct sigaction sa;
sa.sa_handler = &handler;
sigemptyset(&sa.sa_mask);
if (sigaction(SIGINT, &sa, NULL) != 0 || sigaction(SIGQUIT, &sa, NULL) != 0 || sigaction(SIGTERM, &sa, NULL) != 0) {
printf("Could not set up signal handlers!\n");
cleanup();
return EXIT_FAILURE;
}
struct msghdr msg = {0};
struct iovec iovec;
int client_arg;
iovec.iov_base = &client_arg;
iovec.iov_len = sizeof(client_arg);
msg.msg_iov = &iovec;
msg.msg_iovlen = 1;
while (run) {
int ret = recvmsg(socket_fd, &msg, MSG_DONTWAIT);
if (ret != sizeof(client_arg)) {
if (errno != EAGAIN && errno != EWOULDBLOCK) {
printf("Error while accessing socket: %s\n", strerror(errno));
exit(1);
}
printf("No further client_args in socket.\n");
} else {
printf("received client_arg=%d\n", client_arg);
}
/* do daemon stuff */
sleep(1);
}
printf("Dropped out of daemon loop. Shutting down.\n");
cleanup();
return EXIT_FAILURE;
}
case 1: { /* Client */
if (argc < 2) {
printf("Usage: %s <int>\n", argv[0]);
return EXIT_FAILURE;
}
struct iovec iovec;
struct msghdr msg = {0};
int client_arg = atoi(argv[1]);
iovec.iov_base = &client_arg;
iovec.iov_len = sizeof(client_arg);
msg.msg_iov = &iovec;
msg.msg_iovlen = 1;
int ret = sendmsg(socket_fd, &msg, 0);
if (ret != sizeof(client_arg)) {
if (ret < 0)
printf("Could not send device address to daemon: '%s'!\n", strerror(errno));
else
printf("Could not send device address to daemon completely!\n");
cleanup();
return EXIT_FAILURE;
}
printf("Sent client_arg (%d) to daemon.\n", client_arg);
break;
}
default:
cleanup();
return EXIT_FAILURE;
}
cleanup();
return EXIT_SUCCESS;
}
All credits go to Mark Lakata. I merely did some very minor touch up only.
main.cpp
#include "singleton.hpp"
#include <iostream>
using namespace std;
int main()
{
SingletonProcess singleton(5555); // pick a port number to use that is specific to this app
if (!singleton())
{
cerr << "process running already. See " << singleton.GetLockFileName() << endl;
return 1;
}
// ... rest of the app
}
singleton.hpp
#include <netinet/in.h>
#include <unistd.h>
#include <cerrno>
#include <string>
#include <cstring>
#include <stdexcept>
using namespace std;
class SingletonProcess
{
public:
SingletonProcess(uint16_t port0)
: socket_fd(-1)
, rc(1)
, port(port0)
{
}
~SingletonProcess()
{
if (socket_fd != -1)
{
close(socket_fd);
}
}
bool operator()()
{
if (socket_fd == -1 || rc)
{
socket_fd = -1;
rc = 1;
if ((socket_fd = socket(AF_INET, SOCK_DGRAM, 0)) < 0)
{
throw std::runtime_error(std::string("Could not create socket: ") + strerror(errno));
}
else
{
struct sockaddr_in name;
name.sin_family = AF_INET;
name.sin_port = htons (port);
name.sin_addr.s_addr = htonl (INADDR_ANY);
rc = bind (socket_fd, (struct sockaddr *) &name, sizeof (name));
}
}
return (socket_fd != -1 && rc == 0);
}
std::string GetLockFileName()
{
return "port " + std::to_string(port);
}
private:
int socket_fd = -1;
int rc;
uint16_t port;
};
#include <windows.h>
int main(int argc, char *argv[])
{
// ensure only one running instance
HANDLE hMutexH`enter code here`andle = CreateMutex(NULL, TRUE, L"my.mutex.name");
if (GetLastError() == ERROR_ALREADY_EXISTS)
{
return 0;
}
// rest of the program
ReleaseMutex(hMutexHandle);
CloseHandle(hMutexHandle);
return 0;
}
FROM: HERE
On Windows you could also create a shared data segment and use an interlocked function to test for the first occurence, e.g.
#include <Windows.h>
#include <stdio.h>
#include <conio.h>
#pragma data_seg("Shared")
volatile LONG lock = 0;
#pragma data_seg()
#pragma comment(linker, "/SECTION:Shared,RWS")
void main()
{
if (InterlockedExchange(&lock, 1) == 0)
printf("first\n");
else
printf("other\n");
getch();
}
I have just written one, and tested.
#define PID_FILE "/tmp/pidfile"
static void create_pidfile(void) {
int fd = open(PID_FILE, O_RDWR | O_CREAT | O_EXCL, 0);
close(fd);
}
int main(void) {
int fd = open(PID_FILE, O_RDONLY);
if (fd > 0) {
close(fd);
return 0;
}
// make sure only one instance is running
create_pidfile();
}
Just run this code on a seperate thread:
void lock() {
while(1) {
ofstream closer("myapplock.locker", ios::trunc);
closer << "locked";
closer.close();
}
}
Run this as your main code:
int main() {
ifstream reader("myapplock.locker");
string s;
reader >> s;
if (s != "locked") {
//your code
}
return 0;
}

Cygwin: interrupting blocking read

I've written the program which spawns a thread that reads in a loop from stdin in a blocking fashion. I want to make the thread return from blocked read immediately. I've registered my signal handler (with sigaction and without SA_RESTART flag) in the reading thread, send it a signal and expect read to exit with EINTR error. But it doesn't happen. Is it issue or limitation of Cygwin or am I doing something wrong?
Here is the code:
#include <stdio.h>
#include <errno.h>
#include <pthread.h>
pthread_t thread;
volatile int run = 0;
void root_handler(int signum)
{
printf("%s ENTER (thread is %x)\n", __func__, pthread_self());
run = 0;
}
void* thr_func(void*arg)
{ int res;
char buffer[256];
printf("%s ENTER (thread is %x)\n", __func__, pthread_self());
struct sigaction act;
memset (&act, 0, sizeof(act));
act.sa_sigaction = &root_handler;
//act.sa_flags = SA_RESTART;
if (sigaction(SIGUSR1, &act, NULL) < 0) {
perror ("sigaction error");
return 1;
}
while(run)
{
res = read(0,buffer, sizeof(buffer));
if(res == -1)
{
if(errno == EINTR)
{
puts("read was interrupted by signal");
}
}
else
{
printf("got: %s", buffer);
}
}
printf("%s LEAVE (thread is %x)\n", __func__, pthread_self());
}
int main() {
run = 1;
printf("root thread: %x\n", pthread_self());
pthread_create(&thread, NULL, &thr_func, NULL);
printf("thread %x started\n", thread);
sleep(4);
pthread_kill(thread, SIGUSR1 );
//raise(SIGUSR1);
pthread_join(thread, NULL);
return 0;
}
I'm using Cygwin (1.7.32(0.274/5/3)).
I've just tried to do the same on Ubuntu and it works (I needed to include signal.h, though, even though in Cygwin it compiled as it is). It seems to be peculiarity of Cygwin's implementation.

Can not get proper response from select() using writefds

Parent receives SIGPIPE sending chars to aborted child process through FIFO pipe.
I am trying to avoid this, using select() function. In the attached sample code,
select() retruns OK even after the child at the other end of pipe having been terminated.
Tested in
RedHat EL5 (Linux 2.6.18-194.32.1.el5)
GNU C Library stable release version 2.5
Any help appreciated. Thnak you.
#include <stdio.h>
#include <stdlib.h>
#include <fcntl.h>
#include <signal.h>
#include <sys/stat.h>
#include <unistd.h>
static void sigpipe_fct();
main()
{
struct stat st;
int i, fd_out, fd_in, child;
char buf[1024];
#define p_out "/tmp/pout"
signal(SIGPIPE, sigpipe_fct);
if (stat(p_out, &st) != 0) {
mknod(p_out, S_IFIFO, 0);
chmod(p_out, 0666);
}
/* start receiving process */
if ((child = fork()) == 0) {
if ((fd_in = open(p_out, O_RDONLY)) < 0) {
perror(p_out);
exit(1);
}
while(1) {
i = read(fd_in, buf, sizeof(buf));
fprintf(stderr, "child %d read %.*s\n", getpid(), i, buf);
lseek(fd_in, 0, 0);
}
}
else {
fprintf(stderr,
"reading from %s - exec \"kill -9 %d\" to test\n", p_out, child);
if ((fd_out = open(p_out, O_WRONLY + O_NDELAY)) < 0) { /* output */
perror(p_out);
exit(1);
}
while(1) {
if (SelectChkWrite(fd_out) == fd_out) {
fprintf(stderr, "SelectChkWrite() success write abc\n");
write(fd_out, "abc", 3);
}
else
fprintf(stderr, "SelectChkWrite() failed\n");
sleep(3);
}
}
}
static void sigpipe_fct()
{
fprintf(stderr, "SIGPIPE received\n");
exit(-1);
}
SelectChkWrite(ch)
int ch;
{
#include <sys/select.h>
fd_set writefds;
int i;
FD_ZERO(&writefds);
FD_SET (ch, &writefds);
i = select(ch + 1, NULL, &writefds, NULL, NULL);
if (i == -1)
return(-1);
else if (FD_ISSET(ch, &writefds))
return(ch);
else
return(-1);
}
From the Linux select(3) man page:
A descriptor shall be considered ready for writing when a call to an
output function with O_NONBLOCK clear would not block, whether or not
the function would transfer data successfully.
When the pipe is closed, it won't block, so it is considered "ready" by select.
BTW, having #include <sys/select.h> inside your SelectChkWrite() function is extremely bad form.
Although select() and poll() are both in the POSIX standard, select() is much older and more limited than poll(). In general, I recommend people use poll() by default and only use select() if they have a good reason. (See here for one example.)

Simultaneous socket read/write ("full-duplex") in Linux (aio specifically)

I'm porting an application built on top of the ACE Proactor framework. The application runs perfectly for both VxWorks and Windows, but fails to do so on Linux (CentOS 5.5, WindRiver Linux 1.4 & 3.0) with kernel 2.6.X.X - using librt.
I've narrowed the problem down to a very basic issue:
The application begins an asynchronous (via aio_read) read operation on a socket and subsequently begins an asynchronous (via aio_write) write on the very same socket. The read operation cannot be fulfilled yet since the protocol is initialized from the application's end.
- When the socket is in blocking-mode, the write is never reached and the protocol "hangs".
- When using a O_NONBLOCK socket, the write succeeds but the read returns indefinitely with a "EWOULDBLOCK/EAGAIN" error, never to recover (even if the AIO operation is restarted).
I went through multiple forums and could not find a definitive answer to whether this should work (and I'm doing something wrong) or impossible with Linux AIO. Is it possible if I drop the AIO and seek a different implementation (via epoll/poll/select etc.)?
Attached is a sample code to quickly re-produce the problem on a non-blocking socket:
#include <aio.h>
#include <stdio.h>
#include <stdlib.h>
#include <netdb.h>
#include <string.h>
#include <netinet/in.h>
#include <sys/socket.h>
#include <sys/types.h>
#include <assert.h>
#include <errno.h>
#define BUFSIZE (100)
// Global variables
struct aiocb *cblist[2];
int theSocket;
void InitializeAiocbData(struct aiocb* pAiocb, char* pBuffer)
{
bzero( (char *)pAiocb, sizeof(struct aiocb) );
pAiocb->aio_fildes = theSocket;
pAiocb->aio_nbytes = BUFSIZE;
pAiocb->aio_offset = 0;
pAiocb->aio_buf = pBuffer;
}
void IssueReadOperation(struct aiocb* pAiocb, char* pBuffer)
{
InitializeAiocbData(pAiocb, pBuffer);
int ret = aio_read( pAiocb );
assert (ret >= 0);
}
void IssueWriteOperation(struct aiocb* pAiocb, char* pBuffer)
{
InitializeAiocbData(pAiocb, pBuffer);
int ret = aio_write( pAiocb );
assert (ret >= 0);
}
int main()
{
int ret;
int nPort = 11111;
char* szServer = "10.10.9.123";
// Connect to the remote server
theSocket = socket(AF_INET, SOCK_STREAM, 0);
assert (theSocket >= 0);
struct hostent *pServer;
struct sockaddr_in serv_addr;
pServer = gethostbyname(szServer);
bzero((char *) &serv_addr, sizeof(serv_addr));
serv_addr.sin_family = AF_INET;
serv_addr.sin_port = htons(nPort);
bcopy((char *)pServer->h_addr, (char *)&serv_addr.sin_addr.s_addr, pServer->h_length);
assert (connect(theSocket, (const sockaddr*)(&serv_addr), sizeof(serv_addr)) >= 0);
// Set the socket to be non-blocking
int oldFlags = fcntl(theSocket, F_GETFL) ;
int newFlags = oldFlags | O_NONBLOCK;
fcntl(theSocket, F_SETFL, newFlags);
printf("Socket flags: before=%o, after=%o\n", oldFlags, newFlags);
// Construct the AIO callbacks array
struct aiocb my_aiocb1, my_aiocb2;
char* pBuffer = new char[BUFSIZE+1];
bzero( (char *)cblist, sizeof(cblist) );
cblist[0] = &my_aiocb1;
cblist[1] = &my_aiocb2;
// Start the read and write operations on the same socket
IssueReadOperation(&my_aiocb1, pBuffer);
IssueWriteOperation(&my_aiocb2, pBuffer);
// Wait for I/O completion on both operations
int nRound = 1;
printf("\naio_suspend round #%d:\n", nRound++);
ret = aio_suspend( cblist, 2, NULL );
assert (ret == 0);
// Check the error status for the read and write operations
ret = aio_error(&my_aiocb1);
assert (ret == EWOULDBLOCK);
// Get the return code for the read
{
ssize_t retcode = aio_return(&my_aiocb1);
printf("First read operation results: aio_error=%d, aio_return=%d - That's the first EWOULDBLOCK\n", ret, retcode);
}
ret = aio_error(&my_aiocb2);
assert (ret == EINPROGRESS);
printf("Write operation is still \"in progress\"\n");
// Re-issue the read operation
IssueReadOperation(&my_aiocb1, pBuffer);
// Wait for I/O completion on both operations
printf("\naio_suspend round #%d:\n", nRound++);
ret = aio_suspend( cblist, 2, NULL );
assert (ret == 0);
// Check the error status for the read and write operations for the second time
ret = aio_error(&my_aiocb1);
assert (ret == EINPROGRESS);
printf("Second read operation request is suddenly marked as \"in progress\"\n");
ret = aio_error(&my_aiocb2);
assert (ret == 0);
// Get the return code for the write
{
ssize_t retcode = aio_return(&my_aiocb2);
printf("Write operation has completed with results: aio_error=%d, aio_return=%d\n", ret, retcode);
}
// Now try waiting for the read operation to complete - it'll just busy-wait, receiving "EWOULDBLOCK" indefinitely
do
{
printf("\naio_suspend round #%d:\n", nRound++);
ret = aio_suspend( cblist, 1, NULL );
assert (ret == 0);
// Check the error of the read operation and re-issue if needed
ret = aio_error(&my_aiocb1);
if (ret == EWOULDBLOCK)
{
IssueReadOperation(&my_aiocb1, pBuffer);
printf("EWOULDBLOCK again on the read operation!\n");
}
}
while (ret == EWOULDBLOCK);
}
Thanks in advance,
Yotam.
Firstly, O_NONBLOCK and AIO don't mix. AIO will report the asynchronous operation complete when the corresponding read or write wouldn't have blocked - and with O_NONBLOCK, they would never block, so the aio request will always complete immediately (with aio_return() giving EWOULDBLOCK).
Secondly, don't use the same buffer for two simultaneous outstanding aio requests. The buffer should be considered completely offlimits between the time when the aio request was issued and when aio_error() tells you that it has completed.
Thirdly, AIO requests to the same file descriptor are queued, in order to give sensible results. This means that your write won't happen until the read completes - if you need to write the data first, you need to issue the AIOs in the opposite order. The following will work fine, without setting O_NONBLOCK:
struct aiocb my_aiocb1, my_aiocb2;
char pBuffer1[BUFSIZE+1], pBuffer2[BUFSIZE+1] = "Some test message";
const struct aiocb *cblist[2] = { &my_aiocb1, &my_aiocb2 };
// Start the read and write operations on the same socket
IssueWriteOperation(&my_aiocb2, pBuffer2);
IssueReadOperation(&my_aiocb1, pBuffer1);
// Wait for I/O completion on both operations
int nRound = 1;
int aio_status1, aio_status2;
do {
printf("\naio_suspend round #%d:\n", nRound++);
ret = aio_suspend( cblist, 2, NULL );
assert (ret == 0);
// Check the error status for the read and write operations
aio_status1 = aio_error(&my_aiocb1);
if (aio_status1 == EINPROGRESS)
puts("aio1 still in progress.");
else
puts("aio1 completed.");
aio_status2 = aio_error(&my_aiocb2);
if (aio_status2 == EINPROGRESS)
puts("aio2 still in progress.");
else
puts("aio2 completed.");
} while (aio_status1 == EINPROGRESS || aio_status2 == EINPROGRESS);
// Get the return code for the read
ssize_t retcode;
retcode = aio_return(&my_aiocb1);
printf("First operation results: aio_error=%d, aio_return=%d\n", aio_status1, retcode);
retcode = aio_return(&my_aiocb1);
printf("Second operation results: aio_error=%d, aio_return=%d\n", aio_status1, retcode);
Alternatively, if you don't care about reads and writes being ordered with respect to each other, you can use dup() to create two file descriptors for the socket, and use one for reading and the other for writing - each will have its AIO operations queued separately.

Debugging segmentation fault in a multi-threaded (using clone) program

I wrote a code to create some threads and whenever one of the threads finish a new thread is created to replace it. As I was not able to create very large number of threads (>450) using pthreads, I used clone system call instead. (Please note that I am aware of the implication of having such a huge number of threads, but this program is meant to only stress the system).
As clone() requires the stack space for the child thread to be specified as parameter, I malloc the required chunk of stack space for each thread and free it up when the thread finishes. When a thread finishes I send a signal to the parent to notify it of the same.
The code is given below:
#include <sched.h>
#include <stdio.h>
#include <stdlib.h>
#include <string.h>
#include <signal.h>
#include <unistd.h>
#include <errno.h>
#define NUM_THREADS 5
unsigned long long total_count=0;
int num_threads = NUM_THREADS;
static int thread_pids[NUM_THREADS];
static void *thread_stacks[NUM_THREADS];
int ppid;
int worker() {
int i;
union sigval s={0};
for(i=0;i!=99999999;i++);
if(sigqueue(ppid, SIGUSR1, s)!=0)
fprintf(stderr, "ERROR sigqueue");
fprintf(stderr, "Child [%d] done\n", getpid());
return 0;
}
void sigint_handler(int signal) {
char fname[35]="";
FILE *fp;
int ch;
if(signal == SIGINT) {
fprintf(stderr, "Caught SIGINT\n");
sprintf(fname, "/proc/%d/status", getpid());
fp = fopen(fname,"r");
while((ch=fgetc(fp))!=EOF)
fprintf(stderr, "%c", (char)ch);
fclose(fp);
fprintf(stderr, "No. of threads created so far = %llu\n", total_count);
exit(0);
} else
fprintf(stderr, "Unhandled signal (%d) received\n", signal);
}
int main(int argc, char *argv[]) {
int rc, i; long t;
void *chld_stack, *chld_stack2;
siginfo_t siginfo;
sigset_t sigset, oldsigset;
if(argc>1) {
num_threads = atoi(argv[1]);
if(num_threads<1) {
fprintf(stderr, "Number of threads must be >0\n");
return -1;
}
}
signal(SIGINT, sigint_handler);
/* Block SIGUSR1 */
sigemptyset(&sigset);
sigaddset(&sigset, SIGUSR1);
if(sigprocmask(SIG_BLOCK, &sigset, &oldsigset)==-1)
fprintf(stderr, "ERROR: cannot block SIGUSR1 \"%s\"\n", strerror(errno));
printf("Number of threads = %d\n", num_threads);
ppid = getpid();
for(t=0,i=0;t<num_threads;t++,i++) {
chld_stack = (void *) malloc(148*512);
chld_stack2 = ((char *)chld_stack + 148*512 - 1);
if(chld_stack == NULL) {
fprintf(stderr, "ERROR[%ld]: malloc for stack-space failed\n", t);
break;
}
rc = clone(worker, chld_stack2, CLONE_VM|CLONE_FS|CLONE_FILES, NULL);
if(rc == -1) {
fprintf(stderr, "ERROR[%ld]: return code from pthread_create() is %d\n", t, errno);
break;
}
thread_pids[i]=rc;
thread_stacks[i]=chld_stack;
fprintf(stderr, " [index:%d] = [pid:%d] ; [stack:0x%p]\n", i, thread_pids[i], thread_stacks[i]);
total_count++;
}
sigemptyset(&sigset);
sigaddset(&sigset, SIGUSR1);
while(1) {
fprintf(stderr, "Waiting for signal from childs\n");
if(sigwaitinfo(&sigset, &siginfo) == -1)
fprintf(stderr, "- ERROR returned by sigwaitinfo : \"%s\"\n", strerror(errno));
fprintf(stderr, "Got some signal from pid:%d\n", siginfo.si_pid);
/* A child finished, free the stack area allocated for it */
for(i=0;i<NUM_THREADS;i++) {
fprintf(stderr, " [index:%d] = [pid:%d] ; [stack:%p]\n", i, thread_pids[i], thread_stacks[i]);
if(thread_pids[i]==siginfo.si_pid) {
free(thread_stacks[i]);
thread_stacks[i]=NULL;
break;
}
}
fprintf(stderr, "Search for child ended with i=%d\n",i);
if(i==NUM_THREADS)
continue;
/* Create a new thread in its place */
chld_stack = (void *) malloc(148*512);
chld_stack2 = ((char *)chld_stack + 148*512 - 1);
if(chld_stack == NULL) {
fprintf(stderr, "ERROR[%ld]: malloc for stack-space failed\n", t);
break;
}
rc = clone(worker, chld_stack2, CLONE_VM|CLONE_FS|CLONE_FILES, NULL);
if(rc == -1) {
fprintf(stderr, "ERROR[%ld]: return code from clone() is %d\n", t, errno);
break;
}
thread_pids[i]=rc;
thread_stacks[i]=chld_stack;
total_count++;
}
fprintf(stderr, "Broke out of infinite loop. [total_count=%llu] [i=%d]\n",total_count, i);
return 0;
}
I have used couple of arrays to keep track of the child processes' pid and the stack area base address (for freeing it).
When I run this program it terminates after sometime. Running with gdb tells me that one of the thread gets a SIGSEGV (segmentation fault). But it doesn't gives me any location, the output is similar to the following:
Program received signal SIGSEGV, Segmentation fault.
[Switching to LWP 15864]
0x00000000 in ?? ()
I tried running it under valgrind with the following commandline:
valgrind --tool=memcheck --leak-check=yes --show-reachable=yes -v --num-callers=20 --track-fds=yes ./a.out
But it keeps running without any issues under valgrind.
I am puzzled as to how to debug this program. I felt that this might be some stack overflow or something but increasing the stack size (upto 74KB) didn't solved the problem.
My only query is why and where is the segmentation fault or how to debug this program.
Found the actual issue.
When the worker thread signals the parent process using sigqueue(), the parent sometimes gets the control immediately and frees up the stack before the child executes the return statement. When the same child thread uses return statement, it causes segmentation fault as the stack got corrupted.
To solve this I replaced
exit(0)
instead of
return 0;
I think i found the answer
Step 1
Replace this:
static int thread_pids[NUM_THREADS];
static void *thread_stacks[NUM_THREADS];
By this:
static int *thread_pids;
static void **thread_stacks;
Step 2
Add this in the main function (after checking arguments):
thread_pids = malloc(sizeof(int) * num_threads);
thread_stacks = malloc(sizeof(void *) * num_threads);
Step 3
Replace this:
chld_stack2 = ((char *)chld_stack + 148*512 - 1);
By this:
chld_stack2 = ((char *)chld_stack + 148*512);
In both places you use it.
I dont know if its really your problem, but after testing it i didnt get any segmentation fault. Btw i did only get segmentation faults when using more than 5 threads.
Hope i helped!
edit: tested with 1000 threads and runs perfectly
edit2: Explanation why the static allocation of thread_pids and thread_stacks causes an error.
The best way to do this is with an example.
Assume num_threads = 10;
The problem occurs in the following code:
for(t=0,i=0;t<num_threads;t++,i++) {
...
thread_pids[i]=rc;
thread_stacks[i]=chld_stack;
...
}
Here you try to access memory which does not belong to you (0 <= i <= 9, but both arrays have a size of 5). That can cause either segmentation fault or data corruption. Data corruption may happen if both arrays are allocated one after the other, resulting in writing to the other array. Segmentation can happen if you write in memory you dont have allocated (statically or dynamically).
You may be lucky and have no errors at all, but the code is surely not safe.
About the non-aligned pointer: I think i dont have to explain more than in my comment.

Resources