I have a pool of threads all calling epoll_wait on an epoll instance that is watching a set of file descriptors. All of the descriptors have been added to the epoll instance with EPOLLONESHOT. Is it guaranteed that only one thread will wake up when a file descriptor is triggered? Or is it possible that multiple threads will wake up? To rephrase that question, EPOLLONESHOT guaranteed to be thread safe and only wake one thread?
My testing seems to indicate that only one thread will wake up, but I'm looking for either a specification that asserts this or citations to kernel code.
Related
In Windows there is the API WaitForMultipleObjects which will, if one event is registered in many threads, only wake one thread if the event occurs. I now have to port an application that uses this in its threadpool and I am looking for the best practive to do this in Linux.
I am aware of epoll which can wait for fds (which i can create with pipe), but waiting on one FD in multiple threads may wake every thread on event when only one is needed.
What would be the best practice to implement this behaviour on Linux? I really dont want to split up an event to have as many FDs as there are worker threads, as this may hit the FD limit on some systems as I have many events (which all would be split up).
What I thought about is create 1 master thread that will delegate work to an available worker (or queue the task if all workers are working), but that would mean that I have one additional context switch (and thus giving up computation time) as the master will wake up and then wake up another worker. I would do this if there is no other possibility to cleanly implement this. Unfortunately I cannot get rid of the current architecture so I need to get around this.
Is there any API that would be applicable for this kind of problem?
epoll() is the correct solution, although you could consider using eventfd() file descriptors rather than pipe() file descriptors for the event signalling. See this text from the epoll(7) man page:
If multiple threads (or processes, if child processes have inherited
the epoll file descriptor across fork(2)) are blocked in
epoll_wait(2) waiting on the same the same epoll file descriptor
and a file descriptor in the interest list that is marked for
edge-triggered (EPOLLET) notification becomes ready, just one of the
threads (or processes) is awoken from epoll_wait(2). This provides
a useful optimization for avoiding "thundering herd" wake-ups in some
scenarios.
So to get this single-wakeup behaviour, you have to be calling epoll_wait() in each thread on the same epoll descriptor, and you have to have registered your event-notifying file descriptors in the epoll set as edge-triggered.
What I learned is if a process got blocked, it will be swapped out to the disk and wait for wake-up event. But, if a process can have multiple threads, what if a thread is blocked? For example, one of the threads waits for a keyboard eveny, the thread will be blocked. Then will the process also be blocked, or is it possible that only the thread is blocked and process is running?
What I learned is if a process got blocked, it will be swapped out to the disk and wait for wake-up event.
You're probably reading some very old documentation. Likely by "process" it means something scheduled by the kernel.
But, if a process can have multiple threads, what if a thread is blocked? For example, one of the threads waits for a keyboard event, the thread will be blocked. Then will the process also be blocked, or is it possible that only the thread is blocked and process is running?
If you define a "process" as a container that consists of an address space, file descriptor set and so on and that can contain more than one thread, then there is no such thing as a process being blocked. What would block a process exactly?
What if multiple threads epoll wait on the same socket?
In my own experiment, it showed that only one thread can invoke epoll_wait successfully, the other threads show Invalid arguments error. Could someone explain it?
You can call epoll_wait concurrently on multiple threads for the same epoll_fd.
event.events = EPOLLIN | EPOLLET | EPOLLONESHOT;
http://www.csh.rit.edu/~rossdylan/presentations/EpollMT/
You can epoll_wait concurrently on multiple threads for the same fd. But epoll doesn't handle thread synchronization like IOCP. It is possible that all the threads come out of epoll_wait call when an event occurs on one of the sockets. Usually only one thread is enough to wait on epoll_wait. You can then give the task of receiving or sending data to other threads from the epoll_wait thread (polling thread).
I have several threads, one of them calls epoll_wait in a loop, others can open connections that need to be epoll'ed by first thread. Is it possible to just add new sockets with epoll_ctl while another thread waits in epoll_wait?
What will happen in the following scenario:
Thread 1 calls epoll_wait.
Thread 2 creates a socket(A) and adds it to epoll instance using epoll_ctl.
Someone sends some data, socket A becomes ready for read() call.
Will epoll_wait return socket A?
Yes, it will. The whole point of an epoll socket is that you don't have to duplicate effort. No snapshotting or use of multiple wait queues is involved.
Under the hood, the epoll socket has its own wait queue. When you block on the epoll socket, you are added to that single wait queue. No state is saved or anything like that. The state is in the epoll socket itself.
For example on windows there is MsgWaitForMultipleObjects that lets you asynchronously wait for windows messages, socket events, asynchronous io (IOCompletionRoutine), AND mutex handles.
On Unix you have select/poll that gives you everything except possibility to break out when some pthread_mutex is unlocked.
The story:
I have application that has main thread that does something with multiple sockets, pipes or files. Now from time to time there is a side job (a db transaction) that might take a longer time and if done synchronously in main thread would disrupt normal servicing of sockets. So I want to do the db operation in separate thread. That thread would wait on some mutex when idle until main thread decides to give it some job and unlocks the mutex so db thread can grab it.
The problem is how the db thread can notify back the main thread that it has finished the job. Main thread has to process sockets, so it cannot afford sleeping in pthread_mutex_lock. Doing periodic pthread_mutex_trylock is the last I would want to do. Currently I consider using a pipe, but is this the better way?
Using a pipe is a good idea here. Make sure that no other process has the write end of the pipe open, and then select() or poll() in the main thread the read end for reading. Once your worker thread is done with the work, close() the write end. The select() in the main thread wakes up immediately.
I don't think waiting on a mutex and something else would be possible, because on Linux, mutexes are implemented with the futex(2) system call, which doesn't support file descriptors.
I don't know how well it applies to your specific problem, but posix has message queues.