Can I call accept() for one socket from several threads simultaneously? - linux

I am using Linux 3.2.0, x86_64.
Can I call accept() for one socket from several threads simultaneously?

Yes, you can call accept() on the same listening socket from multiple threads and multiple process though there might not be as much point to it as you think. The kernel will only allow one to succeed. When this is done with processes it is known as pre-forking and it saves the expense of a fork() for every new connection. But when you are dealing with threads you can more easily have an existing thread pool that waits on a queue of new connections. One thread does the accept and writes the queue and the worker threads read the queue and do their thing. It's cleaner, it's a well understood pattern, and you lose almost nothing.

Related

Is a mutex needed on a listener socket shared between child processes?

I'm developing a server application using C++. I designed it in a such way that there will be main process, responsible for maintaining child processes (workers). Workers accept() new connections and create threads for handle them individually.
Suppose I create a listener socket in main process and each worker would monitor it (using kqueue, epoll, etc.) for new connections. After researching a bit, I found some affirmations of the need of using mutex on listener socket to prevent concurrent accept()s that would lead workers accept()ing the same connections at same time.
Well, being aware of such need, I'm not sure what is the best way to distribute client connections among workers, as the result will be the same as accept() them on main process and send somehow just the new socket FD to workers (new connections handling becomes blocking - one accept() at a time).
My question is: Is mutex on listening socket really needed? Am I right of its accept() blocking (one new connection accept()ed at a time) side effect?
I'm concerned about this single detail because this application must scale to up to thousands of new connections per second (exact number may vary, as this applications is intended to be used on networks with from 100s to 1000s of clients).
A long time ago there were operating systems that had race conditions if multiple processes performed an accept concurrently on the same socket. Apache used to have an optional accept mutex to resolve this.
This problem has long since been solved on every operating system you're likely to use and it's perfectly reasonable to use a shared socket that workers call accept on. If you want each worker to handle only one connection at a time, an idle worker can block in accept on a shared socket.
I'm concerned about this single detail because this application must scale to up to hundred of thousands or even millions of new connections per second. I want to avoid the work of writing two complex applications for the sole purpose of comparing both methods performance. Also, I've no way to simulate real world simultaneous connections.
You can't have it both ways. Either you abandon such ambitious scaling plans or you accept that you will have numerous major efforts on your hand. Just simulating that kind of connection load for testing would be a major effort.
I can't answer the part of your question about how threadsafe the listen() and accept() calls are, because I would never even consider trying that. What I would do is have the main thread doing the listen() and accept(), and forking a new thread when accept() returns, passing the socket off to the thread.
Similarly, you could have a bunch of running threads, and mutex a variable that will do the socket notification. Basically the same as above, but rather than create a thread at accept time, you notify an already running thread of the socket descriptor. General pseudocode might be:
main()
{
listen();
while(true)
{
int socket = accept();
if(fork() == 0)
{
DoMyThing(socket);
}
}
}

Multi threaded Linux Socket programming design

I am trying to write a server program which supports one client till now and over the few days i was trying to develop it, I concluded i needed threads. The reason for such a decision was since I take input from a wifi socket and later process it and finally write to a file, the processing time is slow and hence i needed a input thread -> circular buffer -> output thread pattern with producer consumer model which is quite common in network programming.
Now, The situation becomes complicated, as I need to manage client disconnection and re connection. I thought of using pthread_exit() and cleaning up all the semaphores and then re initializing them each time the single client re connects.
My question is that is this a efficient approach i.e. everytime killing the threads and semaphores and re creating them. Are there any better solutions.
Thanks.
My question is that is this a efficient approach i.e. everytime killing the threads and semaphores and re creating them. Are there any better solutions.
Learn how to use non-blocking sockets and an event loop. Or use a library that provides TCP sessions for you using non-blocking sockets under the hood. Such as boost::asio.
Learn how to use multi-threading without polluting your code with any synchronization primitives by using message passing to communicate between threads, not shared state. The event loop library you use for non-blocking I/O should also provide means for cross-thread message passing.
Some comments and suggestions.
1-In TCP detecting that the other side has silently disconnected it very difficult if not impossible. A client could disconnect sending a RST TCP message to the server or sending a FIN message, this is the good case. Sometimes the client can disconnect without notice (crash, cable disconnection, etc).
One suggestion here is that you consider the way client and server will communicate. For example, you can use function “select” to set a timeout for receiving a message from client and detect a silent client.
Additionally, depending on the programming language and operating system you may need to handle broken pipe (SIGPIPE) signal (in Linux, with C/C++), for a server trying to send a message through a connection closed by the client.
2-Regarding semaphores, you shouldn’t need to clean semaphores in any especial way when a client disconnect. By applying common good practices of locking and unlocking mutexes should be enough. Also with resources like file descriptors, you need to release them before ending the thread either by returning from the thread start function or with pthread_exit. Maybe I didn’t understand this part of the question.
3-Regarding threads: if you work with multiple threads to optimum is to have a pool of pre-created consumer/worker threads that will check the circular buffer to consume the next available connection. Creating and destroying threads is costly for the operating system.
Threads are resource consuming and you may exhaust operating system resources if you need to create 1,000 threads for example.
Another alternative, is to have only one consumer thread that manages all connections (sockets) asynchronously: a) Each connection has its own state. b) The main thread goes through all connections and use function “select” to detect when connection reads or a writes are ready. 3)Use of non-blocking sockets but this is not essential because from select you know which sockets are ready and will not block.
You can use functions select, poll, epoll.
One link about select and non-blocking sockets: Using select() for non-blocking sockets
Other link with an example: http://linux.die.net/man/2/select

Multithreaded socket server using libev

I'm implementing a socket server.
All clients (up to 10k) are supposed to stay connected.
Here's my current design:
The main thread creates an event loop (use epoll by default) and a watcher for accepting clients.
The accept callback
Accept fd and set it to non-blocking mode.
Add watcher for the fd to monitor read events.
The read callback
Read data and add a task to thread pool to send response.
Is it OK to move read part to thread pool, or any other better idea?
Thanks.
Hard to say. You don't want 10k threads running in the background. You should keep the read part in the main thread. This way if suddently all clients start asking for things, you pile those resources only in the threadpool queue (You don't end up with 10k threads running at the same time). Also you might get better performance this way because you avoid doing some unnecessary context switches (between your own threads).
On the other hand if your clients are unlikely to send requests at the same time, or if the replies are very simple, it might be simpler to just have one thread per client, and avoid the context switch between the main thread and the thread pool.

using zmq PUSH socket per producer thread vs dedicated zmq thread for all threaded producers

It's explicitly stated in the ZeroMQ guide that sockets must not be shared between threads. In case of multiple threaded producers who need to PUSH their output via zmq, I see two possible design patterns:
0mq socket per producer thread
single 0mq socket in a separate thread
In the first case, each thread handles its own affairs. In the latter, you need a thread-safe queue to which all producers write and from which the 0mq thread reads and then sends.
What are the factors for choosing between these two patterns? What are the pros\cons of each?
A lot depends on how many producers there are.
If there are only a handful, then having one socket per thread is manageable, and works well.
If there are many, then a producer-consumer queue with a single socket pushing (effectively being a consumer of the queue and singe producer for downstream sockets) is probably going to be faster. Having lots of sockets running is not without cost.
The main pro of the first case is that it is much more easily scaled out to separate processes for each producer, each one single-threaded with its own socket.
I've asked a similiar question.
You can use a pool of worker threads, like this, where each worker has a dedicated 0mq socket via ThreadLocal, ensuring sockets are used and destroyed in the threads that created them
You can also use a pool of sockets, perhaps backed with an ArrayBlockingQueue, and just take/replace sockets whenever you need them. This approach is less safe than the dedicated socket approach because it shares socket objects (synchronously) amongst different threads; you should be ok since Java handles locking, but its not the 0mq recommended approach.
Hope it helps...

Linux socket using multiple threads to send

I have a single non-blocking socket sending udp packets to multiple targets and receiving responses from all of them on the same socket. I'm reading in a dedicated thread but writes (sendto) can come from several different threads.
Is this a safe without any additional synchronization? Do I need to write while holding a mutex? Or, do writes need to come from the same thread and I need a queue?
The kernel will synchronize access to underlying file descriptor for you, so you don't need a separate mutex. There would be a problem with this approach if you were using TCP, but since we are talking about UDP this should be safe, though not necessarily best way.
You can write to the socket from a single or multiple threads. If you write to a socket from multiple threads, they should be synchronized with a mutex. If instead your threads place their messages in a queue and a single thread pulls from the queue to do the writes, reads and writes to/from the queue should be protected by a mutex.
Reading and writing to the same socket from different threads won't interfere with each other.

Resources