I have two processes: a producer which pushes messages via ZMQ to a consumer in a simple PULL-PUSH point-to-point pattern. The producer has several internal threads that send() via zmq. However, 0MQ's docs suggest not to share sockets between threads.
Must I use a single thread to send?
Assuming there is no strict requirement for keeping the sending order between the threads, doesn't the fact that the socket is a one-directional simplex allow multiple threads to use it without introducing locks?
The easiest thing to do is to create a separate PUSH socket on each of producer's threads and connect all these sockets to a single PULL socket in consumer.
It's explicitly stated in the guide that ZeroMQ sockets must be used on a single thread. I'd say that violating this requirement is not a good idea, even if it seems to work: things may break in the next version of the library or on some specific platform or in some specific load scenario. So, it's just too risky.
Related
Now I am building an application to send big data from client to server via UDP. I have some questions are:
Should I use one thread to send data or multi-threads to send data?
If I should use multi-threads to send data, I will use one socket for all threads or one socket per one thread?
Thanks,
Should I use one thread to send data or multi-threads to send data?
Either way can work, so it's mostly a matter of personal preference. If it was me, I would use a single thread rather than multiple threads, because multiple threads are a lot harder to implement correctly, and in this case they won't buy you any additional performance, since your throughput bottleneck is almost certainly going to be either your hard disk or your network card, not the speed of your CPU core(s).
If I should use multi-threads to send data, I will use one socket for all threads or one socket per one thread?
Again, either way will work (for UDP), but if it was me, I would use one socket per thread, only because then you don't have to worry so much about race conditions during process-setup and process-shutdown (i.e. each thread simply creates and destroys its own separate/private socket, so there's no worrying about who does what to the socket when)
I'm working on an application where I want to use ZeroMQ to connect nodes of different types which may be added and removed while the system is running. This means that I want to call zmq_connect() or zmq_disconnect() at any time as nodes come and go.
Some connection use sockets of type ZMQ_REQ, which block when no peers are available. Thus, it may happen that one node is blocked in a zmq_recv(), without any node available for processing the request. If then a new node becomes available, I would like to connect the socket using zmq_connect(). The only way I can see how I could do that is to call zmq_connect() from a different thread. But the documentation states pretty clearly that zmq_socket instances cannot be used from multiple threads simultaneously.
How can I solve this problem, sending messages on a ZMQ_REQ socket without any connections (or connection which cannot be established) and then later add connections and have the waiting requests being processed?
You should not use zmq_recv() when no messages are ready. That way you avoid blocking your thread. Instead check that there indeed are a message to receive. The easiest way to achieve this is using a poller. Since you haven't stated which library or language you're using I can't give you the right example, but I guess C example from the ZeroMQ Guide's examples here could be of use.
Building ZeroMQ based applications is, in my experience, most effective by building one threaded nodes that reacts to messages and, if necessary, runs methods based on time intervals.
For building a system like you talk about I suggest you look at the Service Discovery chapter of the awesome ZeroMQ Guide.
I am trying to write a server program which supports one client till now and over the few days i was trying to develop it, I concluded i needed threads. The reason for such a decision was since I take input from a wifi socket and later process it and finally write to a file, the processing time is slow and hence i needed a input thread -> circular buffer -> output thread pattern with producer consumer model which is quite common in network programming.
Now, The situation becomes complicated, as I need to manage client disconnection and re connection. I thought of using pthread_exit() and cleaning up all the semaphores and then re initializing them each time the single client re connects.
My question is that is this a efficient approach i.e. everytime killing the threads and semaphores and re creating them. Are there any better solutions.
Thanks.
My question is that is this a efficient approach i.e. everytime killing the threads and semaphores and re creating them. Are there any better solutions.
Learn how to use non-blocking sockets and an event loop. Or use a library that provides TCP sessions for you using non-blocking sockets under the hood. Such as boost::asio.
Learn how to use multi-threading without polluting your code with any synchronization primitives by using message passing to communicate between threads, not shared state. The event loop library you use for non-blocking I/O should also provide means for cross-thread message passing.
Some comments and suggestions.
1-In TCP detecting that the other side has silently disconnected it very difficult if not impossible. A client could disconnect sending a RST TCP message to the server or sending a FIN message, this is the good case. Sometimes the client can disconnect without notice (crash, cable disconnection, etc).
One suggestion here is that you consider the way client and server will communicate. For example, you can use function “select” to set a timeout for receiving a message from client and detect a silent client.
Additionally, depending on the programming language and operating system you may need to handle broken pipe (SIGPIPE) signal (in Linux, with C/C++), for a server trying to send a message through a connection closed by the client.
2-Regarding semaphores, you shouldn’t need to clean semaphores in any especial way when a client disconnect. By applying common good practices of locking and unlocking mutexes should be enough. Also with resources like file descriptors, you need to release them before ending the thread either by returning from the thread start function or with pthread_exit. Maybe I didn’t understand this part of the question.
3-Regarding threads: if you work with multiple threads to optimum is to have a pool of pre-created consumer/worker threads that will check the circular buffer to consume the next available connection. Creating and destroying threads is costly for the operating system.
Threads are resource consuming and you may exhaust operating system resources if you need to create 1,000 threads for example.
Another alternative, is to have only one consumer thread that manages all connections (sockets) asynchronously: a) Each connection has its own state. b) The main thread goes through all connections and use function “select” to detect when connection reads or a writes are ready. 3)Use of non-blocking sockets but this is not essential because from select you know which sockets are ready and will not block.
You can use functions select, poll, epoll.
One link about select and non-blocking sockets: Using select() for non-blocking sockets
Other link with an example: http://linux.die.net/man/2/select
It's explicitly stated in the ZeroMQ guide that sockets must not be shared between threads. In case of multiple threaded producers who need to PUSH their output via zmq, I see two possible design patterns:
0mq socket per producer thread
single 0mq socket in a separate thread
In the first case, each thread handles its own affairs. In the latter, you need a thread-safe queue to which all producers write and from which the 0mq thread reads and then sends.
What are the factors for choosing between these two patterns? What are the pros\cons of each?
A lot depends on how many producers there are.
If there are only a handful, then having one socket per thread is manageable, and works well.
If there are many, then a producer-consumer queue with a single socket pushing (effectively being a consumer of the queue and singe producer for downstream sockets) is probably going to be faster. Having lots of sockets running is not without cost.
The main pro of the first case is that it is much more easily scaled out to separate processes for each producer, each one single-threaded with its own socket.
I've asked a similiar question.
You can use a pool of worker threads, like this, where each worker has a dedicated 0mq socket via ThreadLocal, ensuring sockets are used and destroyed in the threads that created them
You can also use a pool of sockets, perhaps backed with an ArrayBlockingQueue, and just take/replace sockets whenever you need them. This approach is less safe than the dedicated socket approach because it shares socket objects (synchronously) amongst different threads; you should be ok since Java handles locking, but its not the 0mq recommended approach.
Hope it helps...
I've read the C10K doc as well as many related papers on scaling up a socket server. All roads point to the following:
Avoid the classic mistake of "thread per connection".
Prefer epoll over select.
Likewise, legacy async io mechanism in unix may be hard to use.
My simple TCP server just listens for client connections on a listen socket on a dedicated port. Upon receiving a new connection, parses the request, and sends a response back. Then gracefully closes the socket.
I think I have a good handle on how to scale this up on a single thread using epoll. Just one loop that calls epoll_wait for the listen socket as well as for the existing client connections. Upon return, the code will handle new creating new client connections as well as managing state of existing connections depending on which socket just got signaled. And perhaps some logic to manage connection timeouts, graceful closing of sockets, and efficient resource allocation for each connection. Seems straightforward enough.
But what if I want to scale this to take advantage of multiple threads and multiple cpu cores? The core idea that springs to mind is this:
One dedicated thread for listening for incoming connections on the TCP listen socket. Then a set of N threads (or thread pool) to handle all the active concurrent client connections. Then invent some thread safe way in which the listen thread will "dispatch" the new connection (socket) to one of the available worker threads. (ala IOCP in Windows). The worker thread will use an epoll loop on all the connections it is handling to do what the single threaded approach would do.
Am I on the right track? Or is there a standard design pattern for doing a TCP server with epoll on multiple threads?
Suggestions on how the listen thread would dispatch a new connection to the thread pool?
Firstly, note that it's C*10K*. Don't concern yourself if you're less than about 100 (on a typical system). Even then it depends on what your sockets are doing.
Yes, but keep in mind that epoll manipulation requires system calls, and their cost may or may not be more expensive than the cost of managing a few fd_sets yourself. The same goes for poll. At low counts its cheaper to be doing the processing in user space each iteration.
Asynchronous IO is very painful when you're not constrained to just a few sockets that you can juggle as required. Most people cope by using event loops, but this fragments and inverts your program flow. It also usually requires making use of large, unwieldy frameworks for this purpose since a reliable and fast event loop is not easy to get right.
The first question is, do you need this? If you're handily coping with the existing traffic by spawning off threads to handle each incoming request, then keep doing it this way. The code will be simpler for it, and all your libraries will play nicely.
As I mentioned above, juggling simultaneous requests can be complex. If you want to do this in a single loop, you'll also need to make guarantees about CPU starvation when generating your responses.
The dispatch model you proposed is the typical first step solution if your responses are expensive to generate. You can either fork or use threads. The cost of forking or generating a thread should not be a consideration in selecting a pooling mechanism: rather you should use such a mechanism to limit or order the load placed on the system.
Batching sockets onto multiple epoll loops is excessive. Use multiple processes if you're this desperate. Note that it's possible to accept on a socket from multiple threads and processes.
I would guess you are on the right track. But I also think details depend upon the particular situation (bandwidh, request patterns, indifidual request processing, etc.). I think you should try, and benchmark carefully.