Akka dispatcher and router - multithreading

After reading the Akka documentation and also some posts online, I still don't have a clear understanding of the relationship between a router and a dispatcher.
1) Does a router always use a dispatcher for dispatching to the routees? Can a router do its job without using a dispatcher?
2) If there are no additional dispatchers defined in the configuration, my understanding is that the default dispatcher will be used. In my actor system, I have a cluster with two producer actors that use the router actor and three consumer actors. The producers and consumers are all running in different JVMs--what does it mean for an actor system to have one default dispatcher?
My understanding is that a dispatcher is like a thread pool executor. In this case, in different JVMs, wouldn't each JVM have its own instance of a dispatcher and its own thread pool executor?
3) Related to the above question (https://doc.akka.io/docs/akka/current/dispatchers.html#problem-blocking-on-default-dispatcher):
Using context.dispatcher as the dispatcher on which the blocking Future executes can be a problem, since this dispatcher is by default used for all other actor processing unless you set up a separate dispatcher for the actor.
If the actors are running in different JVMs, is the above still applicable? If so, what does it mean?

(1a) Does a router always use a dispatcher for dispatching to a routee?
Yes.
(1b) Can a router do its job without using a dispatcher?
No. All actors, regardless of whether or not they are routers, run on a dispatcher.
(2) ...in different JVMs, wouldn't each JVM have its own instance of a dispatcher and its own thread pool executor?
Yes, essentially. If your system consists of multiple JVMs, then each JVM will have its own ActorSystem (for example, using Akka Cluster). Each ActorSystem configures its own dispatcher(s) independently of any other ActorSystem.1 If you don't add a dispatcher, the default dispatcher will be used.
(3) "Using context.dispatcher as the dispatcher on which the blocking Future executes can be a problem, since this dispatcher is by default used for all other actor processing unless you set up a separate dispatcher for the actor."
If the actors are running in different JVMs, is the above still applicable? If so, what does it mean?
Yes, the guidelines about dealing with blocking operations would apply if you have actors running on multiple JVMs. Each JVM would have its own ActorSystem, and each ActorSystem would need to set up a dedicated dispatcher to deal with blocking operations, as the documentation you quoted recommends.
1In fact, you can have more than one ActorSystem on a JVM. From the documentation:
Several actor systems with different configurations may co-exist within the same JVM without problems, there is no global shared state within Akka itself.

Some small corrections to Jeffreys otherwise great answer: it is possible to run a router which routes messages on the calling thread (see the first example in the docs), and that thread could potentially be an arbitrary non-actor thread, so that would not in itself require a dispatcher.
The actor the router routes to, however, like any other actor will always be running on a dispatcher.
It is also quite common to run a router as a separate actor, and in that case it will run on a dispatcher (described in the second section of the router docs).
The mailbox is the queue of messages for an actor and putting a message in it will lead to the actor processing that message (or a few in one batch) being scheduled on the dispatcher. When a mailbox is empty the actor is not scheduled to execute which means that a large amount of actors can share a dispatcher with a small number of threads.
If one of those actors take "a few minutes" to execute, that can lead to starvation - that no other actor gets to execute, including actors that deal with cluster state and internals of Akka, therefore it is important to isolate them onto their own dispatcher. See the blocking needs careful consideration section of the docs.

Related

Tuning gRPC thread pool

I'm dealing with a legacy synchronous server that has operations running for upto a minute and exposes 3 ports to overcome this problem. There is "light-requests" port, "heavy-but-important" requests port and "heavy" port.
They all expose the same service, but since they run on separate ports, they end up with dedicated thread pools.
Now this approach is running into a problem with load balancing, as Envoy can't handle a single services exposing the same proto on 3 different ports.
I'm trying to come up with a single threadpool configuration that would work (probably an extremely overprovisioned one), but I can't find any documentation on what the threadpool settings actually do.
NUM_CQS
Number of completion queues.
MIN_POLLERS
Minimum number of polling threads.
MAX_POLLERS
Maximum number of polling threads.
CQ_TIMEOUT_MSEC
Completion queue timeout in milliseconds.
Is there some reason why you need the requests split into three different thread pools? By default, there is no limit to the number of request handler threads. The sync server will spawn a new thread for each request, so the number of threads will be determined by the number of concurrent requests -- the only real limit is what your server machine can handle. (If you actually want to bound the number of threads, I think you can do so via ResourceQuota::SetMaxThreads(), although that's a global limit, not one per class of requests.)
Note that the request handler threads are independent from the number of polling threads set via MIN_POLLERS and MAX_POLLERS, so those settings probably aren't relevant here.
UPDATE: Actually, I just learned that my description above, while correct in a practical sense, got some of the internal details wrong. There is actually just one thread pool for both polling and request handlers. When a request comes in, an existing polling thread basically becomes a request handler thread, and when the request handler completes, that thread is available to become a polling thread again. The MIN_POLLERS and MAX_POLLERS options allow tuning the number of threads that are used for polling: when a polling thread becomes a request handler thread, if there are not enough polling threads remaining, a new one will be spawned, and when a request handler finishes, if there are too many polling threads, the thread will terminate. But none of this affects the number of threads used for request handlers -- that is still unbounded by default.

What is the name of multithreading design pattern that uses asynchronous requests instead of synchronization with mutexes?

I'm wondering what is an "official" name for the design pattern where you have a single thread that actually handles some resource (database, file, communication interface, network connection, log, ...) and other threads that wish to do something with that resource have to pass a message to this thread and - optionally - wait for a notification about completion?
I've found some articles that refer to this method as "Central Controller", but googling doesn't give much about that particular phrase.
One the other hand this is not exactly a "message pump" or "event queue", because it's not related to GUI or the operating system passing some messages to the application.
It's also not "work queue" or "thread pool", as this single thread is dedicated only to this single activity (managing single resource), not meant to be used to do just about anything that is thrown at it.
For example assume that there's a special communication interface managed by one thread (for example let that be Modbus, but this really doesn't matter). This interface is completely hidden inside an object with it's thread and a message queue. This object has member functions that allow to "read" or "write" data using that communication interface, and these functions can be used by multiple threads without any special synchronization. That's because internally the code of these function converts the arguments to a message/request and passes that via the queue to the handler thread, which just serves these requests one at a time.
This design pattern may be used instead of explicit synchronization with a mutex protecting the shared resource, which would have to be locked/unlocked by each thread that wishes to interact with that resource.
The best pattern that fits here may be the Broker pattern:
The Broker architectural pattern can be used to structure distributed
software systems with decoupled components that interact by remote
service invocations. A broker component Is responsible for
coordinating communication, such as forwarding requests. as well as
for transmitting results and exceptions.
I would simply call it asynchronous IO, or more catchy: non-blocking IO.
As: does it really matter what that "single thread side" is doing in detail? Does it make a difference if you deal "async" with a data base; or some other remote server?
They key attribute is: your code is not waiting for answers; but expecting information to flow in "later".

Only single netty thread is running

I am using Netty camel-netty:jar:2.10.0.redhat-60024.
Below is my configuration of Netty listener
netty:tcp://10.1.33.204:9001?textline=true&autoAppendDelimiter=true&delimiter=LINE&keepAlive=true&synchronous=false&orderedThreadPoolExecutor=false&sendBufferSize=2000&receiveBufferSize=2000&decoderMaxLineLength=2000&workerCount=20
Here I see based on debug log , Netty is creating only one worker threads , so incoming mesages are blocked until existing message is processed.
Like:
2014-08-23 12:36:48,394 | DEBUG | w I/O worker #5 | NettyConsumer
| ty.handlers.ServerChannelHandler 85 | 126 -
org.apache.camel.camel-netty - 2.10.0.redhat-60024
Till 5 minute proccess is running but I seee only this thread active. Only when this thread sends reponse it is accepting next request
For TCP, Netty creates a number of worker threads, and assigns each connection to a specific worker thread. All events for that channel are handled by that single thread (note it can be more complex, but that's sufficient for this answer).
It sounds like you're processing your message in the Netty worker thread. Therefore you're blocking processing of any further events on that connection, and all other connections assigned to the worker thread, until your process returns.
Netty is actually creating multiple worker threads. You can see in the debug message that your channel is being handled by I/O worker 5. Netty will create 2 * Runtime.availableProcessors by default but each connection is handled by a single thread unless you intervene.
It's not clear whether you can process requests concurrently and out of order, or whether ordering is important. If ordering is important you can tell camel to use the ordered thread pool executor. This will process the request in a separate thread pool, but subsequent requests on the same connection will still be blocked by the first requests.
If ordering is not important you have a few options. Given that camel appears to be using Netty 3, and allows you to create a custom pipeline, you could use Netty's MemoryAwareThreadPoolExecutor to process requests concurrently. Perhaps take look at What happens when shared MemoryAwareThreadPoolExecutor's threshold is reached? if you do this.
Camel may offer other mechanisms to help but I'm not overly familiar with Camel. The SEDA component might be a good place to start.

How is an event based model(Node.js) more efficient than thread based model(Apache) for serving http requests?

In apache, we have a single thread for each incoming request. Each thread consumes a memory space. the memory spaces don't collide with each other because of which each request serves it purpose.
How does this happen in node.js as it has single thread execution. A single memory space is used by all incoming requests. Why don't the requests collide with each other. What differentiates them?
As you self noticed an event based model allows to share the given memory more efficiently as the overhead of reexecuting a stack again and again is minimized.
However to make an event or single threaded model non-blocking you have to get back to threads somewhere and this is where nodes "io-engine" libuv is working.
libuv supplies an API which underneath manages IO-tasks in a thread pool if an IO task is done async. Using a thread pool results in not blocking the main process however extensive javascript operations still can do (this is why there is the cluster module which allows spawning multiple worker processes).
I hope this answers you question if not feel free to comment!

using zmq PUSH socket per producer thread vs dedicated zmq thread for all threaded producers

It's explicitly stated in the ZeroMQ guide that sockets must not be shared between threads. In case of multiple threaded producers who need to PUSH their output via zmq, I see two possible design patterns:
0mq socket per producer thread
single 0mq socket in a separate thread
In the first case, each thread handles its own affairs. In the latter, you need a thread-safe queue to which all producers write and from which the 0mq thread reads and then sends.
What are the factors for choosing between these two patterns? What are the pros\cons of each?
A lot depends on how many producers there are.
If there are only a handful, then having one socket per thread is manageable, and works well.
If there are many, then a producer-consumer queue with a single socket pushing (effectively being a consumer of the queue and singe producer for downstream sockets) is probably going to be faster. Having lots of sockets running is not without cost.
The main pro of the first case is that it is much more easily scaled out to separate processes for each producer, each one single-threaded with its own socket.
I've asked a similiar question.
You can use a pool of worker threads, like this, where each worker has a dedicated 0mq socket via ThreadLocal, ensuring sockets are used and destroyed in the threads that created them
You can also use a pool of sockets, perhaps backed with an ArrayBlockingQueue, and just take/replace sockets whenever you need them. This approach is less safe than the dedicated socket approach because it shares socket objects (synchronously) amongst different threads; you should be ok since Java handles locking, but its not the 0mq recommended approach.
Hope it helps...

Resources