Akka: how to config Router, dispatcher to make a better performance? - multithreading

I have played around akka for two weeks and still very confused with some basic concept.
I have a very simple pattern which contains three kinds of actors:
master
worker
reporter
I config these actors as following:
Master
Master use following dispatcher with RoundRobinRouter(10):
mailbox-capacity = 10000
executor = "fork-join-executor"
fork-join-executor {
parallelism-min = 0
parallelism-max = 600
parallelism-factor = 3.0
}
Worker
I have several workers(ref) in this system, they all receive messages from master, and each of them use a router of RoundRobinRouter(10).
type = Dispatcher
executor = "fork-join-executor"
fork-join-executor {
parallelism-min = 0
parallelism-max = 600
parallelism-factor = 3.0
}
mailbox-capacity = 100000
Notifier
is just an actor used to receive result from the worker, and counting up.
It uses the same dispatcher as workers.
I have made some adjustment on the parallelism parameters and router, but the performance seems no change. It takes 80 seconds to consume 10 million tasks, of each takes at least 500 ms to finish.
So it got me there, if a dispatcher acts like a thread pool, what if an actor use this dispatcher without using a router, which means there is only one instance. Will the code in receive block be executed parallel?
Just in case something else in my code messed things up:
Gist
And this is the virtual machine runs this program:
32-bit ubuntu 12.04 lts
Memory 5.0 GiB
Intel® Core™ i5-2500 CPU # 3.30GHz × 4
Sorry to put this question so unclear here.
If there is any thing can improve this performance please tell me.
Any advise is welcome. Thank in advance.
Update
Sorry!
It was 10 million task, not 100 million!
My bad, sorry!

Related

Why ExecutorService is much faster than Coroutines in this example? [Solved]

Update:
I made 2 silly mistakes!
I submitted only 1 task in the executor service example
I forgot to await for the tasks to finish.
Fixing the test, lead to all 3 examples having around 190-200 ms/op latency.
I created a benchmark comparison using kotlinx-benchmark (uses jmh) to compare coroutines and a threadpool when making a blocking call.
My rational behind such benchmark is
Coroutines will block the underlying thread when making a blocking call.
A Network call is generally blocking ()
In an average service, I need to make a million of network calls.
In such scenario will I get any benefit, if I use coroutines?
The benchmark I create simulates the blocking call using Thread.sleep(10) // 10 ms block and I need to create 1000 of them. I created 3 examples with following results
Dispatchers.io
Used Dispatchers.io, which is the recommended way to handle IO operations.
#Benchmark
fun withCoroutines() {
runBlocking {
val coroutines = (0 until 1000).map {
CoroutineScope(Dispatchers.IO).async {
sleep(10)
}
}
coroutines.joinAll()
}
}
Avg time: 188.418 ms/op
Fixed Threadpool
Dispatcher.IO created 64 threads (the exact number is nondeterministic statically). So I kept 60 threads for a comparable scenario
#Benchmark
fun withExecutorService() {
val executors = Executors.newFixedThreadPool(60)
executors.submit { sleep(10) }
executors.shutdown()
}
Avg time: 0.054 ms/op
Threadpool Dispatcher
Since the results were shocking I decided to use the same threadpool above as the dispatcher as
Executors.newFixedThreadPool(60).asCoroutineDispatcher()
Avg time: 206,260 ms/op
Questions
Why are coroutines performing exceptionally bad here?
With limitedParallelism(10) options coroutines performed much better at 30ms/op. Default number of threads used by IO are 64. Does that mean that coroutine scheduler is causing too many context switches, leading to poor performance. Still the performance is not close to that of threadpools
Am I correct to assume that the network calls are always blocking? Both executor service and coroutines schedules execution over underlying threads while not blocking the main thread, so they are the direct competitors.
Notes:
I am running jmh with
#State(Scope.Benchmark)
#Fork(1)
#Warmup(iterations = 50)
#Measurement(iterations = 5, time = 1000, timeUnit = TimeUnit.MILLISECONDS)
#OutputTimeUnit(TimeUnit.MILLISECONDS)
#BenchmarkMode(Mode.AverageTime)
The code can be found here

too many thread count issue

I have some questions about thread name in the PlayFramework.
I've developed Rest-API service on the Play for about 5 months.
The app simply accesses MySQL, and send back json formatted data to clients.
I've already understood the pit fall of the 'blocking io', so
I create a thread pool for blocking io, and use it all the Future block that
block thread execution.
The definition of the thread pool is as follows.
akka {
actor-system = "myActorSystem"
blocking-io-dispatcher {
type = Dispatcher
executor = "thread-pool-executor"
thread-pool-executor {
fixed-pool-size = 64
}
throughput = 10
}
}
I checked the log file, and be sure that all non-blocking logics
run under thread named 'application-akka.actor.default-dispatcher-#' where
is integer value and that all blocking logics run under thread named
'application-blocking-io-dispatcher'.
Then I checked the all thread name and count using 'Jconsole'.
The number of thread named 'application-akka.actor.default-dispatcher-#' is
always under 13, and thread count of 'application-blocking-io-dispatcher-#'
is always under 30.
However, the total thread count of the JVM under which my app runs increases
constantly. The total number of thread is more than 10,000.
There is so many threads whose name start with 'default-scheduler-' or
'default-akka.actor.default-dispatcher'.
My questions are
a. What's the difference between 'application-akka.actor.default-dispatcher'
and 'default-akka.actor.default-dispatcher-' ?
b. Is there any reason thread count increases?
I want to solve this issue.
Here's my environment.
OS : Windows 10 Pro. 64bit
CPU : Intel(R) Core i7 # 3.5GHz
RAM : 64GB
JVM : 1.8.0_162 64bit
PlayFramework : 2.6
RDBMS : MySQL 5.7.21
Any suggestions will be greatly appreciated. Thanks in advance.
Finally I solved the problem. There was a bug that would not shutdown the instance of
akka's Materializor. After modifying the code, thread count in the VM keeps stable value.
Thanks.

Is there any restriction on maximum number of threads created using ExecutorService

I am creating an application which requires multiple processes to run in parallel. The number of processes to run is dynamic, it depends on the input received.
E.g., if the user wants information about three different things [car, bike, auto] then I need three separate thread to run each in parallel.
numberOfThreadsNeeded = getNumberOfThingsFromInput();
ExecutorService executor = Executors.newFixedThreadPool(numberOfThreadsNeeded);
Code Snippet:
public class ConsoleController {
private static final Log LOG = LogFactory.getLog(ConsoleController.class);
#Autowired
ConsoleCache consoleCache;
Metrics metrics;
public List<Feature> getConsoleData(List<String> featureIds, Map<String, Object> input, Metrics metrics) {
this.metrics = metrics;
List<FeatureModel> featureModels =
featureIds
.stream()
.map(consoleCache::getFeature)
.collect(toList());
Integer numberOfThreadsNeeded = getThreadCount(featureModels);
ExecutorService executor = Executors.newFixedThreadPool(numberOfThreadsNeeded);
featureModels.stream()
.map(f -> (Callable<Result>) () -> f.fetchData(input, metrics))
.map(executor::submit)
.collect(toList()));
The number of threads to be created varies from 1 to 100. Is it safe to define the thread pool size during initialization?
And also I want to know whether it is safe to run 100 threads in parallel?
There is no hard limit as per Java, but there might be a limit, for example, in the JVM implementation or the Operating System. So, practically speaking there is a limit. And there is a point where adding more threads can make the performance worse, not better. Also, there is a possibility of running out of memory.
The way you use ExecutorService is not the way it was intended to be used. Normally you would create a single ExecutorService with the threads limit number that is best suited for your environment.
Keep in mind that even if you really want all your tasks to be executed in parallel you won't be able to achieve it anyways given the hardware/software concurrency limitations.
BTW, if you still want to create an ExecutorService per request - don't forget to call its shutdown() method, otherwise the JVM won't be able to exit gracefully as there will be threads still hanging around.

How to check what dispatcher is configured in akka application

I have following entry in conf file. But I'm not sure if this dispatcher setting is being picked up and what's ultimate parallelism value being used
akka{
actor{
default-dispatcher {
type = Dispatcher
executor = "fork-join-executor"
throughput = 3
fork-join-executor {
parallelism-min = 40
parallelism-factor = 10
parallelism-max = 100
}
}
}
}
I've 8 core machine so I expect 80 parallel threads to be in ready state
40min < 80 (8*10 factor) < 100max. I'd like to see what value is akka using for max parallel thread.
I created 45 child actors and in my logs, I'm printing the thread id [application-akka.actor.default-dispatcher-xx] and I don't see more than 20 threads running in parallel.
In order to max-out the parallelism factor, all the actors needs to be processing some messages at the same time. Are you sure this is the case in your application?
Take for example the following code
object Test extends App {
val system = ActorSystem()
(1 to 80).foreach{ _ =>
val ref = system.actorOf(Props[Sleeper])
ref ! "hello"
}
}
class Sleeper extends Actor {
override def receive: Receive = {
case msg =>
//Thread.sleep(60000)
println(msg)
}
}
If you consider your config and 8 cores, you will see a small amount of threads being spawned (4, 5?) as the processing of the messages is too quick for some real parallelism to build up.
On the contrary, if you keep your actors CPU-busy uncommenting the nasty Thread.sleep you will see the number of threads will bump up to 80. However, this will only last 1 minute, after which the threads will be gradually be retired from the pool.
I guess the main trick is: don't think of each actor being run on a separate thread. It's whenever one or more messages appear on an actor's mailbox that the dispatcher awakes and - indeed - dispatches the message processing task to a designated pool.
Assuming you have an ActorSystem instance you can check the values set in its configuration. This is how you could get your hand on the values you've set in the config file:
val system = ActorSystem()
val config = system.settings.config.getConfig("akka.actor.default-dispatcher")
config.getString("type")
config.getString("executor")
config.getString("throughput")
config.getInt("fork-join-executor.parallelism-min")
config.getInt("fork-join-executor.parallelism-max")
config.getDouble("fork-join-executor.parallelism-factor")
I hope this helps. You can also consult this page for more details on specific configuration settings.
Update
I've dug up a bit more in Akka to find out exactly what it uses for your settings. As you might already expect it uses a ForkJoinPool. The parallelism used to build it is given by:
object ThreadPoolConfig {
...
def scaledPoolSize(floor: Int, multiplier: Double, ceiling: Int): Int =
math.min(math.max((Runtime.getRuntime.availableProcessors * multiplier).ceil.toInt, floor), ceiling)
...
}
This function is used at some point to build a ForkJoinExecutorServiceFactory:
new ForkJoinExecutorServiceFactory(
validate(tf),
ThreadPoolConfig.scaledPoolSize(
config.getInt("parallelism-min"),
config.getDouble("parallelism-factor"),
config.getInt("parallelism-max")),
asyncMode)
Anyway, this is the parallelism that will be used to create the ForkJoinPool, which is actually an instance of java.lang.ForkJoinPool. Now we have to ask how many thread does this pool use? The short answer is that it will use the whole capacity (80 threads in our case) only if it needs it.
To illustrate this scenario, I've ran a couple of tests with various uses of Thread.sleep inside the actor. What I've found out is that it can use from somewhere around 10 threads (if no sleep call is made) to around the max 80 threads (if I call sleep for 1 second). The tests were made on a machine with 8 cores.
Summing it up, you will need to check the implementation used by Akka to see exactly how that parallelism is used, this is why I looked into ForkJoinPool. Other than looking at the config file and then inspecting that particular implementation I don't think you can do unfortunately :(
I hope this clarifies the answer - initially I thought you wanted to see how the actor system's dispatcher is configured.

Number of max simultaneous threads is less than max-thread-pool-size

I don't understand behavior of glassfish v3.1.2.
I run my java web-application with such glassfish thread-pool parameters:
Class Name: com.sun.grizzly.http.StatsThreadPool
Max Queue Size: 4096
Max Thread Pool Size: 10
Min Thread Pool Size: 10
Idle Thread
Timeout: 900
Then I send many requests to my servlet. The logic of my servlet is like this:
//do some action
Thread.currentThread().sleep(5000);
Netbeans profiler shows these results in threads window:
http://s8.postimage.org/5hupqk4ad/profiler.png
It seems that all 10 threads were created, but only 5 can run simultaneously.
Of course I want to use max number of threads simultaneously.
Can somebody explain such behavior and suggest how to fix it.
Tell me if you need more information.
Thanks
Try to check your client side, may be you have restrictions there.

Resources