Kotlin coroutines thread pools and Java equivalent - multithreading

I see from the doc for val IO: CoroutineDispatcher that the created executor limits to 64 threads unless we set specific limit.
This seems to me that it is equivalent to a newFixedThreadPool.
Is there in Kotlin coroutines something equivalent to a newCachedThreadPool ?

I believe there is no an equivalent to newCachedThreadPool in Kotlin coroutines, but you can convert it to CoroutineDispatcher by applying asCoroutineDispatcher() extension function:
Executors.newCachedThreadPool().asCoroutineDispatcher()

Related

How do I limit the Tokio threadpool to a certain number of native threads?

What's the correct way of limiting the Tokio (v 0.1.11) threadpool to n OS native threads, where n is an arbitrary number, preferably configurable at runtime?
As far as I can tell, it's possible to use Tokio in single threaded mode using using tokio_current_thread::block_on_all instead of tokio::run and tokio_current_thread::spawn instead of tokio::spawn.
I'd like a similar solution but for n >= 1.
You can build a Tokio Runtime object using tokio::runtime::Builder. The builder offers a core_threads() method that can be used to configure the number of threads, e.g.
let mut rt = runtime::Builder::new()
.core_threads(4)
.build()
.unwrap();
You can then use rt.spawn(some_future) to run a future on this runtime.

Multi-Threading in PyQt 5

I've been learning about multi-threading, specifically in the context of a PyQt 5 application.
Initially I implemented a version using 'threading', but have since learnt that I should be using 'QThread' to allow use of signals / slots, e.g:
workerThread = QThread()
workerObject = Worker(cmdlist)
workerObject.moveToThread(workerThread)
workerThread.started.connect(workerObject.run)
workerObject.finished.connect(workerThread.quit)
However, is it possible to design a system in which:
Each class is associated with a thread created at run-time.
The'main' component of the program can then call functions within those
classes, which are executed within the separate thread for the given
class.
An example of the behaviour would be this:
thread = threading.Thread(target=self.run, args=())
But how would I implement similar behaviour with QThread?
Or my understanding of threads in Python in-correct?
Martin Fitzpatrick has an amazing guide on how to do this using QThreadPools. I think this is what you're looking for.
Multithreading PyQt applications with QThreadPool

Scala Parallel Collections: Change default Pool

SHORT VERSION
I'm looking for a way to set once and for all what Pool to use globally when I call the .par function of a collection...
Up to now I found only how to set the number of threads in the global ExecutionContext but not how to change the actual Pool used by default.
I merely want to explicitly specify the ForkJoinPool to make the parallel collections ExecutionContext independent from the Scala version I use.
LONG VERSION
This requirement came in after we've got issues because Scala 2.10 doesn't support JDK 1.8
Scala simply didn't recognize the java version and thought we were still in 1.5, hence the pool was a different type and the number of threads wasn't limited to the number of processors
The problem is caused by this code:
if (scala.util.Properties.isJavaAtLeast("1.6")) new ForkJoinTaskSupport
else new ThreadPoolTaskSupport
def isJavaAtLeast(version: String) = {
val okVersions = version match {
case "1.5" => List("1.5", "1.6", "1.7")
case "1.6" => List("1.6", "1.7")
case "1.7" => List("1.7")
case _ => Nil
}
okVersions exists (javaVersion startsWith _)
}
As how we manage threads is quite critical in our application and we don't want unexpected surprises just changing a version, I wondered if it was possible to force Scala to use ForkJoinPool with a preset number of threads decided by us GLOBALLY (I don't want the single instance solution described here Scala Parallel Collections: How to know and configure the number of threads)
hope it's clear enough!
From my point of view, your question contain two different requirements :
One is I merely want to explicitly specify the ForkJoinPool to make the parallel collections ExecutionContext independent from the Scala version I use.
I'm not aware this is possible. Above all things, I'm made skeptical by the constructor class ForkJoinTaskSupport(val environment: ForkJoinPool). This constructor is being called with the ForkJoinPool backing the current execution context used by .par, which is the Global one if I'm not mistaken. A few layers later, we realize that this pool is defined here in ExecutionContextImpl :
def createExecutorService: ExecutorService = {
[...]
val desiredParallelism = range(
getInt("scala.concurrent.context.minThreads", "1"),
getInt("scala.concurrent.context.numThreads", "x1"),
getInt("scala.concurrent.context.maxThreads", "x1"))
val threadFactory = new DefaultThreadFactory(daemonic = true)
try {
new ForkJoinPool(
desiredParallelism,
threadFactory,
uncaughtExceptionHandler,
true) // Async all the way baby
} catch {
[...]
}
}
So it's not exactly a pool you can change, but it's still a pool you can definitely configure, which would solve the reformulation of your requirement, aka I wondered if it was possible to force Scala to use ForkJoinPool with a preset number of threads decided by us GLOBALLY
Full disclaimer : I never tried to do so, since I have not needed it so far, but your question made me wanna investigate a bit!

Asynchronous IO in Scala with futures

Let's say I'm getting a (potentially big) list of images to download from some URLs. I'm using Scala, so what I would do is :
import scala.actors.Futures._
// Retrieve URLs from somewhere
val urls: List[String] = ...
// Download image (blocking operation)
val fimages: List[Future[...]] = urls.map (url => future { download url })
// Do something (display) when complete
fimages.foreach (_.foreach (display _))
I'm a bit new to Scala, so this still looks a little like magic to me :
Is this the right way to do it? Any alternatives if it is not?
If I have 100 images to download, will this create 100 threads at once, or will it use a thread pool?
Will the last instruction (display _) be executed on the main thread, and if not, how can I make sure it is?
Thanks for your advice!
Use Futures in Scala 2.10. They were joint work between the Scala team, the Akka team, and Twitter to reach a more standardized future API and implementation for use across frameworks. We just published a guide at: http://docs.scala-lang.org/overviews/core/futures.html
Beyond being completely non-blocking (by default, though we provide the ability to do managed blocking operations) and composable, Scala's 2.10 futures come with an implicit thread pool to execute your tasks on, as well as some utilities to manage time outs.
import scala.concurrent.{future, blocking, Future, Await, ExecutionContext.Implicits.global}
import scala.concurrent.duration._
// Retrieve URLs from somewhere
val urls: List[String] = ...
// Download image (blocking operation)
val imagesFuts: List[Future[...]] = urls.map {
url => future { blocking { download url } }
}
// Do something (display) when complete
val futImages: Future[List[...]] = Future.sequence(imagesFuts)
Await.result(futImages, 10 seconds).foreach(display)
Above, we first import a number of things:
future: API for creating a future.
blocking: API for managed blocking.
Future: Future companion object which contains a number of useful methods for collections of futures.
Await: singleton object used for blocking on a future (transferring its result to the current thread).
ExecutionContext.Implicits.global: the default global thread pool, a ForkJoin pool.
duration._: utilities for managing durations for time outs.
imagesFuts remains largely the same as what you originally did- the only difference here is that we use managed blocking- blocking. It notifies the thread pool that the block of code you pass to it contains long-running or blocking operations. This allows the pool to temporarily spawn new workers to make sure that it never happens that all of the workers are blocked. This is done to prevent starvation (locking up the thread pool) in blocking applications. Note that the thread pool also knows when the code in a managed blocking block is complete- so it will remove the spare worker thread at that point, which means that the pool will shrink back down to its expected size.
(If you want to absolutely prevent additional threads from ever being created, then you ought to use an AsyncIO library, such as Java's NIO library.)
Then we use the collection methods of the Future companion object to convert imagesFuts from List[Future[...]] to a Future[List[...]].
The Await object is how we can ensure that display is executed on the calling thread-- Await.result simply forces the current thread to wait until the future that it is passed is completed. (This uses managed blocking internally.)
val all = Future.traverse(urls){ url =>
val f = future(download url) /*(downloadContext)*/
f.onComplete(display)(displayContext)
f
}
Await.result(all, ...)
Use scala.concurrent.Future in 2.10, which is RC now.
which uses an implicit ExecutionContext
The new Future doc is explicit that onComplete (and foreach) may evaluate immediately if the value is available. The old actors Future does the same thing. Depending on what your requirement is for display, you can supply a suitable ExecutionContext (for instance, a single thread executor). If you just want the main thread to wait for loading to complete, traverse gives you a future to await on.
Yes, seems fine to me, but you may want to investigate more powerful twitter-util or Akka Future APIs (Scala 2.10 will have a new Future library in this style).
It uses a thread pool.
No, it won't. You need to use the standard mechanism of your GUI toolkit for this (SwingUtilities.invokeLater for Swing or Display.asyncExec for SWT). E.g.
fimages.foreach (_.foreach(im => SwingUtilities.invokeLater(new Runnable { display im })))

How can I execute multiple tasks in Scala?

I have 50,000 tasks and want to execute them with 10 threads.
In Java I should create Executers.threadPool(10) and pass runnable to is then wait to process all. Scala as I understand especially useful for that task, but I can't find solution in docs.
Scala 2.9.3 and later
THe simplest approach is to use the scala.concurrent.Future class and associated infrastructure. The scala.concurrent.future method asynchronously evaluates the block passed to it and immediately returns a Future[A] representing the asynchronous computation. Futures can be manipulated in a number of non-blocking ways, including mapping, flatMapping, filtering, recovering errors, etc.
For example, here's a sample that creates 10 tasks, where each tasks sleeps an arbitrary amount of time and then returns the square of the value passed to it.
import scala.concurrent.duration._
import scala.concurrent.ExecutionContext.Implicits.global
val tasks: Seq[Future[Int]] = for (i <- 1 to 10) yield future {
println("Executing task " + i)
Thread.sleep(i * 1000L)
i * i
}
val aggregated: Future[Seq[Int]] = Future.sequence(tasks)
val squares: Seq[Int] = Await.result(aggregated, 15.seconds)
println("Squares: " + squares)
In this example, we first create a sequence of individual asynchronous tasks that, when complete, provide an int. We then use Future.sequence to combine those async tasks in to a single async task -- swapping the position of the Future and the Seq in the type. Finally, we block the current thread for up to 15 seconds while waiting for the result. In the example, we use the global execution context, which is backed by a fork/join thread pool. For non-trivial examples, you probably would want to use an application specific ExecutionContext.
Generally, blocking should be avoided when at all possible. There are other combinators available on the Future class that can help program in an asynchronous style, including onSuccess, onFailure, and onComplete.
Also, consider investigating the Akka library, which provides actor-based concurrency for Scala and Java, and interoperates with scala.concurrent.
Scala 2.9.2 and prior
This simplest approach is to use Scala's Future class, which is a sub-component of the Actors framework. The scala.actors.Futures.future method creates a Future for the block passed to it. You can then use scala.actors.Futures.awaitAll to wait for all tasks to complete.
For example, here's a sample that creates 10 tasks, where each tasks sleeps an arbitrary amount of time and then returns the square of the value passed to it.
import scala.actors.Futures._
val tasks = for (i <- 1 to 10) yield future {
println("Executing task " + i)
Thread.sleep(i * 1000L)
i * i
}
val squares = awaitAll(20000L, tasks: _*)
println("Squares: " + squares)
You want to look at either the Scala actors library or Akka. Akka has cleaner syntax, but either will do the trick.
So it sounds like you need to create a pool of actors that know how to process your tasks. An Actor can basically be any class with a receive method - from the Akka tutorial (http://doc.akkasource.org/tutorial-chat-server-scala):
class MyActor extends Actor {
def receive = {
case "test" => println("received test")
case _ => println("received unknown message")
}}
val myActor = Actor.actorOf[MyActor]
myActor.start
You'll want to create a pool of actor instances and fire your tasks to them as messages. Here's a post on Akka actor pooling that may be helpful: http://vasilrem.com/blog/software-development/flexible-load-balancing-with-akka-in-scala/
In your case, one actor per task may be appropriate (actors are extremely lightweight compared to threads so you can have a LOT of them in a single VM), or you might need some more sophisticated load balancing between them.
EDIT:
Using the example actor above, sending it a message is as easy as this:
myActor ! "test"
The actor will then output "received test" to standard output.
Messages can be of any type, and when combined with Scala's pattern matching, you have a powerful pattern for building flexible concurrent applications.
In general Akka actors will "do the right thing" in terms of thread sharing, and for the OP's needs, I imagine the defaults are fine. But if you need to, you can set the dispatcher the actor should use to one of several types:
* Thread-based
* Event-based
* Work-stealing
* HawtDispatch-based event-driven
It's trivial to set a dispatcher for an actor:
class MyActor extends Actor {
self.dispatcher = Dispatchers.newExecutorBasedEventDrivenDispatcher("thread-pool-dispatch")
.withNewThreadPoolWithBoundedBlockingQueue(100)
.setCorePoolSize(10)
.setMaxPoolSize(10)
.setKeepAliveTimeInMillis(10000)
.build
}
See http://doc.akkasource.org/dispatchers-scala
In this way, you could limit the thread pool size, but again, the original use case could probably be satisfied with 50K Akka actor instances using default dispatchers and it would parallelize nicely.
This really only scratches the surface of what Akka can do. It brings a lot of what Erlang offers to the Scala language. Actors can monitor other actors and restart them, creating self-healing applications. Akka also provides Software Transactional Memory and many other features. It's arguably the "killer app" or "killer framework" for Scala.
If you want to "execute them with 10 threads", then use threads. Scala's actor model, which is usually what people is speaking of when they say Scala is good for concurrency, hides such details so you won't see them.
Using actors, or futures with all you have are simple computations, you'd just create 50000 of them and let them run. You might be faced with issues, but they are of a different nature.
Here's another answer similar to mpilquist's response but without deprecated API and including the thread settings via a custom ExecutionContext:
import java.util.concurrent.Executors
import scala.concurrent.{ExecutionContext, Await, Future}
import scala.concurrent.duration._
val numJobs = 50000
var numThreads = 10
// customize the execution context to use the specified number of threads
implicit val ec = ExecutionContext.fromExecutor(Executors.newFixedThreadPool(numThreads))
// define the tasks
val tasks = for (i <- 1 to numJobs) yield Future {
// do something more fancy here
i
}
// aggregate and wait for final result
val aggregated = Future.sequence(tasks)
val oneToNSum = Await.result(aggregated, 15.seconds).sum

Resources