How to set up a Scala actor to exclusively use a separate thread to run? - multithreading

As far as I understand, Scala manages a thread pool to run actors, sharing threads among them. Can I set up a particular actor to run in a separate thread exclusively, never sharing it with another actor?

It sounds like you are using Scala (not Akka) actors. In that case if you use the receive or receiveWithin style of message handling then each actor will get its own thread. Using the react style of message handling shares a thread pool among actors.
When I say the receive "style", I mean in a loop, for example:
val timerActor = actor {
while (true) {
receiveWithin(60 * 1000) {
case Stop => self.exit()
case TIMEOUT =>
destination ! Tick
}
}
}
In this case timerActor does not share its thread with any other actor. receiveWithin will block until either the actor receives a Stop message or 60 seconds passes. If 60 seconds passes then the TIMEOUT case is executed.
If you want to learn the gritty details about Scala actors, check out the paper Actors That Unify Threads and Events.
Akka also supports thread-based actors in addition to event-based actors.

Related

Play Framework: thread-pool-executor vs fork-join-executor

Let's say we have a an action below in our controller. At each request performLogin will be called by many users.
def performLogin( ) = {
Async {
// API call to the datasource1
val id = databaseService1.getIdForUser();
// API call to another data source different from above
// This process depends on id returned by the call above
val user = databaseService2.getUserGivenId(id);
// Very CPU intensive task
val token = performProcess(user)
// Very CPU intensive calculations
val hash = encrypt(user)
Future.successful(hash)
}
}
I kind of know what the fork-join-executor does. Basically from the main thread which receives a request, it spans multiple worker threads which in tern will divide the work into few chunks. Eventually main thread will join those result and return from the function.
On the other hand, if I were to choose the thread-pool-executor, my understanding is that a thread is chosen from the thread pool, this selected thread will do the work, then go back to the thread pool to listen to more work to do. So no sub dividing of the task happening here.
In above code parallelism by fork-join executor is not possible in my opinion. Each call to the different methods/functions requires something from the previous step. If I were to choose the fork-join executor for the threading how would that benefit me? How would above code execution differ among fork-join vs thread-pool executor.
Thanks
This isn't parallel code, everything inside of your Async call will run in one thread. In fact, Play! never spawns new threads in response to requests - it's event-based, there is an underlying thread pool that handles whatever work needs to be done.
The executor handles scheduling the work from Akka actors and from most Futures (not those created with Future.successful or Future.failed). In this case, each request will be a separate task that the executor has to schedule onto a thread.
The fork-join-executor replaced the thread-pool-executor because it allows work stealing, which improves efficiency. There is no difference in what can be parallelized with the two executors.

Scala actors, futures and system calls resulting in Thread leaks

I am running a complex software with different actors (scala actors). Some of them have some executions that uses scala futures to avoid locking and keep processing new received messages (simplified code):
def act {
while (true) {
receive {
case (code: String) =>
val codeMatch = future { match_code(code) }
for (c <- codeMatch)
yield callback(code)(JSON.parseJSON(c))
}
}
}
def match_code(code: String) {
val result = s"my_script.sh $code" !!
}
I noticed looking at jvisualvm and Eclipse Debugger that the number of active threads keeps increasing when this system is running. I am afraid I am having some kind of Thread leak, but I can't detect where is the problem.
Here are some screenshots of both finished and live threads (I hided some live threads that are not related to this problem)
Finished Threads
Living threads
Edit 1:
In the above graphs example, I run the system with only 3 actors of different classes: Actor1 sends messages to Actor2 that sends message to Actor3
You are using receive so each actor will use its own thread, and you don't at least in this example provide any way for actors to terminate. So you would expect to have one new thread per actor that was ever started. If that is what you see, then all is working as expected. If you want to have actors cease running, you will have to let them eventually fall out of the while loop or call sys.exit on them or somesuch.
(Also, old-style Scala actors are deprecated in favor of Akka actors in 2.11.)
You also don't (in the code above) have any indication whether the future actually completed. If the futures don't finish, they'll keep tying up threads.

Pin/Run Akka Actor to Main Thread

I'm currently poking at using Scala and Akka in an application that uses LWJGL. As is commonly known, you can't really issue OpenGL calls outside of the main thread of the application. This poses a problem if I want to use any actor for rendering (either a single, main actor that, for example, drains a rendering command queue, or having multiple actors that might issue arbitrary OpenGL commands at any time) as I have not seen a way to run any actor on a specific thread. Either by pinning a specific actor to a thread, or by instructing an actor to run on a specific thread at some point. (a la Objective-C's performSelectorOnMainThread)
Is there a way to pin a "rendering" actor to the main thread, or have any actor run on the main thread at some point in the future, at which point it will be able to issue OpenGL calls? (or even some other solution, I'm open to ideas)
To pin execution thread of Akka actor you can use custom executor service configuration:
akka {
...
actor {
...
my-dispatcher {
executor = "com.github.plokhotnyuk.actors.CustomExecutorServiceConfigurator"
}
}
}
class CustomExecutorServiceConfigurator(config: Config, prerequisites: DispatcherPrerequisites) extends ExecutorServiceConfigurator(config, prerequisites) {
def createExecutorServiceFactory(id: String, threadFactory: ThreadFactory): ExecutorServiceFactory = new ExecutorServiceFactory {
def createExecutorService: ExecutorService = myExecutorService()
}
}
Full example is here

How can I execute multiple tasks in Scala?

I have 50,000 tasks and want to execute them with 10 threads.
In Java I should create Executers.threadPool(10) and pass runnable to is then wait to process all. Scala as I understand especially useful for that task, but I can't find solution in docs.
Scala 2.9.3 and later
THe simplest approach is to use the scala.concurrent.Future class and associated infrastructure. The scala.concurrent.future method asynchronously evaluates the block passed to it and immediately returns a Future[A] representing the asynchronous computation. Futures can be manipulated in a number of non-blocking ways, including mapping, flatMapping, filtering, recovering errors, etc.
For example, here's a sample that creates 10 tasks, where each tasks sleeps an arbitrary amount of time and then returns the square of the value passed to it.
import scala.concurrent.duration._
import scala.concurrent.ExecutionContext.Implicits.global
val tasks: Seq[Future[Int]] = for (i <- 1 to 10) yield future {
println("Executing task " + i)
Thread.sleep(i * 1000L)
i * i
}
val aggregated: Future[Seq[Int]] = Future.sequence(tasks)
val squares: Seq[Int] = Await.result(aggregated, 15.seconds)
println("Squares: " + squares)
In this example, we first create a sequence of individual asynchronous tasks that, when complete, provide an int. We then use Future.sequence to combine those async tasks in to a single async task -- swapping the position of the Future and the Seq in the type. Finally, we block the current thread for up to 15 seconds while waiting for the result. In the example, we use the global execution context, which is backed by a fork/join thread pool. For non-trivial examples, you probably would want to use an application specific ExecutionContext.
Generally, blocking should be avoided when at all possible. There are other combinators available on the Future class that can help program in an asynchronous style, including onSuccess, onFailure, and onComplete.
Also, consider investigating the Akka library, which provides actor-based concurrency for Scala and Java, and interoperates with scala.concurrent.
Scala 2.9.2 and prior
This simplest approach is to use Scala's Future class, which is a sub-component of the Actors framework. The scala.actors.Futures.future method creates a Future for the block passed to it. You can then use scala.actors.Futures.awaitAll to wait for all tasks to complete.
For example, here's a sample that creates 10 tasks, where each tasks sleeps an arbitrary amount of time and then returns the square of the value passed to it.
import scala.actors.Futures._
val tasks = for (i <- 1 to 10) yield future {
println("Executing task " + i)
Thread.sleep(i * 1000L)
i * i
}
val squares = awaitAll(20000L, tasks: _*)
println("Squares: " + squares)
You want to look at either the Scala actors library or Akka. Akka has cleaner syntax, but either will do the trick.
So it sounds like you need to create a pool of actors that know how to process your tasks. An Actor can basically be any class with a receive method - from the Akka tutorial (http://doc.akkasource.org/tutorial-chat-server-scala):
class MyActor extends Actor {
def receive = {
case "test" => println("received test")
case _ => println("received unknown message")
}}
val myActor = Actor.actorOf[MyActor]
myActor.start
You'll want to create a pool of actor instances and fire your tasks to them as messages. Here's a post on Akka actor pooling that may be helpful: http://vasilrem.com/blog/software-development/flexible-load-balancing-with-akka-in-scala/
In your case, one actor per task may be appropriate (actors are extremely lightweight compared to threads so you can have a LOT of them in a single VM), or you might need some more sophisticated load balancing between them.
EDIT:
Using the example actor above, sending it a message is as easy as this:
myActor ! "test"
The actor will then output "received test" to standard output.
Messages can be of any type, and when combined with Scala's pattern matching, you have a powerful pattern for building flexible concurrent applications.
In general Akka actors will "do the right thing" in terms of thread sharing, and for the OP's needs, I imagine the defaults are fine. But if you need to, you can set the dispatcher the actor should use to one of several types:
* Thread-based
* Event-based
* Work-stealing
* HawtDispatch-based event-driven
It's trivial to set a dispatcher for an actor:
class MyActor extends Actor {
self.dispatcher = Dispatchers.newExecutorBasedEventDrivenDispatcher("thread-pool-dispatch")
.withNewThreadPoolWithBoundedBlockingQueue(100)
.setCorePoolSize(10)
.setMaxPoolSize(10)
.setKeepAliveTimeInMillis(10000)
.build
}
See http://doc.akkasource.org/dispatchers-scala
In this way, you could limit the thread pool size, but again, the original use case could probably be satisfied with 50K Akka actor instances using default dispatchers and it would parallelize nicely.
This really only scratches the surface of what Akka can do. It brings a lot of what Erlang offers to the Scala language. Actors can monitor other actors and restart them, creating self-healing applications. Akka also provides Software Transactional Memory and many other features. It's arguably the "killer app" or "killer framework" for Scala.
If you want to "execute them with 10 threads", then use threads. Scala's actor model, which is usually what people is speaking of when they say Scala is good for concurrency, hides such details so you won't see them.
Using actors, or futures with all you have are simple computations, you'd just create 50000 of them and let them run. You might be faced with issues, but they are of a different nature.
Here's another answer similar to mpilquist's response but without deprecated API and including the thread settings via a custom ExecutionContext:
import java.util.concurrent.Executors
import scala.concurrent.{ExecutionContext, Await, Future}
import scala.concurrent.duration._
val numJobs = 50000
var numThreads = 10
// customize the execution context to use the specified number of threads
implicit val ec = ExecutionContext.fromExecutor(Executors.newFixedThreadPool(numThreads))
// define the tasks
val tasks = for (i <- 1 to numJobs) yield Future {
// do something more fancy here
i
}
// aggregate and wait for final result
val aggregated = Future.sequence(tasks)
val oneToNSum = Await.result(aggregated, 15.seconds).sum

Threading 101: What is a Dispatcher?

Once upon a time, I remembered this stuff by heart. Over time, my understanding has diluted and I mean to refresh it.
As I recall, any so called single threaded application has two threads:
a) the primary thread that has a pointer to the main or DllMain entry points; and
b) For applications that have some UI, a UI thread, a.k.a the secondary thread, on which the WndProc runs, i.e. the thread that executes the WndProc that recieves messages that Windows posts to it. In short, the thread that executes the Windows message loop.
For UI apps, the primary thread is in a blocking state waiting for messages from Windows. When it recieves them, it queues them up and dispatches them to the message loop (WndProc) and the UI thread gets kick started.
As per my understanding, the primary thread, which is in a blocking state, is this:
C++
while(getmessage(/* args &msg, etc. */))
{
translatemessage(&msg, 0, 0);
dispatchmessage(&msg, 0, 0);
}
C# or VB.NET WinForms apps:
Application.Run( new System.Windows.Forms() );
Is this what they call the Dispatcher?
My questions are:
a) Is my above understanding correct?
b) What in the name of hell is the Dispatcher?
c) Point me to a resource where I can get a better understanding of threads from a Windows/Win32 perspective and then tie it up with high level languages like C#. Petzold is sparing in his discussion on the subject in his epic work.
Although I believe I have it somewhat right, a confirmation will be relieving.
It depends on what you consider the primary thread. Most UI frameworks will have an event handler thread that sits mostly idle, waiting for low level events. When an event occurs this thread gets a lock on the event queue, and adds the events there. This is hardly what I'd consider the primary thread, though.
In general a dispatcher takes some events and, based on their content or type sends (dispatches, if you will) them to another chunk of code (often in another thread, but not always). In this sense the event handler thread itself is a simple dispatcher. On the other end of the queue, the framework typically provides another dispatcher that will take events off of the queue. For instance, sending mouse events to mouse listeners, keyboard events to keyboard listeners etc.
Edit:
A simple dispatcher may look like this:
class Event{
public:
EventType type; //Probably an enum
String data; //Event data
};
class Dispatcher{
public:
...
dispatch(Event event)
{
switch(event.type)
{
case FooEvent:
foo(event.data);
break;
...
}
};
Most people I've met use "dispatcher" to describe something that's more than just a simple passthrough. In this case, it performs different actions based on a type variable which is consistent with most of the dispatchers I've seen. Often the switch is replaced with polymorphism, but switch makes it clearer what's going on for an example.

Resources