How can akka actor interact between threads - multithreading

I've read akka documentation and can't produce clean understanding of thread interaction while using akka. Docs may omit this thing as obvious but it is not so obvious for me.
All akka actors seemed to be run in same thread they are called. I see actors as co-procedures that just had own stack reset each time receive called.
You may perform a huge chain of actors switching in straight line. Each receive perform small non-blocking operation and force another receive to work further. There is no event loop, that can handle messages outside of the actor system.
I'd like to catch a request from other thread, perform control operations, and wait for another message.
There are some use cases that outline my needs.
There is thread that constantly polling data from some sources. Once data matches pattern it invokes event-driven handler based on actors. Logical controller makes a decision and passes it workers. There should be two persistent thread. One threads works constantly on polling and another works asynchronously to control it work. You should not let akka actors to first thread since they broke polling periods and first thread should not block actors so they need another thread.
There is some kind of two-side board game. One side has a controller thread that schedules calculation time works interacts with board server and etcetera. Other thread is a heavy calculating thread that loops over different variants and could not be written in akka since it has blocking nature
I aware of existing akka futures, but they represent a working task that run once fired and shutting down after performing their goal. The futures are well combined with akka actors, but can not express looped working threads.
Akka actor system incorporates different kinds of network event loops. You may use its built-in remote actor system or well known 0mq protocol. But using network for thread interactions seems like overdoing for me.
What is the supposed way to glue non-akka thread with akka one? Should I wrote a couple of special procedures to perform message passing in thread-safe way?

If you need polling, then the polling thread should just turn whatever is polled into a message and fire it off to an actor.
I find it more useful to use an Actor with a receiveTimeout to do non-blocking polling at an interval, and when there's something that gets polled, it will publish it to some other actor, or perhaps even its ActorSystems' EventStream, for true pub-sub action.

Related

Async and scheduling - how do libraries avoid blocking at the lowest level?

I've been using various concurrency constructs for a while now without much consideration for how all the magic happens, which has recently made me increasingly uneasy.
In an attempt to remedy this ... feeling, I have been reading up on how async works under the hood. When I say async, in this case I'm referring to userland / greenthread / cooperative multitasking, although I assume some of the concepts will also apply to traditional OS managed threads insofar as a scheduler and workers are involved.
I see how a worker can suspend itself and let other workers execute, but at the lowest level in non-blocking library code, how does the scheduler know when a previously suspended worker's job is done and to wake up that worker?
For example if you fire up a worker in some sort of async block and perform an operation that would normally block (e.g. HTTP request, SQL query, other I/O), then even though your calling code is async, that operation (library code) better play nice with your async framework or you've effectively defeated the purpose of using it and blocked your scheduler from calling other waiting operations (the, What Color is Your Function problem) while it waits for your blocking call, which was executed inside your non-blocking calling code, to complete.
So now we've got async code calling other async library code, and now I'm asking myself the question all over again - how does the async library code know when to suspend and resume operation?
The idea of firing off a HTTP request, moving on, and returning later to check for results is weird to think about for me - not conceptually but from an implementation standpoint.
How do you perform a partial operation, e.g. sending TCP packets and then continuing with the rest of the program execution, only to come back later and check if results have been delivered. Delivered to what? A socket?
Now we're another layer deep and you are using socket selects to avoid creating threads and blocking, but, again...
how do those sockets start an operation, move on before completion, and then how does select know when data is available?
Are you continually checking some buffer to see if bytes have been delivered in an infinite loop and moving on if not?
Anyhow - I think you see where I'm going here....
I focused mainly on HTTP as a motivating example, but the same question applies for any normally blocking operations - how does it all work at the bottom?
Here are some of the resources I found helpful while researching the topic and which informed this question:
David Beazley's excellent video Build Your Own Async where he walks you through a simple implementation of a scheduler which fire callbacks and suspend execution by sleeping on a waiting queue. I found this video tremendously instructive, but it stops a bit short in that it shows you how using an async sleep frees up the scheduler to execute other workers, but doesn't really go into what would happen when you call code in those workers that itself must be non-blocking so it plays nice with the scheduler.
How does non-blocking IO work under the hood - This got me further along in my understanding, but still left with a few uncertainties.

What really is asynchronous computing?

I've been reading (and working) quite a bit with massively multi-threaded applications, and with IO, and I've found that the term asynchronous has become some sort of catch-all for multiple vague ideas. I'm wondering if I understand it correctly. The way I see it is that there are two main branches of "asynchronicity".
Asynchronous I/O. Such as network read/write. What this really boils down to is efficient parallel processing between multiple CPUs, such as your main CPU and your NIC CPU. The idea is to have multiple processors running in parallel, exchanging data, without blocking waiting for the other to finish and return the results of it's job.
Minimizing context-switching penalties by minimizing use of threads. This seems to be what the .NET framework is focusing on with it's async/await features. Instead of spawning/closing/blocking threads, break parallel jobs into tasks, and use a software task scheduler to keep a pool of threads as busy as possible without resorting to spawning new threads.
These seem like two entirely separate concepts with no similarities that could tie them together, but are both referred to by the same "asynchronous computing" vocabulary.
Am I understanding all of this correctly?
Asynchronous basically means not blocking, i.e. not having to wait for an operation to complete.
Threads are just one way of accomplishing that. There are many ways of doing this, from hardware level, SO level, software level.
Someone with more experience than me can give examples of asyncronicity not related to threads.
What this really boils down to is efficient parallel processing between multiple CPUs, such as your main CPU and your NIC CPU. The idea is to have multiple processors running in parallel...
Asynchronous programming is not all about multi-core CPU's and parallelism: consider a single core CPU, with just one thread creating email messages and sends them. In a synchronous fashion, it would spend a few micro seconds to create the message, and a lot more time to send it through network, and only then create the next message. But in asynchronous program, the thread could create a new message while the previous one is being sent through the network. One implementation for that kind of program can be using .NET async/await feature, where you can have just one thread. But even a blocking IO program could be considered asynchronous: If the main thread creates the messages and queues them in a buffer, which another thread pulls them from and sends them in a blocking IO way. From the main thread's point of view - it's completely async.
.NET async/await just uses the OS api's which are already async - reading /writing a file, send /receive data through network, they are all async anyway - the OS doesn't block on them (the drivers themselves are async).
Asynchronous is a general term, which does not have widely accepted meaning. Different domains have different meanings to it.
For instance, async IO means that instead of blocking on IO call, something else happens. Something else can be really different things, but it usually involves some sort of notification of call completion. Details might differ. For instance, a notification might be built into the call itself - like in MS Completeion Ports (if memory serves). Or, it can be something verify do before you make a call so that the call can not block - this is what poll() and friends do.
Async might also well mean simply parallel execution. For instance, one might say that 'database is updated asynchronously' meaning that there is a dedicated thread which handles database connectivity, and that thread does not slow down the main processing thread.

multithreading and user interface

Ok, here we go.
recently I got fond of HCI topics on interface design.
I found out that there could be some way to implement a multithread interface in case for reducing the delay of system respons.
Morover. this may also be possible to say that designing a user interface has tight relationship with STD.
therefore, I wonder if there is any method or techniques to find independant part of ,say,a given STD of a UI that can be seen as threads?
A multi threaded interface in most cases is not fundamentally different to it's single threaded counterpart. There is still a single thread listening on interface events and it will still run handlers as events happen. However the difference comes down to what is contained in these handlers. A simple single threaded event loop would look as below:
A multi-threaded UI is a little different but the principal is the same:
Effectively long processes which are initiated in worker threads which can then report back to them main UI thread so it can report completion.
In relation to a State Transition Diagram, multi-threading complicates things somewhat however there a number of ways to still accomplish this. The first is to simply map each (potential) thread's path separately, this requires decisions for if any threads are finished at the points the main thread checks. It is also possible to use a thread state transition diagram which can demonstrate many threads in a single diagram but is sometimes harder to parse.
Now regarding using a state transition diagram to help implement threading in a user interface program you simply have to locate tasks between the event handler and returning to listening which are time consuming and likely to block. You then need to dispatch these tasks as a thread, optionally adding a completion callback in the main thread.
If I have missed anything please comment below, otherwise I hope this is helpful.

scala actors vs threads and blocking IO

As I understand it, actors are basically lightweight threads implemented on top of threads, running many actors on a small pool of shared threads.
Given that's the case, using blocking operations in an actor blocks the underlying thread. This is not a correctness problem because the actor library will spawn more threads as necessary (is that right?) but then you end up with lots and lots of threads, negating the benefit of using actors in the first place.
Given that, how do actors work when you need to do such IO operations? Are there operations which "actor-block", suspending the actor while letting the thread go on to other operations (much as blocking operations suspend the thread while letting the CPU go on to other operations), or is everything written in CPS, with chained actors? Or are actors simply not a good fit for this sort of long-running operation?
Background: I have experience writing multithreaded stuff the classic way, and understand prettywell how CPS/event loops work, but have absolutely no experience working with actors, and just want to understand, on a high level, how they fit in, before I dive into the code.
This is not a correctness problem because the actor library will spawn more threads as necessary (is that right?)
So far as I understand, that is not right. The actor is blocked, and sending another message to it causes that message to sit in the actors mailbox until that actor can receive it or react to the message.
In Programming in Scala (1), it explicitly states that actors should not block. If an actor needs to do something long running it should pass the work to a second actor, so that the main actor can free itself up and go read more messages from its mailbox. Once the worker has completed the work, it can signal that fact back to the main actor, which can finish doing whatever it has to do.
Since workers too have mailboxes, you will end up with several workers busily working their way through the work. If you don't have enough CPU to handle that, their queues will just get bigger and bigger. Eventually you can scale out by using remote actors. Akka might be more useful in such cases.
(1) Chapter 32.5 of Programming in Scala (Odersky, Second edition, 2010)
EDIT: I found this:
The scheduler method of the Actor trait can be overridden to return a ResizableThreadPoolScheduler, which resizes its thread pool to avoid starvation caused by actors that invoke arbitrary blocking methods.
Found it at: http://www.scala-lang.org/api/current/scala/actors/Actor.html
So, that means depending on the scheduler impl you set, perhaps the pool used to run the actors will be increased. I was wrong when I said you were wrong :-) The rest of the answer still holds true.
How thing work in erlang is that all blocking operation should be don by send message because when your actor is blocked waiting for a message it's yielding the thread to other actor.
So if you want to do some blocking operation like reading from a file you should do a FileReader actor that use Non-bloking api to read and write from file. And have your other actor use this actor (send and receive message to it) as an api to read and write to file.

Alternative to Observer Pattern

Does anyone know of an alternative to the Observer a.k.a. Listener pattern?
I'm interested in something that would work well in an asynchronous
environment.
The problem I'm facing is that I have an application which uses this
pattern a lot, which is not a bad thing per se, but it becomes a bottleneck as the number of listeners increases. Combined with threading primitives (mutexes, critical sections - of course in my specific environment) the hit on performance is really bad.
How about Message Queue?
If there are too many observers, so the thread being observed is not making any progress, then it might be wise to reverse the relationship. Rather than have the observed thread call out to each and every observer, it may be better to have the observers wait on something like a condition variable or event associated with the observed thread. The observer code can then block, waiting for the condition variable to be signalled. The observed thread can then just signal the condition variable rather than calling into the observers; the observers can notice the signal and process the consequences in their own time.
Please take a look at it if reducing listeners in your code is your primary objective Jeffrey Richter and his AsyncEnumerator. This technique makes asynchronous programmin look more like synchronous.
With this technique your single method could issue an Asynch call and resto fo the method act as an event handler, therefore whole of the invokation and event listener code could be clubbed as one fucntion.
Difficult to say without a more concrete description but the Mediator pattern is related and can be used when the number of communicating objects starts to proliferate. You could implement some policy to co-ordinate the activities a more structured way within these.
Two alternatives from me: using actor model (like akka framework) or using executor to limit the parallelization. Executor is basically just a thread pool which will limit the number of thread and reuse finished threads.

Resources