When should one use the Actor model? - multithreading

When should the Actor Model be used?
It certainly doesn't guarantee deadlock-free environment.
Actor A can wait for a message from B while B waits for A.
Also, if an actor has to make sure its message was processed before moving on to its next task, it will have to send a message and wait for a "your message was processed" message instead of the straightforward blocking.
What's the power of the model?

Given some concurrency problem, what would you look for to decide whether to use actors or not?
First I would look to define the problem... is the primary motivation a speedup of a nested for loop or recursion? If so a simple task based approach or parallel loop approach will likely work well for you (rather than actors).
However if you have a more complex system that involves dependencies and coordinating shared state, then an actor approach can help. Specifically through use of actors and message passing semantics you can often avoid using explicit locks to protect shared state by actually making copies of that state (messages) and reacting to them.
You can do this quite easily with the classic synchronization problems like dining philosophers and the sleeping barbers problem. But you can also use the 'actor' to help with more modern patterns, i.e. your facade could be an actor, your model view and controller could also be actors that communicate with each other.
Another thing that I've observed is that actor semantics are learnable by most developers and 'safer' than their locked counterparts. This is because they raise the abstraction level and allow you to focus on coordinating access to that data rather than protecting all accesses to the data with locks. As an example, imagine that you have a simple class with a data member. If you choose to place a lock in that class to protect access to that data member then any methods on that class will need to ensure that they are accessing that data member under the lock. This becomes particularly problematic when others (or you) modify the class at a later date, they have to remember to use that lock.
On the other hand if that class becomes an actor and the data member becomes a buffer or port you communicate with via messages, you don't have to remember to take the lock because the semantics are built into the buffer and you will very explicitly know whether you are going to block on that based on the type of the buffer.
-Rick

The usage of Actor is "natural" in at least two cases:
When you can decompose your problem in a set of independent tasks.
When you can decompose your problem in a set of tasks linked by a clear workflow (ie. dataflow programming).
For instance, if you process complex data using a series of filters, it is easy to use a pipeline of actors where each actor receives data from an upstream actor and sets data to a downstream actor.
Of course, this data-flow must not be linear and if a step is slow in your pipeline, instead you can use a pool of actors doing the same job. Another way of solving the load balancing problems would be to use instead a demand-driven approach organized with a kind of virtual Kanban system.
Of course, you will need synchronization between actors in almost all interesting cases, but contrary to the classic multi-thread approach, this synchronization is really "concrete". You can imagine guys in a factory, imagine possible problems (workers run out of the job to do, upstream operations is too fast and intermediate products need a huge storage place, etc.) By analogy, you can then find a solution more easily.

I am not an actor expert but here is my 2 cents when to use actor model:
Actor model is not suited for every concurrent application, for instance if you are creating an application which is multi threaded and works in high concurrency actor model is not made to solve the concurrency issue.
Where actors really comes into play is when you are creating an event driven application. For instance you have an application and you are tracking what are users clicking in your application realtime. You can use actors to do activities realtime segregated by user, device or anything of your business requirement as actors are stateful. So, for example if some users lies in actors which clicked on shirts you can send them notification of some coupon.
Also some applications where actors comes handy are : Finance (Pricing, fraud detection), multiplayer gaming.

Actors are asynchronous and concurrent but does not guarantee message order or time limit as to when the message may be acted upon. Hence atomic transactions cannot be split into Actors.
If the application/task involves no mutable state then Actors are overkill as Actor frameworks go to great lengths to avoid race conditions.

Related

One aggregate per transaction, with "one" or "multiple" bounded contexts

Following the Vaughn Vernon recommendation, to achieve a high level of decoupling and single responsibility, just one aggregate should be changed per transaction.
In the chapter 8 of the Red Book Vaughn Vernon demonstrated how two aggregates can "talk" to each other with domain events. In the chapter 13 how different aggregates in two different bounded context can "talk" to each other with notifications.
My question is, why should I deal with these situations differently once both of them happen in different transaction? If is it just one or multiple bounded contexts the possible problems wouldn't be the same?
For example, if the application crashes between two domain events in the same bounded context I'll end up with inconsistency as with two bounded contexts.
It seems that the safest way to deal with two aggregates "talking" to each other asynchronously is to have a transitional status in it, persist the events before send them (to avoid lose events), have idempotent operations when possible and deduplicate the event in the receiving side when it's not possible to execute the operation in an idempotent way.
I see two aspects to consider in your question:
The DDD aspect: Event types and what you do with them
A technical aspect: how to implement it reliably
Regarding the types of Events what I would say is that events that stay within the boundaries of a bounded context (often called Domain Events) normally carry a lot of information. Potentially a big part of the state of the Aggregate. If you use CQRS, they are used to create the Read Model. Events that cross the BC boundaries are sometimes called Integration Events and they should carry as little data as possible (potentially, only global IDs, like CustomerId, OrderId). The reason is that every extra property that you add is extra coupling between the publisher BC and the subscriber BCs, which is what you want to minimize.
I would say that it's this distinction between the types of Events which might lead to have different technical solutions, but I agree with you that it doesn't have to be this way if you find a solution that works well for both cases.
The solution you propose is correct. It looks very similar to the Outbox feature of NServiceBus, which basically takes care of all this for you.
Another approach that I've used, if your message broker supports it, is what Azure Service Bus calls Send Via. With this feature, you can publish events Via your own queue but the send will be committed transactionally with the removal of the incoming message from the queue. This means that if for some reason the message that you are processing is not deleted from the queue successfully (DB update exception, broker unavailable, etc) and therefore it will be retried, you know for sure that the events won't be sent and you can safely publish them again during the retry. This makes making idempotent operations simpler and avoids publishing ghost messages.

Alternative to Akka for handling tasks sequentially

I have a system which may generate certain events in the lifecycle of a transaction. On every even I need to update a row in a DB and also send out a UI event over websocket.
One option I have is to implement the event processing (DB and UI) in actors thus avoiding any locking issues - also I can afford minor delays so handling this sequentially will greatly simplify matters.
What are alternative ways of handling this in Scala as I feel maybe Actors might be overkill in this case?
There are those blogs stating that actors should be used for "concurrency with state" - though I would like to see a more appropriate mechanism in order to eliminate this option.
Ultimately the main unique benefit for using actors is that they are great for encapsulating mutable variables so that you avoid race conditions.
To do what you describe, you could just use classic threads. In your (possibly simplified) description, I don't see the potential for deadlocks. If you want to something a bit more composable, e.g. a sequence of asynchronous tasks, you can use Scala's Futures.
Not at all sure if this is applicable for Scala, but there's a great lib for Groovy and Java with several concurrency models. I've myself used the Dataflow Concurrency with great success and can recommend it as a light-weight yet manageable model.
Dataflow Concurrency offers an alternative concurrency model, which is
inherently safe and robust. It puts an emphasis on the data and their
flow though your processes instead of the actual processes that
manipulate the data. Dataflow algorithms relieve developers from
dealing with live-locks, race-conditions and make dead-locks
deterministic and thus 100% reproducible. If you don’t get dead-locks
in tests you won’t get them in production.
There are other models available in the linked GPars library as well.
I wouldn't suggest making the threading yourself, unless you have no other choices.
Addendum
After posting I got interested in the topic and made a few searches. Seems like Akka has direct support for Dataflow model also. Or at least has had in some version.
Actors avoid locking issues because they use queues to interact. You can use threads with (blocking) queues and get the same level of safety. The only advantage of Actors over Threads is that an Actor does not spend memory for call stack, thus we can have many more actors than threads in the same amount of core memory. The downside of actor model is that a complex algorithm which could be implemented in single thread require several actors and so actor implementation may look obscured.

Basic Explanation of Actors in Erlang

I am trying to put together a very basic explanation of actors in Erlang. It is supposed to be as bare-bones as possible, but without leaving out key features of the theory or the Erlang implementation of it. This is my explanation:
The actor model is a mathematical model of concurrent computation that treats actors as the universal primitives of concurrent computation. The actor is a computational entity that, in response to a message it receives, can concurrently (1) send a finite number of messages to other actors, (2) create a finite number of new actors, and (3) designate the behavior to be used for the next message it receives.
In Erlang, each actor is a separate process in the virtual machine, implemented by a function. Processes communicate by sending messages to each other. Every message is explicit, traceable and safe. The messages are received in a mailbox and stored in the order in which they are received. They are stored there until the receiving process takes them out to be read. This is called asynchronous message passing.
What do you guys think? Is it OK? Should I add or change anything? Thanks.
I think you would help yourself if you didn't confuse actors with Erlang processes. You started with Wikipedia's description of Actor model only to seamlessly start writing about Erlang processes, like if it was one and the same. Actor model is a mathematical model which can be implemented in many different ways, including a pure C or C++ low-level implementation. On the other hand, Erlang processes are lightweight preemptive language features that allow to run a vast amount of processes in parallel much more effectively than it would be possible using native system processes or even threads. It happens that they are modelled after the mathematical model, but it was a design decision based on specific requirements.
I think it would all fit better together if you discussed briefly the Actor model as a mathematical model on its own and only then how it has been implemented in Erlang, pointing out any differences and features specific to Erlang.

Diagnosing Azure stateful actors

I'm still trying to get my mind around Azure Service Fabric Stateful Actors. So, my (current) problem is best put into an example like this:
I've got a helpdesk system, where each ticket is a stateful actor. The actor knows about the state it's in (posted, dealt with, rejected, ...), can access the associated data and all that.
I find I have made a mistake and a bunch of those 50.000 tickets are in the wrong state. So, I need to
fix the code
publish the solution
fix the data content of a subset of those 50.000 actors.
Now, how can I query the state of those actors, like "give me each actor that is in "rejected" and belongs to a user whose name starts with a german ümlaut"? How can I then patch the state data of those actors?
Do I really have to add a query method to each actor and wake up each single actor? Or is there a way to query those state dictionaries outside of the actors sitting on top of them?
The short answer is yes, in a situation like that you'd have to wake up each single actor (eventually).
If you are already in that state, I think JoshL's suggestion makes sense.
To avoid this sort of situations, you could keep an index dictionary in a stateful service, holding the information you'll want to query on e.g. the actor id and the status (posted, dealt with, etc.). You then only have to wake up those actors that are relevant.
There are two approaches you can take for that:
Have the stateful service direct the flow of information - be responsible for updating the index dictionary and telling actors what to do (e.g. change status).
Have the actors responsible for notifying the stateful service for state updates (this could be done periodically through reminders for example).
Perhaps you could consider overriding OnActivateAsync in your actor class(es) and implement the cleanup logic there, then upgrade your SF application?
This would prevent the need to iterate every single instance externally (as the SF runtime will call OnActivateAsync for you), and would ensure that the logic runs for each instance only if/when needed (only upon next activation for a given instance).
more on Actor activate/deactivate/etc.
Best of luck!

Actors at a very deep level, what makes them unique?

I'm not strong in multi-threading programming. And I've been working with akka pretty much enough, but nonetheless I still don't understand what makes actors and akka so neat, convenient, safe and so and so forth. I know that they receive messages, an actor can receive only message at a time. But what of it, what makes them thread-safe?
First of all, actors are just a library built on system threads that involves using shared mutable state and they need somehow to deal with it.
So the question is, how do actors work at a very deep level? I'd also appreciate any link about it.
Björn's answer hits the important point: The actor model encapsulates state and any logic that operates on that state in an actor. The only way to change state from the outside is to send the actor a message.
Because only the actor can modify the state, and because it processes messages serially, there's no possibility of concurrent modification. No race conditions.
Ryan Tanner (disclosure: Ryan works at my company) has a great blog post about what makes actors special: http://blog.goconspire.com/post/64274254800/akka-at-conspire-part-2-why-we-like-actors.
You seem to be mixing up the Actor Model with one concrete implementation of it in Akka.
The code inside a single actor is only run on one thread at any given time processing one single message at any given time. If your actors don't share mutable objects between each other and only communicate via immutable messages then the code is free of the kind of races where you inadvertently change the same object/variable from multiple threads concurrently.
How the implementation runs your actors on top of multiple threads should be irrelevant. But you are of course free to look at the Akka source code.

Resources