Actors at a very deep level, what makes them unique? - multithreading

I'm not strong in multi-threading programming. And I've been working with akka pretty much enough, but nonetheless I still don't understand what makes actors and akka so neat, convenient, safe and so and so forth. I know that they receive messages, an actor can receive only message at a time. But what of it, what makes them thread-safe?
First of all, actors are just a library built on system threads that involves using shared mutable state and they need somehow to deal with it.
So the question is, how do actors work at a very deep level? I'd also appreciate any link about it.

Björn's answer hits the important point: The actor model encapsulates state and any logic that operates on that state in an actor. The only way to change state from the outside is to send the actor a message.
Because only the actor can modify the state, and because it processes messages serially, there's no possibility of concurrent modification. No race conditions.
Ryan Tanner (disclosure: Ryan works at my company) has a great blog post about what makes actors special: http://blog.goconspire.com/post/64274254800/akka-at-conspire-part-2-why-we-like-actors.

You seem to be mixing up the Actor Model with one concrete implementation of it in Akka.
The code inside a single actor is only run on one thread at any given time processing one single message at any given time. If your actors don't share mutable objects between each other and only communicate via immutable messages then the code is free of the kind of races where you inadvertently change the same object/variable from multiple threads concurrently.
How the implementation runs your actors on top of multiple threads should be irrelevant. But you are of course free to look at the Akka source code.

Related

Alternative to Akka for handling tasks sequentially

I have a system which may generate certain events in the lifecycle of a transaction. On every even I need to update a row in a DB and also send out a UI event over websocket.
One option I have is to implement the event processing (DB and UI) in actors thus avoiding any locking issues - also I can afford minor delays so handling this sequentially will greatly simplify matters.
What are alternative ways of handling this in Scala as I feel maybe Actors might be overkill in this case?
There are those blogs stating that actors should be used for "concurrency with state" - though I would like to see a more appropriate mechanism in order to eliminate this option.
Ultimately the main unique benefit for using actors is that they are great for encapsulating mutable variables so that you avoid race conditions.
To do what you describe, you could just use classic threads. In your (possibly simplified) description, I don't see the potential for deadlocks. If you want to something a bit more composable, e.g. a sequence of asynchronous tasks, you can use Scala's Futures.
Not at all sure if this is applicable for Scala, but there's a great lib for Groovy and Java with several concurrency models. I've myself used the Dataflow Concurrency with great success and can recommend it as a light-weight yet manageable model.
Dataflow Concurrency offers an alternative concurrency model, which is
inherently safe and robust. It puts an emphasis on the data and their
flow though your processes instead of the actual processes that
manipulate the data. Dataflow algorithms relieve developers from
dealing with live-locks, race-conditions and make dead-locks
deterministic and thus 100% reproducible. If you don’t get dead-locks
in tests you won’t get them in production.
There are other models available in the linked GPars library as well.
I wouldn't suggest making the threading yourself, unless you have no other choices.
Addendum
After posting I got interested in the topic and made a few searches. Seems like Akka has direct support for Dataflow model also. Or at least has had in some version.
Actors avoid locking issues because they use queues to interact. You can use threads with (blocking) queues and get the same level of safety. The only advantage of Actors over Threads is that an Actor does not spend memory for call stack, thus we can have many more actors than threads in the same amount of core memory. The downside of actor model is that a complex algorithm which could be implemented in single thread require several actors and so actor implementation may look obscured.

Basic Explanation of Actors in Erlang

I am trying to put together a very basic explanation of actors in Erlang. It is supposed to be as bare-bones as possible, but without leaving out key features of the theory or the Erlang implementation of it. This is my explanation:
The actor model is a mathematical model of concurrent computation that treats actors as the universal primitives of concurrent computation. The actor is a computational entity that, in response to a message it receives, can concurrently (1) send a finite number of messages to other actors, (2) create a finite number of new actors, and (3) designate the behavior to be used for the next message it receives.
In Erlang, each actor is a separate process in the virtual machine, implemented by a function. Processes communicate by sending messages to each other. Every message is explicit, traceable and safe. The messages are received in a mailbox and stored in the order in which they are received. They are stored there until the receiving process takes them out to be read. This is called asynchronous message passing.
What do you guys think? Is it OK? Should I add or change anything? Thanks.
I think you would help yourself if you didn't confuse actors with Erlang processes. You started with Wikipedia's description of Actor model only to seamlessly start writing about Erlang processes, like if it was one and the same. Actor model is a mathematical model which can be implemented in many different ways, including a pure C or C++ low-level implementation. On the other hand, Erlang processes are lightweight preemptive language features that allow to run a vast amount of processes in parallel much more effectively than it would be possible using native system processes or even threads. It happens that they are modelled after the mathematical model, but it was a design decision based on specific requirements.
I think it would all fit better together if you discussed briefly the Actor model as a mathematical model on its own and only then how it has been implemented in Erlang, pointing out any differences and features specific to Erlang.

When should one use the Actor model?

When should the Actor Model be used?
It certainly doesn't guarantee deadlock-free environment.
Actor A can wait for a message from B while B waits for A.
Also, if an actor has to make sure its message was processed before moving on to its next task, it will have to send a message and wait for a "your message was processed" message instead of the straightforward blocking.
What's the power of the model?
Given some concurrency problem, what would you look for to decide whether to use actors or not?
First I would look to define the problem... is the primary motivation a speedup of a nested for loop or recursion? If so a simple task based approach or parallel loop approach will likely work well for you (rather than actors).
However if you have a more complex system that involves dependencies and coordinating shared state, then an actor approach can help. Specifically through use of actors and message passing semantics you can often avoid using explicit locks to protect shared state by actually making copies of that state (messages) and reacting to them.
You can do this quite easily with the classic synchronization problems like dining philosophers and the sleeping barbers problem. But you can also use the 'actor' to help with more modern patterns, i.e. your facade could be an actor, your model view and controller could also be actors that communicate with each other.
Another thing that I've observed is that actor semantics are learnable by most developers and 'safer' than their locked counterparts. This is because they raise the abstraction level and allow you to focus on coordinating access to that data rather than protecting all accesses to the data with locks. As an example, imagine that you have a simple class with a data member. If you choose to place a lock in that class to protect access to that data member then any methods on that class will need to ensure that they are accessing that data member under the lock. This becomes particularly problematic when others (or you) modify the class at a later date, they have to remember to use that lock.
On the other hand if that class becomes an actor and the data member becomes a buffer or port you communicate with via messages, you don't have to remember to take the lock because the semantics are built into the buffer and you will very explicitly know whether you are going to block on that based on the type of the buffer.
-Rick
The usage of Actor is "natural" in at least two cases:
When you can decompose your problem in a set of independent tasks.
When you can decompose your problem in a set of tasks linked by a clear workflow (ie. dataflow programming).
For instance, if you process complex data using a series of filters, it is easy to use a pipeline of actors where each actor receives data from an upstream actor and sets data to a downstream actor.
Of course, this data-flow must not be linear and if a step is slow in your pipeline, instead you can use a pool of actors doing the same job. Another way of solving the load balancing problems would be to use instead a demand-driven approach organized with a kind of virtual Kanban system.
Of course, you will need synchronization between actors in almost all interesting cases, but contrary to the classic multi-thread approach, this synchronization is really "concrete". You can imagine guys in a factory, imagine possible problems (workers run out of the job to do, upstream operations is too fast and intermediate products need a huge storage place, etc.) By analogy, you can then find a solution more easily.
I am not an actor expert but here is my 2 cents when to use actor model:
Actor model is not suited for every concurrent application, for instance if you are creating an application which is multi threaded and works in high concurrency actor model is not made to solve the concurrency issue.
Where actors really comes into play is when you are creating an event driven application. For instance you have an application and you are tracking what are users clicking in your application realtime. You can use actors to do activities realtime segregated by user, device or anything of your business requirement as actors are stateful. So, for example if some users lies in actors which clicked on shirts you can send them notification of some coupon.
Also some applications where actors comes handy are : Finance (Pricing, fraud detection), multiplayer gaming.
Actors are asynchronous and concurrent but does not guarantee message order or time limit as to when the message may be acted upon. Hence atomic transactions cannot be split into Actors.
If the application/task involves no mutable state then Actors are overkill as Actor frameworks go to great lengths to avoid race conditions.

Actor based development - implementation questions

it's my first message here and I'm glad to join this community.
It looks like that everything is now going towards multi-thread development. Big fishes say that it won't take longer to reach hundreds of cores.
I've recently read about actor based development and how wonderful message passing is to handle concurrent programming. In addition, I also read that they can be implemented as a means of method call. In this case, a given object is also an actor.
In other words we no longer call methods arbitrarily. They are post in queue for late processing. A queue then ensures that a object's state(var) isn't modified at the same time because messages are all serialized.
I understand that this model is quite straightforward to implement (at least an experimental one) and perhaps that's why is too difficult to find any technical detail.
My question concerns queues. This is a typical case of multiple-producers and one consumer and I suspect they require some sort of synchronization. Is that true? There would be another solution? I heard they can be implemented as lock-free structures.
I'm not really sure about that. Any comment will be greatly appreciated.
Have a nice day pals
Multiple producers and a single consumer is a great scenario for using Actors, and doesn't require any synchronization. In Scala, you generally don't use any mutable state when working with Actors. You just send over a copy of whatever data needs processing.
You can read more about Actors in Scala in "Programming Scala", available online for free.
If I understood correctly, agents received the messages in a MailBox which behaves like a concurrent queue. So you do not have to care about it. If you want to play with mailbox directly, you can have a look at this nice article from the great "The busy Java developer's guide to Scala" series.

What is actor model in context of a programming language?

I've seen it mentioned in several places in contexts like Erlang actor model, Groovy actors, Scala actor model etc. What does this refer to?
I think Wikipedia sums it up best:
The Actor model adopts the philosophy that everything is an actor. This is similar to the everything is an object philosophy used by some object-oriented programming languages, but differs in that object-oriented software is typically executed sequentially, while the Actor model is inherently concurrent. [snip] The Actor model is about the semantics of message passing.
Some time ago I wrote this blog post that explains the basic concepts of the model and builds a basic implementation with JavaScript. From the post:
In the Actor Model, an actor is the foundation on which you build the structure of your application, it has internal state invisible to the outer world and interacts with other actors through asynchronous messages.
If this sounds to you a lot like Object-Oriented Programming (OOP), you are right. The Actor Model can be thought as OOP with special treatment to messages: they are delivered asynchronously and executed synchronously by the receiver.
Every actor is identified with a unique address by which you send messages to it. When a message is processed, it is matched against the current behavior of the actor; which is nothing more than a function that defines the actions to be taken in reaction to the message. In response to a message, an actor may:
Create more actors.
Send messages to other actors.
Designate internal state to handle the next message.
Actor model main idea is to manage actors as a primitives for concurrent computation. Actor can send messages to other actors, receive and react to messages and spawn new actors.
The key idea here is to communicate through messages instead of sharing memory between different threads.
It's important to add that Actors are asynchronous and concurrent but they not guarantee message order or time limit as to when the message may be acted upon (hence atomic transactions cannot be split into Actors).
The usage of Actor model is suitable in the main two cases:
When the can decompose your solution in a set of independent tasks.
When you can decompose your solution in a set of tasks linked by a
clear workflow.
Illustration:

Resources