I am trying to find a way to adequately model the interaction between objects and the threads they are running on. I have tried a sequence diagram but the interactions are not clear enough.
What is the best way to model object and thread interaction?
Related
I am trying to put together a very basic explanation of actors in Erlang. It is supposed to be as bare-bones as possible, but without leaving out key features of the theory or the Erlang implementation of it. This is my explanation:
The actor model is a mathematical model of concurrent computation that treats actors as the universal primitives of concurrent computation. The actor is a computational entity that, in response to a message it receives, can concurrently (1) send a finite number of messages to other actors, (2) create a finite number of new actors, and (3) designate the behavior to be used for the next message it receives.
In Erlang, each actor is a separate process in the virtual machine, implemented by a function. Processes communicate by sending messages to each other. Every message is explicit, traceable and safe. The messages are received in a mailbox and stored in the order in which they are received. They are stored there until the receiving process takes them out to be read. This is called asynchronous message passing.
What do you guys think? Is it OK? Should I add or change anything? Thanks.
I think you would help yourself if you didn't confuse actors with Erlang processes. You started with Wikipedia's description of Actor model only to seamlessly start writing about Erlang processes, like if it was one and the same. Actor model is a mathematical model which can be implemented in many different ways, including a pure C or C++ low-level implementation. On the other hand, Erlang processes are lightweight preemptive language features that allow to run a vast amount of processes in parallel much more effectively than it would be possible using native system processes or even threads. It happens that they are modelled after the mathematical model, but it was a design decision based on specific requirements.
I think it would all fit better together if you discussed briefly the Actor model as a mathematical model on its own and only then how it has been implemented in Erlang, pointing out any differences and features specific to Erlang.
I'm not strong in multi-threading programming. And I've been working with akka pretty much enough, but nonetheless I still don't understand what makes actors and akka so neat, convenient, safe and so and so forth. I know that they receive messages, an actor can receive only message at a time. But what of it, what makes them thread-safe?
First of all, actors are just a library built on system threads that involves using shared mutable state and they need somehow to deal with it.
So the question is, how do actors work at a very deep level? I'd also appreciate any link about it.
Björn's answer hits the important point: The actor model encapsulates state and any logic that operates on that state in an actor. The only way to change state from the outside is to send the actor a message.
Because only the actor can modify the state, and because it processes messages serially, there's no possibility of concurrent modification. No race conditions.
Ryan Tanner (disclosure: Ryan works at my company) has a great blog post about what makes actors special: http://blog.goconspire.com/post/64274254800/akka-at-conspire-part-2-why-we-like-actors.
You seem to be mixing up the Actor Model with one concrete implementation of it in Akka.
The code inside a single actor is only run on one thread at any given time processing one single message at any given time. If your actors don't share mutable objects between each other and only communicate via immutable messages then the code is free of the kind of races where you inadvertently change the same object/variable from multiple threads concurrently.
How the implementation runs your actors on top of multiple threads should be irrelevant. But you are of course free to look at the Akka source code.
I want to draw uml-correct activity diagram representing process of my raytracer.
I know I should use black rectangles to model fork/join. But in my application I spawn N threads doing the same thing (which is not simple and will be modeled via multiple activity elements). How can I draw such activity diagram without having the same thing without knowing number of threads?
My explanation is poor, image may help understand what I want to model with activity diagram
You can use the expansion region element.
There is no way I know of to model a fork of N control flows and I found none in three UML2 books nor the UML2.4.1 formal specification (http://www.omg.org/spec/UML/2.4.1/Superstructure).
That said, using an expansion region with the 'parallel' keyword, you can fork N object flows, processing N objects in parallel.
I am, however, not fully satisfied with this solution because I suspect that you don't create N threads because you have N objects to process but because you have N processor cores and that each thread processes a lot of frames (or whatever objects that need processing).
You can, of course, work around this by using the processor cores as objects.
When should the Actor Model be used?
It certainly doesn't guarantee deadlock-free environment.
Actor A can wait for a message from B while B waits for A.
Also, if an actor has to make sure its message was processed before moving on to its next task, it will have to send a message and wait for a "your message was processed" message instead of the straightforward blocking.
What's the power of the model?
Given some concurrency problem, what would you look for to decide whether to use actors or not?
First I would look to define the problem... is the primary motivation a speedup of a nested for loop or recursion? If so a simple task based approach or parallel loop approach will likely work well for you (rather than actors).
However if you have a more complex system that involves dependencies and coordinating shared state, then an actor approach can help. Specifically through use of actors and message passing semantics you can often avoid using explicit locks to protect shared state by actually making copies of that state (messages) and reacting to them.
You can do this quite easily with the classic synchronization problems like dining philosophers and the sleeping barbers problem. But you can also use the 'actor' to help with more modern patterns, i.e. your facade could be an actor, your model view and controller could also be actors that communicate with each other.
Another thing that I've observed is that actor semantics are learnable by most developers and 'safer' than their locked counterparts. This is because they raise the abstraction level and allow you to focus on coordinating access to that data rather than protecting all accesses to the data with locks. As an example, imagine that you have a simple class with a data member. If you choose to place a lock in that class to protect access to that data member then any methods on that class will need to ensure that they are accessing that data member under the lock. This becomes particularly problematic when others (or you) modify the class at a later date, they have to remember to use that lock.
On the other hand if that class becomes an actor and the data member becomes a buffer or port you communicate with via messages, you don't have to remember to take the lock because the semantics are built into the buffer and you will very explicitly know whether you are going to block on that based on the type of the buffer.
-Rick
The usage of Actor is "natural" in at least two cases:
When you can decompose your problem in a set of independent tasks.
When you can decompose your problem in a set of tasks linked by a clear workflow (ie. dataflow programming).
For instance, if you process complex data using a series of filters, it is easy to use a pipeline of actors where each actor receives data from an upstream actor and sets data to a downstream actor.
Of course, this data-flow must not be linear and if a step is slow in your pipeline, instead you can use a pool of actors doing the same job. Another way of solving the load balancing problems would be to use instead a demand-driven approach organized with a kind of virtual Kanban system.
Of course, you will need synchronization between actors in almost all interesting cases, but contrary to the classic multi-thread approach, this synchronization is really "concrete". You can imagine guys in a factory, imagine possible problems (workers run out of the job to do, upstream operations is too fast and intermediate products need a huge storage place, etc.) By analogy, you can then find a solution more easily.
I am not an actor expert but here is my 2 cents when to use actor model:
Actor model is not suited for every concurrent application, for instance if you are creating an application which is multi threaded and works in high concurrency actor model is not made to solve the concurrency issue.
Where actors really comes into play is when you are creating an event driven application. For instance you have an application and you are tracking what are users clicking in your application realtime. You can use actors to do activities realtime segregated by user, device or anything of your business requirement as actors are stateful. So, for example if some users lies in actors which clicked on shirts you can send them notification of some coupon.
Also some applications where actors comes handy are : Finance (Pricing, fraud detection), multiplayer gaming.
Actors are asynchronous and concurrent but does not guarantee message order or time limit as to when the message may be acted upon. Hence atomic transactions cannot be split into Actors.
If the application/task involves no mutable state then Actors are overkill as Actor frameworks go to great lengths to avoid race conditions.
I'm fond of using UML diagrams to describe my software. In the majority of cases the diagrams are for my own use and I use them for more involved pieces of code, interactions etc. where I'll benefit from being able to look back over them in the future.
One thing I've found myself doing a few different ways is diagramming threads. Threads by their nature tend to pop up in the more involved pieces of code and keeping track of them is often a primary purpose of my design documents.
In the past I've used a symbol in a sequence diagram to show the creation of a new thread but looking back at some diagrams doing that it's sometimes ambiguous between an object's lifetime - which sequence diagrams are for - and a thread's lifetime. Is there a better approach for incorporating threads into UML?
I managed to produce a diagram that makes sense to me at the time of drawing it. The basic premise is that I've overlaid grey boxes representing class instances with blue boxes representing thread lifetimes. The main thing it lets me keep track of is knowing which thread I will be executing on when I call certain methods.
No doubt there's better and more intuitive ways to do thread and class modeling. The measure of success for me is whether my own diagram still gives me the same level of understanding 6 months down the track.
Activity, Sequence, and State Diagrams are all correct ways of showing thread behavior.
1st: (To vs's comments) There are two sets of diagrams or modeling elements in UML, static structure, as you put it, and behavioral. Any book will help you understand the split, typically in the contents/TOC, additionally it can be seen on page 11 of Martin Fowler's UML Distilled a near defacto standard for beginning UML in my opinion.
2nd: (To sipwiz's question and comment) Activity diagrams are not commonly understood to model business process, they can be used for that however, and most examples or simple tutorial would approach it from a business standpoint.
Discussion on your options to model threads:
Activity diagrams - Allows for forking and specifying concurrency by using a BAR and usage lines. Note the example at the bottom is no a business process, example. Most people can read these, business, management, and developers, though sometimes they can lack detail or get messy.
Sequence Interaction diagrams - In the same post, example, you will see sequence diagrams allow you to specify parallel behavior within a sequence by boxing parallelizable behavior with a label "par", this is useful to show the reader what methods can or should be called in parallel, ie, by different threads. This is the method I would use for detailed developer like discussions around building an object.
State diagram - The state chart just like the activity allows for concurrency by using a BAR and usage lines.
NOTE: These will not model a specific thread and it's exact lift cycle, as that is part of the instance/run-time level of modeling, if this what you want clarify your question and I will respond. I would just model it using one of the above as no one other than a MDA/UML expert will call you out, and you are not generating a running system.
Also: Please note that further details can be found in most UML books.
Also leveraged: http://www.jguru.com/faq/view.jsp?EID=56322
Traditionally threading has been depicted diagramatically using Petri Nets. Rob Martin has an article on multithreading in UML which you may find useful.
Update- just remembered you can represent threads with forks in activity diagrams- I've managed to find something that explains this.
It is very hard to find any free tutorials for Petri Nets, however I know Petri Nets are good for modeling concurrency, so I Google'd "producer-consumer Petri Nets" (my favourite threading thing) and found this.
I've also found some slides that show Petri Nets modeling a Semaphore.
UML activity diagrams have fork and join elements to show parallel flow of logic.
I don't know of a way, but using a sequence diagram does not seem entirely inappropriate, considering that a thread is in many languages implemented as a Thread (or similar) class.
The most UML-compatible way would probably be to add an annotation of some sort indicating that the 'object' represents a thread.
The UML is defined by the UML Superstructure, you can find it here http://www.omg.org/spec/UML.
If you read the specification you find that a UML class can be active. An Active Class is a class with the meta-attribute isActive set to true. It is also depicted differently.
An object instances of an active class automatically executes a "classifier behavior". As for any behavior you can define it by means of an activity in which you wait for asynchronous signals (AcceptEventActions) and invokes methods (CallOperationAction) or other behaviors (CallBehaviorActions). That is how active objects are modeled in UML. You just have to read the UML specification.
Activity diagrams will model the internal workings of your software with forks and joins to represent threads. To find out exactly how to model this properly, please see Conrad Bock's excellent series of articles. Here is the article that covers forks and joins, but you should follow the links back to the first article in the series to learn how to properly model using "Colored Petri Nets". It's not how you think (and it's pretty easy)!
There is a new, in-process standard at the OMG for a language called Alf that provides a more convenient surface notation for activity diagrams and is intended for representing code. From the spec:
A primary goal of an action language is to act as the surface notation for specifying executable
behaviors within a wider model that is primarily represented using the usual graphical notations of
UML. For example, this might include methods on the operations of classes or transition effect
behaviors on state machines.
For a programmer, you probably can't get more intuitive than Alf. And it will convert perfectly into UML activity diagrams.
UML strongest point is depicting the static structure. If you use short-lived threads, I also don't see any easy way of diagramming them. Maybe you can find a solution by turning things around a bit: why do you use/need threads? What's the functionality they provide? If they interact with each other and follow some (message passing) API, drawing them as components might make sense.