How to use expansion regions for loops in an activity diagram? - uml

I am having problems designing a proper UML activity diagram.
I've seen similar questions and possible answers:
How to present a loop in activity diagram?
Even with these answers I am having doubts and my own answer doesn't correspond with the UML definitions.
Summarization of the problem: Loop over folders and files in each folder, act upon each folder depending on its name and upon each file depending on its name. The main problem I am having is if I am using the expansion region correctly.
Many sources tell that an expansion region must have an input collection and an output collection. But I don't necessarily have an output collection.
Is it automatically assumed that the Region will iterate over all items until no items are left before it goes into ActivityFinal?
Here is a Enterprise Architect screenshot of what I've done:

First of all, your Expansion Nodes are connected with Control Flows or your Actions are connected with Object Flows, either of which is impossible (too bad, that EA doesn't enforce this rule). That means you need to have an Action after the Initial Node, that provides you with a collection. Also you are using Activities in an Activity-Diagram. Contrary to popular believe (and to EA), this is not allowed. You should use Actions (possibly CallBehaviorActions calling Activities, but that's up to you).
I don't know exactly what you try to model. However here is my suggestion for a valid use of Expansion Regions:
The first Expansion Node creates an object token for each folder in the directory. The second Expansion Node creates an object token for each file in the folder. If you need to access the folder name, you can simply draw an Object Flow into the Region. This will then provide a separate folder token for each execution of the inner Expansion Region.
It is not necessary to model an output Expansion Node, if you don't need one. Simply end each execution with a Flow Final Node. After the last execution finishes, the Region will produce a token for the outgoing Control Flow.

Based on your reply it's just an object you are acting on.
You can just put that object in a global context outside of your expansion region. The input-/output-parameters just stay the same. They are the analogon of procedure parameters. In case you fiddle with the external (global) object your return value would be some empty collection (and possibly some information that you dealt with the external object).

Related

Axon Framework: send command on aggregate load

We're building a microservices system with Axon Framework 4.1. In our domain, we have a label concept where we can attach labels to other entities. While labels are normally created and managed by the user, some of these labels are "special" and need to be hard-coded, but they need to be present in the event stream as well.
We have a bunch of aggregates that represent entities that can be labeled with these labels. Some of these aggregates will be used frequently, while others might be used infrequently or are even abandoned by the user.
Sometimes we come up with new special labels. We add them to the code, and then we also need to add them to the event stream. What is a good way to do that?
We can create a special command that we need to send when the updated service is started for the first time. It goes through all the labels and adds the ones that aren't in the event stream yet. This has two disadvantages. First, we need to actually send that command, which either requires us to not forget it, or to add some infrastructure for it outside of the code (e.g., in our build pipeline). Also, other services could have booted up faster with the new labels and started sending commands before we fired our special command. The other disadvantage is that this command will target all aggregates, including the abandoned ones, which could be wasteful of resources and be confusing to end users who might see activity in a document they thought was abandoned.
Ideally, we would like to be able to send the command when Axon has just loaded the aggregate. That way we would be certain that the labels are only introduced in aggregates that are actually used. Also, we could wire this up in code and it wouldn't require us to add infrastructure outside of the application and/or remember to do it manually.
Unfortunately, this feature doesn't seem to exist in Axon (yet) 😉.
Are there other (better) ways to achieve this?
I've got an idea which might help you out on this.
If I understand the use case correctly, the "Label" in your system, which user can introduce themselves but for which also a couple of hard-coded versions exist, is an Aggregate.
Based on that assumption, I suggest to be smart with the Aggregate Identifier you are using.
The sole thing that Axon expects from you, is that the Aggregate Identifier is (or can be made in to) a String. Typically a UUID is used for the Aggregate Identifiers, which is a reasonable first start.
You can however wrap this UUID in a typed-id object. Taking your "Label" Aggregate, that would opt for a LabelId.
That said, let's first go back to verifying whether a given "Label" Aggregate exists within the Event Stream.
The concern you have is rather valid I think; reading the entire Event Stream to figure out whether a given Aggregate instance exists is to big of a hassle.
However, the EventStore can be queried through two mechanism:
The Event Stream from a given point in time (e.g. what the TrackingToken mechanism does).
The Event Stream for a given Aggregate instance, based on the Aggregate Identifier.
It's the second option which is far more ideal in your scenario.
Just query the EventStore for a given "Label" Aggregate's Identifier. If you receive a non-empty Event Stream, you know it already exists.
Vice versa, if no Events are found, you are certain it's a new "Label" that needs to be introduced.
The crux here is in knowing the "Label's" Aggregate Identifier up front, which circles back to the String storage approach for the Aggregate Identifiers using a typed LabelId. What you could do, is deviate in the LabelId object between a custom "Label" (I'd opt for a UUID here) and a hard-coded "Label".
For the latter, you could for example have the label-name, plus a UUID/counter if desired.
Doing so will ensure that all the Events published from a hard-coded "Label" will have an Aggregate Identifier you can anticipate on during start-up.
Hope this is clear and all, if not, please comment on my response below.

How do I represent a loop in an activity diagram?

I'd like to represent a loop in a UML activity diagram. Here's my situation:
For each folder, I check each document within that folder
For each document I check its content:
If it's invalid (based on keyword searching), do action X and pass to next document.
When all document are verified, continue to the next folder.
Can anyone show me what this should look like?
There are 3 different notations that you can use.
As your loop is based on some elements (folders, documents) the most convenient way is to use Expansion Region (of iterative type).
Second option, that is a preferred choice when you have some guard based loop is a Loop Node.
The last possibility is to simply build correctly structured decision/merge structure.
The benefits of the first two are that they are compact and clear. It is also easy to have nested loops. Neither of this is true with the last option. Yet if you present your diagram to someone who is not familiar with UML (especially if you have no chance to explain a meaning of particular structure), the last approach usually is most widely recognized and understood.

Showing data on the UI in the Hexagonal architecture

I'm learning DDD and Hexagonal architecture, I think I got the basics. However, there's one thing I'm not sure how to solve: how am I showing data to the user?
So, for example, I got a simple domain with a Worker entity with some functionality (some methods cause the entity to change) and a WorkerRepository so I can persist Workers. I got an application layer with some commands and command bus to manipulate the domain (like creating Workers and updating their work hours, persisting the changes), and an infrastructure layer which has the implementation of the WorkerRepository and a GUI application.
In this application I want to show all workers with some of their data, and be abe to modify them. How do I show the data?
I could give it a reference to the implementation of WorkerRepository.
I think it's not a good solution because this way I could insert new Workers in the repository skipping the command bus. I want all changes going through the command bus.
Okay then, I'd split the WorkerRepository into WorkerQueryRepository and WorkerCommandRepository (as per CQRS), and give reference only to the WorkerQueryRepository. It's still not a good solution because the repo gives back Worker entities which have methods that change them, and how are these changes will be persisted?
Should I create two type of Repositories? One would be used in the domain and application layer, and the other would be used only for providing data to the outside world. The second one wouldn't return full-fledged Worker entities, only WorkerDTOs containing only the data the GUI needs. This way, the GUI has no other way to change Workers, only through the command bus.
Is the third approach the right way? Or am I wrong forcing that the changes must go through the command bus?
Should I create two type of Repositories? One would be used in the domain and application layer, and the other would be used only for providing data to the outside world. The second one wouldn't return full-fledged Worker entities, only WorkerDTOs containing only the data the GUI needs.
That's the CQRS approach; it works pretty well.
Greg Young (2010)
CQRS is simply the creation of two objects where there was previously only one. The separation occurs based upon whether the methods are a command or a query (the same definition that is used by Meyer in Command and Query Separation, a command is any method that mutates state and a query is any method that returns a value).
The current term for the WorkerDTO you propose is "Projection". You'll often have more than one; that is to say, you can have a separate projection for each view of a worker in the GUI. (That has the neat side effect of making the view easier -- it doesn't need to think about the data that it is given, because the data is already formatted usefully).
Another way of thinking of this, is that you have a "write-only" representation (the aggregate) and "read-only" representations (the projections). In both cases, you are reading the current state from the book of record (via the repository), and then using that state to construct the representation you need.
As the read models don't need to be saved, you are probably better off thinking factory, rather than repository, on the read side. (In 2009, Greg Young used "provider", for this same reason.)
Once you've taken the first step of separating the two objects, you can start to address their different use cases independently.
For instance, if you need to scale out read performance, you have the option to replicate the book of record to a bunch of slave copies, and have your projection factory load from the slaves, instead of the master. Or to start exploring whether a different persistence store (key value store, graph database, full text indexer) is more appropriate. Udi Dahan reviews a number of these ideas in CQRS - but different (2015).
"read models don't need to be saved" Is not correct.
It is correct; but it isn't perhaps as clear and specific as it could be.
We don't need to create a durable representation of a read model, because all of the information that describes the variance between instances of the read model has already been captured by our writes.
We will often want to cache the read model (or a representation of it), so that we can amortize the work of creating the read model across many queries. And various trade offs may indicate that the cached representations should be stored durably.
But if a meteor comes along and destroys our cache of read models, we lose a work investment, but we don't lose information.

Convention-based object-graph synchronization

I'm planning my first architecture that uses DTOs. I'm now exploring how to map the modified client-side domain objects back to the DTOs that were originally retrieved from the data service. I must map back to the original object graph, instead of instantiating a new one, in order to use WCF Data Services Client Library's change tracking feature.
To put it in general terms, I need a tool that maps instances and (recursively) their sub-instances (collectively called the "source graph") to existing instances and (recursively) sub-instances (collectively called the "target graph") in a manner that is (nearly) 100% convention, rather than configuration, based.
The specific required functionality that I can think of is:
Replace single-valued properties within the target graph with their corresponding values from the source graph.
Synchronize collection pairs: elements that were added to a collection within the source graph should then be added to the corresponding collection within the target graph; elements removed from a collection within the source graph should then be removed from the corresponding collection within the target graph.
When it comes to mapping DTOs, it seems many people use AutoMapper. So I had assumed this task would be easy using that tool. Upon looking at the details, though, I have doubts it will fit my requirements. This indicates AutoMapper won't handle #1 so well. Equally so, this indicates AutoMapper won't help much with #2 either.
I don't want to try bending AutoMapper to my purposes if it will lead to a lot of configuration code. That would defeat the purpose of using a convention-based tool in the first place. So I'm wondering: what's a better tool for the job?

Workorder management with DDD and ORM

The central tenet to the software I am building is the "workorder"
WorkOrder as I see it would be an "aggregate root" that contains basic information about the work order such as creation date, model/manufacturer, serial number, purchase order.
In addition to these "value" objects, there are also sub "entities" or "aggregates" such as:
Sequences
Reworks
Dimensions
QuoteItems
Consumables
None of the above can/should exist without an associated work order. In the existing system they actually occasionally do but that is because of lack of transactions or checks in code to ensure integrity. They are orphaned records and deleted via scheduled clean up - one of the many reasons I am learning more about DDD and ORM to bring our development practices up to speed.
NOTE: This is probably off topic and can likely be skipped in your
reply
because we are primarily a web-based interface using extJS, each of
the list controls that display each of the above, I have been
reluctant to switch to ORM and DDD. Each list is populated via a
controller:action that queries the DB (ie: sequences list is populated
when the JS control calls a sequence REST URI with GET command). This
GET command invokes a controller that instantiates a sequence object
and calls the selectAllForWorkorderID method
My understanding of ORM is that I would use a repository to query
these items. Fine, however if this sequence object (in DDD parlance)
is considered an aggregate of WorkOrder root - then I must find the
workorder first and traverse the sequences through the WorkOrder.
In a AJAX web-based context this feels funny to me - but in a desktop
environment or even standard web-based context this is acceptable as I
would only query the WorkOrder object once each time a WorkOrder item
is selected in the master list. Not 6 or 8 times for each individual
list to be populated.
I can see now that our system actually has several aggregate root objects, work order is just the more complicated of the few:
WorkOrder
Warranties
Repair Orders
Are the primary roots. Warranties are dependent on work order ID's and Repair Orders can be but not always.
Ignoring the latter roots - allow me to focus solely on WorkOrder.
When I begin examine the existing models and try to determine what is business logic and/or application logic I am slightly confused. What goes into a "service" versus "aggregate root".
Consider one such method in the current model:
createWorkOrderFromRpi.
RPI's are approved documents that act as templates for WorkOrders - they dictate what sequences and the order of execution "can" be performed, dimensions, list of consumables etc. This is a separate system altogether and I believe would best be described as a "module" in DDD nomenclature.
This method has to query the RPI system and obtain the work order header details, sequence list, consumables, etc.
Once it has this data it calls the associated objects and methods:
WorkOrder.Create(Header Details)
Sequence.Create(Sequence Details) - Done in loop (1:m)
Consumable.Create(Consumable Details) - Done in loop (1:m)
In following DDD I am tempted to have the WorkOrder "aggregate root" provide a method with an identical signature however I am reluctant to do so.
I believe each of the "entities" that are aggregates of WorkOrder fit the description and should not ever be exposed to anything outside of the "root" unless traversed through the root itself. There may be cases where this is not the case. On second thought, the interface only ever exposes consumables and sequences and such when a work order is selected which would imply a work order must be loaded anyway?!?
There are some essential business rules which this method must perform:
A Work order with identical serial number is not actively already in the system (unless archived) unless it's on sub-contract in which case do not create a new work but receive a repair order for this work order instead.
There are a few more "rules" but I will exclude them for the sake of brevity.
The individual entities perform micro business validations, for example some fields, such as serial number, have a specific format, as do part numbers and purchase order numbers.
My primary question or concern, is given the above description, would this method best be implemented in an "aggregate root" or "service"?
UPDATE | One final question...if aggregate root is the proper concept...and I need access to the sequences so that I may update a field I would access conceptually (ignore syntax) like:
WorkOrder.Sequences(0).moveToNext()
If this method was implemented in the sequences "entity" which makes sense. Where does the division between technical details and business logic exist? For example, to move a work order from one sequence to the next, we update three timestamps per sequence:
date_entered
date_started
date_finished
When the last timestamp is set, the next sequence date_entered is set to the same time as previous sequence date_finished and the system knows this is the active sequence now. Thats a technical matter.
But a business rule or constraint would be:
Don't move work order if moved into history
Don't move work order if in rework
Don't move work order if in subcon
These are rules, which I would love to keep separated and distinct so as to make it easy for me to translate into English in the form of a specs document which I could present management as a living document and proof of functionality. I was kind of hoping that is what DDD would enforce/promote in a clean manner. Is this a requirement handled independently of DDD? Is this where CQS comes in? Separating business rules from technical matters which are of zero relevance to stake holders?
Alex
I think your createWorkOrderFromRpi() method should be on a "Service" rather than the WorkOrder aggregate root. This service method would then call methods on your Repositories or DAOs to create the workorder. An Aggregate Root typically combines entities but on your model I think RPI is a template or specification outside of the work order aggregate root. If RPI is part of the aggregate then you should put the method on the repository directly and call it wherever, as a repository is a business object in DDD also.
On the second question I believe a WorkOrder Aggregate Root is totally correct for the other "dependent" entities you listed, namely
Sequences
Reworks
Dimensions
QuoteItems
Consumables
I'm interested to know how you implemented this.

Resources