DDD modeling, interaction between aggregate roots - domain-driven-design

Marked my aggregate roots with 1;2;3. Looks quite nice - almost like grapes.
Thing I dislike is an entity that's marked with red arrow.
Let's imagine that:
AR #1 is company
AR #2 is office
AR #3 is employee
Entity marked with red arrow is named Country
Company sets the rules from which countries it hires employees (on hiring, company.Countries.Contains(employee.Country) must be true)
I somehow see this quite unimportant part of domain (maybe it does not sound like that in this example one), and I would like to avoid promoting Country to aggregate root.
Glossary about aggregate roots says:
Transient references to the internal members can be passed out for use within a single operation only.
So - does introducing something like 'EmployeeCountry', removing reference to company Country and checking if Employee country matches any company country on hiring operation sounds reasonable?
Any other ideas?
How can I get my grapes look like they should?

In this context Country is just a value object, not an entity - much less an aggregate root - so there's no reason to change anything about your design (without more information).
Additionally, note that the warning you cite pertains to internal members of aggregate roots, not aggregates themselves. There's nothing wrong with maintaining references to aggregates in multiple places. Aggregate roots are supposed to encapsulate child objects so that there's a single place to enforce business rules for related objects.
You can see this clearly in several places in Evans' "Domain-Driven Design" (a.k.a., "The Blue Book"). For example, see the diagram on page 127 (in the introduction to aggregate roots), which shows a Car aggregate that has a reference to an Engine aggregate.

Related

Inter-aggregate references must use primary keys?

When I was reading Microservice Patterns, one of the paragraph says that Domain-Driven Design requires aggregate to follow some rules. One of the rule is "inter-aggregate references must use primary keys".
For example, it basically means that a class Book may only have getOwnerUserId() and shouldn't have getOwnerUser().
However, in Eric Evans's Domain-Driven Design, it clearly says:
Objects within the AGGREGATE can hold references to other AGGREGATE roots.
I guess it means that Book can have getOwnerUser().
If my above understandings of these 2 books are correct, is the book "Microservice Patterns" wrong about aggregates? Or is there some variant of Domain-Driven Design that "Microservice Patterns" is referring to? Or, did I miss something?
Both books are saying roughly the same thing using different words. I'll add mine.
An aggregate can hold a reference to other aggregates in the same bounded context. This reference is through an identifier. In many cases an identifier is a primary key (relational artifact) or a document ID (e.g. from a document database like MongoDB). Regardless, in the domain, it's just an "identifier".
It is also possible for aggregates to refer to aggregates in another bounded context. In this case the reference is not just an identifier, but a projection of the "foreign" aggregate into the current bounded context.
Think of a library system. One bounded context could be the checkout system, and another could be about books themselves. A Library Patron aggregate could have references to books within its aggregate; these references would be small objects containing just a few of the books' properties: ID, title, and author perhaps, but not the number of pages, publisher, location in the library, etc.
"Aggregate root" is essentially the DDD way of saying "primary key" (I suspect the reason for not saying "primary key" is that to do so would be bringing something that's more of an infrastructure concern into the domain).
If User is a separate aggregate from Book, Book can only hold a User's ID (assuming that that's the aggregate root for User), not a User.
Since anything outside of the User class can only access a user by ID, however, it's probably better naming to say getUser() vs. getUserId() and have getUser() return a user ID.
"inter-aggregate references must use primary keys"
"primary key" is very RDBMS-specific so identity would be more appropriate.
"Objects within the AGGREGATE can hold references to other AGGREGATE roots."
Can, but generally shouldn't.
Why reference through identity?
An Aggregate Root (AR) is a strong consistency boundary. The natural way for an AR to protect it's invariants (including from violations through concurrency) is to encapsulate it's data in a way that allows it to oversee/detect every change.
When you reference other ARs by object reference rather than identity the consistency boundary becomes blurry which makes the design much harder to reason about.
Here's a (rather silly) example:
We can see that it's not enough anymore to look at the AR's structure to know what's truly part of it's boundary and surely that could lead to issues.
Furthermore, would you know if persons will get deleted if you delete InviteList or if changes made to persons from within InviteList would get persisted when calling save(inviteList)? You'd have to inspect the persistence mappings (assuming an ORM) and the cascade options to know for sure.
Why have direct references?
I'd say the primary reason to allow a direct reference to another AR would be to be pragmatic about queries that are constructed from domain objects. It's generally harder to query without such relationships (e.g. find all InviteList that have an invitee named "Foo") or construct DTOs that must aggregate data from multiple ARs (e.g. InviteListDto with all the invitee names).
However, that's also one of the many reasons CQRS have become so popular these days. If you bypass the domain model for queries entirely (e.g. plain SQL) then you do not have to make concessions in your domain for querying needs.
References
Here's a sample from the IDDD book by Vaugh Vernon where he talks about that very quote from Evans.

What is an Aggregate Root?

No, it is not a duplication question.
I have red many sources on the subject, but still I feel like I don't fully understand it.
This is the information I have so far (from multiple sources, be it articles, videos, etc...) about what is an Aggregate and Aggregate Root:
Aggregate is a collection of multiple Value Objects\Entity references and rules.
An Aggregate is always a command model (meant to change business state).
An Aggregate represents a single unit of (database - because essentialy the changes will be persisted) work, meaning it has to be consistent.
The Aggregate Root is the interface to the external world.
An Aggregate Root must have a globally unique identifier within the system
DDD suggests to have a Repository per Aggregate Root
A simple object from an aggregate can't be changed without its AR(Aggregate Root) knowing it
So with all that in mind, lets get to the part where I get confused:
in this site it says
The Aggregate Root is the interface to the external world. All interaction with an Aggregate is via the Aggregate Root. As such, an Aggregate Root MUST have a globally unique identifier within the system. Other Entites that are present in the Aggregate but are not Aggregate Roots require only a locally unique identifier, that is, an Id that is unique within the Aggregate.
But then, in this example I can see that an Aggregate Root is implemented by a static class called Transfer that acts as an Aggregate and a static function inside called TransferedRegistered that acts as an AR.
So the questions are:
How can it be that the function is an AR, if there must be a globaly unique identifier to it, and there isn't, reason being that its a function. what does have a globaly unique identifier is the Domain Event that this function produces.
Following question - How does an Aggregate Root looks like in code? is it the event? is it the entity that is returned? is it the function of the Aggregate class itself?
In the case that the Domain Event that the function returns is the AR (As stated that it has to have that globaly unique identifier), then how can we interact with this Aggregate? the first article clearly stated that all interaction with an Aggregate is by the AR, if the AR is an event, then we can do nothing but react on it.
Is it right to say that the aggregate has two main jobs:
Apply the needed changes based on the input it received and rules it knows
Return the needed data to be persisted from AR and/or need to be raised in a Domain Event from the AR
Please correct me on any of the bullet points in the beginning if some/all of them are wrong is some way or another and feel free to add more of them if I have missed any!
Thanks for clarifying things out!
I feel like I don't fully understand it.
That's not your fault. The literature sucks.
As best I can tell, the core ideas of implementing solutions using domain driven design came out of the world of Java circa 2003. So the patterns described by Evans in chapters 5 and six of the blue book were understood to be object oriented (in the Java sense) domain modeling done right.
Chapter 6, which discusses the aggregate pattern, is specifically about life cycle management; how do you create new entities in the domain model, how does the application find the right entity to interact with, and so on.
And so we have Factories, that allow you to create instances of domain entities, and Repositories, that provide an abstraction for retrieving a reference to a domain entity.
But there's a third riddle, which is this: what happens when you have some rule in your domain that requires synchronization between two entities in the domain? If you allow applications to talk to the entities in an uncoordinated fashion, then you may end up with inconsistencies in the data.
So the aggregate pattern is an answer to that; we organize the coordinated entities into graphs. With respect to change (and storage), the graph of entities becomes a single unit that the application is allowed to interact with.
The notion of the aggregate root is that the interface between the application and the graph should be one of the members of the graph. So the application shares information with the root entity, and then the root entity shares that information with the other members of the aggregate.
The aggregate root, being the entry point into the aggregate, plays the role of a coarse grained lock, ensuring that all of the changes to the aggregate members happen together.
It's not entirely wrong to think of this as a form of encapsulation -- to the application, the aggregate looks like a single entity (the root), with the rest of the complexity of the aggregate being hidden from view.
Now, over the past 15 years, there's been some semantic drift; people trying to adapt the pattern in ways that it better fits their problems, or better fits their preferred designs. So you have to exercise some care in designing how to translate the labels that they are using.
In simple terms an aggregate root (AR) is an entity that has a life-cycle of its own. To me this is the most important point. One AR cannot contain another AR but can reference it by Id or some value object (VO) containing at least the Id of the referenced AR. I tend to prefer to have an AR contain only other VOs instead of entities (YMMV). To this end the AR is responsible for consistency and variants w.r.t. the AR. Each VO can have its own invariants such as an EMailAddress requiring a valid e-mail format. Even if one were to call contained classes entities I will call that semantics since one could get the same thing done with a VO. A repository is responsible for AR persistence.
The example implementation you linked to is not something I would do or recommend. I followed some of the comments and I too, as one commenter alluded to, would rather use a domain service to perform something like a Transfer between two accounts. The registration of the transfer is not something that may necessarily be permitted and, as such, the domain service would be required to ensure the validity of the transfer. In fact, the registration of a transfer request would probably be a Journal in an accounting sense as that is my experience. Once the journal is approved it may attempt the actual transfer.
At some point in my DDD journey I thought that there has to be something wrong since it shouldn't be so difficult to understand aggregates. There are many opinions and interpretations w.r.t. to DDD and aggregates which is why it can get confusing. The other aspect is, in IMHO, that there is a fair amount of design involved that requires some creativity and which is based on an understanding of the domain itself. Creativity cannot be taught and design falls into the realm of tacit knowledge. The popular example of tacit knowledge is learning to ride a bike. Now, we can read all we want about how to ride a bike and it may or may not help much. Once we are on the bike and we teach ourselves to balance then we can make progress. Then there are people who end up doing absolutely crazy things on a bike and even if I read how to I don't think that I'll try :)
Keep practicing and modelling until it starts to make sense or until you feel comfortable with the model. If I recall correctly Eric Evans mentions in the Blue Book that it may take a couple of designs to get the model closer to what we need.
Keep in mind that Mike Mogosanu is using a event sourcing approach but in any case (without ES) his approach is very good to avoid unwanted artifacts in mainstream OOP languages.
How can it be that the function is an AR, if there must be a globaly unique identifier to it, and there isn't, reason being that
its a function. what does have a globaly unique identifier is the
Domain Event that this function produces.
TransferNumber acts as natural unique ID; there is also a GUID to avoid the need a full Value Object in some cases.
There is no unique ID state in the computer memory because it is an argument but think about it; why you want a globaly unique ID? It is just to locate the root element and its (non unique ID) childrens for persistence purposes (find, modify or delete it).
Order A has 2 order lines (1 and 2) while Order B has 4 order lines (1,2,3,4); the unique identifier of order lines is a composition of its ID and the Order ID: A1, B3, etc. It is just like relational schemas in relational databases.
So you need that ID just for persistence and the element that goes to persistence is a domain event expressing the changes; all the changes needed to keep consistency, so if you persist the domain event using the global unique ID to find in persistence what you have to modify the system will be in a consistent state.
You could do
var newTransfer = New Transfer(TransferNumber); //newTransfer is now an AG with a global unique ID
var changes = t.RegisterTransfer(Debit debit, Credit credit)
persistence.applyChanges(changes);
but what is the point of instantiate a object to create state in the computer memory if you are not going to do more than one thing with this object? It is pointless and most of OOP detractors use this kind of bad OOP design to criticize OOP and lean to functional programming.
Following question - How does an Aggregate Root looks like in code? is it the event? is it the entity that is returned? is it the function
of the Aggregate class itself?
It is the function itself. You can read in the post:
AR is a role , and the function is the implementation.
An Aggregate represents a single unit of work, meaning it has to be consistent. You can see how the function honors this. It is a single unit of work that keeps the system in a consistent state.
In the case that the Domain Event that the function returns is the AR (As stated that it has to have that globaly unique identifier),
then how can we interact with this Aggregate? the first article
clearly stated that all interaction with an Aggregate is by the AR, if
the AR is an event, then we can do nothing but react on it.
Answered above because the domain event is not the AR.
4 Is it right to say that the aggregate has two main jobs: Apply the
needed changes based on the input it received and rules it knows
Return the needed data to be persisted from AR and/or need to be
raised in a Domain Event from the AR
Yes; again, you can see how the static function honors this.
You could try to contat Mike Mogosanu. I am sure he could explain his approach better than me.

DDD - Aggregates for read-only

If we are working on a sub-domain where we're only dealing with a read-only scenario, meaning that our entities and value objects will not be changed, does it make sense to create aggregates composed by roots and its children or should each entity of this context map to a single aggregate?
Imagine that we've entity A and entity B.
In a context where modifications are made, we create an aggregate composed by entity A and entity B, where A is the aggregate root (let's say that B can't live without A and there are some invariants involved).
If we move the same entities to a different context where no modifications are made, does it make sense to keep this aggregate or should we create an aggregate for entity A and a different one for entity B?
In 2019, there's fairly large support for the idea that in a read only scenario, you don't bother with the domain model at all.
Just load the data directly into whatever read only data structure makes sense to support the use case.
See also: cqrs.
The first thing is if B cant live without A and there are some invariants involved, to me A is an Aggregate root, with B being an entity that belongs to it.
Aggregate roots represent a real world concept and dont just exist for the convenience of modification. In many of our applications, we don't modify state of our aggregate roots once created - i.e. we in effect have immutable aggregate roots. These would have some logic for design by contract checks/invariant checks etc but they are in effect anaemic as there is no "Update" methods due to its immutability. Since the "blue book" was written by Eric Evans, alot of things have changed, e.g. the concept of NoSql database have become very popular, functional programming concepts have become very influential rising to more advanced DDD style architectures being recommended such as CQRS. So for example, rather than doing updates to a database I can append (i.e. insert) instead. This leads to aggregates no longer having to be "updated". This leads to leaner anaemic types but this is what we want in this context. The issue before with anaemic types was that "update logic" for a given type was put elsewhere in the codebase instead of being put into the type itself. However if you do not require "update logic" in the first place then you dont have that problem!
If for example there is an Order with many OrderItems, we would create an Order aggregate root and an OrderItem entity. Its a very important concept to distill your domain to properly identify what are aggregates, entities and value types.
Then creation of domain services, repositories etc just flows naturally. For example, aggregate roots and repositories are 1 to 1 i.e. in the example above we would have an Order repository and not have an OrderItem repository. That way your main domain concepts are spread throughout your code in a predictable and easy to understand way.
Finally, in your specific question I would not treat them as the same entities. In one context, you seem to need modification logic - in the other they you dont - they are separate domain concepts to me.
In context where modifications are made: A=agg root, B=entity.
In context without modifications: A=agg root (immutable), B=entity(immutable)

DDD: University as an aggregate root

For some time I am dealing with Domain-Driven Design. Unfortunately I have some problems regarding the Aggregate.
Say, I like to model the structure of an university. The university has some departments (faculties) and every department has some classes. There is a rule that every department needs to be unique and so every class in it. For instance the names of the classes needs to be unique. If I understand it right, then "University" seems to be my aggregate root and "department" and "class" are entities within this aggregate.
There is another aggregate root "Professor", because they are globally accessible. They will be assigned to a class. I´m unsure if it is allowed because an aggregate root should only point to another aggregate root and not to its content.
How to handle this?
Appreciate your help,
thanks in advance!
Say, I like to model the structure of an university. The university has some departments (faculties) and every department has some classes. There is a rule that every department needs to be unique and so every class in it. For instance the names of the classes needs to be unique.
Really? why? What's the business value of that rule? What does it cost the business (the university) if there happen to be two classes with the same name. Does that mean the same name across all time, or just during a given semester?
Part of the point of DDD is that the design of the solution requires exploration of the "ubiquitous language" to get a full understanding of the requirement.
In other words, you may be having trouble finding a good fit for this requirement in the design because you haven't yet discovered all of the entities that you need to make it work the way the business experts expect.
Udi Dahan points out that the uniqueness rule may not belong in the domain at all:
Rules that are not part of genuine domain logic do not have to be implemented in the domain model, suggested he, because they do not model the domain.
So if you have a constraint like this, but the constraint isn't a consequence of the domain itself, then the constraint can be correctly implemented elsewhere.
Greg Young has also written about set validation, specifically addressing concerns about eventual consistency.
But broadly, yes -- if you really have a collection of entities, and a domain rules that span multiple elements in the collection, then you need some aggregate that maintains the integrity of the boundary that the collection lives in.
The entities aren't necessarily what you think. For instance, if you need names to be unique, and the rest of the class entity is just along for the ride, then you may be able to simplify the rules by creating a name registry aggregate; Professors reserve names for their classes, and if the reservation is available, then the reserved name can be applied to the class entity.
If your core business really were naming things, with lots of special invariants to consider, you might build out a big model around this. But that's not particularly likely; perhaps you can just slap a table or two into a relational database -- that's a good solution for a set validation problem -- and get on with the valuable part of the project.
There is another aggregate root "Professor", because they are globally accessible. They will be assigned to a class. I´m unsure if it is allowed because an aggregate root should only point to another aggregate root and not to its content.
class.assign(professorId);
is the usual sort of answer here -- you pass around the surrogate key that identifies the aggregate root. Every entity in your domain should have one.
A couple of cautions here: I have found that real world entities (people, in particular) aren't a useful starting point for figuring out what aggregates are for. Primarily, because they end up being representations, primarily, of data where the invariant is enforced outside the domain model.
Also, I've found that starting from the nouns - class, department, professor - tends to put the focus on CRUD, which generally isn't a very interesting problem.
Instead, I recommend thinking about doing something useful -- a use case where there are business rules to enforce, when the business model gets to say "no, the business won't let you do that right now".
Ask yourself these questions:
How many universities will be in your system? If this is only one, it is not your aggregate root.
If you have multiple universities in your system, would be someone working across universities? May be universities are your system tenants?
What happens with a class if some department is dissolved? Will it immediately disappear? I doubt it.
The same as above with university to department relationship
It is not a problem with a Department to hold reference to its classes as a list of value objects that will contain the Class aggregate root id and the class name. The same is valid for departments dealing with their classes.
Vernon's Effective Aggregate Design might help too.
I'm not very experienced in DDD either but here some tips I use to use:
Is it possible to have a Class without a Department assigned? If that is the case then the Department is the aggregate root and Class is another aggregate with a reference to the root, the Department. You can even define a factory method "addClass()" within your Department with the info that a Class needs to be created, so nobody should be allowed to create a Class without a Department.
Why defining a Class a an Aggregate instead of a Value Object? Because Value Objects are distinguished by their properties' value rather than an ID. I would say that even having two Classes with the same name, same students, same info, etc, etc. the business would still want to differentiate each one. It is not the same with a 1 cent coin which with you only care about the value (given by the color, size, weight,...) but you can always replace it with another one with same attributes' value, that is 1 cent. Also assigning another Professor to the class, the class remains the same, it is not immutable as a Value Object should be.
I guess a Professor must be uniquely identified, and he can maybe be assigned to different Classes or even Departments. So to me it is another Aggregate root separated from the department.

Designing a class diagram for a domain model

First, don't think i'm trying to get the job done by someone else, but i'm trying to design a class diagram for a domain model and something I do is probably wrong because I'm stuck, so I just want to get hints about what i'm not doing correctly to continue...
For example, the user needs to search products by categories from a product list. Each category may have subcategories which may have subcategories, etc.
The first diagram I made was this (simplified):
The user also needs to get a tree list of categories which have at least one product.
For example, if this is all the categories tree:
Music instruments
Wind
String
Guitars
Violins
Percussion
Books
Comics
Fiction
Romance
I can't return a tree of Category which have at least one product because I would also get all subCategories, but not each sub category has a product associated to it.
I also can't remove items from the Category.subCategories collection to keep only items which have associated products because it would alter the Category entity, which may be shared elsewhere, this is not what I want.
I thought of doing a copy, but than I would get 2 different instances of the same entity in the same context, isn't it a bad thing ?
So I redesigned to this:
Now I don't get a collection of child categories I don't want with each Category, I only know about its parent category, which is ok.
However, this creates a tree of categories which is navigable only from the bottom to the top, it makes no sense for the client of ProductList who will always need a top -> bottom navigation of categories.
As a solution I think of the diagram below, but i'm not sure it is very good because it kinda dupplicates things, also the CategoryTreeItem does not seems very meaningful in the domain language.
What am I doing wrong ?
This is rather an algorithmic question than a model question. Your first approach is totally ok, unless you were silent about constraints. So you can assign a category or a sub-category to any product. If you assign a sub-category, this means as per this model, the product will also have the parent category. To make it clear I would attach a constraint that tells that a product needs to be assigned to the most finest know category grain. E.g. the guitar products would be assigned to the Guitar category. As more strange instrument like the Stick would get the Strings category (which not would mean its a guitar and a violin but just in the higher category.
Now when you will implement Category you might think of a method to return a collection of assignedInstruments() which for Guitar would return Fender, Alhambra, etc. You might augment this assignedInstruments(levelUp:BOOL) to get also those instruments of the category above.
Generally you must be clear about what the category assignment basically means. If you change the assignment the product will end up in another list.
It depends on the purpose of the diagram. Do you apply a certain software development method that defines the purpose of this diagram in a certain context and the intended readers audience?
Because you talk about a 'domain model', I guess your goal is to provide a kind of conceptual model, i.e. a model of the concepts needed to communicate the application's functionality to end users, testers etc. In that case, the first and the second diagram are both valid, but without the operations (FilterByCategory and GetCategories), because these are not relevant for that audience. The fact that the GUI only displays a subset of the full category tree is usually not expressed in a UML diagram, but in plain text.
On the other hand, if your intention is to provide a technical design for developers, then the third diagram is valid. The developers probably need a class to persist categories in the database ('Category') and a separate class to supply categories to the GUI ('CategoryTreeItem'). You are right that this distinction is not meaningful in the domain language, but in a technical design, it is common to have such additional classes. Please check with the developers if your model is compatible with the programming language and libraries/frameworks they use.
One final remark:
In the first diagram, you specified multiplicity=1 on the parent side. This would mean that every Category has a parent, which is obviously not true. The second diagram has the correct multiplicity: 0..1. The third diagram has an incorrect multiplicity=1 on the composition of CategoryTreeItem.
From my perspective your design is overly complex.
Crafting a domain model around querying needs is usually the wrong approach. Domain models are most useful to express domain behaviors. In other words, to process commands and protect invariants within the correct boundaries.
If your Product Aggregate Root (AR) references a Category AR by id and this relationship is stored in a relationnal DB then you can easily fulfill any of the mentionned querying use cases with a simple DB query. You'd start by gathering a flat representation of the tree which could then be used to construct an in-memory tree.
These queries could be exposed through a ProductQueryService that is part of the application layer, not the domain as those aren't used to enforce domain rules or invariants: I assumed they are used to fullfil reporting or UI display needs. It is there you could have a concept such as ProductCategoryTreeItemDTO for the in-memory representation.
You are also using the wrong terms according to DDD tactical patterns in your diagrams which is very misleading. An AR is an Entity, but an Entity is not necessarily an AR. The Entity term is mostly used to refer to a concept that is uniquely identified within the boundary of it's AR only, but not globally.

Resources