DDD modeling aggregate with few invariants and many fields

DDD modeling aggregate with few invariants and many fields - domain-driven-design

I thinking about modeling aggregates, invariants, data etc. There is common advice to design aggregates to be small. I have problem a with correct splitting domain and simple CRUD.
Let's assume that we have application where we are able to create project and join to it collaborators. There are a lot of informations related with project at the stage of creating (name, description, project_aims, notes, creation date, modified date, collaborators). How to correct design aggregate where there is a rule which check that we can only add 5 collaborators. Taking into consideration that fields name, description, project_aims, notes doesn't really take part in any business rule and there is only requirements that this fields should'nt be empty (those are really invariants?) should those fields really be a part of aggregate?
Is'nt that our real Domain (aggregates, entities, value objects, policies) should hold only data which take part with protecting invariants or help making business decisions?
If so, how to (create) project described above? Should class with all that nonsignificant (from a business point of view) fields be implemented as anemic model outside the Domain and Aggregate root should just have method addCollaborator which protect quantity of collaborators? Is it good idea to save anemic class object using Dao (operates on db table) and for Domain implementation of aggregate, create Repository?
How to add first collaborator during creating project as at the beggining we create anemic class object outside Domain?
Thank you for any help and advice
Papub

"How to correct design aggregate where there is a rule which check that we can only add 5 collaborators"
Project AR most likely maintains a set of collaborators and throws whenever it's size would exceed 5.
"there is only requirements that this fields should'nt be empty (those are really invariants?)"
Yes, these may be simple rules, but are still are invariants.
"should hold only data which take part with protecting invariants or help making business decisions"
This can be done when modeling ARs with EventSourcing, where you'd only materialize the properties needed to check invariants on the AR, while having all data in the full set of events.
"implemented as anemic model outside the Domain and Aggregate root should just have method addCollaborator which protect quantity of collaborators".
You could always attempt to mix CRUD-driven anemia with rich always-valid models, but the anemic/rich model decision is usually consistent for a given Bounded Context (BC), meaning you may have CRUDDy BCs and rich domain model BCs, but rarely both strategies in the same BC.
For instance, perhaps "Project Definition" is a CRUD-driven BC while "Collaboration" isin't. Those BCs may not make any sense, but it's just to give an example.
Furthermore, don't forget DDD tactical patterns are there to allow manage the complexity, they aren't hard rules. If handling a part of your AR through services and another (where there's more meat) with rich behaviors then perhaps that's acceptable. Still, I'd probably prefer CRUDDy behaviors on the ARs themselves like an update method rather than giving up in the anemic direction.

Related

Are titles, labels and other UI-related things should be included in domain model?

I am confused about how to treat strictly UI-related things, that won't be used in the business logic in the domain model: how to properly store them in the database?
If for example I have an aggregate which is an entity and the main purpose of this model is to do something with an important thing, should I include a title in the model even though it does not contribute to the business logic in any way? Does it matter if I want to store the title in the same table I store other data for my entity (e.g. important things)?
#Entity
MyAggregate:
id: ID
title: str
importantThing: ImportantThing
def doSomethingWithImportantThing():
...
And if I don't include a title in the model, then how to properly store it using Repository pattern? If I keep the title within my model my Repository could look like so:
#Repository
MyAggregateRepository:
def create(myAggregate: MyAggregate):
...
What would happen to repository if I remove title from the model? Should it transform like so:
#Repository
MyAggregateRepository:
def create(myAggregate: MyAggregate, title: str):
...

The rule of thumb is to keep only things that are necessary for making decisions and protecting invariants inside the aggregate state. Otherwise, aggregates get polluted by alien concerns and convert to a messy one-to-one representation of an over-growing database table or document.
As any rule, it has exceptions. I don't think it's a good idea going overboard from the start and splitting the entity prematurely.
However, if you feel that things get messy and you can see patterns that a group of fields are used in a group of functions, while another group of fields is solely used in a different set of functions, you might get an idea that your aggregate deserves splitting.
The repository pattern is largely relevant for executing commands. Its main purpose is to handle the aggregate persistence. When implementing queries, consider using CQRS and write queries that you need to write, it doesn't have to be the repository that handles queries. Queries are also idempotent and have no side effect (except the performance), so it is rather safe not to think about the domain model as such when writing queries. It's better to name your queries using the Ubiquitous Language though.

Things that are purely UI-related typically don't belong in the domain unless the domain is related to managing UI-related items, such as in the case of a localisation domain.
Data that belongs in the domain would stay in the domain. For instance, if there is a comment on an AccountTransaction, or some such, then that would be in the language used by the users of the system and not something that one could localize. However, if that transaction has a Type indicator that is either Debit or Credit then you wouldn't want to necessarily use a string representation but rather codify that; even if the "code" is Debit and Credit or Dr/Cr. However, the front-end would use some l10n or i18n mechanism to display the text for the Type in the relevant language.
Hopefully I understood your question correctly.

Keep title within the bounds of your model. There are a few reasons for this.
The utility of title is kept within the bounds of the model itself, since it does not serve any other purpose in the domain layer. It serves as "identity" that is merely local to the model itself, and then gets surfaced in the UI.
The title is not necessary for creating the aggregate, since it has no business logic intent. If it did, it would represent a tighter coupling between the model and the creation of the aggregate, which is typically undesirable.
title seems to be an aggregate invariant that you'd only want the aggregate root to be concerned with, and not a concern from a perspective of external access or creation.
Ultimately, this keeps your design cleaner.

What is an Aggregate Root?

No, it is not a duplication question.
I have red many sources on the subject, but still I feel like I don't fully understand it.
This is the information I have so far (from multiple sources, be it articles, videos, etc...) about what is an Aggregate and Aggregate Root:
Aggregate is a collection of multiple Value Objects\Entity references and rules.
An Aggregate is always a command model (meant to change business state).
An Aggregate represents a single unit of (database - because essentialy the changes will be persisted) work, meaning it has to be consistent.
The Aggregate Root is the interface to the external world.
An Aggregate Root must have a globally unique identifier within the system
DDD suggests to have a Repository per Aggregate Root
A simple object from an aggregate can't be changed without its AR(Aggregate Root) knowing it
So with all that in mind, lets get to the part where I get confused:
in this site it says
The Aggregate Root is the interface to the external world. All interaction with an Aggregate is via the Aggregate Root. As such, an Aggregate Root MUST have a globally unique identifier within the system. Other Entites that are present in the Aggregate but are not Aggregate Roots require only a locally unique identifier, that is, an Id that is unique within the Aggregate.
But then, in this example I can see that an Aggregate Root is implemented by a static class called Transfer that acts as an Aggregate and a static function inside called TransferedRegistered that acts as an AR.
So the questions are:
How can it be that the function is an AR, if there must be a globaly unique identifier to it, and there isn't, reason being that its a function. what does have a globaly unique identifier is the Domain Event that this function produces.
Following question - How does an Aggregate Root looks like in code? is it the event? is it the entity that is returned? is it the function of the Aggregate class itself?
In the case that the Domain Event that the function returns is the AR (As stated that it has to have that globaly unique identifier), then how can we interact with this Aggregate? the first article clearly stated that all interaction with an Aggregate is by the AR, if the AR is an event, then we can do nothing but react on it.
Is it right to say that the aggregate has two main jobs:
Apply the needed changes based on the input it received and rules it knows
Return the needed data to be persisted from AR and/or need to be raised in a Domain Event from the AR
Please correct me on any of the bullet points in the beginning if some/all of them are wrong is some way or another and feel free to add more of them if I have missed any!
Thanks for clarifying things out!

I feel like I don't fully understand it.
That's not your fault. The literature sucks.
As best I can tell, the core ideas of implementing solutions using domain driven design came out of the world of Java circa 2003. So the patterns described by Evans in chapters 5 and six of the blue book were understood to be object oriented (in the Java sense) domain modeling done right.
Chapter 6, which discusses the aggregate pattern, is specifically about life cycle management; how do you create new entities in the domain model, how does the application find the right entity to interact with, and so on.
And so we have Factories, that allow you to create instances of domain entities, and Repositories, that provide an abstraction for retrieving a reference to a domain entity.
But there's a third riddle, which is this: what happens when you have some rule in your domain that requires synchronization between two entities in the domain? If you allow applications to talk to the entities in an uncoordinated fashion, then you may end up with inconsistencies in the data.
So the aggregate pattern is an answer to that; we organize the coordinated entities into graphs. With respect to change (and storage), the graph of entities becomes a single unit that the application is allowed to interact with.
The notion of the aggregate root is that the interface between the application and the graph should be one of the members of the graph. So the application shares information with the root entity, and then the root entity shares that information with the other members of the aggregate.
The aggregate root, being the entry point into the aggregate, plays the role of a coarse grained lock, ensuring that all of the changes to the aggregate members happen together.
It's not entirely wrong to think of this as a form of encapsulation -- to the application, the aggregate looks like a single entity (the root), with the rest of the complexity of the aggregate being hidden from view.
Now, over the past 15 years, there's been some semantic drift; people trying to adapt the pattern in ways that it better fits their problems, or better fits their preferred designs. So you have to exercise some care in designing how to translate the labels that they are using.

In simple terms an aggregate root (AR) is an entity that has a life-cycle of its own. To me this is the most important point. One AR cannot contain another AR but can reference it by Id or some value object (VO) containing at least the Id of the referenced AR. I tend to prefer to have an AR contain only other VOs instead of entities (YMMV). To this end the AR is responsible for consistency and variants w.r.t. the AR. Each VO can have its own invariants such as an EMailAddress requiring a valid e-mail format. Even if one were to call contained classes entities I will call that semantics since one could get the same thing done with a VO. A repository is responsible for AR persistence.
The example implementation you linked to is not something I would do or recommend. I followed some of the comments and I too, as one commenter alluded to, would rather use a domain service to perform something like a Transfer between two accounts. The registration of the transfer is not something that may necessarily be permitted and, as such, the domain service would be required to ensure the validity of the transfer. In fact, the registration of a transfer request would probably be a Journal in an accounting sense as that is my experience. Once the journal is approved it may attempt the actual transfer.
At some point in my DDD journey I thought that there has to be something wrong since it shouldn't be so difficult to understand aggregates. There are many opinions and interpretations w.r.t. to DDD and aggregates which is why it can get confusing. The other aspect is, in IMHO, that there is a fair amount of design involved that requires some creativity and which is based on an understanding of the domain itself. Creativity cannot be taught and design falls into the realm of tacit knowledge. The popular example of tacit knowledge is learning to ride a bike. Now, we can read all we want about how to ride a bike and it may or may not help much. Once we are on the bike and we teach ourselves to balance then we can make progress. Then there are people who end up doing absolutely crazy things on a bike and even if I read how to I don't think that I'll try :)
Keep practicing and modelling until it starts to make sense or until you feel comfortable with the model. If I recall correctly Eric Evans mentions in the Blue Book that it may take a couple of designs to get the model closer to what we need.

Keep in mind that Mike Mogosanu is using a event sourcing approach but in any case (without ES) his approach is very good to avoid unwanted artifacts in mainstream OOP languages.
How can it be that the function is an AR, if there must be a globaly unique identifier to it, and there isn't, reason being that
its a function. what does have a globaly unique identifier is the
Domain Event that this function produces.
TransferNumber acts as natural unique ID; there is also a GUID to avoid the need a full Value Object in some cases.
There is no unique ID state in the computer memory because it is an argument but think about it; why you want a globaly unique ID? It is just to locate the root element and its (non unique ID) childrens for persistence purposes (find, modify or delete it).
Order A has 2 order lines (1 and 2) while Order B has 4 order lines (1,2,3,4); the unique identifier of order lines is a composition of its ID and the Order ID: A1, B3, etc. It is just like relational schemas in relational databases.
So you need that ID just for persistence and the element that goes to persistence is a domain event expressing the changes; all the changes needed to keep consistency, so if you persist the domain event using the global unique ID to find in persistence what you have to modify the system will be in a consistent state.
You could do
var newTransfer = New Transfer(TransferNumber); //newTransfer is now an AG with a global unique ID
var changes = t.RegisterTransfer(Debit debit, Credit credit)
persistence.applyChanges(changes);
but what is the point of instantiate a object to create state in the computer memory if you are not going to do more than one thing with this object? It is pointless and most of OOP detractors use this kind of bad OOP design to criticize OOP and lean to functional programming.
Following question - How does an Aggregate Root looks like in code? is it the event? is it the entity that is returned? is it the function
of the Aggregate class itself?
It is the function itself. You can read in the post:
AR is a role , and the function is the implementation.
An Aggregate represents a single unit of work, meaning it has to be consistent. You can see how the function honors this. It is a single unit of work that keeps the system in a consistent state.
In the case that the Domain Event that the function returns is the AR (As stated that it has to have that globaly unique identifier),
then how can we interact with this Aggregate? the first article
clearly stated that all interaction with an Aggregate is by the AR, if
the AR is an event, then we can do nothing but react on it.
Answered above because the domain event is not the AR.
4 Is it right to say that the aggregate has two main jobs: Apply the
needed changes based on the input it received and rules it knows
Return the needed data to be persisted from AR and/or need to be
raised in a Domain Event from the AR
Yes; again, you can see how the static function honors this.
You could try to contat Mike Mogosanu. I am sure he could explain his approach better than me.

DDD - Fictitious Aggregate Root

Sometimes I come to this case when I have a bunch of entity domain models which should be transactionally persisted but there is no logical domain model which could become an aggregate root of all these entity domain models.
Is it a good idea in these cases to have a fictitious aggregate root domain model which will have NO analogical database entity and will not be persisted in the database but will store in itself only logic for transactionally persisting entity domain models ?
P.S. I tought about that because having a database table storing only a single column of aggregate root ids seems wrong to me.

Is it a good idea in these cases to have a fictitious aggregate root domain model which will have NO analogical database entity and will not be persisted in the database but will store in itself only logic for transactionally persisting entity domain models ?
Sort of.
It's perfectly fine to have a PurpleMonkeyDishwasher that joins together composes together the entities that make up your aggregate, so that you can be sure that your data remains consistent and satisfies your domain invariant.
But it's really suspicious that it doesn't have a name. That suggests that you don't really understand the problem that you are modeling.
It's the modeling equivalent of a code smell. There's probably a theme that arranges these entities to be modeled together, exclusive of the others, rather than in some other arrangement. There's probably a noun that your domain experts use when talking about these entities together. Go find it. That's part of the job.

An "aggregate root domain model which will have NO analogical database entity and will not be persisted in the database" is not a "fictitious aggregate"; it is a standard aggregate just like another aggregate that it needs to be persisted. The purpose of an aggregate is to control the changes following domain rules to ensure consistency and invariants.
Sometimes the aggregate is the change (and need to be persisted) but sometimes it is not and the things to be persisted after the change are parts/full entities and/or VOs that changed inside the aggregate and are mapped in persistence at its own without the needed of composing a persistence concept (table/s, document, etc). This is a implementation detail about how you decided to persist your domain data.
First premise of DDD: There is no DataBase. This helps you to not think too biased about trying to mapping persistence concepts in your domain.
Mike in his blog explain it better than me.
The purpose of our aggregate is to control change, not be the change.
Yes, we have data there organized as Value Objects or Entity
references but that’s because it’s the easiest and most maintainable
way to enforce the business rules. We’re not interested in the state
itself, we’re interested in ensuring that the intended changes respect
the rules and for that we’re ‘borrowing’ the domain mindset i.e we
look at things as if WE were part of the business.
An aggregate instance communicates that everything is ok for a
specific business state change to happen. And, yes, we need to persist
the busines state changes. But that doesn’t mean the aggregate itself
needs to be persisted (a possible implementation detail). Remember
that the aggregate is just a construct to organize business rules,
it’s not a meant to be a representation of state.
So, if the aggregate is not the change itself, what is it? The change
is expressed as one or more relevant Domain Events that are generated
by the aggregate. And those need to be recorded (persisted) and
applied (interpreted). When we apply an event we “process” the
business implications of it. This means some value has changed or a
business scenario can be triggered.

Do we need another repo for each entity?

For example take an order entity. It's obvious that order lines don't exist without order. So we have to get them with the help of OrderRepository(throw an order entity). Ok. But what about other things that are loosely coupled with order? Should the customer info be available only from CustomerRepo and bank requisites of the seller available from BankRequisitesRepo, etc.? If it is correct, we should pass all these repositories to our Create Factory method I think.

Yes. In general, each major entity (aggregate root in domain driven design terminology) should have their own repositories. Child entities *like order lines) will generally not need one.
And yes. Define each repository as a service then inject them where needed.
You can even design things such that there is no direct coupling between Order and Customer in terms of an actual database link. This in turn allows customers and orders to live in completely independent databases. Which may or may not be useful for your applications.

You correctly understood that aggregate roots's (AR) child entities shall not have their own repository, unless they are themselves AR's. Having repositories for non-ARs would leave your invariants unprotected.
However you must also understand that entities should usually not be clustered together for convenience or just because the business states that some entity has one or many some other entity.
I strongly recommend that you read Effective Aggregate Design by Vaughn Vernon and this other blog post that Vaughn kindly wrote for a question I asked.
One of the aggregate design rule of thumb stated in Effective Aggregate Design is that you should usually reference other aggregates by identity only.
Therefore, you greatly reduce the number of AR instances needed in other AR's creationnal process since new Order(customer, ...) could become new Order(customerId, ...).
If you still find the need to query other AR's in one AR's creationnal process, then there's nothing wrong in injecting repositories as dependencies, but you should not depend on more than you need (e.g. let the client resolve the real dependencies and pass them directly rather than passing in a strategy allowing to resolve a dependency).

DDD Repository and Entity

I have some big Entity. Entity has propertis "Id", "Status" and others.
I have Repository for this Entity.
I want to change status in one entity.
Should I get whole Entity, change property Status and use save method in Repository or should I use method ChangeStatus(id, newStatus) in Repository?

Probably you don't need a domain model. You could try a transaction script that directly use SQL to update the database.
You need a domain model if and only if you need to hire an expert to understand the business.
Otherwise, it's just expensive buzzwords driven development.
And btw, if you have big entity classes containing data that you don't need during most of operations, then you know that you haven't properly defined context boundaries.
The best definition of bounded context is the one of Evans:
The delimited applicability of a particular model. BOUNDING CONTEXTS gives team members a clear and shared understanding of what has to be consistent and what can develop independently.
That is: you have to split the domain expert knowledge in contexts where each term has an unambiguous meaning and a restricted set of responsibility. If you do so, you'll obtain small types and modules with high cohesion an

Develop Reference

node.js excel linux python-3.x azure haskell apache-spark rust .htaccess string