DDD: using aggregates inside another aggregates - domain-driven-design

DDD: Can aggregates get other aggregates as parameters?
According to this, its OK to use aggregates inside another aggregates. But its requires to change multiple aggregates at one transaction. So is it truth that this rule can be easily skipped and I can change multiple aggregates at one time (especially in case of Microservice). The only problem that I need to lock whole aggregates? Thx
I have a simple situation: User, Friendship and Friendship request entities. User can be aggregate root.
DDD and Homogeneous Many-to-Many Relationship
But I would not like to use eventual consistency (especially inside on micro service) cause anyways when I handle that event (FriendshipRequestSent) I need to lock another dependant aggregate. And need to handle and write event on error.

So is it truth that this rule can be easily skipped and I can change multiple aggregates at one time (especially in case of Microservice).
Yes, maybe.
The only problem that I need to lock whole aggregates?
No - there is the additional problem that, because you are modifying multiple aggregates (or more precisely, domain entities that belong to multiple aggregates) in the same transaction, you also need to be careful to design your persistent storage so that updates to all of the entities can be committed in the same "transaction".
That is simple enough when, for example, the entities are all stored in a single relational database, and you can use general purpose operations in the relational database to control your writes.
But if you are working with a different kind of data storage, where you cannot easily control the writes to all entities at the same time, then it gets a bit spooky.
In an "ideal" world, we could pretend that all information is local, and storing it is just an implementation detail. In practice, the actual implementations we get to use only approximate this idea, and we have to be mindful of the differences.

Related

DDD: creating multiple aggregates with a shared life-cycle in a single transaction

I'm aware of the general rule that only a single aggregate should be modified per transaction, mostly for concurrency and transactional consistency issues, as far as I'm aware.
I have a use case where I want to create multiple aggregates in a single transaction: a RestaurantManager, a Restaurant, and a Menu. They seem like a single aggregate because their life-cycles begin and end together: it doesn't make sense within the domain to create a RestaurantManager without a Restaurant, or vice versa; the same goes for a Restaurant and a Menu. Further, if the Restaurant or the RestaurantManager is deleted (unregistered), they should all be deleted together.
However, I've split them into separate aggregates because, once created, they are updated separately, maintain their own invariants, and I don't want to load them all into memory just to update one property on the Restaurant, for example.
The only thing that ties them together is their life-cycle.
My question is whether this represents a case where it is okay to go against the "rule" that each transaction should only operate on a single aggregate.
I'd also like to know if I should enforce their shared life-cycle in the domain model by having each aggregate root hold the identifier of the aggregate root it depends on, i.e. by having Restaurant require a MenuId as a constructor parameter, and likewise for Menu and RestaurantId, so that neither can be created without the other. However, this still wouldn't enforce that they should be saved together by the application service anyway, since it could create them all in memory, then only save the Menu, for example.
Your requirement is a pretty normal use case in DDD, IMHO. There are always multiple aggregates working in tandem to support the application, and they are interlinked in their lifecycles. But the modeling concepts still stand true. Let me attempt to explain what your model would look like with the help of a few DDD rules:
Aggregates are transaction boundaries
Aggregates ensure that no business invariants are broken at any point. This means that if you have multiple aggregates strung together as part of one transaction, you have to load all of them into memory for the validation.
This is especially a problem when your application is data-rich and stores data in a database cluster - partitioned, distributed (think Mongo or Elasticsearch). You will have the problem of loaded up data from potentially different clusters as part of a single transaction.
Aggregates are loaded in entirety
Aggregates and their associated data objects are loaded in entirety into memory. This means that unnecessary objects (say the restaurant's schedule for the upcoming month, for example) for the transaction may be loaded into memory. By itself, this is not a problem. But when multiple aggregates get together, the amount of data loaded into memory needs to be considered.
Aggregates refer to each other by their unique identifiers
This one is straightforward and means that each aggregate stores its referenced aggregates by their identifiers instead of enclosing the other aggregate's data within it.
State changes across Aggregates are handled through Domain Events
In cases where you want a state change in one aggregate to have side-effects on other aggregates, you publish a domain event, and a subscriber handles the change on other aggregates in the background. This is how you would want to handle your requirement for cascade deletes.
By following these rules, you are essentially zooming in one single aggregate at a time and ensuring that the complexity remains low. When you string up multiple aggregates, though it is clear and understandable on day 1, eventually, the application tends towards becoming a big ball of mud, as dependencies and invariants start crisscrossing each other.
"only a single aggregate should be modified per transaction"
Contention at creation doesn't matter as much. You can create many ARs in a single transaction without problem because the only other operation that could conflict is another duplicate creation process.
Another reason to avoid involving many ARs in a single transaction is coupling between modules though, but you could always keep things loosely coupled using synchronously dispatched domain events.
As for the deletion, it's probably less problematic to make it eventually consistent. Does it really matter that Restaurant is closed while RestaurantManager remains registered for a short period of time?
The fact you are asking this question tells me your system is not distributed? If your system is running with a single DB server and used by a few people it may be that eventual consistency make things more complex for scalability you don't actually need.
Start simple and refactor as needed, but crossing AR boundaries is not something that should be done consistently or else your boundaries are clearly wrong.
Furthermore, if you want to communicate that a RestaurantManager can't be spawned from nowhere and associated with an invalid RestaurantId by mistake you may want to look at your ubiquitous language for guidance.
e.g.
"A RestaurantManager is registered for a given Restaurant": not sure it truly aligns with your UL, but it's just for the sake of the example.
RestaurantManager manager = restaurant.registerManager(...);
This obviously increases coupling and could affect performance, but it aligns well with the UL and makes it more difficult to misuse the model. Also note that with a single DB, you could enforce referential integrity which takes cares of these uninteresting referential constraints.
As pointed out by #plalx, contention doesn't matter as much when creating aggregates in terms of transactions, since they don't yet exist so can't be involved in contention.
As for enforcing the mutual life cycle of multiple aggregates in the domain, I've come to think that this is the responsibility of the application layer (i.e. an application service, or use case).
Maybe my thinking is closer to Clean or Hexagonal architecture, but I don't think it's possible or even sensible to try and push every single business rule down into the "domain model". The point of the domain model for me is to partition the problem domain into small chunks (aggregates), which encapsulate common business data/operations that change together, but it's the application layer's responsibility to use these aggregates properly in order to achieve the business' end goal (which is the application as a whole), including mediating operations between the aggregates and controlling their life cycles.
As such, I think this stuff belongs in an application service. That being said, frequently updating multiple aggregates in each use case could be a sign of incorrect domain boundaries.

Agregate root vs child methods

I saw many different approaches and I am fairly new to domain-driven design approach. What I am struggling with is to understand one complex (at least for me) thing. I know the whole DDD is complex to understand on first but I am trying to find any resources I can on it.
Example: I have an order and order can have operations. Operations can not be accessed without order and they make no sense without an order. So order entity will be my aggregate root. Operations will be entity too because each operation will have an id (am I right on this one?). Each operation can have subitems (array of strings for example and these can be added or removed from any operation).
Now what I am struggling to understand and what I found everywhere is that every modification should be called and set only through aggregate root... But is it okay to have private methods like setters and getters on the Operation entity itself but these would be called only through the aggregate root (order entity)?
Sorry if I missed something basic, as the whole DDD concept for me is new and I am trying to explore it.
Thanks.
A couple of DDD concepts to arrive at the answer:
Aggregates are Transaction Boundaries.
Aggregates act as gatekeepers for all changes to domain elements enclosed within itself.
Data changes to an Aggregate and its enclosed domain elements are committed atomically. Either everything within the Aggregate stays in sync, or the whole state change operation fails.
The rule also means that one should not access Domain Elements within the Aggregate directly. It would be best if you did not manipulate the domain objects outside the context of the Aggregate.
If Operation is an entity under Order aggregate, then Order is responsible for ensuring operations satisfy the business invariants (a.k.a validations).
Aggregates are loaded in entirety.
Since an Aggregate represents the transaction and consistency boundary of a domain concept, its data is loaded in entirety to guarantee that all Business Invariants are satisfied. Data here means data of all underlying entities and value objects.
If you cannot load the entire data, you cannot guarantee that the change satisfies all business invariants. It may also mean that a data-intensive entity within the Aggregate may need to become an Aggregate itself.
You are protecting the data sanctity and operational consistency of the system if you adhere to these rules. Within the Aggregate itself, how you organize state changes is wholly left to you.
IMHO, I would go with your approach of enclosing all Operation related behaviors, data attributes, and invariants within the Operation entity. Order is responsible for protecting the data within its boundary, but it need not own the methods/logic of doing everything.
You can create state change methods within the Operation entity too, just like you would have done in the Order aggregate, but invoke them from the order object.

DDD : one aggregate root , multiple persistent datasources

In the Guide/eBook: .NET Microservices: Architecture for Containerized .NET Applications (related to the eShopOnContainers) in the chapter "Designing the infrastructure persistence layer" (page 213) is explained in general how an aggregate root can perform CUD operations against a persistent data source.
Two important starting points are mentioned :
An aggregate is ignorant of methods of persistency and infrastructure following the Persistence Ignorance and the Infrastructure Ignorance principles (page 218). An aggregate is determined by the business and not by the infrastructure.
One should only define one repository per aggregate root to maintain transactional consistency between the objects within the aggregate (page 213)
Unfortunately, in all further examples that are mentioned the aggregate root and all underlying objects that fall under it are within one and the same persistent data source.
The pattern then is as follows:
A repository is created containing that aggregate
In this repository a Unit of Work is injected during creation. This Unit of Work contains methods such as SaveChangesAsync, SaveEntitiesAsync, Update
and so on.
In a command, the Unit of Work manages the transactions to
this one data source such as a database or similar.
I want to expand this pattern that the aggregate can write its data over 2 or more physical data sources depending on the underlying object type.
Starting from starting point 1, it is perfectly justified to have a root aggregate and its underlying object to be updated to different data sources depending on the type of underlying object. Examples mentioned are : a Database and an XML file, a database and a NOSQL 'database',a database and a service, a database and an IoT device. Because an aggregate must be ignorant to methods of persistence and infrastructure, to my opinion there is no need to argue about the design of the aggregate. I think nowhere in the book it is written that a aggregate root should persist within one data source.
At the same time, starting point 2 also seems perfectly justified. Because the complete set of objects within the aggregate root is edited, and the successful persistence of the entire package is coordinated from one repository and (preferably) from one Unit of Work.
The question is:
How deals Domain Driven Design if within the aggregate - depending on the type of the underlying object - it is hydrated over different data sources?
Should I use one custom Unit of Work and make the decision where to write to within this UoW ?
I'm aware of the next question , but having studied the code I think it only deals with inheritance of repositories that deal with different data sources, but still serving one data source at the time and that is not what I'm after.
I want to expand this pattern that the aggregate can write its data over 2 or more physical data sources depending on the underlying object type.
Why do you want to do that on purpose?
In most cases, the persistence implementation is chosen to serve the domain, rather than the other way around. So the happy path typically involves choosing a persistence solution that can record the state of the entire aggregate, and storing the entire thing within a single transaction.
So if you find yourself trying to store an aggregate in two different places, you should take a hard careful look at why.
One common answer is that you want to be able to query the aggregate state efficiently. cqrs is a common solution here - rather than persisting the aggregate in two different data stores, you persist it to one and replicate it to another. The queries can run very efficiently against the replica (although there is of course some additional latency between a change to the aggregate and the reflection of that change in the query results).
Another common answer is that you really have two aggregates that reference each other. Nothing wrong with storing two aggregates in different places. You may be better served by making the distinction between the two explicit in your code.
Dan Pritchett
Jimmy Bogard
How deals Domain Driven Design if within the aggregate - depending on the type of the underlying object - it is hydrated over different data sources?
Badly, just like everybody else.

How do you handle an aggregate root with a collection of child entities whose update frequency is different than the root?

We have an aggregate root in our system and is has child entities in a collection. The problem is that the container needs to be updated very frequently, on a transaction basis, and the children entities don't, they in fact hardly ever change, they are more configuration like in nature.
My first reflex was to separate them into two different aggregate roots because our of application requirements. But I was reminded of the cascade delete rule, if we delete the one then the delete should cascade, so their lifetimes are linked.
We stumbled over this problem when we discovered that we have a caching problem. Changes to the children entities (configuration) were not being reflected in the system at runtime because the parent was unaware of the changes (we had them as one aggregate root but someone had created a repository for its children).
The main driver for aggregate boundaries are the invariants of your domain - or in other terms, aggregate boundaries should be consistency boundaries. Things that must change together atomically must be in the same aggregate.
The cascading delete is (with regards to aggregate boundaries) rather a nice-to-have than a rule. You can always enforce the fact that a Parent still lives by requiring one at the place where you load Child entities. With this design, you can make Parent and Child different aggregates, while still enforcing the rule that no "free floating" Child aggregates can be requested. And deleting Child aggregates in response to a deleted Parent is easy if you have domain events in place.
Note: All this is under the assumption that your domain invariants allow separating the aggregates in the first place.
This might be better in a discussion format, rather than a Q&A format. I'd recommend trying the audience at DomainDrivenDesign or DDDCQRS
Are you sure that you have a business requirement to delete data in your domain model? That's really unusual -- in most domain models I've seen, an aggregate will reach an "end of life" state, (example: AccountClosed), but doesn't actually get removed from the system.
A common trap in aggregate design is to think about the structure of the entities. "A has a B" does not necessarily mean that they are part of the same aggregate; the key idea is "A needs to keep B and C consistent". You can think about it like a graph; state B and state C are nodes in the graph, the consistency rules are the edges. If you can't traverse the graph from B to C, then they don't need to be part of the same aggregate, and probably shouldn't be.
My instinct is that caching should be the right answer here. If you are processing millions of transactions per day, and the collection only changes once per month, then simply using a cached value of the collection should produce the right answer most of the time.
In this, I'm influenced by Udi Dahan's essay Race Conditions Don't Exist; by coupling this configuration collection with the rest of the aggregate, you are essentially asserting that changes to the configuration (which are rare) are understood by the business to be happening precisely between two other changes to the aggregate. 3M transactions per day averages 1 per 30ms; are you really scheduling your configuration changes that precisely?
The usual pattern here would be that the consistency rule is removed from the domain model; instead, you monitor for changes that introduce an inconsistency, and mitigate them. That depends upon there being a reasonable way to detect the errors, an efficient way to mitigate them, and a mechanic for keeping the rate under control.
The latter of these would normally be done by having the clients/the application check their local copy of the collection, and making sure the command sent is consistent with that before dispatching the command to the domain model. (Possible questions for your domain experts: how quickly do the configuration changes need to be applied? Do the configuration changes happen when the aggregate is changing frequently or when it is quiet?)
Another possibility might be to change your persistence strategy; if the collection doesn't change often, then there are not a lot of change events related to it. So maybe instead of persisting the aggregate, you look into persisting its history - in other words, using event-sourcing here. Maybe if this aggregate lived in a micro service, you could limit the risk of the change? Hard to say, at a million transactions per day, this aggregate sounds pretty important.

Retrieval of child objects of aggregates in DDD

In DDD root of an aggregate is the only reference to retrieve its child objects. Repository of root of an aggregate is responsible for giving the root object reference only. If I need child objects then need to call a getter method of the aggregate to retrieve the child objects which results in a DB query.
Consider a case where I am retrieving multiple aggregates from DB. So in my case this situation results in multiple DB queries which leads a very slow request. How to avoid this in terms of DDD. For persisting I came across a pattern called Unit Of Work. Is there any pattern for the search which resolves my problem or any other way to do this.
First of all, 95% of problems are solved by your ORM (if you happen to use relational database).
Aggregate root repository should (in most cases) return a fully loaded object with all child objects (entities). Lazy loading children should be an exception, not a rule.
Another thing is, you should avoid loading and persisting multiple aggregates at a time. Try repartitioning you domain so that each user interaction deals with only one aggregate.
And consider a document database solution. It really makes sanes to store whole aggregates as documents in a doc database.
Okey it seems like you have a scenario where you in a single use case want to read from several AR and also savee their state into DB. Is the read operation taking to long? or is it both read and write that takes time?
Your domain model and Aggregate roots should be partly defined through interation from use cases. What I'm saying is that, the model should be designed so it suits your clients needs. This scenario seems not like one that fits your model well.
Reports or other operations that uses a large data view should be bypasses the domain model. Don't use DDD for reports etc. Just do a fast data access.
Second. Unit of work is one way to go, if you want all aggregates to participate in a transaction.
Third. I would say, Use Lazy loading, but some use cases that need performance boost you can do a loading strategy which means you let the root load some child collections without having sql-sub-selects firing...
look at this article http://weblogs.asp.net/fredriknormen/archive/2010/07/25/loading-strategy-for-entity-framework-4-0.aspx (even its for EF pattern works well for NH ORM)
Then at last you can always provide db indexes, caching etc to boost perfomance, but given the scenario info, you have takensome kind of wrong design desicion. I don't havee all the facts but maybe some use cases aren't suitable for
I find DDD excellent when it comes to any kind of write operation. For Querying data instead, it only poses unnecessary restrictions.
I would strongly recommend using CQRS as general architecture pattern. This would allow you to create specific Query Models for your Views and leave DDD for input validation and Command execution.

Resources