In our domain, AggregateA needs to access data from multiple other aggregates. As far as I know, I can not directly reach Root of other aggregates.
In this case, what is the practical way of getting the data for AggregateA?
Build a middle layer Repository and persist data needed by AggregateA?
Use domain events to ask for data from other aggregates?
Business logic in one agg can never depend on state from another agg, so if that is the case you may need to redesign your aggregate boundaries. But depending on your project it may be easier to "lift" your tx-boundary from what you call "aggregate" and assign it to a larger scope.
Related
DDD: Can aggregates get other aggregates as parameters?
According to this, its OK to use aggregates inside another aggregates. But its requires to change multiple aggregates at one transaction. So is it truth that this rule can be easily skipped and I can change multiple aggregates at one time (especially in case of Microservice). The only problem that I need to lock whole aggregates? Thx
I have a simple situation: User, Friendship and Friendship request entities. User can be aggregate root.
DDD and Homogeneous Many-to-Many Relationship
But I would not like to use eventual consistency (especially inside on micro service) cause anyways when I handle that event (FriendshipRequestSent) I need to lock another dependant aggregate. And need to handle and write event on error.
So is it truth that this rule can be easily skipped and I can change multiple aggregates at one time (especially in case of Microservice).
Yes, maybe.
The only problem that I need to lock whole aggregates?
No - there is the additional problem that, because you are modifying multiple aggregates (or more precisely, domain entities that belong to multiple aggregates) in the same transaction, you also need to be careful to design your persistent storage so that updates to all of the entities can be committed in the same "transaction".
That is simple enough when, for example, the entities are all stored in a single relational database, and you can use general purpose operations in the relational database to control your writes.
But if you are working with a different kind of data storage, where you cannot easily control the writes to all entities at the same time, then it gets a bit spooky.
In an "ideal" world, we could pretend that all information is local, and storing it is just an implementation detail. In practice, the actual implementations we get to use only approximate this idea, and we have to be mindful of the differences.
I'm aware of the general rule that only a single aggregate should be modified per transaction, mostly for concurrency and transactional consistency issues, as far as I'm aware.
I have a use case where I want to create multiple aggregates in a single transaction: a RestaurantManager, a Restaurant, and a Menu. They seem like a single aggregate because their life-cycles begin and end together: it doesn't make sense within the domain to create a RestaurantManager without a Restaurant, or vice versa; the same goes for a Restaurant and a Menu. Further, if the Restaurant or the RestaurantManager is deleted (unregistered), they should all be deleted together.
However, I've split them into separate aggregates because, once created, they are updated separately, maintain their own invariants, and I don't want to load them all into memory just to update one property on the Restaurant, for example.
The only thing that ties them together is their life-cycle.
My question is whether this represents a case where it is okay to go against the "rule" that each transaction should only operate on a single aggregate.
I'd also like to know if I should enforce their shared life-cycle in the domain model by having each aggregate root hold the identifier of the aggregate root it depends on, i.e. by having Restaurant require a MenuId as a constructor parameter, and likewise for Menu and RestaurantId, so that neither can be created without the other. However, this still wouldn't enforce that they should be saved together by the application service anyway, since it could create them all in memory, then only save the Menu, for example.
Your requirement is a pretty normal use case in DDD, IMHO. There are always multiple aggregates working in tandem to support the application, and they are interlinked in their lifecycles. But the modeling concepts still stand true. Let me attempt to explain what your model would look like with the help of a few DDD rules:
Aggregates are transaction boundaries
Aggregates ensure that no business invariants are broken at any point. This means that if you have multiple aggregates strung together as part of one transaction, you have to load all of them into memory for the validation.
This is especially a problem when your application is data-rich and stores data in a database cluster - partitioned, distributed (think Mongo or Elasticsearch). You will have the problem of loaded up data from potentially different clusters as part of a single transaction.
Aggregates are loaded in entirety
Aggregates and their associated data objects are loaded in entirety into memory. This means that unnecessary objects (say the restaurant's schedule for the upcoming month, for example) for the transaction may be loaded into memory. By itself, this is not a problem. But when multiple aggregates get together, the amount of data loaded into memory needs to be considered.
Aggregates refer to each other by their unique identifiers
This one is straightforward and means that each aggregate stores its referenced aggregates by their identifiers instead of enclosing the other aggregate's data within it.
State changes across Aggregates are handled through Domain Events
In cases where you want a state change in one aggregate to have side-effects on other aggregates, you publish a domain event, and a subscriber handles the change on other aggregates in the background. This is how you would want to handle your requirement for cascade deletes.
By following these rules, you are essentially zooming in one single aggregate at a time and ensuring that the complexity remains low. When you string up multiple aggregates, though it is clear and understandable on day 1, eventually, the application tends towards becoming a big ball of mud, as dependencies and invariants start crisscrossing each other.
"only a single aggregate should be modified per transaction"
Contention at creation doesn't matter as much. You can create many ARs in a single transaction without problem because the only other operation that could conflict is another duplicate creation process.
Another reason to avoid involving many ARs in a single transaction is coupling between modules though, but you could always keep things loosely coupled using synchronously dispatched domain events.
As for the deletion, it's probably less problematic to make it eventually consistent. Does it really matter that Restaurant is closed while RestaurantManager remains registered for a short period of time?
The fact you are asking this question tells me your system is not distributed? If your system is running with a single DB server and used by a few people it may be that eventual consistency make things more complex for scalability you don't actually need.
Start simple and refactor as needed, but crossing AR boundaries is not something that should be done consistently or else your boundaries are clearly wrong.
Furthermore, if you want to communicate that a RestaurantManager can't be spawned from nowhere and associated with an invalid RestaurantId by mistake you may want to look at your ubiquitous language for guidance.
e.g.
"A RestaurantManager is registered for a given Restaurant": not sure it truly aligns with your UL, but it's just for the sake of the example.
RestaurantManager manager = restaurant.registerManager(...);
This obviously increases coupling and could affect performance, but it aligns well with the UL and makes it more difficult to misuse the model. Also note that with a single DB, you could enforce referential integrity which takes cares of these uninteresting referential constraints.
As pointed out by #plalx, contention doesn't matter as much when creating aggregates in terms of transactions, since they don't yet exist so can't be involved in contention.
As for enforcing the mutual life cycle of multiple aggregates in the domain, I've come to think that this is the responsibility of the application layer (i.e. an application service, or use case).
Maybe my thinking is closer to Clean or Hexagonal architecture, but I don't think it's possible or even sensible to try and push every single business rule down into the "domain model". The point of the domain model for me is to partition the problem domain into small chunks (aggregates), which encapsulate common business data/operations that change together, but it's the application layer's responsibility to use these aggregates properly in order to achieve the business' end goal (which is the application as a whole), including mediating operations between the aggregates and controlling their life cycles.
As such, I think this stuff belongs in an application service. That being said, frequently updating multiple aggregates in each use case could be a sign of incorrect domain boundaries.
In the Guide/eBook: .NET Microservices: Architecture for Containerized .NET Applications (related to the eShopOnContainers) in the chapter "Designing the infrastructure persistence layer" (page 213) is explained in general how an aggregate root can perform CUD operations against a persistent data source.
Two important starting points are mentioned :
An aggregate is ignorant of methods of persistency and infrastructure following the Persistence Ignorance and the Infrastructure Ignorance principles (page 218). An aggregate is determined by the business and not by the infrastructure.
One should only define one repository per aggregate root to maintain transactional consistency between the objects within the aggregate (page 213)
Unfortunately, in all further examples that are mentioned the aggregate root and all underlying objects that fall under it are within one and the same persistent data source.
The pattern then is as follows:
A repository is created containing that aggregate
In this repository a Unit of Work is injected during creation. This Unit of Work contains methods such as SaveChangesAsync, SaveEntitiesAsync, Update
and so on.
In a command, the Unit of Work manages the transactions to
this one data source such as a database or similar.
I want to expand this pattern that the aggregate can write its data over 2 or more physical data sources depending on the underlying object type.
Starting from starting point 1, it is perfectly justified to have a root aggregate and its underlying object to be updated to different data sources depending on the type of underlying object. Examples mentioned are : a Database and an XML file, a database and a NOSQL 'database',a database and a service, a database and an IoT device. Because an aggregate must be ignorant to methods of persistence and infrastructure, to my opinion there is no need to argue about the design of the aggregate. I think nowhere in the book it is written that a aggregate root should persist within one data source.
At the same time, starting point 2 also seems perfectly justified. Because the complete set of objects within the aggregate root is edited, and the successful persistence of the entire package is coordinated from one repository and (preferably) from one Unit of Work.
The question is:
How deals Domain Driven Design if within the aggregate - depending on the type of the underlying object - it is hydrated over different data sources?
Should I use one custom Unit of Work and make the decision where to write to within this UoW ?
I'm aware of the next question , but having studied the code I think it only deals with inheritance of repositories that deal with different data sources, but still serving one data source at the time and that is not what I'm after.
I want to expand this pattern that the aggregate can write its data over 2 or more physical data sources depending on the underlying object type.
Why do you want to do that on purpose?
In most cases, the persistence implementation is chosen to serve the domain, rather than the other way around. So the happy path typically involves choosing a persistence solution that can record the state of the entire aggregate, and storing the entire thing within a single transaction.
So if you find yourself trying to store an aggregate in two different places, you should take a hard careful look at why.
One common answer is that you want to be able to query the aggregate state efficiently. cqrs is a common solution here - rather than persisting the aggregate in two different data stores, you persist it to one and replicate it to another. The queries can run very efficiently against the replica (although there is of course some additional latency between a change to the aggregate and the reflection of that change in the query results).
Another common answer is that you really have two aggregates that reference each other. Nothing wrong with storing two aggregates in different places. You may be better served by making the distinction between the two explicit in your code.
Dan Pritchett
Jimmy Bogard
How deals Domain Driven Design if within the aggregate - depending on the type of the underlying object - it is hydrated over different data sources?
Badly, just like everybody else.
Sometimes I come to this case when I have a bunch of entity domain models which should be transactionally persisted but there is no logical domain model which could become an aggregate root of all these entity domain models.
Is it a good idea in these cases to have a fictitious aggregate root domain model which will have NO analogical database entity and will not be persisted in the database but will store in itself only logic for transactionally persisting entity domain models ?
P.S. I tought about that because having a database table storing only a single column of aggregate root ids seems wrong to me.
Is it a good idea in these cases to have a fictitious aggregate root domain model which will have NO analogical database entity and will not be persisted in the database but will store in itself only logic for transactionally persisting entity domain models ?
Sort of.
It's perfectly fine to have a PurpleMonkeyDishwasher that joins together composes together the entities that make up your aggregate, so that you can be sure that your data remains consistent and satisfies your domain invariant.
But it's really suspicious that it doesn't have a name. That suggests that you don't really understand the problem that you are modeling.
It's the modeling equivalent of a code smell. There's probably a theme that arranges these entities to be modeled together, exclusive of the others, rather than in some other arrangement. There's probably a noun that your domain experts use when talking about these entities together. Go find it. That's part of the job.
An "aggregate root domain model which will have NO analogical database entity and will not be persisted in the database" is not a "fictitious aggregate"; it is a standard aggregate just like another aggregate that it needs to be persisted. The purpose of an aggregate is to control the changes following domain rules to ensure consistency and invariants.
Sometimes the aggregate is the change (and need to be persisted) but sometimes it is not and the things to be persisted after the change are parts/full entities and/or VOs that changed inside the aggregate and are mapped in persistence at its own without the needed of composing a persistence concept (table/s, document, etc). This is a implementation detail about how you decided to persist your domain data.
First premise of DDD: There is no DataBase. This helps you to not think too biased about trying to mapping persistence concepts in your domain.
Mike in his blog explain it better than me.
The purpose of our aggregate is to control change, not be the change.
Yes, we have data there organized as Value Objects or Entity
references but that’s because it’s the easiest and most maintainable
way to enforce the business rules. We’re not interested in the state
itself, we’re interested in ensuring that the intended changes respect
the rules and for that we’re ‘borrowing’ the domain mindset i.e we
look at things as if WE were part of the business.
An aggregate instance communicates that everything is ok for a
specific business state change to happen. And, yes, we need to persist
the busines state changes. But that doesn’t mean the aggregate itself
needs to be persisted (a possible implementation detail). Remember
that the aggregate is just a construct to organize business rules,
it’s not a meant to be a representation of state.
So, if the aggregate is not the change itself, what is it? The change
is expressed as one or more relevant Domain Events that are generated
by the aggregate. And those need to be recorded (persisted) and
applied (interpreted). When we apply an event we “process” the
business implications of it. This means some value has changed or a
business scenario can be triggered.
Which layer should be responsible for checking existence of some entity in the database?
Let's say I have an order as aggregate and that order can contain multiple items. Logic implies that I can add only existing items to order.
Should I write it like this in the application service:
var item = ItemRepository.GetByID(id);
//throws exception if the item is null
order.AddItem(item);
or
//validate item existence inside aggregate function
order.AddItem(item, IItemRepository repo);
Neither, really.
Entities don't cross aggregate boundaries. Either the entity is part of the aggregate, in which case the aggregate manages its own life cycle, or the item is part of some other aggregate, in which case you don't share the entity, you share a reference.
order.AddItem(id)
Part of the definition of aggregate is that changes in different aggregates can happen independently of each other. In other words, there's no way that this aggregate can know what is happening in that aggregate "now".
In other words, you can't ensure transactional consistency across an aggregate boundary.
The right answer, if you are willing to accept a data race, is to use a domain service to query the state outside the boundary.
interface InventoryService{
boolean currentlyInStock(Item id);
}
// ...
order.addItem(id, inventoryService);
A few points:
Use a domain service rather than passing in the other repository, because it better communicates what is going on. The domain service serves as a description of the contract that the aggregate actually needs. Furthermore, by refusing to pass the repository, you exclude any possibility that the order aggregate attempts to write to the item repository.
(The trivial implementation of this domain service is to just forward the call to the repository, but the order aggregate doesn't need to know that).
The domain service, in this case, should not be "helping" by choosing an action based on the availability of the inventory -- maybe the aggregate should throw, maybe the aggregate should throw for low volume orders/low priority purchasers, but use different rules when the order is over a million dollars. It's the order's job to figure that out, the domain service just provides the data.
Given the data race, some false positives can slip through; detection and mitigation is a good idea.
If you aren't willing to accept the data race (are you sure? Amazon accepts order for out of stock items all the time...), then you need to rethink the design of your model, and where you have set your aggregate boundaries.
The null design of a model is to capture all of the business state into a single aggregate; its own state is internally consistent, but it might not be consistent with external state. When you start carving up the model into separate aggregates, you are making the same assertion -- the aggregate needs to be internally consistent, but it might not be consistent with state outside the aggregate.
If that's not acceptable, then you turn Mother's picture to the wall, and implement your business rules directly in the book of record (ie, constraints in your RDBMS).