DDD - Relaxing the rule of Eventual Consistency between aggregate

DDD - Relaxing the rule of Eventual Consistency between aggregate - domain-driven-design

I`m reading the book PATTERNS, PRINCIPLES, AND PRACTICES OF DOMAIN-DRIVEN DESIGN, written by Scott Millett with Nike Tune. In the chapter 19, Aggregates, he states:
Sometimes it is actually good practice to modify multiple aggregates within a transaction. But it’s
important to understand why the guidelines exist in the first place so that you can be aware of the
consequences of ignoring them.
When the cost of eventual consistency is too high, it’s acceptable to consider modifying two objects in the same transaction. Exceptional circumstances will usually be when the business tells you that the customer experience will be too unsatisfactory.
To summarize, saving one aggregate per transaction is the default approach. But you should
collaborate with the business, assess the technical complexity of each use case, and consciously ignore
the guideline if there is a worthwhile advantage, such as a better user experience.
I face to a case in my project when user request a operation to my app and this operation affects two aggregate, and there are rules that must be verified by the two aggregates for the operation takes place successfully.
it is something like "Allocating a cell for a detainee":
the user makes the request
the Detainee (AR1) is fetched from database and receives a command: detainee.AllocateTo(cellId);
3 the Cell (AR2) is fetched and receive a command: cell.Allocate(detaineeId);
Both steps 2 and 3 could throw an exception, depending on the detainee's status and cell capacity. But abstract it.
Using eventual consistency, if step 2 is executed successfully, emiting the event DetaineeAllocated, but step 3 fails (will run in another transaction, inside an event handler), the state of aggregates will be inconsistent, and worse, the operation seemed to be executed successfully for the user.
I know that there are cases like "when the user makes a purchase over $ 100, its type must be changed to VIP" that can be implemented using eventual consistency, but the case I mentioned does not seem to be one.
Do you think that this is a special case that the book mentions?

Each aggregate must not have an invalid state (internal state), but that does not imply aggregates have to be consistent with one another (external, or system state).
Given the context of your question, the answer could be either yes or no.
The Case for No
The external state can become eventually consistent, which may be acceptable to your product owner. In this case you design ways to detect the inconsistency and deal with it (e.g. by retrying operations, issuing compensating transactions, etc.)
The Case for Yes
In your orchestration layer, go ahead and update the aggregates in a transaction. You might choose to do this because it's "easy" and "right", or you might choose to do this because your product owner says the inconsistency can't be tolerated for whatever reason.
Another Case for No
There's another way out for saying this is not a special case, not a reason for more than one transaction. That way out requires a change to your model. Consider removing the mutual dependency between your detainee and the cell, and instead introducing another aggregate, CellAssignment, which represents a moment-interval (a temporal relationship) that can be constructed and saved in a single transaction. In this case, your detainee and the cell don't change.

"the state of aggregates will be inconsistent"
Well, it shouldn't be inconsistent forever or that wouldn't be eventual consistency. You would normally discuss with business experts to establish an acceptable consistency timeframe.
Should something go wrong an event will be raised which should trigger compensating actions and perhaps a notification to a human stating something went wrong after-all.
Another approach could be to introduce a process manager which is responsible to carry out the business process by triggering commands and listening to events, until completion or timeout. The ARs are often designed to allow small incremental steps
towards consistency. For instance, there could be a command to reserve cell space first rather than directly allocating the detainee. The UI could always poll the state of the process to know when it's complete if necessary.
Eventual consistency obviously comes at a cost. If you have a single DB in a monolith that doesn't need extreme scalability you could very well favor to modify both ARs in a single transaction until that becomes a problem.
Eventual consistency is often sold as less costly that strong consistency, but I believe that's mostly for distributed systems where you'd have to deal with XA transactions.

Do you think that this is a special case that the book mentions?
No.
What I suspect you have here is a modeling error.
From your description, it sounds like you are dealing with something like a CellAssignment, and the invariant that you are trying to maintain is to ensure that there are no conflicts among active cell assignments.
That suggests to me that you are missing some sort of aggregate - something like a seating chart? - that keeps track of all of the active assignments and conflicts.
How can you know? One way is to graph your aggregates; create a node for each piece of information you need to save, and join nodes with lines if there is a rule that requires locking both nodes. If you find yourself with disconnected graphs, or two graphs that only connect at the root id, then it's a good bet that separating some information into a new graph will improve your modeling.
All Our Aggregates Are Wrong, by Mauro Servienti, would be a good talk to review.

Related

Update two aggregate instances in a single transaction

Lets say we have an Account aggregate for a banking service. Someone wants to transfer money from their Account to another person's Account. There are a number of rules: payer needs to have enough money in their Account, and the payee's Account must be active. If these rules pass then the balance on both Accounts are updated. In a traditional system this can easily be done in a single acid db transaction.
In DDD this would not be allowed, as we can't update two aggregate instances in a single transaction? Firstly, why? Secondly, does that mean using eventual consistency to handle the two Accounts? If so, I can see how that can be done, but it adds a lot of complexity.

In DDD this would not be allowed
Not really true - there's a lot going on here.
What Evans (2003), and also Vaughn 2013, wrote is that transaction management is not a domain model concern, but rather that transaction control belongs in the application code.
There is, however, a real concern with changing multiple aggregates at the same time: to do so assumes that you can acquire locks on those entities at the same time and also commit all of those changes together.
That's relatively straightforward when all of the aggregates that you are changing are stored in a single relational database; but it becomes very difficult when the aggregates are stored in different places.
If you design your system such that it assumes that all aggregates are stored together, then you greatly restrict your scaling options.
Be careful not to overuse the ability to commit modifications to multiple Aggregates in a single transaction just because it works in a unit test environment -- Vaughn 2013.
does that mean using eventual consistency to handle the two Accounts
That, or changing how you model your aggregates. Sometimes both.
For instance, it's somewhat common to have aggregates that handle (short lived) processes, which are different from the long lived aggregates.
When I look at my credit card statement, a charge will normally fall in one of three states: it's not yet posted to my statement (not visible), or it's pending (visible), or it is actually posted as a charge (visible). Clearly, there is stuff going on "somewhere else", and that information is eventually copied to my statement where I can see it.
I can see how that can be done, but it adds a lot of complexity.
Yup. If it wasn't complicated/complex, we wouldn't be creating our own model; we'd instead be buying some general purpose solution off the self.
Greg Young talked about this in a 2011 presentation: domain driven design makes sense in places where we can derive a competitive advantage from the work we are doing. In other words, we are using it in places where giving the business control over that complexity improves the bottom line.
Making sure you are working on the correct side of the build versus buy line is an important step. Don't skip it.

DDD: creating multiple aggregates with a shared life-cycle in a single transaction

I'm aware of the general rule that only a single aggregate should be modified per transaction, mostly for concurrency and transactional consistency issues, as far as I'm aware.
I have a use case where I want to create multiple aggregates in a single transaction: a RestaurantManager, a Restaurant, and a Menu. They seem like a single aggregate because their life-cycles begin and end together: it doesn't make sense within the domain to create a RestaurantManager without a Restaurant, or vice versa; the same goes for a Restaurant and a Menu. Further, if the Restaurant or the RestaurantManager is deleted (unregistered), they should all be deleted together.
However, I've split them into separate aggregates because, once created, they are updated separately, maintain their own invariants, and I don't want to load them all into memory just to update one property on the Restaurant, for example.
The only thing that ties them together is their life-cycle.
My question is whether this represents a case where it is okay to go against the "rule" that each transaction should only operate on a single aggregate.
I'd also like to know if I should enforce their shared life-cycle in the domain model by having each aggregate root hold the identifier of the aggregate root it depends on, i.e. by having Restaurant require a MenuId as a constructor parameter, and likewise for Menu and RestaurantId, so that neither can be created without the other. However, this still wouldn't enforce that they should be saved together by the application service anyway, since it could create them all in memory, then only save the Menu, for example.

Your requirement is a pretty normal use case in DDD, IMHO. There are always multiple aggregates working in tandem to support the application, and they are interlinked in their lifecycles. But the modeling concepts still stand true. Let me attempt to explain what your model would look like with the help of a few DDD rules:
Aggregates are transaction boundaries
Aggregates ensure that no business invariants are broken at any point. This means that if you have multiple aggregates strung together as part of one transaction, you have to load all of them into memory for the validation.
This is especially a problem when your application is data-rich and stores data in a database cluster - partitioned, distributed (think Mongo or Elasticsearch). You will have the problem of loaded up data from potentially different clusters as part of a single transaction.
Aggregates are loaded in entirety
Aggregates and their associated data objects are loaded in entirety into memory. This means that unnecessary objects (say the restaurant's schedule for the upcoming month, for example) for the transaction may be loaded into memory. By itself, this is not a problem. But when multiple aggregates get together, the amount of data loaded into memory needs to be considered.
Aggregates refer to each other by their unique identifiers
This one is straightforward and means that each aggregate stores its referenced aggregates by their identifiers instead of enclosing the other aggregate's data within it.
State changes across Aggregates are handled through Domain Events
In cases where you want a state change in one aggregate to have side-effects on other aggregates, you publish a domain event, and a subscriber handles the change on other aggregates in the background. This is how you would want to handle your requirement for cascade deletes.
By following these rules, you are essentially zooming in one single aggregate at a time and ensuring that the complexity remains low. When you string up multiple aggregates, though it is clear and understandable on day 1, eventually, the application tends towards becoming a big ball of mud, as dependencies and invariants start crisscrossing each other.

"only a single aggregate should be modified per transaction"
Contention at creation doesn't matter as much. You can create many ARs in a single transaction without problem because the only other operation that could conflict is another duplicate creation process.
Another reason to avoid involving many ARs in a single transaction is coupling between modules though, but you could always keep things loosely coupled using synchronously dispatched domain events.
As for the deletion, it's probably less problematic to make it eventually consistent. Does it really matter that Restaurant is closed while RestaurantManager remains registered for a short period of time?
The fact you are asking this question tells me your system is not distributed? If your system is running with a single DB server and used by a few people it may be that eventual consistency make things more complex for scalability you don't actually need.
Start simple and refactor as needed, but crossing AR boundaries is not something that should be done consistently or else your boundaries are clearly wrong.
Furthermore, if you want to communicate that a RestaurantManager can't be spawned from nowhere and associated with an invalid RestaurantId by mistake you may want to look at your ubiquitous language for guidance.
e.g.
"A RestaurantManager is registered for a given Restaurant": not sure it truly aligns with your UL, but it's just for the sake of the example.
RestaurantManager manager = restaurant.registerManager(...);
This obviously increases coupling and could affect performance, but it aligns well with the UL and makes it more difficult to misuse the model. Also note that with a single DB, you could enforce referential integrity which takes cares of these uninteresting referential constraints.

As pointed out by #plalx, contention doesn't matter as much when creating aggregates in terms of transactions, since they don't yet exist so can't be involved in contention.
As for enforcing the mutual life cycle of multiple aggregates in the domain, I've come to think that this is the responsibility of the application layer (i.e. an application service, or use case).
Maybe my thinking is closer to Clean or Hexagonal architecture, but I don't think it's possible or even sensible to try and push every single business rule down into the "domain model". The point of the domain model for me is to partition the problem domain into small chunks (aggregates), which encapsulate common business data/operations that change together, but it's the application layer's responsibility to use these aggregates properly in order to achieve the business' end goal (which is the application as a whole), including mediating operations between the aggregates and controlling their life cycles.
As such, I think this stuff belongs in an application service. That being said, frequently updating multiple aggregates in each use case could be a sign of incorrect domain boundaries.

Should a single command address multiple aggregates?

In CQRS and DDD, an aggregate is a transactional boundary. Hence I have been modeling commands always in such a way that each command always only ever addresses a single aggregate. Of course, technically, it would be possible to write a command handler that addresses multiple aggregates, but that would not be within a single transaction and hence would not be consistent.
If you actually have to address multiple aggregates, I usually go with a process manager, but this sometimes feels like pretty much overhead. In addition, from my understanding a process manager always only reacts to domain events, it is not directly addressed by commands. So you need to decide which aggregate to put the starting point to.
I have seen that some people solve this using so-called domain or application services, which can receive commands as well, and then work on multiple aggregates – but in this case the transactional nature of the process gets lost.
To give a simple example, to better illustrate the scenario:
A user shall join a group.
A user has a max number of groups.
A group has a max number of users.
Where to put the command that triggers the initial joining process, and what to call it? user.join(group) feels as right or wrong as group.welcome(user). I'd probably go for the first one, because this is closer to the ubiquitous language, but anyway…
If I had something above the aggregates, like the aforementioned services, then I could run something such as:
userManagement.addUserToGroup(user, group);
However, this addUserToGroup function would then need to call both commands, which in turn means it has to take care of both commands being processed – which is somewhat counterintuitive to having separate aggregates at all, and having aggregates as transactional boundaries.
What would be the correct way to model this?

It may be worth reviewing Greg Young on Eventual Consistency and Set Validation.
What is the business impact of having a failure
This is the key question we need to ask and it will drive our solution
in how to handle this issue as we have many choices of varying degrees
of difficulty.
And certainly Pat Helland on Memories, Guesses, and Apologies.
Short version: the two generals tell us that, if two pieces of information must be consistent, then we need to write both pieces of information in the same place. The "invariant" constrains our data model.
The invariant you describe is effectively a couple of set validation problems: the "membership" collection allows only so many members with user A, and only so many members with group B. And if you really are in a "we go out of business if those rules are violated" situation, then you cannot distribute the members of that set -- you have to lock the entire set when you modify it to ensure that the rule is not broken and that first writer wins.
An element that requires some care in your modeling: is the domain model the authority for membership? or is the "real world" responsible for membership and the domain model is just caching that information for later use? You want to be very careful about trying to enforce an invariant on the real world.
There's a risk that you end up over constraining the order in which information is accepted by the model.

Essentially what you have is many to many relationships between users and groups with restrictions on both sides:
Restriction on the number of groups a user can join
Restriction on the number of users a group can have
VoiceOfUnreason already gave a great answer, so I'll share one way I've solved similar problems and go straight to the model and implementation in case you have to ensure that these constraints are enforced at all costs. If you don't have to, do not make the model and implementation that complex.
Ensuring consistency with such constraints on both Group and User entities will be difficult in a single operation because of the concurrency of the operations.
You can model this by adding a collection of RegisteredUsers to a Group or vice versa, adding a collection of JoinedGroups to a User, and enforce the constraint on one side, but enforcing it on the other side is still an issue.
What you can do is introduce another concept in your domain. The concept of a
"Slot" in a Group. "Slots" are limited by the max number of Slots for a Group.
Then a User will issue a JoinGroupRequest that can be Accepted or Rejected.
A Slot can be either Taken or Reserved. Then you can introduce the concept of SlotReservation. The process of joining a User to a Group will be:
Issue a JoinGroupRequest from a User
Try to Reserve a Slot enforcing the MaxUsersPerGroup constraint.
Acquire a Slot or Reject the SlotReservation of a User enforcing the MaxGroupsPerUser constraint.
Accept or Reject the JoinGroupRequest depending on the outcome of the SlotReservation
If the SlotReservation is Rejected, another User will be able to use this Slot later.
For the implementation, you can add SlotReservation Queue Per Group to ensure that once a Slot is free after a Rejected SlotReservation, the next User that wants to join the Group will be able to.
For the implementation, you can add a collection of Slots to a Group, or you can make Slot an aggregate in its own right.
You can use a Saga for this process. The Saga will be triggered when a JoinGroupRequest is made by a User.
Essentially, this operation becomes a Tentative Operation.
For more details take a look and the Accountability Pattern and Life beyond distributed transactions an apostate's opinion and Life beyond distributed transactions an apostate's implementation.

Best way to enforce invariants between aggregates?

What is the best way to handle consistency between aggregates? Having an example from Vaugn Vernon book, you have BacklogItem aggregate and SprintAggregate. When BacklogItemEvent is raised the event handler catches it and tries to update Sprint Aggregate. What if this operation fails? How to find the best way of handling this situation? As far as I understand there are 3 options:
1) Update all aggregates in one transaction. We loose scalability, but we gain consistency.
2) Do nothing. Just log and Error and wait for manual intervention.
3) Use saga. This complicates the design and forces us to implement each usecase which has to enforse envariants between aggregates in a separate object(saga). If Sprint update fails, saga will try to Uncommit Backlog item (compensate).
Which of this option will you choose, and what is the criteria you base on?

What is the best way to handle consistency between aggregates?
If your aggregates are correctly designed, then you handle "consistency" between aggregates over time (aka: eventual consistency).
What if this operation fails?
Take a careful read through Race Conditions Don't Exist; Udi Dahan makes an argument that operations in collaborative domains should not fail.
Update all aggregates in one transaction.
You can do that; but what that effectively means that that the two entities are really part of a single implicit aggregate. In other words, it strongly suggests that you haven't got your aggregate boundaries in the right place.
Trying to modify a multiple aggregates in a single transaction is effectively two phase commit, with all of the additional complications that arise from that.
Do nothing. Just log and Error and wait for manual intervention.
Yup; see, for instance; what Greg Young has to say about warehouse systems and exception reports.
Use saga. This complicates the design and forces us to implement each use case which has to enforce invariants between aggregates in a separate object(saga).
These days, you'll normally see "process manager" rather than "saga", which has a more specific meaning. But yes, if the domain model needs orchestration between aggregates, then you are going to need to describe the orchestration logic somewhere.
You might want to review Rinat Abdullin's discussion of Evolving Business Processes; he makes a pretty good argument that the automation is just replicating the actions the human operator would take.
Which of this option will you choose, and what is the criteria you base on?
I strongly prefer simple to easy. So I would aim for exception reporting, on the argument that (a) these failures should be rare anyway, so we don't want to be investing a lot of design capital in work far off the happy path, and (b) if we have failing commands in the system, then we ought to have a mechanic for reporting failed commands anyway, so I'm just leveraging what's already present.
If I were squeezed for time, if the project hadn't yet become successful enough to need to scale, if I didn't have the reporting pieces needed at hand, I might prefer instead to sneak the changes into a single transaction, and then raise an exception report in the development process itself to call attention to the fact that more work needed to be done later.

Which of this option will you choose, and what is the criteria you
base on?
Domain expert input. If they demand extremely strict correctness at all times, chances are eventual consistency won't make the cut. There are other times when compensating actions require manual interventions that are hardly feasible in a given domain. Or, it could be extremely simple and beneficial to include a human in the loop. Talking with a business person will teach you about the broader domain process and uncover or rule out some options.
Transactional analysis. If they are not under strong concurrent access, maybe updating 2 aggregates in a single transaction is not that problematic. In contrast, identifying "hot" aggregates allows you to leverage looser consistency where it matters.
Use case complexity. Not all eventual consistency scenarios require a Saga. If the operation is as simple as updating an aggregate as a consequence of an event and rolling the original change back in the unlikely event that the update fails, chances are you don't need such a complex, long-lived pattern.

What does the choosing consistency type "Ask Whose Job It Is" guidance mean?

When discussing how to decide whether transactional or eventual consistency should be used in Part II of Vaughn Vernon's Effective Aggregate Design, he states
When examining the use case (or story), ask whether it's the job of
the user executing the use case to make the data consistent. If it is,
try to make it transactionally consistent, but only by adhering to the
other rules of aggregate. If it is another user's job, or the job of
the system, allow it to be eventually consistent.
I don't follow. Does anyone have a good example of applying this rule of thumb?

Here's how I get it :
MovePiece() on ChessBoard aggregate => user's responsibility. The action should all take place in one transaction contained within the ChessBoard boundary.
DecideGameOver() on ChessGame aggregate => system's responsibility.
Some handler subscribes to ChessBoard's PieceMoved events and
delegates to the ChessGame aggregate for it to decide if the game is
over. We can tolerate a delay between the final move being made on
the Board and when the Game aggregate is updated - it's eventual
consistency.
It's not a hard and fast rule though, more of an indicator generalized from observation of dozens of systems.

Develop Reference

node.js excel linux python-3.x azure haskell apache-spark rust .htaccess string