How to delete multiple aggregates in DDD?

How to delete multiple aggregates in DDD? - domain-driven-design

I know in DDD that deleting the Aggregate Root must remove everything within the Aggregate boundary all at once.
But in the Agile example that vaughn vernon gave it here https://vaughnvernon.co/?p=838 the BackLogItem and Product Aggregates are exist in separate Aggregates and the BackLogItem Aggregate Root is referencing Product Aggregate Root by Id.So, If I want to delete the Product Aggregate Root wouldn't mean that I should delete its BackLogItems?
So, my question how to delete multiple Aggregates in DDD and if so would be that possible using Domain Services , Domain Event or whatever?
P.S
Depending on vaughn vernon that we should not modify more than one
aggregate in the same transaction(in some cases we are forced to use
eventual consistency).

The usual mechanism for behavior distributed across multiple aggregates is to use a process manager.
I recommend starting from Rinat's writeup, because it really does get to the core of the matter; a process manager is just a stand in for a human being that reacts to events by sending commands to other aggregates.
Oh look, the Product was removed
I should load a list of the BackLogItems that reference that Product
And remove each of them in turn
If your modeling of the back log items as belonging to a distinct aggregate from the product is correct, then it follows that the changes to the back log items can be separated in time from the changes to the product.
Also, see Udi Dahan: Don't Delete -- Just Don't.

Use Saga Pattern. It executes a scenario in the transaction and imitates clients work. Saga sends multiple commands just as client does.

I would argue that your problem is not only how to delete multiple aggregates, but also which aggregates you should delete. In your question, you mention that when you delete the Product Aggregate you should delete its Backlog Items, but is that all? What about the Kanban Board and its custom columns and workflow, and N other features associated with that Product Aggregate?
The point of having the BacklogItems knowing about the Product, but not the other way around is to avoid having a God object, which knows about everything because dozens of features can be attached to the Product Aggregate over time.
When deleting the Product Aggregate, there's not a single place that knows everything that needs to be deleted, but everything that needs to be deleted will know. Therefore, leave the responsibility to delete (or something else) to every other aggregate: From the ProductAggregate, publish a ProductDeletedEvent and subscribe to it from all the other aggregates that care about it.

Related

How to use sagas in a CQRS architecture using DDD?

I am designing a CQRS application using DDD, and am wondering how to implement the following scenario:
a Participant aggregate can be referenced by multiple ParticipantEntry aggregates
an AddParticipantInfoCommand is issued to the Command side, which contains all info of the Participant and one ParticipantEntry (similar to an Order and one OrderLineItem)
Where should the logic be implemented that checks whether the Participant already exists and if it doesn't exist, creates the Participant?
Should it be done in a Saga that first checks the domain model for the existence of the Participant, and if it doesn't find it, issues an AddParticipantCommand and afterwards an AddParticipantEntry command containing the Participant ID?
Should this be done entirely by the aggregateroots in the domain model itself?

You don't necessarily need sagas in order to deal with this situation. Take a look at my blog post on why not to create aggregate roots, and what to do instead:
http://udidahan.com/2009/06/29/dont-create-aggregate-roots/

Where should the logic be implemented that checks whether the Participant already exists and if it doesn't exist, creates the Participant?
In most instances, this behavior should be under the control of the Participant aggregate itself.
Processes are useful when you need to coordinate changes across multiple transaction boundaries. Two changes to the same aggregate, however, can be managed within the same transaction.
You can implement this as two distinct transactions operating on the same aggregate, with coordination; but the extra complexity of a process doesn't offer any gains. It's much simpler to send the single command to the aggregate, and allow it to decide what actions to take to maintain the correct invariant.
Sagas, in particular, are a pattern for reverting multiple transactions. Yan Cui's How the Saga Pattern manages failures with AWS Lambda and Step Functions includes a good illustration of a travel booking saga.
(Note: there is considerable confusion about the definition of "saga"; the NServiceBus community tends to understand the term a slightly different way than originally described by Garia-Molina and Salem. kellabyte's Clarifying the Saga Pattern surveys the confusion.)

Repository within domain objects

I have seen lot of discussions regarding this topic but i couldn't get a convincing answer. The general advice is not to have repository inside a domain object. What about an aggregate root? Isnt it right to give the root the responsibility to manipulate the composed objects?
For example, i have a microservice which takes care of invoices. Invoice is an aggregate root which has the different products. There is no requirement for this service to give details about individual products. I have 2 tables, one to store invoice details and other to store products of those invoices. I have two repositories corresponding to the tables. I have injected product repository inside the invoice domain object. Is it wrong to do so?

I see some mistakes according to DDD principles in your question. Let me try to clarify some concepts to give you hand.
First, you mentioned you have an Aggregate Root which is Invoice, and then two different repositories. Having an Aggregate Root means that any change on the Entities that the Aggregate consists of should be performed via the Aggregate Root. Why? That's because you need to satisfy some business rule (invariant) that applies on the relation of those Entities. For instance, given the next business rule:
Winning auction bids must always be placed before the auction ends. If a winning bid is placed after an auction ends, the domain is in an invalid state because an invariant has been broken and the model has failed to correctly apply domain rules.
Here there is an aggregate consisting of Auction and Bids where the Auction is the Aggregate Root.
If you have a BidsRepository, you could easily do:
var newBid = new Bid(money);
BidsRepository->save(newBid);
And you were saving a Bid without passing the defined business rule. However, having the repository just for the Aggregate Root you are enforcing your design because you need to do something like:
var newBid = new Bid(money);
auction.placeBid(newBid);
auctionRepository.save(auction);
Therefore, you can check your invariant within the method placeBid and nobody can skip it if they want to place a new Bid. Afterwards you can save the info into as many tables as you want, that is an implementation detail.
Second, you said if it's wrong injecting the repository into a Domain class. Here a quick explanation:
The repository should depend on the object it returns, not the other way around. The reason for this is that your "domain object" (more on that later) can exist (and should be testable) without being loaded or saved (that is, having a dependency on a repository).
Basically your design says that in order to have an invoice, you need to provide a MySQL/Mongo/XXX instance connection which is an infrastructure detail. Your domain should not know anything about how it is persisted. Your domain knows about the behavior like in the scenario of the Auction and Bids.
These concepts just help you to create code easier to maintain as well as help you to apply best practices such as SRP (Single Responsibility Principle).

Yes, I think it is wrong.
Domain should match real business model and should not care how data is persisted. Even if data internally are stored in multiple tables, this should not affect domain objects in any way.
When you are loading aggregate root, you should load related entities as well in one go. For example, this can easily be achieved with Include keyword in Entity Framework if you are on .NET. By loading all the data you ensure that you have full representation of business entity at any given time and you don't have to query database anymore.
Any changes in related entities should be persisted together with aggregate root in one atomic operation (usually using transactions).

How do you handle an aggregate root with a collection of child entities whose update frequency is different than the root?

We have an aggregate root in our system and is has child entities in a collection. The problem is that the container needs to be updated very frequently, on a transaction basis, and the children entities don't, they in fact hardly ever change, they are more configuration like in nature.
My first reflex was to separate them into two different aggregate roots because our of application requirements. But I was reminded of the cascade delete rule, if we delete the one then the delete should cascade, so their lifetimes are linked.
We stumbled over this problem when we discovered that we have a caching problem. Changes to the children entities (configuration) were not being reflected in the system at runtime because the parent was unaware of the changes (we had them as one aggregate root but someone had created a repository for its children).

The main driver for aggregate boundaries are the invariants of your domain - or in other terms, aggregate boundaries should be consistency boundaries. Things that must change together atomically must be in the same aggregate.
The cascading delete is (with regards to aggregate boundaries) rather a nice-to-have than a rule. You can always enforce the fact that a Parent still lives by requiring one at the place where you load Child entities. With this design, you can make Parent and Child different aggregates, while still enforcing the rule that no "free floating" Child aggregates can be requested. And deleting Child aggregates in response to a deleted Parent is easy if you have domain events in place.
Note: All this is under the assumption that your domain invariants allow separating the aggregates in the first place.

This might be better in a discussion format, rather than a Q&A format. I'd recommend trying the audience at DomainDrivenDesign or DDDCQRS
Are you sure that you have a business requirement to delete data in your domain model? That's really unusual -- in most domain models I've seen, an aggregate will reach an "end of life" state, (example: AccountClosed), but doesn't actually get removed from the system.
A common trap in aggregate design is to think about the structure of the entities. "A has a B" does not necessarily mean that they are part of the same aggregate; the key idea is "A needs to keep B and C consistent". You can think about it like a graph; state B and state C are nodes in the graph, the consistency rules are the edges. If you can't traverse the graph from B to C, then they don't need to be part of the same aggregate, and probably shouldn't be.
My instinct is that caching should be the right answer here. If you are processing millions of transactions per day, and the collection only changes once per month, then simply using a cached value of the collection should produce the right answer most of the time.
In this, I'm influenced by Udi Dahan's essay Race Conditions Don't Exist; by coupling this configuration collection with the rest of the aggregate, you are essentially asserting that changes to the configuration (which are rare) are understood by the business to be happening precisely between two other changes to the aggregate. 3M transactions per day averages 1 per 30ms; are you really scheduling your configuration changes that precisely?
The usual pattern here would be that the consistency rule is removed from the domain model; instead, you monitor for changes that introduce an inconsistency, and mitigate them. That depends upon there being a reasonable way to detect the errors, an efficient way to mitigate them, and a mechanic for keeping the rate under control.
The latter of these would normally be done by having the clients/the application check their local copy of the collection, and making sure the command sent is consistent with that before dispatching the command to the domain model. (Possible questions for your domain experts: how quickly do the configuration changes need to be applied? Do the configuration changes happen when the aggregate is changing frequently or when it is quiet?)
Another possibility might be to change your persistence strategy; if the collection doesn't change often, then there are not a lot of change events related to it. So maybe instead of persisting the aggregate, you look into persisting its history - in other words, using event-sourcing here. Maybe if this aggregate lived in a micro service, you could limit the risk of the change? Hard to say, at a million transactions per day, this aggregate sounds pretty important.

Should the implementation of repositories be isolated like their coresponding aggregates?

The benifit of having repositories when using DDD is that they allows one to design a domain model without worrying about how objects will be persisted. It also allows the final product to be more flexible, as different implementations of repositories can be swapped in and out easily. So it's possible for the implementation of repositories to be based on SQL databases, REST web services, XML files, or any other method of storing and retrieving data. From the model's perspective the expectation is that there are just these magic collections that can be use to store and retrieve aggregate roots objects.
Now if I have two normal in-memory collections, say an IList<Order> and an IList<Customer>, I would never expect that modifying one collection would affect the other. So should the same logic apply to repositories? Should the actual implementation of repositories be totally isolated from one another, even if they in reality access the same database?
For example a cascade-on-delete relationship may be setup in a SQL database between a Customers table and an Orders table so that corresponding orders are deleted when a customer is deleted. Yet this functionality would break if later the SQLCustomerRepository is replaced by a RESTCustomerRepository.
So am I correct in thinking that the model should always be under the assumption that repositories are totally isolated from one another, and correspondingly the actual implementation of repositories should be isolated as well?
So if Orders should be deleted when a Customer is deleted should this be defined explicitly in the domain model, rather then relying on the database? Say through a CustomerService.DeleteCustomer() method which accesses the current ICustomerRepository and IOrderRepository.
I think I am just having a hard time getting my head out of the relational world and into the DDD world. I keep wanting to think of things in terms of tables and PK/FK relationships, where I should just ignore that a database is involved at all.

I believe that point you miss is that aggregate roots draws context boundaries.
In simple words - stuff underneath makes sense only together w/ aggregate root itself.
As I see it - Order is not an aggregate root but an entity which lives in Customer aggregate root context. That means - there is no need for Order repository because repositories are supposed to be per aggregate root. So there should be only CustomerRepository which is supposed to know how to persist Customer.Orders too.
I myself don't worry that much and omit repository pattern altogether and just rely on NHibernate ORM. Rich domain model that correctly tracks and monitors state changes is much more important than way how you actually send update/select sql statements.
Also - think twice before deleting stuff.

Never delete a customer, a customer is not deleted, it is made inactive or something. Also please don't cascade delete orders it will get you into strange places, orders should always be preserved when they are processed. Think of reports for your application, so 1.1 Million revenue just went away because you decided to cascade delete.

You have a repository per aggregate root not per entity, thus even cascading deletion of childs of aggregate root is applicable in the aggregate root repository as it is still isolated.
Dont cascade deletion or have any side effects to other aggregate roots, co-ordinate this logic in the application layer.

Your domain model should model the transactional operations of your domain. By putting Orders on Customer, in your Customer entity, you are saying that when a Customer is deleted, so should his Orders.
If you have OrderIds on your Customer, that's different. Than you have an association between Customer and Orders. In this case, you are saying that by adding or removing from the list of OrderIds on Customers, you are adding or removing associations, not adding or deleting Orders.
Should the actual implementation of repositories be totally isolated from one another, even if they in reality access the same database?
Yes, for the most part. If you decide to make both Order and Customer Aggregate Roots, you are saying they are independant of one another, and should be allowed to change independently and simultaneously. That is, you don't need the changes to be transactional between the two. If you only make Customer an Aggregate Root, and have it have a list of Orders, now you are saying that the Customer entity dictates what happens to the Orders, and changing a Customer will cascade changes to it's Orders.
Now in your example, it seems you'd have Customers as aggregate roots. And Orders as aggregate roots. Each with their own repo. Customers would have a list of OrderIds to model the one to many association. If you deleted a Customer, you could publish a customer deleted event, and have everything related to this customer clean itself up.

Aggregate roots depend on the use case so does that mean that we might end up with really a lots of repositories?

Ive heard a lots that aggregate roots depend on the use case. But what does that mean in coding context ?
You have a service class which offcourse hold methods (use cases) that gonna accomplish something in a repository. Great, so you use a repository which is equal to an aggregate root to perform your querying.
Now you need to perform some other kind of operation which use totally different use case than the first service class but use the same entities.
Here the representation :
Entities: Customer, Orders, LineOrder
Service 1: Add new customers, Delete some customers, retrieve customer orders
Here the aggregate root seem to be Customer because you need this repository to perform thoses use cases.
Service 2: Retrieve customer from an actual order
Here the aggregate root seem to be Order because you need this repository to perform this use case.
If i am wrong please correct me. Now that mean you have 2 aggregates roots.
Now my question is, since aggregate roots depend on the use case does that mean that we might end up with really a lots of repositories if you end up having lots of use cases ?
The above example was probably not the best example... so lets say we have a Journal which hold JournalEntries which each entries hold Tasks, Problems and Notes. (This is in the context of telling to a system what have been done to a project)
Does that mean that im gonna end up with 2 repository ? (Journal, JournalEntry)
In the use cases where i need to add new tasks, problems and notes from an journal entry ?
(Can be seen as a service)
Or might end up with 4 repository. (Journal, Task, Problems, Notes)
In the use cases where i need to access directment task, problems and notes ?
(Can be seen as another service)
But that would mean if i need both of theses services (that actually hold the use cases) that i actually need 5 repository to be able to perform use cases in both of them ?
Thanks.

Hi I saw your post and thought I may give you my opion. First I must say I've been doing DDD in project for three years now, so I'm not an expert. But I'm currently working in a project as an architect an coaching developers in DDD, and I must say it isn't a walk in the park... I don't know how many times I've refactored the model and Entity relationships.
But my experience is that you endup with some repositories (more than few but not many). My Aggregates usually contains a few classes and the Aggregate object graph isn't that deep (if you know what I mean).
But I try to be concrete:
1) Aggregate roots are defined by your needs. I mean if you feel that you need that Tasks object through Journal to often, then maybe thats a sign for it to be upgraded as a aggregate root.
2) But everything cannot be aggregate roots, so try to capsulate object that are tight related. Notes seems like a candidate for being own by a root object. You'd probably always relate Notes to the root or it loses its context. Notes cannot live by itself.
3) Remember that Aggregates are used for splitting up large complex domains into smaller "islands" that take care of thier inhabbitants. Its important to not make your domain more complex than it is.
4) You don't know how your model look likes before you've reached far into the project implementation phase. If you realize that some repositories aren't used that much, they may be candidates for merging into other root object (if they have that kind of relationship). You can break out objects that are used so much through root object without its context. I mean for example if Journal are aggregate root and contains Notes and Tasks. After a while you model grows and maybe Tasks
have assoications to Action and ActionHistory and User and Rule and Permission. Now I just throw out a bunch om common objects in a rule/action/user permission functionality. Maybe this result in usecases that approach Tasks from another angle, "View all Tasks performed by this User" etc. Tasks get more involved in some kind of State/Workflow engine and therefor candidates for being an aggregate root itself.
Okey. Not the best example but it maybe gives you the idea. A root object can contain children where some of its children can also be root object because we need it in another context (than journal).
But I have myself banged my head against the wall everytime you startup with a fresh model. Just go with the flow and let the model evolve itself through its clients/subsribers. You refine the model through its usage. The Services (application services and not domain services) are of course extended with methods that respond to UI and usecases (often one-to-one).
I hope I helped you in someway...or not :D

Yes, you would most likely end up with 5 repositories (Journal, JournalEntry, Task, Problems, Notes). Your services would then use these repositories to perform CRUD for each type of entity.
Your reaction of "wow so many repositories" is not uncommon for developers new to DDD.
However, your repositories are usually light weight assuming your model and DB schema are fairly evenly matched which is often the case. If you use an ORM such as nHibernate or a tool such as codesmith generator then it gets even easier to create your repositories.

At first you need to define what is aggregate. I don't know about use case aggregates.
I know about aggregates following...
Aggregates are union of several entities. One of the entities is the aggregate root, the rest entities (or value types) have sense only in selected aggregate root context. For example you can define Order and OrderLine as an aggregate if you don't need to do any independent actions with OrderLine entities. It means that OrderLine makes sense in Order context only.
Why to define aggregates at all? It is required to reduce references between objects. That will simplify you domain model.
And of course you don't need to have OrderLineRepository if OrderLine is a part of Order aggregate.
Here is a link with more information. You can read Eric Evans DDD book. He explains aggregates very well.

Develop Reference

node.js excel linux python-3.x azure haskell apache-spark rust .htaccess string