How do you handle associations between aggregates in DDD? - domain-driven-design

I'm still wrapping my head around DDD, and one of the stumbling blocks I've encountered is in how to handle associations between separate aggregates. Say I've got one aggregate encapsulating Customers and another encapsulating Shipments.
For business reasons Shipments are their own aggregates, and yet they need to be explicitly tied to Customers. Should my Customer domain entity have a list of Shipments? If so, how do I populate this list at the repository level - given I'll have a CustomerRepository and a ShipmentRepository (one repo per aggregate)?
I'm saying 'association' rather than 'relationship' because I want to stress that this is a domain decision, not an infrastructure one - I'm designing the system from the model first.
Edit: I know I don't need to model tables directly to objects - that's the reason I'm designing the model first. At this point I don't care about the database at all - just the associations between these two aggregates.

There's no reason your ShipmentRepository can't aggregate customer data into your shipment models. Repositories do not have to have a 1-to-1 mapping with tables.
I have several repositories which combine multiple tables into a single domain model.

I think there's two levels of answering this question. At one level, the question is how do I populate the relationship between customer and shipment. I really like the "fill" semantics where your shipment repository can have a fillOrders( List customers, ....).
The other level is "how do I handle the denormalized domain models that are a part of DDD". And "Customer" is probably the best example of them all, because it simply shows up in such a lot of different contexts; almost all your processes have customer in them and the context of the customer is usually extremely varied. At max half the time you are interested in the "orders". If my understanding of the domain was perfect when starting, I'd never make a customer domain concept. But it's not, so I always end up making the Customer object. I still remember the project where I after 3 years felt that I was able to make the proper "Customer" domain model. I would be looking for the alternate and more detailed concepts that also represent the customer; PotentialCustomer, OrderingCustomer, CustomerWithOrders and probably a few others; sorry the names aren't better. I'll need some more time for that ;)

Shipment has relation many-to-one relationship with Customer.
If your are looking for the shipments of a client, add a query to your shipment repository that takes a client parameter.
In general, I don't create one-to-mane associations between entities when the many side is not limited.

Related

Can DDD repositories return data from other aggregate roots?

I'm having trouble getting my head around how to use the repository pattern with a more complex object model. Say I have two aggregate roots Student and Class. Each student may be enrolled in any number of classes. Access to this data would therefore be through the respective repositories StudentRepository and ClassRepository.
Now on my front end say I want to create a student details page that shows the information about the student, and a list of classes they are enrolled in. I would first have to get the Student from StudentRepository and then their Classes from ClassRepository. This makes sense.
Where I get lost is when the domain model becomes more realistic/complex. Say students have a major that is associated with a department, and classes are associated with a course, room, and instructors. Rooms are associated with a building. Course are associated with a department etc.. etc..
I could easily see wanting to show information from all these entities on the student details page. But then I would have to make a number of calls to separate repositories per each class the student is enrolled in. So now what could have been a couple queries to the database has increased massively. This doesn't seem right.
I understand the ClassRepository should only be responsible for updating classes, and not anything in other aggregate roots. But does it violate DDD if the values ClassRepository returns contains information from other related aggregate roots? In most cases this would only need to be a partial summary of those related entities (building name, course name, course number, instructor name, instructor email etc..).
But then I would have to make a number of calls to separate repositories per each class the student is enrolled in. So now what could have been a couple queries to the database has increased massively. This doesn't seem right.
Yup.
But does it violate DDD if the values ClassRepository returns contains information from other related aggregate roots?
Nobody cares about "violate DDD". What we care about is: do you still get the benefits of the repository pattern if you start pulling in data from other aggregates?
Probably not - part of the point of "aggregates" is that when writing the business code you don't have to worry to much about how storage is implemented... but if you start mixing locked data and unlocked data, your abstraction starts leaking into the domain code.
However: if you are trying to support reporting, or some other effectively read only function, you don't necessarily need the domain model at all -- it might make sense to just query your data store and present a representation of the answer.
This substitution isn't necessarily "free" -- the accuracy of the information will depend in part on how closely your stored information matches your in memory information (ie, how often are you writing information into your storage).
This is basically the core idea of CQRS: reads and writes are different, so maybe we should separate the two, so that they each can be optimized without interfering with the correctness of the other.
Can DDD repositories return data from other aggregate roots?
Short answer: No. If that happened, that would not be a DDD repository for a DDD aggregate (that said, nobody will go after you if you do it).
Long answer: Your problem is that you are trying to use tools made to safely modify data (aggregates and repositories) to solve a problem reading data for presentation purposes. An aggregate is a consistency boundary. Its goal is to implement a process and encapsulate the data required for that process. The repository's goal is to read and atomically update a single aggregate. It is not meant to implement queries needed for data presentation to users.
Also, note that the model you present is not a model based on aggregates. If you break that model into aggregates you'll have multiple clusters of entities without "lines" between them. For example, a Student aggregate might have a collection of ClassEnrollments and a Class aggregate a collection of Atendees (that's just an example, note that modeling many to many relationships with aggregates can be a bit tricky). You'll have one repository for each aggregate, which will fully load the aggregate when executing an operation and transactionally update the full aggregate.
Now to your actual question: how do you implement queries for data presentation that require data from multiple aggregates? well, you have multiple options:
As you say, do multiple round trips using your existing repositories. Load a student and from the list of ClassEnrollments, load the classes that you need.
Use CQRS "lite". Aggregates and respositories will only be used for update operations and for query operations implement Queries, which won't use repositories, but access the DB directly, therefore you can join tables from multiple aggregates (Student->Enrollments->Atendees->Classes)
Use "full" CQRS. Create read models optimised for your queries based on the data from your aggregates.
My preferred approach is to use CQRS lite and only create a dedicated read model when it's really needed.

Relationships between multiple aggregate roots

In many applications, I deal with users and finance companies (as an example) and I have long been struggling to model the relationship between the two according to Domain Driven Design principles.
In my system I can do the following:
Add a user to an existing finance company.
Add a finance company to an existing user.
I believe both are aggregate roots... Finance Company and User.
How do I model the relationship between the 2? Is it FinanceCompany.Users? or User.FinanceCompanies? Is it neither? Or am I missing knowledge of some key DDD concept(s)? The problem is if I choose one way over the other, the code is more understandable / clear from one aggregate root entry point, but not the other. Sometimes there are cases where it makes more sense to navigate to a Finance Company and add users to it, and other times there are cases where it makes more sense to navigate to a specific user and add finance companies to the user.
Is there some better way to approach this, maybe through repository methods? Is there some key concept I am not getting or understanding here? It doesn't feel right to assume the relationship between Finance Company and User belongs under either of the 2 ARs. When I store the relationship I have to store it in a table named FinanceCompanyUsers or UserFinanceCompanies, but it still doesn't seem clear.
Would I have code such as FinanceCompany.AddUser() and User.AddFinanceCompany()? or is there some completely different approach for relationships such as this?
You have already determined that both User and FinanceCompany are aggregates so each has its own life-cycle.
The problem with many domains is that we don't have a complete understanding of the relationships. As another example we can take an Order and a Product. Both are aggregates but would we have Order.AddProduct() or Product.AddOrder()? In this case it seems pretty obvious in that an Order contains a limited subset of Product entries whereas a Product may very well contain many orders and we are not really too interested in that relationship since it is a rather weak relationship. A Product can exist and be valid without any orders attached but an Order is pretty useless without at least one product entry. This means that the Order has an invariant imposed in terms of its OrderItem entries. In addition to this we have enough knowledge about this hackneyed example that we know we are going to need an associative entity (in relational theory speak) since we need additional information regarding the relationship and entering the fray would be our OrderItem table. Curiously I have not seen it called OrderProduct.
The guidance I would suggest is to pick the most appropriate side.
However, if no side is a true winner and both aggregates can exist without a relationship to the other in terms of an invariant perhaps the relationship itself is an aggregate as you have certainly alluded to. Perhaps it isn't only a UserFinanceCompany aggregate but perhaps there is a concept that is missing from the ubiquitous language that the domain experts refer to. Perhaps something like Auditor or some such that represents that relationship. This is akin to the OrderItem or OrderLine concept as opposed to OrderProduct.

DDD: do I really need to load all objects in an aggregate? (Performance concerns)

In DDD, a repository loads an entire aggregate - we either load all of it or none of it. This also means that should avoid lazy loading.
My concern is performance-wise. What if this results in loading into memory thousands of objects? For example, an aggregate for Customer comes back with ten thousand Orders.
In this sort of cases, could it mean that I need to redesign and re-think my aggregates? Does DDD offer suggestions regarding this issue?
Take a look at this Effective Aggregate Design series of three articles from Vernon. I found them quite useful to understand when and how you can design smaller aggregates rather than a large-cluster aggregate.
EDIT
I would like to give a couple of examples to improve my previous answer, feel free to share your thoughts about them.
First, a quick definition about an Aggregate (took from Patterns, Principles and Practices of Domain Driven Design book by Scott Millet)
Entities and Value Objects collaborate to form complex relationships that meet invariants within the domain model. When dealing with large interconnected associations of objects, it is often difficult to ensure consistency and concurrency when performing actions against domain objects. Domain-Driven Design has the Aggregate pattern to ensure consistency and to define transactional concurrency boundaries for object graphs. Large models are split by invariants and grouped into aggregates of entities and value objects that are treated as conceptual whole.
Let's go with an example to see the definition in practice.
Simple Example
The first example shows how defining an Aggregate Root helps to ensure consistency when performing actions against domain objects.
Given the next business rule:
Winning auction bids must always be placed before the auction ends. If a winning bid is placed after an auction ends, the domain is in an invalid state because an invariant has been broken and the model has failed to correctly apply domain rules.
Here there is an aggregate consisting of Auction and Bids where the Auction is the Aggregate Root.
If we say that Bid is also a separated Aggregate Root you would have have a BidsRepository, and you could easily do:
var newBid = new Bid(money);
BidsRepository->save(auctionId, newBid);
And you were saving a Bid without passing the defined business rule. However, having the Auction as the only Aggregate Root you are enforcing your design because you need to do something like:
var newBid = new Bid(money);
auction.placeBid(newBid);
auctionRepository.save(auction);
Therefore, you can check your invariant within the method placeBid and nobody can skip it if they want to place a new Bid.
Here it is pretty clear that the state of a Bid depends on the state of an Auction.
Complex Example
Back to your example of Orders being associated to a Customer, looks like there are not invariants that make us define a huge aggregate consisting of a Customer and all her Orders, we can just keep the relation between both entities thru an identifier reference. By doing this, we avoid loading all the Orders when fetching a Customer as well as we mitigate concurrency problems.
But, say that now business defines the next invariant:
We want to provide Customers with a pocket so they can charge it with money to buy products. Therefore, if a Customer now wants to buy a product, it needs to have enough money to do it.
Said so, pocket is a VO inside the Customer Aggregate Root. It seems now that having two separated Aggregate Roots, one for Customer and another one for Order is not the best to satisfy the new invariant because we could save a new order without checking the rule. Looks like we are forced to consider Customer as the root. That is going to affect our performance, scalaibility and concurrency issues, etc.
Solution? Eventual Consistency. What if we allow the customer to buy the product? that is, having an Aggregate Root for Orders so we create the order and save it:
var newOrder = new Order(customerId, ...);
orderRepository.save(newOrder);
we publish an event when the order is created and then we check asynchronously if the customer has enough funds:
class OrderWasCreatedListener:
var customer = customerRepository.findOfId(event.customerId);
var order = orderRepository.findOfId(event.orderId);
customer.placeOrder(order); //Check business rules
customerRepository.save(customer);
If everything was good, we have satisfied our invariants while keeping our design as we wanted at the beginning modifying just one Aggregate Root per request. Otherwise, we will send an email to the customer telling her about the insufficient funds issue. We can take advance of it by adding to the email alternatives options she can purchase with her current budget as well as encourage her to charge the pocket.
Take into account that the UI can help us to avoid having customers paying without enough money, but we cannot blindly trust on the UI.
Hope you find both examples useful, and let me know if you find better solutions for the exposed scenarios :-)
In this sort of cases, could it mean that I need to redesign and re-think my aggregates?
Almost certainly.
The driver for aggregate design isn't structure, but behavior. We don't care that "a user has thousands of orders". What we care about are what pieces of state need to be checked when you try to process a change - what data do you need to load to know if a change is valid.
Typically, you'll come to realize that changing an order doesn't (or shouldn't) depend on the state of other orders in the system, which is a good indication that two different orders should not be part of the same aggregate.

DDD: Confusion about repository/domain boundaries

My domain consists of Products, Departments, Classes, Manufacturers, DailySales, HourlySales.
I have a ProductRepository which facilitates storing/retrieving products from storage.
I have a DepartmentAndClass repository which facilitates storing/retrieving of departments and classes, as well as adding and removing products from those departments and classes.
I also have a DailySales repository which I use to retrieve statistics about daily sales from multiple groupings. ie..
DailySales.GetSalesByDepartment(dateTime)
DailySales.GetSalesByClass(dateTime)
DailySales.GetSalesByHour(dateTime)
Is it correct to have these sales tracking methods in their own repository like this? Am I on the right track?
Since domains are so dependent on context some answers are harder than others. I would, however, place statistics on the Query side of things. You probably do not want to be calculating those stats on the fly as you will be placing some heavy processing on your database. Typically the stats should be denormalized for quick access where only filtering is required.
You may want to take a look at CQRS if you haven't done so.
Although most queries return an object or a collection of objects, it also fits within the concept to return some types of summary calculations, such as an object count, or a sum of a numerical attribute that was intended by the model to be tallied.
Eric Evans - Domain-Driven Design
This might be considered a read model. Are these daily sales objects being used in any domain model behaviour? Does any business logic depend on them? If not, it might be a good idea to separate this out into a distinct read model - at which point you're taking your first steps into CQRS.

In domain driven design, can entities have their own repositories?

I'm working a pretty standard e-commerce web site where there are Products and Categories. Each product has an associated category, which is a simple name-value pair object used to categorise a product (e.g. item 1234 may have a category "ballon").
I modelled the product as a root aggregate, which owns and knows how to modify it's category, which is an entity.
However, I ran into a problem where a user needs to be able to search a category. How am I supposed to implement this in DDD? I'm new to DDD but I believe that only root aggregates should be given it's own repository. So that leaves me with 2 options:
Add "SearchCategory" method to the ProductRepository
Implement the search logic as service (i.e. CategoryFinderService)
I personally think option 2 is more logical but it feels weird to have a service that touches database. Somehow I feel that only repository should be allowed to interact with database.
Can someone please tell me what's the best way to implement this?
IMHO, in your Domain Model, Category should not be child of the Product Aggregation. The Product has a Category, but it does not know how to create or edit a Category.
Take this another example. Imagine the ShoppingCart class, it's an aggregate root and contains a list of Items. The ShoppingCart is responsible for adding/editing/removing the Items, in this case you won't need a Repository for the Item class.
Not sure by the way, I'm new to this just like you.
Placing something You don't know where to put into artificial services usually leads to anemic domain model.
I would go with first option. But need for entities without context of root is a sign that You might lack another root.
Don't try to implement everything with your domain model. The domain model is powerful for changing the state of the system, but unnecessary complex for querying. So separate the two. It's called Command Query Responsibility Segregation, or CQRS. And no, it has nothing to do with Event Sourcing, even though they do work nicely together.
I implement scenarios such as this so that I have a domain logic side with the domain objects and repositories (if needed), which do the state changing when something happens, i.e. new order is placed or order is shipped. But when I just need to show something in the UI, for instance the list of the products filtered by the category, it is a simple query and does not involve the domain objects at all. It simply returns Data Transfer Objects (DTO) that do not contain any domain logic at all.

Resources