Domain Driven Design - Aggregate Roots

Domain Driven Design - Aggregate Roots - domain-driven-design

I'm struggling with aggregates and aggregate roots. I have a natural aggregate root which works for about 60% of the user requests. I.e. those requests naturally apply to the aggregate root.
Within my aggregate I have another entity which can only exist as a member of the aggregate root. Users, however, will be told about this other entity object. It will sometimes make sense, conceptually, for users to operate on this non-aggregate root object directly.
So, I think I have a couple of options:
They can they both be aggregate roots depending on which operation is being requested by the user.
All operations have to go through the top level aggregate root.
Note that the top level aggregate root will hold a collection of this other entity.
Example:
Main aggregate root: Car
Second entity: Seat (a Car has either 2 or 4 seats depending on type). In my domain seats can only exist as part of a car.
Most operations in the domain are at the Car level. So that will be a good candidate for aggregate root. However, (and I'm struggling for examples here), some operations will be at the seat level, e.g. SpillCoffee, ChangeFabric, Clean....
Can Seat and Car both be aggregate roots? Or should I always start with Car?
Thanks

The idea of an aggregate is to guarantee consistency, being the root responsible for data integrity and forcing invariants.
Suppose there's a rule like "The fabric of all seats must be the same", or ""you can only spill coffee on the seat if there's someone inside the car". It will be much harder to enforce these, once the clients will be able to change the fabric separately, or these invariants will need to be forced outside (danger zone).
IMHO, if integrity or forcing invariants is not an issue, then aggregates are not really needed. But, if it is necessary, my advice is to start everything with the car. But always think of the model. If there are invariants like these, then who enforces these invariants? Then try passing this idea to the code, and everything should be fine.

Probably you need some deeper knowledge of some aspect of the domain model. This question shows that you are about to invent a way to organize the entities to supply the system, when, ideally, this kind of questions are already answered before implementation.
When this pops out only on the system implementation, you whether go back to review the domain or you discovered some fragility whose feedback could - and should - aggregate changes on related details of the business to make the domain richer and better modeled.
In the car example, I used the approach of two aggregates who correlate different contexts. The first would be the "car has a seat" approach, and in this aggregate the possible actions for "seat" would be just the ones that make sense to "seat as part of a car". Example: Clean.
The second aggregate would be in the context of "seat", and there would be the possible actions and configurations for seat as a standalone. Example: ChangeFabric, ColorList. This way, the aggregate "car" has "seat", but the clients can know seat on the context that makes sense. Which is dangerous, like said by samuelcarrijo on previous post. If the modifications between contexts affects the domain integrity, you lost all the aggregate concept.

In the case of a shopping cart with an cart and line items I have both of those as aggregate roots since I often modify them independently.
public class Cart : IAggregateRoot
{
public List<LineItem> LineItems {get;}
}
public class LineItems : IAggregateRoot
{
public List<LineItem> LineItems {get;}
}
However, I have a separate bounded context for orders and in this case I only need to have one aggregate root since I no longer need to modify the line items independently.
public class Order : IAggregateRoot
{
public List<LineItem> LineItems {get;}
}
The other option is to have a way of looking up the aggregate root from a child ID.
Car GetCarFromSeatID(guid seatID)

Related

DDD: do I really need to load all objects in an aggregate? (Performance concerns)

In DDD, a repository loads an entire aggregate - we either load all of it or none of it. This also means that should avoid lazy loading.
My concern is performance-wise. What if this results in loading into memory thousands of objects? For example, an aggregate for Customer comes back with ten thousand Orders.
In this sort of cases, could it mean that I need to redesign and re-think my aggregates? Does DDD offer suggestions regarding this issue?

Take a look at this Effective Aggregate Design series of three articles from Vernon. I found them quite useful to understand when and how you can design smaller aggregates rather than a large-cluster aggregate.
EDIT
I would like to give a couple of examples to improve my previous answer, feel free to share your thoughts about them.
First, a quick definition about an Aggregate (took from Patterns, Principles and Practices of Domain Driven Design book by Scott Millet)
Entities and Value Objects collaborate to form complex relationships that meet invariants within the domain model. When dealing with large interconnected associations of objects, it is often difficult to ensure consistency and concurrency when performing actions against domain objects. Domain-Driven Design has the Aggregate pattern to ensure consistency and to define transactional concurrency boundaries for object graphs. Large models are split by invariants and grouped into aggregates of entities and value objects that are treated as conceptual whole.
Let's go with an example to see the definition in practice.
Simple Example
The first example shows how defining an Aggregate Root helps to ensure consistency when performing actions against domain objects.
Given the next business rule:
Winning auction bids must always be placed before the auction ends. If a winning bid is placed after an auction ends, the domain is in an invalid state because an invariant has been broken and the model has failed to correctly apply domain rules.
Here there is an aggregate consisting of Auction and Bids where the Auction is the Aggregate Root.
If we say that Bid is also a separated Aggregate Root you would have have a BidsRepository, and you could easily do:
var newBid = new Bid(money);
BidsRepository->save(auctionId, newBid);
And you were saving a Bid without passing the defined business rule. However, having the Auction as the only Aggregate Root you are enforcing your design because you need to do something like:
var newBid = new Bid(money);
auction.placeBid(newBid);
auctionRepository.save(auction);
Therefore, you can check your invariant within the method placeBid and nobody can skip it if they want to place a new Bid.
Here it is pretty clear that the state of a Bid depends on the state of an Auction.
Complex Example
Back to your example of Orders being associated to a Customer, looks like there are not invariants that make us define a huge aggregate consisting of a Customer and all her Orders, we can just keep the relation between both entities thru an identifier reference. By doing this, we avoid loading all the Orders when fetching a Customer as well as we mitigate concurrency problems.
But, say that now business defines the next invariant:
We want to provide Customers with a pocket so they can charge it with money to buy products. Therefore, if a Customer now wants to buy a product, it needs to have enough money to do it.
Said so, pocket is a VO inside the Customer Aggregate Root. It seems now that having two separated Aggregate Roots, one for Customer and another one for Order is not the best to satisfy the new invariant because we could save a new order without checking the rule. Looks like we are forced to consider Customer as the root. That is going to affect our performance, scalaibility and concurrency issues, etc.
Solution? Eventual Consistency. What if we allow the customer to buy the product? that is, having an Aggregate Root for Orders so we create the order and save it:
var newOrder = new Order(customerId, ...);
orderRepository.save(newOrder);
we publish an event when the order is created and then we check asynchronously if the customer has enough funds:
class OrderWasCreatedListener:
var customer = customerRepository.findOfId(event.customerId);
var order = orderRepository.findOfId(event.orderId);
customer.placeOrder(order); //Check business rules
customerRepository.save(customer);
If everything was good, we have satisfied our invariants while keeping our design as we wanted at the beginning modifying just one Aggregate Root per request. Otherwise, we will send an email to the customer telling her about the insufficient funds issue. We can take advance of it by adding to the email alternatives options she can purchase with her current budget as well as encourage her to charge the pocket.
Take into account that the UI can help us to avoid having customers paying without enough money, but we cannot blindly trust on the UI.
Hope you find both examples useful, and let me know if you find better solutions for the exposed scenarios :-)

In this sort of cases, could it mean that I need to redesign and re-think my aggregates?
Almost certainly.
The driver for aggregate design isn't structure, but behavior. We don't care that "a user has thousands of orders". What we care about are what pieces of state need to be checked when you try to process a change - what data do you need to load to know if a change is valid.
Typically, you'll come to realize that changing an order doesn't (or shouldn't) depend on the state of other orders in the system, which is a good indication that two different orders should not be part of the same aggregate.

Creating child instances on an aggregate root in DDD

I have been reading Eric Evan's book on DDD and on page 139 he states:
"if you needed to add elements inside a preexisting AGGREGATE, you might create a FACTORY METHOD on the root of the AGGREGATE"
I would assume that could be implemented something like this where the method NewLineItem is used to create and add a new line item to the order.
class Order
{
public IEnumerable<LineItem> LineItems { get; }
public void NewLineItem(Product product, int quantity);
}
Another way I could think of doing this is to move the factory method into the collection itself. Something like this below. I could then add a new item by calling LineItems.New(...).
class Order
{
public LineItems LineItems { get; }
public class LineItems : IEnumerable<LineItem>
{
public void New(Product product, int quantity);
}
}
What are the pros/cons to each approach? Are there any gotchas with moving the factory method into a collection? We are currently trying to figure out the best way to implement a large domain model. We are concerned that some of these root aggregate models will get bloated with numerous factory methods and deletion methods such as RemoveLineItem(LineItem). Our thinking is that moving these factory methods to their collections helps organize the design and keeps the root aggregate less cluttered with methods. Any advice?
Thanks

One advantage of having the factory method on the AR directly is that it makes the AR aware of the changes and allows it to enforce it's invariants. Not only that, but because the method is aware of the internal state of the AR you may be able to reduce the number of arguments passed to the factory method (most useful when creating other related ARs).
E.g. registration = course.register(registrant) vs registration = new Registration(registrant, courseId)
Also, LineItem becomes an implementation detail so the client doesn't need to be aware of that class.
The fact that you are asking this question and are actually worried of having too many methods on your ARs is perhaps an indicator that you may be clustering together objects that do not belong together.
Do not lose sight of the AR main purpose: it's a transactionnal boundary allowing to protect invariants. If there's no invariant to protect then clustering may be unecessary or even undesirable.
I would strongly advise you to read Effective Aggregate Design by Vauhgn Vernon.

There is always that "law" of Demeter business :)
The aggregate root (AR) is going to be responsible for the integrity and invariants. It may be possible that you will have an invariant along the lines of "maximum order total of $50 and no more than 6 line items at any time". The collection will not have access to any of this information (well, perhaps the count). So the idea is that the AR handles these interactions.
If you are concerned with bloat or find yourself with ARs that are unwieldy it may indicate a problem with your design. Vaughn Vernon covers these scenarios quite nicely in his book. You really do want highly cohesive ARs and it can be tricky to identify them correctly. A couple of iterations may be required to get the most comfortable design.
So I would try and stick with Eric's advice and handle the interactions on the AR itself as far as is practically possible.

Aggregates and aggregation roots confusion

i've been assigned a quite simple project as an exam and i had the idea to develop it using the Domain Driven Design.
Many of you might say that the application is so simple that going with repositories and UoW is just a waste of time, and you probably be correct but i think of it as an opportunity to learn something more.
The application is a "Flight tickets" system and from the following image you could probably well guess it's functionality.
The thing is that i am not sure if i am correctly seperating the aggregates and their roots.
EDIT:
I presented the data model so anyone can spot the whole functionality easily.
The thing is that from an employe perspective the flight as "Rad" said encapsulates the whole functionality and is the aggregate root.
However from an admin perspective, flights are none his bussiness.
He just want to update or add new planes-companies, etc..
So then there is a new aggregate root which is the Airplane which encapsulates the Airplane seats(Entity), the seatType(value object) and the company(Entity) as a new aggregate.
This tends to confuses me as i have an aggregate root(Airplane) inside another aggregate(Flight Aggregate).
Since the aggregate root is consider to be the "CORE" entity which without it the other entities inside it will not make any sense without it, i am thinking about Company. And i conclude that company makes sense without the airplane.
To explain more i think of the scenario where the admin want to just insert a new Company, or want to first load a company and then its airplanes.
DDD principles say that any entities inside the aggregate may only be loaded from the root itself.
So here is the confusion.

Mmm, where is the Aggregate and Aggregate roots here ? This is only Data Model... Not Domain Model.
Aggregate is a cluster of items (Domain Object) that are gathered together, and Aggregate Root are the entity root... (If you consider the Flight Aggregate encapsulates Seats, Location... The Aggregate Root should be Flight entity).
[Edit]
You have to ignore the persistent. In your app you can have many aggregate it depends in your Domain, maybe Flight is an Aggregate and Company another one ;), don't confuse entity and Aggregate...

An aggregate is a group of entities (objects with identity) and maybe value objects (objects without identity, immutable). There is exactly one entity in an aggregate that is the aggregate root. You can easily identify it by checking if the other objects in the aggregate depend on it, for example, if you delete an object of the aggregate root type, the remaining objects don't make sense anymore (in database terms, you'd cascade delete the dependent objects).
The aggregate root is the sole object in the aggregate that gives access to the other types in the aggregate, hence you'll have one repository per aggregate and it returns instances of aggregate root type.

DDD: Large Aggregate Root - Person

I am building a system to manage person information. I have an ever growing aggregate root called Person. It now has hundreds of related objects, name, addresses, skills, absences, etc. My concern is that the Person AR is both breaking SRP and will create performance problems as more and more things (esp collections) get added to it.
I cannot see how with DDD to break this down into smaller objects. Taking the example of Absences. The Person has a collection of absence records (startdate, enddate, reason). These are currently managed through the Person (BookAbsence, ChangeAbsence, CancelAbsence). When adding absences I need to validate against all other absences, so I need an object which has access to the other absences in order to do this validation.
Am I missing something here? Is there another AR I have not identified? In the past I would have done this via an "AbsenceManager" service, but would like to do it using DDD.
I am fairly new to DDD, so maybe I am missing something.
Many Thanks....

The Absence chould be modeled as an aggregate. An AbsenceFactory is reposible for validating against other Absence s when you want to add a new Absence.
Code example:
public class AbsenceFactory {
private AbsenceRepository absenceRepository;
public Absence newAbsenceOf(Person person) {
List<Absence> current =
absenceRepository.findAll(person.getIdentifier());
//validate and return
}
}
You can find this pattern in the blue book (section 6.2 Factory if I'm not mistaken)
In other "modify" cases, you could introduce a Specification
public class SomeAbsenceSpecification {
private AbsenceRepository absenceRepository;
public SomeAbsenceSpecification(AbsenceRepository absenceRepository) {
this.absenceRepository=absenceRepository;
}
public boolean isSatisfiedBy(Absence absence) {
List<Absence> current =
absenceRepository.findAll(absence.getPersonIdentifier());
//validate and return
}
}
You can find this pattern in the blue book(section 9.2.3 Specification)

This is indeed what makes aggregate design so tricky. Ownership does not necessarily mean aggregation. One needs to understand the domain to be able to give a proper answer so we'll go with the good ol' Order example. A Customer would not have a collection of Order objects. The simplest rule is to think about deleting an AR. Those objects that could make sense in the absence of the AR probably do not belong on the AR. A Customer may very well have a collection of ActiveOrder objects, though. Of course there would be an invariant stating that a customer cannot be deleted if it has active orders.
Another thing to look out for is a bloated bounded context. It is conceivable that you could have one or more bounded contexts that have not been identified leading to a situation where you have an AR doing too much.
So in your case you may very well still be interested in the Absence should the Customer be deleted. In the case of an OrderLine it has no meaning without its Order. So no lifecycle of its own.
Hope that helps ever so slightly.

I am building a system to manage person information.
Are you sure that a simple CRUD application that edit/query RDBMS's tables via SQL, wouldn't be a cheaper approach?
If you can express the most of the business rules in term of data relations and table operations, you shouln't use DDD at all.
I have an ever growing aggregate root called Person.
If you actually have complex business rules, an ever growing aggregate is often a syntom of undefined (or wrongly defined) context boundaries.

What Belongs to the Aggregate Root

This is a practical Domain Driven Design question:
Conceptually, I think I get Aggregate roots until I go to define one.
I have an Employee entity, which has surfaced as an Aggregate root. In the Business, some employees can have work-related Violations logged against them:
Employee-----*Violations
Since not all Employees are subject to this, I would think that Violations would not be a part of the Employee Aggregate, correct?
So when I want to work with Employees and their related violations, is this two separate Repository interactions by some Service?
Lastly, when I add a Violation, is that method on the Employee Entity?
Thanks for the help!

After doing even MORE research, I think I have the answer to my question.
Paul Stovell had this slightly edited response to a similar question on the DDD messageboard. Substitute "Customer" for "Employee", and "Order" for "Violation" and you get the idea.
Just because Customer references Order
doesn't necessarily mean Order falls
within the Customer aggregate root.
The customer's addresses might, but
the orders can be independent (for
example, you might have a service that
processes all new orders no matter who
the customer is. Having to go
Customer->Orders makes no sense in
this scenario).
From a domain point of view, you can
even question the validity of those
references (Customer has reference to
a list of Orders). How often will you
actually need all orders for a
customer? In some systems it makes
sense, but in others, one customer
might make many orders. Chances are
you want orders for a customer between
a date range, or orders for a customer
that aren't processed yet, or orders
which have not been paid, and so on.
The scenario in which you'll need all
of them might be relatively uncommon.
However, it's much more likely that
when dealing with an Order, you will
want the customer information. So in
code, Order.Customer.Name is useful,
but Customer.Orders[0].LineItem.SKU -
probably not so useful. Of course,
that totally depends on your business
domain.
In other words, Updating Customer has nothing to do with updating Orders. And orders, or violations in my case, could conceivable be dealt with independently of Customers/Employees.
If Violations had detail lines, then Violation and Violation line would then be a part of the same aggregate because changing a violation line would likely affect a Violation.
EDIT**
The wrinkle here in my Domain is that Violations have no behavior. They are basically records of an event that happened. Not sure yet about the implications that has.

Eric Evan states in his book, Domain-Driven Design: Tackling the Complexity in the Heart of Software,
An AGGREGATE is a cluster of associated objects that we treat as a unit for the purpose of data changes.
There are 2 important points here:
These objects should be treated as a "unit".
For the purpose of "data change".
I believe in your scenario, Employee and Violation are not necessarily a unit together, whereas in the example of Order and OrderItem, they are part of a single unit.
Another thing that is important when modeling the agggregate boundaries is whether you have any invariants in your aggregate. Invariants are business rules that should be valid within the "whole" aggregate. For example, as for the Order and OrderItem example, you might have an invariant that states the total cost of the order should be less than a predefined amount. In this case, anytime you want to add an OrderItem to the Order, this invariant should be enforced to make sure that your Order is valid. However, in your problem, I don't see any invariants between your entities: Employee and Violation.
So short answer:
I believe Employee and Violation each belong to 2 separate aggregates. Each of these entities are also their own aggregate roots. So you need 2 repositories: EmployeeRepository and ViolationRepository.
I also believe you should have an unidirectional association from Violation to Employee. This way, each Violation object knows who it belongs to. But if you want to get the list of all Violations for a particular Employee, then you can ask the ViolationRepository:
var list = repository.FindAllViolationsByEmployee(someEmployee);

You say that you have employee entity and violations and each violation does not have any behavior itself. From what I can read above, it seems to me that you may have two aggregate roots:
Employee
EmployeeViolations (call it EmployeeViolationCard or EmployeeViolationRecords)
EmployeeViolations is identified by the same employee ID and it holds a collection of violation objects. You get behavior for employee and violations separated this way and you don't get Violation entity without behavior.
Whether violation is entity or value object you should decide based on its properties.

I generally agree with Mosh on this one. However, keep in mind the notion of transactions in the business point of view. So I actually take "for the purpose of data changes" to mean "for the purpose of transaction(s)".
Repositories are views of the domain model. In a domain environment, these "views" really support or represent a business function or capability - a transaction. Case in point, the Employee may have one or more violations, and if so, are aspects of a transaction(s) in a point in time. Consider your use cases.
Scenario: "An employee commits an act that is a violation of the workplace." This is a type of business event (i.e. transaction, or part of a larger, perhaps distributed transaction) that occurred. The root affected domain object actually can be seen from more than one perspective, which is why it is confusing. But the thing to remember is behavior as it pertains to a business transaction, since you want your business processes to model the real-world as accurate as possible. In terms of relationships, just like in a relational database, your conceptual domain model should actually indicate this already (i.e. the associativity), which often can be read in either direction:
Employee <----commits a -------committed by ----> Violation
So for this use case, it would be fair that to say that it is a transaction dealing with violations, and that the root - or "primary" entity - is a Violation. That, then would be your aggregate root you would reference for that particular business activity or business process. But that is not to say that, for a different activity or process, that you cannot have an Employee aggregate root, such as the "new employee process". If you take care, there should be no negative impact of cyclic references, or being able to traverse your domain model multiple ways. I will warn, however, that governing of this should be thought about and handled by your controller piece of your business domain, or whatever equivalent you have.
Aside: Thinking in terms of patterns (i.e. MVC), the repository is a view, the domain objects are the model, and thus one should also employ some form of controller pattern. Typically, the controller declares the concrete implementation of and access to the repositories (collections of aggregate roots).
In the data access world...
Using LINQ-To-SQL as an example, the DataContext would be the controller exposing a view of Customer and Order entities. The view is a non-declarative, framework-oriented Table type (rough equivalent to Repository). Note that the view keeps a reference to its parent controller, and often goes through the controller to control how/when the view gets materialized. Thus, the controller is your provider, taking care of mapping, translation, object hydration, etc. The model is then your data POCOs. Pretty much a typical MVC pattern.
Using N/Hibernate as an example, the ISession would be the controller exposing a view of Customer and Order entities by way of the session.Enumerable(string query) or session.Get(object id) or session.CreateCriteria(typeof(Customer)).List()
In the business logic world...
Customer { /*...*/ }
Employee { /*...*/ }
Repository<T> : IRepository<T>
, IEnumerable<T>
//, IQueryable<T>, IQueryProvider //optional
{ /**/ }
BusinessController {
Repository<Customer> Customers { get{ /*...*/ }} //aggregate root
Repository<Order> Orders { get{ /*...*/ }} // aggregate root
}
In a nutshell, let your business processes and transactions be the guide, and let your business infrastructure naturally evolve as processes/activities are implemented or refactored. Moreover, prefer composability over traditional black box design. When you get to service-oriented or cloud computing, you will be glad you did. :)

I was wondering what the conclusion would be?
'Violations' become a root entity. And 'violations' would be referenced by 'employee' root entity. ie violations repository <-> employee repository
But you are consfused about making violations a root entity becuase it has no behavior.
But is 'behaviour' a criteria to qualify as a root entity? I dont think so.

a slightly orthogonal question to test understanding here, going back to Order...OrderItem example, there might be an analytics module in the system that wants to look into OrderItems directly i.e get all orderItems for a particular product, or all order items greater than some given value etc, does having a lot of usecases like that and driving "aggregate root" to extreme could we argue that OrderItem is a different aggregate root in itself ??

It depends. Does any change/add/delete of a vioation change any part of employee - e.g. are you storing violation count, or violation count within past 3 years against employee?

Develop Reference

node.js excel linux python-3.x azure haskell apache-spark rust .htaccess string