How to properly define an aggregate in DDD?

How to properly define an aggregate in DDD? - domain-driven-design

What would be a rule of thumb when designing an aggregate in DDD?
According to Martin Fowler, aggregate is a cluster of domain objects that can be treated as a single unit. An aggregate will have one of its component objects be the aggregate root.
https://martinfowler.com/bliki/DDD_Aggregate.html
After designing aproximatelly 20 DDD projects I am still confused about the rule of thumb when choosing domain objects that would create an aggregate.
Martin Fowler uses order and line-items analogy and I don't think it is a good example, because order+line-items are really tightly bounded objects. Not much to think about in that example.
Lets try with car analogy where CarContent is a subdomain of a car dealer domain.
CarContent would consist of at least one or more aggregate/s.
For example we have this AggregateRoot (i am keeping it as simple as possible)
class CarStructureAggregate
{
public int Id {get; private set;}
public ModelType ModelType {get; private set;}
public int Year {get; private set;}
public List<EquipmentType> {get; private set;}
}
Alternative could be this (example B)
class CarStructureAggregate
{
public int Id {get; private set;}
public ModelType ModelType {get; private set;}
public int Year {get; private set;}
}
class CarEquipmentAggregate
{
public int Id {get; private set;}
public List<EquipmentType> {get; private set;}
}
Car can be created without equipment but it cannot be activated/published without the equipment (ie. this can be populated over two different transactions)
Equipment can be referenced trough CarStructureAggregate in example A or using CarEquipmentAggregate in example B.
EquipmentType could be an enum, or could be a complex class with many more classes, properties.
What is a rule of thumb when choosing between examples A and B?
Now imagine that car could have more information such as
photos
description
maybe more data about the engine
and CarStructureAggregate could be an extremely large class
So what is it that makes us split Aggregate into new Aggregates? Size? Atomicity of a transaction (although that would not be an issue since usually aggregates of a same sub domain are usually located on the same server)

Be careful about having too strong OO mindset. The blue book and Martin Fowler post are a little bit old and the vision it provides is too narrow.
An aggregate does not need to be a class. It does not need to be persisted. Theese are implementation details. Even, sometimes, the aggregate do things that does not implies a change, just implies a "OK this action may be done".
iTollu post give you a good start: What matters is transactional boundary. The job of an aggregate is just one. Ensure invariants and domain rules in an action that, in most of the cases (remember that not always), change data that must be persisted. The transactional boundary means that once the aggregate says that something may, and has, be done; nothing in the world should contradict it because, if contradiction occurs, your aggregate is badly designed and the rule that contradict the aggregate should be part of aggregate.
So, to design aggregates, I usualy start very simple and keep evolving. Think in a static function that recives all the VO's, entities and command data (almost DTO all of them except the unique ID of the entities) needed to check domain rules for the action and returns a domain event saying that something has be done. The data of the event must contain all data that your system needs to persist the changes, if needed, and to act in consequence when the event reach to other aggregates (in the same or different bounded context).
Now start to refactoring and OO designing. Supress primitive obsession antipattern. Add constraints to avoid incorrect states of entities and VO's. That piece of code to check or calculate someting related to a entity better goes into the entity. Put your events in a diet. Put static functions that need almost the same VO's and entities to check domain rules together creating a class as aggregate root. Use repositories to create the aggregates in an always valid state. And a long etc. You know; just good OOP design, going towards no DTO's, "tell, don't ask" premise, responsibility segregation and so on.
When you finish all that work you will find your aggregates, VO's and entities perfectly designed from a domain (bounded context related) and technical view.

Something to keep in mind when designing aggregates is that the same entity can be an aggregate in one use case and a normal entity in another. So you can have a CarStructureAggregate that owns a list of EquipmentTypes, but you can also have an EquipmentTypeAggregate that owns other things and has its own business rules.
Remember, though, that aggregates can update their own properties but not update the properties of owned objects. For example if your CarStructureAggregate owns the list of EquipmentType, you cannot change properties of one of the equipment types in the context of updating the CarStructureAggregate. You must query the EquipmentType in its aggregate role to make changes to it. CarStructureAggregate can only add EquipmentTypes to its internal list or remove them.
Another rule of thumb is only populate aggregates one level deep unless there is an overriding reason to go deeper. In your example you would instantiate the CarStructureAggregate and fill the list of EquipmentTypes, but you would not populate any lists that each EquipmentType might own.

I believe, what matters here is transactional boundary.
On one hand, you can't establish it more narrow than it is sufficient for preserving an aggregate's consistency.
On the other hand, you don't want to make it so large to lock your users from concurrent modifications.
In your example, if users should be able to modify CarStructure and CarEquipment concurrently - then I'd stick to implementation B. If not - it would be simpler to use A.

in a very simple sentence, I can say:
basically, a business use case that aims to change and consists of one or more relevant entities, value objects, and invariants based on the business in domain-driven design is aggregate. being a model command is important because if you only need to read, you don’t need an aggregate.

Related

Initializing Domain Objects - observing SOLID, Tell, Don't Ask

I'm trying to follow some of the more current design principles including SOLID and Domain Driven Design. My question is around how people handle "Initializing" Domain Objects.
Here's a simple example:
Based on SOLID, I should not depend on concretions, so I create an interface and a class. Since I'm taking advantage of Domain Driven Design, I create an object with relevant methods. (i.e. not anemic).
Interface IBookstoreBook
{
string Isbn {get; set;}
int Inventory {get; set;}
void AddToInventory(int numBooks);
void RemoveFromInventory(int numBooks);
}
public class BookstoreBook : IBookstoreBook
{
public string Isbn {get; set;}
public int Inventory {get; private set;}
public void AddToInventory(int numBooks);
public void RemoveFromInventory(int numBooks);
}
To help with testing and be more loosely coupled, I also use an IoC container to create this book. So when the book is created it is always created empty. But, if a book doesn't have an ISBN and Inventory it is invalid.
BookstoreBook(string bookISBN, int bookInventory) {..} // Does not exist
I could have 4 or 5 different classes that use a BookstoreBook. For one,
public class Bookstore : IBookstore
{
...
public bool NeedToIncreaseInventory(BookstoreBook book) { ...}
...
}
How does any method know is getting a valid book? My solutions below seem to violate the "Tell Don't Ask" principle.
a) Should each method that uses a Bookstore book test for validity? (i.e. should NeedToIncreaseInventory test for a books validity? I'm not sure it should have to know what makes a BookstoreBook valid.)
b) Should I have a "CreateBook" on the IBookstoreBook object and just "assume" that clients know they have to call this anytime they want to initialize a BookstoreBook? That way, NeedToIncreaseInventory would just trust that "CreateBook" was already called on BookstoreBook.
I'm interested in what the recommended appreach is here.

First off, I think your BookstoreBook doesn't have any really relevant methods, which means it doesn't have any relevant behavior, no business rules at all. And since it doesn't contain any business rules it actually is anemic. It just has a bunch of Getters and Setters. I would argue that having a method like AddToInventory that ends up just adding +1 to a property is no meaningful behavior.
Also, why would your BookstoreBook know how many of its type are in your Bookstore? I feel like this is probably something the Bookstore itself should keep track of.
As for point a): no, if you're creating books from user input you should check the provided data before you even create a new book. That prevents you from ever having invalid books in your system.
As for the creation of the object, the question is will you ever have more than one book type? If the answer is no you can drop the interface and just instantiate a book in a class that is responsible for creating new books from user input for example. If you need more book types an abstract factory may be useful.

First of all, a great way to make sure that entity state only can be set by behavior (methods) so to make all property setters private. It also allows you to make sure that all related properties are set when the state changes.
But, if a book doesn't have an ISBN and Inventory it is invalid.
There you have two business rules. Let's start with ISBN. If a book is not valid without it it HAVE to be specified in the constructor. Otherwise it's fully possible to create a book which is invalid. An ISBN also have a specified format (at least the length). So that format have to be validated too.
Regarding the inventory I believe that it's not true. You might have books that are sold out or books that can be booked before their release. Right? So a book CAN exist without an inventory, it's just not likely.
If you look at the relation between inventory and books from the domain perspective they are two separate entities with different responsibilities.
A book is representing something that the user can read about and use that information to decide whether it should be rented or purchased.
An inventory is used to make sure that your application can fulfill your customers request. Typically it can be done by a delivery directly (decrease the inventory) or by a backorder (order more copies from your supplier and then deliver the book).
Thus the inventory part of the application do not really need to know everything there is to know about the book. Thus I would recommend that the inventory only knows about the book identity (that's typically how root aggregates can reference each other according to Martin Fowler's book).
An inversion of control container is typically used to to manage services (in DDD the application services and the domain services). It's job is not to act as a factory for domain entities. It will only complicate things without any benefit.

To help with testing and be more loosely coupled, I also use an IoC container to create this book.
Why is your IoC container creating books? That's a bit strange. Your domain model should by container agnostic (wiring together the interfaces and the implementation is the concern of your composition root).
How does any method know is getting a valid book?
The domain model knows it is getting a valid book, because it says so right there in the interface.
The data model knows it is producing a valid book, because the constructor/factory method accepted its arguments without throwing an exception.
Should each method that uses a Bookstore book test for validity?
No, once you have a Book, it is going to stay valid (there shouldn't be any verbs defined in your domain model that would create an invalid data model).
Should I have a "CreateBook" on the IBookstoreBook object and just "assume" that clients know they have to call this anytime they want to initialize a BookstoreBook? That way, NeedToIncreaseInventory would just trust that "CreateBook" was already called on BookstoreBook.
It's normal to have a factory for creating objects. See Evans, chapter 6.
books can be created from a database and many other places. I'm assuming others have had to solve this issue if they are using DDD and I am wondering about their approach. Should we all be using factories - as you suggest that take the needed data as input?
There are really only two sources for data -- your own book of record (in which case, you load the data via a repository), and everywhere else (where you need to make sure that the data conforms to the assumptions of your model.

Based on SOLID, I should not depend on concretions
If you're referring to the Dependency Inversion principle, it does not exactly say that.
- High-level modules should not depend on low-level modules. Both should depend on abstractions.
- Abstractions should not depend on details. Details should depend on abstractions.
No domain entity is of a higher level than another and normally no object in the Domain layer is a "detail", so DIP usually doesn't apply to domain entities.
I also use an IoC container to create this book
Considering that BookstoreBook has no dependency, I'm not sure why you would do that.
How does any method know is getting a valid book?
By assuming that the book is Always Valid, always consistent. This usually requires having a single Book constructor that checks all relevant rules at creation time, and state-changing methods that enforce invariants about the Book.
a) ...
b) ...
You're mixing up two concerns here - making sure that Book is in a consistent state wherever it is used, and initializing a Book. I'm not sure what your question is really about in the end, but if you apply the "always valid" approach and forget about Book being an interface/higher level abstraction, you should be good to go.

Creating child instances on an aggregate root in DDD

I have been reading Eric Evan's book on DDD and on page 139 he states:
"if you needed to add elements inside a preexisting AGGREGATE, you might create a FACTORY METHOD on the root of the AGGREGATE"
I would assume that could be implemented something like this where the method NewLineItem is used to create and add a new line item to the order.
class Order
{
public IEnumerable<LineItem> LineItems { get; }
public void NewLineItem(Product product, int quantity);
}
Another way I could think of doing this is to move the factory method into the collection itself. Something like this below. I could then add a new item by calling LineItems.New(...).
class Order
{
public LineItems LineItems { get; }
public class LineItems : IEnumerable<LineItem>
{
public void New(Product product, int quantity);
}
}
What are the pros/cons to each approach? Are there any gotchas with moving the factory method into a collection? We are currently trying to figure out the best way to implement a large domain model. We are concerned that some of these root aggregate models will get bloated with numerous factory methods and deletion methods such as RemoveLineItem(LineItem). Our thinking is that moving these factory methods to their collections helps organize the design and keeps the root aggregate less cluttered with methods. Any advice?
Thanks

One advantage of having the factory method on the AR directly is that it makes the AR aware of the changes and allows it to enforce it's invariants. Not only that, but because the method is aware of the internal state of the AR you may be able to reduce the number of arguments passed to the factory method (most useful when creating other related ARs).
E.g. registration = course.register(registrant) vs registration = new Registration(registrant, courseId)
Also, LineItem becomes an implementation detail so the client doesn't need to be aware of that class.
The fact that you are asking this question and are actually worried of having too many methods on your ARs is perhaps an indicator that you may be clustering together objects that do not belong together.
Do not lose sight of the AR main purpose: it's a transactionnal boundary allowing to protect invariants. If there's no invariant to protect then clustering may be unecessary or even undesirable.
I would strongly advise you to read Effective Aggregate Design by Vauhgn Vernon.

There is always that "law" of Demeter business :)
The aggregate root (AR) is going to be responsible for the integrity and invariants. It may be possible that you will have an invariant along the lines of "maximum order total of $50 and no more than 6 line items at any time". The collection will not have access to any of this information (well, perhaps the count). So the idea is that the AR handles these interactions.
If you are concerned with bloat or find yourself with ARs that are unwieldy it may indicate a problem with your design. Vaughn Vernon covers these scenarios quite nicely in his book. You really do want highly cohesive ARs and it can be tricky to identify them correctly. A couple of iterations may be required to get the most comfortable design.
So I would try and stick with Eric's advice and handle the interactions on the AR itself as far as is practically possible.

Domain driven design: How to deal with complex models with a lot of data fields?

Well I am trying to apply domain driven design principles for my application, with a rich domain model that contains both data fields and business logic. I've read many DDD books, but it seems that their domain models(called entities) are very simple. It becomes a problem when I have a domain model with 10-15 data fields, such as the one below:
class Job extends DomainModel{
protected int id;
protected User employer;
protected string position;
protected string industry;
protected string requirements;
protected string responsibilities;
protected string benefits;
protected int vacancy;
protected Money salary;
protected DateTime datePosted;
protected DateTime dateStarting;
protected Interval duration;
protected String status;
protected float rating;
//business logic below
}
As you see, this domain model contains a lot of data fields, and all of them are important and cannot be stripped away. I know that a good rich domain model should not contain setter methods, but rather pass its data to constructor, and mutate states using business logic. However, for the above domain model, I cannot pass everything to the constructor, as it will lead to 15+ parameters in constructor method. A method should not contain more than 6-7 parameters, dont you think?
So what can I do to deal with a domain model with a lot of data fields? Should I try to decompose it? If so, how? Or maybe, I should just use a Builder class or reflection to initialize its properties upon instantiation so I wont pollute the constructor with so many arguments? Can anyone give some advice? Thanks.

What you've missed is the concept of a Value Object. Value objects are small, immutable objects with meaning in the respective domain.
I don't know the specifics of your domain, but looking at your Job entity, there could be a value object JobDescription that looks like this:
class JobDescription {
public JobDescription(string position, string requirements, string responsibilities) {
Position = position;
Requirements = requirements;
Responsibilities = responsibilities;
}
public string Position {get;}
public string Requirements {get;}
public string Responsibilities {get;}
}
This is C# code, but I think the idea should be clear regardless of the language you are using.
The basic idea is to group values in a way that makes sense in the respective domain. This means of course that value objects can also contain other value objects.
You should also ensure that value objects are compared by value instead of by reference, e.g. by implementing IEquatable<T> in C#.
If you refactor your code with this approach, you will get fewer fields on your entity, so using constructor injection (which is highly recommended) becomes feasible again.
Further notes regarding your example code that are not directly connected to the question:
The domain model is the whole thing, an entity is part of it. So your base class should be called Entity and not DomainModel.
You should make the fields of your class private and provide protected accessors where required to maintain encapsulation.

There's an awful lot going on in your Job domain model object - it seems to mix a huge number of concerns, and (to me at least) suggests a number of bounded contexts, some of which are easy to discern for the sake of making an example.
Remuneration (pay, benefits)
Organisational position (reporting line)
Person spec (skills)
Job specification (responsibilities)
etc.
When you consider the things that interact with your 'Job' model, are there any that need to inspect or mutate BOTH the Salary property and the Industry property, for example?
Without knowing the full nuances of the domain, the Salary you get for holding a position and the Industry you work in are not really connected, are they? Not a rhetorical point; these are the questions you NEED to ask the domain experts.
If they DON'T have any interaction then you have identified that these two things exist in two different BOUNDED CONTEXTS. The Salary side has no need of any interaction with the Industry side and vice versa, and even if they did, do they need to be held as state in the same process at the same time?
Think about the lifecycle of how a person becomes an employee; a person applies for a job. The job has a specification, salary range. The person attends an interview. The hirers offer the person the position. The person accepts. The person is now an employee, not a candidate any longer. The new employee is now accruing holiday and benefits and has a start date etc.
DDD teaches us that a single, unified view of the world rarely serves ANY of the concerns correctly. Please explore BOUNDED CONTEXTS - your software will be much more pliable and flexible as a result.

DDD and getting additional information in a domain class

I think I've read 16,154 questions, blog posts, tweets, etc about DDD and best practices. Apologies for yet another question of that type. Let's say I have three tables in my database, User, Department, and UserDepartment. All very simple. I need to build a hierarchy showing what departments a user has access to. The issue is that I also need to show the parent departments of those that they have access to.
Is it best to have a GetDepartments() method on my user class? Right now I have a user service with GetDepartments(string userName), but I don't feel like that is the optimal solution. If user.GetDepartments() is preferred then how do I get access the repository to get the parent departments for those that the user has access to?
Don't think it matters, but I'm using the Entity Framework.
public class User
{
[Key]
public int UserId { get; private set; }
[Display(Name = "User Name")]
public string UserName { get; private set; }
[Display(Name = "Email")]
public string Email { get; private set; }
[Display(Name = "UserDepartments")]
public virtual ICollection<UserDepartment> UserDepartments { get; private set; }
public List<Department> GetDepartments()
{
// Should this be here? and if so, what's the preferred method for accessing the repository?
}
}

DDD is more about the behavior, which also mean it is TDA (tell, don't ask) oriented.
Normally you structure your aggregates in a way that you tell them what to do, not ask for information.
Even more, if some extra information is required by the aggregate in order to perform its behavior, it is typically not their job to figure out where to get this information from.
Now, when you are saying that your User aggregate has GetDepartments method, it raises a bell. Does the aggregate need this information in order to perform any kind of behavior? I don't think so, it is just you wanting some data to display.
So what I see here is that you are trying to structure your aggregates against your data tables, not against the behavior.
This is actually #2 error when applying DDD (#1 is not thinking about bounded contexts).
Again, aggregates represent business logic and behavior of your system. Which means that you don't have to read from aggregates. Your read side can be done much easier - just make a damn query to the DB.
But once you need to ask your system to do something - now you do it through aggregates: AppService would load one from the repository and call its behavior method.
That's why normally you don't have properties in your aggregates, just methods that represent behavior.
Also, you don't want your aggregates to be mapped to the data tables anyhow, it is not their job, but the job of repositories. Actually, you don't want your domain to have dependencies on anything, especially infrastructure.
So if you want to go for DDD direction then consider the following:
Structure your aggregates to encapsulate behaviors, not represent data tables
Don't make your domain dependant on infrastructure, etc.
Make repositories to be responsible to load/save aggregates. Aggregates themselves should know nothing about persistence, data structure, etc.
You don't have to read data through aggregates.
Think of #4 as your system has two sides: the "read" side when you just read the data and show them in the UI, and the "command" side when you perform actions.
The first one (read) is very simple: stupid queries to read the data in a way you want it. It doesn't affect anything because it is just reading, no side effects here.
The second one is when you make changes and that is going through your domain.
Again, remember the first rule of DDD: if you don't have business logic and behavior to model then don't do DDD.

DDD: Large Aggregate Root - Person

I am building a system to manage person information. I have an ever growing aggregate root called Person. It now has hundreds of related objects, name, addresses, skills, absences, etc. My concern is that the Person AR is both breaking SRP and will create performance problems as more and more things (esp collections) get added to it.
I cannot see how with DDD to break this down into smaller objects. Taking the example of Absences. The Person has a collection of absence records (startdate, enddate, reason). These are currently managed through the Person (BookAbsence, ChangeAbsence, CancelAbsence). When adding absences I need to validate against all other absences, so I need an object which has access to the other absences in order to do this validation.
Am I missing something here? Is there another AR I have not identified? In the past I would have done this via an "AbsenceManager" service, but would like to do it using DDD.
I am fairly new to DDD, so maybe I am missing something.
Many Thanks....

The Absence chould be modeled as an aggregate. An AbsenceFactory is reposible for validating against other Absence s when you want to add a new Absence.
Code example:
public class AbsenceFactory {
private AbsenceRepository absenceRepository;
public Absence newAbsenceOf(Person person) {
List<Absence> current =
absenceRepository.findAll(person.getIdentifier());
//validate and return
}
}
You can find this pattern in the blue book (section 6.2 Factory if I'm not mistaken)
In other "modify" cases, you could introduce a Specification
public class SomeAbsenceSpecification {
private AbsenceRepository absenceRepository;
public SomeAbsenceSpecification(AbsenceRepository absenceRepository) {
this.absenceRepository=absenceRepository;
}
public boolean isSatisfiedBy(Absence absence) {
List<Absence> current =
absenceRepository.findAll(absence.getPersonIdentifier());
//validate and return
}
}
You can find this pattern in the blue book(section 9.2.3 Specification)

This is indeed what makes aggregate design so tricky. Ownership does not necessarily mean aggregation. One needs to understand the domain to be able to give a proper answer so we'll go with the good ol' Order example. A Customer would not have a collection of Order objects. The simplest rule is to think about deleting an AR. Those objects that could make sense in the absence of the AR probably do not belong on the AR. A Customer may very well have a collection of ActiveOrder objects, though. Of course there would be an invariant stating that a customer cannot be deleted if it has active orders.
Another thing to look out for is a bloated bounded context. It is conceivable that you could have one or more bounded contexts that have not been identified leading to a situation where you have an AR doing too much.
So in your case you may very well still be interested in the Absence should the Customer be deleted. In the case of an OrderLine it has no meaning without its Order. So no lifecycle of its own.
Hope that helps ever so slightly.

I am building a system to manage person information.
Are you sure that a simple CRUD application that edit/query RDBMS's tables via SQL, wouldn't be a cheaper approach?
If you can express the most of the business rules in term of data relations and table operations, you shouln't use DDD at all.
I have an ever growing aggregate root called Person.
If you actually have complex business rules, an ever growing aggregate is often a syntom of undefined (or wrongly defined) context boundaries.

Develop Reference

node.js excel linux python-3.x azure haskell apache-spark rust .htaccess string