aggregate root design and size - domain-driven-design

I know there are a million questions like this. I'm sorry. I think mine is different but it may not seems so. I am new to DDD and trying to get a grip.
Part of my domain is like this.
Location 1-* Field
Field 1-* Event
Field 1-* Task
Task - Employee
now it would seem that the AR is the Location. and if I wanted to get a particular task I would have to traverse down to the task through the collection of fields in to the collection of tasks.
This sounds pretty laborious since I am dealing with tasks and events a lot and almost never with a location per say. The location serves to segregate a group of fields and their corresponding entities. So in the ui, I may pick a location and get a list of fields. I then would pick a field. From there I might edit one of it's tasks. So I have a collection of tasks and I pick one so I have the Id of the task. I then need to traverse up to location and get his Id so I can get the AR and traverse back down to the task. Or rather I would be keeping the Id of the AR around so that I could get it. So should I be keeping the Id of the Field around too? so what I return to the server would be the AR.Id, the Field.Id and the Task.Id that I want to look at?
Secondly, an employee of course could not be an Entity it would most likely be an AR. Is it ok for an Entity on an AR to have a collection of ARs?
So perhaps the way it should be structured is like this?
public class Location // is an aggregate Root
{
public IEnumerable<Field> Fields {get;set;} //in real code encapsulated. not here for brevity
}
public class Field // is an Aggregate Root
{
public Location Location {get;set;} //reference to AR
public IEnumerable<Task> Tasks {get;set;}
public IEnumerable<Events> Events {get;set;}
}
public class Task // is an Aggregate Root
{
public Field Field {get;set;} // reference to AR
public IEnumerable<Employee> Employees {get;set;}
public TaskType TaskType {get;set;} // probably Value Object
public IEnumerable<Equipment> Equipment {get;set;} // maybe Entity or AR
}
This makes it much easier to deal with the objects that are modified the most and to traverse their relationships, but it also feels sort of like plain old OOP and that AR doesn't really mean anything.
Again I'm new to DDD and don't have anyone to run this by for a sanity check. Please help me get a grip on how these boundaries are drawn, and if it is the first way, is there an easier way to handle dealing with the Entities then carrying around the AR.id, ParentParent.Id, ParentId and finally the object of interest Entity.Id
Thanks for any thoughts
R

Ok, upon some more googling I found this great series of articles.
https://dddcommunity.org/wp-content/uploads/files/pdf_articles/Vernon_2011_1.pdf
to get to part 2 and so on just change the last didgit in the url.
Here I discovered that, much like Yves points out, I was misunderstanding the purpose of Aggregates and Aggregate Roots. Turns out they are about maintaining consistency between related entities rather then just bundling up a bunch of entities that have relations to each other.
So if a Field could only have 3 Tasks on any given day, then a Field would be a good candidate for an AR since if you were just adding Tasks willy nilly you could easily create an invalid state in the system, where as if you had to add a Task via a method on Field, then it could easily be checked whether that is acceptable.
Further one wants to avoid giant aggregate roots because they take a lot of resources to load, and can cause concurrency problems. etc etc read the articles they address my above question beautifully

Related

How to properly define an aggregate in DDD?

What would be a rule of thumb when designing an aggregate in DDD?
According to Martin Fowler, aggregate is a cluster of domain objects that can be treated as a single unit. An aggregate will have one of its component objects be the aggregate root.
https://martinfowler.com/bliki/DDD_Aggregate.html
After designing aproximatelly 20 DDD projects I am still confused about the rule of thumb when choosing domain objects that would create an aggregate.
Martin Fowler uses order and line-items analogy and I don't think it is a good example, because order+line-items are really tightly bounded objects. Not much to think about in that example.
Lets try with car analogy where CarContent is a subdomain of a car dealer domain.
CarContent would consist of at least one or more aggregate/s.
For example we have this AggregateRoot (i am keeping it as simple as possible)
class CarStructureAggregate
{
public int Id {get; private set;}
public ModelType ModelType {get; private set;}
public int Year {get; private set;}
public List<EquipmentType> {get; private set;}
}
Alternative could be this (example B)
class CarStructureAggregate
{
public int Id {get; private set;}
public ModelType ModelType {get; private set;}
public int Year {get; private set;}
}
class CarEquipmentAggregate
{
public int Id {get; private set;}
public List<EquipmentType> {get; private set;}
}
Car can be created without equipment but it cannot be activated/published without the equipment (ie. this can be populated over two different transactions)
Equipment can be referenced trough CarStructureAggregate in example A or using CarEquipmentAggregate in example B.
EquipmentType could be an enum, or could be a complex class with many more classes, properties.
What is a rule of thumb when choosing between examples A and B?
Now imagine that car could have more information such as
photos
description
maybe more data about the engine
and CarStructureAggregate could be an extremely large class
So what is it that makes us split Aggregate into new Aggregates? Size? Atomicity of a transaction (although that would not be an issue since usually aggregates of a same sub domain are usually located on the same server)
Be careful about having too strong OO mindset. The blue book and Martin Fowler post are a little bit old and the vision it provides is too narrow.
An aggregate does not need to be a class. It does not need to be persisted. Theese are implementation details. Even, sometimes, the aggregate do things that does not implies a change, just implies a "OK this action may be done".
iTollu post give you a good start: What matters is transactional boundary. The job of an aggregate is just one. Ensure invariants and domain rules in an action that, in most of the cases (remember that not always), change data that must be persisted. The transactional boundary means that once the aggregate says that something may, and has, be done; nothing in the world should contradict it because, if contradiction occurs, your aggregate is badly designed and the rule that contradict the aggregate should be part of aggregate.
So, to design aggregates, I usualy start very simple and keep evolving. Think in a static function that recives all the VO's, entities and command data (almost DTO all of them except the unique ID of the entities) needed to check domain rules for the action and returns a domain event saying that something has be done. The data of the event must contain all data that your system needs to persist the changes, if needed, and to act in consequence when the event reach to other aggregates (in the same or different bounded context).
Now start to refactoring and OO designing. Supress primitive obsession antipattern. Add constraints to avoid incorrect states of entities and VO's. That piece of code to check or calculate someting related to a entity better goes into the entity. Put your events in a diet. Put static functions that need almost the same VO's and entities to check domain rules together creating a class as aggregate root. Use repositories to create the aggregates in an always valid state. And a long etc. You know; just good OOP design, going towards no DTO's, "tell, don't ask" premise, responsibility segregation and so on.
When you finish all that work you will find your aggregates, VO's and entities perfectly designed from a domain (bounded context related) and technical view.
Something to keep in mind when designing aggregates is that the same entity can be an aggregate in one use case and a normal entity in another. So you can have a CarStructureAggregate that owns a list of EquipmentTypes, but you can also have an EquipmentTypeAggregate that owns other things and has its own business rules.
Remember, though, that aggregates can update their own properties but not update the properties of owned objects. For example if your CarStructureAggregate owns the list of EquipmentType, you cannot change properties of one of the equipment types in the context of updating the CarStructureAggregate. You must query the EquipmentType in its aggregate role to make changes to it. CarStructureAggregate can only add EquipmentTypes to its internal list or remove them.
Another rule of thumb is only populate aggregates one level deep unless there is an overriding reason to go deeper. In your example you would instantiate the CarStructureAggregate and fill the list of EquipmentTypes, but you would not populate any lists that each EquipmentType might own.
I believe, what matters here is transactional boundary.
On one hand, you can't establish it more narrow than it is sufficient for preserving an aggregate's consistency.
On the other hand, you don't want to make it so large to lock your users from concurrent modifications.
In your example, if users should be able to modify CarStructure and CarEquipment concurrently - then I'd stick to implementation B. If not - it would be simpler to use A.
in a very simple sentence, I can say:
basically, a business use case that aims to change and consists of one or more relevant entities, value objects, and invariants based on the business in domain-driven design is aggregate. being a model command is important because if you only need to read, you don’t need an aggregate.

DDD Effective modelling of aggregates and root aggregation creation

We are starting a new project and we are keen to apply DDD principles. The project is using dotnet core, with EF core providing the persistence to SQL Server.
Initial view of the domain
I will use an example of a task tracker to illustrate our issues and challenges as this would follow a similar structure.
In the beginning we understand the following: -
We have a Project
Users can be associated to Projects
A Project has Workstreams
A Workstream has Tasks
Users can post Comments against a Task
A User is able to change the status of a Task (in progress, complete etc)
A Project, with associated Worksteams and Tasks is initially created from a Template
The initial design was a large cluster aggregate with the Project being the root aggregate holding a collection of ProjectUsers and Workstreams, Workstreams has a collection of Tasks etc etc
This approach was obviously going to lead to a number of contention and performance issues due to having to load the whole Project aggregate for any changes within that aggregate.
Rightly or wrongly our next revision was to break the Comments out of the aggregate and to form a new aggregate using Comment as a root. The motivation for this was that the business envisaged there being a significant number of Comments raised against each Task.
As each Comment is related to a Task a Comment needs to hold a foreign key back to the Task. However this isn't possible following the principle that you can only reference another aggregate via its root. To overcome this we broke the Task out to another aggregate. This also seemed to satisfy the need that the Tasks could be Completed by different people and again would reduce contention.
We then faced the same problem with the reference from the Task to the Workstream the Task belongs to leading to us creating a new Workstream aggregate with the foreign key in the Task back to the Workstream.
The result is: -
A Project aggregate which only contains a list of Users assigned to the project
A Workstream aggregate which contains a foreign key to the Project
A Task aggregate which contains a foreign key to the Project
A Comments aggregate which contains a foreign key back to the Task
The Project has a method to create a new instance of a Workstream, allow us to set the foreign key. I.e. slightly simplified version
public class Project()
{
string _name { get; private set;}
public Project(Name)
{
_name = Name;
}
public Workstream CreateWorkstream(string name)
{
return new Workstream(name, Id);
}
....+ Methods for managing user assignment to the project
}
In a similar way Workstream has a method to create a Task
public class Workstream()
{
string _name { get; private set;}
public int ProjectId { get; private set; }
public Workstream(Name, Id)
{
_name = Name;
_projectId = Id;
}
public Task CreateTask(string name)
{
return new Task(name, Id);
}
private readonly List<Task> _activities = new List<Task>();
public IEnumerable<Task> Activities => _activities.AsReadOnly();
}
The Activities property has been added purely to support navigation when using the entities to build the read models.
The team are not comfortable that this approach, something doesn't feel right. The main concerns are:-
it is felt that creating a project logically should be create project, add one or more workstreams to the project, add task to the workstreams, then let EF deal with persisting that object structure.
there is discomfort that the Project has to be created first and that the developer needs to ensure it is persisted so it gets an Id, ready for when the method to Create the template is called which is dependent on that Id for the foreign key. Is it okay to push the responsibility for this to a method in a domain service CreateProjectFromTemplate() to orchestrate the creation and persistence of the separate objects to each repository?
is the method to create the new Workstream even in the correct place?
the entities are used to form the queries (support by the navigation properties) which are used to create the read models. Maybe the concern is that the object structure is being influence by the how we need to present data in a read only
We are now at the point where we are just going around in circles and could really use some advice to give us some direction.
The team are not comfortable that this approach, something doesn't feel right.
That's a very good sign.
However this isn't possible following the principle that you can only reference another aggregate via its root.
You'll want to let go of this idea, it's getting in your way.
Short answer is that identifiers aren't references. Holding a copy of an identifier for another entity is fine.
Longer answer: DDD is based on the work of Eric Evans, who was describing a style that had worked for him on java projects at the beginning of the millennium.
The pain that he is strugging with is this: if the application is allowed object references to arbitrary data entities, then the behaviors of the domain end up getting scattered all over the code base. This increases the amount of work that you need to do to understand the domain, and it increases the cost of making (and testing!) change.
The reaction was to introduce a discipline; isolate the data from the application, by restricting the application's access to a few carefully constrained gate keepers (the "aggregate root" objects). The application can hold object references to the root objects, and can send messages to those root objects, but the application cannot hold a reference to, or send a message directly to, the objects hidden behind the api of the aggregate.
Instead, the application sends a message to the root object, and the root object can then forward the message to other entities within its own aggregate.
Thus, if we want to send a message to a Task inside of some Project, we need some mechanism to know which project to load, so that we can send the message to the project to send a message to the Task.
Effectively, this means you need a function somewhere that can take a TaskId, and return the corresponding ProjectId.
The simplest way to do this is to simply store the two fields together
{
taskId: 67890,
projectId: 12345
}
it is felt that creating a project logically should be create project, add one or more workstreams to the project, add task to the workstreams, then let EF deal with persisting that object structure.
Maybe the concern is that the object structure is being influence by the how we need to present data in a read only
There's a sort of smell here, which is that you are describing the relations of a data structure. Aggregates aren't defined by relations as much as they are changes.
Is it okay to push the responsibility for this to a method in a domain service CreateProjectFromTemplate
It's actually fairly normal to have a draft aggregate (which understands editing) that is separate from a Published aggregate (which understands use). Part of the point of domain driven design is to improve the business by noticing implicit boundaries between use cases and making them explicit.
You could use a domain service to create a project from a template, but in the common case, my guess is that you should do it "by hand" -- copy the state from the draft, and then send use that state to create the project; it avoids confusion when a publish and an edit are happening concurrently.
Here is a different perspective that might nudge you out of your deadlock.
I feel you are doing data modeling instead of real domain modeling. You are concerned with a relational model that will be directly persisted using ORM (EF) and less concerned with the actual problem domain. That is why you are concerned that the project will load too many things, or which objects will hold foreign keys to what.
An alternative approach would be to forget persistence for a moment and concentrate on what things might need what responsibilities. With responsibilities I don't mean technical things like save/load/search, but things that the domain defines. Like creating a task, completing a task, adding a comment, etc. This should give you an outline of things, like:
interface Task {
...
void CompleteBy(User user);
...
}
interface Project {
...
Workstream CreateWorkstreamFrom(Template template);
...
}
Also, don't concentrate too much on what is an Entity, Value Object, Aggregate Root. First, represent your business correctly in a way you and your colleagues are happy with. That is the important part. Try to talk to non-technical people about your model, see if the language you are using fits, whether you can have a conversation with it. You can decide later what objects are Entities or Value Objects, that part is purely technical and less important.
One other point: don't bind your model directly to an ORM. ORMs are blunt instruments that will probably force you into bad decisions. You can use an ORM inside your domain objects, but don't make them be a part of the ORM. This way you can do your domain the right way, and don't have to be afraid to load too much for a specific function. You can do exactly the right things for all the business functions.

DDD: Large Aggregate Root - Person

I am building a system to manage person information. I have an ever growing aggregate root called Person. It now has hundreds of related objects, name, addresses, skills, absences, etc. My concern is that the Person AR is both breaking SRP and will create performance problems as more and more things (esp collections) get added to it.
I cannot see how with DDD to break this down into smaller objects. Taking the example of Absences. The Person has a collection of absence records (startdate, enddate, reason). These are currently managed through the Person (BookAbsence, ChangeAbsence, CancelAbsence). When adding absences I need to validate against all other absences, so I need an object which has access to the other absences in order to do this validation.
Am I missing something here? Is there another AR I have not identified? In the past I would have done this via an "AbsenceManager" service, but would like to do it using DDD.
I am fairly new to DDD, so maybe I am missing something.
Many Thanks....
The Absence chould be modeled as an aggregate. An AbsenceFactory is reposible for validating against other Absence s when you want to add a new Absence.
Code example:
public class AbsenceFactory {
private AbsenceRepository absenceRepository;
public Absence newAbsenceOf(Person person) {
List<Absence> current =
absenceRepository.findAll(person.getIdentifier());
//validate and return
}
}
You can find this pattern in the blue book (section 6.2 Factory if I'm not mistaken)
In other "modify" cases, you could introduce a Specification
public class SomeAbsenceSpecification {
private AbsenceRepository absenceRepository;
public SomeAbsenceSpecification(AbsenceRepository absenceRepository) {
this.absenceRepository=absenceRepository;
}
public boolean isSatisfiedBy(Absence absence) {
List<Absence> current =
absenceRepository.findAll(absence.getPersonIdentifier());
//validate and return
}
}
You can find this pattern in the blue book(section 9.2.3 Specification)
This is indeed what makes aggregate design so tricky. Ownership does not necessarily mean aggregation. One needs to understand the domain to be able to give a proper answer so we'll go with the good ol' Order example. A Customer would not have a collection of Order objects. The simplest rule is to think about deleting an AR. Those objects that could make sense in the absence of the AR probably do not belong on the AR. A Customer may very well have a collection of ActiveOrder objects, though. Of course there would be an invariant stating that a customer cannot be deleted if it has active orders.
Another thing to look out for is a bloated bounded context. It is conceivable that you could have one or more bounded contexts that have not been identified leading to a situation where you have an AR doing too much.
So in your case you may very well still be interested in the Absence should the Customer be deleted. In the case of an OrderLine it has no meaning without its Order. So no lifecycle of its own.
Hope that helps ever so slightly.
I am building a system to manage person information.
Are you sure that a simple CRUD application that edit/query RDBMS's tables via SQL, wouldn't be a cheaper approach?
If you can express the most of the business rules in term of data relations and table operations, you shouln't use DDD at all.
I have an ever growing aggregate root called Person.
If you actually have complex business rules, an ever growing aggregate is often a syntom of undefined (or wrongly defined) context boundaries.

Can I call operations on Entities within an Aggregate Root?

As per the title, I have the following classes:
public class Company : AggregateRoot {
public AddressBook AddressBook { get; set; }
}
public class AddressBook {
public List<Address> Addresses { get; set; }
public Address GetPrimaryAddress() {
return Addresses.FirstOrDefault();
}
}
Is it acceptable for me to call:
company.AddressBook.GetPrimaryAddress();
Or should I expose a GetPrimaryAddress() method on Company that in turn calls the AddressBook method?
I know I shouldn't have references to entities within an AggregateRoot but I wasn't sure what the rulings are on calling operations.
Update
For what it's worth, below is a diagram (click here for full size) of my actual model. ContactList contains rules for how all types of contact (Person/Business Location) should be managed, such as what happens when a primary contact is removed. It also works around some caveats of how RavenDB stores nested entities (essentially we need to provide our own Id strategy - hence the LastContactId property).
First of all, it's all depending on the context and I assume that Company really is the AR for that specific context. The same Company can be a simple object in other contexts. Now, I'm not a fan of dogmatic use of rules and patterns so it is not important what the 'rule' says.
In this case I won't expose the Address as it seems to be an internal of the Company. As a coosnumer of the Company, I want its primary address, I don't care you're using the AddresBook to organize them.
To give a not so common example: the AR Human has two Eye objects. Will you ask the Person to give one of his eyes so you can check their color or you ask the Person directly what color his eyes are?
According to the Aggregate pattern:
Transient references to the internal members can be passed out for use within a single operation only.
Meaning - a Company can pass a reference to its Address object to other objects outside the aggregate, but Address cannot be a member of any other object outside the aggregate.
For example, an object User can ask a reference to an Address from a Company, but User cannot have Address as one of its members.
And why is that so important?
Because the root controls access, it cannot be blindsided by changes to the internals.
If an object User would have Address as one of its members, it might pull it out of the database without its Company and thus, Company would be blindsided by changes to its internals.
Please see a post I've wrote in which i demonstrate why is this principle so important.
Good question, this is one of the things I've always found hard to get right in DDD - do you always access entities through their aggregate root and probably violate the Law of Demeter at some point (AggregateRoot.EntityX.EntityY.DoStuff()) ? Do you short-circuit the aggregate root ? Do you add at the aggregate root level one direct accessor for each sub-sub-entity you want to access, muddling the aggregate root ?
One way to solve that could be : try to make every object talk only to its immediate or nearby neighbors and not to some distant stranger. Use multiple objects that each know a small part of the path from the aggregate root to the final entity you want to access.
The first object knows only the aggregate root,
It injects AggregateRoot.SubEntity1 into a second object,
Second object in turn injects SubEntity1.SubEntity2 into a third object
and so on.
Interestingly enough, one thing this reveals is the (ir)relevance of some of your domain entities. In the Address example, ask yourself if it feels right for every object that wants to access the primary Address of a Company to be injected an AddressBook. If it seems too convoluted, maybe you should not have an AddressBook in the first place. Maybe it isn't such a strong notion that it deserves to be part of the ubiquitous language after all.
Or, maybe you'll find out an AddressBook is precisely the right object to be used by your client object, and that this client object tries to do too many things at a time in manipulating both a Company and an Address.

Please clarify how create/update happens against child entities of an aggregate root

After much reading and thinking as I begin to get my head wrapped around DDD, I am a bit confused about the best practices for dealing with complex hierarchies under an aggregate root. I think this is a FAQ but after reading countless examples and discussions, no one is quite talking about the issue I'm seeing.
If I am aligned with the DDD thinking, entities below the aggregate root should be immutable. This is the crux of my trouble, so if that isn't correct, that is why I'm lost.
Here is a fabricated example...hope it holds enough water to discuss.
Consider an automobile insurance policy (I'm not in insurance, but this matches the language I hear when on the phone w/ my insurance company).
Policy is clearly an entity. Within the policy, let's say we have Auto. Auto, for the sake of this example, only exists within a policy (maybe you could transfer an Auto to another policy, so this is potential for an aggregate as well, which changes Policy...but assume it simpler than that for now). Since an Auto cannot exist without a Policy, I think it should be an Entity but not a root. So Policy in this case is an aggregate root.
Now, to create a Policy, let's assume it has to have at least one auto. This is where I get frustrated. Assume Auto is fairly complex, including many fields and maybe a child for where it is garaged (a Location). If I understand correctly, a "create Policy" constructor/factory would have to take as input an Auto or be restricted via a builder to not be created without this Auto. And the Auto's creation, since it is an entity, can't be done beforehand (because it is immutable? maybe this is just an incorrect interpretation). So you don't get to say new Auto and then setX, setY, add(Z).
If Auto is more than somewhat trivial, you end up having to build a huge hierarchy of builders and such to try to manage creating an Auto within the context of the Policy.
One more twist to this is later, after the Policy is created and one wishes to add another Auto...or update an existing Auto. Clearly, the Policy controls this...fine...but Policy.addAuto() won't quite fly because one can't just pass in a new Auto (right!?). Examples say things like Policy.addAuto(VIN, make, model, etc.) but are all so simple that that looks reasonable. But if this factory method approach falls apart with too many parameters (the entire Auto interface, conceivably) I need a solution.
From that point in my thinking, I'm realizing that having a transient reference to an entity is OK. So, maybe it is fine to have a entity created outside of its parent within the aggregate in a transient environment, so maybe it is OK to say something like:
auto = AutoFactory.createAuto();
auto.setX
auto.setY
or if sticking to immutability, AutoBuilder.new().setX().setY().build()
and then have it get sorted out when you say Policy.addAuto(auto)
This insurance example gets more interesting if you add Events, such as an Accident with its PolicyReports or RepairEstimates...some value objects but most entities that are all really meaningless outside the policy...at least for my simple example.
The lifecycle of Policy with its growing hierarchy over time seems the fundamental picture I must draw before really starting to dig in...and it is more the factory concept or how the child entities get built/attached to an aggregate root that I haven't seen a solid example of.
I think I'm close. Hope this is clear and not just a repeat FAQ that has answers all over the place.
Aggregate Roots exist for the purpose of transactional consistency.
Technically, all you have are Value Objects and Entities.
The difference between the two is immutability and identity.
A Value Object should be immutable and it's identity is the sum of it's data.
Money // A value object
{
string Currency;
long Value;
}
Two Money objects are equal if they have equal Currency and equal Value. Therefore, you could swap one for the other and conceptually, it would be as if you had the same Money.
An Entity is an object with mutability over time, but whose identity is immutable throughout it's lifetime.
Person // An entity
{
PersonId Id; // An immutable Value Object storing the Person's unique identity
string Name;
string Email;
int Age;
}
So when and why do you have Aggregate Roots?
Aggregate Roots are specialized Entities whose job is to group a set of domain concepts under one transactional scope for purpose of data change only. That is, say a Person has Legs. You would need to ask yourself, should changes on Legs and changes on Person be grouped together under a single transaction? Or can I change one separately from the other?
Person // An entity
{
PersonId Id;
string Name;
string Ethnicity;
int Age;
Pair<Leg> Legs;
}
Leg // An entity
{
LegId Id;
string Color;
HairAmount HairAmount; // none, low, medium, high, chewbacca
int Length;
int Strength;
}
If Leg can be changed by itself, and Person can be changed by itself, then they both are Aggregate Roots. If Leg can not be changed alone, and Person must always be involved in the transaction, than Leg should be composed inside the Person entity. At which point, you would have to go through Person to change Leg.
This decision will depend on the domain you are modeling:
Maybe the Person is the sole authority on his legs, they grow longer and stronger based on his age, the color changes according to his ethnicity, etc. These are invariants, and Person will be responsible for making sure they are maintained. If someone else wants to change this Person's legs, say you want to shave his legs, you'd have to ask him to either shaves them himself, or hand them to you temporarily for you to shave.
Or you might be in the domain of archeology. Here you find Legs, and you can manipulate the Legs independently. At some point, you might find a complete body and guess who this person was historically, now you have a Person, but the Person has no say in what you'll do with the Legs you found, even if it was shown to be his Legs. The color of the Leg changes based on how much restoration you've applied to it, or other things. These invariants would be maintained by another Entity, this time it won't be Person, but maybe Archaeologist instead.
TO ANSWER YOUR QUESTION:
I keep hearing you talk about Auto, so that's obviously an important concept of your domain. Is it an entity or a value object? Does it matter if the Auto is the one with serial #XYZ, or are you only interested in brand, colour, year, model, make, etc.? Say you care about the exact identity of the Auto and not just it's features, than it would need to be an Entity of your domain. Now, you talk about Policy, a policy dictates what is covered and not covered on an Auto, this depends on the Auto itself, and probably the Customer too, since based on his driving history, the type and year and what not of Auto he has, his Policy might be different.
So I can already conceive having:
Auto : Entity, IAggregateRoot
{
AutoId Id;
string Serial;
int Year
colour Colour;
string Model
bool IsAtGarage
Garage Garage;
}
Customer : Entity, IAggregateRoot
{
CustomerId Id;
string Name;
DateTime DateOfBirth;
}
Policy : Entity, IAggregateRoot
{
string Id;
CustomerId customer;
AutoId[] autos;
}
Garage : IValueObject
{
string Name;
string Address;
string PhoneNumber;
}
Now the way you make it sound, you can change a Policy without having to change an Auto and a Customer together. You say things like, what if the Auto is at the garage, or we transfer an Auto from one Policy to another. This makes me feel like Auto is it's own Aggregate Root, and so is Policy and so is Customer. Why is that? Because it sounds like it is the usage of your domain that you would change an Auto's garage without caring that the Policy be changed with it. That is, if someone changes an Auto's Garage and IsAtGarage state, you don't care not to change the Policy. I'm not sure if I'm being clear, you wouldn't want to change the Customer's Name and DateOfBirth in a non transactional way, because maybe you change his name, but it fails to change the Date and now you have a corrupt customer whose Date of Birth doesn't match his name. On the other hand, it's fine to change the Auto without changing the Policy. Because of this, Auto should not be in the aggregate of Policy. Effectively, Auto is not a part of Policy, but only something that the Policy keeps track of and might use.
Now we see that it then totally make sense that you are able to create an Auto on it's own, as it is an Aggregate Root. Similarly, you can create Customers by themselves. And when you create a Policy, you simply must link it to a corresponding Customer and his Autos.
aCustomer = Customer.Make(...);
anAuto = Auto.Make(...);
anotherAuto = Auto.Make(...);
aPolicy = Policy.Make(aCustomer, { anAuto, anotherAuto }, ...);
Now, in my example, Garage isn't an Aggregate Root. This is because, it doesn't seem to be something that the domain directly works with. It is always used through an Auto. This makes sense, Insurance companies don't own garages, they don't work in the business of garages. You wouldn't ever need to create a Garage that existed on it's own. It's easy then to have an anAuto.SentToGarage(name, address, phoneNumber) method on Auto which creates a Garage and assign it to the Auto. You wouldn't delete a Garage on it's own. You would do anAuto.LeftGarage() instead.
entities below the aggregate root should be immutable.
No. Value objects are supposed to be immutable. Entities can change their state.
Just need to make sure You do proper encapsulation:
entities modifies themselves
entities are modified through aggregate root only
but Policy.addAuto() won't quite fly because one can't just pass in a new Auto (right!?)
Usually it's supposed to be so. Problem is that auto creation task might become way too large. If You are lucky and, knowing that entities can be modified, are able to divide smoothly it into smaller tasks like SpecifyEngine, problem is resolved.
However, "real world" does not work that way and I feel Your pain.
I got case when user uploads 18 excel sheets long crap load of data (with additional fancy rule - it should be "imported" whatever how invalid data are (as I say - that's like saying true==false)). This upload process is considered as one atomic operation.
What I do in this case...
First of all - I have excel document object model, mappings (e.g. Customer.Name==1st sheet, "C24") and readers that fill DOM. Those things live in infrastructure far far away.
Next thing - entity and value objects in my domain that looks similar to DOM dto`s, but only projection that I'm interested in, with proper data types and according validation. + I Have 1:1 association in my domain model that isolates dirty mess out (luckily enough, it kind a fits with ubiquitous language).
Armed with that - there's still one tricky part left - mapping between excel DOM dtos to domain objects. This is where I sacrifice encapsulation - I construct entity with its value objects from outside. My thought process is kind a simple - this overexposed entity can't be persisted anyway and validness still can be forced (through constructors). It lives underneath aggregate root.
Basically - this is the part where You can't runaway from CRUDyness.
Sometimes application is just editing bunch of data.
P.s. I'm not sure that I'm doing right thing. It's likely I've missed something important on this issue. Hopefully there will be some insight from other answerers.
Part of my answer seems to be captured in these posts:
Domain Driven Design - Parent child relation pattern - Specification pattern
Best practice for Handling NHibernate parent-child collections
how should i add an object into a collection maintained by aggregate root
To summarize:
It is OK to create an entity outside its aggregate if it can manage its own consistency (you may still use a factory for it). So having a transient reference to Auto is OK and then a new Policy(Auto) is how to get it into the aggregate. This would mean building up "temporary" graphs to get the details spread out a bit (not all piled into one factory method or constructor).
I'm seeing my alternatives as either:
(a) Build a DTO or other anemic graph first and then pass it to a factory to get the aggregate built.
Something like:
autoDto = new AutoDto();
autoDto.setVin(..);
autoDto.setEtc...
autoDto.setGaragedLocation(new Location(..));
autoDto.addDriver(...);
Policy policy = PolicyFactory.getInstance().createPolicy(x, y, autoDto);
auto1Dto...
policy.addAuto(auto1Dto);
(b) Use builders (potentially compound):
builder = PolicyBuilder.newInstance();
builder = builder.setX(..).setY(..);
builder = builder.addAuto(vin, new Driver()).setGaragedLocation(new Location());
Policy = builder.build();
// and how would update work if have to protect the creation of Auto instances?
auto1 = AutoBuilder.newInstance(policy, vin, new Driver()).build();
policy.addAuto(auto1);
As this thing twists around and around a couple things seem clear.
In the spirit of ubiquitous language, it makes sense to be able to say:
policy.addAuto
and
policy.updateAuto
The arguments to these and how the aggregate and the entity creation semantics are managed is not quite clear, but having to look at a factory to understand the domain seems a bit forced.
Even if Policy is an aggregate and manages how things are put together beneath it, the rules about how an Auto looks seem to belong to Auto or its factory (with some exceptions for where Policy is involved).
Since Policy is invalid without a minimally constructed set of children, those children need to be created prior or within its creation.
And that last statement is the crux. It looks like for the most part these posts handle the creation of children as separate affairs and then glue them. The pure DDD approach would seem to argue that Policy has to create Autos but the details of that spin wildly out of control in non-trivial cases.

Resources