DDD: Large Aggregate Root - Person - domain-driven-design

DDD: Large Aggregate Root - Person - domain-driven-design

I am building a system to manage person information. I have an ever growing aggregate root called Person. It now has hundreds of related objects, name, addresses, skills, absences, etc. My concern is that the Person AR is both breaking SRP and will create performance problems as more and more things (esp collections) get added to it.
I cannot see how with DDD to break this down into smaller objects. Taking the example of Absences. The Person has a collection of absence records (startdate, enddate, reason). These are currently managed through the Person (BookAbsence, ChangeAbsence, CancelAbsence). When adding absences I need to validate against all other absences, so I need an object which has access to the other absences in order to do this validation.
Am I missing something here? Is there another AR I have not identified? In the past I would have done this via an "AbsenceManager" service, but would like to do it using DDD.
I am fairly new to DDD, so maybe I am missing something.
Many Thanks....

The Absence chould be modeled as an aggregate. An AbsenceFactory is reposible for validating against other Absence s when you want to add a new Absence.
Code example:
public class AbsenceFactory {
private AbsenceRepository absenceRepository;
public Absence newAbsenceOf(Person person) {
List<Absence> current =
absenceRepository.findAll(person.getIdentifier());
//validate and return
}
}
You can find this pattern in the blue book (section 6.2 Factory if I'm not mistaken)
In other "modify" cases, you could introduce a Specification
public class SomeAbsenceSpecification {
private AbsenceRepository absenceRepository;
public SomeAbsenceSpecification(AbsenceRepository absenceRepository) {
this.absenceRepository=absenceRepository;
}
public boolean isSatisfiedBy(Absence absence) {
List<Absence> current =
absenceRepository.findAll(absence.getPersonIdentifier());
//validate and return
}
}
You can find this pattern in the blue book(section 9.2.3 Specification)

This is indeed what makes aggregate design so tricky. Ownership does not necessarily mean aggregation. One needs to understand the domain to be able to give a proper answer so we'll go with the good ol' Order example. A Customer would not have a collection of Order objects. The simplest rule is to think about deleting an AR. Those objects that could make sense in the absence of the AR probably do not belong on the AR. A Customer may very well have a collection of ActiveOrder objects, though. Of course there would be an invariant stating that a customer cannot be deleted if it has active orders.
Another thing to look out for is a bloated bounded context. It is conceivable that you could have one or more bounded contexts that have not been identified leading to a situation where you have an AR doing too much.
So in your case you may very well still be interested in the Absence should the Customer be deleted. In the case of an OrderLine it has no meaning without its Order. So no lifecycle of its own.
Hope that helps ever so slightly.

I am building a system to manage person information.
Are you sure that a simple CRUD application that edit/query RDBMS's tables via SQL, wouldn't be a cheaper approach?
If you can express the most of the business rules in term of data relations and table operations, you shouln't use DDD at all.
I have an ever growing aggregate root called Person.
If you actually have complex business rules, an ever growing aggregate is often a syntom of undefined (or wrongly defined) context boundaries.

Related

DDD Factory Responsibility

If have the following Code.
public class CountryFactory : IEntityFactory
{
private readonly IRepository<Country> countryRepository;
public CountryFactory(IRepository<Country> countryRepository)
{
this.countryRepository = countryRepository;
}
public Country CreateCountry(string name)
{
if (countryRepository.FindAll().Any(c => c.Name == name))
{
throw new ArgumentException("There is already a country with that name!");
}
return new Country(name);
}
}
From a DDD approach, is the the correct way to create a Country. Or is it better to have a CountryService which checks whether or not a country exists, then if it does not, just call the factory to return a new entity. This will then mean that the service will be responsible of persisting the Entity rather than the Factory.
I'm a bit confused as to where the responsibility should lay. Especially if more complex entities needs to be created which is not as simple as creating a country.

In DDD factories are used to encapsulate complex objects and aggregates creation. Usually, factories are not implemented as separate classes but rather static methods on the aggregate root class that returns the new aggregate.
Factory methods are better suited than constructors since you might need to have technical constructors for serialization purposes and var x = new Country(name) has very little meaning inside your Ubiquitous Language. What does it mean? Why do you need a name when you create a country? Do you really create countries, how often new countries appear, do you even need to model this process? All these questions arise if you start thinking about your model and ubiquitous language besides tactical pattern.
Factories must return valid objects (i.e. aggregates), checking all invariants inside it, but not outside. Factory might receive services and repositories as parameters but this is also not very common. Normally, you have an application service or command handler that does some validations and then creates a new aggregate using the factory method and adds it to the repository.
There is also a good answer by Lev Gorodinski here Factory Pattern where should this live in DDD?
Besides, implementation of Factories is extensively described in Chapter 11 of the Red Book.

Injecting a Repository into a Factory is OK, but it shouldn't be your first concern. The starting point should be : what kind of consistency does your business domain require ?
By checking Country name uniqueness in CountryFactory which is part of your Domain layer, you give yourself the impression that the countries will always be consistent. But the only aggregate is Country and since there is no AllCountries aggregate to act as a consistency boundary, respect of this invariant will not be guaranteed. Somebody could always sneak in a new Country that has exactly the same name as the one being added, just after you checked it. What you could do is wrap the CreateCountry operation into a transaction that would lock the entire set of Countries (and thus the entire table if you use an RDBMS) but this would hurt concurrency.
There are other options to consider.
Why not leverage a database unique constraint to enforce the Country name invariant ? As a complement, you could also have another checkpoint at the UI level to warn the user that the country name they typed in is already taken. This would necessitate another "query" service that just calls CountryRepository.GetByName() but where the returned Countries are not expected to be modified.
Soon you'll be realizing that there are really two kinds of models - ones that can give you some domain data at a given moment in time so that you can display it on a user interface, and ones that expose operations (AddCountry) and will guarantee that domain invariants always hold. This is a first step towards CQRS.
What is the frequency of Countries being added or modified ? If it is that high, do we really need a Country name to be unique at all times ? Wouldn't it solve a lot of problems if we loosened up the constraints and allowed a user to temporarily create a duplicate Country name ? A mechanism could detect the duplicates later on and take a compensating action, putting the newly added Country on hold and reaching out to the user to ask them to change the name. A.k.a eventual consistency instead of immediate consistency.
Does Country need to be an Aggregate ? What would be the cost if it was a Value Object and duplicated in each entity where it is used ?

Creating child instances on an aggregate root in DDD

I have been reading Eric Evan's book on DDD and on page 139 he states:
"if you needed to add elements inside a preexisting AGGREGATE, you might create a FACTORY METHOD on the root of the AGGREGATE"
I would assume that could be implemented something like this where the method NewLineItem is used to create and add a new line item to the order.
class Order
{
public IEnumerable<LineItem> LineItems { get; }
public void NewLineItem(Product product, int quantity);
}
Another way I could think of doing this is to move the factory method into the collection itself. Something like this below. I could then add a new item by calling LineItems.New(...).
class Order
{
public LineItems LineItems { get; }
public class LineItems : IEnumerable<LineItem>
{
public void New(Product product, int quantity);
}
}
What are the pros/cons to each approach? Are there any gotchas with moving the factory method into a collection? We are currently trying to figure out the best way to implement a large domain model. We are concerned that some of these root aggregate models will get bloated with numerous factory methods and deletion methods such as RemoveLineItem(LineItem). Our thinking is that moving these factory methods to their collections helps organize the design and keeps the root aggregate less cluttered with methods. Any advice?
Thanks

One advantage of having the factory method on the AR directly is that it makes the AR aware of the changes and allows it to enforce it's invariants. Not only that, but because the method is aware of the internal state of the AR you may be able to reduce the number of arguments passed to the factory method (most useful when creating other related ARs).
E.g. registration = course.register(registrant) vs registration = new Registration(registrant, courseId)
Also, LineItem becomes an implementation detail so the client doesn't need to be aware of that class.
The fact that you are asking this question and are actually worried of having too many methods on your ARs is perhaps an indicator that you may be clustering together objects that do not belong together.
Do not lose sight of the AR main purpose: it's a transactionnal boundary allowing to protect invariants. If there's no invariant to protect then clustering may be unecessary or even undesirable.
I would strongly advise you to read Effective Aggregate Design by Vauhgn Vernon.

There is always that "law" of Demeter business :)
The aggregate root (AR) is going to be responsible for the integrity and invariants. It may be possible that you will have an invariant along the lines of "maximum order total of $50 and no more than 6 line items at any time". The collection will not have access to any of this information (well, perhaps the count). So the idea is that the AR handles these interactions.
If you are concerned with bloat or find yourself with ARs that are unwieldy it may indicate a problem with your design. Vaughn Vernon covers these scenarios quite nicely in his book. You really do want highly cohesive ARs and it can be tricky to identify them correctly. A couple of iterations may be required to get the most comfortable design.
So I would try and stick with Eric's advice and handle the interactions on the AR itself as far as is practically possible.

Simple aggregate root and repository

I'm one of many trying to understand the concept of aggregate roots, and I think that I've got it!
However, when I started modeling this sample project, I quickly ran into a dilemma.
I have the two entities ProcessType and Process. A Process cannot exist without a ProcessType, and a ProcessType has many Processes. So a process holds a reference to a type, and cannot exist without it.
So should ProcessType be an aggregate root? New processes would be created by calling processType.AddProcess(new Process());
However, I have other entities that only holds a reference to the Process, and accesses its type through Process.Type. In this case it makes no sense going through ProcessType first.
But AFAIK entities outside the aggregate are only allowed to hold references to the root of the aggregate, and not entities inside the aggregate. So do I have two aggregates here, each with their own repository?

I largely agree with what Sisyphus has said, particularly the bit about not constricting yourself to the 'rules' of DDD that may lead to a pretty illogical solution.
In terms of your problem, I have come across the situation many times, and I would term 'ProcessType' as a lookup. Lookups are objects that 'define', and have no references to other entities; in DDD terminology, they are value objects. Other examples of what I would term a lookup may be a team member's 'RoleType', which could be a tester, developer, project manager for example. Even a person's 'Title' I would define as a lookup - Mr, Miss, Mrs, Dr.
I would model your process aggregate as:
public class Process
{
public ProcessType { get; }
}
As you say, these type of objects typically need to populate dropdowns in the UI and therefore need their own data access mechanism. However, I have personally NOT created 'repositories' as such for them, but rather a 'LookupService'. This for me retains the elegance of DDD by keeping 'repositories' strictly for aggregate roots.
Here is an example of a command handler on my app server and how I have implemented this:
Team Member Aggregate:
public class TeamMember : Person
{
public Guid TeamMemberID
{
get { return _teamMemberID; }
}
public TeamMemberRoleType RoleType
{
get { return _roleType; }
}
public IEnumerable<AvailabilityPeriod> Availability
{
get { return _availability.AsReadOnly(); }
}
}
Command Handler:
public void CreateTeamMember(CreateTeamMemberCommand command)
{
TeamMemberRoleType role = _lookupService.GetLookupItem<TeamMemberRoleType>(command.RoleTypeID);
TeamMember member = TeamMemberFactory.CreateTeamMember(command.TeamMemberID,
role,
command.DateOfBirth,
command.FirstName,
command.Surname);
using (IUnitOfWork unitOfWork = UnitOfWorkFactory.CreateUnitOfWork())
_teamMemberRepository.Save(member);
}
The client can also make use of the LookupService to populate dropdown's etc:
ILookup<TeamMemberRoleType> roles = _lookupService.GetLookup<TeamMemberRoleType>();

Not so simple. ProcessType is most likley a knowledge layer object - it defines a certain process. Process on the other hand is an instance of a process that is ProcessType. You probably really don't need or want the bidirectional relationship. Process is probably not a logical child of a ProcessType. They typically belong to something else, like a Product, or Factory or Sequence.
Also by definition when you delete an aggregate root you delete all members of the aggregate. When you delete a Process I seriously doubt you really want to delete ProcessType. If you deleted ProcessType you might want to delete all Processes of that type, but that relationship is already not ideal and chances are you will not be deleting definition objects ever as soon as you have a historical Process that is defined by ProcessType.
I would remove the Processes collection from ProcessType and find a more suitable parent if one exists. I would keep the ProcessType as a member of Process since it probably defines Process. Operational layer (Process) and Knowledge Layer(ProcessType) objects rarely work as a single aggregate so I would have either Process be an aggregate root or possibly find an aggregate root that is a parent for process. Then ProcessType would be a external class. Process.Type is most likely redundant since you already have Process.ProcessType. Just get rid of that.
I have a similar model for healthcare. There is Procedure (Operational layer) and ProcedureType (knowledge layer). ProcedureType is a standalone class. Procedure is a child of a third object Encounter. Encounter is the aggregate root for Procedure. Procedure has a reference to ProcedureType but it is one way. ProcedureType is a definition object it does not contain a Procedures collection.
EDIT (because comments are so limited)
One thing to keep in mind through all of this. Many are DDD purists and adamant about rules. However if you read Evans carefully he constantly raises the possibility that tradeoffs are often required. He also goes to pretty great lengths to characterize logical and carefully thought out design decisions versus things like teams that do not understand the objectives or circumvent things like aggregates for the sake of convenience.
The important things is to understand and apply the concepts as opposed to the rules. I see many DDD that shoehorn an application into illogical and confusing aggregates etc for no other reason than because a literal rule about repositories or traversal is being applied, That is not the intent of DDD but it is often the product of the overly dogmatic approach many take.
So what are the key concepts here:
Aggregates provide a means to make a complex system more manageable by reducing the behaviors of many objects into higher level behaviors of the key players.
Aggregates provide a means to ensure that objects are created in a logical and always valid condition that also preserves a logical unit of work across updates and deletes.
Let's consider the last point. In many conventional applications someone creates a set of objects that are not fully populated because they only need to update or use a few properties. The next developer comes along and he needs these objects too, and someone has already made a set somewhere in the neighborhood fora different purpose. Now this developer decides to just use those, but he then discovers they don't have all the properties he needs. So he adds another query and fills out a few more properties. Eventually because the team does not adhere to OOP because they take the common attitude that OOP is "inefficient and impractical for the real world and causes performance issues such as creating full objects to update a single property". What they end up with is an application full of embedded SQL code and objects that essentially randomly materialize anywhere. Even worse these objects are bastardized invalid proxies. A Process appears to be a Process but it is not, it is partially populated in different ways any given point depending on what was needed. You end up with a ball mud of numerous queries to continuously partially populate objects to varying degrees and often a lot of extraneous crap like null checks that should not exist but are required because the object is never truly valid etc.
Aggregate rules prevent this by ensuring objects are created only at certain logical points and always with a full set of valid relationships and conditions. So now that we fully understand exactly what aggregate rules are for and what they protect us from, we also want to understand that we also do not want to misuse these rules and create strange aggregates that do not reflect what our application is really about simply because these aggregate rules exists and must be followed at all times.
So when Evans says create Repositories only for aggregates he is saying create aggregates in a valid state and keep them that way instead of bypassing the aggregate for internal objects directly. You have a Process as a root aggregate so you create a repository. ProcessType is not part of that aggregate. What do you do? Well if an object is by itself and it is an entity, it is an aggregate of 1. You create a repository for it.
Now the purist will come along and say you should not have that repository because ProcessType is a value object, not an entity. Therefore ProcessType is not an aggregate at all, and therefore you do not create a repository for it. So what do you do? What you don't do is shoehorn ProcessType into some kind of artificial model for no other reason than you need to get it so you need a repository but to have a repository you have to have an entity as an aggregate root. What you do is carefully consider the concepts. If someone tells you that repository is wrong, but you know that you need it and whatever they may say it is, your repository system is valid and preserves the key concepts, you keep the repository as is instead of warping your model to satisfy dogma.
Now in this case assuming I am correct about what ProcessType is, as the other commentor noted it is in fact a Value Object. You say it cannot be a Value Object. That could be for several reasons. Maybe you say that because you use NHibernate for example, but the NHibernate model for implementing value objects in the same table as another object does not work. So your ProcessType requires an identity column and field. Often because of database considerations the only practical implementation is to have value objects with ids in their own table. Or maybe you say that because each Process points to a single ProcessType by reference.
It does not matter. It is a value Object because of the concept. If you have 10 Process objects that are of the same ProcessType you have 10 Process.ProcessType members and values. Whether each Process.ProcessType points to a single reference, or each got a copy, they should still by definition all be exactly the same things and all be completely interchangeable with any of the other 10. THAT is what makes it a value Object. The person who says "It has an Id therefore is cannot be a value Object you have an entity" is making a dogmatic error. Don't make the same error, if you need an ID field give it one, but don't say "it can't be a Value Object" when it in fact is albeit one that for other reason you had to give an Id to.
So how do you get this one right and wrong? ProcessType is a Value Object, but for some reason you need it to have an Id. The Id per se does not violate the rules. You get it right by having 10 processes that all have a ProcessType that is exactly the same. Maybe each has a local deeep copy, maybe they all point to one object. but each is identical either way, ergo each has an Id = 2, for example. You get is wrong when you do this: 10 Processes each have a ProcessType, and this ProcessType is identical and completely interchangeable EXCEPT now each also has it's own unique Id as well. Now you have 10 instances of the same thing but they vary only in Id, and will always vary only in Id. Now you no longer have a Value Object, not because you gave it an Id, but because you gave it an Id with an implementation that reflects the nature of an entity - each instance is unique and different
Make sense?

Look i think you have to restructure your model. Use ProcessType like a Value Object and Process Agg Root.
This way Every Process has a processType
Public class Process
{
Public Process()
{
}
public ProcessType { get; }
}
for this u just need 1 agg root not 2.

Please clarify how create/update happens against child entities of an aggregate root

After much reading and thinking as I begin to get my head wrapped around DDD, I am a bit confused about the best practices for dealing with complex hierarchies under an aggregate root. I think this is a FAQ but after reading countless examples and discussions, no one is quite talking about the issue I'm seeing.
If I am aligned with the DDD thinking, entities below the aggregate root should be immutable. This is the crux of my trouble, so if that isn't correct, that is why I'm lost.
Here is a fabricated example...hope it holds enough water to discuss.
Consider an automobile insurance policy (I'm not in insurance, but this matches the language I hear when on the phone w/ my insurance company).
Policy is clearly an entity. Within the policy, let's say we have Auto. Auto, for the sake of this example, only exists within a policy (maybe you could transfer an Auto to another policy, so this is potential for an aggregate as well, which changes Policy...but assume it simpler than that for now). Since an Auto cannot exist without a Policy, I think it should be an Entity but not a root. So Policy in this case is an aggregate root.
Now, to create a Policy, let's assume it has to have at least one auto. This is where I get frustrated. Assume Auto is fairly complex, including many fields and maybe a child for where it is garaged (a Location). If I understand correctly, a "create Policy" constructor/factory would have to take as input an Auto or be restricted via a builder to not be created without this Auto. And the Auto's creation, since it is an entity, can't be done beforehand (because it is immutable? maybe this is just an incorrect interpretation). So you don't get to say new Auto and then setX, setY, add(Z).
If Auto is more than somewhat trivial, you end up having to build a huge hierarchy of builders and such to try to manage creating an Auto within the context of the Policy.
One more twist to this is later, after the Policy is created and one wishes to add another Auto...or update an existing Auto. Clearly, the Policy controls this...fine...but Policy.addAuto() won't quite fly because one can't just pass in a new Auto (right!?). Examples say things like Policy.addAuto(VIN, make, model, etc.) but are all so simple that that looks reasonable. But if this factory method approach falls apart with too many parameters (the entire Auto interface, conceivably) I need a solution.
From that point in my thinking, I'm realizing that having a transient reference to an entity is OK. So, maybe it is fine to have a entity created outside of its parent within the aggregate in a transient environment, so maybe it is OK to say something like:
auto = AutoFactory.createAuto();
auto.setX
auto.setY
or if sticking to immutability, AutoBuilder.new().setX().setY().build()
and then have it get sorted out when you say Policy.addAuto(auto)
This insurance example gets more interesting if you add Events, such as an Accident with its PolicyReports or RepairEstimates...some value objects but most entities that are all really meaningless outside the policy...at least for my simple example.
The lifecycle of Policy with its growing hierarchy over time seems the fundamental picture I must draw before really starting to dig in...and it is more the factory concept or how the child entities get built/attached to an aggregate root that I haven't seen a solid example of.
I think I'm close. Hope this is clear and not just a repeat FAQ that has answers all over the place.

Aggregate Roots exist for the purpose of transactional consistency.
Technically, all you have are Value Objects and Entities.
The difference between the two is immutability and identity.
A Value Object should be immutable and it's identity is the sum of it's data.
Money // A value object
{
string Currency;
long Value;
}
Two Money objects are equal if they have equal Currency and equal Value. Therefore, you could swap one for the other and conceptually, it would be as if you had the same Money.
An Entity is an object with mutability over time, but whose identity is immutable throughout it's lifetime.
Person // An entity
{
PersonId Id; // An immutable Value Object storing the Person's unique identity
string Name;
string Email;
int Age;
}
So when and why do you have Aggregate Roots?
Aggregate Roots are specialized Entities whose job is to group a set of domain concepts under one transactional scope for purpose of data change only. That is, say a Person has Legs. You would need to ask yourself, should changes on Legs and changes on Person be grouped together under a single transaction? Or can I change one separately from the other?
Person // An entity
{
PersonId Id;
string Name;
string Ethnicity;
int Age;
Pair<Leg> Legs;
}
Leg // An entity
{
LegId Id;
string Color;
HairAmount HairAmount; // none, low, medium, high, chewbacca
int Length;
int Strength;
}
If Leg can be changed by itself, and Person can be changed by itself, then they both are Aggregate Roots. If Leg can not be changed alone, and Person must always be involved in the transaction, than Leg should be composed inside the Person entity. At which point, you would have to go through Person to change Leg.
This decision will depend on the domain you are modeling:
Maybe the Person is the sole authority on his legs, they grow longer and stronger based on his age, the color changes according to his ethnicity, etc. These are invariants, and Person will be responsible for making sure they are maintained. If someone else wants to change this Person's legs, say you want to shave his legs, you'd have to ask him to either shaves them himself, or hand them to you temporarily for you to shave.
Or you might be in the domain of archeology. Here you find Legs, and you can manipulate the Legs independently. At some point, you might find a complete body and guess who this person was historically, now you have a Person, but the Person has no say in what you'll do with the Legs you found, even if it was shown to be his Legs. The color of the Leg changes based on how much restoration you've applied to it, or other things. These invariants would be maintained by another Entity, this time it won't be Person, but maybe Archaeologist instead.
TO ANSWER YOUR QUESTION:
I keep hearing you talk about Auto, so that's obviously an important concept of your domain. Is it an entity or a value object? Does it matter if the Auto is the one with serial #XYZ, or are you only interested in brand, colour, year, model, make, etc.? Say you care about the exact identity of the Auto and not just it's features, than it would need to be an Entity of your domain. Now, you talk about Policy, a policy dictates what is covered and not covered on an Auto, this depends on the Auto itself, and probably the Customer too, since based on his driving history, the type and year and what not of Auto he has, his Policy might be different.
So I can already conceive having:
Auto : Entity, IAggregateRoot
{
AutoId Id;
string Serial;
int Year
colour Colour;
string Model
bool IsAtGarage
Garage Garage;
}
Customer : Entity, IAggregateRoot
{
CustomerId Id;
string Name;
DateTime DateOfBirth;
}
Policy : Entity, IAggregateRoot
{
string Id;
CustomerId customer;
AutoId[] autos;
}
Garage : IValueObject
{
string Name;
string Address;
string PhoneNumber;
}
Now the way you make it sound, you can change a Policy without having to change an Auto and a Customer together. You say things like, what if the Auto is at the garage, or we transfer an Auto from one Policy to another. This makes me feel like Auto is it's own Aggregate Root, and so is Policy and so is Customer. Why is that? Because it sounds like it is the usage of your domain that you would change an Auto's garage without caring that the Policy be changed with it. That is, if someone changes an Auto's Garage and IsAtGarage state, you don't care not to change the Policy. I'm not sure if I'm being clear, you wouldn't want to change the Customer's Name and DateOfBirth in a non transactional way, because maybe you change his name, but it fails to change the Date and now you have a corrupt customer whose Date of Birth doesn't match his name. On the other hand, it's fine to change the Auto without changing the Policy. Because of this, Auto should not be in the aggregate of Policy. Effectively, Auto is not a part of Policy, but only something that the Policy keeps track of and might use.
Now we see that it then totally make sense that you are able to create an Auto on it's own, as it is an Aggregate Root. Similarly, you can create Customers by themselves. And when you create a Policy, you simply must link it to a corresponding Customer and his Autos.
aCustomer = Customer.Make(...);
anAuto = Auto.Make(...);
anotherAuto = Auto.Make(...);
aPolicy = Policy.Make(aCustomer, { anAuto, anotherAuto }, ...);
Now, in my example, Garage isn't an Aggregate Root. This is because, it doesn't seem to be something that the domain directly works with. It is always used through an Auto. This makes sense, Insurance companies don't own garages, they don't work in the business of garages. You wouldn't ever need to create a Garage that existed on it's own. It's easy then to have an anAuto.SentToGarage(name, address, phoneNumber) method on Auto which creates a Garage and assign it to the Auto. You wouldn't delete a Garage on it's own. You would do anAuto.LeftGarage() instead.

entities below the aggregate root should be immutable.
No. Value objects are supposed to be immutable. Entities can change their state.
Just need to make sure You do proper encapsulation:
entities modifies themselves
entities are modified through aggregate root only
but Policy.addAuto() won't quite fly because one can't just pass in a new Auto (right!?)
Usually it's supposed to be so. Problem is that auto creation task might become way too large. If You are lucky and, knowing that entities can be modified, are able to divide smoothly it into smaller tasks like SpecifyEngine, problem is resolved.
However, "real world" does not work that way and I feel Your pain.
I got case when user uploads 18 excel sheets long crap load of data (with additional fancy rule - it should be "imported" whatever how invalid data are (as I say - that's like saying true==false)). This upload process is considered as one atomic operation.
What I do in this case...
First of all - I have excel document object model, mappings (e.g. Customer.Name==1st sheet, "C24") and readers that fill DOM. Those things live in infrastructure far far away.
Next thing - entity and value objects in my domain that looks similar to DOM dto`s, but only projection that I'm interested in, with proper data types and according validation. + I Have 1:1 association in my domain model that isolates dirty mess out (luckily enough, it kind a fits with ubiquitous language).
Armed with that - there's still one tricky part left - mapping between excel DOM dtos to domain objects. This is where I sacrifice encapsulation - I construct entity with its value objects from outside. My thought process is kind a simple - this overexposed entity can't be persisted anyway and validness still can be forced (through constructors). It lives underneath aggregate root.
Basically - this is the part where You can't runaway from CRUDyness.
Sometimes application is just editing bunch of data.
P.s. I'm not sure that I'm doing right thing. It's likely I've missed something important on this issue. Hopefully there will be some insight from other answerers.

Part of my answer seems to be captured in these posts:
Domain Driven Design - Parent child relation pattern - Specification pattern
Best practice for Handling NHibernate parent-child collections
how should i add an object into a collection maintained by aggregate root
To summarize:
It is OK to create an entity outside its aggregate if it can manage its own consistency (you may still use a factory for it). So having a transient reference to Auto is OK and then a new Policy(Auto) is how to get it into the aggregate. This would mean building up "temporary" graphs to get the details spread out a bit (not all piled into one factory method or constructor).
I'm seeing my alternatives as either:
(a) Build a DTO or other anemic graph first and then pass it to a factory to get the aggregate built.
Something like:
autoDto = new AutoDto();
autoDto.setVin(..);
autoDto.setEtc...
autoDto.setGaragedLocation(new Location(..));
autoDto.addDriver(...);
Policy policy = PolicyFactory.getInstance().createPolicy(x, y, autoDto);
auto1Dto...
policy.addAuto(auto1Dto);
(b) Use builders (potentially compound):
builder = PolicyBuilder.newInstance();
builder = builder.setX(..).setY(..);
builder = builder.addAuto(vin, new Driver()).setGaragedLocation(new Location());
Policy = builder.build();
// and how would update work if have to protect the creation of Auto instances?
auto1 = AutoBuilder.newInstance(policy, vin, new Driver()).build();
policy.addAuto(auto1);
As this thing twists around and around a couple things seem clear.
In the spirit of ubiquitous language, it makes sense to be able to say:
policy.addAuto
and
policy.updateAuto
The arguments to these and how the aggregate and the entity creation semantics are managed is not quite clear, but having to look at a factory to understand the domain seems a bit forced.
Even if Policy is an aggregate and manages how things are put together beneath it, the rules about how an Auto looks seem to belong to Auto or its factory (with some exceptions for where Policy is involved).
Since Policy is invalid without a minimally constructed set of children, those children need to be created prior or within its creation.
And that last statement is the crux. It looks like for the most part these posts handle the creation of children as separate affairs and then glue them. The pure DDD approach would seem to argue that Policy has to create Autos but the details of that spin wildly out of control in non-trivial cases.

What Belongs to the Aggregate Root

This is a practical Domain Driven Design question:
Conceptually, I think I get Aggregate roots until I go to define one.
I have an Employee entity, which has surfaced as an Aggregate root. In the Business, some employees can have work-related Violations logged against them:
Employee-----*Violations
Since not all Employees are subject to this, I would think that Violations would not be a part of the Employee Aggregate, correct?
So when I want to work with Employees and their related violations, is this two separate Repository interactions by some Service?
Lastly, when I add a Violation, is that method on the Employee Entity?
Thanks for the help!

After doing even MORE research, I think I have the answer to my question.
Paul Stovell had this slightly edited response to a similar question on the DDD messageboard. Substitute "Customer" for "Employee", and "Order" for "Violation" and you get the idea.
Just because Customer references Order
doesn't necessarily mean Order falls
within the Customer aggregate root.
The customer's addresses might, but
the orders can be independent (for
example, you might have a service that
processes all new orders no matter who
the customer is. Having to go
Customer->Orders makes no sense in
this scenario).
From a domain point of view, you can
even question the validity of those
references (Customer has reference to
a list of Orders). How often will you
actually need all orders for a
customer? In some systems it makes
sense, but in others, one customer
might make many orders. Chances are
you want orders for a customer between
a date range, or orders for a customer
that aren't processed yet, or orders
which have not been paid, and so on.
The scenario in which you'll need all
of them might be relatively uncommon.
However, it's much more likely that
when dealing with an Order, you will
want the customer information. So in
code, Order.Customer.Name is useful,
but Customer.Orders[0].LineItem.SKU -
probably not so useful. Of course,
that totally depends on your business
domain.
In other words, Updating Customer has nothing to do with updating Orders. And orders, or violations in my case, could conceivable be dealt with independently of Customers/Employees.
If Violations had detail lines, then Violation and Violation line would then be a part of the same aggregate because changing a violation line would likely affect a Violation.
EDIT**
The wrinkle here in my Domain is that Violations have no behavior. They are basically records of an event that happened. Not sure yet about the implications that has.

Eric Evan states in his book, Domain-Driven Design: Tackling the Complexity in the Heart of Software,
An AGGREGATE is a cluster of associated objects that we treat as a unit for the purpose of data changes.
There are 2 important points here:
These objects should be treated as a "unit".
For the purpose of "data change".
I believe in your scenario, Employee and Violation are not necessarily a unit together, whereas in the example of Order and OrderItem, they are part of a single unit.
Another thing that is important when modeling the agggregate boundaries is whether you have any invariants in your aggregate. Invariants are business rules that should be valid within the "whole" aggregate. For example, as for the Order and OrderItem example, you might have an invariant that states the total cost of the order should be less than a predefined amount. In this case, anytime you want to add an OrderItem to the Order, this invariant should be enforced to make sure that your Order is valid. However, in your problem, I don't see any invariants between your entities: Employee and Violation.
So short answer:
I believe Employee and Violation each belong to 2 separate aggregates. Each of these entities are also their own aggregate roots. So you need 2 repositories: EmployeeRepository and ViolationRepository.
I also believe you should have an unidirectional association from Violation to Employee. This way, each Violation object knows who it belongs to. But if you want to get the list of all Violations for a particular Employee, then you can ask the ViolationRepository:
var list = repository.FindAllViolationsByEmployee(someEmployee);

You say that you have employee entity and violations and each violation does not have any behavior itself. From what I can read above, it seems to me that you may have two aggregate roots:
Employee
EmployeeViolations (call it EmployeeViolationCard or EmployeeViolationRecords)
EmployeeViolations is identified by the same employee ID and it holds a collection of violation objects. You get behavior for employee and violations separated this way and you don't get Violation entity without behavior.
Whether violation is entity or value object you should decide based on its properties.

I generally agree with Mosh on this one. However, keep in mind the notion of transactions in the business point of view. So I actually take "for the purpose of data changes" to mean "for the purpose of transaction(s)".
Repositories are views of the domain model. In a domain environment, these "views" really support or represent a business function or capability - a transaction. Case in point, the Employee may have one or more violations, and if so, are aspects of a transaction(s) in a point in time. Consider your use cases.
Scenario: "An employee commits an act that is a violation of the workplace." This is a type of business event (i.e. transaction, or part of a larger, perhaps distributed transaction) that occurred. The root affected domain object actually can be seen from more than one perspective, which is why it is confusing. But the thing to remember is behavior as it pertains to a business transaction, since you want your business processes to model the real-world as accurate as possible. In terms of relationships, just like in a relational database, your conceptual domain model should actually indicate this already (i.e. the associativity), which often can be read in either direction:
Employee <----commits a -------committed by ----> Violation
So for this use case, it would be fair that to say that it is a transaction dealing with violations, and that the root - or "primary" entity - is a Violation. That, then would be your aggregate root you would reference for that particular business activity or business process. But that is not to say that, for a different activity or process, that you cannot have an Employee aggregate root, such as the "new employee process". If you take care, there should be no negative impact of cyclic references, or being able to traverse your domain model multiple ways. I will warn, however, that governing of this should be thought about and handled by your controller piece of your business domain, or whatever equivalent you have.
Aside: Thinking in terms of patterns (i.e. MVC), the repository is a view, the domain objects are the model, and thus one should also employ some form of controller pattern. Typically, the controller declares the concrete implementation of and access to the repositories (collections of aggregate roots).
In the data access world...
Using LINQ-To-SQL as an example, the DataContext would be the controller exposing a view of Customer and Order entities. The view is a non-declarative, framework-oriented Table type (rough equivalent to Repository). Note that the view keeps a reference to its parent controller, and often goes through the controller to control how/when the view gets materialized. Thus, the controller is your provider, taking care of mapping, translation, object hydration, etc. The model is then your data POCOs. Pretty much a typical MVC pattern.
Using N/Hibernate as an example, the ISession would be the controller exposing a view of Customer and Order entities by way of the session.Enumerable(string query) or session.Get(object id) or session.CreateCriteria(typeof(Customer)).List()
In the business logic world...
Customer { /*...*/ }
Employee { /*...*/ }
Repository<T> : IRepository<T>
, IEnumerable<T>
//, IQueryable<T>, IQueryProvider //optional
{ /**/ }
BusinessController {
Repository<Customer> Customers { get{ /*...*/ }} //aggregate root
Repository<Order> Orders { get{ /*...*/ }} // aggregate root
}
In a nutshell, let your business processes and transactions be the guide, and let your business infrastructure naturally evolve as processes/activities are implemented or refactored. Moreover, prefer composability over traditional black box design. When you get to service-oriented or cloud computing, you will be glad you did. :)

I was wondering what the conclusion would be?
'Violations' become a root entity. And 'violations' would be referenced by 'employee' root entity. ie violations repository <-> employee repository
But you are consfused about making violations a root entity becuase it has no behavior.
But is 'behaviour' a criteria to qualify as a root entity? I dont think so.

a slightly orthogonal question to test understanding here, going back to Order...OrderItem example, there might be an analytics module in the system that wants to look into OrderItems directly i.e get all orderItems for a particular product, or all order items greater than some given value etc, does having a lot of usecases like that and driving "aggregate root" to extreme could we argue that OrderItem is a different aggregate root in itself ??

It depends. Does any change/add/delete of a vioation change any part of employee - e.g. are you storing violation count, or violation count within past 3 years against employee?

Develop Reference

node.js excel linux python-3.x azure haskell apache-spark rust .htaccess string