DDD Effective modelling of aggregates and root aggregation creation

DDD Effective modelling of aggregates and root aggregation creation - domain-driven-design

We are starting a new project and we are keen to apply DDD principles. The project is using dotnet core, with EF core providing the persistence to SQL Server.
Initial view of the domain
I will use an example of a task tracker to illustrate our issues and challenges as this would follow a similar structure.
In the beginning we understand the following: -
We have a Project
Users can be associated to Projects
A Project has Workstreams
A Workstream has Tasks
Users can post Comments against a Task
A User is able to change the status of a Task (in progress, complete etc)
A Project, with associated Worksteams and Tasks is initially created from a Template
The initial design was a large cluster aggregate with the Project being the root aggregate holding a collection of ProjectUsers and Workstreams, Workstreams has a collection of Tasks etc etc
This approach was obviously going to lead to a number of contention and performance issues due to having to load the whole Project aggregate for any changes within that aggregate.
Rightly or wrongly our next revision was to break the Comments out of the aggregate and to form a new aggregate using Comment as a root. The motivation for this was that the business envisaged there being a significant number of Comments raised against each Task.
As each Comment is related to a Task a Comment needs to hold a foreign key back to the Task. However this isn't possible following the principle that you can only reference another aggregate via its root. To overcome this we broke the Task out to another aggregate. This also seemed to satisfy the need that the Tasks could be Completed by different people and again would reduce contention.
We then faced the same problem with the reference from the Task to the Workstream the Task belongs to leading to us creating a new Workstream aggregate with the foreign key in the Task back to the Workstream.
The result is: -
A Project aggregate which only contains a list of Users assigned to the project
A Workstream aggregate which contains a foreign key to the Project
A Task aggregate which contains a foreign key to the Project
A Comments aggregate which contains a foreign key back to the Task
The Project has a method to create a new instance of a Workstream, allow us to set the foreign key. I.e. slightly simplified version
public class Project()
{
string _name { get; private set;}
public Project(Name)
{
_name = Name;
}
public Workstream CreateWorkstream(string name)
{
return new Workstream(name, Id);
}
....+ Methods for managing user assignment to the project
}
In a similar way Workstream has a method to create a Task
public class Workstream()
{
string _name { get; private set;}
public int ProjectId { get; private set; }
public Workstream(Name, Id)
{
_name = Name;
_projectId = Id;
}
public Task CreateTask(string name)
{
return new Task(name, Id);
}
private readonly List<Task> _activities = new List<Task>();
public IEnumerable<Task> Activities => _activities.AsReadOnly();
}
The Activities property has been added purely to support navigation when using the entities to build the read models.
The team are not comfortable that this approach, something doesn't feel right. The main concerns are:-
it is felt that creating a project logically should be create project, add one or more workstreams to the project, add task to the workstreams, then let EF deal with persisting that object structure.
there is discomfort that the Project has to be created first and that the developer needs to ensure it is persisted so it gets an Id, ready for when the method to Create the template is called which is dependent on that Id for the foreign key. Is it okay to push the responsibility for this to a method in a domain service CreateProjectFromTemplate() to orchestrate the creation and persistence of the separate objects to each repository?
is the method to create the new Workstream even in the correct place?
the entities are used to form the queries (support by the navigation properties) which are used to create the read models. Maybe the concern is that the object structure is being influence by the how we need to present data in a read only
We are now at the point where we are just going around in circles and could really use some advice to give us some direction.

The team are not comfortable that this approach, something doesn't feel right.
That's a very good sign.
However this isn't possible following the principle that you can only reference another aggregate via its root.
You'll want to let go of this idea, it's getting in your way.
Short answer is that identifiers aren't references. Holding a copy of an identifier for another entity is fine.
Longer answer: DDD is based on the work of Eric Evans, who was describing a style that had worked for him on java projects at the beginning of the millennium.
The pain that he is strugging with is this: if the application is allowed object references to arbitrary data entities, then the behaviors of the domain end up getting scattered all over the code base. This increases the amount of work that you need to do to understand the domain, and it increases the cost of making (and testing!) change.
The reaction was to introduce a discipline; isolate the data from the application, by restricting the application's access to a few carefully constrained gate keepers (the "aggregate root" objects). The application can hold object references to the root objects, and can send messages to those root objects, but the application cannot hold a reference to, or send a message directly to, the objects hidden behind the api of the aggregate.
Instead, the application sends a message to the root object, and the root object can then forward the message to other entities within its own aggregate.
Thus, if we want to send a message to a Task inside of some Project, we need some mechanism to know which project to load, so that we can send the message to the project to send a message to the Task.
Effectively, this means you need a function somewhere that can take a TaskId, and return the corresponding ProjectId.
The simplest way to do this is to simply store the two fields together
{
taskId: 67890,
projectId: 12345
}
it is felt that creating a project logically should be create project, add one or more workstreams to the project, add task to the workstreams, then let EF deal with persisting that object structure.
Maybe the concern is that the object structure is being influence by the how we need to present data in a read only
There's a sort of smell here, which is that you are describing the relations of a data structure. Aggregates aren't defined by relations as much as they are changes.
Is it okay to push the responsibility for this to a method in a domain service CreateProjectFromTemplate
It's actually fairly normal to have a draft aggregate (which understands editing) that is separate from a Published aggregate (which understands use). Part of the point of domain driven design is to improve the business by noticing implicit boundaries between use cases and making them explicit.
You could use a domain service to create a project from a template, but in the common case, my guess is that you should do it "by hand" -- copy the state from the draft, and then send use that state to create the project; it avoids confusion when a publish and an edit are happening concurrently.

Here is a different perspective that might nudge you out of your deadlock.
I feel you are doing data modeling instead of real domain modeling. You are concerned with a relational model that will be directly persisted using ORM (EF) and less concerned with the actual problem domain. That is why you are concerned that the project will load too many things, or which objects will hold foreign keys to what.
An alternative approach would be to forget persistence for a moment and concentrate on what things might need what responsibilities. With responsibilities I don't mean technical things like save/load/search, but things that the domain defines. Like creating a task, completing a task, adding a comment, etc. This should give you an outline of things, like:
interface Task {
...
void CompleteBy(User user);
...
}
interface Project {
...
Workstream CreateWorkstreamFrom(Template template);
...
}
Also, don't concentrate too much on what is an Entity, Value Object, Aggregate Root. First, represent your business correctly in a way you and your colleagues are happy with. That is the important part. Try to talk to non-technical people about your model, see if the language you are using fits, whether you can have a conversation with it. You can decide later what objects are Entities or Value Objects, that part is purely technical and less important.
One other point: don't bind your model directly to an ORM. ORMs are blunt instruments that will probably force you into bad decisions. You can use an ORM inside your domain objects, but don't make them be a part of the ORM. This way you can do your domain the right way, and don't have to be afraid to load too much for a specific function. You can do exactly the right things for all the business functions.

Related

Repository Add and Create methods

Why are repositories' .Add method usually implemented as accepting the instance of entity to add, with the .Id already "set" (although it can be set again via reflection), which should be repo's responsibility?
Wouldn't it be better to implement it as .CreateAndAdd?
For example, given a Person entity:
public class Person
{
public Person(uint id, string name)
{
this.Id = id;
this.Name = name;
}
public uint Id { get; }
public string Name { get; }
}
why are repositories usually implemented as:
public interface IRpository<T>
{
Task<T> AddAsync(T entity);
}
and not as:
public interface IPersonsRpository
{
Task<Person> CreateAndAddAsync(string name);
}

why are repositories usually implemented as...?
A few reasons.
Historically, domain-driven-design is heavily influenced by the Eric Evans book that introduced the term. There, Evans proposed that repositories provide collection semantics, providing "the illusion of an in memory collection".
Adding a String, or even a Name, to a collection of Person doesn't make very much sense.
More broadly, figuring out how to reconstitute an entity from a set of a parameters is a separate responsibility from storage, so perhaps it doesn't make sense to go there (note: a repository often ends up with the responsibility of reconstituting an entity from some stored memento, so it isn't completely foreign, but there's usually an extra abstraction, the "Factory", that really does the work.)
Using a generic repository interface often makes sense, as interacting with individual elements of the collection via retrieve/store operations shouldn't require a lot of custom crafting. Repositories can support custom queries for different kinds of entities, so it can be useful to call that out specifically
public interface IPersonRepository : IRepository<Person> {
// Person specific queries go here
}
Finally, the id... and the truth of it is that identity, as a concept, has a whole lot of "it depends" baked into it. In some cases, it may make sense for the repository to assign an id to an entity -- for instance, using a unique key generated by the database. Often, you'll instead want to have control of the identifier outside of the repository. Horses for courses.

There already is a great answer on the question, I just want to add some of my thoughts. (It will contain some duplication from the previous answer, so if this is a bad thing just let me know and I'll remove it :) ).
The Responsibility of ID generation can belong to different part of an organization or a system.
Sometimes the ID will be generated by some special rules like a Social Security Number. This number can be used for ID of a Person in a system, so before creating a Person entity this code will have to be generated from a specific SSNGenerator Service.
We can use a random generated ID like a UUID. UUIDs can be generated outside of the Repository and assigned to the entity during creation and the Repository will only store it (add, save) it to the DB.
IDs generated by databases are very interesting. You can have Sequential IDs like in RDBMS, UUID-ish like in MonogoDB or some Hash. In this case the Responsibility of ID generation is assigned to the DB so it can happen only after the Entity is stored not when it's created. (I'm allowing myself freedom here as you can generate it before saving a transaction or read the last one etc.. but I like to generalize here and avoid discussing cases with race conditions and collisions). This means that you Entity does't have an identity before the save completes. Is this a good thing? Of course It depends :)
This problem is a great example of leaky abstractions.
When you implement a solution sometimes the technology used will affect it. You will have to deal with the fact that for example the ID is generated by your Database which is part of your Infrastructure (if you have defined such a layer in your code). You can also avoid this by using s UUID even if you use a RDBMS, but then you have to join (again technology specific stuff :) ) on these IDs so sometimes people like to use the default.
Instead of having Add or AddAndCreate you can have Save method instead that does the same thing, it's just a different term that some people prefer. The repository is indeed often defined as an "In memory collection" but that doesn't mean that we have to stick to it strictly (It can be a good thing to do that most of the time but still...).
As mentioned, if you database generates ID's, the Repository seems like a good candidate to assign IDs (before of after storing) because it is the one talking to the DB.
If you are using events the way you generate ID's can affect things. For example lets say you want to have UserRegisteredEvent with the UserID as s property. If you are using the DB to generate ID's you will have to store the User first and then create and store/dispatch the event or do something of the sort. On the other hand if you generate the ID beforehand you can save the event and the entity together (in a transaction or in the same document doesn't matter). Sometimes this can get tricky.
Background, experience with technologies and framework, exposure to terminology in literature, school and work affects how we think about things and what terminology sounds better to us. Also we (most of the time) work in teams and this can affect how we name things and how implement them.

Using Martin Fowler's definition:
A Repository mediates between the domain and data mapping layers,
acting like an in-memory domain object collection. Client objects
construct query specifications declaratively and submit them to
Repository for satisfaction. Objects can be added to and removed from
the Repository, as they can from a simple collection of objects, and
the mapping code encapsulated by the Repository will carry out the
appropriate operations behind the scenes. Conceptually, a Repository
encapsulates the set of objects persisted in a data store and the
operations performed over them, providing a more object-oriented view
of the persistence layer
A Repository gives an Object Oriented view of the underlying Data (which may be otherwise stored in a relational DB). It's responsible for mapping your Table to your Entity.
Generating an ID for an object is whole different responsibility, which is not trivial and can get quite complex. You may decide to generate the ID in the DB or a separate service. Regardless of where the ID is generated, a Repository should seamlessly map it between your Entity and Table.
ID generation is a responsibility of its own, and if you add it to the Repository, then you are moving away from Single Responsibility Principle.
A side note here that using GUID for an ID is a terrible idea, because they are not sequential. They only meet the uniqueness requirement of an ID but they are not helpful for searching through the Database Index.

DDD Factory Responsibility

If have the following Code.
public class CountryFactory : IEntityFactory
{
private readonly IRepository<Country> countryRepository;
public CountryFactory(IRepository<Country> countryRepository)
{
this.countryRepository = countryRepository;
}
public Country CreateCountry(string name)
{
if (countryRepository.FindAll().Any(c => c.Name == name))
{
throw new ArgumentException("There is already a country with that name!");
}
return new Country(name);
}
}
From a DDD approach, is the the correct way to create a Country. Or is it better to have a CountryService which checks whether or not a country exists, then if it does not, just call the factory to return a new entity. This will then mean that the service will be responsible of persisting the Entity rather than the Factory.
I'm a bit confused as to where the responsibility should lay. Especially if more complex entities needs to be created which is not as simple as creating a country.

In DDD factories are used to encapsulate complex objects and aggregates creation. Usually, factories are not implemented as separate classes but rather static methods on the aggregate root class that returns the new aggregate.
Factory methods are better suited than constructors since you might need to have technical constructors for serialization purposes and var x = new Country(name) has very little meaning inside your Ubiquitous Language. What does it mean? Why do you need a name when you create a country? Do you really create countries, how often new countries appear, do you even need to model this process? All these questions arise if you start thinking about your model and ubiquitous language besides tactical pattern.
Factories must return valid objects (i.e. aggregates), checking all invariants inside it, but not outside. Factory might receive services and repositories as parameters but this is also not very common. Normally, you have an application service or command handler that does some validations and then creates a new aggregate using the factory method and adds it to the repository.
There is also a good answer by Lev Gorodinski here Factory Pattern where should this live in DDD?
Besides, implementation of Factories is extensively described in Chapter 11 of the Red Book.

Injecting a Repository into a Factory is OK, but it shouldn't be your first concern. The starting point should be : what kind of consistency does your business domain require ?
By checking Country name uniqueness in CountryFactory which is part of your Domain layer, you give yourself the impression that the countries will always be consistent. But the only aggregate is Country and since there is no AllCountries aggregate to act as a consistency boundary, respect of this invariant will not be guaranteed. Somebody could always sneak in a new Country that has exactly the same name as the one being added, just after you checked it. What you could do is wrap the CreateCountry operation into a transaction that would lock the entire set of Countries (and thus the entire table if you use an RDBMS) but this would hurt concurrency.
There are other options to consider.
Why not leverage a database unique constraint to enforce the Country name invariant ? As a complement, you could also have another checkpoint at the UI level to warn the user that the country name they typed in is already taken. This would necessitate another "query" service that just calls CountryRepository.GetByName() but where the returned Countries are not expected to be modified.
Soon you'll be realizing that there are really two kinds of models - ones that can give you some domain data at a given moment in time so that you can display it on a user interface, and ones that expose operations (AddCountry) and will guarantee that domain invariants always hold. This is a first step towards CQRS.
What is the frequency of Countries being added or modified ? If it is that high, do we really need a Country name to be unique at all times ? Wouldn't it solve a lot of problems if we loosened up the constraints and allowed a user to temporarily create a duplicate Country name ? A mechanism could detect the duplicates later on and take a compensating action, putting the newly added Country on hold and reaching out to the user to ask them to change the name. A.k.a eventual consistency instead of immediate consistency.
Does Country need to be an Aggregate ? What would be the cost if it was a Value Object and duplicated in each entity where it is used ?

should a domain model keep itself consistent using events?

I am working on an application where we try to use a Domain Model. The idea is to keep the business logic inside the objects in the Domain Model. Now a lot is done by objects subscribing to related objects to react to changes in them. This is done through PropertyChanged and CollectionChanged. This work OK except in the following:
Complex actions : Where a lot of changes should be handled as a group (and not individual property/collection changes). Should I / how can I 'build' transactions?
Persistency : I use NHibernate for persistency and this also uses the public property setters of classes. When NHibernate hits the property a lot of bussiness logic is done (which seems unnecessary). Should I use custom setters for NHibernate?
Overal it seems that pushing all logic in the domain model makes the domain model rather complex. Any ideas???
Here's a 'sample' problem (sorry for the crappy tooling i use):
You can see the Project my container and objects below it are reacting to each other by subscribing. Now changes to Network are done via NetworkEditor but this editor has no knowledge of NetworkData. This data might even be defined in a another assembly sometimes. The flow goes from user->NetworkEditor->Network->NetworkData and the all other object interested. This does not seem to scale.

I fear that combination of DDD and PropertyChanged/CollactionChanged events might now be the best idea. The problem is, that if you base your logic around these events it is extremely hard to manage the complexity as one PropertyChanged leads to another and another and soon enough you loose control.
Another reason why ProportyChanged events and DDD doesn't exactly fit is that in DDD every business operation should be as explicit as possible. Keep in mind that DDD is supposed to bring technical stuff into the world of business, not the other way around. And basing on the PropertyChanged/CollectionChanged doesn't seem very explicit.
In DDD the main goal is to keep consistency inside aggregate, in other words, you need to model the aggregate in such way, that whatever operation you invoke the aggregate is valid and consistent (if the operation succeeds of course).
If you build your model right that there's no need to worry about 'building' transaction - an operation on aggregate should be a transaction itself.
I don't know how your model looks like, but you might consider moving the responsibilities one level 'up' in the aggregate tree, quite possibly adding additional logical entities in the process, instead of relying on the PropertyChanged events.
Example:
Lets assume you have a collection of payments with statuses and whenever a payment changes, you want to recalculate the total balance of customer orders. Instead of subscribing changes to the payments collection and calling a method on customer when collection changes, you might do something like this:
public class CustomerOrder
{
public List<Payment> Payments { get; }
public Balance BalanceForOrder { get; }
public void SetPaymentAsReceived(Guid paymentId)
{
Payments.First(p => p.PaymentId == paymentId).Status = PaymentStatus.Received;
RecalculateBalance();
}
}
You might have noticed, that we recalculate the balance of single order and not the balance of entire customer - and in most cases that's ok as customer belongs to another aggregate and its balance can be simply queried when needed. That is exactly the part that shows this 'consistency only within aggregate' thingy - we don't care about any other aggregate at this point, we only deal with single order. If that's not ok for requirements, then the domain is modeled incorrectly.
My point is, that in DDD there's no single good model for every scenario - you have to understand how the business works to be successful.
If you take a look at the example above, you'll see that there is no need to 'build' the transaction - entire transaction is located in SetPaymentAsReceived method. In most cases, one user action should lead to one particular method on an entity withing aggregate - this method explicitly relates to business operation (of course this method may call other methods).
As for events in DDD, there is a concept of Domain Events, however these are not directly related with PropertyChanged/CollectionChanged technical events. Domain Events indicate the business operations (transactions) that have been completed by an aggregate.
Overal it seems that pushing all logic in the domain model makes the
domain model rather complex
Of course it does as it is supposed to be used for scenarios with complex business logic. However if the domain is modeled correctly then it is easy to manage and control this complexity and that's one of the advantages of DDD.
Added after providing example:
Ok, and what about creating an aggregate root called Project - when you build aggregate root from Repository, you can fill it with NetworkData and the operation might look like this:
public class Project
{
protected List<Network> networks;
protected List<NetworkData> networkDatas;
public void Mutate(string someKindOfNetworkId, object someParam)
{
var network = networks.First(n => n.Id == someKindOfNetworkId);
var someResult = network.DoSomething(someParam);
networkDatas.Where(d => d.NetworkId == someKindOfNetworkId)
.ToList()
.ForEach(d => d.DoSomething(someResult, someParam));
}
}
NetworkEditor would not operate on Network directly, rather through Project using NetworkId.

Simple aggregate root and repository

I'm one of many trying to understand the concept of aggregate roots, and I think that I've got it!
However, when I started modeling this sample project, I quickly ran into a dilemma.
I have the two entities ProcessType and Process. A Process cannot exist without a ProcessType, and a ProcessType has many Processes. So a process holds a reference to a type, and cannot exist without it.
So should ProcessType be an aggregate root? New processes would be created by calling processType.AddProcess(new Process());
However, I have other entities that only holds a reference to the Process, and accesses its type through Process.Type. In this case it makes no sense going through ProcessType first.
But AFAIK entities outside the aggregate are only allowed to hold references to the root of the aggregate, and not entities inside the aggregate. So do I have two aggregates here, each with their own repository?

I largely agree with what Sisyphus has said, particularly the bit about not constricting yourself to the 'rules' of DDD that may lead to a pretty illogical solution.
In terms of your problem, I have come across the situation many times, and I would term 'ProcessType' as a lookup. Lookups are objects that 'define', and have no references to other entities; in DDD terminology, they are value objects. Other examples of what I would term a lookup may be a team member's 'RoleType', which could be a tester, developer, project manager for example. Even a person's 'Title' I would define as a lookup - Mr, Miss, Mrs, Dr.
I would model your process aggregate as:
public class Process
{
public ProcessType { get; }
}
As you say, these type of objects typically need to populate dropdowns in the UI and therefore need their own data access mechanism. However, I have personally NOT created 'repositories' as such for them, but rather a 'LookupService'. This for me retains the elegance of DDD by keeping 'repositories' strictly for aggregate roots.
Here is an example of a command handler on my app server and how I have implemented this:
Team Member Aggregate:
public class TeamMember : Person
{
public Guid TeamMemberID
{
get { return _teamMemberID; }
}
public TeamMemberRoleType RoleType
{
get { return _roleType; }
}
public IEnumerable<AvailabilityPeriod> Availability
{
get { return _availability.AsReadOnly(); }
}
}
Command Handler:
public void CreateTeamMember(CreateTeamMemberCommand command)
{
TeamMemberRoleType role = _lookupService.GetLookupItem<TeamMemberRoleType>(command.RoleTypeID);
TeamMember member = TeamMemberFactory.CreateTeamMember(command.TeamMemberID,
role,
command.DateOfBirth,
command.FirstName,
command.Surname);
using (IUnitOfWork unitOfWork = UnitOfWorkFactory.CreateUnitOfWork())
_teamMemberRepository.Save(member);
}
The client can also make use of the LookupService to populate dropdown's etc:
ILookup<TeamMemberRoleType> roles = _lookupService.GetLookup<TeamMemberRoleType>();

Not so simple. ProcessType is most likley a knowledge layer object - it defines a certain process. Process on the other hand is an instance of a process that is ProcessType. You probably really don't need or want the bidirectional relationship. Process is probably not a logical child of a ProcessType. They typically belong to something else, like a Product, or Factory or Sequence.
Also by definition when you delete an aggregate root you delete all members of the aggregate. When you delete a Process I seriously doubt you really want to delete ProcessType. If you deleted ProcessType you might want to delete all Processes of that type, but that relationship is already not ideal and chances are you will not be deleting definition objects ever as soon as you have a historical Process that is defined by ProcessType.
I would remove the Processes collection from ProcessType and find a more suitable parent if one exists. I would keep the ProcessType as a member of Process since it probably defines Process. Operational layer (Process) and Knowledge Layer(ProcessType) objects rarely work as a single aggregate so I would have either Process be an aggregate root or possibly find an aggregate root that is a parent for process. Then ProcessType would be a external class. Process.Type is most likely redundant since you already have Process.ProcessType. Just get rid of that.
I have a similar model for healthcare. There is Procedure (Operational layer) and ProcedureType (knowledge layer). ProcedureType is a standalone class. Procedure is a child of a third object Encounter. Encounter is the aggregate root for Procedure. Procedure has a reference to ProcedureType but it is one way. ProcedureType is a definition object it does not contain a Procedures collection.
EDIT (because comments are so limited)
One thing to keep in mind through all of this. Many are DDD purists and adamant about rules. However if you read Evans carefully he constantly raises the possibility that tradeoffs are often required. He also goes to pretty great lengths to characterize logical and carefully thought out design decisions versus things like teams that do not understand the objectives or circumvent things like aggregates for the sake of convenience.
The important things is to understand and apply the concepts as opposed to the rules. I see many DDD that shoehorn an application into illogical and confusing aggregates etc for no other reason than because a literal rule about repositories or traversal is being applied, That is not the intent of DDD but it is often the product of the overly dogmatic approach many take.
So what are the key concepts here:
Aggregates provide a means to make a complex system more manageable by reducing the behaviors of many objects into higher level behaviors of the key players.
Aggregates provide a means to ensure that objects are created in a logical and always valid condition that also preserves a logical unit of work across updates and deletes.
Let's consider the last point. In many conventional applications someone creates a set of objects that are not fully populated because they only need to update or use a few properties. The next developer comes along and he needs these objects too, and someone has already made a set somewhere in the neighborhood fora different purpose. Now this developer decides to just use those, but he then discovers they don't have all the properties he needs. So he adds another query and fills out a few more properties. Eventually because the team does not adhere to OOP because they take the common attitude that OOP is "inefficient and impractical for the real world and causes performance issues such as creating full objects to update a single property". What they end up with is an application full of embedded SQL code and objects that essentially randomly materialize anywhere. Even worse these objects are bastardized invalid proxies. A Process appears to be a Process but it is not, it is partially populated in different ways any given point depending on what was needed. You end up with a ball mud of numerous queries to continuously partially populate objects to varying degrees and often a lot of extraneous crap like null checks that should not exist but are required because the object is never truly valid etc.
Aggregate rules prevent this by ensuring objects are created only at certain logical points and always with a full set of valid relationships and conditions. So now that we fully understand exactly what aggregate rules are for and what they protect us from, we also want to understand that we also do not want to misuse these rules and create strange aggregates that do not reflect what our application is really about simply because these aggregate rules exists and must be followed at all times.
So when Evans says create Repositories only for aggregates he is saying create aggregates in a valid state and keep them that way instead of bypassing the aggregate for internal objects directly. You have a Process as a root aggregate so you create a repository. ProcessType is not part of that aggregate. What do you do? Well if an object is by itself and it is an entity, it is an aggregate of 1. You create a repository for it.
Now the purist will come along and say you should not have that repository because ProcessType is a value object, not an entity. Therefore ProcessType is not an aggregate at all, and therefore you do not create a repository for it. So what do you do? What you don't do is shoehorn ProcessType into some kind of artificial model for no other reason than you need to get it so you need a repository but to have a repository you have to have an entity as an aggregate root. What you do is carefully consider the concepts. If someone tells you that repository is wrong, but you know that you need it and whatever they may say it is, your repository system is valid and preserves the key concepts, you keep the repository as is instead of warping your model to satisfy dogma.
Now in this case assuming I am correct about what ProcessType is, as the other commentor noted it is in fact a Value Object. You say it cannot be a Value Object. That could be for several reasons. Maybe you say that because you use NHibernate for example, but the NHibernate model for implementing value objects in the same table as another object does not work. So your ProcessType requires an identity column and field. Often because of database considerations the only practical implementation is to have value objects with ids in their own table. Or maybe you say that because each Process points to a single ProcessType by reference.
It does not matter. It is a value Object because of the concept. If you have 10 Process objects that are of the same ProcessType you have 10 Process.ProcessType members and values. Whether each Process.ProcessType points to a single reference, or each got a copy, they should still by definition all be exactly the same things and all be completely interchangeable with any of the other 10. THAT is what makes it a value Object. The person who says "It has an Id therefore is cannot be a value Object you have an entity" is making a dogmatic error. Don't make the same error, if you need an ID field give it one, but don't say "it can't be a Value Object" when it in fact is albeit one that for other reason you had to give an Id to.
So how do you get this one right and wrong? ProcessType is a Value Object, but for some reason you need it to have an Id. The Id per se does not violate the rules. You get it right by having 10 processes that all have a ProcessType that is exactly the same. Maybe each has a local deeep copy, maybe they all point to one object. but each is identical either way, ergo each has an Id = 2, for example. You get is wrong when you do this: 10 Processes each have a ProcessType, and this ProcessType is identical and completely interchangeable EXCEPT now each also has it's own unique Id as well. Now you have 10 instances of the same thing but they vary only in Id, and will always vary only in Id. Now you no longer have a Value Object, not because you gave it an Id, but because you gave it an Id with an implementation that reflects the nature of an entity - each instance is unique and different
Make sense?

Look i think you have to restructure your model. Use ProcessType like a Value Object and Process Agg Root.
This way Every Process has a processType
Public class Process
{
Public Process()
{
}
public ProcessType { get; }
}
for this u just need 1 agg root not 2.

Please clarify how create/update happens against child entities of an aggregate root

After much reading and thinking as I begin to get my head wrapped around DDD, I am a bit confused about the best practices for dealing with complex hierarchies under an aggregate root. I think this is a FAQ but after reading countless examples and discussions, no one is quite talking about the issue I'm seeing.
If I am aligned with the DDD thinking, entities below the aggregate root should be immutable. This is the crux of my trouble, so if that isn't correct, that is why I'm lost.
Here is a fabricated example...hope it holds enough water to discuss.
Consider an automobile insurance policy (I'm not in insurance, but this matches the language I hear when on the phone w/ my insurance company).
Policy is clearly an entity. Within the policy, let's say we have Auto. Auto, for the sake of this example, only exists within a policy (maybe you could transfer an Auto to another policy, so this is potential for an aggregate as well, which changes Policy...but assume it simpler than that for now). Since an Auto cannot exist without a Policy, I think it should be an Entity but not a root. So Policy in this case is an aggregate root.
Now, to create a Policy, let's assume it has to have at least one auto. This is where I get frustrated. Assume Auto is fairly complex, including many fields and maybe a child for where it is garaged (a Location). If I understand correctly, a "create Policy" constructor/factory would have to take as input an Auto or be restricted via a builder to not be created without this Auto. And the Auto's creation, since it is an entity, can't be done beforehand (because it is immutable? maybe this is just an incorrect interpretation). So you don't get to say new Auto and then setX, setY, add(Z).
If Auto is more than somewhat trivial, you end up having to build a huge hierarchy of builders and such to try to manage creating an Auto within the context of the Policy.
One more twist to this is later, after the Policy is created and one wishes to add another Auto...or update an existing Auto. Clearly, the Policy controls this...fine...but Policy.addAuto() won't quite fly because one can't just pass in a new Auto (right!?). Examples say things like Policy.addAuto(VIN, make, model, etc.) but are all so simple that that looks reasonable. But if this factory method approach falls apart with too many parameters (the entire Auto interface, conceivably) I need a solution.
From that point in my thinking, I'm realizing that having a transient reference to an entity is OK. So, maybe it is fine to have a entity created outside of its parent within the aggregate in a transient environment, so maybe it is OK to say something like:
auto = AutoFactory.createAuto();
auto.setX
auto.setY
or if sticking to immutability, AutoBuilder.new().setX().setY().build()
and then have it get sorted out when you say Policy.addAuto(auto)
This insurance example gets more interesting if you add Events, such as an Accident with its PolicyReports or RepairEstimates...some value objects but most entities that are all really meaningless outside the policy...at least for my simple example.
The lifecycle of Policy with its growing hierarchy over time seems the fundamental picture I must draw before really starting to dig in...and it is more the factory concept or how the child entities get built/attached to an aggregate root that I haven't seen a solid example of.
I think I'm close. Hope this is clear and not just a repeat FAQ that has answers all over the place.

Aggregate Roots exist for the purpose of transactional consistency.
Technically, all you have are Value Objects and Entities.
The difference between the two is immutability and identity.
A Value Object should be immutable and it's identity is the sum of it's data.
Money // A value object
{
string Currency;
long Value;
}
Two Money objects are equal if they have equal Currency and equal Value. Therefore, you could swap one for the other and conceptually, it would be as if you had the same Money.
An Entity is an object with mutability over time, but whose identity is immutable throughout it's lifetime.
Person // An entity
{
PersonId Id; // An immutable Value Object storing the Person's unique identity
string Name;
string Email;
int Age;
}
So when and why do you have Aggregate Roots?
Aggregate Roots are specialized Entities whose job is to group a set of domain concepts under one transactional scope for purpose of data change only. That is, say a Person has Legs. You would need to ask yourself, should changes on Legs and changes on Person be grouped together under a single transaction? Or can I change one separately from the other?
Person // An entity
{
PersonId Id;
string Name;
string Ethnicity;
int Age;
Pair<Leg> Legs;
}
Leg // An entity
{
LegId Id;
string Color;
HairAmount HairAmount; // none, low, medium, high, chewbacca
int Length;
int Strength;
}
If Leg can be changed by itself, and Person can be changed by itself, then they both are Aggregate Roots. If Leg can not be changed alone, and Person must always be involved in the transaction, than Leg should be composed inside the Person entity. At which point, you would have to go through Person to change Leg.
This decision will depend on the domain you are modeling:
Maybe the Person is the sole authority on his legs, they grow longer and stronger based on his age, the color changes according to his ethnicity, etc. These are invariants, and Person will be responsible for making sure they are maintained. If someone else wants to change this Person's legs, say you want to shave his legs, you'd have to ask him to either shaves them himself, or hand them to you temporarily for you to shave.
Or you might be in the domain of archeology. Here you find Legs, and you can manipulate the Legs independently. At some point, you might find a complete body and guess who this person was historically, now you have a Person, but the Person has no say in what you'll do with the Legs you found, even if it was shown to be his Legs. The color of the Leg changes based on how much restoration you've applied to it, or other things. These invariants would be maintained by another Entity, this time it won't be Person, but maybe Archaeologist instead.
TO ANSWER YOUR QUESTION:
I keep hearing you talk about Auto, so that's obviously an important concept of your domain. Is it an entity or a value object? Does it matter if the Auto is the one with serial #XYZ, or are you only interested in brand, colour, year, model, make, etc.? Say you care about the exact identity of the Auto and not just it's features, than it would need to be an Entity of your domain. Now, you talk about Policy, a policy dictates what is covered and not covered on an Auto, this depends on the Auto itself, and probably the Customer too, since based on his driving history, the type and year and what not of Auto he has, his Policy might be different.
So I can already conceive having:
Auto : Entity, IAggregateRoot
{
AutoId Id;
string Serial;
int Year
colour Colour;
string Model
bool IsAtGarage
Garage Garage;
}
Customer : Entity, IAggregateRoot
{
CustomerId Id;
string Name;
DateTime DateOfBirth;
}
Policy : Entity, IAggregateRoot
{
string Id;
CustomerId customer;
AutoId[] autos;
}
Garage : IValueObject
{
string Name;
string Address;
string PhoneNumber;
}
Now the way you make it sound, you can change a Policy without having to change an Auto and a Customer together. You say things like, what if the Auto is at the garage, or we transfer an Auto from one Policy to another. This makes me feel like Auto is it's own Aggregate Root, and so is Policy and so is Customer. Why is that? Because it sounds like it is the usage of your domain that you would change an Auto's garage without caring that the Policy be changed with it. That is, if someone changes an Auto's Garage and IsAtGarage state, you don't care not to change the Policy. I'm not sure if I'm being clear, you wouldn't want to change the Customer's Name and DateOfBirth in a non transactional way, because maybe you change his name, but it fails to change the Date and now you have a corrupt customer whose Date of Birth doesn't match his name. On the other hand, it's fine to change the Auto without changing the Policy. Because of this, Auto should not be in the aggregate of Policy. Effectively, Auto is not a part of Policy, but only something that the Policy keeps track of and might use.
Now we see that it then totally make sense that you are able to create an Auto on it's own, as it is an Aggregate Root. Similarly, you can create Customers by themselves. And when you create a Policy, you simply must link it to a corresponding Customer and his Autos.
aCustomer = Customer.Make(...);
anAuto = Auto.Make(...);
anotherAuto = Auto.Make(...);
aPolicy = Policy.Make(aCustomer, { anAuto, anotherAuto }, ...);
Now, in my example, Garage isn't an Aggregate Root. This is because, it doesn't seem to be something that the domain directly works with. It is always used through an Auto. This makes sense, Insurance companies don't own garages, they don't work in the business of garages. You wouldn't ever need to create a Garage that existed on it's own. It's easy then to have an anAuto.SentToGarage(name, address, phoneNumber) method on Auto which creates a Garage and assign it to the Auto. You wouldn't delete a Garage on it's own. You would do anAuto.LeftGarage() instead.

entities below the aggregate root should be immutable.
No. Value objects are supposed to be immutable. Entities can change their state.
Just need to make sure You do proper encapsulation:
entities modifies themselves
entities are modified through aggregate root only
but Policy.addAuto() won't quite fly because one can't just pass in a new Auto (right!?)
Usually it's supposed to be so. Problem is that auto creation task might become way too large. If You are lucky and, knowing that entities can be modified, are able to divide smoothly it into smaller tasks like SpecifyEngine, problem is resolved.
However, "real world" does not work that way and I feel Your pain.
I got case when user uploads 18 excel sheets long crap load of data (with additional fancy rule - it should be "imported" whatever how invalid data are (as I say - that's like saying true==false)). This upload process is considered as one atomic operation.
What I do in this case...
First of all - I have excel document object model, mappings (e.g. Customer.Name==1st sheet, "C24") and readers that fill DOM. Those things live in infrastructure far far away.
Next thing - entity and value objects in my domain that looks similar to DOM dto`s, but only projection that I'm interested in, with proper data types and according validation. + I Have 1:1 association in my domain model that isolates dirty mess out (luckily enough, it kind a fits with ubiquitous language).
Armed with that - there's still one tricky part left - mapping between excel DOM dtos to domain objects. This is where I sacrifice encapsulation - I construct entity with its value objects from outside. My thought process is kind a simple - this overexposed entity can't be persisted anyway and validness still can be forced (through constructors). It lives underneath aggregate root.
Basically - this is the part where You can't runaway from CRUDyness.
Sometimes application is just editing bunch of data.
P.s. I'm not sure that I'm doing right thing. It's likely I've missed something important on this issue. Hopefully there will be some insight from other answerers.

Part of my answer seems to be captured in these posts:
Domain Driven Design - Parent child relation pattern - Specification pattern
Best practice for Handling NHibernate parent-child collections
how should i add an object into a collection maintained by aggregate root
To summarize:
It is OK to create an entity outside its aggregate if it can manage its own consistency (you may still use a factory for it). So having a transient reference to Auto is OK and then a new Policy(Auto) is how to get it into the aggregate. This would mean building up "temporary" graphs to get the details spread out a bit (not all piled into one factory method or constructor).
I'm seeing my alternatives as either:
(a) Build a DTO or other anemic graph first and then pass it to a factory to get the aggregate built.
Something like:
autoDto = new AutoDto();
autoDto.setVin(..);
autoDto.setEtc...
autoDto.setGaragedLocation(new Location(..));
autoDto.addDriver(...);
Policy policy = PolicyFactory.getInstance().createPolicy(x, y, autoDto);
auto1Dto...
policy.addAuto(auto1Dto);
(b) Use builders (potentially compound):
builder = PolicyBuilder.newInstance();
builder = builder.setX(..).setY(..);
builder = builder.addAuto(vin, new Driver()).setGaragedLocation(new Location());
Policy = builder.build();
// and how would update work if have to protect the creation of Auto instances?
auto1 = AutoBuilder.newInstance(policy, vin, new Driver()).build();
policy.addAuto(auto1);
As this thing twists around and around a couple things seem clear.
In the spirit of ubiquitous language, it makes sense to be able to say:
policy.addAuto
and
policy.updateAuto
The arguments to these and how the aggregate and the entity creation semantics are managed is not quite clear, but having to look at a factory to understand the domain seems a bit forced.
Even if Policy is an aggregate and manages how things are put together beneath it, the rules about how an Auto looks seem to belong to Auto or its factory (with some exceptions for where Policy is involved).
Since Policy is invalid without a minimally constructed set of children, those children need to be created prior or within its creation.
And that last statement is the crux. It looks like for the most part these posts handle the creation of children as separate affairs and then glue them. The pure DDD approach would seem to argue that Policy has to create Autos but the details of that spin wildly out of control in non-trivial cases.

Develop Reference

node.js excel linux python-3.x azure haskell apache-spark rust .htaccess string