Aggregate Root references other aggregate roots

Aggregate Root references other aggregate roots - domain-driven-design

I'm currently working a lot with DDD, and I'm facing a problem when loading/operating on aggregate roots from other aggregate roots.
For each aggregate root in my model, I also have a repository. The repository is responsible for handling persistence operations for the root.
Let's say that I have two aggregate roots, with some members (entities and value objects).
AggregateRoot1 and AggregateRoot2.
AggregateRoot1 has an entity member which references AggregateRoot2.
When I load AggregateRoot1, should I load AggregateRoot2 as well?
Should the repository for AggregateRoot2 be responsible for this?
If so, is it okay for the entity in AggregateRoot1 to call the repository of AggregateRoot2 for loading?
Also, when I create an association between the entity in AggregateRoot1 to AggregateRoot2, should that be done through the entity, or through the repository for AggregateRoot2?
Hope my question makes sense.
[EDIT]
CURRENT SOLUTION
With help from Twith2Sugars I've come up with the following solution:
As described in the question, an aggregate root can have children that have references to other roots. When assigning root2 to one of the members of root1, the repository for root1 will be responsible for detecting this change, and delegating this to the repository for root2.
public void SomeMethod()
{
AggregateRoot1 root1 = AggregateRoot1Repository.GetById("someIdentification");
root1.EntityMember1.AggregateRoot2 = new AggregateRoot2();
AggregateRoot1Repository.Update(root1);
}
public class AggregateRoot1Repository
{
public static void Update(AggregateRoot1 root1)
{
//Implement some mechanism to detect changes to referenced roots
AggregateRoot2Repository.HandleReference(root1.EntityMember1, root1.EntityMember1.AggregateRoot2)
}
}
This is just a simple example, no Law of Demeter or other best principles/practices included :-)
Further comments appreciated.

I've been in this situation myself and came to a conclusion that it's too much of a head ache to make child aggregates work in an elegant way. Instead, I'd consider whether you actually need to reference the second aggregate as child of the first. It makes life much easier if you just keep a reference of the aggregate's ID rather than the actual aggregate itself. Then, if there is domain logic that involves both aggregates this can be extracted to a domain service and look something like this:
public class DomainService
{
private readonly IAggregate1Repository _aggregate1Repository;
private readonly IAggregate2Repository _aggregate2Repository;
public void DoSomething(Guid aggregateID)
{
Aggregate1 agg1 = _aggregate1Repository.Get(aggregateID);
Aggregate2 agg2 = _aggregate2Repository.Get(agg1.Aggregate2ID);
agg1.DoSomething(agg2);
}
}
EDIT:
I REALLY recommend these articles on the subject: https://vaughnvernon.co/?p=838

This approach have some issues. first, you should have one repository to each aggregate and its done. having one repository that calls another one is a break on this rule. second, a good practice about aggregate relationship is that one root aggregate should communicate with another root aggregate by its id, not having its reference. doing so, you keep each aggregate independent of another aggregate. keep reference in root aggregate only of the classes that compose the same aggregate.

Perhaps the AggregateRoot1 repository could call AggregateRoot2 repository when it's constructing the the AggregateRoot1 entity.
I don't think this invalidates ddd since the repositories are still in charge of getting/creating their own entities.

Related

DDD Effective modelling of aggregates and root aggregation creation

We are starting a new project and we are keen to apply DDD principles. The project is using dotnet core, with EF core providing the persistence to SQL Server.
Initial view of the domain
I will use an example of a task tracker to illustrate our issues and challenges as this would follow a similar structure.
In the beginning we understand the following: -
We have a Project
Users can be associated to Projects
A Project has Workstreams
A Workstream has Tasks
Users can post Comments against a Task
A User is able to change the status of a Task (in progress, complete etc)
A Project, with associated Worksteams and Tasks is initially created from a Template
The initial design was a large cluster aggregate with the Project being the root aggregate holding a collection of ProjectUsers and Workstreams, Workstreams has a collection of Tasks etc etc
This approach was obviously going to lead to a number of contention and performance issues due to having to load the whole Project aggregate for any changes within that aggregate.
Rightly or wrongly our next revision was to break the Comments out of the aggregate and to form a new aggregate using Comment as a root. The motivation for this was that the business envisaged there being a significant number of Comments raised against each Task.
As each Comment is related to a Task a Comment needs to hold a foreign key back to the Task. However this isn't possible following the principle that you can only reference another aggregate via its root. To overcome this we broke the Task out to another aggregate. This also seemed to satisfy the need that the Tasks could be Completed by different people and again would reduce contention.
We then faced the same problem with the reference from the Task to the Workstream the Task belongs to leading to us creating a new Workstream aggregate with the foreign key in the Task back to the Workstream.
The result is: -
A Project aggregate which only contains a list of Users assigned to the project
A Workstream aggregate which contains a foreign key to the Project
A Task aggregate which contains a foreign key to the Project
A Comments aggregate which contains a foreign key back to the Task
The Project has a method to create a new instance of a Workstream, allow us to set the foreign key. I.e. slightly simplified version
public class Project()
{
string _name { get; private set;}
public Project(Name)
{
_name = Name;
}
public Workstream CreateWorkstream(string name)
{
return new Workstream(name, Id);
}
....+ Methods for managing user assignment to the project
}
In a similar way Workstream has a method to create a Task
public class Workstream()
{
string _name { get; private set;}
public int ProjectId { get; private set; }
public Workstream(Name, Id)
{
_name = Name;
_projectId = Id;
}
public Task CreateTask(string name)
{
return new Task(name, Id);
}
private readonly List<Task> _activities = new List<Task>();
public IEnumerable<Task> Activities => _activities.AsReadOnly();
}
The Activities property has been added purely to support navigation when using the entities to build the read models.
The team are not comfortable that this approach, something doesn't feel right. The main concerns are:-
it is felt that creating a project logically should be create project, add one or more workstreams to the project, add task to the workstreams, then let EF deal with persisting that object structure.
there is discomfort that the Project has to be created first and that the developer needs to ensure it is persisted so it gets an Id, ready for when the method to Create the template is called which is dependent on that Id for the foreign key. Is it okay to push the responsibility for this to a method in a domain service CreateProjectFromTemplate() to orchestrate the creation and persistence of the separate objects to each repository?
is the method to create the new Workstream even in the correct place?
the entities are used to form the queries (support by the navigation properties) which are used to create the read models. Maybe the concern is that the object structure is being influence by the how we need to present data in a read only
We are now at the point where we are just going around in circles and could really use some advice to give us some direction.

The team are not comfortable that this approach, something doesn't feel right.
That's a very good sign.
However this isn't possible following the principle that you can only reference another aggregate via its root.
You'll want to let go of this idea, it's getting in your way.
Short answer is that identifiers aren't references. Holding a copy of an identifier for another entity is fine.
Longer answer: DDD is based on the work of Eric Evans, who was describing a style that had worked for him on java projects at the beginning of the millennium.
The pain that he is strugging with is this: if the application is allowed object references to arbitrary data entities, then the behaviors of the domain end up getting scattered all over the code base. This increases the amount of work that you need to do to understand the domain, and it increases the cost of making (and testing!) change.
The reaction was to introduce a discipline; isolate the data from the application, by restricting the application's access to a few carefully constrained gate keepers (the "aggregate root" objects). The application can hold object references to the root objects, and can send messages to those root objects, but the application cannot hold a reference to, or send a message directly to, the objects hidden behind the api of the aggregate.
Instead, the application sends a message to the root object, and the root object can then forward the message to other entities within its own aggregate.
Thus, if we want to send a message to a Task inside of some Project, we need some mechanism to know which project to load, so that we can send the message to the project to send a message to the Task.
Effectively, this means you need a function somewhere that can take a TaskId, and return the corresponding ProjectId.
The simplest way to do this is to simply store the two fields together
{
taskId: 67890,
projectId: 12345
}
it is felt that creating a project logically should be create project, add one or more workstreams to the project, add task to the workstreams, then let EF deal with persisting that object structure.
Maybe the concern is that the object structure is being influence by the how we need to present data in a read only
There's a sort of smell here, which is that you are describing the relations of a data structure. Aggregates aren't defined by relations as much as they are changes.
Is it okay to push the responsibility for this to a method in a domain service CreateProjectFromTemplate
It's actually fairly normal to have a draft aggregate (which understands editing) that is separate from a Published aggregate (which understands use). Part of the point of domain driven design is to improve the business by noticing implicit boundaries between use cases and making them explicit.
You could use a domain service to create a project from a template, but in the common case, my guess is that you should do it "by hand" -- copy the state from the draft, and then send use that state to create the project; it avoids confusion when a publish and an edit are happening concurrently.

Here is a different perspective that might nudge you out of your deadlock.
I feel you are doing data modeling instead of real domain modeling. You are concerned with a relational model that will be directly persisted using ORM (EF) and less concerned with the actual problem domain. That is why you are concerned that the project will load too many things, or which objects will hold foreign keys to what.
An alternative approach would be to forget persistence for a moment and concentrate on what things might need what responsibilities. With responsibilities I don't mean technical things like save/load/search, but things that the domain defines. Like creating a task, completing a task, adding a comment, etc. This should give you an outline of things, like:
interface Task {
...
void CompleteBy(User user);
...
}
interface Project {
...
Workstream CreateWorkstreamFrom(Template template);
...
}
Also, don't concentrate too much on what is an Entity, Value Object, Aggregate Root. First, represent your business correctly in a way you and your colleagues are happy with. That is the important part. Try to talk to non-technical people about your model, see if the language you are using fits, whether you can have a conversation with it. You can decide later what objects are Entities or Value Objects, that part is purely technical and less important.
One other point: don't bind your model directly to an ORM. ORMs are blunt instruments that will probably force you into bad decisions. You can use an ORM inside your domain objects, but don't make them be a part of the ORM. This way you can do your domain the right way, and don't have to be afraid to load too much for a specific function. You can do exactly the right things for all the business functions.

Creating child instances on an aggregate root in DDD

I have been reading Eric Evan's book on DDD and on page 139 he states:
"if you needed to add elements inside a preexisting AGGREGATE, you might create a FACTORY METHOD on the root of the AGGREGATE"
I would assume that could be implemented something like this where the method NewLineItem is used to create and add a new line item to the order.
class Order
{
public IEnumerable<LineItem> LineItems { get; }
public void NewLineItem(Product product, int quantity);
}
Another way I could think of doing this is to move the factory method into the collection itself. Something like this below. I could then add a new item by calling LineItems.New(...).
class Order
{
public LineItems LineItems { get; }
public class LineItems : IEnumerable<LineItem>
{
public void New(Product product, int quantity);
}
}
What are the pros/cons to each approach? Are there any gotchas with moving the factory method into a collection? We are currently trying to figure out the best way to implement a large domain model. We are concerned that some of these root aggregate models will get bloated with numerous factory methods and deletion methods such as RemoveLineItem(LineItem). Our thinking is that moving these factory methods to their collections helps organize the design and keeps the root aggregate less cluttered with methods. Any advice?
Thanks

One advantage of having the factory method on the AR directly is that it makes the AR aware of the changes and allows it to enforce it's invariants. Not only that, but because the method is aware of the internal state of the AR you may be able to reduce the number of arguments passed to the factory method (most useful when creating other related ARs).
E.g. registration = course.register(registrant) vs registration = new Registration(registrant, courseId)
Also, LineItem becomes an implementation detail so the client doesn't need to be aware of that class.
The fact that you are asking this question and are actually worried of having too many methods on your ARs is perhaps an indicator that you may be clustering together objects that do not belong together.
Do not lose sight of the AR main purpose: it's a transactionnal boundary allowing to protect invariants. If there's no invariant to protect then clustering may be unecessary or even undesirable.
I would strongly advise you to read Effective Aggregate Design by Vauhgn Vernon.

There is always that "law" of Demeter business :)
The aggregate root (AR) is going to be responsible for the integrity and invariants. It may be possible that you will have an invariant along the lines of "maximum order total of $50 and no more than 6 line items at any time". The collection will not have access to any of this information (well, perhaps the count). So the idea is that the AR handles these interactions.
If you are concerned with bloat or find yourself with ARs that are unwieldy it may indicate a problem with your design. Vaughn Vernon covers these scenarios quite nicely in his book. You really do want highly cohesive ARs and it can be tricky to identify them correctly. A couple of iterations may be required to get the most comfortable design.
So I would try and stick with Eric's advice and handle the interactions on the AR itself as far as is practically possible.

DDD: Large Aggregate Root - Person

I am building a system to manage person information. I have an ever growing aggregate root called Person. It now has hundreds of related objects, name, addresses, skills, absences, etc. My concern is that the Person AR is both breaking SRP and will create performance problems as more and more things (esp collections) get added to it.
I cannot see how with DDD to break this down into smaller objects. Taking the example of Absences. The Person has a collection of absence records (startdate, enddate, reason). These are currently managed through the Person (BookAbsence, ChangeAbsence, CancelAbsence). When adding absences I need to validate against all other absences, so I need an object which has access to the other absences in order to do this validation.
Am I missing something here? Is there another AR I have not identified? In the past I would have done this via an "AbsenceManager" service, but would like to do it using DDD.
I am fairly new to DDD, so maybe I am missing something.
Many Thanks....

The Absence chould be modeled as an aggregate. An AbsenceFactory is reposible for validating against other Absence s when you want to add a new Absence.
Code example:
public class AbsenceFactory {
private AbsenceRepository absenceRepository;
public Absence newAbsenceOf(Person person) {
List<Absence> current =
absenceRepository.findAll(person.getIdentifier());
//validate and return
}
}
You can find this pattern in the blue book (section 6.2 Factory if I'm not mistaken)
In other "modify" cases, you could introduce a Specification
public class SomeAbsenceSpecification {
private AbsenceRepository absenceRepository;
public SomeAbsenceSpecification(AbsenceRepository absenceRepository) {
this.absenceRepository=absenceRepository;
}
public boolean isSatisfiedBy(Absence absence) {
List<Absence> current =
absenceRepository.findAll(absence.getPersonIdentifier());
//validate and return
}
}
You can find this pattern in the blue book(section 9.2.3 Specification)

This is indeed what makes aggregate design so tricky. Ownership does not necessarily mean aggregation. One needs to understand the domain to be able to give a proper answer so we'll go with the good ol' Order example. A Customer would not have a collection of Order objects. The simplest rule is to think about deleting an AR. Those objects that could make sense in the absence of the AR probably do not belong on the AR. A Customer may very well have a collection of ActiveOrder objects, though. Of course there would be an invariant stating that a customer cannot be deleted if it has active orders.
Another thing to look out for is a bloated bounded context. It is conceivable that you could have one or more bounded contexts that have not been identified leading to a situation where you have an AR doing too much.
So in your case you may very well still be interested in the Absence should the Customer be deleted. In the case of an OrderLine it has no meaning without its Order. So no lifecycle of its own.
Hope that helps ever so slightly.

I am building a system to manage person information.
Are you sure that a simple CRUD application that edit/query RDBMS's tables via SQL, wouldn't be a cheaper approach?
If you can express the most of the business rules in term of data relations and table operations, you shouln't use DDD at all.
I have an ever growing aggregate root called Person.
If you actually have complex business rules, an ever growing aggregate is often a syntom of undefined (or wrongly defined) context boundaries.

Generic Vs Individual Repository for Aggregate Root

As I understand, the Bounded Context can have modules, the modules can have many aggregate roots, the aggregate root can have entities. For the persistence, each aggregate root should have a repository.
With the numerous aggregate roots in a large project, is it okay to use a Generic Repository, one for ready only and one for update? Or should have separate repository for each aggregate root which can provide better control.

In a large complex project, I wouldn't recommend using a generic repository since there will most likely be many specific cases beyond your basic GetById(), GetAll()... operations.
Greg Young has a great article on generic repositories : http://codebetter.com/gregyoung/2009/01/16/ddd-the-generic-repository/
is it okay to use a Generic Repository, one for ready only and one for update?
Repositories generally don't handle saving updates to your entities, i.e. they don't have an Update(EntityType entity) method. This is usually taken care of by your ORM's change tracker/Unit of Work implementation. However, if you're looking for an architecture that separates reads from writes, you should definitely have a look at CQRS.

Pure DDD is about making implicit explicit, ie : not using List(), but rather ListTheCustomerThatHaveNotBeSeenForALongTime().
What is at stake here is a technical implementation. From What I know, domain driven design does not provide technical choices.
Generic repository fits well. Your use of this generic repository might not fit the spirit of ddd though. It depends on your domain.

On some of the sample DDD applications that are published on the web, I have seen them have a base repository interface that each aggregate root repository inherits from. I, personally, do things a bit differently. Because repositories are supposed to look like collections to the application code, my base repository interface inherits from IEnumerable so I have:
public interface IRepository<T> : IEnumerable<T> where T : IAggregateRoot
{
}
I do have some base methods I put in there, but only ones that allow reading the collection because some of my aggregate root objects are encapsulated to the point that changes can ONLY be made through method calls.
To answer your question, yes it is fine to have a generic repository, but try not to define any functionality that shouldn't be inherited by ALL repositories. And, if you do accidentally define something that one repository doesn't need, refactor it out into all of the repository interfaces that do need it.
EDIT: Added example of how to make repositories behave just like any other ICollection object.
On the repositories that require CRUD operations, I add this:
void Add(T item); //Add
void Remove(T item); //Remove
T this[int index] { set; } //or T this[object id] { set; } //Update

Thanks for the comments. The approach that I took was separated the base repository interface into ReadOnly and Updatable. Every aggregate root entity will have it's own repository and is derived from Updatable or readonly repository. The repository at the aggregate root level will have it's own additional methods. I'm not planning to use a generic repository.
There is a reason I choose to have IReadOnlyRepository. In the future, I will convert the query part of the app to a CQRS. So the segregation to a ReadOnly interface supertype will help me at that point.

Does DDD allow for a List to be an Aggregate Root?

I am trying to understand the fundamentals of Domain-driven design. Yesterday I found some code in a project I am working with where a Repository returned a list of Entities, i.e. List getMessages() where Message is an entity (has its own id and is modifiable). Now, when reading about Repositories in DDD they are pretty specific that the Repository should return the Aggregate Root, and that any actions on the aggregate should be done by invoking methods in the Aggregate Root.
I would like to place the List in its own class and then just return that class. But, in my project there is basically no need to do that except for compliance with DDD, since we only show messages, add new ones or remove an existing message. We will never have to remove all messages, so the only methods we would have are, addMessage(...), getMessages(), updateMessage(...) and removeMessage(...) which is basically what our Domain Service is doing.
Any ideas anyone? What is the best practice in DDD when it comes to describe Aggregates and Repositories?

One of the confusing aspects of those new to DDD is repository concept.
Repository:
Mediates between the domain and data mapping layers using a collection-like interface for accessing domain objects.
A Repository provides the ability to obtain a reference to an Aggregate root. Not Entity, Value Object, but Aggregate root ( i dont agree with "Repository should return the Aggregate Root").
Suggestions:
- One repository per aggregate root
Repository interfaces (e.g. IMessageRepository) reside in the domain model
public interface IMessageRepository()
{
void saveMessage(Message msg);
void removeMessage(Message msg);
Ilist<Messages> getMessages();
}
Repository implementations (e.g. NHibernateMessageRepository if using nhibernate) reside outside the domain
Hope this help!!

Develop Reference

node.js excel linux python-3.x azure haskell apache-spark rust .htaccess string