DDD, Aggregates and Repos - domain-driven-design

DDD, Aggregates and Repos - domain-driven-design

I have the following entities (example):
Book
Author
The Book entity is also an aggregate since it has related one or many Authors. Now I have a problem in how to fetch this aggregate from the repo. I have the following cases - and I also do need to take care of performance:
List all the Books. No need to fetch Authors.
List all the Bookss with Authors names. Obviously, we need to fetch the aggregate of Books and related Authors.
List all the Books with authors count. Similar to (2), except I do not want to fetch the Authors from the repo, just the count.
So how my repository should look like? Specific questions:
Should we have method like findBooks and findBooksWithAuthors and findBooksWithAuthorsCount? But this would lead to crazy amount of methods, since our entities have many relationships between each other.
Should we just have findBooks and then loadAuthors in AuthorsRepo, i.e. not doing the join, until we hit some performance issue, and then to refactor.
Should I create some aggregate-value-objects, like: BookAndAuthors that describes relationships?
Please note that this example is trivial - and you have to know that our models are more rich and have more relationships.

Do you need to fetch this kind of information to display on the UI ?,
I would encourage you to separate your read and write concerns, and keep your repository interface simple (similar to that of a collections interface).
Have a look at CQRS, it works very well with DDD and will help simplify your design to a great deal.
Once you get into CQRS, just keep in mind that CQRS does not necessarily involve Event Sourcing.
In your case I would recommend the simplest approach show in this article, basically have a read service (can call it Finder), which fires SQL and gets you a DTO/Map of whatever you need for the UI.

Related

Can DDD repositories return data from other aggregate roots?

I'm having trouble getting my head around how to use the repository pattern with a more complex object model. Say I have two aggregate roots Student and Class. Each student may be enrolled in any number of classes. Access to this data would therefore be through the respective repositories StudentRepository and ClassRepository.
Now on my front end say I want to create a student details page that shows the information about the student, and a list of classes they are enrolled in. I would first have to get the Student from StudentRepository and then their Classes from ClassRepository. This makes sense.
Where I get lost is when the domain model becomes more realistic/complex. Say students have a major that is associated with a department, and classes are associated with a course, room, and instructors. Rooms are associated with a building. Course are associated with a department etc.. etc..
I could easily see wanting to show information from all these entities on the student details page. But then I would have to make a number of calls to separate repositories per each class the student is enrolled in. So now what could have been a couple queries to the database has increased massively. This doesn't seem right.
I understand the ClassRepository should only be responsible for updating classes, and not anything in other aggregate roots. But does it violate DDD if the values ClassRepository returns contains information from other related aggregate roots? In most cases this would only need to be a partial summary of those related entities (building name, course name, course number, instructor name, instructor email etc..).

But then I would have to make a number of calls to separate repositories per each class the student is enrolled in. So now what could have been a couple queries to the database has increased massively. This doesn't seem right.
Yup.
But does it violate DDD if the values ClassRepository returns contains information from other related aggregate roots?
Nobody cares about "violate DDD". What we care about is: do you still get the benefits of the repository pattern if you start pulling in data from other aggregates?
Probably not - part of the point of "aggregates" is that when writing the business code you don't have to worry to much about how storage is implemented... but if you start mixing locked data and unlocked data, your abstraction starts leaking into the domain code.
However: if you are trying to support reporting, or some other effectively read only function, you don't necessarily need the domain model at all -- it might make sense to just query your data store and present a representation of the answer.
This substitution isn't necessarily "free" -- the accuracy of the information will depend in part on how closely your stored information matches your in memory information (ie, how often are you writing information into your storage).
This is basically the core idea of CQRS: reads and writes are different, so maybe we should separate the two, so that they each can be optimized without interfering with the correctness of the other.

Can DDD repositories return data from other aggregate roots?
Short answer: No. If that happened, that would not be a DDD repository for a DDD aggregate (that said, nobody will go after you if you do it).
Long answer: Your problem is that you are trying to use tools made to safely modify data (aggregates and repositories) to solve a problem reading data for presentation purposes. An aggregate is a consistency boundary. Its goal is to implement a process and encapsulate the data required for that process. The repository's goal is to read and atomically update a single aggregate. It is not meant to implement queries needed for data presentation to users.
Also, note that the model you present is not a model based on aggregates. If you break that model into aggregates you'll have multiple clusters of entities without "lines" between them. For example, a Student aggregate might have a collection of ClassEnrollments and a Class aggregate a collection of Atendees (that's just an example, note that modeling many to many relationships with aggregates can be a bit tricky). You'll have one repository for each aggregate, which will fully load the aggregate when executing an operation and transactionally update the full aggregate.
Now to your actual question: how do you implement queries for data presentation that require data from multiple aggregates? well, you have multiple options:
As you say, do multiple round trips using your existing repositories. Load a student and from the list of ClassEnrollments, load the classes that you need.
Use CQRS "lite". Aggregates and respositories will only be used for update operations and for query operations implement Queries, which won't use repositories, but access the DB directly, therefore you can join tables from multiple aggregates (Student->Enrollments->Atendees->Classes)
Use "full" CQRS. Create read models optimised for your queries based on the data from your aggregates.
My preferred approach is to use CQRS lite and only create a dedicated read model when it's really needed.

Should I have same inheritance of ebook and printedbooks in this case?

I am developing Library Management System which have two sorts of books (Ebook and PrintedBook).
I intends to make search capacity with both ebook and printedbook in the same page.
The only problem is that I see that ebook and printedbook are book. And should I make an Book entity, and PrintedBook and Ebook inherits Book entity. If I do this, the search capacity is easier by using IBookRepository. If not I have to join two tables (Ebooks and PrintedBooks).
Please help me.

Dealing with inheritance at persistance level, esspecialy when talking about relation databases, can be a headache. First of all you should ask yourself why is this a problem for you.
If the problem is a performance due to using JOIN in you database query you might look at technique called single table inheritance. Basically you have one table containing all the columns of all your book types (i.e. PrintedBook and Ebook). This way you don't have to use JOIN, but you sacrifice some storage.
Other then the concrete table inheritanec technique (as described by yourself) there is no other way how to deal with the inheritance problem in relation databases.
If your application becomes too complex or the domain model isn't compatible with your read use cases, you might look at read-model. Read-model helps you to focus on your problem domain without modifying it while having easy access to the data. This is very complex topic so if you want to read something about read-models (or about DDD implementation problems/techniques) I recommend you to read Implementing Domain-Driven Design by Vaugh Vernon.

Aggregate roots depend on the use case so does that mean that we might end up with really a lots of repositories?

Ive heard a lots that aggregate roots depend on the use case. But what does that mean in coding context ?
You have a service class which offcourse hold methods (use cases) that gonna accomplish something in a repository. Great, so you use a repository which is equal to an aggregate root to perform your querying.
Now you need to perform some other kind of operation which use totally different use case than the first service class but use the same entities.
Here the representation :
Entities: Customer, Orders, LineOrder
Service 1: Add new customers, Delete some customers, retrieve customer orders
Here the aggregate root seem to be Customer because you need this repository to perform thoses use cases.
Service 2: Retrieve customer from an actual order
Here the aggregate root seem to be Order because you need this repository to perform this use case.
If i am wrong please correct me. Now that mean you have 2 aggregates roots.
Now my question is, since aggregate roots depend on the use case does that mean that we might end up with really a lots of repositories if you end up having lots of use cases ?
The above example was probably not the best example... so lets say we have a Journal which hold JournalEntries which each entries hold Tasks, Problems and Notes. (This is in the context of telling to a system what have been done to a project)
Does that mean that im gonna end up with 2 repository ? (Journal, JournalEntry)
In the use cases where i need to add new tasks, problems and notes from an journal entry ?
(Can be seen as a service)
Or might end up with 4 repository. (Journal, Task, Problems, Notes)
In the use cases where i need to access directment task, problems and notes ?
(Can be seen as another service)
But that would mean if i need both of theses services (that actually hold the use cases) that i actually need 5 repository to be able to perform use cases in both of them ?
Thanks.

Hi I saw your post and thought I may give you my opion. First I must say I've been doing DDD in project for three years now, so I'm not an expert. But I'm currently working in a project as an architect an coaching developers in DDD, and I must say it isn't a walk in the park... I don't know how many times I've refactored the model and Entity relationships.
But my experience is that you endup with some repositories (more than few but not many). My Aggregates usually contains a few classes and the Aggregate object graph isn't that deep (if you know what I mean).
But I try to be concrete:
1) Aggregate roots are defined by your needs. I mean if you feel that you need that Tasks object through Journal to often, then maybe thats a sign for it to be upgraded as a aggregate root.
2) But everything cannot be aggregate roots, so try to capsulate object that are tight related. Notes seems like a candidate for being own by a root object. You'd probably always relate Notes to the root or it loses its context. Notes cannot live by itself.
3) Remember that Aggregates are used for splitting up large complex domains into smaller "islands" that take care of thier inhabbitants. Its important to not make your domain more complex than it is.
4) You don't know how your model look likes before you've reached far into the project implementation phase. If you realize that some repositories aren't used that much, they may be candidates for merging into other root object (if they have that kind of relationship). You can break out objects that are used so much through root object without its context. I mean for example if Journal are aggregate root and contains Notes and Tasks. After a while you model grows and maybe Tasks
have assoications to Action and ActionHistory and User and Rule and Permission. Now I just throw out a bunch om common objects in a rule/action/user permission functionality. Maybe this result in usecases that approach Tasks from another angle, "View all Tasks performed by this User" etc. Tasks get more involved in some kind of State/Workflow engine and therefor candidates for being an aggregate root itself.
Okey. Not the best example but it maybe gives you the idea. A root object can contain children where some of its children can also be root object because we need it in another context (than journal).
But I have myself banged my head against the wall everytime you startup with a fresh model. Just go with the flow and let the model evolve itself through its clients/subsribers. You refine the model through its usage. The Services (application services and not domain services) are of course extended with methods that respond to UI and usecases (often one-to-one).
I hope I helped you in someway...or not :D

Yes, you would most likely end up with 5 repositories (Journal, JournalEntry, Task, Problems, Notes). Your services would then use these repositories to perform CRUD for each type of entity.
Your reaction of "wow so many repositories" is not uncommon for developers new to DDD.
However, your repositories are usually light weight assuming your model and DB schema are fairly evenly matched which is often the case. If you use an ORM such as nHibernate or a tool such as codesmith generator then it gets even easier to create your repositories.

At first you need to define what is aggregate. I don't know about use case aggregates.
I know about aggregates following...
Aggregates are union of several entities. One of the entities is the aggregate root, the rest entities (or value types) have sense only in selected aggregate root context. For example you can define Order and OrderLine as an aggregate if you don't need to do any independent actions with OrderLine entities. It means that OrderLine makes sense in Order context only.
Why to define aggregates at all? It is required to reduce references between objects. That will simplify you domain model.
And of course you don't need to have OrderLineRepository if OrderLine is a part of Order aggregate.
Here is a link with more information. You can read Eric Evans DDD book. He explains aggregates very well.

In DDD, are collection properties of entities allowed to have partial values?

In Domain Driven Design are collection properties of entities allowed to have partial values?
For example, should properties such as Customer.Orders, Post.Comments, Graph.Vertices always contain all orders, comments, vertices or it is allowed to have today's orders, recent comments, orphaned vertices?
Correspondingly, should Repositories provide methods like
GetCustomerWithOrdersBySpecification
GetPostWithCommentsBefore
etc.?

I don't think that DDD tells you to do or not to do this. It strongly depends on the system you are building and the specific problems you need to solve.
I not even heard about patterns about this.
From a subjective point of view I would say that entities should be complete by definitions (considering lazy loading), and could completely or partially be loaded to DTO's, to optimized the amount of data sent to clients. But I wouldn't mind to load partial entities from the database if it would solve some problem.

Remember that Domain-Driven Design also has a concept of services. For performing certain database queries, it's better to model the problem as a service than as a collection of child objects attached to a parent object.
A good example of this might be creating a report by accepting several user-entered parameters. It be easier to model this as:
CustomerReportService.GetOrdersByOrderDate(Customer theCustomer, Date cutoff);
Than like this:
myCustomer.OrdersCollection.SelectMatching(Date cutoff);
Or to put it another way, the DDD model you use for data entry does not have to be the same as the DDD model you use for reporting.
In highly scalable systems, it's common to separate these two concerns.

data access in DDD?

After reading Evan's and Nilsson's books I am still not sure how to manage Data access in a domain driven project. Should the CRUD methods be part of the repositories, i.e. OrderRepository.GetOrdersByCustomer(customer) or should they be part of the entities: Customer.GetOrders(). The latter approach seems more OO, but it will distribute Data Access for a single entity type among multiple objects, i.e. Customer.GetOrders(), Invoice.GetOrders(), ShipmentBatch.GetOrders() ,etc. What about Inserting and updating?

CRUD-ish methods should be part of the Repository...ish. But I think you should ask why you have a bunch of CRUD methods. What do they really do? What are they really for? If you actually call out the data access patterns your application uses I think it makes the repository a lot more useful and keeps you from having to do shotgun surgery when certain types of changes happen to your domain.
CustomerRepo.GetThoseWhoHaventPaidTheirBill()
// or
GetCustomer(new HaventPaidBillSpecification())
// is better than
foreach (var customer in GetCustomer()) {
/* logic leaking all over the floor */
}
"Save" type methods should also be part of the repository.
If you have aggregate roots, this keeps you from having a Repository explosion, or having logic spread out all over: You don't have 4 x # of entities data access patterns, just the ones you actually use on the aggregate roots.
That's my $.02.

DDD usually prefers the repository pattern over the active record pattern you hint at with Customer.Save.
One downside in the Active Record model is that it pretty much presumes a single persistence model, barring some particularly intrusive code (in most languages).
The repository interface is defined in the domain layer, but doesn't know whether your data is stored in a database or not. With the repository pattern, I can create an InMemoryRepository so that I can test domain logic in isolation, and use dependency injection in the application to have the service layer instantiate a SqlRepository, for example.
To many people, having a special repository just for testing sounds goofy, but if you use the repository model, you may find that you don't really need a database for your particular application; sometimes a simple FileRepository will do the trick. Wedding to yourself to a database before you know you need it is potentially limiting. Even if a database is necessary, it's a lot faster to run tests against an InMemoryRepository.
If you don't have much in the way of domain logic, you probably don't need DDD. ActiveRecord is quite suitable for a lot of problems, especially if you have mostly data and just a little bit of logic.

Let's step back for a second. Evans recommends that repositories return aggregate roots and not just entities. So assuming that your Customer is an aggregate root that includes Orders, then when you fetched the customer from its repository, the orders came along with it. You would access the orders by navigating the relationship from Customer to Orders.
customer.Orders;
So to answer your question, CRUD operations are present on aggregate root repositories.
CustomerRepository.Add(customer);
CustomerRepository.Get(customerID);
CustomerRepository.Save(customer);
CustomerRepository.Delete(customer);

I've done it both ways you are talking about, My preferred approach now is the persistent ignorant (or PONO -- Plain Ole' .Net Object) method where your domain classes are only worried about being domain classes. They do not know anything about how they are persisted or even if they are persisted. Of course you have to be pragmatic about this at times and allow for things such as an Id (but even then I just use a layer super type which has the Id so I can have a single point where things like default value live)
The main reason for this is that I strive to follow the principle of Single Responsibility. By following this principle I've found my code much more testable and maintainable. It's also much easier to make changes when they are needed since I only have one thing to think about.
One thing to be watchful of is the method bloat that repositories can suffer from. GetOrderbyCustomer.. GetAllOrders.. GetOrders30DaysOld.. etc etc. One good solution to this problem is to look at the Query Object pattern. And then your repositories can just take in a query object to execute.
I'd also strongly recommend looking into something like NHibernate. It includes a lot of the concepts that make Repositories so useful (Identity Map, Cache, Query objects..)

Even in a DDD, I would keep Data Access classes and routines separate from Entities.
Reasons are,
Testability improves
Separation of concerns and Modular design
More maintainable in the long run, as you add entities, routines
I am no expert, just my opinion.

The annoying thing with Nilsson's Applying DDD&P is that he always starts with "I wouldn't do that in a real-world-application but..." and then his example follows. Back to the topic: I think OrderRepository.GetOrdersByCustomer(customer) is the way to go, but there is also a discussion on the ALT.Net Mailing list (http://tech.groups.yahoo.com/group/altdotnet/) about DDD.

Develop Reference

node.js excel linux python-3.x azure haskell apache-spark rust .htaccess string