How to retrieve Aggregate Roots that don't have repositories? - domain-driven-design

Eric Evan's DDD book, pg. 152:
Provide Repositories only for AGGREGATE roots that actually need
direct access.
1.
Should Aggregate Roots that don't need direct access be retrieved and saved via repositories of those Aggregate Roots that do need direct access?
For example, if we have Customer and Order Aggregate roots and if for whatever reason we don't need direct access to Order AR, then I assume only way orders can be obtained is by traversing Customer.Orders property?
2.
When should ICustomerRepository retrieve orders? When Customer AR is retrieved ( via ICustomerRepository.GetCustomer ) or when we traverse Customer.GetOrders property?
3.
Should ICustomerRepository itself retrieve orders or should it delegate this responsibility to a IOrderRepository? If the latter, then one option would be to inject IOrderRepository into ICustomerRepository. But since outside code shouldn't know that IOrderRepository even exists ( if outside code was aware of its existence, then it may also use IOrderRepository directly ), how then should ICustomerRepository get a reference to IOrderREpository?
UPDATE:
1
With regards to implementation, if done with an ORM like NHibernate,
there is no need for an IOrderRepository.
a) Are you saying that when using ORM, we usually don't need to implement repositories, since ORMs implicitly provide them?
b) I do plan on learning one of ORM technologies ( probably EF ), but from little I did read on ORMs, it seems that if you want to completely decouple Domain or Application layers from Persistence layer, then these two layers shouldn't use ORM expressions, which also implies that ORM expressions and POCOs should exist only within Repository implementations?
c) If there is a scenario where for some reason AR root doesn't have a direct access ( and project doesn't use ORM ), what would your answer to 3. be?
thanks

I'm hard-pressed to think of an example where an aggregate does not require direct access. However, I think at the time of writing (circa 2003), the emphasis on limiting or eliminating traversable object references between aggregates wasn't as prevalent as it is today. Therefore, it could have been the case that a Customer aggregate would reference a collection of Order aggregates. In this scenario, there may be no need to reference an Order directly because traversal from Customer is acceptable.
With regards to implementation, if done with an ORM like NHibernate, there is no need for an IOrderRepository. The Order aggregate would simply have a mapping. Additionally, the mapping for Customer would specify that changes should cascade down to corresponding Order aggregates.
When should ICustomerRepository retrieve orders?
This is the question which raises concern over traversable object references between aggregates. A solution provided by ORM is lazy loading, but lazy loading can be problematic. Ideally, a customer's orders would only be retrieved when needed and this depends on context. My suggestion, therefore, is to avoid traversable references between aggregates and use a repository search instead.
UPDATE
a) You would still need something that implements ICustomerRepository, but the implementation would be largely trivial if the mappings are configured - you'd delegate to the ORM's API to implement each repository method. No need for a IOrderRepository however.
b) For full encapsulation, the repository interface would not contain anything ORM-specific. The repository implementation would adapt the repository contract to ORM specifics.
c) Hard to make a judgement on a scenario I can't picture, but it would seem there is no need for Order repository interface, you can still have an Order Repository to better separate responsibilities. No need for injection either, just have the Customer repo implementation create an instance of Order repo.

Related

Do we need another repo for each entity?

For example take an order entity. It's obvious that order lines don't exist without order. So we have to get them with the help of OrderRepository(throw an order entity). Ok. But what about other things that are loosely coupled with order? Should the customer info be available only from CustomerRepo and bank requisites of the seller available from BankRequisitesRepo, etc.? If it is correct, we should pass all these repositories to our Create Factory method I think.
Yes. In general, each major entity (aggregate root in domain driven design terminology) should have their own repositories. Child entities *like order lines) will generally not need one.
And yes. Define each repository as a service then inject them where needed.
You can even design things such that there is no direct coupling between Order and Customer in terms of an actual database link. This in turn allows customers and orders to live in completely independent databases. Which may or may not be useful for your applications.
You correctly understood that aggregate roots's (AR) child entities shall not have their own repository, unless they are themselves AR's. Having repositories for non-ARs would leave your invariants unprotected.
However you must also understand that entities should usually not be clustered together for convenience or just because the business states that some entity has one or many some other entity.
I strongly recommend that you read Effective Aggregate Design by Vaughn Vernon and this other blog post that Vaughn kindly wrote for a question I asked.
One of the aggregate design rule of thumb stated in Effective Aggregate Design is that you should usually reference other aggregates by identity only.
Therefore, you greatly reduce the number of AR instances needed in other AR's creationnal process since new Order(customer, ...) could become new Order(customerId, ...).
If you still find the need to query other AR's in one AR's creationnal process, then there's nothing wrong in injecting repositories as dependencies, but you should not depend on more than you need (e.g. let the client resolve the real dependencies and pass them directly rather than passing in a strategy allowing to resolve a dependency).

Few confusing things about globally accessible Value Objects

Quotes are from DDD: Tackling Complexity in the Heart of Software ( pg. 150 )
a)
global search access to a VALUE is often meaningles, because finding a
VALUE by its properties would be equivalent to creating a new instance
with those properties. There are exceptions. For example, when I am
planning travel online, I sometimes save a few prospective itineraries
and return later to select one to book. Those itineraries are VALUES
(if there were two made up of the same flights, I would not care which
was which), but they have been associated with my user name and
retrieved for me intact.
I don't understand author's reasoning as for why it would be more appropriate to make Itinierary Value Object globally accessible instead of clients having to globally search for Customer root entity and then traverse from it to this Itinierary object?
b)
A subset of persistent objects must be globaly accessible through a
search based on object attributes ... They are usualy ENTITIES,
sometimes VALUE OBJECTS with complex internal structure ...
Why is it more common for Values Objects with complex internal structure to be globally accesible rather than simpler Value Objects?
c) Anyways, are there some general guidelines on how to determine whether a particular Value Object should be made globally accessible?
UPDATE:
a)
There is no domain reason to make an itinerary traverse-able through
the customer entity. Why load the customer entity if it isn't needed
for any behavior? Queries are usually best handled without
complicating the behavioral domain.
I'm probably wrong about this, but isn't it common that when user ( Ie Customer root entity ) logs in, domain model retrieves user's Customer Aggregate?
And if users have an option to book flights, then it would also be common for them to check from time to time the Itineraries ( though English isn't my first language so the term Itinerary may actually mean something a bit different than I think it means ) they have selected or booked.
And since Customer Aggregate is already retrieved from the DB, why issue another global search for Itinerary ( which will probably search for it in DB ) when it was already retrieved together with Customer Aggregate?
c)
The rule is quite simple IMO - if there is a need for it. It doesn't
depend on the structure of the VO itself but on whether an instance of
a particular VO is needed for a use case.
But this VO instance has to be related to some entity ( ie Itinerary is related to particular Customer ), else as the author pointed out, instead of searching for VO by its properties, we could simply create a new VO instance with those properties?
SECOND UPDATE:
a) From your link:
Another method for expressing relationships is with a repository.
When relationship is expressed via repository, do you implement a SalesOrder.LineItems property ( which I doubt, since you advise against entities calling repositories directly ), which in turns calls a repository, or do you implement something like SalesOrder.MyLineItems(IOrderRepository repo)? If the latter, then I assume there is no need for SalesOrder.LineItems property?
b)
The important thing to remember is that aggregates aren't meant to be
used for displaying data.
True that domain model doesn't care what upper layers will do with the data, but if not using DTO's between Application and UI layers, then I'd assume UI will extract the data to display from an aggregate ( assuming we sent to UI whole aggregate and not just some entity residing within it )?
Thank you
a) There is no domain reason to make an itinerary traverse-able through the customer entity. Why load the customer entity if it isn't needed for any behavior? Queries are usually best handled without complicating the behavioral domain.
b) I assume that his reasoning is that complex value objects are those that you want to query since you can't easily recreate them. This issue and all query related issues can be addressed with the read-model pattern.
c) The rule is quite simple IMO - if there is a need for it. It doesn't depend on the structure of the VO itself but on whether an instance of a particular VO is needed for a use case.
UPDATE
a) It is unlikely that a customer aggregate would have references to the customer's itineraries. The reason is that I don't see how an itinerary would be related to behaviors that would exist in the customer aggregate. It is also unnecessary to load the customer aggregate at all if all that is needed is some data to display. However, if you do load the aggregate and it does contain reference data that you need you may as well display it. The important thing to remember is that aggregates aren't meant to be used for displaying data.
c) The relationship between customer and itinerary could be expressed by a shared ID - each itinerary would have a customerId. This would allow lookup as required. However, just because these two things are related it does not mean that you need to traverse customer to get to the related entities or value objects for viewing purposes. More generally, associations can be implemented either as direct references or via repository search. There are trade-offs either way.
UPDATE 2
a) If implemented with a repository, there is no LineItems property - no direct references. Instead, to obtain a list of line items a repository is called.
b) Or you can create a DTO-like object, a read-model, which would be returned directly from the repository. The repository can in turn execute a simple SQL query to get all required data. This allows you to get to data that isn't part of the aggregate but is related. If an aggregate does have all the data needed for a view, then use that aggregate. But as soon as you have a need for more data that doesn't concern the aggregate, switch to a read-model.

How should I enforce relationships and constraints between aggregate roots?

I have a couple questions regarding the relationship between references between two aggregate roots in a DDD model. Refer to the typical Customer/Order model diagrammed below.
First, should references between the actual object implementation of aggregates always be done through ID values and not object references? For example if I want details on the customer of an Order I would need to take the CustomerId and pass it to a ICustomerRepository to get a Customer rather then setting up the Order object to return a Customer directly correct? I'm confused because returning a Customer directly seems like it would make writing code against the model easier, and is not much harder to setup if I am using an ORM like NHibernate. Yet I'm fairly certain this would be violating the boundaries between aggregate roots/repositories.
Second, where and how should a cascade on delete relationship be enforced for two aggregate roots? For example say I want all the associated orders to be deleted when a customer is deleted. The ICustomerRepository.DeleteCustomer() method should not be referencing the IOrderRepostiory should it? That seems like that would be breaking the boundaries between the aggregates/repositories? Should I instead have a CustomerManagment service which handles deleting Customers and their associated Orders which would references both a IOrderRepository and ICustomerRepository? In that case how can I be sure that people know to use the Service and not the repository to delete Customers. Is that just down to educating them on how to use the model correctly?
First, should references between aggregates always be done through ID values and not actual object references?
Not really - though some would make that change for performance reasons.
For example if I want details on the customer of an Order I would need to take the CustomerId and pass it to a ICustomerRepository to get a Customer rather then setting up the Order object to return a Customer directly correct?
Generally, you'd model 1 side of the relationship (eg., Customer.Orders or Order.Customer) for traversal. The other can be fetched from the appropriate Repository (eg., CustomerRepository.GetCustomerFor(Order) or OrderRepository.GetOrdersFor(Customer)).
Wouldn't that mean that the OrderRepository would have to know something about how to create a Customer? Wouldn't that be beyond what OrderRepository should be responsible for...
The OrderRepository would know how to use an ICustomerRepository.FindById(int). You can inject the ICustomerRepository. Some may be uncomfortable with that, and choose to put it into a service layer - but I think that's overkill. There's no particular reason repositories can't know about and use each other.
I'm confused because returning a Customer directly seems like it would make writing code against the model easier, and is not much harder to setup if I am using an ORM like NHibernate. Yet I'm fairly certain this would be violating the boundaries between aggregate roots/repositories.
Aggregate roots are allowed to hold references to other aggregate roots. In fact, anything is allowed to hold a reference to an aggregate root. An aggregate root cannot hold a reference to a non-aggregate root entity that doesn't belong to it, though.
Eg., Customer cannot hold a reference to OrderLines - since OrderLines properly belongs as an entity on the Order aggregate root.
Second, where and how should a cascade on delete relationship be enforced for two aggregate roots?
If (and I stress if, because it's a peculiar requirement) that's actually a use case, it's an indication that Customer should be your sole aggregate root. In most real-world systems, however, we wouldn't actually delete a Customer that has associated Orders - we may deactivate them, move their Orders to a merged Customer, etc. - but not out and out delete the Orders.
That being said, while I don't think it's pure-DDD, most folks will allow some leniency in following a unit of work pattern where you delete the Orders and then the Customer (which would fail if Orders still existed). You could even have the CustomerRepository do the work, if you like (though I'd prefer to make it more explicit myself). It's also acceptable to allow the orphaned Orders to be cleaned up later (or not). The use case makes all the difference here.
Should I instead have a CustomerManagment service which handles deleting Customers and their associated Orders which would references both a IOrderRepository and ICustomerRepository? In that case how can I be sure that people know to use the Service and not the repository to delete Customers. Is that just down to educating them on how to use the model correctly?
I probably wouldn't go a service route for something so intimately tied to the repository. As for how to make sure a service is used...you just don't put a public Delete on the CustomerRepository. Or, you throw an error if deleting a Customer would leave orphaned Orders.
Another option would be to have a ValueObject describing the association between the Order and the Customer ARs, VO which will contain the CustomerId and additional information you might need - name,address etc (something like ClientInfo or CustomerData).
This has several advantages:
Your ARs are decoupled - and now can be partitioned, stored as event streams etc.
In the Order ARs you usually need to keep the information you had about the customer at the time of the order creation and not reflect on it any future changes made to the customer.
In almost all the cases the information in the value object will be enough to perform the read operations ( display customer info with the order ).
To handle the Deletion/deactivation of a Customer you have the freedom to chose any behavior you like. You can use DomainEvents and publish a CustomerDeleted event for which you can have a handler that moves the Orders to an archive, or deletes them or whatever you need. You can also perform more than one operation on that event.
If for whatever reason DomainEvents are not your choice you can have the Delete operation implemented as a service operation and not as a repository operation and use a UOW to perform the operations on both ARs.
I have seen a lot of problems like this when trying to do DDD and i think that the source of the problems is that developers/modelers have a tendency to think in DB terms. You ( we :) ) have a natural tendency to remove redundancy and normalize the domain model. Once you get over it and allow your model to evolve and implicate the domain expert(s) in it's evolution you will see that it's not that complicated and it's quite natural.
UPDATE: and a similar VO - OrderInfo can be placed inside the Customer AR if needed, with only the needed information - order total, order items count etc.

Aggregate roots depend on the use case so does that mean that we might end up with really a lots of repositories?

Ive heard a lots that aggregate roots depend on the use case. But what does that mean in coding context ?
You have a service class which offcourse hold methods (use cases) that gonna accomplish something in a repository. Great, so you use a repository which is equal to an aggregate root to perform your querying.
Now you need to perform some other kind of operation which use totally different use case than the first service class but use the same entities.
Here the representation :
Entities: Customer, Orders, LineOrder
Service 1: Add new customers, Delete some customers, retrieve customer orders
Here the aggregate root seem to be Customer because you need this repository to perform thoses use cases.
Service 2: Retrieve customer from an actual order
Here the aggregate root seem to be Order because you need this repository to perform this use case.
If i am wrong please correct me. Now that mean you have 2 aggregates roots.
Now my question is, since aggregate roots depend on the use case does that mean that we might end up with really a lots of repositories if you end up having lots of use cases ?
The above example was probably not the best example... so lets say we have a Journal which hold JournalEntries which each entries hold Tasks, Problems and Notes. (This is in the context of telling to a system what have been done to a project)
Does that mean that im gonna end up with 2 repository ? (Journal, JournalEntry)
In the use cases where i need to add new tasks, problems and notes from an journal entry ?
(Can be seen as a service)
Or might end up with 4 repository. (Journal, Task, Problems, Notes)
In the use cases where i need to access directment task, problems and notes ?
(Can be seen as another service)
But that would mean if i need both of theses services (that actually hold the use cases) that i actually need 5 repository to be able to perform use cases in both of them ?
Thanks.
Hi I saw your post and thought I may give you my opion. First I must say I've been doing DDD in project for three years now, so I'm not an expert. But I'm currently working in a project as an architect an coaching developers in DDD, and I must say it isn't a walk in the park... I don't know how many times I've refactored the model and Entity relationships.
But my experience is that you endup with some repositories (more than few but not many). My Aggregates usually contains a few classes and the Aggregate object graph isn't that deep (if you know what I mean).
But I try to be concrete:
1) Aggregate roots are defined by your needs. I mean if you feel that you need that Tasks object through Journal to often, then maybe thats a sign for it to be upgraded as a aggregate root.
2) But everything cannot be aggregate roots, so try to capsulate object that are tight related. Notes seems like a candidate for being own by a root object. You'd probably always relate Notes to the root or it loses its context. Notes cannot live by itself.
3) Remember that Aggregates are used for splitting up large complex domains into smaller "islands" that take care of thier inhabbitants. Its important to not make your domain more complex than it is.
4) You don't know how your model look likes before you've reached far into the project implementation phase. If you realize that some repositories aren't used that much, they may be candidates for merging into other root object (if they have that kind of relationship). You can break out objects that are used so much through root object without its context. I mean for example if Journal are aggregate root and contains Notes and Tasks. After a while you model grows and maybe Tasks
have assoications to Action and ActionHistory and User and Rule and Permission. Now I just throw out a bunch om common objects in a rule/action/user permission functionality. Maybe this result in usecases that approach Tasks from another angle, "View all Tasks performed by this User" etc. Tasks get more involved in some kind of State/Workflow engine and therefor candidates for being an aggregate root itself.
Okey. Not the best example but it maybe gives you the idea. A root object can contain children where some of its children can also be root object because we need it in another context (than journal).
But I have myself banged my head against the wall everytime you startup with a fresh model. Just go with the flow and let the model evolve itself through its clients/subsribers. You refine the model through its usage. The Services (application services and not domain services) are of course extended with methods that respond to UI and usecases (often one-to-one).
I hope I helped you in someway...or not :D
Yes, you would most likely end up with 5 repositories (Journal, JournalEntry, Task, Problems, Notes). Your services would then use these repositories to perform CRUD for each type of entity.
Your reaction of "wow so many repositories" is not uncommon for developers new to DDD.
However, your repositories are usually light weight assuming your model and DB schema are fairly evenly matched which is often the case. If you use an ORM such as nHibernate or a tool such as codesmith generator then it gets even easier to create your repositories.
At first you need to define what is aggregate. I don't know about use case aggregates.
I know about aggregates following...
Aggregates are union of several entities. One of the entities is the aggregate root, the rest entities (or value types) have sense only in selected aggregate root context. For example you can define Order and OrderLine as an aggregate if you don't need to do any independent actions with OrderLine entities. It means that OrderLine makes sense in Order context only.
Why to define aggregates at all? It is required to reduce references between objects. That will simplify you domain model.
And of course you don't need to have OrderLineRepository if OrderLine is a part of Order aggregate.
Here is a link with more information. You can read Eric Evans DDD book. He explains aggregates very well.

Pros and cons of DDD Repositories

Pros:
Repositories hide complex queries.
Repository methods can be used as transaction boundaries.
ORM can easily be mocked
Cons:
ORM frameworks offer already a collection like interface to persistent objects, what is the intention of repositories. So repositories add extra complexity to the system.
combinatorial explosion when using findBy methods. These methods can be avoided with Criteria objects, queries or example objects. But to do that no repository is needed because a ORM already supports these ways to find objects.
Since repositories are a collection of aggregate roots (in the sense of DDD), one have to create and pass around aggregate roots even if only a child object is modified.
Questions:
What pros and cons do you know?
Would you recommend to use repositories? (Why or why not?)
The main point of a repository (as in Single Responsibility Principle) is to abstract the concept of getting objects that have identity. As I've become more comfortable with DDD, I haven't found it useful to think about repositories as being mainly focused on data persistence but instead as factories that instantiate objects and persist their identity.
When you're using an ORM you should be using their API in as limited a way as possible, giving yourself a facade perhaps that is domain specific. So regardless your domain would still just see a repository. The fact that it has an ORM on the other side is an "implementation detail".
Repository brings domain model into focus by hiding data access details behind an interface that is based on ubiquitous language. When designing repository you concentrate on domain concepts, not on data access. From the DDD perspective, using ORM API directly is equivalent to using SQL directly.
This is how repository may look like in the order processing application:
List<Order> myOrders = Orders.FindPending()
Note that there are no data access terms like 'Criteria' or 'Query'. Internally 'FindPending' method may be implemented using Hibernate Criteria or HQL but this has nothing to do with DDD.
Method explosion is a valid concern. For example you may end up with multiple methods like:
Orders.FindPending()
Orders.FindPendingByDate(DateTime from, DateTime to)
Orders.FindPendingByAmount(Money amount)
Orders.FindShipped()
Orders.FindShippedOn(DateTime shippedDate)
etc
This can improved by using Specification pattern. For example you can have a class
class PendingOrderSpecification{
PendingOrderSpecification WithAmount(Money amount);
PendingOrderSpecification WithDate(DateTime from, DateTime to)
...
}
So that repository will look like this:
Orders.FindSatisfying(PendingOrderSpecification pendingSpec)
Orders.FindSatisfying(ShippedOrderSpecification shippedSpec)
Another option is to have separate repository for Pending and Shipped orders.
A repository is really just a layer of abstraction, like an interface. You use it when you want to decouple your data persistence implementation (i.e. your database).
I suppose if you don't want to decouple your DAL, then you don't need a repository. But there are many benefits to doing so, such as testability.
Regarding the combinatorial explosion of "Find" methods: in .NET you can return an IQueryable instead of an IEnumerable, and allow the calling client to run a Linq query on it, instead of using a Find method. This provides flexibility for the client, but sacrifices the ability to provide a well-defined, testable interface. Essentially, you trade off one set of benefits for another.

Resources