I am going to implement repository pattern to my project using Node.js, I am new to DDD, but I have read a lot of it. I understand repository pattern should be keep simple and deals with aggregate.
But I am wondering how should I load entity relations? in ORM we usually have eager loading and lazy loading. But since Nodejs retrieve data asynchronously, I think lazy loading is not possible.
Should I encapsulate eager inside the repository? or is it better if I make a parameter to define what relations to be included?
For example:
class GenericRepository {
find({ select, where, includes, orderBy }) {
// Code
}
}
If i define a method like that, isn't it like reinventing ORM function?
Please give me your opinions.
Thanks.
In DDD there is not such thing as lazy loading an Aggregate or parts of it. Before executing a command, the Aggregate must be fully loaded from the Repository. If you feel that in most cases it does not need to be fully loaded maybe your Aggregate boundaries are wrong and you have an Aggregate that is too big.
Regarding the Aggregate repositories in JavaScript, the simplest solution is to use a document database like MongoDB that persist an JavaScript object as it is, with minimum number of transformations.
For read (or display) scenarios, you can load only some attributes of the Aggregate but be careful to not break the Aggregate encapsulation as you start to depend on Aggregate internal properties.
The things are simpler (from this perspective) in CQRS architectures because the Aggregate corresponds only to the write-model and you can have any number of perfect-fitted/optimized read-models.
Related
I'm new to DDD. Now I've an aggregate Team and entities TeamMember:
class Team {
members: Map<TeamMemberId, TeamMember>;
add(member) {
assert(!members.has(member.id), "Team Member is already exists");
this.members.set(member.id, member);
}
}
When I execute AddTeamMemberCommand, the repository will load the entire aggregate from MongoDB.
when the team is large, this may seem unacceptable.
I found the following from google and stackoverflow:
Use Id references instead of entities
Lazy load
Redesign aggregation
....
I'm not sure which solution is right for me, or is there a better, more generic solution for this scenario? Is there a GitHub sample project i can look at?
Thank you very much.
is there a better, more generic solution for this scenario?
For reads/queries, where you aren't going to be changing the aggregate at all, lazy loading is fine. In the world of CQRS, we might even avoid loading the aggregate altogether, instead just fetching a read-only copy of the information that we need.
An AGGREGATE is a cluster of associated objects that we treat as a unit for the purpose of data changes.
If we're trying to make a change to an aggregate, and are tempted to leave a bunch of unnecessary information unloaded, that may imply that our aggregate boundaries are in the wrong places, and that we should be re-designing our domain model so that the loaded information better fits with what we need.
For example, if you are trying to update just Bob, not the entire team, then that may be a hint that Bob isn't an entity inside of the Team aggregate, but is instead belongs in a different, smaller aggregate, which has some relationship with team.
Mauro Servienti's talk on aggregate boundaries may be a good starting point.
In DDD root of an aggregate is the only reference to retrieve its child objects. Repository of root of an aggregate is responsible for giving the root object reference only. If I need child objects then need to call a getter method of the aggregate to retrieve the child objects which results in a DB query.
Consider a case where I am retrieving multiple aggregates from DB. So in my case this situation results in multiple DB queries which leads a very slow request. How to avoid this in terms of DDD. For persisting I came across a pattern called Unit Of Work. Is there any pattern for the search which resolves my problem or any other way to do this.
First of all, 95% of problems are solved by your ORM (if you happen to use relational database).
Aggregate root repository should (in most cases) return a fully loaded object with all child objects (entities). Lazy loading children should be an exception, not a rule.
Another thing is, you should avoid loading and persisting multiple aggregates at a time. Try repartitioning you domain so that each user interaction deals with only one aggregate.
And consider a document database solution. It really makes sanes to store whole aggregates as documents in a doc database.
Okey it seems like you have a scenario where you in a single use case want to read from several AR and also savee their state into DB. Is the read operation taking to long? or is it both read and write that takes time?
Your domain model and Aggregate roots should be partly defined through interation from use cases. What I'm saying is that, the model should be designed so it suits your clients needs. This scenario seems not like one that fits your model well.
Reports or other operations that uses a large data view should be bypasses the domain model. Don't use DDD for reports etc. Just do a fast data access.
Second. Unit of work is one way to go, if you want all aggregates to participate in a transaction.
Third. I would say, Use Lazy loading, but some use cases that need performance boost you can do a loading strategy which means you let the root load some child collections without having sql-sub-selects firing...
look at this article http://weblogs.asp.net/fredriknormen/archive/2010/07/25/loading-strategy-for-entity-framework-4-0.aspx (even its for EF pattern works well for NH ORM)
Then at last you can always provide db indexes, caching etc to boost perfomance, but given the scenario info, you have takensome kind of wrong design desicion. I don't havee all the facts but maybe some use cases aren't suitable for
I find DDD excellent when it comes to any kind of write operation. For Querying data instead, it only poses unnecessary restrictions.
I would strongly recommend using CQRS as general architecture pattern. This would allow you to create specific Query Models for your Views and leave DDD for input validation and Command execution.
Pros:
Repositories hide complex queries.
Repository methods can be used as transaction boundaries.
ORM can easily be mocked
Cons:
ORM frameworks offer already a collection like interface to persistent objects, what is the intention of repositories. So repositories add extra complexity to the system.
combinatorial explosion when using findBy methods. These methods can be avoided with Criteria objects, queries or example objects. But to do that no repository is needed because a ORM already supports these ways to find objects.
Since repositories are a collection of aggregate roots (in the sense of DDD), one have to create and pass around aggregate roots even if only a child object is modified.
Questions:
What pros and cons do you know?
Would you recommend to use repositories? (Why or why not?)
The main point of a repository (as in Single Responsibility Principle) is to abstract the concept of getting objects that have identity. As I've become more comfortable with DDD, I haven't found it useful to think about repositories as being mainly focused on data persistence but instead as factories that instantiate objects and persist their identity.
When you're using an ORM you should be using their API in as limited a way as possible, giving yourself a facade perhaps that is domain specific. So regardless your domain would still just see a repository. The fact that it has an ORM on the other side is an "implementation detail".
Repository brings domain model into focus by hiding data access details behind an interface that is based on ubiquitous language. When designing repository you concentrate on domain concepts, not on data access. From the DDD perspective, using ORM API directly is equivalent to using SQL directly.
This is how repository may look like in the order processing application:
List<Order> myOrders = Orders.FindPending()
Note that there are no data access terms like 'Criteria' or 'Query'. Internally 'FindPending' method may be implemented using Hibernate Criteria or HQL but this has nothing to do with DDD.
Method explosion is a valid concern. For example you may end up with multiple methods like:
Orders.FindPending()
Orders.FindPendingByDate(DateTime from, DateTime to)
Orders.FindPendingByAmount(Money amount)
Orders.FindShipped()
Orders.FindShippedOn(DateTime shippedDate)
etc
This can improved by using Specification pattern. For example you can have a class
class PendingOrderSpecification{
PendingOrderSpecification WithAmount(Money amount);
PendingOrderSpecification WithDate(DateTime from, DateTime to)
...
}
So that repository will look like this:
Orders.FindSatisfying(PendingOrderSpecification pendingSpec)
Orders.FindSatisfying(ShippedOrderSpecification shippedSpec)
Another option is to have separate repository for Pending and Shipped orders.
A repository is really just a layer of abstraction, like an interface. You use it when you want to decouple your data persistence implementation (i.e. your database).
I suppose if you don't want to decouple your DAL, then you don't need a repository. But there are many benefits to doing so, such as testability.
Regarding the combinatorial explosion of "Find" methods: in .NET you can return an IQueryable instead of an IEnumerable, and allow the calling client to run a Linq query on it, instead of using a Find method. This provides flexibility for the client, but sacrifices the ability to provide a well-defined, testable interface. Essentially, you trade off one set of benefits for another.
If my understanding of Aggregate Roots is correct, the root should be responsible also for deleting one of its "children". That would seemingly translate into something like this:
order.removeOrderLine(23);
Which would effectively remove it from the collection. However, how is this persisted? Is my ORM's UnitOfWork supposed to detect that something went missing in that collection, and then delete it from the database?
Should I have removeOrderLine a method of the OrderRepository instead?
Your Unit of Work should usually take care of this, but it depends on its implementation, specifically on the way it detects changes. Some unit of work implementations (i.e. Hibernate) keeps a copy of the aggregate before you changed it, so at the end of business transaction (when you call something like unitOfWork.PersistAll()), it tries to match current version of all objects (and collections) against the original version.
Other way is to have your domain entities more coupled with your unit of work so that the entity notified the unit of work when something changes (i.e. the order.removeOrderLine method would notify the unit of work about the change).
There are multiple ways how to implement UoW change detection. Have a look at several implementations for hibernat for inspiration.
After reading Evan's and Nilsson's books I am still not sure how to manage Data access in a domain driven project. Should the CRUD methods be part of the repositories, i.e. OrderRepository.GetOrdersByCustomer(customer) or should they be part of the entities: Customer.GetOrders(). The latter approach seems more OO, but it will distribute Data Access for a single entity type among multiple objects, i.e. Customer.GetOrders(), Invoice.GetOrders(), ShipmentBatch.GetOrders() ,etc. What about Inserting and updating?
CRUD-ish methods should be part of the Repository...ish. But I think you should ask why you have a bunch of CRUD methods. What do they really do? What are they really for? If you actually call out the data access patterns your application uses I think it makes the repository a lot more useful and keeps you from having to do shotgun surgery when certain types of changes happen to your domain.
CustomerRepo.GetThoseWhoHaventPaidTheirBill()
// or
GetCustomer(new HaventPaidBillSpecification())
// is better than
foreach (var customer in GetCustomer()) {
/* logic leaking all over the floor */
}
"Save" type methods should also be part of the repository.
If you have aggregate roots, this keeps you from having a Repository explosion, or having logic spread out all over: You don't have 4 x # of entities data access patterns, just the ones you actually use on the aggregate roots.
That's my $.02.
DDD usually prefers the repository pattern over the active record pattern you hint at with Customer.Save.
One downside in the Active Record model is that it pretty much presumes a single persistence model, barring some particularly intrusive code (in most languages).
The repository interface is defined in the domain layer, but doesn't know whether your data is stored in a database or not. With the repository pattern, I can create an InMemoryRepository so that I can test domain logic in isolation, and use dependency injection in the application to have the service layer instantiate a SqlRepository, for example.
To many people, having a special repository just for testing sounds goofy, but if you use the repository model, you may find that you don't really need a database for your particular application; sometimes a simple FileRepository will do the trick. Wedding to yourself to a database before you know you need it is potentially limiting. Even if a database is necessary, it's a lot faster to run tests against an InMemoryRepository.
If you don't have much in the way of domain logic, you probably don't need DDD. ActiveRecord is quite suitable for a lot of problems, especially if you have mostly data and just a little bit of logic.
Let's step back for a second. Evans recommends that repositories return aggregate roots and not just entities. So assuming that your Customer is an aggregate root that includes Orders, then when you fetched the customer from its repository, the orders came along with it. You would access the orders by navigating the relationship from Customer to Orders.
customer.Orders;
So to answer your question, CRUD operations are present on aggregate root repositories.
CustomerRepository.Add(customer);
CustomerRepository.Get(customerID);
CustomerRepository.Save(customer);
CustomerRepository.Delete(customer);
I've done it both ways you are talking about, My preferred approach now is the persistent ignorant (or PONO -- Plain Ole' .Net Object) method where your domain classes are only worried about being domain classes. They do not know anything about how they are persisted or even if they are persisted. Of course you have to be pragmatic about this at times and allow for things such as an Id (but even then I just use a layer super type which has the Id so I can have a single point where things like default value live)
The main reason for this is that I strive to follow the principle of Single Responsibility. By following this principle I've found my code much more testable and maintainable. It's also much easier to make changes when they are needed since I only have one thing to think about.
One thing to be watchful of is the method bloat that repositories can suffer from. GetOrderbyCustomer.. GetAllOrders.. GetOrders30DaysOld.. etc etc. One good solution to this problem is to look at the Query Object pattern. And then your repositories can just take in a query object to execute.
I'd also strongly recommend looking into something like NHibernate. It includes a lot of the concepts that make Repositories so useful (Identity Map, Cache, Query objects..)
Even in a DDD, I would keep Data Access classes and routines separate from Entities.
Reasons are,
Testability improves
Separation of concerns and Modular design
More maintainable in the long run, as you add entities, routines
I am no expert, just my opinion.
The annoying thing with Nilsson's Applying DDD&P is that he always starts with "I wouldn't do that in a real-world-application but..." and then his example follows. Back to the topic: I think OrderRepository.GetOrdersByCustomer(customer) is the way to go, but there is also a discussion on the ALT.Net Mailing list (http://tech.groups.yahoo.com/group/altdotnet/) about DDD.