I have some middleware code which fetches a list of products from an external api. I am modelling the response and returning that response to clients of my code.
Any clients of my code do not care about specifics on individual products returned: they simply want the collection of products.
How would that be modelled using ddd?
Each product property a value object, a product an entity and a repository to contain all of the products?
Why not use CQRS (https://learn.microsoft.com/en-us/azure/architecture/patterns/cqrs).
Separate your models into read and write models. In your case read models will do. Make they POCOs. On the read side we do not need to use DDD tactical modeling tools.
For more info visit the link i provided.
I think you are almost there, your middleware(external api) could be a repository, by having find methods, and returning Product models.
It is recommended a repository be an interface (e.g. ProductRepository) for making the code more testable. You could have simple implementation for tests(e.g. ProductRepositoryTestImpl) and main implementation for middleware communication (e.g. ProductRepostioryImpl).
For packaging, I prefer this:
domain
\model
\product
|Product
|ProductRepository
infrastructure
\persistence
\YOUR_EXTERNAL_API_NAME
|ProductRepositoryImp
\eclipselink
...
\test
\...
|ProductRepositoryTestImpl
You should see the external api like an external bounded context. Your local bounded context will use an anti-corruption layer that translate terms from remote to local bounded context. So, your code is in fact an anti-corruption layer.
Now, should you persist those products as entities or value objects? This depends on your local usage. Do you modify those products or not. If you don't modify them then they are Value objects.
In any case you probably will have to use a repository to persist/retreive the products.
Related
While i am practicing DDD in my software projects, i have always faced the question of "Why should i implement my business rules in the entities? aren't they supposed to be pure data models?"
Note that, from my understanding of DDD, domain models could be consist of persistent models as well as value objects.
I have come up with a solution in which i separate my persistent models from my domain models. On the other hand we have data transfer objects (DTO), so we have 3 layers of data mapping. Database to persistence model, persistence model to domain models and domain models to DTOs. In my opinion, my solution is not an efficient one as too much hard effort must be put into it.
Therefore is there any better practice to achieve this goal?
Disclaimer: this answer is a little larger that the question but it is needed to understand the problem; also is 100% based on my experience.
What you are feeling is normal, I had the same feeling some time ago. This is because of a combination of architecture, programming language and used framework. You should try to choose the above tools as such that they give the code that is easiest to change. If you have to change 3 classes for each field added to an entity then this would be nightmare in a large project (i.e. 50+ entity types).
The problem is that you have multiple DTOs per entity/concept.
The heaviest architecture that I used was the Classic layered architecture; the strict version was the hardest (in the strict version a layer may access only the layer that is just before it; i.e. the User interface may access only the Application). It involved a lot of DTOs and translations as the data moved from the Infrastructure to the UI. The testing was also hard as I had to use a lot of mocking.
Then I inverted the dependency, the Domain will not depend on the Infrastructure. For this I defined interfaces in the Domain layer that were implemented in the Infrastructure. But I still needed to use mocking for them. Also, the Aggregates were not pure and they had side effects (because they called the Infrastructure, even it was abstracted by interfaces).
Then I moved the Domain to the very bottom. This made my Aggregates pure. I no longer needed to use mocking. But I still needed DTOs (returned by the Application layer to the UI and those used by the ORM).
Then I made the first leap: CQRS. This splits the models in two: the write model and the read model. The important thing is that you don't need to use DTOs for models anymore. The Aggregate (the write model) can be serialized as it is or converted to JSON and stored in almost any database. Vaughn Vernon has a blog post about this.
But the nicest are the Read models. You can create a read model for each use case. Being a model used only for read/query, it can be as simple/dump as possible. The read entities contain only query related behavior. With the right persistence they can be persisted as they are. For example, if you use MongoDB (or any document database), with a simple reflection based serializer you can have a very thin architecture. Thanks to the domain events, you won't need to use JOINS, you can have full data denormalization (the read entities include all the data they need).
The second leap is Event sourcing. With this you don't need a flat persistence for the Aggregates. They are rehydrated from the Event store each time they handle a command.
You still have DTOs (commands, events, read models) but there is only one DTO per entity/concept.
Regarding the elimination of DTOs used by the Presentation: you can use something like GraphSQL.
All the above can be made worse by the programming language and framework. Strong typed programming languages force you to create a type for each custom returned value. Some frameworks force you to return a custom serializable type in order to return them to REST over HTTP requests (in this way you could have self-described REST endpoints using reflection). In PHP you can simply use arrays with string keys as value to be returned by a REST controller.
P.S.
By DTO I mean a class with data and no behavior.
I'm not saying that we all should use CQRS, just that you should know that it exists.
Why should i implement my business rules in the entities? aren't they supposed to be pure data models?
Your persistence entities should be pure data models. Your domain entities describe behaviors. They aren't the same thing; it is a common pattern to have a bit of logic with in the repository to change one to the other.
The cleanest way I know of to manage things is to treat the persistent entity as a value object to be managed by the domain entity, and to use something like a data mapper for transitions between domain and persistence.
On the other hand we have data transfer objects (DTO), so we have 3 layers of data mapping. Database to persistence model, persistence model to domain models and domain models to DTOs. In my opinion, my solution is not an efficient one as too much hard effort must be put into it.
cqrs offers some simplification here, based on the idea that if you are implementing a query, you don't really need the "domain model" because you aren't actually going to change the supporting data. In which case, you can take the "domain model" out of the loop altogether.
DDD and data are very different things. The aggregate's data (an outcome) will be persisted somehow depending on what you're using. Personally I think in domain events so the resulting Domain Event is the DTO (technically it is) that can be stored directly in an Event Store (if you're using Event Sourcing) or act as a data source for your persistence model.
A domain model represents relevant domain behaviour with the domain state being the 'result'. An entity is concept which has an id, compared to a Value Object which represents a business semantic value only. An entity usually groups related value objects and consistency rules. Not all business rules are here , some of them make sense as a service.
Now, there is the case of a CRUD domain or CRUD modelling where basically all you have is some data structures plus some validation rules. No need to complicate your life here if the modeling is correct. Implement things as simple as possible.
Always think of DDD as a methodology to gather requirements and to structure information. Implementation as in code (design) is something different.
In many case, I need write a lot of class work with CRUD for some class. For example CRUD with pure object User, Book, Tag.
I usually make a directory named models, put all the CRUD classed into the models folder.
But I feel that the word model is not show essence. Is the word model well-defined in computer science? It means the pure object of User, or the means of CRUD of User?
I also use another name services for more complex logic, For example UserService may require other models than UserModel. But the word service is often conflict with some other case like an online service, backend service.
Are there any good names for the model and service in my case? BTW, I am most using Node.js; it may not conflict with the general conventions used in Node.js.
Ultimately, it will come down to what makes the code the most understandable both to you and to someone down the road who may have occasion to work on your code. If 'model' and 'services' convey the thought of what lies within in an obvious way to anyone coming in to the code, then they are probably fine. As far as standards, I don't know if there is a 'defined' set of names you have to use. In MVC, for example, you will use 'Models' as one of your folders in order to store all of the actual models you will be feeding your views, and this is understood in the MVC architecture that those names (Models, Views, Controllers) are the standard.
I agree with you that Model is a little ambiguous. Sometimes it is used to indicate domain objects such as User/Book/Tag, but sometimes it is used to indicate objects that deal with business logic, such as "Buying a book","Authenticating a user".
What's common to both uses is that "Model" is clearly separated from UI, that is handled entirely by the Views and the Controllers.
Another useful name is Entities. In Robert Martin's work on Object Oriented Design, he speaks of use-case-driven design, and distinguishes between three kinds of objects: Entity Objects, Interactor objects and Boundary objects.
Entity objects are useful in multiple use-cases. For example, in a book selling system, entities can be Book/User/Recommendation/Review.
Interactor objects implement use-cases, and they typically use multiple entity objects. For example, Purchase_Book/Login/Search_Books can be such objects.
Boundary objects are used for transferring data across module boundaries, and are used for building interfaces between parts of the system, which should be decoupled from one-another. For example, a controller may need to create a Purchase_Book object, and in order to create it, it needs to pass data about what book ID needs to be purchased, by what user ID, etc... and this data can be packed in a boundary object called Purchase_Request.
While Interactor and Boundary require more explanation, I find that the word Entities is meaningful and can be grasped intuitively without reading any explanation.
I have some big Entity. Entity has propertis "Id", "Status" and others.
I have Repository for this Entity.
I want to change status in one entity.
Should I get whole Entity, change property Status and use save method in Repository or should I use method ChangeStatus(id, newStatus) in Repository?
Probably you don't need a domain model. You could try a transaction script that directly use SQL to update the database.
You need a domain model if and only if you need to hire an expert to understand the business.
Otherwise, it's just expensive buzzwords driven development.
And btw, if you have big entity classes containing data that you don't need during most of operations, then you know that you haven't properly defined context boundaries.
The best definition of bounded context is the one of Evans:
The delimited applicability of a particular model. BOUNDING CONTEXTS gives team members a clear and shared understanding of what has to be consistent and what can develop independently.
That is: you have to split the domain expert knowledge in contexts where each term has an unambiguous meaning and a restricted set of responsibility. If you do so, you'll obtain small types and modules with high cohesion an
Eric Evan's DDD book, pg. 152:
Provide Repositories only for AGGREGATE roots that actually need
direct access.
1.
Should Aggregate Roots that don't need direct access be retrieved and saved via repositories of those Aggregate Roots that do need direct access?
For example, if we have Customer and Order Aggregate roots and if for whatever reason we don't need direct access to Order AR, then I assume only way orders can be obtained is by traversing Customer.Orders property?
2.
When should ICustomerRepository retrieve orders? When Customer AR is retrieved ( via ICustomerRepository.GetCustomer ) or when we traverse Customer.GetOrders property?
3.
Should ICustomerRepository itself retrieve orders or should it delegate this responsibility to a IOrderRepository? If the latter, then one option would be to inject IOrderRepository into ICustomerRepository. But since outside code shouldn't know that IOrderRepository even exists ( if outside code was aware of its existence, then it may also use IOrderRepository directly ), how then should ICustomerRepository get a reference to IOrderREpository?
UPDATE:
1
With regards to implementation, if done with an ORM like NHibernate,
there is no need for an IOrderRepository.
a) Are you saying that when using ORM, we usually don't need to implement repositories, since ORMs implicitly provide them?
b) I do plan on learning one of ORM technologies ( probably EF ), but from little I did read on ORMs, it seems that if you want to completely decouple Domain or Application layers from Persistence layer, then these two layers shouldn't use ORM expressions, which also implies that ORM expressions and POCOs should exist only within Repository implementations?
c) If there is a scenario where for some reason AR root doesn't have a direct access ( and project doesn't use ORM ), what would your answer to 3. be?
thanks
I'm hard-pressed to think of an example where an aggregate does not require direct access. However, I think at the time of writing (circa 2003), the emphasis on limiting or eliminating traversable object references between aggregates wasn't as prevalent as it is today. Therefore, it could have been the case that a Customer aggregate would reference a collection of Order aggregates. In this scenario, there may be no need to reference an Order directly because traversal from Customer is acceptable.
With regards to implementation, if done with an ORM like NHibernate, there is no need for an IOrderRepository. The Order aggregate would simply have a mapping. Additionally, the mapping for Customer would specify that changes should cascade down to corresponding Order aggregates.
When should ICustomerRepository retrieve orders?
This is the question which raises concern over traversable object references between aggregates. A solution provided by ORM is lazy loading, but lazy loading can be problematic. Ideally, a customer's orders would only be retrieved when needed and this depends on context. My suggestion, therefore, is to avoid traversable references between aggregates and use a repository search instead.
UPDATE
a) You would still need something that implements ICustomerRepository, but the implementation would be largely trivial if the mappings are configured - you'd delegate to the ORM's API to implement each repository method. No need for a IOrderRepository however.
b) For full encapsulation, the repository interface would not contain anything ORM-specific. The repository implementation would adapt the repository contract to ORM specifics.
c) Hard to make a judgement on a scenario I can't picture, but it would seem there is no need for Order repository interface, you can still have an Order Repository to better separate responsibilities. No need for injection either, just have the Customer repo implementation create an instance of Order repo.
Pros:
Repositories hide complex queries.
Repository methods can be used as transaction boundaries.
ORM can easily be mocked
Cons:
ORM frameworks offer already a collection like interface to persistent objects, what is the intention of repositories. So repositories add extra complexity to the system.
combinatorial explosion when using findBy methods. These methods can be avoided with Criteria objects, queries or example objects. But to do that no repository is needed because a ORM already supports these ways to find objects.
Since repositories are a collection of aggregate roots (in the sense of DDD), one have to create and pass around aggregate roots even if only a child object is modified.
Questions:
What pros and cons do you know?
Would you recommend to use repositories? (Why or why not?)
The main point of a repository (as in Single Responsibility Principle) is to abstract the concept of getting objects that have identity. As I've become more comfortable with DDD, I haven't found it useful to think about repositories as being mainly focused on data persistence but instead as factories that instantiate objects and persist their identity.
When you're using an ORM you should be using their API in as limited a way as possible, giving yourself a facade perhaps that is domain specific. So regardless your domain would still just see a repository. The fact that it has an ORM on the other side is an "implementation detail".
Repository brings domain model into focus by hiding data access details behind an interface that is based on ubiquitous language. When designing repository you concentrate on domain concepts, not on data access. From the DDD perspective, using ORM API directly is equivalent to using SQL directly.
This is how repository may look like in the order processing application:
List<Order> myOrders = Orders.FindPending()
Note that there are no data access terms like 'Criteria' or 'Query'. Internally 'FindPending' method may be implemented using Hibernate Criteria or HQL but this has nothing to do with DDD.
Method explosion is a valid concern. For example you may end up with multiple methods like:
Orders.FindPending()
Orders.FindPendingByDate(DateTime from, DateTime to)
Orders.FindPendingByAmount(Money amount)
Orders.FindShipped()
Orders.FindShippedOn(DateTime shippedDate)
etc
This can improved by using Specification pattern. For example you can have a class
class PendingOrderSpecification{
PendingOrderSpecification WithAmount(Money amount);
PendingOrderSpecification WithDate(DateTime from, DateTime to)
...
}
So that repository will look like this:
Orders.FindSatisfying(PendingOrderSpecification pendingSpec)
Orders.FindSatisfying(ShippedOrderSpecification shippedSpec)
Another option is to have separate repository for Pending and Shipped orders.
A repository is really just a layer of abstraction, like an interface. You use it when you want to decouple your data persistence implementation (i.e. your database).
I suppose if you don't want to decouple your DAL, then you don't need a repository. But there are many benefits to doing so, such as testability.
Regarding the combinatorial explosion of "Find" methods: in .NET you can return an IQueryable instead of an IEnumerable, and allow the calling client to run a Linq query on it, instead of using a Find method. This provides flexibility for the client, but sacrifices the ability to provide a well-defined, testable interface. Essentially, you trade off one set of benefits for another.