I've been reading a book Domain-Driven Design Quickly.
Now I've reached the Repository Pattern.
I am not sure what are they referring by mentioning the "Client"?
What does "Client" mean here?
Databases are part of the infrastructure. A poor solution is for
the client to be aware of the details needed to access a database.
For example, the client has to create SQL queries to retrieve the
desired data. The database query may return a set of records,
exposing even more of its internal details. When many clients
have to create objects directly from the database, it turns out that such code is scattered throughout the entire domain.
Client of a repository is a piece of code (another class), usually application layer in context of DDD/Onion Architecture. The rule of a thumb says: 1 repository per Aggregate Root. If your Aggregate Root is Order, which has a collection of OrderItem inside, you create only OrderRepository and return back the whole Order with ALL OrderItems, no Lazy Loading. Now, your client, (application layer code) should have no idea what is inside repository, (is it file based, sql based, http based) you treat it as inmemory collection: repository.GetById(orderId) where repository is IOrderRepository. That would mean you can easily change your repo from in Memory to sql and back anytime and your client code (application layer) or whatever class which uses repository will not be affected hence Liskov Substitution principle is preserved.
Related
I have two questions related to CQRS and Domain Driven Design (DDD).
As far as I understand the segregation idea behind CQRS, one would have two separate models, a read model and a write model. Only the write model would have access to and use the business domain model by means of commands. The read model, however, directly translates database content to DTOs by means of queries and has no access to the business domain at all.
For context: I am writing a web application back-end that provides calculation services for particle physics. Now my questions:
1.) My business domain logic contains some functions that calculate mathematical values as output from given measurement system configurations as input. So, technically, these are read-only queries that calculate the values on-the-fly and do not change any state in any model. Thus, they should be part of the read model. However, because the functions are heavily domain related, they have to be part of the domain model which again is part of the write model.
How should I make these calculation functions available for the front-end via my API when the read model, that should contain all the queries, does not have any access to the domain model?
Do I really have to trigger a command to save all the calculations to the database such that the read model can access the calculation results? These "throw-away" calculations will only be used short-term by the front-end and nobody will ever have to access the persistent calculation results later on.
It's the measurement configuration that has to be persistent, not the calculation results. Those will be re-calculated many times, whenever the user hits the "calculate" button on the front-end.
2.) I also feel like I duplicate quite a bit of data validation code, because both read model and write model have to deserialize and validate the same or very similar request parameters in the process chain http request body -> json -> unvalidated DTO -> validated value -> command/query.
How should I deal with that? Can I share validation code between read model and write model? This seems to dissolve the segregation.
Thanks in advance for all your help and ideas.
I think that what do you have is a set of domain services that with a given input they return an output.
As you said, this services are located in the domain. But, nothing denies you to use them in the read model. As long as you don't change the domain inside the functions, you can use them in any layer above the domain.
If, for any reason, this solution is not viable, because for example the services require domain objects that you cannot/don't want to build in the query side, you can always wrap the domain services inside application services. There you take in input a base object, you do all the transformations to the domain one, you call the domain service and you return the resulting value.
For the second question, you can build a validation service in the domain layer, as a set of services or simple functions. Again, nothing denies you to use them in the validation steps.
I've done the same in my last web app: the validation step of the form data calls a set of domain services, that are used also when I build the domain objects during the handling of a command. Changing the validation in the domain has as effect that also the Web related validation changes. You validate two times (before building the command and during the building of the domain object), but it's ok.
Take a look at the ports/adapters or the onion architecture: it helps a lot understanding what should stay inside a layer and what can be used by an overlapping layer.
Just to add my two pence worth.
We have microservices that use mongo as a backend. We use NET Core using mediatr (https://github.com/jbogard/MediatR) to implement a CQRS pattern in the MSs. We have .Query and .Command features.
Now in our world we never went to Event Sourcing. So, at the mongo level, our read and write models (entities) are the same. So what worked best for us was to have a single entity model which could be transformed into different models (if needed ) via each command/query handler(s).
If you are doing pure CQRS via event sourcing (so different read and write models) or something else then please ignore me! But this worked best for us.
Colleague of mine told me - we don't have a Business logic, we only have CRUD like GetById, GetBySearchTerm, GetByParentID....so I started to wondering about these words.
After reading about DDD, those methods are CRUD, they have a mechanism to fetch data (also store, update, delete...) based on some specific code (usually SQL).
If Business analyst say me: "We need to show data about specific customer".
In my opinion this IS (GetById) a Business process, GetById should be placed inside Business logic part of the application and it contacts repository to fetch a data. Repository with CRUD methods is responsible to persist data based on some criteria.
I Know this question can lead to debate to have repository with atomic methods (GetById, GetBySearchTerm, GetByParentiId...) but my question is only simple - are those methods are CRUD or Business logic methods.
The short answer is that you should not be querying your domain model for any reason other than domain operations that are part of the write / transactional side of things. This side of things is more interested in commands issued at your domain in order to do / perform operations.
Anything related to displaying data should come from as simple a query / read model as is possible.
If you find that your queries require domain interaction you probably have a scenario where you may need to tell your domain to do something and, once completed, you can request the data through the query side.
Not every application is a DDD application. Some applications are just simple CRUD
The business logic would be the part of the application where you validate inputs (like get by id and id is a number between 1 and 99999). This then is passed on to the repository for the actual query.
But if your application is really a crud application then trying to apply DDD isn't going to help you.
Those methods can't be the business methods at all. As a CQRS practitioner i would suggest you to have different models for command and query side. May be you create a different bounded context that serves the whole reading (Query) process (You can create anemic /DTOs here) and another domain model that serves pure business logic purpose.
You can take a look at my blog for command and query separation.
https://aspxsushil.wordpress.com/2015/10/18/command-and-query-object-pattern/
I am building an application that will expose part of its features through RESTful services and my application packages is organized as below
Application --> This package contains the RESTfull services
Model --> Contains the domain model the aggregates, Value Objects,...
Infrastructure --> Contains the set of classes required to access the database
Mongo DB --> My DB
The application package exposes the endpoint
CastReview(UUID reviewedEntityId, string review)
The review the retrieved from the body of the request and it is mandatory.
Now my question is where the validation should occur
Should I keep the validation logic inside the aggregate and inside the application I just construct instance of the aggregate and check if the aggregate is valid
Or Should I have the validation inside the application package as well as inside the aggregate
For Aggregates, I wouldn't call it validation but invariant enforcement, since they are supposed to be always valid. You don't just modify an aggregate and then have it checked by an external validator, aggregates enforce their own invariants.
Some rules are clearly domain invariants since you have to have deep knowledge of aggregate data to enforce them, and some are definitely applicative rules (e.g. email confirmation == email). But sometimes the lines are blurred. I would definitely check at a client-side and applicative level that the review is not null or empty, and at the same time I wouldn't consider a Review Aggregate OK if it has a null review, so I would do both. But this might be domain-dependent and YMMV.
Integrity constraints (or "invariants", if you prefer that term) should be defined in the (domain/design/data) Model. Then they should be checked multiple times:
In the front-end User Interface (on input/change and on submit) for getting responsive validation.
In the back-end Application or Infrastructure before save.
And in the DBMS (before commit), if your DB is shared with other applications.
See also my article Integrity Constraints and Data Validation.
Consider a typical Breeze controller that limits the results of a query to entities that the logged in user has access to. When the browser calls SaveChanges, does Breeze verify on the server that the entities reported as modified are from the original set?
To put it another way, does the EFContextProvider (in the case Entity Framework) keep track of entities that have been handed out, so it can check against malicious data passed to SaveChanges? Or does BeforeSaveEntity need to validate that the user has access to the changed entities?
You must guard against malicious data in your BeforeSaveEntity or BeforeSaveEntities methods.
The idea that the EFContextProvider would keep track of entities that have already been handed out is probably something that we would NOT want to do because
The EFContextProvider would no longer be stateless, which was a design goal to facilitate scaling.
You would still need to guard against malicious data for "Added" entities in the BeforeXXX methods.
It is actually a valid use case for some of our users to "modify" entities without having first queried them.
I model a User as an aggregate root and a User is composed of an Identifier value object as well as an Email value object. Both value objects can uniquely identify a User, however the email is allowed to change and the identifier cannot.
In most examples of DDD I have seen, a repository for an aggregate root only fetches by identifier. Would it be correct to add another method that fetches by email to the repository? Am I modeling this poorly?
I would say yes, it is appropriate for a repository to have methods for retrieving aggregates by something other than the identity. However, there are some subtleties to be aware of.
The reason that many repository examples only retrieve by ID is based on the observation that repositories coupled with the structure of aggregates cannot fulfill all query requirements. For instance, if you have a query which calls for some fields from an aggregate as well as some fields for a referenced aggregate and some summary data, the corresponding aggregate classes cannot be used to represent this data. Instead, a dedicated read-model is needed. Therefore, querying responsibilities are decoupled from the repository. This have several advantages (queries can be served by a dedicated de-normalized store) and it is the principal paradigm of CQRS. In this type of architecture, domain classes are only retrieved by the repository when some behavior needs to execute. All read-only use cases are served by a read-models.
The reason that I think it appropriate for a repository to have a GetByEmail method is based on YAGNI and battling complexity. You an allow your application to evolve as requirements change and grow. You don't need to jump to CQRS and separate read/write stores right away. You can start with a repository that also happens to have a query method. The only thing to keep in mind is that you should try to retrieve entities by ID when you need to invoke some behavior on those entities.
I would put this functionality into a service / business layer that is specific to your User object. Not every object is going to have an Email identifier. This seems more like business logic than the responsibility of the repository. I am sure you already know this, but here is good explanation of what I am talking about.
I would not recommend this, but you could have a specific implementation of your repository for a User that exposes a GetByEmail(string emailAddress) method, but I still like the service idea.
I agree with what eulerfx has answered:
You need to ask yourself why you need to get the AR using something
other than the ID.
I think it would be rather obvious that you do not have the ID but you do have some other unique identifier such as the e-mail address.
If you go with CQRS you need to first determine whether the data is important to the domain or only to the query store. If you require the data to be 100% consistent then it changes things slightly. You would, for instance, need 100% consistency if you are checking whether an e-mail address exists in order to satisfy the unique constraint. If the queried data is at any time stale you will probably run into problems.
Remember that a repository represents a collection of sorts. So if you do not need to actually operate on the AR (command side) but you have decided that where you are using your domain is appropriate then you could always go for a ContainsEMailAddress on the repository; else you could have a query side for your domain data store also since your domain data store (OLTP type store) is 100% consistent whereas your query store (OLAP type store) may only be eventually consistent, as is typical of CQRS with a separate query store.
In most examples of DDD I have seen, a repository for an aggregate
root only fetches by identifier.
I'd be curious to know what examples you've looked at. According to the DDD definition, a Repository is
A mechanism for encapsulating storage, retrieval, and search behavior
which emulates a collection of objects.
Search obviously includes getting a root or a collection of roots by all sorts of criteria, not only their ID's.
Repository is a perfect place for GetCustomerByEmail(), GetCustomersOver18(), GetCustomersByCountry(...) and so on.
Would it be correct to add another method that fetches by email to the repository? - I would not do that. In my opinion a repository should have only methods for getting by id, save and delete.
I'd rather ask why you don't have user id in the command handler in which you want to fetch the user and call a domain method on it. I don't know what exactly you are doing, but for the login/register scenario, I would do following. When a user logs in, he passes an email address and a password, and you do a query to authenticate the user - this would not use domain or repository (that is just for commands), but would use some query implementation which would return some UserDto which would contain user id, from this point you have the user id. Next scenario is registration. The command handler to create a new user would create a new user entity, then the user needs to log in.