DDD - can a repository fetch an aggregate by something other than its identifier? - domain-driven-design

I model a User as an aggregate root and a User is composed of an Identifier value object as well as an Email value object. Both value objects can uniquely identify a User, however the email is allowed to change and the identifier cannot.
In most examples of DDD I have seen, a repository for an aggregate root only fetches by identifier. Would it be correct to add another method that fetches by email to the repository? Am I modeling this poorly?

I would say yes, it is appropriate for a repository to have methods for retrieving aggregates by something other than the identity. However, there are some subtleties to be aware of.
The reason that many repository examples only retrieve by ID is based on the observation that repositories coupled with the structure of aggregates cannot fulfill all query requirements. For instance, if you have a query which calls for some fields from an aggregate as well as some fields for a referenced aggregate and some summary data, the corresponding aggregate classes cannot be used to represent this data. Instead, a dedicated read-model is needed. Therefore, querying responsibilities are decoupled from the repository. This have several advantages (queries can be served by a dedicated de-normalized store) and it is the principal paradigm of CQRS. In this type of architecture, domain classes are only retrieved by the repository when some behavior needs to execute. All read-only use cases are served by a read-models.
The reason that I think it appropriate for a repository to have a GetByEmail method is based on YAGNI and battling complexity. You an allow your application to evolve as requirements change and grow. You don't need to jump to CQRS and separate read/write stores right away. You can start with a repository that also happens to have a query method. The only thing to keep in mind is that you should try to retrieve entities by ID when you need to invoke some behavior on those entities.

I would put this functionality into a service / business layer that is specific to your User object. Not every object is going to have an Email identifier. This seems more like business logic than the responsibility of the repository. I am sure you already know this, but here is good explanation of what I am talking about.
I would not recommend this, but you could have a specific implementation of your repository for a User that exposes a GetByEmail(string emailAddress) method, but I still like the service idea.

I agree with what eulerfx has answered:
You need to ask yourself why you need to get the AR using something
other than the ID.
I think it would be rather obvious that you do not have the ID but you do have some other unique identifier such as the e-mail address.
If you go with CQRS you need to first determine whether the data is important to the domain or only to the query store. If you require the data to be 100% consistent then it changes things slightly. You would, for instance, need 100% consistency if you are checking whether an e-mail address exists in order to satisfy the unique constraint. If the queried data is at any time stale you will probably run into problems.
Remember that a repository represents a collection of sorts. So if you do not need to actually operate on the AR (command side) but you have decided that where you are using your domain is appropriate then you could always go for a ContainsEMailAddress on the repository; else you could have a query side for your domain data store also since your domain data store (OLTP type store) is 100% consistent whereas your query store (OLAP type store) may only be eventually consistent, as is typical of CQRS with a separate query store.

In most examples of DDD I have seen, a repository for an aggregate
root only fetches by identifier.
I'd be curious to know what examples you've looked at. According to the DDD definition, a Repository is
A mechanism for encapsulating storage, retrieval, and search behavior
which emulates a collection of objects.
Search obviously includes getting a root or a collection of roots by all sorts of criteria, not only their ID's.
Repository is a perfect place for GetCustomerByEmail(), GetCustomersOver18(), GetCustomersByCountry(...) and so on.

Would it be correct to add another method that fetches by email to the repository? - I would not do that. In my opinion a repository should have only methods for getting by id, save and delete.
I'd rather ask why you don't have user id in the command handler in which you want to fetch the user and call a domain method on it. I don't know what exactly you are doing, but for the login/register scenario, I would do following. When a user logs in, he passes an email address and a password, and you do a query to authenticate the user - this would not use domain or repository (that is just for commands), but would use some query implementation which would return some UserDto which would contain user id, from this point you have the user id. Next scenario is registration. The command handler to create a new user would create a new user entity, then the user needs to log in.

Related

How to handle relationships between aggregate roots in resolvejs

I'm having trouble figuring out how to handle some basic stuff around relationships between aggregate roots in resolvejs. The basic question is how do I handle the integrity of relationships? To do so, it seems like you need knowledge of both sides at the same time, but that doesn't seem to be allowed in the write side.
Heres the setup: I am trying to build a user management tool and I have two aggregate roots, User and Organisation. I need to allow both to exist independently of each other and define an access relationship between them (i.e. a user can have access to any number of organisation).
If the relationship belongs to the User, I can create a create a command like grantAccessToOrganisation on the User aggregate which takes an organisationId but this raises a couple questions. How do I make sure that the organisationId provided is real one? It seems like it needs to happen in the command handler but since the command belongs to the User aggregate, I don't have access to the Organisation aggregate. Also, how should I handle when an organisation gets removed? It seems like that should have a side effect on all users who have access to it but I don't seem to have a good way of making that query on the write side.
Try not to think of aggregate as an "entity" in traditional systems. Choose aggregate root as a transactional and consistency boundary.
This means that all commands to the given aggregate are sequental, its state is consistent, meaning you can be sure that your command is applied to expected aggregate state and no chages is being done by another user or process.
As an extreme example, you can even have a single aggregate "System", and the whole system state will be accessible to you. But this mean the size of the state will be enourmous, and each command will lock the whole system.
So choose your aggregate large enough to control its transactions and small enough not to block other transactions.
In your example, I can guess that User is more about identity, login, profile, avatar - things like that. It can live without knowledge of Organisation and access rights to it. Organisation is the aggregate that deals with access rights and changing access rights is a transaction that affect the single organisation.
So I would send grantAccess command to the Organisation, not to the user in your example. But of course it depends on other requirements, and I could be wrong here.
Also, there always will be some inter-aggregate business rules that can be implemented with saga. An example would be a rule that if User login is disabled, its access rights are removed after 30 days. Saga is a long-running business transaction that can affect several aggregates.

Client in Repository Pattern of DDD

I've been reading a book Domain-Driven Design Quickly.
Now I've reached the Repository Pattern.
I am not sure what are they referring by mentioning the "Client"?
What does "Client" mean here?
Databases are part of the infrastructure. A poor solution is for
the client to be aware of the details needed to access a database.
For example, the client has to create SQL queries to retrieve the
desired data. The database query may return a set of records,
exposing even more of its internal details. When many clients
have to create objects directly from the database, it turns out that such code is scattered throughout the entire domain.
Client of a repository is a piece of code (another class), usually application layer in context of DDD/Onion Architecture. The rule of a thumb says: 1 repository per Aggregate Root. If your Aggregate Root is Order, which has a collection of OrderItem inside, you create only OrderRepository and return back the whole Order with ALL OrderItems, no Lazy Loading. Now, your client, (application layer code) should have no idea what is inside repository, (is it file based, sql based, http based) you treat it as inmemory collection: repository.GetById(orderId) where repository is IOrderRepository. That would mean you can easily change your repo from in Memory to sql and back anytime and your client code (application layer) or whatever class which uses repository will not be affected hence Liskov Substitution principle is preserved.

DDD repository input parameters

Which is suggested way to implement _customRoleRepository in the following examples?
Following code is executed in the application service.
var user = _userRepository.GetById(1);
var customRole = _customRoleRepository.GetById(user.CustomRoleId);
or
var user = _userRepository.GetById(1);
var customRole = _customRoleRepository.GetForUser(user);
Given the two options I would probably go for the first one, which keeps consistency of accessing by an ID.
If possible, it might be preferable to load the custom role when you load the user to avoid another round trip to the database, especially if this is a common operation. This could be implemented as a read model.
This is presuming you have modelled your aggregates correctly... :)
Hate to say it but in a DDD environment, my answer would be neither.
In your first example, the role repository can be ignorant of the user domain which is good but it means the application needs to know that to get the role it needs to pull an id out of the user and then query another repository. In other words, the application is acting as a mapper between user and role.
In the second example, the roles repository now needs to know about the user domain. Not great but on the other hand the application no longer needs to know about roleId. So that is good. Classic sort of trade off between the two approaches.
But in both cases the application still needs two repositories to get it's information. What happens when more relations are needed? The number of repositories can quickly grow and things become a mess.
In Domain Driven Design you should try to think in terms of aggregate roots(AR) and domain contexts. For your example context, the user is an AR and the role becomes a child. So you might have:
var user = _userFinder.GetById(1);
var customRole = user.CustomRole;
That hides most of the implementation details from you application and allows you to focus on what your domain entities actually need to do.
Both are equally valid, depending on your needs. GetForUser would be good if you want to ensure the calling code has a valid User aggregate before you try and retrieve the roles - while it does couple the customRoleRepository to knowledge of the User aggregate, if you want to require the calling code to have a valid User aggregate, then that coupling has a purpose.
GetByUserId is more consistent with GetById and has less coupling, so if in your context it doesn't matter to call GetByUserId even if the client doesn't have a valid User aggregate, then that's fine too.
If you are loading ById, I've also found using typed identity valueobjects can be quite helpful in providing an extra level of type safety - some conversation about the pros and cons here https://stackoverflow.com/a/5377460/6720449 and here https://groups.google.com/forum/#!topic/dddcqrs/WQ9zRtW3Gbg

Ensuring query restrictions are honored during SaveChanges - Breeze security

Consider a typical Breeze controller that limits the results of a query to entities that the logged in user has access to. When the browser calls SaveChanges, does Breeze verify on the server that the entities reported as modified are from the original set?
To put it another way, does the EFContextProvider (in the case Entity Framework) keep track of entities that have been handed out, so it can check against malicious data passed to SaveChanges? Or does BeforeSaveEntity need to validate that the user has access to the changed entities?
You must guard against malicious data in your BeforeSaveEntity or BeforeSaveEntities methods.
The idea that the EFContextProvider would keep track of entities that have already been handed out is probably something that we would NOT want to do because
The EFContextProvider would no longer be stateless, which was a design goal to facilitate scaling.
You would still need to guard against malicious data for "Added" entities in the BeforeXXX methods.
It is actually a valid use case for some of our users to "modify" entities without having first queried them.

CQRS/EventStore - Passing GUIDS to UI?

I am currently using J Oliver's EventStore which as I understand uses Guids for the Stream ID and this is what is used to build my aggregate root.
From a CQRS point of view and a DDD perspective, I should be thinking about the domain and not GUIDs.
So, if I do a GET (Mvc client), am I right to say that my URL should have the identity of my domain object (aggregate root) and from that I get the GUID from the read store which is then used to build my aggregate root from the event store?
Or should I pass the GUID around to my forms and pass them back as hidden form variables? At least this way I know the aggregate root id and do not have to query the read store?
I suppose the first way is the correct way (not using GUIDs in forms) as then all my Gets and POSTS deal with identities of domain objects rather than GUIDSs which the client will not know.
I suppose this also allows me to build as REST based API which focuses on resources and their identities rather than system generated GUIDS. Please correct me if I am wrong
In my opinion you are on the right track here. The ui should rely solely on the read model and not really care about the aggregates. Then when you need to send a command you can use the read model to get the id of the aggregate you are interested in. The read model should be very fast to read from anyway (that's the whole reason behind using different models for reads and writes) and very easy to cache if you need to. This will also give you much nicer urls.

Resources