Should repositories be both loading and saving entities? - domain-driven-design

In my design, I have Repository classes that get Entities from the database (How it does that it does not matter). But to save Entities back to the database, does it make sense to make the repository do this too? Or does it make more sense to create another class, (such as a UnitOfWork), and give it the responsibility to save stuff by having it accept Entities, and by calling save() on it to tell it to go ahead and do its magic?

In DDD, Repositories are definitely where ALL persistence-related stuff is expected to reside.
If you had saving to and loading from the database encapsulated in more than one class, database-related code will be spread over too many places in your codebase - thus making maintenance significantly harder. Moreover, there will be a high chance that later readers of this code might not understand it at first sight, because such a design does not adhere to the quasi-standards that most developers are expecting to find.
Of course, you can have separate Reader/Writer-helper classes, if that's appropriate in your project. But seen from the Business Layer, the only gateway to persistence should be the repository...
HTH!
Thomas

I would give the repository the overall responsibility for encapsulating all aspects of load and save. This ensures that tricky issues such as managing contention between readers and writes has a place to be managed.
The repository might well use your UnitOfWork class, and might need to expose a BeginUow and Commit methods.

Fowler says that repository api should mimic collection:
A Repository mediates between the domain and data mapping layers, acting like an in-memory domain object collection.

Related

Different persistence repositories for an aggregate in DDD

I have an aggregate with a root entity (Documentation) and a VO (Document). Documents are associated with files (pdfs, images, office documents, etc), so I have to persist the aggregate in a database and files in a ftp server (files cannot be saved in the database because space files is too large).
My db repository class implements an interface with methods like FindXXX, AddDocument, RemoveDocument and others. How could I implement ftp persistence? Should my db repository connect to ftp setver in AddDocument and RemoveDocument? Or I should create a ftp repository class that implements the interface. If so, methods like FindXXX not make sense.
As far as I know about DDD, each aggregate have only one interface repository that represents how can be persisted. It can have multiple "persistence modes" (in a db, ftp, file, etc) but the interface should the same.
As far as I know about DDD, each aggregate have only one interface repository that represents how can be persisted.
That's mostly true; people generally assume that an entire aggregate is going to be stored in a single place. When you distribute the state of the aggregate across multiple storage units, your failure modes need very careful attention.
So one thing to consider is whether the separately stored documents are something that are part of the aggregate, or something that is referenced by the aggregate.
If they are referenced by the aggregate, then you treat them like any other reference to another aggregate. The documentation aggregate stores a identifier/reference/hint for the document, and takes advantage of a domain service to access the document if it needs it.
If they are part of the aggregate, then the usual answer is that "the repository" will be a facade in front of a complicated infrastructure thing that masks the fact that the documentation and the document(s) are stored separately.
In other words, the infrastructure layer will be trying to orchestrate the load and store operations, and the rest of the system doesn't need to know the details.
Late response. But, simply put, you should have two services. In my reading of DDD, repositories are often considered as infrastructure services. In this case, you have two:
A repository / interface for storage, and basic retrieval of document IDs, metadata, and references
A repository / interface for storage, and basic retrieval of blobs of data
Sometimes it makes sense to have multiple aggregates, and repositories. In fact, some of Vaughn Vernon examples on bounded contexts (https://github.com/VaughnVernon/IDDD_Samples) do include aggregates holding references to other aggregates. I would argue that you should do what makes sense, and what feels appropriate.
Indeed, if you were running a post office collection centre, chances are you will have a way of 1. storing the small to large parcels, and 2. curating an index of where every small to large parcels are located in the centre so that you can retrieve it.
My db repository class implements an interface with methods like FindXXX, AddDocument, RemoveDocument and others. How could I implement ftp persistence? Should my db repository connect to ftp setver in AddDocument and RemoveDocument? Or I should create a ftp repository class that implements the interface.
If your database repository connects to FTP in addition to some other data store, you may arguably be putting too much logic, and responsibility in one area. That said, there is nothing wrong with doing this too.
If so, methods like FindXXX not make sense. As far as I know about DDD, each aggregate have only one interface repository that represents how can be persisted.
For this specific problem, most DDD practitioners will recommend you have a separate view service / model. It can produce a materialised / view DTO across repositories or services.
Fundamentally, it should be easy to test individual parts, and to replace underlying implementations. If you decided to switch (or even include support) from FTP to Google Cloud Storage / AWS S3 one day, then there might be more work involved, and changes to test cases.

DDD. Should I modify a entity inside a repository?

I have a question about implementing DDD and repository pattern.
Should I modify a entity inside a repository?
Let's say I have an Order and want to mark that order as finished.
As I see this I have two choices.
1.
var order _orderRepository.GetById(1);
order.Finish();
_orderRepository.Update(order);
...where the change is persisted to the database in the Update call.
2.
var order _orderRepository.GetById(1);
var finishedOrder = _orderRepository.Finish(order);
...where the change is persisted to the database in the Finish call.
Is there a advantage of using one method over the other? What is the DDD-way of doing this?
You should not modify it in the repository.
The reason is that the repository is responsible of abstracting away the persistence (i.e. reading/writing to the data storage).
If you also make it responsible of some business logic you are violating the Single Responsibility Principle.
If you are doing automated testing, it also means that you have to do integration tests to be sure that the database communication/mapping works and then unit tests to verify the business logic that you introduced in it.
It can seem trivial. But it's only trivial the first time you violate the principle. But one violation usually leads to another and another and finally an application that isn't as easy to maintain :)
An application where classes have mixed responsibilities are also harder to navigate. Each time you want to update a feature you have to go through all layers to find where the actual logic is done.
Use the application layer to coordinate behaviors for one or more domain objects, the domain objects should execute all state changes and lastly the repo should persist those changes to the database or wherever you are storing the domain's state.

DDD repository and factory

In my application a few layers.
In this topic will focus on Domain and Infrastructure layers.
I have repository interface ClientRepositoryInterface in Domain layer.
And I have implementation of this interface ClientRepositoryImpl in Infrastructure layer.
But to reconstitute the object in the middle of the cycle of its existence I need factory(ReconstitutionClientFactory).
Call the factory will be in the repository.
The book by Eric Evans is described as a normal practice.
But where this factory(ReconstitutionClientFactory) should be located? In Domain or in Infrastructure layer?
I think in Domain...
BUT! But then the lower layer will directly call a higher layer!
This is wrong, but how to do right?
Factory & Repository Concepts
To answer your question, I think it's important to focus on responsibilities of the concepts defined by DDD.
In the blue book, there is a section that deals with the problem that you describe:
A FACTORY handles the beginning of an object’s life; a REPOSITORY helps manage the middle and the end.
and specifically for your question:
Because the REPOSITORY is, in this case, creating objects based on data, many people consider the REPOSITORY to be a FACTORY—indeed it is, from a technical point of view.
(both quotes from Evans, chapter 6, section "The relationship with factories")
To keep the concepts pure, it is important that the interface of your factories and repositories are clean. So don't allow creating new business objects through the repository interface, and don't allow querying existing ones through the factory interface.
Keeping the interfaces clean does however not mean that you should not use a factory from the repository implementation, because, after all, the repository creates an instance at some point, and if that instance creation is complex, a factory is the appropriate solution.
To quote Evans again:
Whenever there is exposed complexity in reconstituting an object from another medium, the FACTORY is a good option.
Note, however, that the repository will most likely call a different method on the factory than the clients that really want to create a new domain object (as opposed to reconstitution).
There is even an example in Evans' book that illustrates approach:
Answer to your question
Now that it is clear that this is allowed, lets focus on your question of where to put the factory:
The DDD factory interface belongs in the domain, because your domain logic uses this to create domain objects.
The DDD reconstitution factory interface does not belong in the domain, since this is only relevant for your repository. It does not exist in the real world of your domain.
Now if you're using an architecture that prohibits dependencies from the domain to the infrastructure (which you probably should when applying DDD), it's clear that the factory implementation belongs in the infrastructure. Note that it does not matter whether you call your layers layers, rings, realms or whatever, the dependencies are the important part.
First of all, the layers approach is kinda obsolete. When talking layers think 'context', who's on top of who isn't important.
The repository is in charge of restoring an object. A factory just creates a new object. Note the different semantics. The repository knows how saving/restoring to/from persistence is done and that depends on the storage and the method of access.
So, everything is done inside the repository i.e in the Infrastructure. If you serialize things, then you need just to deserialize back (this is how a document db does things anyway). If you're using an ORM or store things in tables then you'll do all the query required to get the data and repopulate the object. An ORM is the easiest way since it can use reflection to populate private properties. In this case the ORM itself is the factory.
One more thing, restoring, while technically can be done by a domain factory, it isn't the factory's purpose to do that because it breaks the layer boundaries. We want to keep everything persistence related in the Infrastructure.

DDD & Factories - Intensive CRUD Operations

We have recently decided to adopt DDD in my team for our new projects because of the so many obvious benefits (coming from the Active-Record pattern school) and there are a couple of things that are yet unclear.
Say I have an entity Transaction that depends on the following entities (that each in turn depends on other so many entities):
1. Customer
2. Account
3. Currency
When I make use of factories to instantiate a Transaction entity to pass to a Domain Service for some fancy business rules, do I make so many queries to setup all these dependent instances?
If I have overloads in my factory that skip such dependencies then those will be null in some cases and it will become too complicated to differentiate when I can access those properties and when I cannot. With Active-Record pattern I just use lazy loading and have them load only on demand. Any ideas with DDD?
EDIT:
In my scenario “Transaction” seems to be the best candidate for an Aggregate root. I have defined a method in my Application Service “InitiateTransaction” (also have a “FinalizeTransaction” as it involves a redirect to PayPal) and takes as parameters the DTOs needed to carry AccountId, CurrencyId, LanguageId and various other foreign keys as well as Transaction attributes.
When calling my Domain Services (Transaction Processor and Fraud Rule Evaluator), I need to specify the “Transaction” Aggregate with all dependencies loaded (“Transaction.Customer”, “Transaction.Currency”, etc.).
So if I am correct the steps required are:
1. Call some repository(ies) to retrieve Customer, Currency etc.
2. Call TransactionFactory with dependencies specified above to get a Transaction object
3. Call Domain Services with fully loaded Transaction object for business rules to take place
Correct? Additionally, my concern was about steps 1 and 2.
If “Customer”, “Currency” and other Entities/Value Objects “Transaction” depends on, have in turn other dependencies. Do I try to set up those as well? Because it seems to me that if I do I will end up with very bloated code in my Application Service and not very reusable to place in a separate method. However, if I don’t and just retrieve those from a repository with a “GetById(id)”as you suggested, my code could end up buggy as say I need property “Transaction.Customer.CreatedByUser” which returns a “User” instance, it will be null because repositories only load flat instances.
EDIT:
I ended up using GetById(id) to load only the dependencies I knew they were needed in my Services. Not a big fun of accidentally accessing null instances due to flat loading but I have my unit tests to protect me from taking it to production!!
I highly doubt it that Currency is an entity, however it's important to model things like how they defined and use by the real Domain. Forget factories or other implementation details like the db, you need to make sure you have defined the concepts right.
Once you've done that, you'd already identified the aggregate root as well. Btw, the entities should encapsulate the relevant business rules. Use Services to implement use-cases i.e to manage the interaction between the domain objects and other parts such as the repository.
You should keep EVERYTHING related to db and CRUD in the repository, and have the repo work only with the aggregate roots. Also, for querying purposes, you should use CQRS so that all the queries would be done on a read model. For Domain purposes, a Get(id) is 99% enough and that method returns an aggregate root.
Be aware that DDD is very tricky, the most difficult part is modeling the Domain correctly, all the buzzwords are useless if the model is wrong.

could domain models be aware of repositories?

May be for some domain logic implementation entities need access to repo for update/delete of self or any related entity. Does this sound right ??
No, it doesn't, at least for the question tagged with "domain-driven-design" tag.
Definitely, Active Record pattern has a right to live in some systems and some people find strong coupling useful, but in DDD the proposed way is to use repositories explicitly:
Evans DDD, p.152: For each type of object that needs global access, create an object that can provide the illusion of an in-memory collection of all objects of that type. «...» Provide REPOSITORIES only for AGGREGATE roots that actually need direct access. Keep the client focused on the model, delegating all object storage and access to the REPOSITORIES.
So, in DDD, repository not only encapsulates the infrastructure code, required to access the database, but the whole idea that the objects must be stored and loaded.
If you are doing some compound actions which involve saving and loading from the database, then the services that have references to the repositories are the best candidates.
While it definitely sounds dangerous for an entity to be able to access its own repository to store or delete itself (see persistence ignorance), in some particular cases I could tolerate that an entity exceptionnally requests from a Repository another aggregate root that it doesn't already hold a reference to.
However, note that domain entities should only know about abstractions of repositories (i.e. interfaces that reside in the domain layer) and not their concrete implementations. Therefore, don't have the Domain layer reference the Infrastructure layer, but instead inject instances of concrete repositories at runtime where you need them.
And this shouldn't be the norm, anyway.

Resources