Different persistence repositories for an aggregate in DDD - domain-driven-design

I have an aggregate with a root entity (Documentation) and a VO (Document). Documents are associated with files (pdfs, images, office documents, etc), so I have to persist the aggregate in a database and files in a ftp server (files cannot be saved in the database because space files is too large).
My db repository class implements an interface with methods like FindXXX, AddDocument, RemoveDocument and others. How could I implement ftp persistence? Should my db repository connect to ftp setver in AddDocument and RemoveDocument? Or I should create a ftp repository class that implements the interface. If so, methods like FindXXX not make sense.
As far as I know about DDD, each aggregate have only one interface repository that represents how can be persisted. It can have multiple "persistence modes" (in a db, ftp, file, etc) but the interface should the same.

As far as I know about DDD, each aggregate have only one interface repository that represents how can be persisted.
That's mostly true; people generally assume that an entire aggregate is going to be stored in a single place. When you distribute the state of the aggregate across multiple storage units, your failure modes need very careful attention.
So one thing to consider is whether the separately stored documents are something that are part of the aggregate, or something that is referenced by the aggregate.
If they are referenced by the aggregate, then you treat them like any other reference to another aggregate. The documentation aggregate stores a identifier/reference/hint for the document, and takes advantage of a domain service to access the document if it needs it.
If they are part of the aggregate, then the usual answer is that "the repository" will be a facade in front of a complicated infrastructure thing that masks the fact that the documentation and the document(s) are stored separately.
In other words, the infrastructure layer will be trying to orchestrate the load and store operations, and the rest of the system doesn't need to know the details.

Late response. But, simply put, you should have two services. In my reading of DDD, repositories are often considered as infrastructure services. In this case, you have two:
A repository / interface for storage, and basic retrieval of document IDs, metadata, and references
A repository / interface for storage, and basic retrieval of blobs of data
Sometimes it makes sense to have multiple aggregates, and repositories. In fact, some of Vaughn Vernon examples on bounded contexts (https://github.com/VaughnVernon/IDDD_Samples) do include aggregates holding references to other aggregates. I would argue that you should do what makes sense, and what feels appropriate.
Indeed, if you were running a post office collection centre, chances are you will have a way of 1. storing the small to large parcels, and 2. curating an index of where every small to large parcels are located in the centre so that you can retrieve it.
My db repository class implements an interface with methods like FindXXX, AddDocument, RemoveDocument and others. How could I implement ftp persistence? Should my db repository connect to ftp setver in AddDocument and RemoveDocument? Or I should create a ftp repository class that implements the interface.
If your database repository connects to FTP in addition to some other data store, you may arguably be putting too much logic, and responsibility in one area. That said, there is nothing wrong with doing this too.
If so, methods like FindXXX not make sense. As far as I know about DDD, each aggregate have only one interface repository that represents how can be persisted.
For this specific problem, most DDD practitioners will recommend you have a separate view service / model. It can produce a materialised / view DTO across repositories or services.
Fundamentally, it should be easy to test individual parts, and to replace underlying implementations. If you decided to switch (or even include support) from FTP to Google Cloud Storage / AWS S3 one day, then there might be more work involved, and changes to test cases.

Related

DDD repository and factory

In my application a few layers.
In this topic will focus on Domain and Infrastructure layers.
I have repository interface ClientRepositoryInterface in Domain layer.
And I have implementation of this interface ClientRepositoryImpl in Infrastructure layer.
But to reconstitute the object in the middle of the cycle of its existence I need factory(ReconstitutionClientFactory).
Call the factory will be in the repository.
The book by Eric Evans is described as a normal practice.
But where this factory(ReconstitutionClientFactory) should be located? In Domain or in Infrastructure layer?
I think in Domain...
BUT! But then the lower layer will directly call a higher layer!
This is wrong, but how to do right?
Factory & Repository Concepts
To answer your question, I think it's important to focus on responsibilities of the concepts defined by DDD.
In the blue book, there is a section that deals with the problem that you describe:
A FACTORY handles the beginning of an object’s life; a REPOSITORY helps manage the middle and the end.
and specifically for your question:
Because the REPOSITORY is, in this case, creating objects based on data, many people consider the REPOSITORY to be a FACTORY—indeed it is, from a technical point of view.
(both quotes from Evans, chapter 6, section "The relationship with factories")
To keep the concepts pure, it is important that the interface of your factories and repositories are clean. So don't allow creating new business objects through the repository interface, and don't allow querying existing ones through the factory interface.
Keeping the interfaces clean does however not mean that you should not use a factory from the repository implementation, because, after all, the repository creates an instance at some point, and if that instance creation is complex, a factory is the appropriate solution.
To quote Evans again:
Whenever there is exposed complexity in reconstituting an object from another medium, the FACTORY is a good option.
Note, however, that the repository will most likely call a different method on the factory than the clients that really want to create a new domain object (as opposed to reconstitution).
There is even an example in Evans' book that illustrates approach:
Answer to your question
Now that it is clear that this is allowed, lets focus on your question of where to put the factory:
The DDD factory interface belongs in the domain, because your domain logic uses this to create domain objects.
The DDD reconstitution factory interface does not belong in the domain, since this is only relevant for your repository. It does not exist in the real world of your domain.
Now if you're using an architecture that prohibits dependencies from the domain to the infrastructure (which you probably should when applying DDD), it's clear that the factory implementation belongs in the infrastructure. Note that it does not matter whether you call your layers layers, rings, realms or whatever, the dependencies are the important part.
First of all, the layers approach is kinda obsolete. When talking layers think 'context', who's on top of who isn't important.
The repository is in charge of restoring an object. A factory just creates a new object. Note the different semantics. The repository knows how saving/restoring to/from persistence is done and that depends on the storage and the method of access.
So, everything is done inside the repository i.e in the Infrastructure. If you serialize things, then you need just to deserialize back (this is how a document db does things anyway). If you're using an ORM or store things in tables then you'll do all the query required to get the data and repopulate the object. An ORM is the easiest way since it can use reflection to populate private properties. In this case the ORM itself is the factory.
One more thing, restoring, while technically can be done by a domain factory, it isn't the factory's purpose to do that because it breaks the layer boundaries. We want to keep everything persistence related in the Infrastructure.

DDD & Factories - Intensive CRUD Operations

We have recently decided to adopt DDD in my team for our new projects because of the so many obvious benefits (coming from the Active-Record pattern school) and there are a couple of things that are yet unclear.
Say I have an entity Transaction that depends on the following entities (that each in turn depends on other so many entities):
1. Customer
2. Account
3. Currency
When I make use of factories to instantiate a Transaction entity to pass to a Domain Service for some fancy business rules, do I make so many queries to setup all these dependent instances?
If I have overloads in my factory that skip such dependencies then those will be null in some cases and it will become too complicated to differentiate when I can access those properties and when I cannot. With Active-Record pattern I just use lazy loading and have them load only on demand. Any ideas with DDD?
EDIT:
In my scenario “Transaction” seems to be the best candidate for an Aggregate root. I have defined a method in my Application Service “InitiateTransaction” (also have a “FinalizeTransaction” as it involves a redirect to PayPal) and takes as parameters the DTOs needed to carry AccountId, CurrencyId, LanguageId and various other foreign keys as well as Transaction attributes.
When calling my Domain Services (Transaction Processor and Fraud Rule Evaluator), I need to specify the “Transaction” Aggregate with all dependencies loaded (“Transaction.Customer”, “Transaction.Currency”, etc.).
So if I am correct the steps required are:
1. Call some repository(ies) to retrieve Customer, Currency etc.
2. Call TransactionFactory with dependencies specified above to get a Transaction object
3. Call Domain Services with fully loaded Transaction object for business rules to take place
Correct? Additionally, my concern was about steps 1 and 2.
If “Customer”, “Currency” and other Entities/Value Objects “Transaction” depends on, have in turn other dependencies. Do I try to set up those as well? Because it seems to me that if I do I will end up with very bloated code in my Application Service and not very reusable to place in a separate method. However, if I don’t and just retrieve those from a repository with a “GetById(id)”as you suggested, my code could end up buggy as say I need property “Transaction.Customer.CreatedByUser” which returns a “User” instance, it will be null because repositories only load flat instances.
EDIT:
I ended up using GetById(id) to load only the dependencies I knew they were needed in my Services. Not a big fun of accidentally accessing null instances due to flat loading but I have my unit tests to protect me from taking it to production!!
I highly doubt it that Currency is an entity, however it's important to model things like how they defined and use by the real Domain. Forget factories or other implementation details like the db, you need to make sure you have defined the concepts right.
Once you've done that, you'd already identified the aggregate root as well. Btw, the entities should encapsulate the relevant business rules. Use Services to implement use-cases i.e to manage the interaction between the domain objects and other parts such as the repository.
You should keep EVERYTHING related to db and CRUD in the repository, and have the repo work only with the aggregate roots. Also, for querying purposes, you should use CQRS so that all the queries would be done on a read model. For Domain purposes, a Get(id) is 99% enough and that method returns an aggregate root.
Be aware that DDD is very tricky, the most difficult part is modeling the Domain correctly, all the buzzwords are useless if the model is wrong.

could domain models be aware of repositories?

May be for some domain logic implementation entities need access to repo for update/delete of self or any related entity. Does this sound right ??
No, it doesn't, at least for the question tagged with "domain-driven-design" tag.
Definitely, Active Record pattern has a right to live in some systems and some people find strong coupling useful, but in DDD the proposed way is to use repositories explicitly:
Evans DDD, p.152: For each type of object that needs global access, create an object that can provide the illusion of an in-memory collection of all objects of that type. «...» Provide REPOSITORIES only for AGGREGATE roots that actually need direct access. Keep the client focused on the model, delegating all object storage and access to the REPOSITORIES.
So, in DDD, repository not only encapsulates the infrastructure code, required to access the database, but the whole idea that the objects must be stored and loaded.
If you are doing some compound actions which involve saving and loading from the database, then the services that have references to the repositories are the best candidates.
While it definitely sounds dangerous for an entity to be able to access its own repository to store or delete itself (see persistence ignorance), in some particular cases I could tolerate that an entity exceptionnally requests from a Repository another aggregate root that it doesn't already hold a reference to.
However, note that domain entities should only know about abstractions of repositories (i.e. interfaces that reside in the domain layer) and not their concrete implementations. Therefore, don't have the Domain layer reference the Infrastructure layer, but instead inject instances of concrete repositories at runtime where you need them.
And this shouldn't be the norm, anyway.

Repository or ServiceAgent in DDD

I have a system that talks with the database using repositories. What is the correct definition when it is a remote service? Or better,
Repository is for databases as [...] is for external Web-Services.
I found in many places about ServiceAgents but I don't know if this is the correct definition.
In DDD, Repositories represent a virtual collection of entities (aggregate roots in particular). So, if you have 10M persisted customers, you would work with a repository as if it were a collection of all 10M in memory. Repositories generally only deal with the same types of operations you would find on a collection: Add something, Remove something, Find something in the collection.
If the actual persistence of the data occurs via a Web Service, the implementation of the repository might interact with a Web Service proxy rather than with a database. The fact that the persistence involves a web service doesn't drive how it should be expressed by your domain however. That is to say, whether the data is persisted via direct database calls, an ORM, a Web Service, or a courier pigeon is an implementation detail.
Now, if your domain model has dependencies to be facilitated by external services (e.g. credit card validation, address verification, etc.), this should be expressed as a Domain Service in the form of an interface which defines the required operations in terms of the domain model. To be clear, Domain Services are operations which are logically part of your domain, but don't fit cleanly for some reason or another on a given entity or value object. Behavior facilitated by an external service is just one example of when you might use a Domain Service, so don't think of Domain Services as "the repository pattern for Web Services" or some such thing.
Not sure if this is the official term but I'd refer to this as a "Service Proxy".
I'm just begging to wrap my head around most of the DDD principles but my understanding is that Repositories are generally used to abstract databases but they could be used to wrap anything that behaves as persistent and/or queryable store. So in short I guess what I'm suggesting is that if your service behaves enough like a database then I don't see any problem with using a repository to abstract access to the entities that it deals with.
Otherwise, if the web service doesn't directly deal with data or behavior that is in your domain, then perhaps your case may be a candidate for an infrastructure level service in your application layer.
If anybody has any objections then please comment as I would like to know if I myself am misunderstanding the intended use of repositories.

Should repositories be both loading and saving entities?

In my design, I have Repository classes that get Entities from the database (How it does that it does not matter). But to save Entities back to the database, does it make sense to make the repository do this too? Or does it make more sense to create another class, (such as a UnitOfWork), and give it the responsibility to save stuff by having it accept Entities, and by calling save() on it to tell it to go ahead and do its magic?
In DDD, Repositories are definitely where ALL persistence-related stuff is expected to reside.
If you had saving to and loading from the database encapsulated in more than one class, database-related code will be spread over too many places in your codebase - thus making maintenance significantly harder. Moreover, there will be a high chance that later readers of this code might not understand it at first sight, because such a design does not adhere to the quasi-standards that most developers are expecting to find.
Of course, you can have separate Reader/Writer-helper classes, if that's appropriate in your project. But seen from the Business Layer, the only gateway to persistence should be the repository...
HTH!
Thomas
I would give the repository the overall responsibility for encapsulating all aspects of load and save. This ensures that tricky issues such as managing contention between readers and writes has a place to be managed.
The repository might well use your UnitOfWork class, and might need to expose a BeginUow and Commit methods.
Fowler says that repository api should mimic collection:
A Repository mediates between the domain and data mapping layers, acting like an in-memory domain object collection.

Resources