How do read-only database views fit into the repository pattern? - domain-driven-design

Example: Your database has a SQL view named "CustomerOrdersOnHold". This view returns a filtered mix of specific customer and order data fields. You need to fetch data from this view in your application. How does access to such a view fit into the repository pattern? Would you create a "CustomerOrdersOnHoldRepository"? Is a read-only view such as this considered an aggregate root?

I would prefer separating the read repository, preferably even change its name to Finder or Reader, the repository is meant for Domain usage not for querying read-only data, you can refer to this article and this which explains the usage of Finder separated form repository.
I would recommend also the separating of read model from write model architecture CQRS and there
This architecture allows you to separate the read model from write model even in terms of data storage and the use of event sourcing.
For a middle solution you can utilize some CQRS concepts without the complexity of separating database by just separating repository from finders, read this post
for a sample of this type of solution (use the same database but separating finders form repositories) check this sample

Your read-only data would be considered Value Objects in the DDD world.
I typically place access methods for value objects in existing repositories until such time that it makes sense to create a separate repository. It's similar to a method that might return a static list of states to be used on an address form:
IAddressRepository
{
Address GetAddress(string addressID);
List<string> GetStates(string country);
}

I think that it is fine to have a separate repository like "CustomerOrdersOnHoldRepository". The interface of the repository will reflect the fact that the objects are readonly (by not defining Save/Add/MakePersistent method).
From How to write a repository:
... But there is another strategy that I quite like: multiple
Repositories. In our ordering example there is no reason we can have
two Repositories: AllOrders and SurchargedOrders. AllOrders represent
a list containing every single order in the system, SurchargedOrders
represents a subset of it.
I would not call returned object an Aggrgate Root. Aggregates are for consistency, data exchange and life cycles. Your objects don't have any of these. It seems that they also can not be classified as Value Objects ('characteristic or attribute'). They are just standalone classes.

Related

In DDD/CQRS, should ReadModel act as ViewModel, if not then where belongs responsibility for mapping?

Assume read model ProductCatalogueItem is built from aggregates/write-models, stored separately from write-models, and contains each product available for selling, and has following properties:
basics: product_code, name, price, number_of_available_stock,
documentation: short_description, description,...
product characteristics: weight, length, depth, width, color,...
And, there are two views:
product list containing list/table/grid of available product offers, and the view needs only following basic properties: product_code, name, price, number_of_available_stock,
product details showing all the properties - basics, documentation, product characteristics.
Naturally, there come two ViewModels in mind:
ProductCatalogueListItem containing only basic properties,
ProductCatalogueItemDetails containing all the properties.
Now,.. there two options (I can see).
ViewModels are 1:1 representation of ReadModels
Therefore the are two read models, not one, ProductCatalogueListItem and ProductCatalogueItemDetails. And, the read service will have two methods:
List<ProductCatalogueListItem> searchProducts(FilteringOptions),
ProductCatalogueItemDetails getProductDetails(product_code).
And, controllers return these models directly (or, mapped to dto for transport layer).
The issue here is filtering,.. should read service perform search query on a different read model, than is returned from the method call? Because, ProductCatalogueListItem doesn't have enough information to perform filtering.
ViewModels are another project of ReadModels
The read service will have two methods:
List<ProductCatalogueItem> searchProducts(FilteringOptions),
ProductCatalogueItem getProduct(product_code).
And, the mapping from ReadModels to ViewModels is done by upper layer (probably controller).
There is no issue with filtering,... But, there is another issue, that more data leave domain layer, than is actually needed. And, controllers would grow with more logic. As there might be different controllers for different transport technologies, then mapping code would probably get duplicated in those controllers.
Which approach to organize responsibilities is correct according to DDD/CQRS, or completely something else?
The point is:
should I build two read models, and search using one, then return other?
should I build single read model, which is used, and then mapped to limited view to contain only base information for view?
First of all, you do a wrong assertion:
...read model ProductCatalogueItem is built from aggregates/write-models...
Read model doesn't know of aggregates or anything about write model, you build the read model directly from the database, returning the data needed by the UI.
So, the view model is the read model, and it doesn't touch the write model. That's the reason why CQRS exists: for having a different model, the read model, to optimize the queries for returning the data needed by the client.
Update
I will try to explain myself better:
CQRS is simply splitting one object into two, based on the method types. There are two method types: command (any method that mutates state) and query (any method that returns a value). That's all.
When you apply this pattern to the service boundary of an application, you have a write service and a read service, and so you can scale differently the command and query handling, and you can have also two models.
But CQRS is not having two databases, is not messaging, is not eventual consistency, is not updating read model from write model, is not event sourcing. You can do CQRS wihtout them. I say this because I've seen some misconceptions in your assertions.
That said, the design of the read model is done according to what information the user wants to see in the UI, i.e., the read model is the view model, you have no mapping between them, they both are the same model. You can read about it in the references (3) and (6) bellow. I think this answer to your whole question. What I don't understand is the filtering issue.
Some good references
(1) http://codebetter.com/gregyoung/2010/02/16/cqrs-task-based-uis-event-sourcing-agh/
(2) http://www.cqrs.nu/Faq/command-query-responsibility-segregation
(3) "Implementing Domain Driven Design" book, by Vaughn Vernon. Chapter 4: Architecture, "Command-Query Responsibility Segregation, or CQRS" section
(4) https://kalele.io/really-simple-cqrs/
(5) https://martinfowler.com/bliki/CQRS.html
(6) http://udidahan.com/2009/12/09/clarified-cqrs/
As you already built your read model using data which arrived from one or more services, your problem is now in another space(perhaps MVC) rather in CQRS.
Now assume your read model is a db object and ProductCatalogueListItem and ProductCatalogueItemDetails are 2 view models. When you have a request to serve list of products you will make a query in your read db from read model (ProductCatalog table). May be you make queries for additional filters using additional where clauses. Now where do you put your mapping activities in your code after fetching db objects? Its a personal choice. You don't have to do it on uupper llayer aat aall. When I use dapper I fetch db objects using view models inside generic. So I can directly return result from my service method whose return type would be IEnumerable.
For a detail view I would use the same db object. I know CQRS suggests to have different read models for different views. But question yourself - do you really need another db object for detail view? You will need only an id to get all columns where in the first case you needed some selected columns. So I would design your case with a mixture of your 2 above mentioned methods - have 2 service methods returning 2 different objects but instead of having a 1:1 read model to view model have a single read db object and build 2 different view models from it.

Repository Add and Create methods

Why are repositories' .Add method usually implemented as accepting the instance of entity to add, with the .Id already "set" (although it can be set again via reflection), which should be repo's responsibility?
Wouldn't it be better to implement it as .CreateAndAdd?
For example, given a Person entity:
public class Person
{
public Person(uint id, string name)
{
this.Id = id;
this.Name = name;
}
public uint Id { get; }
public string Name { get; }
}
why are repositories usually implemented as:
public interface IRpository<T>
{
Task<T> AddAsync(T entity);
}
and not as:
public interface IPersonsRpository
{
Task<Person> CreateAndAddAsync(string name);
}
why are repositories usually implemented as...?
A few reasons.
Historically, domain-driven-design is heavily influenced by the Eric Evans book that introduced the term. There, Evans proposed that repositories provide collection semantics, providing "the illusion of an in memory collection".
Adding a String, or even a Name, to a collection of Person doesn't make very much sense.
More broadly, figuring out how to reconstitute an entity from a set of a parameters is a separate responsibility from storage, so perhaps it doesn't make sense to go there (note: a repository often ends up with the responsibility of reconstituting an entity from some stored memento, so it isn't completely foreign, but there's usually an extra abstraction, the "Factory", that really does the work.)
Using a generic repository interface often makes sense, as interacting with individual elements of the collection via retrieve/store operations shouldn't require a lot of custom crafting. Repositories can support custom queries for different kinds of entities, so it can be useful to call that out specifically
public interface IPersonRepository : IRepository<Person> {
// Person specific queries go here
}
Finally, the id... and the truth of it is that identity, as a concept, has a whole lot of "it depends" baked into it. In some cases, it may make sense for the repository to assign an id to an entity -- for instance, using a unique key generated by the database. Often, you'll instead want to have control of the identifier outside of the repository. Horses for courses.
There already is a great answer on the question, I just want to add some of my thoughts. (It will contain some duplication from the previous answer, so if this is a bad thing just let me know and I'll remove it :) ).
The Responsibility of ID generation can belong to different part of an organization or a system.
Sometimes the ID will be generated by some special rules like a Social Security Number. This number can be used for ID of a Person in a system, so before creating a Person entity this code will have to be generated from a specific SSNGenerator Service.
We can use a random generated ID like a UUID. UUIDs can be generated outside of the Repository and assigned to the entity during creation and the Repository will only store it (add, save) it to the DB.
IDs generated by databases are very interesting. You can have Sequential IDs like in RDBMS, UUID-ish like in MonogoDB or some Hash. In this case the Responsibility of ID generation is assigned to the DB so it can happen only after the Entity is stored not when it's created. (I'm allowing myself freedom here as you can generate it before saving a transaction or read the last one etc.. but I like to generalize here and avoid discussing cases with race conditions and collisions). This means that you Entity does't have an identity before the save completes. Is this a good thing? Of course It depends :)
This problem is a great example of leaky abstractions.
When you implement a solution sometimes the technology used will affect it. You will have to deal with the fact that for example the ID is generated by your Database which is part of your Infrastructure (if you have defined such a layer in your code). You can also avoid this by using s UUID even if you use a RDBMS, but then you have to join (again technology specific stuff :) ) on these IDs so sometimes people like to use the default.
Instead of having Add or AddAndCreate you can have Save method instead that does the same thing, it's just a different term that some people prefer. The repository is indeed often defined as an "In memory collection" but that doesn't mean that we have to stick to it strictly (It can be a good thing to do that most of the time but still...).
As mentioned, if you database generates ID's, the Repository seems like a good candidate to assign IDs (before of after storing) because it is the one talking to the DB.
If you are using events the way you generate ID's can affect things. For example lets say you want to have UserRegisteredEvent with the UserID as s property. If you are using the DB to generate ID's you will have to store the User first and then create and store/dispatch the event or do something of the sort. On the other hand if you generate the ID beforehand you can save the event and the entity together (in a transaction or in the same document doesn't matter). Sometimes this can get tricky.
Background, experience with technologies and framework, exposure to terminology in literature, school and work affects how we think about things and what terminology sounds better to us. Also we (most of the time) work in teams and this can affect how we name things and how implement them.
Using Martin Fowler's definition:
A Repository mediates between the domain and data mapping layers,
acting like an in-memory domain object collection. Client objects
construct query specifications declaratively and submit them to
Repository for satisfaction. Objects can be added to and removed from
the Repository, as they can from a simple collection of objects, and
the mapping code encapsulated by the Repository will carry out the
appropriate operations behind the scenes. Conceptually, a Repository
encapsulates the set of objects persisted in a data store and the
operations performed over them, providing a more object-oriented view
of the persistence layer
A Repository gives an Object Oriented view of the underlying Data (which may be otherwise stored in a relational DB). It's responsible for mapping your Table to your Entity.
Generating an ID for an object is whole different responsibility, which is not trivial and can get quite complex. You may decide to generate the ID in the DB or a separate service. Regardless of where the ID is generated, a Repository should seamlessly map it between your Entity and Table.
ID generation is a responsibility of its own, and if you add it to the Repository, then you are moving away from Single Responsibility Principle.
A side note here that using GUID for an ID is a terrible idea, because they are not sequential. They only meet the uniqueness requirement of an ID but they are not helpful for searching through the Database Index.

Multiple Data Transfer Objects for same domain model

How do you solve a situation when you have multiple representations of same object, depending on a view?
For example, lets say you have a book store. Within a book store, you have 2 main representations of Books:
In Lists (search results, browse by category, author, etc...): This is a compact representation that might have some aggregates like for example NumberOfAuthors and NumberOfRwviews. Each Author and Review are entities themselves saved in db.
DetailsView: here you wouldn't have aggregates but real values for each Author, as Book has a property AuthorsList.
Case 2 is clear, you get all from DB and show it. But how to solve case 1. if you want to reduce number of connections and payload to/from DB? So, if you don't want to get all actual Authors and Reviews from DB but just 2 ints for count for each of them.
Full normalized solution would be 2, but 1 seems to require either some denormalization or create 2 different entities: BookDetails and BookCompact within Business Layer.
Important: I am not talking about View DTOs, but actually getting data from DB which doesn't fit into Business Layer Book class.
For me it sounds like multiple Query Models (QM).
I used DDD with CQRS/ES style, so aggregate roots are producing events based on commands being passed in. To those events multiple QMs are subscribed. So I create multiple "views" based on requirements.
The ES (event-sourcing) has huge power - I can introduce another QMs later by replaying stored events.
Sounds like managing a lot of similar, or even duplicate data, but it has sense for me.
QMs can and are optimized to contain just enough data/structure/indexes for given purpose. This is the way out of "shared data model". I see the huge evil in "RDMS" one for all approach. You will always get lost in complexity of managing shared model - like you do.
I had a very good result with the following design:
domain package contains #Entity classes which contain all necessary data which are stored in database
dto package which contains view/views of entity which will be returned from service
Dto should have constructor which takes entity as parameter. To copy data easier you can use BeanUtils.copyProperties(domainClass, dtoClass);
By doing this you are sharing only minimal amount of information and it is returned in object which does not have any functionality.

PHP How should Repositories handle adding/removing/saving/deleting entities?

I am having a bit of trouble implementing the Repository pattern, due to some confusion.
As far as I can tell now, a Repository should behave like an in-memory collection of objects, so if I do say:
$users = new UserRepository(new UserMapper);
$users->findAll();
The Users repository will load and return an array of User entities. Now I can either use them for just reading data, or can update the data on any particular entity, and invoke a save() method on the Repository that will utilize the Mapper to save the loaded entities back to the data source, with the updates that have been applied.
What I am wondering is if that is a correct understanding.
Should the add() method add an entity directly to the data source, or only to the collection within the Repository?
Likewise for remove(); should this method remove an entity from the data source, or only from the Repository.
The confusion stems from the fact that some implementations I have seen in tutorials have both add()/remove() methods, alongside save()/delete() methods. Is that the correct approach?
I've been developing using DDD techniques for around 6 months now and always use the save and delete methods, the save should persist the data to your persistence layer, the delete should remove from your persistence layer.
Saying the above, there is no reason why it shouldnt add to your collection.
p.s check out the dddinphp Google Group, theres an active community purely for these questions

PHP Repository Pattern Implementation Questions

What should Repositories return from service calls?
An Entity (or Collection of Entities), or instead a reference to itself, which could then be used to access an property that holds a collection of Entities, for example?
Take this sample code:
$user = $userRepository->findById(1);
or
$users = $userRepository->findAll();
I think in most code, a User Entity object, or Users Collection Entity would be returned from a call like this.
It seems a bit strange to me that from one direction, a Repository will return objects directly, yet from the other end, it will hold them in state before acting upon them. Take this sample code as an example:
$user = $factory->make('user');
$user->setName($array_data['name']);
$repo->add($user);
$repo->save();
Is this just how it's done?
I think I am expecting to see something a bit more like this, in terms of retrieval:
$users = $userRepository->findAll(); // Returns $userRepository reference
foreach($users->collection() as $user) {
// Do some operations, or whatever
}
$users->save();
or perhaps, for read only needs:
$users = $userRepository->findAll();
$users = $users->collection(); // Returns User Entities held in state
Clarification as to why it's done one way or another would be much appreciated.
Where does the Factory belong inside the Domain?
Should it be injected as a dependency of the Mapper object? It seems like there must also be Factory access from the controlling code/service layer as well, for creating Entities to submit to the Repository.
That leads into my next question...
What is the preferred way to create new Entities from the controlling class/service layer?
I have seen Factory objects being used, like this:
$user = $factory->make('user');
$user->setName($array_data['name']);
$repo->add($user);
As well as built-in Repository methods, like this:
$repo->saveFromArray($array_data);
In the second example, the $array_data would be forwarded through the repository, to the Mapper, which will then perform the save. Of course the data source would be checked for overlapping records beforehand, in either example.
I assume the first method is preferred? It seems to be a more object-oriented approach.
You have many questions...
What should Repositories return from service calls?
Always aggregate roots (AR). AR design is very important, but it's none of the repository's concern. The repository methods return one or many objects as needed by the Domain. There is no Users Collection Enitity, there's a list of Users (which in php probably is an array), don't complicate things.
The Domain repositories should be used only for Domain needs (read or write). The whole object is returned, the repository doesn't return pieces of an AR, but the whole AR. Once again I mention that AR design is very important.
Where does the Factory belong inside the Domain?
Where it's needed. I don't use a factory, at most I have a factory method, but even that is for restoring purposes (if I'm using a memento). You don't have to use a factory to create the domain objects.
What is the preferred way to create new Entities from the controlling class/service layer?
The simplest way possible. For probably 99% of cases, you'll be using the "new" operator. Use Factories only if it gives you a concrete benefit for specific entities.
The Mapper never performs saves, because it's a mapper. Only repositories do persistence work. Mappers 'convert'/copy data from one model to another. You can use mappers to map a domain objects to some data model to be persisted and back.

Resources