PHP How should Repositories handle adding/removing/saving/deleting entities? - domain-driven-design

I am having a bit of trouble implementing the Repository pattern, due to some confusion.
As far as I can tell now, a Repository should behave like an in-memory collection of objects, so if I do say:
$users = new UserRepository(new UserMapper);
$users->findAll();
The Users repository will load and return an array of User entities. Now I can either use them for just reading data, or can update the data on any particular entity, and invoke a save() method on the Repository that will utilize the Mapper to save the loaded entities back to the data source, with the updates that have been applied.
What I am wondering is if that is a correct understanding.
Should the add() method add an entity directly to the data source, or only to the collection within the Repository?
Likewise for remove(); should this method remove an entity from the data source, or only from the Repository.
The confusion stems from the fact that some implementations I have seen in tutorials have both add()/remove() methods, alongside save()/delete() methods. Is that the correct approach?

I've been developing using DDD techniques for around 6 months now and always use the save and delete methods, the save should persist the data to your persistence layer, the delete should remove from your persistence layer.
Saying the above, there is no reason why it shouldnt add to your collection.
p.s check out the dddinphp Google Group, theres an active community purely for these questions

Related

How to design to save a field of a aggregate root in DDD

I learned DDD recently, we used to encapsulate the creation, update, deletion in to the repository to persist the changes to the DB.
With ORM tools, we can ignore the detail of the persistence, usually the argument of the repository is an aggregate root object, and the ORM execute the conversion of the persistence(for example, it will update one field if there just one change).
But if without ORM, there is just a field of the aggregate root object changed and save it to DB, how to design this for repository? support a method to save this field? There is a method called update to save all properties, but with it, it will cause performance issue.
To persist changes only you need to know what changed, obviously. There are two common ways to achieve this:
Track changes as they occur. This strategy is easier to implement when the entity explicitely participates to the change tracking mecanism. For instance, with Event Sourcing the Aggregate Root would record uncommitted change event(s) in a collection for all commands it processed.
Dirty checking: compare the new state to the old state. Note that the old state may be cached for performance optimizations.
Generally you need another Repository. How it is implemented is up to you.
You can write the code so it is able to save/update just single fields when they change.
If you want to update single fields as they change one way to do this is to use an Observer to "observe mutations" in your objects. This approach can have two "operation modes"
Ad-hoc: When a field gets updated persist just this field's value right away.
Aggregate update: Gather the information of all updated fields (just the fact that they were updated not the data). Then update them all at once when the time comes
This approach can have other performance implications in a large system. You have to see if it suits you or not.
Another option would be to have your ORM recognize the changed fields at the time of the update via a comparison. This again has its own performance implications since you will have to fetch the DB object (aggregate) once more and compare it against the runtime changes.
How you actually implement any of these heavily depends on the language you're using and its utilities. Performance issues also heavily depend on the language/runtime platform/3rd party software and lots of other things.

DDD: Referencing non aggregate roots

I'm trying to improve my design using some DDD concepts. Currently I have 4 simple EF entites as shown in the following image:
There are multiple TaskTemplates each of them storing multiple TasksItemTemplates. The TaskItemTemplates contains various information (description, images, default processing times).
Users can create new concrete Tasks based on a TaskTemplate. In the current implementation, this will also create a TaskItem for every TaskItemTemplate, but in the future it might be possible to select one some relevant TasksItemTemplates.
I wonder how to model this requirement in DDD. The reference from TaskItem to TaskTemplateItem is not allowed, because TaskTemplateItem is not an aggregate root. But without this reference it is not possible to get the properties of the TaskTemplateItem.
Of course I could just drop the reference and copy all properties from TaskTemplateItem to TaskItem, but actually I like the possibility to update TaskItems by updating the TaskTemplateItems.
Update: Expected behaviour on Task(Item)Template updates
It should be possible to edit TaskTemplate and TaskItemTemplate and e.g. fix Typos in Name or Description. I expect these changes to be reflected in the Task/TaskItem.
On the other hand, if the DefaultProcessingTime is modified, this should not change the persisted DueDate of a TaskItem.
In my current Implemenation it is not possible to add/remove TaskItemTemplates to a persisted TaskTemplate, but this would be a nice improvement. How would I implement something likes this? Add another entity TaskTemplateVersion between TaskTemplate and TaskItemTemplate?
Update2: TaskItemTemplateId as ValueObject
After reading Vaughn's slides again, I think with a simple modification, my model is correct according to DDD:
Unfortunately I do not really understand, why this Design is better (is it better?). Okay, there won't be unnecessary db queries for TaskItemTemplates. But on the other side I almost ever need a TaskItemTemplate when working with a TaskItem and therefore everything gets more complicated. I cannot any longer do something like
public string Description
{
get { return this.taskItemTemplate.Description; }
}
Based on the properties that you list beneath TaskItem and TaskItemTemplate I'd say that they should be value objects instead of entities. So if there isn't a reason (based on the information in your question there isn't) to make them entities, change them to immutable value objects.
With that solution, you just create a TaskItem from a TaskItemTemplate by copying its data.
Regarding the update scenario that you describe, it see the following solution:
TaskItems are created from a specific version of the TaskItemTemplate. Record that version with a TaskItem.
The TaskTemplate is responsible for updating its items and keep track of their version.
If a template changes, notify all Tasks that are derived from the template if immediate action is required. If you just want to be able to "pull in" the template changes at a later time (instead of acting when the template changes), you just compare the versions.
To make informed decisions, it is very important that you fully understand the pros and cons of immutability. Only then you will see a benefit in modelling things as value objects. One source on the topic that I find very valuable is Eric Lippert's series on immutability.
Also, the book Implementing DDD by Vaughn Vernon explains the concepts of value objects and entities very well.

PHP Repository Pattern Implementation Questions

What should Repositories return from service calls?
An Entity (or Collection of Entities), or instead a reference to itself, which could then be used to access an property that holds a collection of Entities, for example?
Take this sample code:
$user = $userRepository->findById(1);
or
$users = $userRepository->findAll();
I think in most code, a User Entity object, or Users Collection Entity would be returned from a call like this.
It seems a bit strange to me that from one direction, a Repository will return objects directly, yet from the other end, it will hold them in state before acting upon them. Take this sample code as an example:
$user = $factory->make('user');
$user->setName($array_data['name']);
$repo->add($user);
$repo->save();
Is this just how it's done?
I think I am expecting to see something a bit more like this, in terms of retrieval:
$users = $userRepository->findAll(); // Returns $userRepository reference
foreach($users->collection() as $user) {
// Do some operations, or whatever
}
$users->save();
or perhaps, for read only needs:
$users = $userRepository->findAll();
$users = $users->collection(); // Returns User Entities held in state
Clarification as to why it's done one way or another would be much appreciated.
Where does the Factory belong inside the Domain?
Should it be injected as a dependency of the Mapper object? It seems like there must also be Factory access from the controlling code/service layer as well, for creating Entities to submit to the Repository.
That leads into my next question...
What is the preferred way to create new Entities from the controlling class/service layer?
I have seen Factory objects being used, like this:
$user = $factory->make('user');
$user->setName($array_data['name']);
$repo->add($user);
As well as built-in Repository methods, like this:
$repo->saveFromArray($array_data);
In the second example, the $array_data would be forwarded through the repository, to the Mapper, which will then perform the save. Of course the data source would be checked for overlapping records beforehand, in either example.
I assume the first method is preferred? It seems to be a more object-oriented approach.
You have many questions...
What should Repositories return from service calls?
Always aggregate roots (AR). AR design is very important, but it's none of the repository's concern. The repository methods return one or many objects as needed by the Domain. There is no Users Collection Enitity, there's a list of Users (which in php probably is an array), don't complicate things.
The Domain repositories should be used only for Domain needs (read or write). The whole object is returned, the repository doesn't return pieces of an AR, but the whole AR. Once again I mention that AR design is very important.
Where does the Factory belong inside the Domain?
Where it's needed. I don't use a factory, at most I have a factory method, but even that is for restoring purposes (if I'm using a memento). You don't have to use a factory to create the domain objects.
What is the preferred way to create new Entities from the controlling class/service layer?
The simplest way possible. For probably 99% of cases, you'll be using the "new" operator. Use Factories only if it gives you a concrete benefit for specific entities.
The Mapper never performs saves, because it's a mapper. Only repositories do persistence work. Mappers 'convert'/copy data from one model to another. You can use mappers to map a domain objects to some data model to be persisted and back.

Core data mapping model with new (non-optional) relationship

My original data model has an entity "Game". I have now updated the model to include an entity, "Match", which can refer to multiple games. I wish to add a Match to all of my old Games, and ideally this would be a non-optional relationship.
Currently I am setting Match to be optional, and simply adding a Match to every old Game in application:didFinishLaunching after the model has been updated. This works, but I'm wondering if this is really the best way to do it.
I have tried to follow the tutorial here, but I am getting stuck on the part with "StepOneEntityMigrationPolicy.m". I have created an NSEntityMigrationPolicy subclass and set it in the mapping model. I've tried overriding both createDestinationInstancesForSourceInstance and createRelationshipsForDestinationInstance:, but neither get called.
Is this perhaps because my Source and Destination are both the same (GameToGame)? Also, is there any benefit to doing this via the mapping model rather than as I am doing it now?
I think the simplest and most pragmatic way is what you are doing now, i.e. inserting the necessary new entities "manually" after an update. This is a common way to populate orphaned entities after a model version upgrade and perfectly fine.

How do read-only database views fit into the repository pattern?

Example: Your database has a SQL view named "CustomerOrdersOnHold". This view returns a filtered mix of specific customer and order data fields. You need to fetch data from this view in your application. How does access to such a view fit into the repository pattern? Would you create a "CustomerOrdersOnHoldRepository"? Is a read-only view such as this considered an aggregate root?
I would prefer separating the read repository, preferably even change its name to Finder or Reader, the repository is meant for Domain usage not for querying read-only data, you can refer to this article and this which explains the usage of Finder separated form repository.
I would recommend also the separating of read model from write model architecture CQRS and there
This architecture allows you to separate the read model from write model even in terms of data storage and the use of event sourcing.
For a middle solution you can utilize some CQRS concepts without the complexity of separating database by just separating repository from finders, read this post
for a sample of this type of solution (use the same database but separating finders form repositories) check this sample
Your read-only data would be considered Value Objects in the DDD world.
I typically place access methods for value objects in existing repositories until such time that it makes sense to create a separate repository. It's similar to a method that might return a static list of states to be used on an address form:
IAddressRepository
{
Address GetAddress(string addressID);
List<string> GetStates(string country);
}
I think that it is fine to have a separate repository like "CustomerOrdersOnHoldRepository". The interface of the repository will reflect the fact that the objects are readonly (by not defining Save/Add/MakePersistent method).
From How to write a repository:
... But there is another strategy that I quite like: multiple
Repositories. In our ordering example there is no reason we can have
two Repositories: AllOrders and SurchargedOrders. AllOrders represent
a list containing every single order in the system, SurchargedOrders
represents a subset of it.
I would not call returned object an Aggrgate Root. Aggregates are for consistency, data exchange and life cycles. Your objects don't have any of these. It seems that they also can not be classified as Value Objects ('characteristic or attribute'). They are just standalone classes.

Resources