Where should a factory for creating an Aggregate be placed? - domain-driven-design

We can place a factory on an Aggregate root, on an object closely involved in spawning another object or we can implement it as a Service ( which usually creates an entire Aggregate ).
a) In most situations, where should a factory for creating an Aggregate root be placed ( assuming it makes sense to create just the root and not the entire Aggregate )? On the root itself?
b) Similarly, where should in most situations a factory for creating a complete Aggregate be placed?
Thank you

a) In most situations, where should a factory for creating an Aggregate root be placed ( assuming it makes sense to create just the root and not the entire Aggregate )? On the root itself?
When it makes sense to create just the Aggregate root object itself, offer a factory method. To my ears, most of the time this is merely a static convenience method for allocating memory and calling an initialization routine ([Foo fooWithBar:...] instead of [[Foo alloc] initWithBar:...] in Objective-C) or using a constructor with parameters (new Foo(...) in Ruby, for example).
b) Similarly, where should in most situations a factory for creating a complete Aggregate be placed?
Is testing an issue?
I found verifying static (factory) method behavior a pain to test. Essentially, you'd have to treat your code like legacy code and refactor until it's testable. If the tests dictate the code, though, you wouldn't run into problems like these. Instead of using the Factory Method Pattern, you'd use an Abstract Factory instead and rely on its instance methods.
So, if you put the factory inside a designated factory object: to create complex aggregate objects, I tend to use multiple factory objects. There the one Aggregate Root Factory everyone else will talk to. It utilizes factories for the components the aggregate is made of. This way, you don't shovel complex object creations into one single factory which is prone to change often as your code base evolves. Instead, the Aggregate Root Factory depends on other factories to do the initialization of the parts.
The factories should be placed in the domain layer, as commenters to your question already pointed out.
Edit: Especially when you reconstitute an aggregate from a repository will using multiple factories be useful. If you put in a deeply nested JSON, for example, you can only test for the object returned by the factory. If you delegate creating sub-objects to other factories, though, then you'll be able to (1) mock them in tests, and (2) verify that the Aggregate Root Factory (i) called its collaborators with (ii) the data split up already.

Related

What is the purpose of child entity in Aggregate root?

[ Follow up from this question & comments: Should entity have methods and if so how to prevent them from being called outside aggregate ]
As the title says: i am not clear about what is the actual/precise purpose of entity as a child in aggregate?
According to what i've read on many places, these are the properties of entity that is a child of aggregate:
It has identity local to aggregate
It cannot be accessed directly but through aggregate root only
It should have methods
It should not be exposed from aggregate
In my mind, that translates to several problems:
Entity should be private to aggregate
We need a read only copy Value-Object to expose information from an entity (at least for a repository to be able to read it in order to save to db, for example)
Methods that we have on entity are duplicated on Aggregate (or, vice versa, methods we have to have on Aggregate that handle entity are duplicated on entity)
So, why do we have an entity at all instead of Value Objects only? It seams much more convenient to have only value objects, all methods on aggregate and expose value objects (which we already do copying entity infos).
PS.
I would like to focus to child entity on aggregate, not collections of entities.
[UPDATE in response to Constantin Galbenu answer & comments]
So, effectively, you would have something like this?
public class Aggregate {
...
private _someNestedEntity;
public SomeNestedEntityImmutableState EntityState {
get {
return this._someNestedEntity.getState();
}
}
public ChangeSomethingOnNestedEntity(params) {
this._someNestedEntity.someCommandMethod(params);
}
}
You are thinking about data. Stop that. :) Entities and value objects are not data. They are objects that you can use to model your problem domain. Entities and Value Objects are just a classification of things that naturally arise if you just model a problem.
Entity should be private to aggregate
Yes. Furthermore all state in an object should be private and inaccessible from the outside.
We need a read only copy Value-Object to expose information from an entity (at least for a repository to be able to read it in order to save to db, for example)
No. We don't expose information that is already available. If the information is already available, that means somebody is already responsible for it. So contact that object to do things for you, you don't need the data! This is essentially what the Law of Demeter tells us.
"Repositories" as often implemented do need access to the data, you're right. They are a bad pattern. They are often coupled with ORM, which is even worse in this context, because you lose all control over your data.
Methods that we have on entity are duplicated on Aggregate (or, vice versa, methods we have to have on Aggregate that handle entity are duplicated on entity)
The trick is, you don't have to. Every object (class) you create is there for a reason. As described previously to create an additional abstraction, model a part of the domain. If you do that, an "aggregate" object, that exist on a higher level of abstraction will never want to offer the same methods as objects below. That would mean that there is no abstraction whatsoever.
This use-case only arises when creating data-oriented objects that do little else than holding data. Obviously you would wonder how you could do anything with these if you can't get the data out. It is however a good indicator that your design is not yet complete.
Entity should be private to aggregate
Yes. And I do not think it is a problem. Continue reading to understand why.
We need a read only copy Value-Object to expose information from an entity (at least for a repository to be able to read it in order to
save to db, for example)
No. Make your aggregates return the data that needs to be persisted and/or need to be raised in a event on every method of the aggregate.
Raw example. Real world would need more finegrained response and maybe performMove function need to use the output of game.performMove to build propper structures for persistence and eventPublisher:
public void performMove(String gameId, String playerId, Move move) {
Game game = this.gameRepository.load(gameId); //Game is the AR
List<event> events = game.performMove(playerId, move); //Do something
persistence.apply(events) //events contains ID's of entities so the persistence is able to apply the event and save changes usign the ID's and changed data wich comes in the event too.
this.eventPublisher.publish(events); //notify that something happens to the rest of the system
}
Do the same with inner entities. Let the entity return the data that changed because its method call, including its ID, capture this data in the AR and build propper output for persistence and eventPublisher. This way you do not need even to expose public readonly property with entity ID to the AR and the AR neither about its internal data to the application service. This is the way to get rid of Getter/Setters bag objects.
Methods that we have on entity are duplicated on Aggregate (or, vice versa, methods we have to have on Aggregate that handle entity
are duplicated on entity)
Sometimes the business rules, to check and apply, belongs exclusively to one entity and its internal state and AR just act as gateway. It is Ok but if you find this patter too much then it is a sign about wrong AR design. Maybe the inner entity should be the AR instead a inner entity, maybe you need to split the AR into serveral AR's (inand one the them is the old ner entity), etc... Do not be affraid about having classes that just have one or two methods.
In response of dee zg comments:
What does persistance.apply(events) precisely do? does it save whole
aggregate or entities only?
Neither. Aggregates and entities are domain concepts, not persistence concepts; you can have document store, column store, relational, etc that does not need to match 1 to 1 your domain concepts. You do not read Aggregates and entities from persitence; you build aggregates and entities in memory with data readed from persistence. The aggregate itself does not need to be persisted, this is just a possible implementation detail. Remember that the aggregate is just a construct to organize business rules, it's not a meant to be a representation of state.
Your events have context (user intents) and the data that have been changed (along with the ID's needed to identify things in persistence) so it is incredible easy to write an apply function in the persistence layer that knows, i.e. what sql instruction in case of relational DB, what to execute in order to apply the event and persist the changes.
Could you please provide example when&why its better (or even
inevitable?) to use child entity instead of separate AR referenced by
its Id as value object?
Why do you design and model a class with state and behaviour?
To abstract, encapsulate, reuse, etc. Basic SOLID design. If the entity has everything needed to ensure domain rules and invariants for a operation then the entity is the AR for that operation. If you need extra domain rules checkings that can not be done by the entity (i.e. the entity does not have enough inner state to accomplish the check or does not naturaly fit into the entity and what represents) then you have to redesign; some times could be to model an aggregate that does the extra domain rules checkings and delegate the other domain rules checking to the inner entity, some times could be change the entity to include the new things. It is too domain context dependant so I can not say that there is a fixed redesign strategy.
Keep in mind that you do not model aggregates and entities in your code. You model just classes with behaviour to check domain rules and the state needed to do that checkings and response whith the changes. These classes can act as aggregates or entities for different operations. These terms are used just to help to comunicate and understand the role of the class on each operation context. Of course, you can be in the situation that the operation does not fit into a entity and you could model an aggregate with a V.O. persistence ID and it is OK (sadly, in DDD, without knowing domain context almost everything is OK by default).
Do you wanna some more enlightment from someone that explains things much better than me? (not being native english speaker is a handicap for theese complex issues) Take a look here:
https://blog.sapiensworks.com/post/2016/07/14/DDD-Aggregate-Decoded-1
http://blog.sapiensworks.com/post/2016/07/14/DDD-Aggregate-Decoded-2
http://blog.sapiensworks.com/post/2016/07/14/DDD-Aggregate-Decoded-3
It has identity local to aggregate
In a logical sense, probably, but concretely implementing this with the persistence means we have is often unnecessarily complex.
We need a read only copy Value-Object to expose information from an
entity (at least for a repository to be able to read it in order to
save to db, for example)
Not necessarily, you could have read-only entities for instance.
The repository part of the problem was already addressed in another question. Reads aren't an issue, and there are multiple techniques to prevent write access from the outside world but still allow the persistence layer to populate an entity directly or indirectly.
So, why do we have an entity at all instead of Value Objects only?
You might be somewhat hastily putting concerns in the same basket which really are slightly different
Encapsulation of operations
Aggregate level invariant enforcement
Read access
Write access
Entity or VO data integrity
Just because Value Objects are best made immutable and don't enforce aggregate-level invariants (they do enforce their own data integrity though) doesn't mean Entities can't have a fine-tuned combination of some of the same characteristics.
These questions that you have do not exist in a CQRS architecture, where the Write model (the Aggregate) is different from a Read model. In a flat architecture, the Aggregate must expose read/query methods, otherwise it would be pointless.
Entity should be private to aggregate
Yes, in this way you are clearly expressing the fact that they are not for external use.
We need a read only copy Value-Object to expose information from an entity (at least for a repository to be able to read it in order to save to db, for example)
The Repositories are a special case and should not be see in the same way as Application/Presentation code. They could be part of the same package/module, in other words they should be able to access the nested entities.
The entities can be viewed/implemented as object with an immutable ID and a Value object representing its state, something like this (in pseudocode):
class SomeNestedEntity
{
private readonly ID;
private SomeNestedEntityImmutableState state;
public getState(){ return state; }
public someCommandMethod(){ state = state.mutateSomehow(); }
}
So you see? You could safely return the state of the nested entity, as it is immutable. There would be some problem with the Law of Demeter but this is a decision that you would have to make; if you break it by returning the state you make the code simpler to write for the first time but the coupling increases.
Methods that we have on entity are duplicated on Aggregate (or, vice versa, methods we have to have on Aggregate that handle entity are duplicated on entity)
Yes, this protect the Aggregate's encapsulation and also permits the Aggregate to protect it's invariants.
I won't write too much. Just an example. A car and a gear. The car is the aggregate root. The gear is a child entity

DDD : one aggregate root , multiple persistent datasources

In the Guide/eBook: .NET Microservices: Architecture for Containerized .NET Applications (related to the eShopOnContainers) in the chapter "Designing the infrastructure persistence layer" (page 213) is explained in general how an aggregate root can perform CUD operations against a persistent data source.
Two important starting points are mentioned :
An aggregate is ignorant of methods of persistency and infrastructure following the Persistence Ignorance and the Infrastructure Ignorance principles (page 218). An aggregate is determined by the business and not by the infrastructure.
One should only define one repository per aggregate root to maintain transactional consistency between the objects within the aggregate (page 213)
Unfortunately, in all further examples that are mentioned the aggregate root and all underlying objects that fall under it are within one and the same persistent data source.
The pattern then is as follows:
A repository is created containing that aggregate
In this repository a Unit of Work is injected during creation. This Unit of Work contains methods such as SaveChangesAsync, SaveEntitiesAsync, Update
and so on.
In a command, the Unit of Work manages the transactions to
this one data source such as a database or similar.
I want to expand this pattern that the aggregate can write its data over 2 or more physical data sources depending on the underlying object type.
Starting from starting point 1, it is perfectly justified to have a root aggregate and its underlying object to be updated to different data sources depending on the type of underlying object. Examples mentioned are : a Database and an XML file, a database and a NOSQL 'database',a database and a service, a database and an IoT device. Because an aggregate must be ignorant to methods of persistence and infrastructure, to my opinion there is no need to argue about the design of the aggregate. I think nowhere in the book it is written that a aggregate root should persist within one data source.
At the same time, starting point 2 also seems perfectly justified. Because the complete set of objects within the aggregate root is edited, and the successful persistence of the entire package is coordinated from one repository and (preferably) from one Unit of Work.
The question is:
How deals Domain Driven Design if within the aggregate - depending on the type of the underlying object - it is hydrated over different data sources?
Should I use one custom Unit of Work and make the decision where to write to within this UoW ?
I'm aware of the next question , but having studied the code I think it only deals with inheritance of repositories that deal with different data sources, but still serving one data source at the time and that is not what I'm after.
I want to expand this pattern that the aggregate can write its data over 2 or more physical data sources depending on the underlying object type.
Why do you want to do that on purpose?
In most cases, the persistence implementation is chosen to serve the domain, rather than the other way around. So the happy path typically involves choosing a persistence solution that can record the state of the entire aggregate, and storing the entire thing within a single transaction.
So if you find yourself trying to store an aggregate in two different places, you should take a hard careful look at why.
One common answer is that you want to be able to query the aggregate state efficiently. cqrs is a common solution here - rather than persisting the aggregate in two different data stores, you persist it to one and replicate it to another. The queries can run very efficiently against the replica (although there is of course some additional latency between a change to the aggregate and the reflection of that change in the query results).
Another common answer is that you really have two aggregates that reference each other. Nothing wrong with storing two aggregates in different places. You may be better served by making the distinction between the two explicit in your code.
Dan Pritchett
Jimmy Bogard
How deals Domain Driven Design if within the aggregate - depending on the type of the underlying object - it is hydrated over different data sources?
Badly, just like everybody else.

PHP Repository Pattern Implementation Questions

What should Repositories return from service calls?
An Entity (or Collection of Entities), or instead a reference to itself, which could then be used to access an property that holds a collection of Entities, for example?
Take this sample code:
$user = $userRepository->findById(1);
or
$users = $userRepository->findAll();
I think in most code, a User Entity object, or Users Collection Entity would be returned from a call like this.
It seems a bit strange to me that from one direction, a Repository will return objects directly, yet from the other end, it will hold them in state before acting upon them. Take this sample code as an example:
$user = $factory->make('user');
$user->setName($array_data['name']);
$repo->add($user);
$repo->save();
Is this just how it's done?
I think I am expecting to see something a bit more like this, in terms of retrieval:
$users = $userRepository->findAll(); // Returns $userRepository reference
foreach($users->collection() as $user) {
// Do some operations, or whatever
}
$users->save();
or perhaps, for read only needs:
$users = $userRepository->findAll();
$users = $users->collection(); // Returns User Entities held in state
Clarification as to why it's done one way or another would be much appreciated.
Where does the Factory belong inside the Domain?
Should it be injected as a dependency of the Mapper object? It seems like there must also be Factory access from the controlling code/service layer as well, for creating Entities to submit to the Repository.
That leads into my next question...
What is the preferred way to create new Entities from the controlling class/service layer?
I have seen Factory objects being used, like this:
$user = $factory->make('user');
$user->setName($array_data['name']);
$repo->add($user);
As well as built-in Repository methods, like this:
$repo->saveFromArray($array_data);
In the second example, the $array_data would be forwarded through the repository, to the Mapper, which will then perform the save. Of course the data source would be checked for overlapping records beforehand, in either example.
I assume the first method is preferred? It seems to be a more object-oriented approach.
You have many questions...
What should Repositories return from service calls?
Always aggregate roots (AR). AR design is very important, but it's none of the repository's concern. The repository methods return one or many objects as needed by the Domain. There is no Users Collection Enitity, there's a list of Users (which in php probably is an array), don't complicate things.
The Domain repositories should be used only for Domain needs (read or write). The whole object is returned, the repository doesn't return pieces of an AR, but the whole AR. Once again I mention that AR design is very important.
Where does the Factory belong inside the Domain?
Where it's needed. I don't use a factory, at most I have a factory method, but even that is for restoring purposes (if I'm using a memento). You don't have to use a factory to create the domain objects.
What is the preferred way to create new Entities from the controlling class/service layer?
The simplest way possible. For probably 99% of cases, you'll be using the "new" operator. Use Factories only if it gives you a concrete benefit for specific entities.
The Mapper never performs saves, because it's a mapper. Only repositories do persistence work. Mappers 'convert'/copy data from one model to another. You can use mappers to map a domain objects to some data model to be persisted and back.

Should external objects perform operations on inner entities by calling ...?

1)
a) Entities within an Aggregate should only be accessed via Aggregate root. While it is possible for the root to pass transient references to internal entities to external objects ( for the duration of a single operation ), I assume in most cases if external object needs to performs some operation on internal entity, it should call method(s) defined on the Aggregate root ( contrived example - Order.SetOrderLineTitle(...) )?
2) Only AGGREGATE roots can be obtained directly. All other objects must be found by traversal of associations.
a) When we say that external objects should access non-root entities by traversal of associations, do we mean they should call methods on Aggregate root ( e.g. Order.SetOrderLineTitle(...)), which in turn would perform operations on internal objects or do we mean that Aggregate root should pass a reference to internal entity to an external object or both?
Thank you
1) Yes, this is the best way for the aggregate to maintain its integrity. Some say that this can result in aggregates with very large number of methods, however in that case there may be multiple aggregates at play.
2) Ideally, the aggregate would perform the required operation without passing references. There may be a case where passing a reference makes sense, but this should be implemented with care as it makes reasoning about integrity more difficult.
I assume in most cases if external object needs to performs some
operation on internal entity, it should call method(s) defined on the
Aggregate root
Just to add a slightly different take on this, the reverse approach might also be used. Adding methods to the Aggregate Root in most cases forces you to divide your domain in very small Aggregates lest the roots become bloated, violating SRP. This slicing might come at the cost of sacrificing the natural business cohesion of your Aggregates.
Instead, you could decide that in most cases you will let external objects get transient references to internal entities and manipulate them as they wish. In rarer cases, especially ones that imply enforcing invariants that span across multiple entities, it would be a better idea to implement these operations directly on the Root.
That approach is discussed here : https://groups.google.com/forum/#!topic/dddcqrs/mtGanS39XYo
the way I see it is although an aggregate root is responsible for the
life cycle of entities within, that doesn't mean that it should be the
exclusive interface ( other than returning a specific entity) to all
methods called on any item within the aggregate.
Overall, the final decision will depend on whether you want to design your aggregates primarily with domain/functional cohesiveness in mind or you first want to think of them as transactional safeguards.

How do you persist/restore aggregate roots with entities in DDD?

Based on the following definitions from Domain-Driven Design: Tackling Complexity in the Heart of Software,
An aggregate is:
A cluster of associated objects that are treated as a unit for the purpose of data changes. External references are restricted to one member of the AGGREGATE, designated as the root. A set of consistency rules applies within the AGGREGATE'S boundaries.
I don't think the Aggregate root should hold a reference to the repository. Since the Aggregate root is the only one that should be holding references to its entities and aggregates, they should be private.
How can my repository persist and restore this private data ?
Edit:
Let's take the classic Order, OrderLines example.
An order is the Aggregate root.
It's lines are Entities.
Since the Aggregate root(order) is the only object allowed to hold references to its entities (order lines), I do not understand how would I persist order lines from the repository.
As far as I understand the aggregate root, it must be the place to access all the entities inside it's scope. That means, as long as traditional ORM is used, that you can access the OrderLines throug the Order.
Further it is not forbidden for anyone to grab a reference to the entitiy inside the root, but these references must be volatile (i.e. short lived) and you must obtain the rerefence via the aggregate root.
In terms of DDD you will use a repository to hide data access, the factory might in turn use a factory to assemble the object. The facotry knows well about the internal structure of the object and must be able to build up a new object or restore one from the data the repository hands over.
Perhaps you might also look into CQRS + Event Sourcing which provides a different approach to persisting entities.
Well, most folks consider the repository to be a logical feature of hte aggregate root (since there's only one per aggregate, in traditional DDD), so it does & should have access to the orderlines.
If you really want them to be private, though, you would need to resort to reflection, or else have the aggregate root entity return them in some persistable fashion (perhaps w/ an internal call of some kind).

Resources