Find aggregate root usages across the domain

Find aggregate root usages across the domain - domain-driven-design

As a matter of fact, Entities and even Value Objects may contain references to Aggregate Roots.
Additionally as per definition, Aggregate Roots stand on their own, the have intrinsically a Repository where I not at last could delete that Aggregate Root.
As a requirement of my GUI / Workflow, the customer wants to see where a particular Aggregate Root is referenced, not least because he wants / should be able to check whether he can delete that AR "safely".
My current design only has the navigation from the Entity in question towards the other AggregateRoot, so there's, at the moment, no simple way to find the opposite direction.
As this is surely not a single case, I wonder how this is done usually?
Addendum:
Consider the following example; we have an Address as Entity, and a Value Object HomeVisit containing date and Address address (just for the simplicity).
Until now, there is no modelling need to be able to navigate from Address to HomeVisit, even more since bidirectional associations are discouraged in general.
But you should see the use case now: For any reason I might need to be able to find out where an Address is currently used prior to delete or even modify it (maybe a service technician is currently on its way to that address and I need to be aware of that).
You can argue that for these cases there must be a Service or similar to find that out, but imagine there's a third party module which brings the HomeVisit VO and makes use of the Address somehow anonymously; at least that's the way I would like it to implement.

It seems like the answer is: via Domain or Application Services, involving respective Repositories whereas necessary.
So, the check for references has to be kinda hard-coded with an AddressService, having a method like deleteAddressByIdentifier. This method then needs to check or invoke a method isTechnicianOnWayToAddress() which again queries Repositories accordingly; or whatever is needed to fulfil the goal.

Related

DDD - How to get another aggregate root without talking with repository?

In this question https://softwareengineering.stackexchange.com/questions/396151/which-layer-do-ddd-repositories-belong-to?newreg=f257b90b65e94f9ead5df5096267ef9a, I know that we should avoid talking with the repository.
But now we have one aggregate root and we need to get another aggregate root to check if the aggregate root we have is correct.
For example:
every resource has its scheme. So the resource will hold the name of its scheme.
If the resource wants to update with the new resource data, it wants to get the scheme entity and ask scheme check if the new resource data is matched by scheme validator or not.
To get the scheme entity, we need to ask the scheme repository. But they tell me we should avoid talking with the repository. If we really need to avoid, how can we get the scheme entity by its name?

Two ideas that you should keep in mind:
these are patterns; the patterns aren't expressed exactly the same way everywhere, but are adapted to each context where we use tham.
we choose designs that make our lives easier -- if the pattern is getting in the way, we don't use it.
To get the scheme entity, we need to ask the scheme repository. But they tell me we should avoid talking with the repository. If we really need to avoid, how can we get the scheme entity by its name?
Basic idea: entities can be found via traversal (you start from some aggregate root, then keep asking for things until you get where you want), root entities can be reached via the repository.
In the "blue book", Evans assumed that you could reach one aggregate root from another; people tend not to do that as much (when you can no longer assume that all aggregates will be stored in the same database, you suddenly need to be careful about which aggregates can change at the same time, and this in turn suggests additional constraints on your design).
In the present day, we would notice that if we aren't intending to change the scheme, that we don't need its aggregate root at all - we just need a validator, which we can create from a copy of the information.
So we treat the scheme information as reference data (see Helland, 2005).
Great - so how do we get "reference data" to the resource aggregate?
Typically in one of two ways - the simplest by far is to just pass it as an argument. The application code (usually the same place that we are pulling the resource aggregate out of its repository) is also responsible for looking up the resource data.
resource = resourceRepository.get(id)
validator = validatorRepository.get(resource.scheme)
resource.update(validator, ....)
Alternatively, we can pass the capability to lookup the validator to the resource. Naively, that might look like:
resource = resourceRepository.get(id)
resource.update(validatorRepository::get, ....)
But that violates our "rule" about where we use the repository. So now what? Two possible answers: we decide the rule doesn't apply here, OR we use a similar pattern to get what we need: a "domain service")
resource = resourceRepository.get(id)
resource.update(domainService, ....)
Domain service is a pattern that can be used for all sorts of things; here, we are using it as a convenient mechanism to access the reference data that we need.
Superficially, it looks like a repository - the significant difference here is that this domain services doesn't have affordances to change the scheme entities; it can only read them. The information is the same, but the expression of that information as an object is different (because this object only has read methods).
It's "just" design; the machine really doesn't care how we tell it what work to do. We're merely trying to arrange the code so that it clearly communicates intent to the next programmer who comes along.

Access aggregate root child directly

I am modeling a course app, trying to play with DDD and Clean Architecture. So I have Course, which has one or more modules, and each of them has one or more lessons
I created a ModuleLessons aggregate root which is a list of lessons that belongs to a module.
I have the use case where user can access the whole list of lessons within a module, so he access an url like myapp/lessons/{module-id} and this it will endup calling something like moduleLessonsRepository.getById({module-id}) and will render to user a list of lessons which compose that module
As I understand, repository should only deal with the whole aggregate root, not child entities directly. In other words, if Lesson is not an AR, I must not have a LessonRepository.getById()
But I have another use case where user can access something like myapp/lesson/{lesson-id}
But how could I implement if I cant have a repository which returns a lesson by it's id?
I could load the ModuleLessons aggregate and then find lesson within it, but I don't have it's id to query.
I could put module id and lesson id (or maybe just a 'lesson position within it's module) on the url and use that to find the ModuleLessons AR, but I'm puting extra data on the url just to fulfill architectural constraints, is that right?
Finaly, the lesson position within it's module does mater, but this piece of data dont belong to the lesson nor to the module, that's why I created the list itself as the AR, maybe it wasn't the right decision?

Your model sounds very structural, e.g. a course consists of modules, modules consist of one or more lessons being taught as part of it, etc. It's not really solving a problem (or at least you've not described one). Could be booking a course, could be attending the lessons of a course, etc ... The other observation is that you seem to be describing what are essentially queries. You will find that most models have a conflict of interest when it comes to reading and writing, one of the main reasons CQRS came about in the first place (not suggesting you adopt that, merely pointing out the obvious). Writing happens to align with use cases and rules that must be upheld at all times (or else). Reading, on the other hand, seems to happen far more liberally, without much consideration for the past use cases that brought the queryable data about. One easy step could be to undo yourself of the shackles that say you can't return lessons by id - simply add whatever code you need to make that happen and don't feel compelled to put that in a box like a repository. Consistency is to be considered, but if the writing imposes the proper transactional boundaries, the reading won't inadvertently observe something it shouldn't. Secondary indexes can help too - they're the sort of thing that can help you find the module id based on the lesson id if you choose to continue to go down the current path.

If it is just about reading data (e.g. showing data to a user), you can always bypass the whole aggregate repository and use whatever whatever appropriate read queries you need. Only, if your use case needs to manipulate data go through the aggregate repository to retrieve a full aggregate in order to make sure transactional consistency inside this aggregate as well as business rules are applied when changing said aggregate.
Also, it should be considered that if you do you have valid use cases where you would directly change (not read) an entity inside an aggregate without the need of considering business logic that needs to be owned by the parent aggregate root, you might have missed to discover this entity being modeled as an aggregate on it's own. See also, https://stackoverflow.com/a/67250062/7730554

What is an Aggregate Root?

No, it is not a duplication question.
I have red many sources on the subject, but still I feel like I don't fully understand it.
This is the information I have so far (from multiple sources, be it articles, videos, etc...) about what is an Aggregate and Aggregate Root:
Aggregate is a collection of multiple Value Objects\Entity references and rules.
An Aggregate is always a command model (meant to change business state).
An Aggregate represents a single unit of (database - because essentialy the changes will be persisted) work, meaning it has to be consistent.
The Aggregate Root is the interface to the external world.
An Aggregate Root must have a globally unique identifier within the system
DDD suggests to have a Repository per Aggregate Root
A simple object from an aggregate can't be changed without its AR(Aggregate Root) knowing it
So with all that in mind, lets get to the part where I get confused:
in this site it says
The Aggregate Root is the interface to the external world. All interaction with an Aggregate is via the Aggregate Root. As such, an Aggregate Root MUST have a globally unique identifier within the system. Other Entites that are present in the Aggregate but are not Aggregate Roots require only a locally unique identifier, that is, an Id that is unique within the Aggregate.
But then, in this example I can see that an Aggregate Root is implemented by a static class called Transfer that acts as an Aggregate and a static function inside called TransferedRegistered that acts as an AR.
So the questions are:
How can it be that the function is an AR, if there must be a globaly unique identifier to it, and there isn't, reason being that its a function. what does have a globaly unique identifier is the Domain Event that this function produces.
Following question - How does an Aggregate Root looks like in code? is it the event? is it the entity that is returned? is it the function of the Aggregate class itself?
In the case that the Domain Event that the function returns is the AR (As stated that it has to have that globaly unique identifier), then how can we interact with this Aggregate? the first article clearly stated that all interaction with an Aggregate is by the AR, if the AR is an event, then we can do nothing but react on it.
Is it right to say that the aggregate has two main jobs:
Apply the needed changes based on the input it received and rules it knows
Return the needed data to be persisted from AR and/or need to be raised in a Domain Event from the AR
Please correct me on any of the bullet points in the beginning if some/all of them are wrong is some way or another and feel free to add more of them if I have missed any!
Thanks for clarifying things out!

I feel like I don't fully understand it.
That's not your fault. The literature sucks.
As best I can tell, the core ideas of implementing solutions using domain driven design came out of the world of Java circa 2003. So the patterns described by Evans in chapters 5 and six of the blue book were understood to be object oriented (in the Java sense) domain modeling done right.
Chapter 6, which discusses the aggregate pattern, is specifically about life cycle management; how do you create new entities in the domain model, how does the application find the right entity to interact with, and so on.
And so we have Factories, that allow you to create instances of domain entities, and Repositories, that provide an abstraction for retrieving a reference to a domain entity.
But there's a third riddle, which is this: what happens when you have some rule in your domain that requires synchronization between two entities in the domain? If you allow applications to talk to the entities in an uncoordinated fashion, then you may end up with inconsistencies in the data.
So the aggregate pattern is an answer to that; we organize the coordinated entities into graphs. With respect to change (and storage), the graph of entities becomes a single unit that the application is allowed to interact with.
The notion of the aggregate root is that the interface between the application and the graph should be one of the members of the graph. So the application shares information with the root entity, and then the root entity shares that information with the other members of the aggregate.
The aggregate root, being the entry point into the aggregate, plays the role of a coarse grained lock, ensuring that all of the changes to the aggregate members happen together.
It's not entirely wrong to think of this as a form of encapsulation -- to the application, the aggregate looks like a single entity (the root), with the rest of the complexity of the aggregate being hidden from view.
Now, over the past 15 years, there's been some semantic drift; people trying to adapt the pattern in ways that it better fits their problems, or better fits their preferred designs. So you have to exercise some care in designing how to translate the labels that they are using.

In simple terms an aggregate root (AR) is an entity that has a life-cycle of its own. To me this is the most important point. One AR cannot contain another AR but can reference it by Id or some value object (VO) containing at least the Id of the referenced AR. I tend to prefer to have an AR contain only other VOs instead of entities (YMMV). To this end the AR is responsible for consistency and variants w.r.t. the AR. Each VO can have its own invariants such as an EMailAddress requiring a valid e-mail format. Even if one were to call contained classes entities I will call that semantics since one could get the same thing done with a VO. A repository is responsible for AR persistence.
The example implementation you linked to is not something I would do or recommend. I followed some of the comments and I too, as one commenter alluded to, would rather use a domain service to perform something like a Transfer between two accounts. The registration of the transfer is not something that may necessarily be permitted and, as such, the domain service would be required to ensure the validity of the transfer. In fact, the registration of a transfer request would probably be a Journal in an accounting sense as that is my experience. Once the journal is approved it may attempt the actual transfer.
At some point in my DDD journey I thought that there has to be something wrong since it shouldn't be so difficult to understand aggregates. There are many opinions and interpretations w.r.t. to DDD and aggregates which is why it can get confusing. The other aspect is, in IMHO, that there is a fair amount of design involved that requires some creativity and which is based on an understanding of the domain itself. Creativity cannot be taught and design falls into the realm of tacit knowledge. The popular example of tacit knowledge is learning to ride a bike. Now, we can read all we want about how to ride a bike and it may or may not help much. Once we are on the bike and we teach ourselves to balance then we can make progress. Then there are people who end up doing absolutely crazy things on a bike and even if I read how to I don't think that I'll try :)
Keep practicing and modelling until it starts to make sense or until you feel comfortable with the model. If I recall correctly Eric Evans mentions in the Blue Book that it may take a couple of designs to get the model closer to what we need.

Keep in mind that Mike Mogosanu is using a event sourcing approach but in any case (without ES) his approach is very good to avoid unwanted artifacts in mainstream OOP languages.
How can it be that the function is an AR, if there must be a globaly unique identifier to it, and there isn't, reason being that
its a function. what does have a globaly unique identifier is the
Domain Event that this function produces.
TransferNumber acts as natural unique ID; there is also a GUID to avoid the need a full Value Object in some cases.
There is no unique ID state in the computer memory because it is an argument but think about it; why you want a globaly unique ID? It is just to locate the root element and its (non unique ID) childrens for persistence purposes (find, modify or delete it).
Order A has 2 order lines (1 and 2) while Order B has 4 order lines (1,2,3,4); the unique identifier of order lines is a composition of its ID and the Order ID: A1, B3, etc. It is just like relational schemas in relational databases.
So you need that ID just for persistence and the element that goes to persistence is a domain event expressing the changes; all the changes needed to keep consistency, so if you persist the domain event using the global unique ID to find in persistence what you have to modify the system will be in a consistent state.
You could do
var newTransfer = New Transfer(TransferNumber); //newTransfer is now an AG with a global unique ID
var changes = t.RegisterTransfer(Debit debit, Credit credit)
persistence.applyChanges(changes);
but what is the point of instantiate a object to create state in the computer memory if you are not going to do more than one thing with this object? It is pointless and most of OOP detractors use this kind of bad OOP design to criticize OOP and lean to functional programming.
Following question - How does an Aggregate Root looks like in code? is it the event? is it the entity that is returned? is it the function
of the Aggregate class itself?
It is the function itself. You can read in the post:
AR is a role , and the function is the implementation.
An Aggregate represents a single unit of work, meaning it has to be consistent. You can see how the function honors this. It is a single unit of work that keeps the system in a consistent state.
In the case that the Domain Event that the function returns is the AR (As stated that it has to have that globaly unique identifier),
then how can we interact with this Aggregate? the first article
clearly stated that all interaction with an Aggregate is by the AR, if
the AR is an event, then we can do nothing but react on it.
Answered above because the domain event is not the AR.
4 Is it right to say that the aggregate has two main jobs: Apply the
needed changes based on the input it received and rules it knows
Return the needed data to be persisted from AR and/or need to be
raised in a Domain Event from the AR
Yes; again, you can see how the static function honors this.
You could try to contat Mike Mogosanu. I am sure he could explain his approach better than me.

Injecting repositories to domain objects or repositories should be aware of business logic?

Let's say we have a class Order (related to user) and it has property state. I want to prevent having more than one confirmed order in the same time so before I confirm any order I have to check if there is already some confirmed in it's time period.
I can make 2 approaches (I know of):
OrderRepository has a function changeState which search for conflicting confirmed orders before changing it and allows it only when nothing is found - the problem here is repository knows about logic of changing state.
OrderRespository is injected into Order and Order has function changeState which will use that repository to check for conflicts - here problem is the domain object know about persistence.
What is a right way to do?

Repositories are not in charge of domain invariants. Aggregates are. If no Aggregate has the needed info inside itself to check the invariant, try to question your aggregate design and maybe come up with a new one.
You can use a Domain Service, alternatively. As a weaker option, you could also degrade the domain invariant down to a simple use case precondition and have it checked by the Application Service/Command Handler. Note that the latter 2 options don't provide as strong a guarantee that domain entities will be in a consistent state at all times regarding that rule.

Another way to think about this would be from the point of view of a repository's responsibility. At the end of the day, a Repository is an abstraction to represent how to deal with a collection of objects, in your case, Orders.
Hence, it makes sense to keep rules around consistency and right state for the collection to be represented at the repository layer. Injecting repositories into entities is probably a code smell that something is not being modelled correctly. I'd go for your first version, but maybe you don't need to be specific about your state change, but simply embed those rules into the save() method. You can save the changes to an order iif there's no other confirmed at the same time.

What are consequences of using repository inside of aggregate vs inside of domain service

We all heard that injecting repository into aggregate is a bad idea, but almost no one tells why.
I will try to write here all disadvantages of doing this, so we can measure rightness of this statement.
First thing that comes into my head is Single Responsibility Principle.
It's true that by injecting repository into AR we are violating SRP, because retrieving and persisting of aggregate is not responsibility of aggregate itself. But it says only about "aggregate itself", not about other aggregates. So does it apply for retrieving from repository aggregates referenced by id? And what about storing them?
I used to think that aggregate shouldn't even know that there is some sort of persistence in system, because it doesn't have to exist. Aggregates can be created just for one procedure call and then get rid of.
Now when I think of it, it's not right, because aggregate root is an entity, and entity has sense only if it has some unique identity. So why would we need unique identity if not for persisting? Even if it's just a persistence in a memory. Maybe for comparing, but in my opinion it's not a main reason behind the identity.
Ok, let's assume that we retrieve and store OTHER aggregates from inside of our aggregate using injected repositories. What are other consequences beside SRP violation?
For sure there is a problem with having no control over persisting of aggregates and retrieving is some kind of lazy loading, which is bad for the same reason (no control).
Because of no control we can come into situation when we persist the same aggregate few times, where it could be persisted only once, or the same aggregate is loaded one hundred times where it could be loaded once, hence performance is worse. Also there might be problem with stale data.
These reasons practically disqualifies ability to inject repository into aggregate.
Here comes my main question - why can we inject repositories into domain service then?
Not the same reasons applies here? It's just like moving logic out of aggregate into separate function and pretend it to be something different.
To be honest, when I stared to write this SO question, I had no good answer for that. But after hours of investigating this problem and writing of this question I came to solution. Rubber duck debugging.
I'll post this question anyway for others having the same problems. Of course with my answer below.

Here are the places where I'd recommend to fetch aggregates (i.e. call Repository.Get...()), in preference order :
Application Service
Domain Service
Aggregate
We don't want Aggregates to fetch other Aggregates most of the time, because this blurs the lines, giving them orchestration powers which normally belong to the Application layer. You also raise the risk of the Aggregate trespassing its jurisdiction by modifying other Aggregates, which can result in contention and performance problems, not to mention that transactions become more difficult to analyze and the code base to reason about.
Domain Services are IMO a good place to fetch Aggregates when determining which aggregates to modify is domain logic per se. In your game example (which might not be the ideal context for DDD by the way), which units are affected by another unit's attack might be considered domain logic, thus you may not want to place it at the Application Service level. This rarely happens in my experience though.
Finally, Application Services are the default place where I call Repository.Get(...) for uniformity's sake and because this is the natural place to get a hold of the actors of the use case (usually only one Aggregate per transaction) and orchestrate calls to them.
That doesn't mean Aggregates should never be injected Repositories, there are exceptions, but other alternatives are almost always better.

So as I wrote in a question, I've found my answer already in the process of writing that question.
The best way to show this is by example:
When we have a simple (superficially) behavior like unit attacking other unit, we can write something like that.
unit.attack_unit(other_unit)
Problem is that, to attack an unit, we have to calculate damage and to do that we need another aggregates, like weapon and armor, which are referenced by id inside of unit. Since we cannot inject repository inside of aggregate, then we have to move that attack_unit logic into domain service, because we can inject repository there. Now where is the difference between injecting it into domain service, and not into unit aggregate.
Answer is - there is no difference. All consequences I described in question won't bite us. In both cases we will load both units once, attacking unit weapon once and armor of unit being attacked once. Also there won't be stale data, even if we mutate weapon object during process and store it, because that weapon is retrieved and stored in one place.
Problem shows up in different example.
Lets create an use case where unit can attack all other units in game in one process.
Problem lies in how we implement it. If we will use already defined unit.attack_unit and we will call it on all units in game (iterating over them), then weapon that is used to compute damage will be retrieved from unit aggregate, number of times equal to count of units in game! But it could be retrieved only once!
It doesn't matter if unit.attack_unit will be method of unit aggregate, or if it will be domain service unit_attack_unit. It will be still the same, weapon will be loaded too many times. To fix that we simply have to change implementation and with that probably interface too.
Now at least we have an answer to question "does moving logic from aggregate method to domain service (because we want to access repository there) fixes problem?". No, it does not change a thing.
Injecting repositories into domain service can be as dangerous as injecting it into aggregate if used wrong.
This answers my SO question, but we still don't have solution to real problem.
What can we do if we have two use cases: one where unit attacks one other unit, and second where unit attacks all other units, without duplicating domain logic.
One way is to put all needed aggregates as parameters to our aggregate method.
unit.attack_unit(unit, weapon, armor)
But what if we will need like five or more aggregates there? It's not a good way. Also application logic will have to know that all these aggregates are needed for an attack, which is knowledge leak. When attack_unit implementation will change we would also might to update interface of that method. What is the purpose of encapsulation then?
So, if we can't access repository to get needed aggregate, how can we smuggle it then?
We can get rid of idea with referencing aggregates by ids, or pass all needed aggregates from application layer (which means knowledge leak).
Or maybe reason of these problems is bad modelling?
Attacking of other unit is indeed an unit responsibility, but is damage calculation its responsibility? Of course not.
Maybe we need another object, like value object MeleeAttack(weapon, armor), yet when we add more properties that can change result of an attack, like enchantments on unit, it gets more complicated.
Also I think that we are now creating objects based on performance, not our on domain.
So from domain driven design, we get performance driven design. Is that what we want? I don't think so.

"So why would we need unique identity if not for persisting?" - think of an account scenario, where several John Smiths exist in your system. Imagine John Smith and John Smith Jr (who didn't enter the Jr in signup) both live at the same address. How do you tell them apart? Imagine I'm trying to write a recommendation engine based upon their past purchases . . . .
Identity is a quality of equality in DDD. If you don't have an identity unique from your fields, then you're a ValueObject.

What are consequences of using repository inside of aggregate vs inside of domain service?
There's a reasonably strong argument that you shouldn't do either.
Riddle: when does an aggregate need to see the state of another aggregate?
The responsibility of an aggregate is to control change. Any command that would change the state of the domain model is dispatched to the aggregate root responsible for the integrity of the state in question. By definition, all of the state required to ensure that the command is currently permitted is contained within the aggregate boundary.
So there is never any need to peek at the data outside of the aggregate when making a change to the model.
In which case, you don't ever need to load another aggregate, which makes the "where" question moot.
Two clarifications:
Queries will often combine the state of multiple aggregates, and will often need to follow a reference from one aggregate to another. The principle above is satisfied because queries treat the domain model as read-only. You need the state to answer the query, but you don't need the invariant enforcement because you aren't changing anything.
Another case is when you need state from another aggregate to process a command properly, but small latency in the data is an acceptable risk to the data. In that case, you query the "other" aggregate to get state. If you were to run that query within the domain model itself, the right way to do so would be via a domain service.
In most cases, though, you'll be equally well served to run the query when generating the command (ie, in the client), or when handling the command (in the application, outside the domain). It would be very unusual for a business to consider domain service latency to be acceptable but client latency to be unacceptable.
(Disconnected clients are one case where that can be especially problematic; when the command is generated and then queued for a long period of time before being dispatched to the server).

Develop Reference

node.js excel linux python-3.x azure haskell apache-spark rust .htaccess string