Hierarchy with event sourcing - domain-driven-design

Hierarchy with event sourcing - domain-driven-design

Could anyone offer any advice on how they would organise entities, aggregate roots, etc in a hierarchical domain model when using event sourcing?
A project has assets. Assets are a hierarchy. Each asset has a set of data (level contains a set of category, category contains a set of item, etc). There could be 100,000s of assets and they in turn could have 10,000s of sub items.
project (id)
assets (id, parentId)
level
category
item
case
revenues
other...
From my understanding of DDD there would be one aggregate with project being the aggregate root because level cant exist without assets and assets cant exist without projects, etc.
This would lead to there being an enormous amount of objects in the project. It would also mean that to add a case I would have to have a method on the root such as CreateNewCase(asset, level, category, item, case) or CreateNewCase(itemId, case) if each item had a unique key. Project would end up enormous and contain methods for handling events for almost all the entities in the application.
Any help would be appreciated.

You pretty much answered your own question -- this model is impractical, and will be slow as you need to load tens of thousands of items each time you rehydrate an aggregate root from the database (unless resorting to complex lazy loading).
The "can't exist without" rule is rarely applicable. I suggest you read the aggregate design recommendations here : http://dddcommunity.org/library/vernon_2011/ http://www.amazon.com/Implementing-Domain-Driven-Design-Vaughn-Vernon/dp/0321834577

DDD is all about mindset and in this case you're using the wrong mindset. Project, asset, item, category etc are concepts known and used by the Domain. An aggregate usually defines a concept or an use case of one. Event sourcing is just an implementation detail: you mutate state by applying events and those events are also how an aggregate is persisted.
But, before you get to that you need to properly identify the aggregate and its root. This is a matter of design unrelated to programming language, library, framework or actual implementation (like event sourcing).
So, you have to really understand the concepts used by the domain and their use cases. Every time you read/hear: 'X has Y', check to see if the meaning is in fact 'X is defined by Y' or 'X acts as a grouping criteria for Y'. Nobody here can tell you the design for your app, we can only tell you some principles and some gotchas but that's it. There's no formula, recipe or 'best practices', it's only understanding the Domain and then modeling it using code.
You will never get a generic DDD solution for your specific problem, because in DDD each solution is highly dependent on your problem. And the sooner you understand that DDD is a paradigm and not a 'how to' magic recipe, the better you'll be. Else, you'll waste years doing everything BUT DDD, regardless of how many classes named aggregate root, repository and entity you'll have.

In my project I have the concept of the model(actor) and since I use event sourcing in json format all the models inside this model are considered components. If you want to see a practical example look the examples of this project. https://github.com/politrons/Scalaydrated

Related

Can I say Axon Commands and Events are considered as anemic models?

My question here is quite straight as mentioned in the subject.
However, please allow me to give some brief explanation here about my innocent thoughts.
I've been using Axon for approximately 10 months now. I used to design my project structure based on the Hexagonal architecture with two top level packages respectively for domain and infrastructure.
Furthermore, domain package will contain different domain objects (as explained in the DDD concept) such as follow:
Aggregate (this will be an Axon aggregate class).
Repository (in my case, this will be a Spring Data Repository interface).
Entity (in my case, this contains any lookup entity that i used for set-based consistency validation as written here).
Service Port (collection of Input and Ouput port interfaces).
Commands (representing Axon Command object).
As for Events, I used to put them on a different module that I compiled as a jar file, so I can share it to other developers whom going to use the same event in their project.
I've noticed recently that all of my commands and events were basically anemic models (an anti pattern that we should avoid).
Is there any good practice on this ? Or, Is it something that intentionally used by design ?
I've been thinking to put my Command classes within my Aggregate class (as an inner classes). At least by using this approach I won't end-up with having so many anemic models scattered outside. Any thoughts ?

Commands are designed to be behavior and input structures mirroring the external world. They don't necessarily mirror an aggregate's structure.
They are not even connected clearly to one single aggregate, at times. Enclosing them within aggregates can be a code smell because you are then thinking in terms of resources and UI organization, instead of transaction boundaries and entity groups.
You are also violating the open-closed principle. Changes in volatile layers like user interface and request structures will make you edit the Aggregate class, and that is not good design.
On a more general note...
At times, this debate of anemic vs. non-anemic (or dry vs. non-dry) can push you in the direction of premature - and incorrect - optimization. Try avoiding this trap because you will end up optimising at the code level, but your domain will suffer.
DDD and CQRS guidelines align with principles that help you keep complexity at bay over the long term. Things kept distinct and separate help you achieve this.

First of all, in DDD, your domain had to be free of any frameworks, just use pure language library.
Then, mixing Commands and Aggregates cannot be a good solution. I think Commands belongs to Port while Aggregates belongs to the Hexagone.
Finally, DDD highlights the discovery of the domain thanks to the experts. Did you do that ? If not, if you're only using the Tacticts pattern, you'll miss one of the most important part of DDD.

DDD modelling settings as Value Objects

I am attempting to build an application in a DDD way. Imagine the Aggregate root is a 'Page' which has other aggregates such as Author, Commenter, Comments, Status etc...
A page can also have various settings such as:
Private page
Allow comments
Allow anonymous comments
Lock after x Date
Slug
I am trying to consider the best way to model these 'settings'. Currently I am looking to have a Settings collection that has each individual setting as a value object. As you can see some of these are essentially boolean values and others might contain specific values such as a date. Given there might be dozens of settings, some with defaults, do I model the 'settings' collection with each specific setting, or just as a collection with the applicable settings?
Is there a 'better' or standard DDD way to approach this problem? I was considering using the specification pattern here but have concluded it doesn't really apply
Sorry a cheeky second question....
A page can have one of many statuses (e.g. draft, published, scheduled, archived) but only one status at a given time. Similar question in terms of the best way to model this. As a status represents something that has an identity I have implemented the status as an entity for the moment. I was wondering would a better approach be to model this as a workflow or a status history even?

First let me say that the aggregate you are planning to create could be too big, double check that since creating an aggregate implies some trade offs like that you can only access the child entities from the aggregate root, and that all changes in the aggregate are atomic.
To avoid having a too big aggregate you could have AuthorId as a VO inside Page instead of the full Author entity.
About the settings thing, as far I know DDD doesn’t have some directives to design this. But I’d try to have Settings collection of some interface, the I’d have an implementation for each setting type: Boolean, date...

What is an Aggregate Root?

No, it is not a duplication question.
I have red many sources on the subject, but still I feel like I don't fully understand it.
This is the information I have so far (from multiple sources, be it articles, videos, etc...) about what is an Aggregate and Aggregate Root:
Aggregate is a collection of multiple Value Objects\Entity references and rules.
An Aggregate is always a command model (meant to change business state).
An Aggregate represents a single unit of (database - because essentialy the changes will be persisted) work, meaning it has to be consistent.
The Aggregate Root is the interface to the external world.
An Aggregate Root must have a globally unique identifier within the system
DDD suggests to have a Repository per Aggregate Root
A simple object from an aggregate can't be changed without its AR(Aggregate Root) knowing it
So with all that in mind, lets get to the part where I get confused:
in this site it says
The Aggregate Root is the interface to the external world. All interaction with an Aggregate is via the Aggregate Root. As such, an Aggregate Root MUST have a globally unique identifier within the system. Other Entites that are present in the Aggregate but are not Aggregate Roots require only a locally unique identifier, that is, an Id that is unique within the Aggregate.
But then, in this example I can see that an Aggregate Root is implemented by a static class called Transfer that acts as an Aggregate and a static function inside called TransferedRegistered that acts as an AR.
So the questions are:
How can it be that the function is an AR, if there must be a globaly unique identifier to it, and there isn't, reason being that its a function. what does have a globaly unique identifier is the Domain Event that this function produces.
Following question - How does an Aggregate Root looks like in code? is it the event? is it the entity that is returned? is it the function of the Aggregate class itself?
In the case that the Domain Event that the function returns is the AR (As stated that it has to have that globaly unique identifier), then how can we interact with this Aggregate? the first article clearly stated that all interaction with an Aggregate is by the AR, if the AR is an event, then we can do nothing but react on it.
Is it right to say that the aggregate has two main jobs:
Apply the needed changes based on the input it received and rules it knows
Return the needed data to be persisted from AR and/or need to be raised in a Domain Event from the AR
Please correct me on any of the bullet points in the beginning if some/all of them are wrong is some way or another and feel free to add more of them if I have missed any!
Thanks for clarifying things out!

I feel like I don't fully understand it.
That's not your fault. The literature sucks.
As best I can tell, the core ideas of implementing solutions using domain driven design came out of the world of Java circa 2003. So the patterns described by Evans in chapters 5 and six of the blue book were understood to be object oriented (in the Java sense) domain modeling done right.
Chapter 6, which discusses the aggregate pattern, is specifically about life cycle management; how do you create new entities in the domain model, how does the application find the right entity to interact with, and so on.
And so we have Factories, that allow you to create instances of domain entities, and Repositories, that provide an abstraction for retrieving a reference to a domain entity.
But there's a third riddle, which is this: what happens when you have some rule in your domain that requires synchronization between two entities in the domain? If you allow applications to talk to the entities in an uncoordinated fashion, then you may end up with inconsistencies in the data.
So the aggregate pattern is an answer to that; we organize the coordinated entities into graphs. With respect to change (and storage), the graph of entities becomes a single unit that the application is allowed to interact with.
The notion of the aggregate root is that the interface between the application and the graph should be one of the members of the graph. So the application shares information with the root entity, and then the root entity shares that information with the other members of the aggregate.
The aggregate root, being the entry point into the aggregate, plays the role of a coarse grained lock, ensuring that all of the changes to the aggregate members happen together.
It's not entirely wrong to think of this as a form of encapsulation -- to the application, the aggregate looks like a single entity (the root), with the rest of the complexity of the aggregate being hidden from view.
Now, over the past 15 years, there's been some semantic drift; people trying to adapt the pattern in ways that it better fits their problems, or better fits their preferred designs. So you have to exercise some care in designing how to translate the labels that they are using.

In simple terms an aggregate root (AR) is an entity that has a life-cycle of its own. To me this is the most important point. One AR cannot contain another AR but can reference it by Id or some value object (VO) containing at least the Id of the referenced AR. I tend to prefer to have an AR contain only other VOs instead of entities (YMMV). To this end the AR is responsible for consistency and variants w.r.t. the AR. Each VO can have its own invariants such as an EMailAddress requiring a valid e-mail format. Even if one were to call contained classes entities I will call that semantics since one could get the same thing done with a VO. A repository is responsible for AR persistence.
The example implementation you linked to is not something I would do or recommend. I followed some of the comments and I too, as one commenter alluded to, would rather use a domain service to perform something like a Transfer between two accounts. The registration of the transfer is not something that may necessarily be permitted and, as such, the domain service would be required to ensure the validity of the transfer. In fact, the registration of a transfer request would probably be a Journal in an accounting sense as that is my experience. Once the journal is approved it may attempt the actual transfer.
At some point in my DDD journey I thought that there has to be something wrong since it shouldn't be so difficult to understand aggregates. There are many opinions and interpretations w.r.t. to DDD and aggregates which is why it can get confusing. The other aspect is, in IMHO, that there is a fair amount of design involved that requires some creativity and which is based on an understanding of the domain itself. Creativity cannot be taught and design falls into the realm of tacit knowledge. The popular example of tacit knowledge is learning to ride a bike. Now, we can read all we want about how to ride a bike and it may or may not help much. Once we are on the bike and we teach ourselves to balance then we can make progress. Then there are people who end up doing absolutely crazy things on a bike and even if I read how to I don't think that I'll try :)
Keep practicing and modelling until it starts to make sense or until you feel comfortable with the model. If I recall correctly Eric Evans mentions in the Blue Book that it may take a couple of designs to get the model closer to what we need.

Keep in mind that Mike Mogosanu is using a event sourcing approach but in any case (without ES) his approach is very good to avoid unwanted artifacts in mainstream OOP languages.
How can it be that the function is an AR, if there must be a globaly unique identifier to it, and there isn't, reason being that
its a function. what does have a globaly unique identifier is the
Domain Event that this function produces.
TransferNumber acts as natural unique ID; there is also a GUID to avoid the need a full Value Object in some cases.
There is no unique ID state in the computer memory because it is an argument but think about it; why you want a globaly unique ID? It is just to locate the root element and its (non unique ID) childrens for persistence purposes (find, modify or delete it).
Order A has 2 order lines (1 and 2) while Order B has 4 order lines (1,2,3,4); the unique identifier of order lines is a composition of its ID and the Order ID: A1, B3, etc. It is just like relational schemas in relational databases.
So you need that ID just for persistence and the element that goes to persistence is a domain event expressing the changes; all the changes needed to keep consistency, so if you persist the domain event using the global unique ID to find in persistence what you have to modify the system will be in a consistent state.
You could do
var newTransfer = New Transfer(TransferNumber); //newTransfer is now an AG with a global unique ID
var changes = t.RegisterTransfer(Debit debit, Credit credit)
persistence.applyChanges(changes);
but what is the point of instantiate a object to create state in the computer memory if you are not going to do more than one thing with this object? It is pointless and most of OOP detractors use this kind of bad OOP design to criticize OOP and lean to functional programming.
Following question - How does an Aggregate Root looks like in code? is it the event? is it the entity that is returned? is it the function
of the Aggregate class itself?
It is the function itself. You can read in the post:
AR is a role , and the function is the implementation.
An Aggregate represents a single unit of work, meaning it has to be consistent. You can see how the function honors this. It is a single unit of work that keeps the system in a consistent state.
In the case that the Domain Event that the function returns is the AR (As stated that it has to have that globaly unique identifier),
then how can we interact with this Aggregate? the first article
clearly stated that all interaction with an Aggregate is by the AR, if
the AR is an event, then we can do nothing but react on it.
Answered above because the domain event is not the AR.
4 Is it right to say that the aggregate has two main jobs: Apply the
needed changes based on the input it received and rules it knows
Return the needed data to be persisted from AR and/or need to be
raised in a Domain Event from the AR
Yes; again, you can see how the static function honors this.
You could try to contat Mike Mogosanu. I am sure he could explain his approach better than me.

Aggregate roots depend on the use case so does that mean that we might end up with really a lots of repositories?

Ive heard a lots that aggregate roots depend on the use case. But what does that mean in coding context ?
You have a service class which offcourse hold methods (use cases) that gonna accomplish something in a repository. Great, so you use a repository which is equal to an aggregate root to perform your querying.
Now you need to perform some other kind of operation which use totally different use case than the first service class but use the same entities.
Here the representation :
Entities: Customer, Orders, LineOrder
Service 1: Add new customers, Delete some customers, retrieve customer orders
Here the aggregate root seem to be Customer because you need this repository to perform thoses use cases.
Service 2: Retrieve customer from an actual order
Here the aggregate root seem to be Order because you need this repository to perform this use case.
If i am wrong please correct me. Now that mean you have 2 aggregates roots.
Now my question is, since aggregate roots depend on the use case does that mean that we might end up with really a lots of repositories if you end up having lots of use cases ?
The above example was probably not the best example... so lets say we have a Journal which hold JournalEntries which each entries hold Tasks, Problems and Notes. (This is in the context of telling to a system what have been done to a project)
Does that mean that im gonna end up with 2 repository ? (Journal, JournalEntry)
In the use cases where i need to add new tasks, problems and notes from an journal entry ?
(Can be seen as a service)
Or might end up with 4 repository. (Journal, Task, Problems, Notes)
In the use cases where i need to access directment task, problems and notes ?
(Can be seen as another service)
But that would mean if i need both of theses services (that actually hold the use cases) that i actually need 5 repository to be able to perform use cases in both of them ?
Thanks.

Hi I saw your post and thought I may give you my opion. First I must say I've been doing DDD in project for three years now, so I'm not an expert. But I'm currently working in a project as an architect an coaching developers in DDD, and I must say it isn't a walk in the park... I don't know how many times I've refactored the model and Entity relationships.
But my experience is that you endup with some repositories (more than few but not many). My Aggregates usually contains a few classes and the Aggregate object graph isn't that deep (if you know what I mean).
But I try to be concrete:
1) Aggregate roots are defined by your needs. I mean if you feel that you need that Tasks object through Journal to often, then maybe thats a sign for it to be upgraded as a aggregate root.
2) But everything cannot be aggregate roots, so try to capsulate object that are tight related. Notes seems like a candidate for being own by a root object. You'd probably always relate Notes to the root or it loses its context. Notes cannot live by itself.
3) Remember that Aggregates are used for splitting up large complex domains into smaller "islands" that take care of thier inhabbitants. Its important to not make your domain more complex than it is.
4) You don't know how your model look likes before you've reached far into the project implementation phase. If you realize that some repositories aren't used that much, they may be candidates for merging into other root object (if they have that kind of relationship). You can break out objects that are used so much through root object without its context. I mean for example if Journal are aggregate root and contains Notes and Tasks. After a while you model grows and maybe Tasks
have assoications to Action and ActionHistory and User and Rule and Permission. Now I just throw out a bunch om common objects in a rule/action/user permission functionality. Maybe this result in usecases that approach Tasks from another angle, "View all Tasks performed by this User" etc. Tasks get more involved in some kind of State/Workflow engine and therefor candidates for being an aggregate root itself.
Okey. Not the best example but it maybe gives you the idea. A root object can contain children where some of its children can also be root object because we need it in another context (than journal).
But I have myself banged my head against the wall everytime you startup with a fresh model. Just go with the flow and let the model evolve itself through its clients/subsribers. You refine the model through its usage. The Services (application services and not domain services) are of course extended with methods that respond to UI and usecases (often one-to-one).
I hope I helped you in someway...or not :D

Yes, you would most likely end up with 5 repositories (Journal, JournalEntry, Task, Problems, Notes). Your services would then use these repositories to perform CRUD for each type of entity.
Your reaction of "wow so many repositories" is not uncommon for developers new to DDD.
However, your repositories are usually light weight assuming your model and DB schema are fairly evenly matched which is often the case. If you use an ORM such as nHibernate or a tool such as codesmith generator then it gets even easier to create your repositories.

At first you need to define what is aggregate. I don't know about use case aggregates.
I know about aggregates following...
Aggregates are union of several entities. One of the entities is the aggregate root, the rest entities (or value types) have sense only in selected aggregate root context. For example you can define Order and OrderLine as an aggregate if you don't need to do any independent actions with OrderLine entities. It means that OrderLine makes sense in Order context only.
Why to define aggregates at all? It is required to reduce references between objects. That will simplify you domain model.
And of course you don't need to have OrderLineRepository if OrderLine is a part of Order aggregate.
Here is a link with more information. You can read Eric Evans DDD book. He explains aggregates very well.

Am I allowed to have "incomplete" aggregates in DDD?

DDD states that you should only ever access entities through their aggregate root. So say for instance that you have an aggregate root X which potentially has a lot of child Y entities. Now, for some scenario, you only really care about a subset of these Y entities at a time (maybe you're displaying them in a paged list or whatever).
Is it OK to implement a repository then, so that in such scenarios it returns an incomplete aggregate? Ie. an X object who'se Ys collection only contains the Y instances we're interested in and not all of them? This could for instance cause methods on X which perform some calculation involving the Ys to not behave as expected.
Is this perhaps an indication that the Y entity in question should be considered promoted to an aggregate root?
My current idea (in C#) is to leverage the delayed execution of LINQ, so that my X object has an IQueryable to represent its relationship with Y. This way, I can have transparent lazy loading with filtering... But getting this to work with an ORM (Linq to Sql in my case) might be a bit tricky.
Any other clever ideas?

I consider an aggregate root with a lot of child entities to be a code smell, or a DDD smell if you will. :-) Generally I look at two options.
Split your aggregate into many smaller aggregates. This means that my original design was not optimal and I need to identify some new entities.
Split your domain into multiple bounded contexts. This means that there are specific sets of scenarios that use a common subset of the entities in the aggregate, while there are other sets of scenarios that use a different subset.

Jimmy Nilsson hints in his book that instead of reading a complete aggregate you can read a snapshot of parts of it. But you are not supposed to be able to save changes in the snapshot classes to the database.
Jimmy Nilsson's book Chapter 6: Preparing for infrastructure - Querying. Page 226.
Snapshot pattern

You're really asking two overlapping questions.
The title and first half of your question are philosophical/theoretical. I think the reason for accessing entities only through their "aggregate root" is to abstract away the kinds of implementation details you're describing. Access through the aggregate root is a way to reduce complexity by having a trusted point of access. You're eliminating friction/ambiguity/uncertainty by adhering to a convention. It doesn't matter how it's implemented within the root, you just know that when you ask for an entity it will be there. I don't think this perspective rules out a "filtered repository" as you describe. But to provide a pit of success for devs to fall into, it should be impossible instantiate the repository without being explicit about its "filteredness;" likewise, if shared access to a repository instance is possible, the "filteredness" should be explicit when coding in the caller.
The second half of your question is about implementation on a specific platform. Not sure why you mention delayed execution, I think that's really orthogonal to the filtering question. The filtering itself could be a bit tricky to implement with LINQ. Maybe rather than inlining the Where lambdas, you set up a collection of them and select one depending on the filter you need.

You are allowed since the code will compile anyway, but if you're going for a pure DDD design you should not have incomplete instances of objects.
You should look into LazyLoading if you're afraid to load a huge object of which you will only use a small portion of its child entities.
LazyLoading delays the loading of whatever you decide to lazy-load until the moment they are accessed. They make use of callbacks to call the loading method once the code calls for them.

Is it OK to implement a repository then, so that in such scenarios it
returns an incomplete aggregate?
Not at all. Aggregate is a transnational boundary to change the state of your system. Never use aggregates for querying data. Split the system into Write and Read sides. (read about CQR & CQRS). When we think "CRUD" based, we implement our system, based on some resource. Lets say you have "Appointment" aggregate. Thinking "Crudish" means we should implement usecases Create, Update, Delete, GetAll appointments. That means Appointment[] should be returned for GetAll. When you think usecase based, (HexagonalArchitecture) your usecases would be ScheduleAppointment, RescheduleAppointment, CancelAppointment. But for query side it can be: /myCalendar. We return back all appointments for a specific user in a ClientCalendar object. Create separate DTO's for Query sides. Never use aggregates for this purpose.

Develop Reference

node.js excel linux python-3.x azure haskell apache-spark rust .htaccess string