Flexible entity models with CQRS and EventSourcing - domain-driven-design

Quoting from Rinat Abdullin's article:
CQRS and Event Sourcing also simplify implementation of the flexible
entity models with various custom fields and properties that are often
defined at the run-time and used in layout and drag-n-drop designers
by the end-users.
I fail to understand how the runtime custom field definition is possible with event sourcing and cqrs?

Just add IDictionary field to your Commands and Events and then add its contents to Projections when handling Events. Only make sure that the stuff you put in Dictionary is indeed serializable.

Related

Best practice for naming Event Types in Event Sourcing

When building an event store, the typical approach is to serialize the event and then persist the type of the event, the body of the event (the serialized event itself), an identifier and the time it occurred.
When it comes to the event type, are there any best practises as to how these should be stored and referenced? Examples I see store the fully qualified path of the class ie.
com.company.project.package.XXXXEvent
What effort is then required though if you decide to refactor your project structure?
After years running event-sourced applications in production, we avoid using fully qualified class names or any other platform-specific identifiers for event types.
An event type is just a string that should allow any kind of reader to understand how the event should be deserialized. You are also absolutely right about the issue with refactoring the application structure that might lead to changes in the class name.
Therefore, we use a pre-configured map that allows resolving the object type to a string and to reverse the string to an event type. By doing so, we detach the event type meta from the actual class and get the freedom to read and write events using different languages and stacks, also being able to freely move classes around of needed.
What effort is then required though if you decide to refactor your project structure?
Not a lot of effort, but some discipline.
Events are messages, and long term viability of messages depend on having a schema, where the schema is deliberately designed to support forward and backward compatibility.
So something like "event type" would be a field name that can be any of an open set of identifiers which would each have an official spelling and semantics.
The spelling conventions that you use don't matter - you can use something that looks like a name in a hierarchical namespace, or you can use a URI, or even just a number like a surrogate key.
The identifiers, whatever convention you use, are coupled to the specification -- not to the class hierarchy that implements them.
In other words, there's no particular reason that org.example.events.Stopped necessarily implies the existence of a type org.example.events.Stopped.
Your "factories" are supposed to create instances of the correct classes/data structures from the messages, and while the naive mapping from schema identifier to class identifier works, then yes, they can take that shortcut. But when you decide to refactor your packages, you have to change the implementation of the factory such that the old identifiers from the message schema map to the new class implementations.
In other words, using something like Class.forName is a shortcut, which you abandon in favor of doing the translation explicitly when the short cut no longer works.
Since event sourcing is about storing Domain Events, we prefer to avoid package-names or other technical properties in the events. Especially when it comes to naming them, since the name should be part of ubiquitous language. Domain Experts and other people don't lean on package names when making conversation about the domain. Package names are a language construct that also ties the storage of the Domain Events with the use of them within your software, which is another reason to avoid this solution.
We sometimes use the short class name (such as Class.forName in Java) to make mapping to code simpler and more automatic, but the class names should in that case be carefully chosen to match the ubiquitous language so that it still is not too implementation-specific.
Additionally, adding a prefix opens upp the possibility to have multiple Event Types with the same name but using different prefixes. Domain Events are part of the context of the Aggregate they are emitted from and therefore the Aggregate type can be useful to embed in the event. It will scope your events so you don't have to make up synthetic prefixes.
If you store events from multiple bounded contexts in one store, BoundedContext.EventThatHappened. Past tense for events, and event names are unique for a bounded context. As your events will change implementation, there is no direct connection to a class name.

Event Sourcing and total aggregates encapsulation

I came across that Event Sourcing assumes total encapsulation. Aggregates dosen`t allow to access their internal state. State is internaly kept only to impose valid transions. As far as I grasp this aggregates (in terms of outside world) just emits events. And I cant get my head around that actualy. I refine my models to reflect my bussiness needs which leads to objects that publish some API. For example, I have two aggregate roots: cart and order. I would like to build my order using ActiveItems from cart:
$order->addItems($cart->getActvieItems)
But this violates ES assumtion about total encapsulation of aggregate state. How order should be fulfilled with ActiveItmes according to ES good practices? Should I use read model? I think this leads to knowleadge leak out of the model (aggregate). Thank you in advance!
Alexey is right in that the Event Sourcing is just a persistence mechanism. I think the confusion comes when thinking about Aggregates. Encapsulation is an important concept when thinking about Aggregates. The implication here is that they are not used for query or the UI. Hence the reason CQRS fits in so well.
But most applications need to query the data or display things on the UI. And that's where Read Models come in handy. Assuming you are using CQRS and Event Sourcing (which you don't have to when using Aggregates) it's a fairly easy thing to do. The idea is to subscribe to the events and update the Read Model as you go. This doesn't 'leak' anything because the functionality is in the Aggregate domain objects.
Why is this a good thing?
Have no or extremely limited dependencies makes the aggregate's much simpler to work with.
Read models can be highly optimised for reading from and therefore very fast.
Read models don't require complex queries and joins.
There is a clear separation of concerns
This approach offers huge scaling potential
It's easy to test
I'm sure there are more. If it helps I have a blog post outlining a typical CQRS and ES architecture. You may find it helpful. You can find it here: CQRS + Event Sourcing – A Step by Step Overview
Event Sourcing does not assume anything in addition to the fact that you save the state of your object as series of events. There is even no requirements to have an "aggregate" when doing Event Sourcing.
If you are talking about the DDD terms Aggregate and Aggregate Root, again, Event Sourcing is just a way to save the object as a stream of events instead of the last actual state. There are no additionally imposed "requirements" like "total encapsulation" and inaccessibility of the internal state. Of course aggregates (and other objects) have state.
What could be confusing is that if you also use CQRS, you can have your aggregate state not being used since all its data is transient to the read model. But this is something else and does not need to be blindly applied.
Again, Event Sourcing is just a persistence method, nothing more, and nothing less.
$order->addItems($cart->getActvieItems)
In addition to the comprehensive coverage of CQRS in the answers, I'd like to point out that in the message driven systems commands(messages) should be self-contained and have all information encapsulated necessary to perform the action.
In the above example, Order aggregate receives command AddItems with the list of Items Ids as a payload. The fact that AddItems command handler needs to get additional information to handle the command points to a problem. AddItems command has no sufficient payload so Cart and Order are semantically coupled. You would want to avoid that.
Message passing reasoning is the key here. Here are some abstractions
class AddItems : Command
{
List ItemIds {get; set;}
}
class Order
{
void AddItems(AddItems command){}
}

Event Sourcing organization of streams and Aggregates

what would be the best way to organize my event streams in ES. With event stream I mean all events to an aggregate.
Given I have a project with some data and a list of tasks.
Right now I have a Guid as AggregateID as my streamID.
So far I can
-> recreate the state for a given project with that ID
-> I can assemble a list of projects with a custom projection
The question would be how to handle todos?
should this also be handled below the project stream id or should it have it's own todo stream id?
If a todo has it's separate stream how would one link it to the owning project. How is the project aware of all the todo streams for a given project.
Meaning all changes to the todo list should be also recognized as Commands and Events (more Events) in the project.
And if I also want to allow free todo's without a relation to a project. Does it require to have its own type and stream to handle freeTodo on top. And the list of all todos whether project related or not would be a projection of all todo and freeTodo related streams?
So I guess the main question is how do I handle nested aggregates and how would one define the event store streams and the linking for that?
Any tips, tricks, best practises or resources will be highly appreciated.
// EDIT Update
First of all thank you #VoiceOfUnreason for taken your time to answer this question in great detail. I added the tag DDD because I got that strange feeling it correlates with the bounded context question which is most of the times no black or white decision. Obviously the domain has more depth and details, I simplified the exampled. Down below I shared some more details which got me questioning.
In my first thought I defined an aggregate for todo as well with a property for the project id. I defined this project property as option type (Nullable) to cover the difference between project related and free todo's. But following use cases/ business rules got me rethinking.
The system should also contains free todo's which allows the user to schedule personal tasks not project related (do HR training, etc). All todo's should appear either in their projects or in a complete todo list (both project related and free).
A project can only be finished/closed if all todo's are completed.
This would mix somehow information from aggregate project with information from aggregate todo. So no clear bounds here. My thoughts would be: a) I could leverage the todo read model in the project aggregate for validation. b) define some sort of listed structures for todo's within the project aggregate scope (if so how). This would handle a todo within the context of project and defines clear bounds
c) Have some sort of service which provides todo infos for project validation which somehow refers to point a.).
And all feels really coupled =-/
It would be great if you or someone finds the time to share some more details and opinions here. Thanks a million.
Reminder: the tactical patterns in ddd are primarily an enumeration of OO best practices. If it's a bad idea in OO, it's probably a bad idea in DDD.
the main question is how do I handle nested aggregates
You redesign your model.
Nested aggregates are an indication that you've completely lost the plot; aggregate boundaries should never overlap. Overlapping boundaries are analogous to an encapsulation violation.
If a todo has it's separate stream how would one link it to the owning project.
The most likely answer is that the Todo would have a projectId property, the value of which usually points to a project elsewhere in the system.
How is the project aware of all the todo streams for a given project.
It isn't. You can build read models that compose the history of a project and the history of the todos to produce a single read-only structure, but the project aggregate -- which is responsible for insuring the integrity of the state within the boundary -- doesn't get to look inside the todo objects.
Meaning all changes to the todo list should be also recognized as Commands and Events (more Events) in the project.
No, if they are separate aggregates, then the events are completely separate.
Under some circumstances, you might use the values written in an event produced by the todo as arguments in a command dispatched to the project, or vice versa, but you need to think of them as separate things having a conversation that may, or may not, ever come to agreement.
Possibilities: it might be that free standing todo items are really a different thing from the todo items associated with a project. Check with your domain experts -- they may have separate terms in the ubiquitous language, or in discussing the details you may discover that they should have different terms in the UL.
Alternatively, todo's can be separate aggregates, and the business adapts to accept the fact that sometimes the state of project and the state of the todo don't agree. Instead of trying to prevent the model from entering a state where the aggregates disagree, you detect the discrepancy and mitigate the problem as necessary.

Domain-driven design, event sourcing and evolving models

Eric Evans talks a lot about evolving models in DDD so refactorings seem to be essential to DDD. When one has a relational persisted state of the world you can handle model changes by migrations that change the database schema.
How can I cope with model changes when using event sourcing? If there are incompatible changes to an aggregate that would prevent replay of events is there some sort of best practice? Or is it a just-don't?
If there are incompatible changes to an aggregate that would prevent replay of events
You have essentially two options in this scenario:
Patch the older events in such a way that they are made compatible and events can be replayed from the beginning. The benefit here is that you don't lose the history but the downside is that you have to expend some effort to patch the old events.
Take a snapshot/memento of the aggregate at the point of the schema change and "re-base" the event stream from this point onward. The benefit here is that you don't have to spend any effort (with event sourcing you most likely have a snapshot mechanism in place). The downside being that you lose the ability to replay events from before the snapshot.
As a general rule of thumb I'd say default to the second option unless you know for sure that you need to be able to go back and edit history before the schema change.
I have not much expierence myself. But I saw a concept called Upcasting
Originally a concept of object-oriented programming, where: "a subclass gets cast to it's superclass automatically when needed", the concept of upcasting can also be applied to event sourcing. To upcast an event means to transform it from its original structure to its new structure. Unlike OOP upcasting, event upcasting cannot be done in full automation because the structure of the new event is unknown to the old event. Manually written Upcasters have to be provided to specify how to upcast the old structure to the new structure.
You can refer to Axon's doc for more detail
Events are just DTOs. It doesn't matter how the model changes as long as you still have one object, if the event itself doesn't change. If you need to change the event, you can 'upgrade' it with the required properties. The Apply method will know what to do with it. I can't come up with something specific without knowing details.
If the model changes so much that basically now you have 2 Aggregate Roots(AR) instead of a previous one, this means you have new different aggregates which won't be using the old events. Basically you start from the old AR, create the new ones and generate the corresponding events which will be specific to those ARs. So you don't really have a compatibility problem in this case.
Working with events is not as straightforward as 'classic' OOP and RDBMS schema, but they are more flexible if you think in business terms and treat your objects as domain concepts. Changing the model means the business concept definition or usage has changed as well, so now you're dealing with a different (new as far as the persistence is concerned) concept.

Do we really need a separate event store with Event Sourcing and CQRS patterns?

Suppose we have a situation when we need to implement some domain rules that requires examination of object history (event store). For example we have an Order object with CurrentStatus property, and we need to examine Order.CurrentStatus changes history.
Most likely you will answer that I need to move this knowledge to domain and introduce Order.StatusHistory property that contains a collection of status records, and that I should not query event store. And I will agree with you.
What I question is the need of Event Store.
We write in event store events that has business meaning (domain value), we do not record UserMovedMouse events (in most cases). And as with OrderStatusChanged event there is a high chance that most of events from EventStore will be needed at some point for domain logic, and we end up with a domain object that have a EventHistory property with the collection of events.
I can see a value in separate event store for patterns such as CQRS when you have a single write only event store and multiple read only query stores, which gives you some scalability. However the need to to introduce such thing in code is in question too for me. All decent databases support single write server, multiple read servers scalability (master-slave replication). Why should I introduce such thing at source code level? Why not to forget about Web Services, and Message buses and use write your own wrapers around Sockets.
I have a great respect to "old school" DDD as it was described be Eric Evans, and I see some fresh and good ideas in new wave DDD+SQRC+EventSourcing pattern aggregate. However the main idea of CQRS is under big question for me. Am I missing something?
In short: if event sourcing is not needed (for its added benefits or as workarounds for some quirks), then you definitely shouldn't bring it into your system just for the sake of it.
ES is just one of many ways to augment CQRS architectural style within a bounded context. It is not a requirement.

Resources