DDD - Using a Process Manager or a Domain Service - domain-driven-design

I am new with DDD and I am implementing it on a part of my application because some of the requirements of the application lead me to CQRS with Event Sourcing (need of historic of events that occured in the system plus need to be able to view the state of the system in the past).
One question I have after reading Vaughn Vernon book and its Effective Aggregate Design series is what is the difference between Process Manager (Long Running Porcess) and Domain Service. Especially when you have navigation properties towards an Aggregate into another Aggregate
I'll explain what I have understood :
- Domain services are made to hold logic that does not belong into any Aggregate. According to Vaughn it can be used as well to pass entities reference to the aggregate that contains it. It maybe used also to manage transactions as they cannot be handle into a Domain Object
- Process manager are made to orchestrate modifications that are made on a system and spans on different aggregates. Some people are saying that a good Process Manager is actually an Aggregate Root. From my understanding it does not manage transactions as events are launched after changes are committed. It uses the approach of eventual consistency. Eventually all the changes will have occured
Now, to put everything in context. The core of the application I am building is to handle a tree of Nodes that contains their own logic. We need to be able to add Nodes to the Tree and of course to create those Nodes.
We need to be able to know what happened to those Node. ie we need to be able to retrieve the event linked to a node
Also a modification that is done to one of the leaves (depending of the kind of modification) will have to be replicated to the other Nodes that are a parent of this node.
What are my aggregates :
- Nodes, it is what my tree contains. In my opinion this is an aggregate for several reasons. It is not an invariant, therefore not a value object. They have their own domain logic that allows them to assign it's properties it's value objects and we need to be able to access them using Ids
- A representation of a non binary Tree made of Nodes. Right now, I actually designed this as my aggregate Root and it is actually a Process Manager. The tree contains a logic representation of this tree. It contains the root of the tree. This root is actually an Object (I am not sure it can be named a Value Object because it contains reference towards other aggregates, Child Nodes, but it certainly sounds like it is). The Node Object in the Tree contains basic information like the Node Name, and the reference towards the actual Aggregate (this almost sounds like two bounded context ?)
Using that approach this is what is happening :
- After executing the command to create a Node, a Node is created and committed. The NodeCreated Event is launched, caught by the correct Handler that retrieves the Tree (process manager) associated to this node and add the node at the correct place (using the parent id property of the Node)
- After executing the command to modify a Node, the node is modified and committed. The NodeModified Event is launched, caught by the handler. The Handler then retrieves the Tree (my process manager) and find all the Parent Node of the modified Node and ask for those Node to modify their own properties based on what was modified on the Child Node. This all makes perfect sense, and looks almost beautiful to me, showing the power of events and the seperation of domain logic
But, My principal issue here is with the transaction. What happens if an error happens while updating the Tree and the node that has to be modified or added? The event for the Node is already saved in the Event Store, because it was committed. So i would have to create a new Event to revert the modifications ? I know that commands have to be valid when entering the system so it would not be a validation issue, and chances that something happening are like 1 in a million. Does that mean we should not take that possibility in account ?
The transaction issue is why I feel like I should use a Service. Either a Application Service (here a command handler) or a domain Service to orchestrate the mofications and do them in a Single Transaction. If something fails during this transaction, nothing is created/modified but that breaks the rule of DDD saying that I should not modify several Aggregates in the same Transaction. This somehow looks a less elegant solution
I really feel like I am missing something here but I am not quite sure what it is.

Some people are saying that a good Process Manager is actually an Aggregate Root
From my point of view this is not correct. A Process manager or a Saga coordinates a long running Business process that spans multiple Aggregate instances. It brings the system eventually in a valid final state. It does not emit events but respond to events and creates Commands that arrive to the Aggregates (possibly through a Command handler, depending on your exact architecture). Those architects that say that have failed to correctly identify they Aggregate boundaries.
A Process manager/Saga could be stateful - but just to remember the progress that it has made; it can have a Process ID; it can even be Event-sourced.
Process manager are made to orchestrate modifications that are made on a system and spans on different aggregates.
Yes, this is correct.
After executing the command to modify a Node, the node is modified and committed.
When you design your Aggregates you must take into consideration only the protection of invariants, of the business rules that exists on the write/command side of the architecture; this is the side that produce the state transition, the emitting of the events in case of Event-driven architectures.
The single business rule, if any, that I have identified on your specific case is that when a node is created (seems like a CRUD operation!) the NodeCreated Event is emitted; similar to NodeModified. So, these operations exist on the write/command side.
The NodeModified Event is launched, caught by the handler. The Handler then retrieves the Tree (my process manager) and find all the Parent Node of the modified Node and ask for those Node to modify their own properties based on what was modified on the Child Node
Are there any business rules for the write side regarding the updating of the Parents nodes? I don't see any. Of course, something is updated after a Node is created but it is not an Aggregate but a Read model. Your Handler that is called is in fact a Read model. It projects the NodeXXX events on a Tree of Nodes.

I really feel like I am missing something here but I am not quite sure what it is.
You may have over complicated your domain model.
Domain Services are typically service providers that give the domain model access to (cached) state or capabilities it wouldn't normally have. For instance, we might use a domain service to give the model access to a cached tax table, so that it can compute the tax on an order; or we might use a domain service to give the model access to a notifyCustomer capability that is delegated to the email infrastructure.
Process Managers are usually used for orchestration - they are basically state machines that look at what has happened (events) and suggest additional commands to run. See Rinat Abdullin's description.
What happens if an error happens while updating the Tree and the node that has to be modified or added? The event for the Node is already saved in the Event Store, because it was committed. So i would have to create a new Event to revert the modifications ?
Maybe - compensating events are a common pattern.
The main point is this: there is no magic in orchestrating a change across multiple transactions. Think about how you would arrange a UI that displays to a human operator what's going on, what should happen next, and what the failure modes would be.
chances that something happening are like 1 in a million. Does that mean we should not take that possibility in account ?
Depends on the risk to the business. But as Greg Young points out in his talk Stop Over Engineering, if you can just escalate that 1 in a million problem to the human beings to sort out, you may have done enough.

Related

Migrate legacy database to cqrs/event sourcing view

We got old legacy application with complex business logic which we need to rewrite. We consider to use cqrs and event sourcing. But it's not clear how to migrate data from the old database. Probable we need migrate it to the read database only, as we can't reproduce all the events to populate event store. But we atleast need to create some initial records in event store for each aggregate, like AggregateCreated? Or we need write a scripts and to use all the commands one by one to recreate aggregates in same way we will normally with event sourcing?
Using the existing database, or a transformed version of it, as a start of your read-side persistence is never a good idea. Your event-sourced system needs to have its start, so you get one of the main benefits of event sourcing - being able to create projections on-demand, using polyglot persistence.
Using commands for migration is also not a good idea for a simple reason that commands, by definition, can fail due to pre or post-condition check of invariant control. It also does not convey the meaning of migration, which is to represent the current system state as it is right now. Remember, that the current system stay is not something you can accept or deny. It is given to you and your job is to capture it.
The best practice for such a migration is to emit so-called migration events, like EntityXMigratedFromLegacy. Of course, the work might be substantial. Mainly because the legacy system model will most probably not match the new model, otherwise the reason for such a migration isn't entirely clear.
By using migration events you explicitly state the fact that a piece of state was moved from another place, as-is. You will always know how the migrated entity started its lifecycle in the new system - either by being migrated from legacy or by being initialised in the new system.
Probable we need migrate it to the read database only
No, your read model db can be dropped and recreated any time based on write side, only write side is your source of truth.
But we atleast need to create some initial records in event store for
each aggregate, like AggregateCreated?
Of course, and having ONLY the initial event could be not enough. If your current OrderAggregate has reservations, you must create ItemReservedEvent for-each reservation it has.
Or we need write a scripts and to use all the commands one by one to
recreate aggregates in same way we will normally with event sourcing?
Feels like that's the way you should go. Read old aggregate/entity from db and try to map it to a new one.

DDD, CQRS/ES & MicroServices Should Decisions be taken on Microservice's views or aggregates?

So I'll explain the problem through the use of an example as it makes everything more concrete and hopefully will reduce ambiguity.
The Architecture is pretty simple
1 MicroService <=> 1 Aggregate <=> Transactional Boundry
Each microservice will be using CQRS/ES design pattern which implies
Each microservice will have its own Aggregate mapping the domain of a real-world problem
The state of the aggregate will be rebuilt from an event store
Each event will signify a state change within the aggregate and will be transmitted to any service interested in the change via a message broker
Each microservice will be transactional within its own domain
Each microservice will be eventually consistent with other domains
Each microservice will build there own view models, from events being emitted by other microservices
So the example lets say we have a banking system
current-account microservice is responsible for mapping the Customer Current Account ... Withdrawal, Deposits
rewards microservice will be responsible for inventory and stock take of any rewards being served by the bank
air-miles microservice will be responsible for monitoring all the transaction coming from the current-account and in doing so award the Customer with rewards, from our reward micro-service
So the problem is this Should the air-miles microservice take decisions based on its own view model which is being updated from events coming from the current-account, and similarly, on picking which reward it should give out to the Customer?
Drawbacks of taking decisions on local view models;
Replicating domain logic on how to maintain these views
Bugs within the view might propagate the wrong rewards to be given out
State changes (aka events emitted) on corrupted view models could have consequences in other services which are taking their own decisions on these events
Advantages of taking a decision on local view models;
The system doesn't need to constantly query the microservice owning the domain
The system should be faster and less resource intense
Or should it use the events coming from the service to trigger queries to the Aggregate owning the Domain, in doing so we accept the fact that view models might get corrupt but the final decision should always be consulted with the aggregate owning the domain?
Please, not that the above problem is simply my understanding of the architecture, and the aim of this post is to get different views on how one might use this architecture effectively in a microservice environment to keep each service decoupled yet avoid cascading corruption scenario without to much chatter between the service.
So the problem is this Should the air-miles microservice take decisions based on its own view model which is being updated from events coming from the current-account, and similarly, on picking which reward it should give out to the Customer?
Yes. In fact, you should revise your architecture and even create more microservices. What I mean is that, being a event-driven architecture (also an Event-sourced one), your microservices have two responsibilities: they need to keep two different models: the write model and the read model.
So, for each Aggregate should be a microservice that keeps only the write model, that is, it only processes Commands, without building also a read model.
Then, for each read/query use case you should have a microservice that build the perfect read model. This is required if you need to keep the Aggregate microservice clean (as you should) because in general, the read models needs data from multiple Aggregate types/bounded contexts. Read models may cross bounded context boundaries, Aggregates may not. So you see, you don't really have a choice if you need to fully respect DDD.
Some says that domain events should be hidden, only local to the owning microservice. I disagree. In an event-driven architecture the domain events are first class citizens, they are allowed to reach other microservices. This gives the other microservices the chance to build their own interpretation of the system state. Otherwise, the emitting microservice would have the impossible additional responsibility/task of building a state that must match every possible need that all the microservices would ever want(!); i.e. maybe a microservices would want to lookup a deleted remote entity's title, how could it do that if the emitting microservice keeps only the list of non-deleted-yet entities? You may say: but then it will keep all the entities, deleted or not. But maybe someone needs the date that an entity was deleted; you may say: but then I keep also the deletedDate. You see what you do? You break the Open/closed principle. Every time you create a microservice you need to modify the emitting microservice.
There is also the resilience of the microservices. In the Art of scalability, the authors speak about swimming lanes. They are a strategy to separate the components of a system into lanes of failures. A failure in a lane does not propagate to other lanes. Our microservices are lanes. Components in a lane are not allowed to access any component from other lane. One down microservice should not bring the others down. It's not a matter of speed/optimisation, it's a matter of resilience. The domain events are the perfect modality of keeping two remote systems synchronized. They also emphasize the fact that the data is eventually consistent; the events travel at a limited speed (from nanoseconds to even days). When a system is designed with that in mind then no other microservice can bring it down.
Yes, there will be some code duplication. And yes, although I said that you don't have a choice, you have. In order to reduce the code duplication at the cost of a lower resilience, you can have some Canonical read models that build a normal flat state and other microservices could query that. This is dangerous in most cases as it breaks the swimming lanes concept. Should the Canonical microservices go down, go down all dependent microservices. Canonical microservices works best for CRUD-like bounded context.
There are however valid cases when you may have some internal events that you don't want to expose. In other words, you are not required to publish all domain events.
So the problem is this Should the air-miles micro service take decisions based on its own view model which is being updated from events coming from the current-account, and similarly, on picking which reward it should give out to the Customer?
Each consumer uses a local replica of a representation computed by the producer.
So if air-miles needs information from current-account it should be looking at a local replica of a view calculated by the current-account service.
The key idea is this: micro services are supposed to be isolated from one another; you should be able to redesign and deploy one without impacting the others.
So try this thought experiment - suppose we had these three micro services, but all saving snapshots of current state, rather than events. Everything works, then imagine that the current-account maintainer discovers that an event sourced implementation would better serve the business.
Should the change to the current-account require a matching change in the air-miles service? If so, can we really claim that these services are isolated from one another?
Advantages of taking a decision on local view models
I don't particularly like these "advantages"; first, they are dominated by the performance axis (please recall that the second rule of performance optimization is "not yet"). And second, that they assume that the service boundaries are correctly drawn; maybe the performance issue is evidence that the separation of responsibilities needs review.

DDD: Applying Event Store in a legacy system

Our current system is a legacy system which doesn't use domain events. We are going to start publishing domain events.
Other bounded contexts are going to listen to these domain events, but only from the time we start publishing, losing all the past information.
Then, how to deal with this legacy system which didn't record these events, but somehow we want to have a past history before the implementation of this event store system?
Is it a good approach trying to figure out what happened and try to create the domain events (reverse engineering) according to the data we have in our DB?
I wouldn't go down the route of trying to reverse engineer events for a legacy system, unless there is a business reason to do so - is your use case just that you want to fit into the new way you'll be modelling things using events? If there's no business case for it, it sounds like a waste of effort.
How about having a single starting event that represents the current state of each of your 'things' (i.e. Aggregates if you're using DDD concepts) as they exist now in the legacy system? Then add new events on top of this.
I.e.
LegacySystemStateCaptured
NewDomainEvent
AnotherNewDomainEvent
...then when you rebuild your state, apply the LegacySystemStateCaptured event as well as the others.

How to handle domain model updates and immutability of stored events?

I understand that events in event sourcing should never be allowed to change. But what about the in-memory state? If the domain model needs to be updated in some way, shouldn't old event still be replayed to old models? I mean shouldn't it be possible to always replay events and get the exact same state as before or is it acceptable if this state evolves too as long as the stored events remains the same? Ideally I think I'd like to be able to get a state as it was with it's old models, rules and what not. But other than that I of course also want to replay old events into new models. What does the theory say about this?
Anticipate event structure changes
You should always try to reflect the fact that an event had a different structure in your event application mechanism (i.e. where you read events and apply them to the model). After all, the earlier structure of an event was a valid structure at that time.
This means that you need to be prepared for this situation. Design the event application mechanism flexible enough so that you can support this case.
Migrating stored events
Only as a very last resort should you migrate the stored events. If you do it, make sure you understand the consequences:
Which other systems consumed the legacy events?
Do we have a problem with them if we change a stored event?
Does the migration work for our system (verify in a QA environment with a full data set)?

Why can't sagas query the read side?

In a CQRS Domain Driven Design system, the FAQ says that a saga should not query the read side (http://cqrs.nu). However, a saga listens to events in order to execute commands, and because it executes commands, it is essentially a "client", so why can't a saga query the read models?
Sagas should not query the read side (projections) for information it needs to fulfill its task. The reason is that you cannot be sure that the read side is up to date. In an eventual consistent system, you do not know when the projection will be updated so you cannot rely on its state.
That does not mean that sagas should not hold state. Sagas do in many cases need to keep track of state, but then the saga should be responsible of creating that state. As I see it, this can be done in two ways.
It can build up its state by reading the events from the event store. When it receives an event that it should trigger on it will read all events it needs from the store and build up its state in a similar manner that an aggregates does. This can be made performant in Event Store by creating new streams.
The other way is that it continuously listens to events from the event store and build up state and stores it on some data storage like projections do. Just be careful with this approach. You cannot reply sagas in the same way as you do with projections. If you need to change the way you store state and want to rebuild it, make sure that you do not execute the commands that you have already executed.
Sagas use the command model to update the state of the system. The command model contains business rules and is able to ensure that changes are valid within a given domain. To do that, the command model has all the information available that it needs.
The read model, on the other hand, has an entirely different purpose: It structures data so that it is suitable to provide information, e.g. to display on a web page.
Since the saga has all the information it needs through the command model, so it doesn't need the read model. Worse, using the read model from a saga would introduce additional coupling and increase the overall complexity of the system considerably.
This does not mean that you absolutely cannot use the read model. But if you do, be sure you understand the consequences. For me, that bar is quite high, and I have always found a different solution yet.
It's primarily about separation of concerns. Process managers (sagas) are state machines responsible for coordinating activities. If the process manager want to affect change, it dispatches commands (asynchronous).
Also: what is the read model? It's a projection of a bunch of events that already happened. So if the processor cared about those events... shouldn't it have been subscribing to them all along? So there's a modeling smell here.
Possible issues:
The process manager should have been listening to earlier messages in the stream, so that it would be in the right state when this message arrived.
The current event should be richer (so that the data the process manager "needs" is already present).
... variation - the command handler should instead be listening for a different event, and THAT one should be richer.
The query that you want should really be a command to an aggregate that already knows the answer
and failing all else
Send a command to a service, which runs the query and dispatches events in response. This sounds weird, but it's already common practice to have a process manager dispatch a message to a scheduling service, to be "woken up" when some fixed amount of time passes.

Resources