Event Sourcing with Side-Effects - domain-driven-design

I'm building a service using the familiar event sourcing pattern:
A request is received.
The aggregate's history is loaded.
The aggregate is rebuilt (from its history).
New events are prepared and the aggregate is updated in response to the incoming request from Step 1.
These events are written to the log, and are made available (published) to any subscribers.
In my case, Step 5 is accomplished in two parts. The events are written to the event log. A background process reads from the event log and publishes all events starting from an offset.
In some cases, I need to publish side effects in addition to events related to the aggregate. As far as the system is concerned, these are events too because they are consumed by and affect the state of other services. However, they don't affect the history of the aggregate in this service and are not needed to rebuild it.
How should I handle these in the code?
Option 1-
Don't write side-effecting events to the event log. Publish these in the main process prior to Step 5.
Option 2-
Write everything to the event log and ignore side-effecting events when the history is loaded. (These aren't part of the history!)
Option 3-
Write side-effecting events to a dummy aggregate so they are published, but never loaded.
Option 4-
?
In the first option, there may be trouble if there is a concurrency violation. If the write fails in Step 5, the side effect cannot be easily rolled back. The second option write events that are not part of the aggregate's history. When loading in Step 2, these side-effecting events would have to be ignored. The 3rd option feels like a hack.
Which of these seems right to you?

Name events correctly
Events are "things that happened". So if you are able to name the events that only trigger side effects in a "X happened" fashion, they become a natural part of the event history.
In my experience, this is always possible, because side-effects don't happen out of thin air. Sometimes the name becomes a bit artificial, but it is still better to name events that way than to call them e.g. "send email to that client event".
In terms of your list of alternatives, this would be option 2.
Example
Instead of calling an event "send status email to customer event", call it "status email triggered event". Of course, if there is a better name for the actual trigger, use that one :-)

Option 4 - Have some other service subscribe to the events and produce the side effects, and any additional events related to them.
Events should be fine-grained.

Option 1- Don't write side-effecting events to the event log. Publish
these in the main process prior to Step 5.
What if you later need this part of the history by building a new bounded context?
Option 2- Write everything to the event log and ignore side-effecting
events when the history is loaded. (These aren't part of the history!)
How to ignore the effect of something which does not have any effect? :D
Option 3- Write side-effecting events to a dummy aggregate so they are
published, but never loaded.
Why do you need consistency boundary around something which you will never change?
What you are talking about is the most common form of domain events, which you use to communicate with other BC-s. Ofc. you need to save them.

Related

How to update bunch of data with Event Sourcing

I wondering how to update bunch of data in Event Sourcing concept for any aggregate.
In traditional application I would take some data such as name, date of birth etc. and put them into existing object; as I understand, in ES concept this approach is wrong, so that should I perform different Events to update different parts of aggregate root? If so, that how to build REST API? How to handle with validation?
In traditional application I would take some data such as name, date of birth etc. and put them into existing object; as I understand, in ES concept this approach is wrong,
Short answer: that approach is fine -- what changes in event sourcing is how you keep track of the changes in your service.
A way to think of a stream of events is a sequence of patch-documents. There's nothing wrong with changing multiple fields in a single patch document, and that is fine in events as well.
This question is really too broad for SO. You should google “event sourcing basics in azure” to find detailed articles, github projects, videos, and other responses to these questions.
In general, in Event Sourcing there two main ideas you need – Messages and Events. A typical process (not the only option, but a common one) is as follows. A message is created by your UI which makes a request for a change to be made to an AR. Validation for that message is done on the message creation source.
The message is then sent to an API, where it is validated again since you can't trust all possible senders. The request is processed, resulting in changes made to an AR. An event is then created describing the changes made, and that event is placed on an event source (Azure Event Hub, Kafka, Kinesis, a DB, whatever). This list of events is kept forever and describes each and every change made to that AR throughout time, including the initial creation request. To get the current state of the AR, just add up all the events.
The key idea that is confusing when learning Event Sourcing is the two different types of “events”. Messages ask for a change to be made, Events record that a change has been made.
As already answered, the batch update approach is fine.
I suggest to focus on the event consumption code. If all you have in your ReadSide is a complete aggregate representation, then generic *_UPDATED event is ok.
But if you do have parts of you system interested only in particular part of your aggregate, you might want to update that part separately, so that system doesn't have to analyze all events and dig for particular data.
For example, some demographic analysis system is only interested in the birthdate. It would be much easier for this system to have a BURTHDATE_SET event that it would listen to, and ignore all others.
Fine grained events like this also reduces coupling, because require less knowledge of the internal event data structure.
It feels like you still have an active record way of looking at things.
You should model the things that happen to your entity as events rather than the impact of things happening.
So to my mind all of that data might be gathered in a "Person was registered" event but an "Address added" event might also exist - in which case your single command might end up appending two events to the event stream.

How are the missing events replayed?

I am trying to learn more about CQRS and Event Sourcing (Event Store).
My understanding is that a message queue/bus is not normally used in this scenario - a message bus can be used to facilitate communication between Microservices, however it is not typically used specifically for CQRS. However, the way I see it at the moment - a message bus would be very useful guaranteeing that the read model is eventually in sync hence eventual consistency e.g. when the server hosting the read model database is brought back online.
I understand that eventual consistency is often acceptable with CQRS. My question is; how does the read side know it is out of sync with the write side? For example, lets say there are 2,000,000 events created in Event Store on a typical day and 1,999,050 are also written to the read store. The remaining 950 events are not written because of a software bug somewhere or because the server hosting the read model is offline for a few secondsetc. How does eventual consistency work here? How does the application know to replay the 950 events that are missing at the end of the day or the x events that were missed because of the downtime ten minutes ago?
I have read questions on here over the last week or so, which talk about messages being replayed from event store e.g. this one: CQRS - Event replay for read side, however none talk about how this is done. Do I need to setup a scheduled task that runs once per day and replays all events that were created since the date the scheduled task last succeeded? Is there a more elegant approach?
I've used two approaches in my projects, depending on the requirements:
Synchronous, in-process Readmodels. After the events are persisted, in the same request lifetime, in the same process, the Readmodels are fed with those events. In case of a Readmodel's failure (bug or catchable error/exception) the error is logged and that Readmodel is just skipped and the next Readmodel is fed with the events and so on. Then follow the Sagas, that may generate commands that generate more events and the cycle is repeated.
I use this approach when the impact of a Readmodel's failure is acceptable by the business, when the readiness of a Readmodel's data is more important than the risk of failure. For example, they wanted the data immediately available in the UI.
The error log should be easily accessible on some admin panel so someone would look at it in case a client reports inconsistency between write/commands and read/query.
This also works if you have your Readmodels coupled to each other, i.e. one Readmodel needs data from another canonical Readmodel. Although this seems bad, it's not, it always depends. There are cases when you trade updater code/logic duplication with resilience.
Asynchronous, in-another-process readmodel updater. This is used when I use total separation of the Readmodel from the other Readmodels, when a Readmodel's failure would not bring the whole read-side down; or when a Readmodel needs another language, different from the monolith. Basically this is a microservice. When something bad happens inside a Readmodel it necessary that some authoritative higher level component is notified, i.e. an Admin is notified by email or SMS or whatever.
The Readmodel should also have a status panel, with all kinds of metrics about the events that it has processed, if there are gaps, if there are errors or warnings; it also should have a command panel where an Admin could rebuild it at any time, preferable without a system downtime.
In any approach, the Readmodels should be easily rebuildable.
How would you choose between a pull approach and a push approach? Would you use a message queue with a push (events)
I prefer the pull based approach because:
it does not use another stateful component like a message queue, another thing that must be managed, that consume resources and that can (so it will) fail
every Readmodel consumes the events at the rate it wants
every Readmodel can easily change at any moment what event types it consumes
every Readmodel can easily at any time be rebuild by requesting all the events from the beginning
there order of events is exactly the same as the source of truth because you pull from the source of truth
There are cases when I would choose a message queue:
you need the events to be available even if the Event store is not
you need competitive/paralel consumers
you don't want to track what messages you consume; as they are consumed they are removed automatically from the queue
This talk from Greg Young may help.
How does the application know to replay the 950 events that are missing at the end of the day or the x events that were missed because of the downtime ten minutes ago?
So there are two different approaches here.
One is perhaps simpler than you expect - each time you need to rebuild a read model, just start from event 0 in the stream.
Yeah, the scale on that will eventually suck, so you won't want that to be your first strategy. But notice that it does work.
For updates with not-so-embarassing scaling properties, the usual idea is that the read model tracks meta data about stream position used to construct the previous model. Thus, the query from the read model becomes "What has happened since event #1,999,050"?
In the case of event store, the call might look something like
EventStore.ReadStreamEventsForwardAsync(stream, 1999050, 100, false)
Application doesn't know it hasn't processed some events due to a bug.
First of all, I don't understand why you assume that the number of events written on the write side must equal number of events processed by read side. Some projections may subscribe to the same event and some events may have no subscriptions on the read side.
In case of a bug in projection / infrastructure that resulted in a certain projection being invalid you might need to rebuild this projection. In most cases this would be a manual intervention that would reset the checkpoint of projection to 0 (begining of time) so the projection will pick up all events from event store from scratch and reprocess all of them again.
The event store should have a global sequence number across all events starting, say, at 1.
Each projection has a position tracking where it is along the sequence number. The projections are like logical queues.
You can clear a projection's data and reset the position back to 0 and it should be rebuilt.
In your case the projection fails for some reason, like the server going offline, at position 1,999,050 but when the server starts up again it will continue from this point.

Event Sourcing Refactoring

I've been studying DDD for a while, and stumbled into design patterns like CQRS, and Event sourcing (ES). These patterns can be used to help achieving some concepts of DDD with less effort.
In the architecture exemplified below, the aggregates know how to handle the commands and events related to itself. In other words, the Event Handlers and Command Handlers are the Aggregates.
Then, I’ve started modeling one sample Domain just to understand how the implementation would follow the business logic. For this question here is my domain (It’s based on this):
I know this is a bad modeled example, but I’m using it just as an example.
So, using ES, at the end of the operation, we would save all the events (Green arrows) into the event store (if there were no Exceptions), each event into its given Event Stream (Aggregate Type + Aggregate Id):
Everything seems right until now. So If we want to Rebuild the internal state of an instance of any of this Aggregate, we only have to new it up (new()) and apply all the events saved in its respective Event Stream in the correct order.
My question is related to changes in the model. Because, software development is a process where we never stop learning about our domain, and we always come with new ideas. So, let’s analyze some change scenarios:
Change Scenario 1:
Let´s pretend that now, if the Reservation Aggregate check’s that the seat is not available, it should send an event (Seat not reserved) and this event should be handled by one new Aggregate that will store all people that got their seat not reserved:
In the hypothesis where the old system already handled the initial command (Place order) correctly, and saved all the events to its respective event streams:
When we want to Rebuild the internal state of an instance of any of this Aggregate, we only have to new it up (new()) and apply all the events saved in its respective Event Stream in the correct order. (Nothing changed). The only thing, is that the new Use case didn’t exist back in the old model.
Change Scenario 2:
Let’s pretend that now, when the payment is accepted we handle this event (Payment Accepted) in a new Aggregate (Finance Aggregate) and not in the Order Aggregate anymore. And It send a new Event (Payment Received) to the Order Aggregate. I know this scenario is not well structured, but something like this could happen.
In the hypothesis where the old system already handled the initial command (Place order) correctly, and saved all the events to its respective event streams:
When we want to Rebuild the internal state of an instance of any of this Aggregate, we have a problem when applying the events from the Aggregate Event Stream to itself:
Now, the order doesn’t know anymore how to handle Payment Accepted Event.
Problems
So as the examples showed, whenever a system change reflects in an event being handled by a different event handler (Aggregate), there are some major problems. Because, we cannot rebuild the internal state anymore.
So, this problem can have some solutions:
Possible Solution
When an event is not handled by the aggregate in which Event Stream it is stored, we can find the new handler and create a new instance and send the event to it. But to maintain the internal state correct, we need the last event (Payment Received) to be handled by the Order Aggregate. So, we let it dispatch the event (and possible commands):
This solution can have some problems. Let’s imagine that a new command (Place Order) arrives and it has to create this order instance and save the new state. Now we would have:
In gray are the events that were already saved in the last call when the system hadn’t already gone through model changes.
We can see that a new Event Stream is created for the new aggregate (Finance W). And we can see that Event Streams are append-only, so the Payment Accepted event in the Order Y Event Stream is still there.
The first Payment Accepted event in Finance W Event Stream is the one that was supposed to be handled by the Order but had to find a new handler.
The Yellow payment received event in Order’s Event Stream is the event that was generated by the new handler of the Payment Accepted when the Payment Accepted event from the Order’s Event Stream was handled by the Finance.
All the other Green Events are new events that were generated by handling the Place Order Command in the new model.
Problem With the Solution
The next time the aggregate needs to be rebuild, there will be a Payment Accepted event in the stream (because it is append-only), and it will again call the new handler, but this have already been done and the Payment Received event have already been saved to the stream. So, it is not necessary to go through this again, we could ignore this event and continue.
Question
So, my question is how can we handle with model changes that impact who handle each event? How can we rebuild the internal state of an Aggregate after a change like this?
Will we need to build some event Stream migration that changes the events from one stream to the new schema (one or more streams)? Just like we would need in a Relational database?
Will we never be allowed to remove one handler, so we can only add new handlers? This would lead to unmanageable system…
You got almost all right, except one thing: Aggregates should not handle events from other Aggregates. It's like a non-event-sourced Aggregate shares a table with another Aggregate: they should not.
In event-driven DDD, Aggregates are the system's building blocks that receive Commands (things that express the intent) and return Events (things that had happened). For every Command type must exist one and only one Aggregate type that handle it. Before executing a Command, the Aggregate is fed with all its own previously emitted Events, that is, every Event that was emitted in the past by this Aggregate instance is applied to this Aggregate instance, in the chronological order.
So, if you want to correctly model your system, you are not allowed to send events from one Aggregate as events to another Aggregate (a different type or instance).
If you need to model business processes that involve multiple Aggregates, the correct way of doing it is by using a Saga/Process manager. This is a different component. It is the opposite of an Aggregate.
It receive Events emitted by Aggregates and sends Commands to other Aggregates.
In simplest cases, a Saga manager simply takes properties from one Event and creates+populates a Command with those properties. Then it sends the Command to the destination Aggregate.
In more complicated cases, the Saga waits for multiple Events and when all are received only then it creates and sends a Command.
The Saga may also deduplicate or reorder events.
In your case, a Saga could be Sale, whose purpose would be to coordinate the entire sales process, from ordering to product dispatching.
In conclusion, you have that problem because you have not modeled correctly your system. If your Aggregates would have handled only their specific Commands (and not somebody else's Events) then even if you must create a new Saga when a new Business process emerges, it would send the same Command to the Same Aggregate.
Answering briefly
my question is how can we handle with model changes that impact who handle each event?
Handling events is generally an easy thing to change, because the handling part is ephemeral. Events have a single writer, but they can have many readers. You just need to arrange for the plumbing to notify each subscriber of the event.
So in scenario #1, its the PaymentAggregate that writes down the PaymentAccepted event (in its own stream), and then your plumbing notifies the OrderAggregate that the PaymentAccepted event happened, and it does the next thing in its own logic.
To change to scenario #2, we'd leave the Payment Aggregate unchanged, but we'd arrange the plumbing so that it tells the FinanceAggregate about PaymentAccepted, and that it tells the OrderAggregate about PaymentReceived.
Your pictures make it hard to see this; I think you aren't being careful to track that each change of state is stored in the stream of the aggregate that changed. Not your fault - the Microsoft picture is really awful.
In other words, your arrow #3 "Seats Reserved" isn't a SeatsReserved event, it's a Handle(SeatsReserved) command.

Loading events from inside event handlers?

We have an event sourced system using GetEventStore where the command-side and denormalisers are running in two separate processes.
I have an event handler which sends emails as the result of a user saving an application (an ApplicationSaved event), and I need to change this so that the email is sent only once for a given application.
I can see a few ways of doing this but I'm not really sure which is the right way to proceed.
1) I could look in the read store to see if theres a matching application however there's no guarantee that the data will be there when my email handler is processing the event.
2) I could attach something to my ApplicationSaved event, maybe Revision which gets incremented on each subsequent save. I then only send the email if Revision is 1.
3) In my event handler I could load in the events from my event store for the matching customer using a repository, and kind of build up an aggregate separate from the one in my domain. It could contain a list of applications with which I can use to make my decision.
My thoughts:
1) This seems a no-go as the data may or may not be in the read store
2) If the data can be derived from a stream of events then it doesn't need to be on the event itself.
3) I'm leaning towards this, but there's meant to be a clear separation between read and write sides which this feels like it violates. Is this permitted?
I can see a few ways of doing this but I'm not really sure which is the right way to proceed.
There's no perfect answer - in most cases, externally observable side effects are independent of your book of record; you're always likely to have some failure mode where an email is sent but the system doesn't know, or where the system records that an email was sent but there was actually a failure.
For a pretty good answer: you're normally going to start with a facility that sends and email and reports as an event that the email was sent successfully, or not. That's fundamentally an event stream - your model doesn't get to veto whether or not the email was sent.
With that piece in place, you effectively have a query to run, which asks "what emails do I need to send now?" You fold the ApplicationSaved events with the EmailSent events, compute from that what new work needs to be done.
Rinat Abdullin, writing Evolving Business Processes a la Lokad, suggested using a human operator to drive the process. Imagine building a screen, that shows what emails need to be sent, and then having buttons where the human says to actually do "it", and the work of sending an email happens when the human clicks the button.
What the human is looking at is a view, or projection, which is to say a read model of the state of the system computed from the recorded events. The button click sends a message to the "write model" (the button clicked event tells the system to try to send the email and write down the outcome).
When all of the information you need to act is included in the representation of the event you are reacting to, it is normal to think in terms of "pushing" data to the subscribers. But when the subscriber needs information about prior state, a "pull" based approach is often easier to reason about. The delivered event signals the project to wake up (reducing latency).
Greg Young covers push vs pull in some detail in his Polyglot Data talk.

CQRS - When to send confirmation message?

Example: Business rules states that the customer should get a confirmation message (email or similar) when an order has been placed.
Lets say that a NewOrderRegisteredEvent is dispatched from the domain and is picked up by an event listener that sends of the confirmation message. When that is done some other event handler throws an exception or something else goes wrong and the unit of work is rolled back. We've now sent the user a confirmation message for something that was rolled back.
What is the "cqrs" way of solving problems like this where you want to do something after a unit of work has been committed? Another complicating factor is replaying of events. I don't want old confirmation messages to be re-sent whenever I replay recorded events in order to build a new view / projection.
My best theory so far: I've just started to look into the fascinating world of cqrs and was wondering whether this is something that would be implemented as a saga? If a saga is like a state machine where each transition only can take place a single time then I guess that would solve this problem? I just have a hard time visualizing how this will fit together with the command bus and domain events..
An Event should only occur after the transaction has been completed. If anything goes wrong and there's a rollback, then the event didn't occur from an external point of view. Therefore it shouldn't be published at all. Though an OrderRegistrationFailed event could be published if necessary.
You wouldn't want the mail to be sent unless the command has sucessfully been executed.
First a few reasons why the command handler -- as proposed in another answer -- would be the wrong place: Under some circumstances the command handler wouldn't be able to tell if the command will eventually succeed or not. Having the command handler invoke the mail sending would also put process knowledge inside the command handler, which would break the SRM and too tightly couple business rules with the application layer.
The mail should be sent after the fact, i.e. from an event handler.
To prevent this handler from firing during replay, you can just not register it. This works similar to how you test your application. You only register the handlers that you actually need.
Production system -> register all event handlers
Tests -> register only the tested event handlers
Replay -> register only the projection/denormalization handlers
Another - even more loosely coupled, though a bit more complex - possibility would be to have a Saga handle the NewOrderRegisteredEvent and issue a SendMail command to the appropriate bounded context (thanks, Yves Reynhout, for pointing this out in the question's comments).
There are two likely solutions
1) The publishing of the event and the handling of the event (i.e. the email) are part of a single transaction. In this case, your transaction framework takes care of it for you. If the email fails, then the event is rolled back. You'll likely retry the command. This is conceptually clean and easy to think about. No event is finished publishing until everyone that has something to say about it has had their say. However practically speaking, this can be painful, as it typically involves distributed transactions. These are hard to come by. Can your email client enroll in the same transaction as the database which is holding your events?
2) The publishing of the event is transactional, but the event handlers each deal with transactions in their own way. The event handler which sends emails could keep track of which events it had seen. If it crashed, it would request old events and process them. You could make a business decision as to how big a deal it would be if people had missing or duplicate emails. (For money-related transactions, the answer is probably you shouldn't allow it.)
Solution (2) is typically what you see promoted in DDD/CQRS circles as it's the more loosely coupled solution. Solution (1) is quite practical in a small system where the event store and the projections are in a single database and the projections don't change often. Solution (2) allows a diversity of event handlers to work in their own way. Solution (1) can cause lots of non-overlapping concerns to become entagled. In this case your order business rules don't complete until the many bizarre things that happen in emailing are taken care of. For one thing, it may slow you down quite a bit.
If the sending of the email were more interesting than "saw the event, sent the email", then you're right, you might have a saga or workflow on your hands. Email in large operations is often a complex system in its own right which you're unlikely to have to implement much of. You just need to be sure you put your email into a request queue of some sort (using approach (2)), and the email system is likely to do retries/batching/spam avoidance/working overnight/etc.

Resources