In our organization, we are implementing the API LED Connectivity pattern together with an event driven architecture. We basically want source systems to publish events on which subscribers can react to do something, mostly updating their own source system.
There is one major issue with our design on which we currently have no solution. So we'd like to ask for your help!
Our problem is that events will trigger an update loop. Below picture describes in an extremely simplified way what we want, and what will happen:
An update in System A, will update System B, which will create an event and update System A, which will create an event and update System B etc. etc.
How can we stop this loop, considering the following:
We cannot prevent System B from sending events. Due to the complexity of many source systems, we have no influence on this. So System B will send events
We cannot ignore events of System B to System A, because there could be data in that event that really needs to be updated in System A
Any help is appreciated, thanks!
Related
I wondering how to update bunch of data in Event Sourcing concept for any aggregate.
In traditional application I would take some data such as name, date of birth etc. and put them into existing object; as I understand, in ES concept this approach is wrong, so that should I perform different Events to update different parts of aggregate root? If so, that how to build REST API? How to handle with validation?
In traditional application I would take some data such as name, date of birth etc. and put them into existing object; as I understand, in ES concept this approach is wrong,
Short answer: that approach is fine -- what changes in event sourcing is how you keep track of the changes in your service.
A way to think of a stream of events is a sequence of patch-documents. There's nothing wrong with changing multiple fields in a single patch document, and that is fine in events as well.
This question is really too broad for SO. You should google “event sourcing basics in azure” to find detailed articles, github projects, videos, and other responses to these questions.
In general, in Event Sourcing there two main ideas you need – Messages and Events. A typical process (not the only option, but a common one) is as follows. A message is created by your UI which makes a request for a change to be made to an AR. Validation for that message is done on the message creation source.
The message is then sent to an API, where it is validated again since you can't trust all possible senders. The request is processed, resulting in changes made to an AR. An event is then created describing the changes made, and that event is placed on an event source (Azure Event Hub, Kafka, Kinesis, a DB, whatever). This list of events is kept forever and describes each and every change made to that AR throughout time, including the initial creation request. To get the current state of the AR, just add up all the events.
The key idea that is confusing when learning Event Sourcing is the two different types of “events”. Messages ask for a change to be made, Events record that a change has been made.
As already answered, the batch update approach is fine.
I suggest to focus on the event consumption code. If all you have in your ReadSide is a complete aggregate representation, then generic *_UPDATED event is ok.
But if you do have parts of you system interested only in particular part of your aggregate, you might want to update that part separately, so that system doesn't have to analyze all events and dig for particular data.
For example, some demographic analysis system is only interested in the birthdate. It would be much easier for this system to have a BURTHDATE_SET event that it would listen to, and ignore all others.
Fine grained events like this also reduces coupling, because require less knowledge of the internal event data structure.
It feels like you still have an active record way of looking at things.
You should model the things that happen to your entity as events rather than the impact of things happening.
So to my mind all of that data might be gathered in a "Person was registered" event but an "Address added" event might also exist - in which case your single command might end up appending two events to the event stream.
I would want expose a little scenario which is still at paper state, and which, regarding DDD principle seem a bit tedious to accomplish.
Let's say, I've an application for hosting accounts management. Basically, the application compose several bounded contexts such as Web accounts management, Ftp accounts management, Mail accounts management... each of them represented by their own AR (they can live standalone).
Now, let's imagine I want to provide a UI with an HTML form that compose one fieldset for each bounded context, for instance to update limits and or features. How should I process exactly to update all AR without breaking single transaction per request principle? Can I create a kind of "outer" AR, let's say a ClientHostingProperties AR which would holds references to other AR and update them as part of single transaction, using own repository? Or should I better create an AR that emit messages to let's listeners provided by the bounded contexts react on, in which case, I should probably think about ES?
Thanks.
How should I process exactly to update all AR without breaking single transaction per request principle?
You are probably looking for a process manager.
Basic sketch: persisting the details from the submitted form is a transaction unto itself (you are offered an opportunity to accrue business value; step 1 is to capture that opportunity).
That gives you a way to keep track of whether or not this task is "done": you compare the changes in the task to the state of the system, and fire off commands (to run in isolated transactions) to make changes.
Processes, in my mind, end up looking a lot like state machines. These tasks are commands are done, these commands are not done, these commands have failed: now what? and eventually reach a state where there are no additional changes to be made, and this instance of the process is "done".
Short answer: You don't.
An aggregate is a transactional boundary, which means that if you would update multiple aggregates in one "action", you'd have to use multiple transactions. The reason for an aggregate to be equivalent to one transaction is that this allows you to guarantee consistency.
This means that you have two options:
You can make your aggregate larger. Then you can actually guarantee consistency, but your ability to handle concurrent requests gets worse. So this is usually what you want to avoid.
You can live with the fact that it's two transactions, which means you are eventually consistent. If so, you usually use something such as a process manager or a flow to handle updating multiple aggregates. In its simplest form, a flow is nothing but a simple if this event happens, run that command rule. In its more complex form, it has its own state.
Hope this helps 😊
This is a theoretical question about the introduction of new BCs in a system we use ES and CQRS with DDD. So there won't be concrete examples.
There can be interesting problems by introducing new BC-s, which communicate with the old ones by receiving and publishing domain events. The root of these problems that we already have domain events in the event storage. When the new BC reacts on those old domain events it will do that in a way which is out of sync and/or out of sequence.
For example we have an old BC A and we introduce a new BC B. Both publish domain events which we call a and b. In the new system the order matters for example b1 must always come after a1, but before a2. What can we do, when we already have the a1, a2, a3 sequence in the event storage? Should we inject b1 after a1 and so on? Is this a viable solution by a huge event storage? It will certainly take a long time to replay all the old events one by one and react on them. How can we prevent sending an email to the customer by handling the newly created b1 event, which reacts on a 5 years old topic? Is there a pattern to prevent these kind of problems?
Problem Analysis
The root of these problems that we already have domain events in the event storage.
If you introduce a new BC B to an existing system, that means the system was functional without B. This is clear by the above statement and has the following consequences:
Events that B would have produced in response to events from A do not need to be published. No other system should take action based on these events, because they are artificial.
You can go live with B at any time you choose. The only thing that you need to do beforehand is getting B in sync with the current state of the system.
Getting B in Sync
This is not difficult if you design B accordingly.
First, you need a replay mode mechanism to import all domain events into B without publishing events from B in response. You need to keep Bs events internally of course if you use event sourcing, but do not publish them. Also, make sure B does not modify the state of the world while in replay mode by other means, e.g. don't send emails.
Then, switch B over to live mode. Now B consumes the new events from the system and also publishes its own.
The problem you mention with event ordering is only a problem when you use a unified event store for all domain events, and also use that store to publish events from. If this is the case, then you need to mark Bs events as "internal" during the replay phase and hide them from the publishing mechanism.
Note: If B is a purely reactive BC (this could be the case for a very simple BC), then you don't even need the replay stuff. But most BC's probably do.
First of all DDD does not require Event sourcing.
we have an old BC A and we introduce a new BC B. Both publish domain
events which we call a and b. In the new system the order matters for
example b1 must always come after a1, but before a2.
Events can be out of order, even in the same component(bounded-context). Transactional integrity is only guaranteed within aggregates.
when we already have the a1, a2, a3 sequence in the event storage?
Doesn't matter. By the way you don't have this guarantee with SQL databases unless you work in SERIALIZABLE isolation (or its vendor specific equivalent). Protip: It's so taxing on performance that it's never enabled by default; therefore you are not using it.
Pay special attention to this part in the above link:
Other transactions cannot insert new rows with key values that would
fall in the range of keys read by any statements in the current
transaction until the current transaction completes.
Furthermore, though an event store shouldn't have multiple copies of an event, events (and other messages such as commands) may arrive multiple times between components.
Should we inject b1 after a1 and so on?
Since your components should be able to handle out of order (and duplicate events) no
What can we do,
Depending on the technology used to integrate components, and the semantics of the messages:
If you are reading events from a web service, feed, DB table; such that it never goes away; you might be able to ignore an event until it is relevant.
Equivalently, you might be able to put an event back to the message queue it came from until it is relevant.
You may use the pattern known as Saga/Process Manager.
Is there a real race condition, at all?
Example: Business rules states that the customer should get a confirmation message (email or similar) when an order has been placed.
Lets say that a NewOrderRegisteredEvent is dispatched from the domain and is picked up by an event listener that sends of the confirmation message. When that is done some other event handler throws an exception or something else goes wrong and the unit of work is rolled back. We've now sent the user a confirmation message for something that was rolled back.
What is the "cqrs" way of solving problems like this where you want to do something after a unit of work has been committed? Another complicating factor is replaying of events. I don't want old confirmation messages to be re-sent whenever I replay recorded events in order to build a new view / projection.
My best theory so far: I've just started to look into the fascinating world of cqrs and was wondering whether this is something that would be implemented as a saga? If a saga is like a state machine where each transition only can take place a single time then I guess that would solve this problem? I just have a hard time visualizing how this will fit together with the command bus and domain events..
An Event should only occur after the transaction has been completed. If anything goes wrong and there's a rollback, then the event didn't occur from an external point of view. Therefore it shouldn't be published at all. Though an OrderRegistrationFailed event could be published if necessary.
You wouldn't want the mail to be sent unless the command has sucessfully been executed.
First a few reasons why the command handler -- as proposed in another answer -- would be the wrong place: Under some circumstances the command handler wouldn't be able to tell if the command will eventually succeed or not. Having the command handler invoke the mail sending would also put process knowledge inside the command handler, which would break the SRM and too tightly couple business rules with the application layer.
The mail should be sent after the fact, i.e. from an event handler.
To prevent this handler from firing during replay, you can just not register it. This works similar to how you test your application. You only register the handlers that you actually need.
Production system -> register all event handlers
Tests -> register only the tested event handlers
Replay -> register only the projection/denormalization handlers
Another - even more loosely coupled, though a bit more complex - possibility would be to have a Saga handle the NewOrderRegisteredEvent and issue a SendMail command to the appropriate bounded context (thanks, Yves Reynhout, for pointing this out in the question's comments).
There are two likely solutions
1) The publishing of the event and the handling of the event (i.e. the email) are part of a single transaction. In this case, your transaction framework takes care of it for you. If the email fails, then the event is rolled back. You'll likely retry the command. This is conceptually clean and easy to think about. No event is finished publishing until everyone that has something to say about it has had their say. However practically speaking, this can be painful, as it typically involves distributed transactions. These are hard to come by. Can your email client enroll in the same transaction as the database which is holding your events?
2) The publishing of the event is transactional, but the event handlers each deal with transactions in their own way. The event handler which sends emails could keep track of which events it had seen. If it crashed, it would request old events and process them. You could make a business decision as to how big a deal it would be if people had missing or duplicate emails. (For money-related transactions, the answer is probably you shouldn't allow it.)
Solution (2) is typically what you see promoted in DDD/CQRS circles as it's the more loosely coupled solution. Solution (1) is quite practical in a small system where the event store and the projections are in a single database and the projections don't change often. Solution (2) allows a diversity of event handlers to work in their own way. Solution (1) can cause lots of non-overlapping concerns to become entagled. In this case your order business rules don't complete until the many bizarre things that happen in emailing are taken care of. For one thing, it may slow you down quite a bit.
If the sending of the email were more interesting than "saw the event, sent the email", then you're right, you might have a saga or workflow on your hands. Email in large operations is often a complex system in its own right which you're unlikely to have to implement much of. You just need to be sure you put your email into a request queue of some sort (using approach (2)), and the email system is likely to do retries/batching/spam avoidance/working overnight/etc.
I have been using State machine based design tools for some time, and have seen UML modeling tools that allow you to execute your logic (call functions, do other stuff) inside a state. However, after spending a couple days with IAR VisualState, it appears that you cannot execute your logic inside a state without a trigger. I am confused as it does not make sense TO HAVE A TRIGGER for every single action inside a state !
Here is what I expect from a state chart tool:
If I enter StateA, upon entering the state I set my values in entry section, then I would like to call a function (I just want to call it, NO TRIGGER), and inside that function, I want to trigger an event based on some logic, and that event would trigget state transition from StateA to StateB or StateC.
Is there something wrong with this expectation? Is it possible in VisualSTATE?
Help is greatly appreciated.
VisualSTATE imposes the event-driven paradigm, just like any Graphical User Interface program. Anything and everything that happens in such systems is triggered by an event. The system then responds by performing actions (computation) and possibly by changing the state (state transition).
Probably the most difficult aspect of event-driven systems is the inversion of control, that is, your (state machine) code is called only when there is an event to process. Otherwise, your code is not even active. This means that you are not in control, the events are. Your job is to respond to events.
Perhaps before you play with visualSTATE, you could pick up any book on GUI programming for Windows (Visual Basic is a good starting point) and build a couple of event-driven applications. After you do this, the philosophy behind visualSTATE will become much clearer.
Create 3 states: A, B, C where state A is a default state.
By entering state A, call action function [that sets you
variables a and b following some algorithm], followed by ^Signal1.
Entry/ action()^Signal1
Make a transition driven by Signal1 [will serve you as an event] from state A with 2 guards:
a <= b, transition to state C
a > b, transition to state B