I have a requirement where i need to group the two events as one transaction by grouping them on certain criteria. Below is the some thoughts on the requirement.
Event ::
We will receive events continuously to our systems.
Each event will have some buffer time to group with other event.
If buffer time elapses then we need to discard the event.
We need to group the two events into one group depending on the two events information.
If event information is not sufficient then we will send event info to other component which will response with corrected data.
Whenever we grouping the events some times we want to hold the other event if related event went to data correcting component even though we are not 100% sure about the matching criteria. This step we want to perform because we want to match the events as many as possible.
I want to model this requirement using domain driven design any suggestions will be appreciated.
Without knowing your business requirements, it's kind of hard to answer. But we can start with assumptions and definitions first:
I refer to an event in DDD as something that is important for your domain, has happened (in the past), is a undeniable fact and cannot be undone.
In my definition either aggregates or domain services are responsible for emitting events.
So your group of events looks like a concept that says that a group of related events is something important to my domain, too.
I guess you can go two ways to think about that concept:
A group is a special view on your already happened events. Then a group is just a component which state is derived from a list of related events.
A group is an aggregate that is a kind of a process that has a life cycle and based on state emits a single group event when the criteria for finishing a group is met
In the first case you can implement a group query that listens to published events and projects them to your group concept
In the second case you have an aggregate that reacts to business requests (you can call this a command) and manages some persistent state. When you request your aggregate to create a group and your aggregate is in the right state to do this, then your aggregate emits a group event.
Related
I wondering how to update bunch of data in Event Sourcing concept for any aggregate.
In traditional application I would take some data such as name, date of birth etc. and put them into existing object; as I understand, in ES concept this approach is wrong, so that should I perform different Events to update different parts of aggregate root? If so, that how to build REST API? How to handle with validation?
In traditional application I would take some data such as name, date of birth etc. and put them into existing object; as I understand, in ES concept this approach is wrong,
Short answer: that approach is fine -- what changes in event sourcing is how you keep track of the changes in your service.
A way to think of a stream of events is a sequence of patch-documents. There's nothing wrong with changing multiple fields in a single patch document, and that is fine in events as well.
This question is really too broad for SO. You should google “event sourcing basics in azure” to find detailed articles, github projects, videos, and other responses to these questions.
In general, in Event Sourcing there two main ideas you need – Messages and Events. A typical process (not the only option, but a common one) is as follows. A message is created by your UI which makes a request for a change to be made to an AR. Validation for that message is done on the message creation source.
The message is then sent to an API, where it is validated again since you can't trust all possible senders. The request is processed, resulting in changes made to an AR. An event is then created describing the changes made, and that event is placed on an event source (Azure Event Hub, Kafka, Kinesis, a DB, whatever). This list of events is kept forever and describes each and every change made to that AR throughout time, including the initial creation request. To get the current state of the AR, just add up all the events.
The key idea that is confusing when learning Event Sourcing is the two different types of “events”. Messages ask for a change to be made, Events record that a change has been made.
As already answered, the batch update approach is fine.
I suggest to focus on the event consumption code. If all you have in your ReadSide is a complete aggregate representation, then generic *_UPDATED event is ok.
But if you do have parts of you system interested only in particular part of your aggregate, you might want to update that part separately, so that system doesn't have to analyze all events and dig for particular data.
For example, some demographic analysis system is only interested in the birthdate. It would be much easier for this system to have a BURTHDATE_SET event that it would listen to, and ignore all others.
Fine grained events like this also reduces coupling, because require less knowledge of the internal event data structure.
It feels like you still have an active record way of looking at things.
You should model the things that happen to your entity as events rather than the impact of things happening.
So to my mind all of that data might be gathered in a "Person was registered" event but an "Address added" event might also exist - in which case your single command might end up appending two events to the event stream.
I am new to event sourcing and ddd and trying to create a simple app to learn more, but I'm strruggling with how to model a relationship between two aggregates.
The idea is to allow companies to create activities that can then be searched for by users.
I want to be able to enforce the rule that a company can only have so many active activities depending on thier membership level.
My first approach would be to have the Company be the aggregate root which would contain the list of Activities and easily control this. However, this means I would have to go through the Comapny Aggregate to access every Activity, which hisn't ideal as most actions against an activity does not depend on the Company.
My second approach was to have seperate Company and Activity aggreagtes. This means that I would have to first raise a ActivityCreated event, then an ActivityAddedToCompany event which would throw an exception if the company is already full of Activities. This approach seems better but I'm not sure if needing the ActivityAddedToCompany is a flag that I have not seperated the aggregates correctly as in a happy path, the ActivityCreated and ActivityAddedToCompany would always be stored after each other.
Is the second approach better or am I missing something basic in Domain Driven Design?
As per your clarifications:
an Activity does not have to be created by a Company
This suggests that Activity should be an aggregate of its own. It has a lifetime separate from any other aggregate.
An Activity can only be registered to one Company
The Activity would have a reference back to the Company via an ID. Effectively, a foreign key. When it is assigned to a Company, it raises an event indicating that the assignment was made.
a Company can only have 5 Activities at any one time
If you were using a standard RDBMS system to manage these rules, you would have a transaction that checks the number of Activities and either approves or rejects the addition of a new Activity. Similarly, in your domain, you can model this through a two-phase commit.
When you assign an Activity to a Company (AssignToCompany command), you raise an AssignedToCompany event. A ProcessManager (PM) will receive that event and send a command to Company (AssignToActivity) and the Company can either accept (AssignedToActivity) or reject that based on the count (RejectedAssignToActivity).
If the latter, the PM will receive the RejectedAssignToActivity event and send a command back to Activity to remove the company (UnassignCompany) which will raise the CompanyUnassigned event.
Optional:
The PM will receive the CompanyUnassigned event and send an UnassignFromActivity command to the Company. This way, you can unassign an activity if needed and have the Company be aware of the change.
We want to implement cqrs in our new design. We have some doubts in processing command handler and read model. We got understand that while processing commands we should take optimistic lock on aggregateId. But what approach should be considered while processing readModels. Should we take lock on entire readModel or on aggregateId or never take lock while processing read model.
case 1. when take lock on entire readmodel -> it is safest but is not good in term of speed.
case 2 - take lock on aggregateId. Here two issues may arise. if we take lock aggregateId wise -> then what if read model server restarts. It does not know from where it starts again.
case 3 - Never take lock. in ths approach, I think data may be in corrputed state. For eg say an order inserted event is generated and thorugh some workflow/saga, order updated event took place as well. what if order updated event comes first and order inserted event is not yet processed ?
Hope I am able to address my issue.
If you do not process events concurrently in the Readmodel then there is no need for a lock. This is the case when you have a single instance of the Readmodel, possible in a Microservice, that poll for events and process them sequentially.
If you have a synchronous Readmodel (i.e. in the same process as the Writemodel/Aggregate) then most probably you will need locking.
An important thing to keep in mind is that a Readmodel most probably differs from the Writemodel. There could be a lot of Writemodel types whos events are projected in the same Readmodel. For example, in an ecommerce shop you could have a ListOfProducts that projects event from Vendor and from Product Aggregates. This means that, when we speak about a Readmodel we cannot simply refer to the "Aggregate" because there is not single Aggregate involved. In the case of ecommerce, when we say "the Aggregate" we might refer to the Product Aggregate or Vendor Aggregate.
But what to lock? Here depends on the database technology. You should lock the smallest affected read entity or collection that can be locked. In a Readmodel that consist of a list of products (read entities, not aggregates!), when an event that affects only one product you should lock only that product (i.e. ProductTitleRenamed).
If an event affects more products then you should lock the entire collection. For example, VendorWasBlocked affects all the products (it should remove all the products from that vendor).
You need the locking for the events that have non-idempotent side effects, for the case where the Readmodel's updater fails during the processing of an event, if you want to retry/resume from where it left. If the event has idempotent side effects then it can be retried safely.
In order to know from where to resume in case of a failed Readmodel, you could store inside the Readmodel the sequence of the last processed event. In this case, if the entity update succeeds then the last processed event's sequence is also saved. If it fails then you know that the event was not processed.
For eg say an order inserted event is generated and thorugh some workflow/saga, order updated event took place as well. what if order updated event comes first and order inserted event is not yet processed ?
Read models are usually easier to reason about if you think about them polling for ordered sequences of events, rather than reacting to unordered notifications.
A single read model might depend on events from more than one aggregate, so aggregate locking is unlikely to be your most general answer.
That also means, if we are polling, that we need to keep track of the position of multiple streams of data. In other words, our read model probably includes meta data that tells us what version of each source was used.
The locking is likely to depend on the nature of your backing store / cache. But an optimistic approach
read the current representation
compute the new representation
compare and swap
is, again, usually easy to reason about.
I've been studying DDD for a while, and stumbled into design patterns like CQRS, and Event sourcing (ES). These patterns can be used to help achieving some concepts of DDD with less effort.
In the architecture exemplified below, the aggregates know how to handle the commands and events related to itself. In other words, the Event Handlers and Command Handlers are the Aggregates.
Then, I’ve started modeling one sample Domain just to understand how the implementation would follow the business logic. For this question here is my domain (It’s based on this):
I know this is a bad modeled example, but I’m using it just as an example.
So, using ES, at the end of the operation, we would save all the events (Green arrows) into the event store (if there were no Exceptions), each event into its given Event Stream (Aggregate Type + Aggregate Id):
Everything seems right until now. So If we want to Rebuild the internal state of an instance of any of this Aggregate, we only have to new it up (new()) and apply all the events saved in its respective Event Stream in the correct order.
My question is related to changes in the model. Because, software development is a process where we never stop learning about our domain, and we always come with new ideas. So, let’s analyze some change scenarios:
Change Scenario 1:
Let´s pretend that now, if the Reservation Aggregate check’s that the seat is not available, it should send an event (Seat not reserved) and this event should be handled by one new Aggregate that will store all people that got their seat not reserved:
In the hypothesis where the old system already handled the initial command (Place order) correctly, and saved all the events to its respective event streams:
When we want to Rebuild the internal state of an instance of any of this Aggregate, we only have to new it up (new()) and apply all the events saved in its respective Event Stream in the correct order. (Nothing changed). The only thing, is that the new Use case didn’t exist back in the old model.
Change Scenario 2:
Let’s pretend that now, when the payment is accepted we handle this event (Payment Accepted) in a new Aggregate (Finance Aggregate) and not in the Order Aggregate anymore. And It send a new Event (Payment Received) to the Order Aggregate. I know this scenario is not well structured, but something like this could happen.
In the hypothesis where the old system already handled the initial command (Place order) correctly, and saved all the events to its respective event streams:
When we want to Rebuild the internal state of an instance of any of this Aggregate, we have a problem when applying the events from the Aggregate Event Stream to itself:
Now, the order doesn’t know anymore how to handle Payment Accepted Event.
Problems
So as the examples showed, whenever a system change reflects in an event being handled by a different event handler (Aggregate), there are some major problems. Because, we cannot rebuild the internal state anymore.
So, this problem can have some solutions:
Possible Solution
When an event is not handled by the aggregate in which Event Stream it is stored, we can find the new handler and create a new instance and send the event to it. But to maintain the internal state correct, we need the last event (Payment Received) to be handled by the Order Aggregate. So, we let it dispatch the event (and possible commands):
This solution can have some problems. Let’s imagine that a new command (Place Order) arrives and it has to create this order instance and save the new state. Now we would have:
In gray are the events that were already saved in the last call when the system hadn’t already gone through model changes.
We can see that a new Event Stream is created for the new aggregate (Finance W). And we can see that Event Streams are append-only, so the Payment Accepted event in the Order Y Event Stream is still there.
The first Payment Accepted event in Finance W Event Stream is the one that was supposed to be handled by the Order but had to find a new handler.
The Yellow payment received event in Order’s Event Stream is the event that was generated by the new handler of the Payment Accepted when the Payment Accepted event from the Order’s Event Stream was handled by the Finance.
All the other Green Events are new events that were generated by handling the Place Order Command in the new model.
Problem With the Solution
The next time the aggregate needs to be rebuild, there will be a Payment Accepted event in the stream (because it is append-only), and it will again call the new handler, but this have already been done and the Payment Received event have already been saved to the stream. So, it is not necessary to go through this again, we could ignore this event and continue.
Question
So, my question is how can we handle with model changes that impact who handle each event? How can we rebuild the internal state of an Aggregate after a change like this?
Will we need to build some event Stream migration that changes the events from one stream to the new schema (one or more streams)? Just like we would need in a Relational database?
Will we never be allowed to remove one handler, so we can only add new handlers? This would lead to unmanageable system…
You got almost all right, except one thing: Aggregates should not handle events from other Aggregates. It's like a non-event-sourced Aggregate shares a table with another Aggregate: they should not.
In event-driven DDD, Aggregates are the system's building blocks that receive Commands (things that express the intent) and return Events (things that had happened). For every Command type must exist one and only one Aggregate type that handle it. Before executing a Command, the Aggregate is fed with all its own previously emitted Events, that is, every Event that was emitted in the past by this Aggregate instance is applied to this Aggregate instance, in the chronological order.
So, if you want to correctly model your system, you are not allowed to send events from one Aggregate as events to another Aggregate (a different type or instance).
If you need to model business processes that involve multiple Aggregates, the correct way of doing it is by using a Saga/Process manager. This is a different component. It is the opposite of an Aggregate.
It receive Events emitted by Aggregates and sends Commands to other Aggregates.
In simplest cases, a Saga manager simply takes properties from one Event and creates+populates a Command with those properties. Then it sends the Command to the destination Aggregate.
In more complicated cases, the Saga waits for multiple Events and when all are received only then it creates and sends a Command.
The Saga may also deduplicate or reorder events.
In your case, a Saga could be Sale, whose purpose would be to coordinate the entire sales process, from ordering to product dispatching.
In conclusion, you have that problem because you have not modeled correctly your system. If your Aggregates would have handled only their specific Commands (and not somebody else's Events) then even if you must create a new Saga when a new Business process emerges, it would send the same Command to the Same Aggregate.
Answering briefly
my question is how can we handle with model changes that impact who handle each event?
Handling events is generally an easy thing to change, because the handling part is ephemeral. Events have a single writer, but they can have many readers. You just need to arrange for the plumbing to notify each subscriber of the event.
So in scenario #1, its the PaymentAggregate that writes down the PaymentAccepted event (in its own stream), and then your plumbing notifies the OrderAggregate that the PaymentAccepted event happened, and it does the next thing in its own logic.
To change to scenario #2, we'd leave the Payment Aggregate unchanged, but we'd arrange the plumbing so that it tells the FinanceAggregate about PaymentAccepted, and that it tells the OrderAggregate about PaymentReceived.
Your pictures make it hard to see this; I think you aren't being careful to track that each change of state is stored in the stream of the aggregate that changed. Not your fault - the Microsoft picture is really awful.
In other words, your arrow #3 "Seats Reserved" isn't a SeatsReserved event, it's a Handle(SeatsReserved) command.
So, I trigger command on aggregate root and it has some 10 events happened as a result of the command. This events are internal ones, and since outer systems need aggregation of this events, I decided to make projection (read projection basically). In order to make this projection from 10 events (internal) TO 1 event (external), I have to apply some business rules (business rules concerning merging of events). Where should I put this rules, since it seems like part of domain but I'm creating projections of internal events?
Basically since projection logic is part of domain, should I keep it inside aggregate and call it in code where projection is made?
UPDATE
So, inside one aggregate root, I have e.g. 3 events (internal) as response to one Command (aggregate.createPaintandwashatsametime(id, red)) that is sent to aggregate root and that are spreading through all the aggregate root entities like: CarCreated(Id), CarSeatColored(Red), CarWashed() etc. (all this 3 events are happened because of single command). External system expects to receive one external event as CarMaintainenceDone(Id, repainted=true, washed=true, somevalue=22);
Now, if i have some complex logic to make this CarMaintainenceDone event (like if(color==red then in projection somevalue==22 otherwise 44) - should this go in projection code or be part of domain?
UPDATE 2
Let me try to give you new example. Just ignore how domain is modeled since this is just example:
As you can see we have AggregateRoot that contains Multiplier which is there just to call things with the right name. When we do multiplication we first send integer 1 to ObjectA which has some logic to set internal state and emit ObjectAHasSetParam event. The same thing goes with ObjectB. Finally, ObjectC listens to all of this events, and on paramsHasBeenSet will do actual multiplication.
In event store in this case I would preserve list of events:
[ObjectAHasSetParam , ObjectBHasSetParam , ObjectCHasMultiplied ]
My point here was: if I emit all of this events one by one out of process - the state that somebody else updates will possibly be inconsistent, since this 3 events make sense only together. That is why I wanted to make something like projection, but I think in this case I just need to publish list of this events together instead of event by event.
class AggregateRoot{
Multiplier ml;
void handle(MultiplyCommand(1,2)){
ml.multiply(1,2);
}
}
class Multiplier{
ObjectA a;
ObjectB b;
ObjectC res;
void multiply(1,2){
a.setParam(1);
b.setParam(2);
publish(paramsHaveBeenSet());
}
}
class ObjectA{
int p;
void setParam(1){
p = 1 + 11;
publish(ObjectAHasSetParam(12));
}
}
class ObjectB{
int p;
void setParam(2){
p = 2 + 22;
publish(ObjectBHasSetParam(24));
}
}
class ObjectC{
int p1; int p2;
int res;
listen(ObjectAHasSetParam e1){
p1 = e1.par;
}
listen(ObjectBHasSetParam e2){
p2 = e2.par;
}
listen(paramsHaveBeenSet e3){
res = p1 * p2;
publish(ObjectCHasMultiplied(288));
}
}
External system expects to receive one external event as CarMaintainenceDone(Id, repainted=true, washed=true, somevalue=22);
A ha! The short answer is process manager.
The longer answer is that you (should) have two aggregates right now. One of them is tracking the state of the car. The other is tracking the process of maintaining the car.
The big hint that there is another aggregate hidden somewhere: you've got this CarMaintenanceDone event, with no aggregate responsible for generating it. All events have an "aggregate" somewhere that produces them. The aggregate might be the real world, or a proxy for the real world (HttpRequestReceived), or a digital thing in some other bounded context; but the event is telling you that something, somewhere, changed state.
That is to say, you have some aggregate that knows the rule of when the maintenance is done. It's an information resource, a log of work. When CarWashed is published (by the Car, or the washing machine, or whatever), an event handler subscribed to the CarWashed event sends a command to the Maintenance aggregate to inform it. The Maintenance aggregate updates its own state, runs its logic, and publishes a MaintenanceCompleted event when all of the individual steps have been accounted for.
Most things that are process like can be implemented as Aggregates; the weird bit is that the "commands" tend to look like event handlers. But they have their own history (based on what they have observed), which describes how the state machine changed in response to each event observed.
It might be more than two, depending on the complexity of the processes.
Rinat Abdullin wrote a good introduction to process managers, that I reference frequently.
Isn't there a clear distinction between an aggregate and a process manager though? I thought process managers would only coordinate and live in the application service world, sending appropriate commands to aggregates based on the events received.
From what I've seen -- no, there isn't. The literature doesn't make that very clear.
For example, Udi Dahan wrote
Here’s the strongest indication I can give you to know that you’re doing CQRS correctly: Your aggregate roots are sagas.
Saga, here, being equivalent to a process.
There's often 2 event models, internal events (only visible within a BC) and external events (published to the outside world). You could decide to make everything external but then you have to version everything.
You can read more about internal vs external events in the Patterns, Principles, and Practices of Domain-Driven Design book p.408 (scroll up a bit in the link).
Projections shouldn't be responsible to publish external events. One common practice would be to register an internal event handler from the application service layer which is responsible for publishing external events on a messaging infrastructure. You could leverage that process to aggregate these events together and publish a single external event from them.
How the aggregation is performed would be up to you, but since internal events can be raised synchronously and handlers are usually single-threaded you can just setup a state machine in the handler that kicks-in when it receives the first event of the batch and aggregates them until it receives the last, then publish on the message bus.
If your messaging infrastructure cannot participate in the same transaction as your event store you could just have an additional process that reads the committed events in order and does the same thing as above.
An alternative would be to let the consumer deal with the aggregation. That could be the right approach if the consumer should be able to veto what "CarMaintenanceDone" means.
Finally, you could also publish an extra event from the aggregate itself. The event may not be leveraged by the AR itself, but sometimes it's better to just do what's more practical (just like enriching events with data only consumed by the read model). This approach would also have the advantage of not having to change the logic if more events are added.
There should not be a notion of a external event. Events are generated by the Aggregates and consumed by synchronous read-models, sagas or published to the outside world where other systems and microservices use them whatever they want.
So, in your case, the consumer (implemented as a saga for example) should aggregate those events by its business rules and then do something (a saga can create a new command for example) and not the Aggregate.
UPDATE (in response to question being updated)
If you think that car maintenance is a responsibility of the Car Aggregate, then Car aggregate should raise the event. It depends on how the future behavior of the Car Aggregate is influenced by that CarMaintainenceDone event. In this particular context, I would generate the event from the Car aggregate, to make code simpler.