Aggregate Root including tremendous number of children - domain-driven-design

I wonder how to model Calendar using DDD and CQRS. My problem consist in increasing number of events. I consider Calendar as Aggregate Root which contains Events (Calendar Events). I dont want to use ReadSide in my Commands but I need way to check events collisions at domain level.

I wonder how to model Calendar using DDD and CQRS. My problem consist in increasing number of events.
The most common answer to "long lived" aggregates is to break that lifetime into episodes. An example of this would be the temporary accounts that an accountant will close at the end of the fiscal year.
In your specific case, probably not "the Calendar" so much as "the February calendar", the "the March calendar", and so on, at whatever grain is appropriate in your domain.
Im not sure if Im right about DDD aproach in terms of validation. I believe the point is not to allow the model to enter into invalid state
Yes, but invalid state is a tricky thing to define. Udi Dahan offered this observation
A microsecond difference in timing shouldn’t make a difference to core business behaviors.
More succinctly, processing command A followed by processing command B produces a valid state, then it should also be true that you end up processing command B first, and then A.
Let's choose your "event collisions" example. Suppose we handle two commands scheduleMeeting(A) and scheduleMeeting(B), and the domain model understands that A and B collide. Riddle: how do we make sure the calendar stays in a valid state?
Without loss of generality, we can flip a coin to decide which command arrives first. My coin came up tails, so command B arrives first.
on scheduleMeeting(B):
publish MeetingScheduled(B)
Now the command for meeting A arrives. If your valid calendars do not permit conflicts, then your implementation needs to look something like
on scheduleMeeting(A):
throw DomainException(A conflicts with B)
On the other hand, if you embrace the idea that the commands arrive shouldn't influence the outcome, then you need to consider another approach. Perhaps
on scheduleMeeting(A)
publish MeetingScheduled(A)
publish ConflictDetected(A,B)
That is, the Calendar aggregate is modeled to track not only the scheduled events, but also the conflicts that have arisen.
See also: aggregates and RFC 2119

Event could also an be an Aggregate root. I don't know your business constraint but I think that if two Events colide you could notify the user somehow to take manual actions. Otherwise, if you really really need them not to colide you could use snapshots to speed up the enormous Calendar AR.
I dont want to use ReadSide in my Commands but I need way to check events collisions at domain level.
You cannot query the read model inside the aggregate command handler. For the colision detection I whould create a special DetectColisionSaga that subscribes to the EventScheduled event and that whould check (possible async if are many Events) if a colision had occurred and notify the user somehow.

Related

How to implement Commands and Events for complex form using Event Sourcing?

I would like to implement CQRS and ES using Axon framework
I've got a pretty complex HTML form which represents recruitment process with six steps.
ES would be helpful to generate historical statistics for selected dates and track changes in form.
Admin can always perform several operations:
assign person responsible for each step
provide notes for each step
accept or reject candidate on every step
turn on/off SMS or email notifications
assign tags
Form update (difference only) is sent from UI application to backend.
Assuming I want to make changes only for servers side application, question is what should be a Command and what should be an Event, I consider three options:
Form patch is a Command which generates Form Update Event
Drawback of this solution is that each event handler needs to check if changes in form refers to this handler ex. if email about rejection should be sent
Form patch is a Command which generates several Events ex:. Interviewer Assigned, Notifications Turned Off, Rejected on technical interview
Drawback of this solution is that some events could be generated and other will not because of breaking constraints ex: Notifications Turned Off will succeed but Interviewer Assigned will fail due to assigning unauthorized user. Maybe I should check all constraints before commands generation ?
Form patch is converted to several Commands ex: Assign Interviewer, Turn Off Notifications and each command generates event ex: Interviewer Assigned, Notifications Turned Off
Drawback of this solution is that some commands can fail ex: Assign Interviewer can fail due to assigning unauthorized user. This will end up with inconsistent state because some events would be stored in repository, some will not. Maybe I should check all constraints before commands generation ?
The question I would call your attention to: are you creating an authority for the information you store, or are you just tracking information from the outside world?
Udi Dahan wrote Race Conditions Don't Exist; raising this interesting point
A microsecond difference in timing shouldn’t make a difference to core business behaviors.
If you have an unauthorized user in your system, is it really critical to the business that they be authorized before they are assigned responsibility for a particular step? Can the system really tell that the "fault" is that the responsibility was assigned to the wrong user, rather than that the user is wrongly not authorized?
Greg Young talks about exception reports in warehouse systems, noting that the responsibility of the model in that case is not to prevent data changes, but to report when a data change has produced an inconsistent state.
What's the cost to the business if you update the data anyway?
If the semantics of the message is that a Decision Has Been Made, or that Something In The Real World Has Changed, then your model shouldn't be trying to block that information from being recorded.
FormUpdated isn't a particularly satisfactory event, for the reason you mention; you have to do a bunch of extra work to cast it in domain specific terms. Given a choice, you'd prefer to do that once. It's reasonable to think in terms of translating events from domain agnostic forms to domain specific forms as you go along.
HttpRequestReceived ->
FormSubmitted ->
InterviewerAssigned
where the intermediate representations are short lived.
I can see one big drawback of the first option. One of the biggest advantage of CQRS/ES with Axon is scalability. We can add new features without worring about regression bugs. Adding new feature is the result of defining new commands, event and handlers for both of them. None of them should not iterfere with ones existing in our system.
FormUpdate as a command require adding extra logic in one of the handler. Adding new attribute to patch and in consequence to command will cause changes in current logic. Scalability is no longer advantage in that case.
VoiceOfUnreason is giving a very good explanation what you should think about when starting with such a system, so definitely take a look at his answer.
The only thing I'd like to add, is that I'd suggest you take the third option.
With the examples you gave, the more generic commands/events don't tell that much about what's happening in your domain. The more granular events far better explain what exactly has happened, as the event message its name already points it out.
Pulling Axon Framework in to the loop, I can also add a couple of pointers.
From a command message perspective, it's safe to just take a route and not over think it to much. The framework quite easily allows you to adjust the command structure later on. In Axon Framework trainings it is typically suggested to let a command message take the form of a specific action you're performing. So 'assigning a person to a step would typically be a AssignPersonToStepCommand, as that is the exact action you'd like the system to perform.
From events it's typically a bit nastier to decide later on that you want fine grained or generic events. This follows from doing Event Sourcing. Since the events are your source of truth, you'll thus be required to deal with all forms of events you've got in your system.
Due to this I'd argue that the weight of your decision should lie with how fine grained your events become. To loop back to your question: in the example you give, I'd say option 3 would fit best.

Modeling one-to-many relations using Domain Driven Design

This question is more of a general question about how to model simple one-to-many relations using collections: should a change in a list item be reflected in the version of the aggregate containing it?
The domain is about meeting scheduling (like in Outlook).
I have a Meeting entity, which can have multiple Participants.
A participant can accept/decline meeting requests.
Rescheduling a meeting nullifies all of the participants confirmations.
I thought of two ways to model this.
Option 1
The Meeting aggregate will contain a list of Participants where each Participant has a ParticipantId and a Status (accepted/denied).
The problem here is that every Accept or Deny command, for a specific participant, increments the Meeting's version, which means two participants will enter a race condition if trying to Accept the meeting request based on the same original version.
Although this could be solved by re-reading the Meeting's document and retrying the Accept command, it's quite annoying considering how often this could happen.
Another approach is to ignore the meeting's version when executing the Accept command, but this introduces a new problem: what happens if, after sending the meeting requests, the meeting has been rescheduled? In this case we can't afford to ignore the Meeting's version, because this time the version DOES represent a real version that should be considered.
BTW, is it at all a good practice to ignore the version in some of the commands and not in others?
Option 2
Extract a Participation aggregate out of Meeting.
Participation will have MeetingId, ParticipantId, and Status.
It will also have its own version.
This way, when participant X Accepts the meeting request, only the relevant Participation will be modified, and the rest will be left intact.
And, when rescheduling the meeting, a "Meeting Rescheduled" event will be published and an event handler will respond to it by resetting all of the Participations' statuses to "NotAccepted" regardless of their current version.
On the one hand this sounds logical in the sense that a meeting's version shouldn't be incremented just because someone accepted/denied its request.
On the other hand, modeling Participation as a standalone aggregate doesn't sound quite right to me, because it is has no meaning outside of the context of the meeting.
Anyway, would love to get feedback on this and see the various approaches to this problem.
Although this could be solved by re-reading the Meeting's document and retrying the Accept command, it's quite annoying considering how often this could happen.
This looks like a modeling error. You should keep in mind that the meeting aggregate is not the book of record for the participants availability - the real world is. So the message shouldn't be AcceptInvitation, but instead InvitationAccepted. There shouldn't be a conflict about this, because the domain model doesn't get to veto events outside of its authority boundary.
You might, depending on your implementation, end up with a concurrent modification exception in your plumbing, but that's something that you should be handling automatically (ie: expected version any, or a retry).
Another approach is to ignore the meeting's version when executing the Accept command, but this introduces a new problem: what happens if, after sending the meeting requests, the meeting has been rescheduled?
The solution here is to model more carefully. Yes, sometimes you will get a message that accepts or declines an invitation that has expired.
Put another way: race conditions don't exist.
A microsecond difference in timing shouldn’t make a difference to core business behaviors.
What happens to Alice, who replied instantly to the invitation, when the meeting is rescheduled? Why wouldn't the same thing happen to Bob, when his reply arrives just after the meeting is rescheduled?
Participation as a standalone aggregate doesn't sound quite right to me, because it is has no meaning outside of the context of the meeting.
I find that heuristic isn't particularly effective. It's much more important to understand whether entities can change state independently, or if their changes need to be coordinated.
Actually, the Meeting aggregate is used to track the participants availability. That's what it purpose is. Unless I didn't fully understand you...
It's a bit subtle, and I didn't spell it out very well.
Suppose the model says that I'm available, but an emergency in the real world calls me away. What happens? Am I blocked from going to the hospital because the model says I have to go to a meeting? Can somebody cancel my emergency by changing the invitation I've submitted?
Furthermore, if I'm away on an emergency, are you available for a meeting that is scheduled for the same time as the meeting you and I were going to have?
In this space, the real world is the authority for whether or not somebody is available. The model is just looking at a cached copy of a message describing whether or not somebody was available in the past.
The cached information being used by the model is not guaranteed to be complete. See Greg Young on warehouse systems and exception reports.
which makes me think that perhaps the Meeting aggregate should have two version fields: one will be a strong version which, when incremented, represents a breaking change, and another soft version for non-breaking changes. Does this make any sense?
Not really. Version is not, as far as I know, a term taken from the ubiquitous language of scheduling meetings. It's meta data, if it exists at all, and the business rules in your model should not depend upon meta data.
I agree, but a Meeting ID (or any ID for that matter) is also not part of the ubiquitous language, yet I might pass it back and forth between my domain world and external worlds.

CQRS aggregates

I'm new to the CQRS/ES world and I have a question. I'm working on an invoicing web application which uses event sourcing and CQRS.
My question is this - to my understanding, a new command coming into the system (let's say ChangeLineItemPrice) should pass through the domain model so it can be validated as a legal command (for example, to check if this line item actually exists, the price doesn't violate any business rules, etc). If all goes well (the command is not rejected) - then the appropriate event is created and stored (for example LineItemPriceChanged)
The thing I didn't quite get is how do I keep this aggregate in memory to begin with, before trying to apply the command. If I have a million invoices in the system, should I playback the whole history every time I want to apply a command? Do I always save the event without any validations and do the validations when constructing the view models / projections?
If I misunderstood any part of the process I would appreciate your feedback.
Thanks for your help!
You are not alone, this is a common misunderstanding. Let me answer the validation part first:
There are 2 types of validation which take place in this kind of system. The first is the kind where you look for valid email addresses, numeric only or required fields. This type is done before the command is even issued. A command which contains these sorts of problems should not be raised as commands (for belt and braces you can check at the domain side but this is not a domain concern and you are better off just preventing this scenario).
The next type of validation is when it is a domain concern. It could be the kind of thing you mention where you check prices are within a set of specified parameters. This is a domain concept the business people would understand, do and be able to articulate.
The next phase is for the domain to apply the state change and raise the associated events. These are then persisted and on success, published for the rest of the app.
All of this is can be done with the aggregate in memory. The actions are coordinated with a domain service which handles the command. It loads the aggregate, apply's all it's past events (or loads a snapshot) then issues the command. On success of the command it requests all the new uncommitted events and tries to persist them. On success it publishes the new events.
As you see it only loads the events for that specific aggregate. Even with a lot of events this process is lightning fast. If performance is a problem there are strategies such as keeping aggregates in memory or snapshotting which you can apply.
To your last point about validating events. As they can only be generated by your aggregate they are trustworthy.
If you want more detail check out my overview of CQRS and ES here. And take a look at my post about how to build aggregate roots here.
Good luck - I hope they help!
It is right that you have to replay the event to 'rehydrate' the domain aggregate. But you don't have to replay all events for all invoices. If you store the entity id of the root aggregate in the events, you can just select and replay the events that with the relevant id.
Then, how do you find the relevant aggregate root id? One of the read repositories should contain the relevant information to get the id, based on a set of search criteria.

Event sourcing chaining events

I'm implementing a bounded context using event sourcing but have come across a problem. Say I'm modelling a game of soccer, and I'm interested in both the individual goals scored (who scored them etc) and the overall score. So if I have a Match aggregate root I ideally want events raised called GoalScored and ScoreChanged. The reason I want the score explicitly stated from the domain like this is that I don't want lots of different listeners and possibly other bounded contexts all computing the same thing.
This seems simple, but: the Match object has a Goal() method that adds a new goal. In the spirit of event sourcing this doesn't directly mutate Match state, but raises a GoalScored event that is handled within the Match which then mutates the state (as well as the event being pushed out to denormalizers). So in terms of raising a ScoreChanged, the score hasn't actually changed until the GoalScored event has been handled, so do I raise another event in response to that event (ScoreChanged), effectively chaining the events? I don't think so, for a start when an aggregate root is reloaded from the event store lots of extra events are going to be created each time in response to each GoalScored.
I also thought about working out what the score would in the command handler that raises GoalScored, sort of a 'what if' situation. Then I could raise both events in the command handler. I'd really rather not do that though - it just doesn't seem 'right'. Working out the score is simple enough for soccer, but other games (cricket for example) requires more work.
I could put both the goal and the score in the GoalScored event, which is fair enough, but again it doesn't seem right - the score has got nothing to do with a GoalScored event per se.
All the examples used when discussing event sourcing seem to use the eCommerce Customer/Order domain, and I've never seen a similar case as this.
Does anyone have any experience dealing with situations like this?
Thanks
The choosing of clean domain events, just like other modeling, should result in concepts that are mirrored in the domain. You say "the score has got nothing to do with a GoalScored event per se". But it does. In soccer, the only way a score can change is if a goal is scored. Though, goals can be taken back, e.g. via offsides calls or other penalties. It's not clear whether you want to model this or not.
It is common for a domain method to emit multiple events at once. A good framework will make it easy to consider them as a bunch, e.g. as a single commit. Why not emit both a GoalScored and ScoreChanged event?
You may also want to consider whether this domain has any commands at all. The soccer match itself is the system of record. The events that come from the match are a record of history already. Are you perhaps just writing a system which processes streams of events into streams of more events?
One thing that often helps when thinking about event sourcing is to notice your tense - you say StartMatch, but in an event sense it's actually the event MatchStarted.
As for the ScoreChanged, are you consuming this event outside of the Match? If not, the Score should be the only part publically accessible, and the GoalScored event simply mutates this. This holds CQRS in a minor way (how I get the score doesn't depend on how I change it). Score can then internally hold a state, or can replay all the GoalScored events to arrive at a number each time it's called. The 'feel' of properly designed event-sourced solutions can always regenerate any state from the events.
Now, if the ScoreChanged needs to alert other parts of your system (a player ranking, for instance), you can either multi-cast the event to different roots, or you might need to refactor your design. In this case, do you want to update players in real time, or only after a match is complete?
I have this open source project for event sourcing in scala which contains examples of how a basket works using it. Just in case is useful for you https://github.com/politrons/Scalaydrated

CQRS + EventSourcing. Change Aggregate Root history

I have following problem.
Given. CQRS + EventSourcing application.
How is that possible to change the state of the Aggregate root in history?
For example, accounting application, accounter wants to apply transation but with past date. The event which will be stored in Event Store will have the older date than recent events, but the sequense number of this event will be bigger.
Repository will restore the state of aggregate root by ordering events by sequence number. And if we will take the snapshot for this past date - we will have aggregate root without this event.
We can surely change the logic of repository to order events by date, but we use external framework for CQRS, and this is not desirable.
Are there some elegant solutions for this case?
What you're looking for is a bi-temporal implementation.
e.g. On dec 3rd we thought X == 12 (as-at), but on dec 5th we corrected the mistake and now know X == 14 on dec 3rd (as-of)
There are two ways to implement this
1) The event store holds as-at data and a projection holds as-of data (a possible variation is both an as-of and as-at projection as well)
2) The aggregate has an overloaded method indicating the desire for as-of vs as-at values from the event store. This will most likely involve using a custom secondary snapshot stream for as-of data values.
Your solution could very likely use both implementations as one is command focused and the other is query focused.
The as-of snapshots for the aggregate root in the second option would need to be rebuilt as corrective events are recieved.
Martin Folwler talks about this in this article
Note: The event store is still append only.
In accounting you'll probably end up in jail if you change past bookings. Don't change the past. Use compensating commands instead.
Sorry, but you brought up the accounting example, which is probably a domain that's very strict about fiddling with past data without making the changes explicit.
If the above doesn't apply to your domain you can easily apply new events on top of older ones that change the state (and possibly the history) of your domain objects.
Take a booking to an account for example. The event might have occurred today, but it can set the actual booking date to some time in the past.
You have stated that your business logic allows you to add a back-dated transaction; now I don't know why you'd want that, but there's nothing constraining your aggregate not to accept it. Of course the event will get a later event sequence number/version, but that's expected.
You don't need to fiddle with the infrastructure, repository or anything else to do this.
Accounting doesn't let you change history. It only lets you add entries. It's up to your business logic to interpret the dates on these events as you will. In this case, the sequence of events is not just a persistence trick as with event sourcing, but the actual content of the domain!
One solution to this is to think of the event as an explicit compensating action. For example, when your bank reverses a charge, they don't delete an existing transaction, they add a compensating transaction. This transaction may reference they transaction it wishes to compensate with respective dating. In this way, the events are a proper representation of reality.

Resources