How to avoid masses of data loads in CQRS/DDD Commands/Events?

How to avoid masses of data loads in CQRS/DDD Commands/Events? - domain-driven-design

We have an DDD AppDomain containing Seminars, Participants and their Meals. We have up to 1000 Participants per Seminar with up to 50 Meals per Participant. We decided that Seminars, Participants an Meals are aggregates to keep these aggregates small.
The user can reschedule a whole seminar with all participants or reschedule a single participant. So we have the commands "RescheduleSeminarCommand" and "RescheduleParticipantCommand".
The Problem arises when you reschedule a Seminar: The "RescheduleSeminarCommand" leads to a "SeminarRescheduledEvent" which leads to a "RescheduleParticipantCommand" per Participant. That would mean loading each single Participant from the repository - so 1000 database requests. Each "RescheduleParticipantCommand" leads to a "ParticipantRescheduledEvent" which fires "RescheduleMealsCommand" which loads the Meals for each single Participant - so another 1000 database requests.
How can we reduce the number of database requests?
1) We thought about extending the "RescheduleParticipantCommand" and the "RescheduleMealsCommand" with the SeminarId so we can not only load one Participant/Meal but all Participants/Meals for a whole Seminar.
2) Another way would be to create additional Events/Commands like for "RescheduleParticipantsForSeminarCommand", "ParticipantsForSeminarRescheduleEvent" and "RescheduleMealsForSeminarCommand" etc.
What do you think is better? 1), 2) or something different we didn't think of?
OK, I'll give some details which i missed in my first description:
If have the following classes
class Seminar
{
UUID SeminarId,
DateTime Begin,
DateTime End
}
// Arrival/Departure of a participant may differ
// from Begin/End of the seminar
class Participant
{
UUID ParticipantId
UUID SeminarId,
DateTime Arrival,
DateTime Departure
}
// We have one Meal-Object for breakfast, one for lunch and
// one for dinner (and additional for other meals) per day
// of the stay of the participant
class Meal
{
UUID MealId,
UUID ParticipantId,
DateTime Date,
MealType MealType
}
The users can
change Arrival/Depature of a single participant with the "RescheduleParticipantCommand" which would also change their Meals to the new dates.
change Begin/End of a seminar with the "RescheduleSeminarCommand" which would change the Arrival/Depature of all participants to the new Begin/End and change their meals accordingly.

You may be missing a concept of SeminarSchedule. First let's ask couple of questions that will affect the model
If you have a Seminar, is it divided to some sort of Lecture, Presentation etc. or is it a seminar of the same thing just for different people at different times?
Can you sign a person to the seminar first and then decide what time this person will attend?
I'll give an example in pseudo code.
NOTE: I'll skip the meals as the question is about the scheduling but they do fit in this model. I'll discuss logic related to them too, just skip them in code
First let's say what our requirements are.
It's a seminar for one thing (lecture, train session whatever) divided into time slots. The same lecture will be given to different people starting at different times.
Participants can sign without being scheduled to a time slot.
When a participant signs, we need to make a meal for him/her based on preferences (for example he/she may be a vegetarian or vegan).
Scheduling will be done at a specific time from users of the system. They will take participants info when doing the schedule. For example we may want to have people with the same age in one time slot or by some other criteria.
Here's the code:
class Seminar {
UUID ID;
// other info for seminar like description, name etc.
}
class Participant {
UUID ID;
UUID SeminarID;
// other data for participant, name, age, meal preferences etc.
}
class TimeSlot {
Time StartTime;
TimeInterval Duration;
ReadonlyCollection<UUID> ParticipantIDs;
void AddParticipant(UUID participantID) { }
}
class SeminarSchedule {
UUID SeminarID;
Date Date;
Time StartTime;
TimeInterval Duration;
ReadOnlyCollection<TimeSlot> TimeSlots;
void ChangeDate(Date newDate) { }
void ChangeStartTime(Time startTime) { }
void ChangeDuration(TimeInterval duration) { }
void ScheduleParticipant(Participant p, Time timeSlotStartTime) { }
void RemoveParticipantFromSchedule(Participant p) { }
void RescheduleParticipant(Participant p, Time newTimeSlotStartTime) { }
}
Here we have 3 aggregates: Seminar, Participant and SeminarSchedule.
If you need to change any information related to the Seminar or Participant you only target these aggregates.
On the other hand if you need to do anything related to the schedule, the SeminarSchedule aggregate (being a transactional boundary around scheduling) will handle these command ensuring consistency. You can also enforce concurrency control over the schedule. You may not want multiple people changing the schedule at the same time. For example one changing the StartTime while another changing the Duration or having two users add the same participant to the schedule. You can use Optimistic Offline lock on the SeminarSchedule aggregate
For instance changing the Duration of StartTime of a SeminarSchedule will affect all TimeSlots.
If you remove a Participant from the Seminar then you will have to remove it from the schedule too. This can be implemented with eventual consistency and handling ParticipantRemoved event or you can use a Saga.
Another thing we need to take into account when modelling aggregates is also how the logic of signing to a seminar works.
Let's say that a participant should sign to the Seminar first before scheduling them. Maybe the scheduling will be performed later by defining groups of people by some criteria. The above model will work fine. It will allow for users to sign a Participant to the Seminar. Later when the schedule is assigned, other users will be able to make the schedule by looking at what kind of participants have signed.
Let's take the opposite case and say that unscheduled participants cannot be present to the seminar.
In this case we can add the Participant entity to the SeminarSchedule aggregate but this will cause you to load this whole aggregate even when you need to change some information for a single participant. This isn't very practical.
So order to keep the nice separation that we have, we may use a Saga or a ProcessManager to ensure consistency. We may also add the concept of a ReservedPlace in the SeminarSchedule aggregate. This way you can reserve a place, then add a participant to the schedule and then remove the ReservedPlace by assigning the participant to the time slot. As this is a complex process that spans multiple aggregates a Saga is definitely in place.
Another way to do this is to define a concept of a SeminarSignRequest that a person can make. Later this request may be approved if meals and/or a place is available. We may have reached the maximum number of people or not have enough meals etc. This probably also be a process so you may need a Saga here too.
For more information, check this article and this video.

Commands are things that could be rejected by your domain rules. If you raise a command due to a event (something that already is done and can not be rejected because it passes all domain rules) keep in mind that; even if the new command does nothing because is rejected; your system has to be in a consistent state. Basic rule: If you raise a event is because the system is in a consistent state even if that event implies more commands in the system that could be rejected or does not change nothing in the system.
So, according to your comments once Seminar aggregate acepts the new dates according its rules; you change Participants dates without the needed to check more rules.
Then the solution is just change everything in persistence and not spam finegrained commands for every change you want.
Relational Database example:
Update Seminar ( Begin , End) Values ( '06/02/2019' ,06/06/2019 ) where SeminarID = #SeminarID;
Update Participant ( Arrival , Departure ) Values ( '06/02/2019' ,06/06/2019 ) where SeminarId = #SeminarID
PS: Why not having just Seminar Begin/End in persistence and bring this data in the hidratation of Paricipants (Arrival/Departure) aggregate? This way you always have a consistent state in your system without worry about changing several things.

Related

Modeling (apparent?) dependency between DDD Bounded Contexts

The simplified scenario is the following: there is a BC (Bounded Context) called "tasks" which contains the Task Aggregate, and a BC called "meetings" which contains the Meeting Aggregate.
// in BC "tasks"
class Task extends AggregateRoot {
private TaskId taskId
private string name
private string description
...
static func register(TaskId taskId, ...): Task { ... }
func rename(string newName) { ... }
...
}
// in BC "meetings"
class Meeting extends AggregateRoot {
private MeetingId meetingId
private DateTime meetingDate
...
static func plan(MeetingId meetingId, ...): Meeting { ... }
func postpone(DateTime newMeetingDate): void { ... }
func scheduleTask(TaskId taskId): void { ... }
...
}
You can schedule Tasks for a Meeting, which will be discussed when the meeting happens, but there are a few rules:
the person which created the Task must explicitly mark it as "ready for meeting", because the creation process can be long and the Task can be "incomplete" for a while (e.g. document must be added but were not sent, the description is not clear or incomplete...)
a Task can only be scheduled for a single Meeting, at the end of which an Opinion must be expressed on the Task (something along the line of "is valid", "is invalid", "ok but this needs to be changed")
there must exist an API to fetch all Tasks eligible to be scheduled for the next Meeting (i.e. not draft but not already added to another Meeting)
I am not sure how and where to model the state relative to the status of the Task ("draft", "ready for meeting", ...) and about the Opinion.
What I've tried so far was to add a status property to Task which starts at "draft" and can be changed to "ready for meeting" via a specific operation:
class Task extends AggregateRoot {
...
private Status status = Status.draft
...
func markAsReadyForMeeting(): void {
// let's ignore other checks, Domain Event publishing etc.
this.status = Status.readyForMeeting
}
...
}
But at this point I don't know:
how to create the fetch API, and in which BC it should live, since part of the information about the Task availability is on the "tasks" BC (is Task draft?) and another part is in the "meetings" BC (is this Task already scheduled in a Meeting?)
how to not create a two-way link between Task and Meeting, since a Meeting must hold to a list of TaskIds, but if I were to add to Task's Status the case scheduled(MeetingId) it would feel like a duplication of information which must be kept in sync
the Opinions are expressed in the context of a Meeting, but should be saved on a Task... so what?
The other thing I have thought of was to have a "simplified" Task model in the "meetings" BC and manage the status in there and not in the "tasks" BC. At this point there will be no Status or Opinion in the "tasks" BC, and the act of "making a Task ready for meeting" will be implemented on the "meetings" BC and not in the "tasks" one.
I have the feeling that this can be a better approach since it appears to me that the "meetings" BC could operate in autonomy, but it also feels that in this way there is a lot of duplication of data between the two BCs (both have a complete list of all Tasks, albeit the contained information is different).
Is my modeling wrong, there is something I'm missing? Or should more integration effectively exist between the two BCs?
As a final note: the two BCs are more complex than this simplified example and are composed of more parts, and I believe that they should remain separated, but I still remain open to explore a "refactoring" approach.

Bounded contexts should be designed around use cases and not object structures like persistence model do. You are partly right in the approach of putting the ready-for-meeting (RFM) state and the Opinon concepts in the Meeting context. The justification behind that is that these concepts do not exist outside of the meeting context, ie: there would not be a ready-for-meeting status, nor Opinions if there was no meeting in your system.
What you are missing, in my opinion, is that you should not confuse draft and RFM states. Draft status should be handled in the Task context as you already do, as it controls the state of the Task outside the meeting concept. The Meeting context would subscribe to Task "undrafted" events. This would allow the Meeting BC to maintain a list of non-draft tasks, and associate them with meetings. The Meeting context is then able to provide a list of undrafted tasks, not associated with a meeting, which is your definition of RFM tasks.
The Task context don't need to know whether the Task is associated to a meeting or not, and if the meeting is planned or has already happened. If you want to prevent the Task context from altering tasks once they are associated with a meeting, you could maintain a readonly state in the Task context. The Task context would subscribe to a "task associated with meeting" event in the Meeting context and would update the readonly state of the Task.

Java MultiThreading banking application

I have a doubt on Java MultiThreading. Suppose i am having a banking application.let us say i am having one controller like below.
public class BankAccount{
private String bankaccount;
private long balance;
getBalance(String bankaccount){
//code to get balance based on bankaccount number
this.balance=value; //value is the balance i get from database
}
updateAccount(long value){
balance=balance-value;
//code to store balance in database
}
Let us say i have employed above code in a spring application
I have a scenario where for one particular account number the balance is 10000.A husband and wife are both trying to withdraw amount from the same account from 2 different ATMs. Since servers internally use Multi Threading,Synchronization is needed for the above scenario. I have following doubts
1) will the above 2 requests create 2 different objects of BankAccount class or only one object.
2) if it creates only 1 object how server can identify a different account number and create another object to it as updating one account number should not block updating some other account number.

It makes sense to implement a solution where only a single BankAccount instance is created for each account number.
You can use the synchronized keyword to synchronize access to each BankAccount instance individually. for example:
BankAccount account = new BankAccount("1234567890");
synchronized (account) {
//perform a transaction here
account.updateAccount(100);
}
This way, only a single thread can enter the synchronized block while other threads will block until the first thread exists the block.

Why do sagas (aka, process managers) contain an internal state and why are they persisted to the event store?

A lot of articles on CQRS imply that sagas have an internal state and must be saved to the event store. I don't see why this is necessary.
For example, say I have three aggregates: Order, Invoice and Shipment. When a customer places an order, the order process starts. However, the shipment cannot be sent until the invoice has been paid and the shipment has first been prepared.
A customer places an order with the PlaceOrder command.
The OrderCommandHandler calls OrderRepository::placeOrder().
The OrderRepository::placeOrder() method returns an OrderPlaced event, which is stored in the EventStore and sent along the EventBus.
The OrderPlaced event contains the orderId and pre-allocates a invoiceId and shipmentId.
The OrderProcess ("saga") receives the OrderPlaced event, creating the invoice and preparing the shipment if necessary (achieving idempotence in the event handler).
6a. At some point in time, the OrderProcess receives the InvoicePaid event. It checks to see whether the shipment has been prepared by looking up the shipment in the ShipmentRepository, and if so, sends the shipment.
6b. At some point in time, the OrderProcess receives the ShipmentPrepared event. It chekcs to see whether the invoice has been paid by looking up the invoice in the InvoiceRepository, and if so, sends the shipment.
To all the experienced DDD/CQRS/ES gurus out there, can you please tell me what concept I'm missing and why this design of a "stateless saga" will not work?
class OrderCommandHandler {
public function handle(PlaceOrder $command) {
$event = $this->orderRepository->placeOrder($command->orderId, $command->customerId, ...);
$this->eventStore->store($event);
$this->eventBus->emit($event);
}
}
class OrderRepository {
public function placeOrder($orderId, $customerId, ...) {
$invoiceId = randomString();
$shipmentId = randomString();
return new OrderPlaced($orderId, $customerId, $invoiceId, $shipmentId);
}
}
class InvoiceRepository {
public function createInvoice($invoiceId, $customerId, ...) {
// Etc.
return new InvoiceCreated($invoiceId, $customerId, ...);
}
}
class ShipmentRepository {
public function prepareShipment($shipmentId, $customerId, ...) {
// Etc.
return new ShipmentPrepared($shipmentId, $customerId, ...);
}
}
class OrderProcess {
public function onOrderPlaced(OrderPlaced $event) {
if (!$this->invoiceRepository->hasInvoice($event->invoiceId)) {
$invoiceEvent = $this->invoiceRepository->createInvoice($event->invoiceId, $event->customerId, $event->invoiceId, ...);
$this->eventStore->store($invoiceEvent);
$this->eventBus->emit($invoiceEvent);
}
if (!$this->shipmentRepository->hasShipment($event->shipmentId)) {
$shipmentEvent = $this->shipmentRepository->prepareShipment($event->shipmentId, $event->customerId, ...);
$this->eventStore->store($shipmentEvent);
$this->eventBus->emit($shipmentEvent);
}
}
public function onInvoicePaid(InvoicePaid $event) {
$order = $this->orderRepository->getOrders($event->orderId);
$shipment = $this->shipmentRepository->getShipment($order->shipmentId);
if ($shipment && $shipment->isPrepared()) {
$this->sendShipment($shipment);
}
}
public function onShipmentPrepared(ShipmentPrepared $event) {
$order = $this->orderRepository->getOrders($event->orderId);
$invoice = $this->invoiceRepository->getInvoice($order->invoiceId);
if ($invoice && $invoice->isPaid()) {
$this->sendShipment($this->shipmentRepository->getShipment($order->shipmentId));
}
}
private function sendShipment(Shipment $shipment) {
$shipmentEvent = $shipment->send();
$this->eventStore->store($shipmentEvent);
$this->eventBus->emit($shipmentEvent);
}
}

Commands can fail.
That's the primary problem; the entire reason we have aggregates in the first place, is so that they can protect the business from invalid state changes. So what happens in onOrderPlaced() if the createInvoice command fails?
Furthermore (though somewhat related) you are lost in time. Process managers handle events; events are things that have already happened in the past. Ergo -- process managers are running in the past. In a very real sense, they can't even talk to anyone that has seen a more recent event than the one that they are processing right now (in fact, they might be the first handler to see this event, meaning everybody else is a step in the past).
This is why you can't run commands synchronously; your event handler is in the past, and the aggregate can't protect its invariant unless it is running in the present. You need the asynchronous dispatch to get the command running against the correct version of the aggregate.
Next problem: when you dispatch the command asynchronously, you can't directly observe the result. It might fail, or get lost en route, and the event handler won't know. The only way that it can determine that the command succeeded is by observing a generated event.
A consequence is that the process manager cannot distinguish a command that failed from a command that succeeded (but the event hasn't become visible yet). To support a finite sla, you need a timing service that wakes up the process manager from time to time to check on things.
When the process manager wakes up, it needs state to know if it has already finished the work.
With state, everything is so much simpler to manage. The process manager ccan re-issue possibly lost commands to be sure that they get through, without also flooding the domain with commands that have already succeeded. You can model the clock without throwing clock events into the domain itself.

What you are referring to seems to be along the lines of orchestration (with a process manager) vs choreography.
Choreography works absolutely fine but you will not have a process manager as a first-class citizen. Each command handler will determine what to do. Even my current project (December 2015) uses choreography quite a bit with a webMethods integration broker. Messages may even carry some of the state along with them. However, when anything needs to take place in parallel your are rather shafted.
A relevant service orchestration vs choreography question demonstrates these concepts quite nicely. One of the answers contains a nice pictorial representation and, as stated in the answer, more complex interactions typically require state for the process.
I find that you typically will require state when interacting with services and endpoints beyond your control. Human interaction, such as authorizations, also require this type of state.
If you can get away with not having state specifically for a process manager it may be OK. However, later on you may run into issues. For example, some low-level/core/infrastructure service may span across various processes. This may cause issues in a choreography scenario.

How to model bank transfer in CQRS

I'm reading Accounting Pattern and quite curious about implementing it in CQRS.
I think AccountingTransaction is an aggregate root as it protects the invariant:
No money leaks, it should be transfer from one account to another.
public class AccountingTransaction {
private String sequence;
private AccountId from;
private AccountId to;
private MonetaryAmount quantity;
private DateTime whenCharged;
public AccountingTransaction(...) {
raise(new AccountingEntryBookedEvent(sequence, from, quantity.negate(),...);
raise(new AccountingEntryBookedEvent(sequence, to, quantity,...);
}
}
When the AccountingTransaction is added to its repository. It publishes several AccountingEntryBookedEvent which are used to update the balance of corresponding accounts on the query side.
One aggregate root updated per db transaction, eventual consistency, so far so good.
But what if some accounts apply transfer constraints, such as cannot transfer quantity more that current balance? I can use the query side to get the account's balance, but I'm worried that data from query side is stale.
public class TransferApplication {
public void transfer(...) {
AccountReadModel from = accountQuery.findBy(fromId);
AccountReadModel to = accountQuery.findBy(toId);
if (from.balance() > quantity) {
//create txn
}
}
}
Should I model the account in the command side? I have to update at least three aggregate roots per db transaction(from/to account and account txn).
public class TransferApplication {
public void transfer(...) {
Account from = accountRepository.findBy(fromId);
Account to = accountRepository.findBy(toId);
Transaction txn = new Transaction(from, to, quantity);
//unit or work locks and updates all three aggregates
}
}
public class AccountingTransaction {
public AccountingTransaction(...) {
if (from.permit(quantity) {
from.debit(quantity);
to.credit(quantity);
raise(new TransactionCreatedEvent(sequence, from, to, quantity,...);
}
}
}

There are some use cases that will not allow for eventual consistency. CQRS is fine but the data may need to be 100% consistent. CQRS does not imply/require eventual consistency.
However, the transactional/domain model store will be consistent and the balance will be consistent in that store as it represents the current state. In this case the transaction should fail anyway, irrespective of an inconsistent query side. This will be a somewhat weird user experience though so a 100% consistent approach may be better.

I remember bits of this, however M Fowler uses a different meaning of event compared to a domain event. He uses the 'wrong' term, as we can recognize a command in his 'event' definition. So basically he is speaking about commands, while a domain event is something that happened and it can never change.
It is possible that I didn't fully understood that Fowler was referring to, but I would model things differently, more precisely as close to the Domain as possible. We can't simply extract a pattern that can always be applied to any financial app, the minor details may change a concept's meaning.
In OP's example , I'd say that we can have a non-explicit 'transaction': we need an account debited with an amount and another credit with the same amount. The easiest way, me thinks, is to implement it via a saga.
Debit_Account_A ->Account_A_Debited -> Credit_Account_B-> Account_B_Credited = transaction completed.
This should happen in a few ms at most seconds and this would be enough to update a read model. Humans and browsers are slower than a few seconds. And a user know to hit F5 or to wait a few minutes/hours. I won't worry much about the read model accuracy.
If the transaction is explicit i.e the Domain has a Transaction notion and the business really stores transactions that's a whole different story. But even in that case, probably the Transaction would be defined by a number of accounts id and some amounts and maybe a completed flag. However, at this point is pointless to continue, because it really depends on the the Domain's definition and use cases.

Fixed the answer
Finally my solution is having Transaction as domain model.
And project transactions to AccountBalance but I implement special projection which make sure every data consistence before publish actual event.

Just two words: "Event Sourcing" with the Reservation Pattern.
And maybe, but not always, you may need the "Sagas" pattern also.

CQRS/Event Sourcing, how to get consistent data to apply business rules?

at times I'm developing a small project using CQRS pattern and Event Sourcing.
I have a structural issue and I'm not aware of which solution to take to resolve it.
Imagine the following example:
A command is sent with information that a client of the bank had deposited some amount of money (DepositCommand).
In the command handler/Entity/Aggregate (which is not important for the discussion) a business rule has to be applied;
If the client is one of the top 10% with more money in the account win some prize.
The question is how can I get up-to-date, consistent, data to know if the client after its deposit is in the top 10%?
I can't use the event store because is not possible to make such a query;
I'm not sure if I can use the read model because is not 100%
sure that is up to date.
How do you do, in cases where you need data from a database to apply a business rule? If I don't pay attention to up-to-date data I run into possibilities
of giving the prize to two different clients
Looking forward to hearing your opinion.

Any information that an aggregate requires to make business decisions should be stored as a part of the aggregate's state. As such, when a command is received to deposit money in to a client's account, you should already have the current/update state for that client which can contain the current balance for each of their accounts.
I would also suggest that an aggregate should never go to the read-model to pull information. Depending on what you are trying to achieve, you may enrich the command with additional details from the read model (where state is not critical), but the aggregate itself should be pulling from it's own known state.
EDIT
After re-reading the question, I realize you are talking about tracking state across multiple aggregates. This falls in the realm of a saga. You can create a saga that tracks the threshold required to be in the top 10%. Thus, whenever a client makes a deposit, the saga can track where this places them in the ranking. If that client crosses over the threadshold, you can then publish a command from the saga to indicate that they meet the criteria required.
In your case, your saga might track the total amount of all deposits so when a deposit is made, a decision can be made as to whether or not the client is now in the top 10%. Other questions you may want to ask yourself... if the client deposits $X amount of money, and immediately widthrawls $Y to drop back under the threashold; what should happen? Etc.
Very crude aggregate/saga handle methods...
public class Client : Aggregate
{
public void Handle(DepositMoney command)
{
// What if the account is not known? Has insufficient funds? Is locked? etc...
// Track the minimum amount of state required to make whatever choice is required.
var account = State.Accounts[command.AccountId];
// Balance here would reflect a point in time, and should not be directly persisted to the read model;
// use an atomic update to increment the balance for the read-model in your denormalizer.
Raise(new MoneyDeposited { Amount = command.Amount, Balance = account.Balance + command.Amount });
}
public void Handle(ElevateClientStatus command)
{
// you are now a VIP... raise event to update state accordingly...
}
}
public class TopClientSaga : Saga
{
public void Handle(MoneyDeposited e)
{
// Increment the total deposits... sagas need to be thread-safe (i.e., locked while state is changing).
State.TotalDeposits += e.Amount;
//TODO: Check if client is already a VIP; if yes, nothing needs to happen...
// Depositing money itself changes the 10% threshold; what happens to clients that are no longer in the top 10%?
if (e.Balance > State.TotalDeposits * 0.10)
{
// you are a top 10% client... publish some command to do whatever needs to be done.
Publish(new ElevateClientStatus { ClientId = e.ClientId, ... });
}
}
// handle withdrawls, money tranfers etc?
}

Develop Reference

node.js excel linux python-3.x azure haskell apache-spark rust .htaccess string