How to enforce invarients in aggregate relationships - domain-driven-design

I am new to event sourcing and ddd and trying to create a simple app to learn more, but I'm strruggling with how to model a relationship between two aggregates.
The idea is to allow companies to create activities that can then be searched for by users.
I want to be able to enforce the rule that a company can only have so many active activities depending on thier membership level.
My first approach would be to have the Company be the aggregate root which would contain the list of Activities and easily control this. However, this means I would have to go through the Comapny Aggregate to access every Activity, which hisn't ideal as most actions against an activity does not depend on the Company.
My second approach was to have seperate Company and Activity aggreagtes. This means that I would have to first raise a ActivityCreated event, then an ActivityAddedToCompany event which would throw an exception if the company is already full of Activities. This approach seems better but I'm not sure if needing the ActivityAddedToCompany is a flag that I have not seperated the aggregates correctly as in a happy path, the ActivityCreated and ActivityAddedToCompany would always be stored after each other.
Is the second approach better or am I missing something basic in Domain Driven Design?

As per your clarifications:
an Activity does not have to be created by a Company
This suggests that Activity should be an aggregate of its own. It has a lifetime separate from any other aggregate.
An Activity can only be registered to one Company
The Activity would have a reference back to the Company via an ID. Effectively, a foreign key. When it is assigned to a Company, it raises an event indicating that the assignment was made.
a Company can only have 5 Activities at any one time
If you were using a standard RDBMS system to manage these rules, you would have a transaction that checks the number of Activities and either approves or rejects the addition of a new Activity. Similarly, in your domain, you can model this through a two-phase commit.
When you assign an Activity to a Company (AssignToCompany command), you raise an AssignedToCompany event. A ProcessManager (PM) will receive that event and send a command to Company (AssignToActivity) and the Company can either accept (AssignedToActivity) or reject that based on the count (RejectedAssignToActivity).
If the latter, the PM will receive the RejectedAssignToActivity event and send a command back to Activity to remove the company (UnassignCompany) which will raise the CompanyUnassigned event.
Optional:
The PM will receive the CompanyUnassigned event and send an UnassignFromActivity command to the Company. This way, you can unassign an activity if needed and have the Company be aware of the change.

Related

CQRS Aggregate and Projection consistency

Aggregate can use View this fact is described in Vaughn Vernon's book:
Such Read Model Projections are frequently used to expose information to various clients (such as desktop and Web user interfaces), but they are also quite useful for sharing information between Bounded Contexts and their Aggregates. Consider the scenario where an Invoice Aggregate needs some Customer information (for example, name, billing address, and tax ID) in order to calculate and prepare a proper Invoice. We can capture this information in an easy-to-consume form via CustomerBillingProjection, which will create and maintain an exclusive instance of CustomerBilling-View. This Read Model is available to the Invoice Aggregate through the Domain Service named IProvideCustomerBillingInformation. Under the covers this Domain Service just queries the document store for the appropriate instance of the CustomerBillingView
Let's imagine our application should allow to create many users, but with unique names. Commands/Events flow:
CreateUser{Alice} command sent
UserAggregate checks UsersListView, since there are no users with name Alice, aggregate decides to create user and publish event.
UserCreated{Alice} event published // By UserAggregate
UsersListProjection processed UserCreated{Alice} // for simplicity let's think UsersListProjection just accumulates users names if receives UserCreated event.
CreateUser{Bob} command sent
UserAggregate checks UsersListView, since there are no users with name Bob, aggregate decides to create user and publish event.
UserCreated{Bob} event published // By UserAggregate
CreateUser{Bob} command sent
UserAggregate checks UsersListView, since there are no users with name Bob, aggregate decides to create user and publish event.
UsersListProjection processed UserCreated{Bob} .
UsersListProjection processed UserCreated{Bob} .
The problem is - UsersListProjection did not have time to process event and contains irrelevant data, aggregate used this irrelevant data. As result - 2 users with the same name created.
how to avoid such situations?
how to make aggregates and projections consistent?
how to make aggregates and projections consistent?
In the common case, we don't. Projections are consistent with the aggregate at some time in the past, but do not necessarily have all of the latest updates. That's part of the point: we give up "immediate consistency" in exchange for other (higher leverage) benefits.
The duplication that you refer to is usually solved a different way: by using conditional writes to the book of record.
In your example, we would normally design the system so that the second attempt to write Bob to our data store would fail because conflict. Also, we prevent duplicates from propagating by ensuring that the write to the data store happens-before any events are made visible.
What this gives us, in effect, is a "first writer wins" write strategy. The writer that loses the data race has to retry/fail/etc.
(As a rule, this depends on the idea that both attempts to create Bob write that information to the same place, using the same locks.)
A common design to reduce the probability of conflict is to NOT use the "read model" of the aggregate itself, but to instead use its own data in the data store. That doesn't necessarily eliminate all data races, but you reduce the width of the window.
Finally, we fall back on Memories, Guesses and Apologies.
It's important to remember in CQRS that every write model is also a read model for the reads that are required to validate a command. Those reads are:
checking for the existence of an aggregate with a particular ID
loading the latest version of an entire aggregate
In general a CQRS/ES implementation will provide that read model for you. The particulars of how that's implemented will depend on the implementation.
Those are the only reads a command-handler ever needs to perform, and if a query can be answered with no more than those reads, the query can be expressed as a command (e.g. GetUserByName{Alice}) which when handled does not emit events. The benefit of such read-only commands is that they can be strongly consistent because they are limited to a single aggregate. Not all queries, of course, can be expressed this way, and if the query can tolerate eventual consistency, it may not be worth paying the coordination tax for strong consistency that you typically pay by making it a read-only command. (Command handling limited to a single aggregate is generally strongly consistent, but there are cases, e.g. when the events form a CRDT and an aggregate can live in multiple datacenters where even that consistency is loosened).
So with that in mind:
CreateUser{Alice} received
user Alice does not exist
persist UserCreated{Alice}
CreateUser{Alice} acknowledged (e.g. HTTP 200, ack to *MQ, Kafka offset commit)
UserListProjection updated from UserCreated{Alice}
CreateUser{Bob} received
user Bob does not exist
persist UserCreated{Bob}
CreateUser{Bob} acknowledged
CreateUser{Bob} received
user Bob already exists
command-handler for an existing user rejects the command and persists no events (it may log that an attempt to create a duplicate user was made)
CreateUser{Bob} ack'd with failure (e.g. HTTP 401, ack to *MQ, Kafka offset commit)
UserListProjection updated from UserCreated{Bob}
Note that while the UserListProjection can answer the question "does this user exist?", the fact that the write-side can also (and more consistently) answer that question does not in and of itself make that projection superfluous. UserListProjection can also answer questions like "who are all of the users?" or "which users have two consecutive vowels in their name?" which the write-side cannot answer.

DDD Relate Aggregates in a long process running

I am working on a project in which we define two aggregates: "Project" and "Task". The Project, in addition to other attributes, has the points attribute. These points are distributed to the tasks as they are defined by users. In a use case, the user assigns points for some task, but the project must have these points available.
We currently model this as follows:
“task.RequestPoints(points)“, this method will create an aggregate PointsAssignment with attributes points and taskId, which in its constructor issues a PointsAssignmentRequested domain event.
The handler of the event issued will fetch the project related to the task and the aggregate PointsAssigment and call the method “project.assignPoints(pointsAssigment, service)”, that is, it will pass PointAssignment aggregate as a parameter and a service to calculate the difference between the current points of the task and the desired points.
If points are available, the project will modify its points attribute and issue a “ProjectPointsAssigned” domain event that will contain the pointsAssignmentId attribute (in addition to others)
The handler of this last event will fetch the PointsAssingment and confirm “pointsAssigment.Confirm ()”, this aggregate will issue a PointsAssigmentConfirmed domain event
The handler for this last event will bring up the associated task and call “task.AssignPoints (pointsAssignment.points)”
My question is: is it correct to pass in step 2 the aggregate PointsAssignment in the project method? That was the only way I found to be able to relate the aggregates.
Note: We have created the PointsAssignment aggregate so that in case of failure I could save the error “pointsAssignment.Reject(reasonText)” and display it to the user, since I am using eventual consistency (1 aggregate per transaction).
We think about use a Process Manager (PointsAssingmentProcess), but the same way we need the third aggregate PointsAssingment to correlate this process.
I would do it a little bit differently (it doesn´t mean more correct).
Your project doesn´t need to know anything about the PointsAssignment.
If your project is the one that has the available points for use, it can have simple methods of removing or adding points.
RemovePointsCommand -> project->removePoints(points)
AddPointsCommand -> project->addPoints(points)
Then, you would have an eventHandler that would react to the PointsAssignmentRequested (i imagine this guy has the id of the project and the number of points and maybe a status field from what you said)
This eventHandler would only do:
on(PointsAssignmentRequested) -> dispatch command (RemovePointsCommand)
// Note that, in here it would be wise to the client to send an ID for this operation, so it can do it asynchronously.
That command can either success or fail, and both of them can dispatch events:
RemovePointsSucceeded
RemovePointsFailed
// Remember that you have a correlation id from earlier persisted
Then, you would have a final eventHandler that would do:
on(RemovePointsSucceeded) -> PointsAssignment.succeed() //
Dispatches PointsAssignmentSuceeded
on(PointsAssignmentSuceeded) -> task.AssignPoints
(pointsAssignment.points)
On the fail side
on(RemovePointsFailed) -> PointsAssignment.fail() // Dispatches PointsAssignmentFailed
This way you dont have to mix aggregates together, all they know are each others id´s and they can work without knowing anything about the schema of other aggregates, avoiding undesired coupling.
I see the semantics of the this problem exactly as a bank transfer.
You have the bank account (project)
You have money in this bank account(points)
You are transferring money through a transfer process (pointsAssignment)
You are transferring money to an account (task)
The bank account only should have minimal operations, of withdrawing and depositing, it does not need to know anything about the transfer process.
The transfer process need to know from which bank it is withdrawing from and to which account it is depositing to.
I imagine your PointsAssignment being like
{
"projectId":"X",
"taskId":"Y",
"points" : 10,
"status" : ["issued", "succeeded", "failed"]
}

How to perform validation across services in microservices

Suppose there are two microservices: Order and Inventory. There is an API in order service that takes ProductId, Qty etc and place the order.
Ideally order should only be allowed to place if inventory exists in inventory service. People recommend to have Saga pattern or any other distributed transactions. That is fine and eventually consistency will be utilized.
But what if somebody wants to abuse the system. He can push orders with products (ProductIds) which are either invalid or out of inventory. System will be taking all these orders and place these orders in queue and Inventory service will be handling these invalid order.
Shouldn't this be handled upfront (in order service) rather than pushing these invalid orders to the next level (specially where productId is invalid)
What are the recommendations to handle these scenarios?
What are the recommendations to handle these scenarios?
Give your order service access to the data that it needs to filter out undesirable orders.
The basic plot would be that, while the Inventory service is the authority for the state of inventory, your Orders service can work with a cached copy of the inventory to determine which orders to accept.
Changes to the Inventory are eventually replicated into the cache of the Orders service -- that's your "eventual consistency". If Inventory drops off line for a time, Order's can continue providing business value based on the information in its cache.
You may want to be paying attention to the age in the data in the cache as well -- if too much time has passed since the cache was last updated, then you may want to change strategies.
Your "aggregates" won't usually know that they are dealing with a cache; you'll pass along with the order data a domain service that supports the queries that the aggregate needs to do its work; the implementation of the domain service accesses the cache to provide answers.
So long as you don't allow the abuser to provide his own instance of the domain service, or to directly manipulate the cache, then the integrity of the cached data is ensured.
(For example: when you are testing the aggregate, you will likely be providing cached data tuned to your specific test scenario; that sort of hijacking is not something you want the abuser to be able to achieve in your production environment).
You most definitely would want to ensure up-front that you can catch as many invalid business cases as possible. There are a couple ways to deal with this. It is the same situation as one would have when booking a seat on an airline. Although they do over-booking which we'll ignore for now :)
Option 1: You could reserve an inventory item as part of the order. This is more of a pessimistic approach but your item would be reserved while you wait for the to be confirmed.
Option 2: You could accept the order only if there is an inventory item available but not reserve it and hope it is available later.
You could also create a back-order if the inventory item isn't available and you want to support back-orders.
If you go with option 1 you could miss out on a customer if an item has been reserved for customer A and customer B comes along and cannot order. If customer A decides not to complete the order the inventory item becomes available again but customer B has now gone off somewhere else to try and source the item.
As part of the fulfillment of your order you have to inform the inventory bounded context that you are now taking the item. However, you may now find that both customer A and B have accepted their quote and created an order for the last item. One is going to lose out. At this point the one not able to be fulfilled will send a mail to the customer and inform them of the unfortunate situation and perhaps create a back-order; or ask the customer to try again in X-number of days.
Your domain experts should make the call as to how to handle the scenarios and it all depends on item popularity, etc.
I will not try to convince you to not do this checking before placing an order and to rely on Sagas as it is usually done; I will consider that this is a business requirement that you must implement.
This seems like a new sub-domain to me: bad-behavior-prevention (or how do you want to call it) that comes with a new responsibility: to prevent abusers. You could add this responsibility to the Order microservice but you would break the SRP. So, it should be done in another microservice.
This new microservice is called from your API Gateway (if you have one) or from the Orders microservice.
If you do not to add a new microservice (from different reasons) then you could implement this new functionality as a module inside of the Orders microservice but I strongly recommend to make it highly decoupled from its host (separate and private persistence/database/table).

How are consistency violations handled in event sourcing?

First of all, let me state that I am new to Command Query Responsibility Segregation and Event Sourcing (Message-Drive Architecture), but I'm already seeing some significant design benefits. However, there are still a few issues on which I'm unclear.
Say I have a Customer class (an aggregate root) that contains a property called postalAddress (an instance of the Address class, which is a value object). I also have an Order class (another aggregate root) that contains (among OrderItem objects and other things) a property called deliveryAddress (also an instance of the Address class) and a string property called status.
The customer places an order by issueing a PlaceOrder command, which triggers the OrderReceived event. At this point in time, the status of the order is "RECEIVED". When the order is shipped, someone in the warehouse issues an ShipOrder command, which triggers the OrderShipped event. At this point in time, the status of the order is "SHIPPED".
One of the business rules is that if a Customer updates their postalAddress before an order is shipped (i.e., while the status is still "RECEIVED"), the deliveryAddress of the Order object should also be updated. If the status of the Order were already "SHIPPED", the deliveryAddress would not be updated.
Question 1. Is the best place to put this "conditionally cascading address update" in a Saga (a.k.a., Process Manager)? I assume so, given that it is translating an event ("The customer just updated their postal address...") to a command ("... so update the delivery address of order 123").
Question 2. If a Saga is the right tool for the job, how does it identify the orders that belong to the user, given that an aggregate can only be retrieved by it's unique ID (in my case a UUID)?
Continuing on, given that each aggregate represents a transactional boundary, if the system were to crash after the Customer's postalAddress was updated (the CustomerAddressUpdated event being persisted to the event store) but before the OrderDeliveryAddressUpdated could be updated (i.e., between the two transactions), then the system is left in an inconsistent state.
Question 3. How are such "violations" of consistency rules detected and rectified?
In most instances the delivery address of an order should be independent of any other data change as a customer may want he order sent to an arbitrary address. That being said, I'll give my 2c on how you could approach this:
Is the best place to handle this in a process manager?
Yes. You should have an OrderProcess.
How would one get hold of the correct OrderProcess instance given that it can only be retrieve by aggregate id?
There is nothing preventing one from adding any additional lookup mechanism that associates data to an aggregate id. In my experimental, going-live-soon, mechanism called shuttle-recall I have a IKeyStore mechanism that associates any arbitrary key to an AR Id. So you would be able to associate something like [order-process]:customerId=CID-123; as a key to some aggregate.
How are such "violations" of consistency rules detected and rectified?
In most cases they could be handled out-of-band, if possible. Should I order something from Amazon and I attempt to change my address after the order has shipped the order is still going to the original address. If your case of linking the customer postal address to the active order address you could notify the customer that n number of orders have had their addresses updated but that a recent order (within some tolerance) has not.
As for the system going down before processing you should have some guaranteed delivery mechanism to handle this. I do not regard these domain event in the same way I regard system events in a messaging infrastructure such as a service bus.
Just some thoughts :)

How to model the requirement using domain driven design

I have a requirement where i need to group the two events as one transaction by grouping them on certain criteria. Below is the some thoughts on the requirement.
Event ::
We will receive events continuously to our systems.
Each event will have some buffer time to group with other event.
If buffer time elapses then we need to discard the event.
We need to group the two events into one group depending on the two events information.
If event information is not sufficient then we will send event info to other component which will response with corrected data.
Whenever we grouping the events some times we want to hold the other event if related event went to data correcting component even though we are not 100% sure about the matching criteria. This step we want to perform because we want to match the events as many as possible.
I want to model this requirement using domain driven design any suggestions will be appreciated.
Without knowing your business requirements, it's kind of hard to answer. But we can start with assumptions and definitions first:
I refer to an event in DDD as something that is important for your domain, has happened (in the past), is a undeniable fact and cannot be undone.
In my definition either aggregates or domain services are responsible for emitting events.
So your group of events looks like a concept that says that a group of related events is something important to my domain, too.
I guess you can go two ways to think about that concept:
A group is a special view on your already happened events. Then a group is just a component which state is derived from a list of related events.
A group is an aggregate that is a kind of a process that has a life cycle and based on state emits a single group event when the criteria for finishing a group is met
In the first case you can implement a group query that listens to published events and projects them to your group concept
In the second case you have an aggregate that reacts to business requests (you can call this a command) and manages some persistent state. When you request your aggregate to create a group and your aggregate is in the right state to do this, then your aggregate emits a group event.

Resources