Repository and bulk update - avoiding database roundtrips

Repository and bulk update - avoiding database roundtrips - domain-driven-design

I work on wholesale system domain. When some products are delivered, domain NewProductsDeliveredEvent is triggered. Event contains set of ProductDelivery value objects containing product code and quantity. Something like below:
class NewProductsDeliveredEvent {
Set<ProductDelivery> productDeliveries;
}
class ProductDelivery {
ProductCode productCode;
Quantity quantity
}
So far so good. Now, when component responsible for inventory updates receives this type of event. It must update products table with the current quantity of products available. So I have something like that:
class NewProudctsDeliveredHandler {
ProductRepository productRepo;
handle(NewProductDeliveryEvent event) {
for (ProductDelivery delivery : event.getProductDeliveries()) {
Product product = productRepo.getByCode(delivery.getProductCode())
product.updateQuantity(delivery.getQuantity());
}
}
}
It's easy to spot that such logic generates a lot of DB roundtrips and I'm thinking about some solution to alleviate the pain. One idea might be to use Specification pattern and build OR specification for product codes. However, in my application product code is a business identifier, so this solution smells a little bit (maybe I'm just exaggerating).
Is there any better way to handle it? Any ideas greatly appreciated.

If you'll allow a slight digression, but are you sure that bulk update is a good idea in your case?
Product is an aggregate of high contetion if it manages inventory. Just imagine that maybe hundreds of people are placing orders for the same product at same time on Amazon.com, whereas few people will modify your order at same time.
Take an example:
Event1: A-5, B-1
Event2: C-1, D-2
Event3: A-2, D-3
Event1 conflicts with Event3, Event2 conflicts with Event3
The more product you update in one transaction, the greater chance of a concurrency failure if your products are selling well.
Iterate one product per transaction is even worse, making the event harder to retry.
handle(NewProductDeliveryEvent event) {
for (ProductDelivery delivery : event.getProductDeliveries()) {
updateProductTransactionally();
// How to retry the whole event
// when the second one failed and the first one committed?
}
}
Maybe splitting the event to multiple sub events which triggers only one product update is more appropiate.

Related

DDD: Can aggregates get other aggregates as parameters?

Assume that I have two aggregates: Vehicles and Drivers, And I have a rule that a vehicle cannot be assigned to a driver if the driver is on vacation.
So, my implementation is:
class Vehicle {
public void assignDriver(driver Driver) {
if (driver.isInVacation()){
throw new Exception();
}
// ....
}
}
Is it ok to pass an aggregate to another one as a parameter? Am I doing anything wrong here?

I'd say your design is perfectly valid and reflects the Ubiquitous Language very well. There's several examples in the Implementing Domain-Driven Design book where an AR is passed as an argument to another AR.
e.g.
Forum#moderatePost: Post is not only provided to Forum, but modified by it.
Group#addUser: User provided, but translated to GroupMember.
If you really want to decouple you could also do something like vehicule.assignDriver(driver.id(), driver.isInVacation()) or introduce some kind of intermediary VO that holds only the necessary state from Driver to make an assignation decision.
However, note that any decision made using external data is considered stale. For instance, what happens if the driver goes in vacation right after it's been assigned to a vehicule?
In such cases you may want to use exception reports (e.g. list all vehicules with an unavailable driver), flag vehicules for a driver re-assignation, etc. Eventual consistency could be done either through batch processing or messaging (event processing).
You could also seek to make the rule strongly-consistent by inverting the relationship, where Driver keeps a set of vehiculeId it drives. Then you could use a DB unique constraint to ensure the same vehicule doesn't have more than 1 driver assigned. You could also violate the rule of modifying only 1 AR per transaction and model the 2-way relationship to protect both invariants in the model.
However, I'd advise you to think of the real world scenario here. I doubt you can prevent a driver from going away. The system must reflect the real world which is probably the book of record for that scenario, meaning the best you can do with strong consistency is probably unassign a driver from all it's vehicules while he's away. In that case, is it really important that vehicules gets unassigned immediately in the same TX or a delay could be acceptable?

In general, an aggregate should keep its own boundaries (to avoid data-load issues and transaction-scoping issues, check this page for example), and therefore only reference another aggregate by identity, e.g. assignDriver(id guid).
That means you would have to query the driver prior to invoking assignDriver, in order to perform validation check:
class MyAppService {
public void execute() {
// Get driver...
if (driver.isInVacation()){
throw new Exception();
}
// Get vehicle...
vehicle.assignDriver(driver.id);
}
}

Suppose you're in a micro-services architecture,
you have a 'Driver Management' service, and an 'Assignation Service' and you're not sharing code between both apart from technical libraries.
You'll naturally have 2 classes for 'Driver',
An aggregate in 'Driver Management' which will hold the operations to manage the state of a driver.
And a value object in the 'Assignation Service' which will only contain the relevant information for assignation.
This separation is harder to see/achieve when you're in a monolithic codebase
I also agree with #plalx, there's more to it for the enforcement of the rule, not only a check on creation, for which you could implement on of the solutions he suggested.
I encourage you to think in events, what happens when:
a driver has scheduled vacation
when he's back from vacation
if he changes he vacation dates
Did you explore creating an Aggregate for Assignation?

How to perform validation across services in microservices

Suppose there are two microservices: Order and Inventory. There is an API in order service that takes ProductId, Qty etc and place the order.
Ideally order should only be allowed to place if inventory exists in inventory service. People recommend to have Saga pattern or any other distributed transactions. That is fine and eventually consistency will be utilized.
But what if somebody wants to abuse the system. He can push orders with products (ProductIds) which are either invalid or out of inventory. System will be taking all these orders and place these orders in queue and Inventory service will be handling these invalid order.
Shouldn't this be handled upfront (in order service) rather than pushing these invalid orders to the next level (specially where productId is invalid)
What are the recommendations to handle these scenarios?

What are the recommendations to handle these scenarios?
Give your order service access to the data that it needs to filter out undesirable orders.
The basic plot would be that, while the Inventory service is the authority for the state of inventory, your Orders service can work with a cached copy of the inventory to determine which orders to accept.
Changes to the Inventory are eventually replicated into the cache of the Orders service -- that's your "eventual consistency". If Inventory drops off line for a time, Order's can continue providing business value based on the information in its cache.
You may want to be paying attention to the age in the data in the cache as well -- if too much time has passed since the cache was last updated, then you may want to change strategies.
Your "aggregates" won't usually know that they are dealing with a cache; you'll pass along with the order data a domain service that supports the queries that the aggregate needs to do its work; the implementation of the domain service accesses the cache to provide answers.
So long as you don't allow the abuser to provide his own instance of the domain service, or to directly manipulate the cache, then the integrity of the cached data is ensured.
(For example: when you are testing the aggregate, you will likely be providing cached data tuned to your specific test scenario; that sort of hijacking is not something you want the abuser to be able to achieve in your production environment).

You most definitely would want to ensure up-front that you can catch as many invalid business cases as possible. There are a couple ways to deal with this. It is the same situation as one would have when booking a seat on an airline. Although they do over-booking which we'll ignore for now :)
Option 1: You could reserve an inventory item as part of the order. This is more of a pessimistic approach but your item would be reserved while you wait for the to be confirmed.
Option 2: You could accept the order only if there is an inventory item available but not reserve it and hope it is available later.
You could also create a back-order if the inventory item isn't available and you want to support back-orders.
If you go with option 1 you could miss out on a customer if an item has been reserved for customer A and customer B comes along and cannot order. If customer A decides not to complete the order the inventory item becomes available again but customer B has now gone off somewhere else to try and source the item.
As part of the fulfillment of your order you have to inform the inventory bounded context that you are now taking the item. However, you may now find that both customer A and B have accepted their quote and created an order for the last item. One is going to lose out. At this point the one not able to be fulfilled will send a mail to the customer and inform them of the unfortunate situation and perhaps create a back-order; or ask the customer to try again in X-number of days.
Your domain experts should make the call as to how to handle the scenarios and it all depends on item popularity, etc.

I will not try to convince you to not do this checking before placing an order and to rely on Sagas as it is usually done; I will consider that this is a business requirement that you must implement.
This seems like a new sub-domain to me: bad-behavior-prevention (or how do you want to call it) that comes with a new responsibility: to prevent abusers. You could add this responsibility to the Order microservice but you would break the SRP. So, it should be done in another microservice.
This new microservice is called from your API Gateway (if you have one) or from the Orders microservice.
If you do not to add a new microservice (from different reasons) then you could implement this new functionality as a module inside of the Orders microservice but I strongly recommend to make it highly decoupled from its host (separate and private persistence/database/table).

Event Sourcing: proper way of rolling back aggregate state

I'm looking for an advice related to the proper way of implementing a rollback feature in a CQRS/event-sourcing application.
This application allows to a group of editors to edit and update some editorial content, an editorial news for instance. We implemented the user interface so that each field has an auto save feature and now we would like to provide our users the possibility to undo the operations they did, so that it is possible to rollback the editorial news to a previous known state.
Basically we would like to implement something like to the undo command that you have in Microsoft Word and similar text editors. In the backend, the editorial news is an instance of an aggregate defined in our domain and called Story.
We have discussed some ideas to implement the rollback and we are looking for an advice based on real world experiences in similar projects. Here is our considerations about this feature.
How rollback works in real world business domains
First of all, we all know that in real world business domains what we are calling rollback is obtained via some form of compensation event.
Imagine a domain related to some sort of service for which it is possible to buy a subscription: we could have an aggregate representing a user subscription and an event describing that a charge has been associated to an instance of the aggregate (the particular subscription of one of the customers). A possible implementation of the event is as follows:
public class ChargeAssociatedToSubscriptionEvent: DomainEvent
{
public Guid SubscriptionId {get; set;}
public decimal Amount {get; set;}
public string Description {get; set;}
public DateTime DueDate {get; set;}
}
If a charge is wrongly associated to a subscription, it is possible to fix the error by means of an accreditation associated to the same subscription and having the same amount, so that the effect of the charge is completely balanced and the user get back its money. In other words, we could define the following compensation event:
public class AccreditationAssociatedToSubscription: DomainEvent
{
public Guid SubscriptionId {get; set;}
public decimal Amount {get; set;}
public string Description {get; set;}
public DateTime AccreditationDate {get; set;}
}
So if a user is wrongly charged for an amount of 50 dollars, we can compensate the error by means of an accreditation of 50 dollars to the user subscription: this way the state of the aggregate has been rolled back to the previous state.
Why things are not as easy as they seem
Based on the previous discussion, the rollback seems quite easy to be implemented. If you have an instance of the story aggregate at the aggregate revision B and you want to roll it back to a previous aggregate revision, say A (with A < B), you just have to do the following steps:
check the event store and get all the events between revisions A and B
compute the compensation event for each of the occurred events
apply the compensation events to the aggregate in the reverse order
Unfortunately, the second step of the previous procedure is not always possible: given a generic domain event it is not always possible to compute its compensation event, because the amount of information contained inside the event could not be enough to do that. Maybe it is possible to wisely define all the events so that they contain enough information to be able to compute the corresponding compensation event, but at the current state of our application there are several events for which computing the compensation event is not possible and we would prefer to avoid changing the shape of our events.
A possible solution based on state comparison
The first idea to overcome the issues with compensation event is computing the minimum set of events needed to roll back the aggregate by comparing the current state of the aggregate with the target state. The algorithm is basically the following:
get an instance of the aggregate at the current state (call it B)
get an instance of the aggregate at the target state (call it A) by applying only the first n events persisted inside event store (our repository allows to do that by specifying the aggregate id and the desired point in time to which materialize the aggregate)
compare the two instances and compute the minimum set of events to be applied to the aggregate in the state B in order to change its state to A
apply the computed events to the aggregate
A smarter approach based on event replay
Another way to solve the problem of rolling back to a previous state of the aggregate could be doing the same thing that the aggregate repository does when an aggregate is materialized at a specific point in time. In order to do that we should define an event, say StoryResettedEvent, whose effect is to reset the state of the aggregate by completely emptying it and do the following steps:
apply the StoryResettedEvent to our aggregate so that its state is emptied
get the first n events for the aggregate we are working on (all the events from the first saved event up to the target state A)
apply all the events to the aggregate instance
The main problem I see with this approach is the event to empty the state of the aggregate: it seems somewhat artificial, not a real domain event with a business meaning, but rather a trick to implement the rollback functionality.
The third way: persisting the compensation event each time an event is saved inside the event store
The third way we figured out to get what we need is based again on the concept of compensation event. The basic idea is that each event of the application could be enriched with a property containing the corresponding compensation event.
In the point of the code where an event is raised it is possible to immediately compute the compensation event for the event to be raised (based on the current state of the aggregate and the shape of the event), so that the event could be enriched with this information that this way will be saved inside the event store. By doing so the compensation events events are always available, ready to be used in case of a rollback request. The downside of this solution is that each domain event must be modified and only a minimum part of the compensation events we must compute and save inside the event store will be useful for an actual rollback (most of them will never be used).
Conclusions
In my opinion the best option to solve the problem is using the algorithm based on state comparison (the first proposed solution), but we are still evaluating what to do.
Does anyone have already had a similar requirement ? Is there any other way to implement a rollback ? Are we completely missing the point and following bad approaches to the problem ?
Thanks for helping, any advice will be appreciated.

How the compensation events are generated should be the concern of the Story aggregate (after all, that's the point of an aggregate in event sourcing - it's just the validator of commands and generator of events for a particular stream).
Presumably you are following something like a typical CQRS/ES flow:
client sends an Undo command, which presumably says what version it wants to undo back to, and what story it is targetting
The Undo Command Handler loads the Story aggregate in the usual way, either possibly from a snapshot and/or by applying the aggregate's events to the aggregate.
In some way, the command is passed to the aggregate (possibly a method call with args extracted from the command, or just passing the command directly to the aggregate)
The aggregate "returns" in some way the events to persist, assuming the undo command is valid. These are the compensating events.
compute the compensation event for each of the occurred events
...
Unfortunately, the second step of the previous procedure is not always possible
Why not? The aggregate has been passed all previous events, so what does it need that it doesn't have? The aggregate doesn't just see the events you want to roll back, it necessarily processes all events for that aggregate ever.
You have two options really - reduce the book-keeping that the aggregate needs to do by having the command handler help out in some way, or the whole process is managed internally by the aggregate.
Command handler helps out:
The command handler extracts from the command the version the user wants to roll back to, and then recreates the aggregate as-of that version (applying events in the usual way), in addition to creating the current aggregate. Then the old aggregate gets passed to the aggregate's undo method along with the command, so that the aggregate can then do state comparison more easily.
You might consider this to be a bit hacky, but it seems moderately harmless, and could significantly simplify the aggregate code.
Aggregate is on its own:
As events are applied to the aggregate, it adds to its state whatever book-keeping it needs to be able to compute the compensating events if it receives an undo command. This could be a map of compensating events, pre-computed, a list of every previous state that can potentially be reverted to (to allow state comparison), the list of events the aggregate has processed (so it can compute the previous state itself in the undo method), or whatever it needs, and it just stores it in its in-memory state (and snapshot state, if applicable).
The main concern with the aggregate doing it on its own is performance - if the size of the book-keeping state is large, the simplification of allowing the command handler to pass the previous state would be worthwhile. In any case, you should be able to switch between the approaches at any time in the future without any issues (except possibly needing to rebuild your snapshots, if you have them).

My 2 cents.
For rollback operation, an orchestration class will be responsible to handle it. It will publish a aggregate_modify_generated event and a projection on the other end for this event will fetch the current state of the aggregates after receiving it. Now when any of the aggregate failed, it should generate a failure event, upon receiving it, orchestration class will generate a aggregate_modify_rollback event that will received by that projection and will set aggregate state with the previously fetched state .
One common projector can do the task, because the events will have aggregate id.

Loading aggregates on reacting to domain events

I am implementing an application with domain driven design and event sourcing. I am storing all domain events in a DomainEvents table in SQL Server.
I have the following aggregates:
- City
+ Id
+ Enable()
+ Disable()
- Company
+ Id
+ CityId
+ Enable()
+ Disable()
- Employee
+ Id
+ CompnayId
+ Enable()
+ Disable()
Each one encapsulates its own domain logic and invariants. I designed them as separate aggregates, because one city may have thousands (maybe more) companies and company may also have very large number of employees. If this entities would belong to the same aggregate I had to load them together, which in most cases would be unnecessary.
Calling Enable or Disable will produce a domain event (e.g. CityEnabled, CompanyDisabled or EmployeeEnabled). These events contain the primary key of the enabled or disabled entity.
Now my problem is a new requirement forcing me to enable/disable all related Companies if a City is enabled/disabled. The same is required for Employees, if a Company is enabled/disabled.
In my event handler, which is invoked if for example CityDisabled has occurred
I need to execute DisableCompanyCommand for each company belonging to that city.
But how would I know what companies should be affected by that change?
My thoughts:
Querying the event store is not possible, because I can't use conditions like 'where CityId = event.CityId'
Letting the parent know its child ids and putting all child ids in every event the parent produces. Is also a bad idea because the event creator shouldn't care who will consume the events later. So only information belonging to the happening event should be in the event.
Executing the DisableCompanyCommand for every company. Only the companies having the matching CityId would change their state. Even though I would do that asynchronously it would produce a huge overhead loading every company on those events. And also for every company getting disabled the same procedure should be repeated to disable all users.
Creating read models mapping ParentIds to ChildIds and loading the childIds according to the parentId in the event. This sounds like the most appropriate solution, but the problem is, how would I know if a new Company is created while I am disabling the existing ones?
I am not satisfied with any of the solutions above. Basically the problem is to determine the affected aggregates for a happened event.
Maybe you have better solutions ?

What you are describing can be resolved by a Saga/Process manager that listen to the CityDisabled event. Then it finds the CompanyIds of all Companies in that City (by using an existing Read model or by maintaining a private state of CityIdsxCompanyIds) and sends each one a DisableCompany command.
The same applies to CompanyDisabled event, regarding the disabling of Employee.
P.S. Disabling a City/Company/Employee seems like CRUD to me, these don't seem terms from a normal ubiquitous language, it's not very DDD-ish but I consider your design as being correct in regard to this question.

Do your requirements mean you have to fire a CompanyDisabled event when disabling a city?
If not - and your requirement is just that a disabled city means all companies are disabled, then what you would do is on your city read model projection you'd listen for CityDisabled events and mark the companies disabled in your read model. (If your requirements are to fire an event for each city then Constantin's answer is good)
Your model is more of a child / parent relationship - its kind of a break in traditional "blue book" thought, but I recommend represent this relationship in your domain with more than a CityId.
In my app something like this would be coded as
public Task Handle(DoSomething command, IHandlerContext ctx)
{
var city = ctx.For<City>().Get(command.CityId);
var company = city.For<Company>().Get(command.CompanyId);
company.DoSomething();
}
public Company : Entity<City>
{
public void DoSomething()
{
// Parent is the City
if(Parent.Disabled)
throw new BusinessException("City is disabled");
Apply<SomethingDone>(x => {
x.CityId = Parent.Id;
x.CompanyId = Id;
...
});
}
}
(Psuedo code is NServiceBus style code and using my library Aggregates.NET)

It's quite probable that you don't have to explicitly force rules like 'enable/disable all related Companies if a City is enabled/disabled' in the domain (write) side at all.
If so, there's no need to disable all Companies when a City is disabled, within the domain. As Charles mentioned in his answer, just introduce a rule that e.g., "a Company is disabled if it is disabled itself (directly) or its City is disabled". The same with Company and its Employees.
This rule should be realized at the read side. A Company in the read model will have 2 properties: the first one is Enabled which is directly mapped from the domain; the second one is EnabledEffective which is calculatable based on the Company's Enabled value and its City's Enabled value. When a CityDisabled event happens, the read model's event handler traverses the City's all Companies in the read model and sets their EnabledEffective property to false; when a CityEnabled event happens, the handler sets the City's every Company's EnabledEffective property back to its own Enabled value. It is EnabledEffective property that you will use in the UI.
The logic can be a bit more complex with CompanyEnabled/CompanyDisabled event handling (in respect to Empoyees) as you must take into account both event info and enabled/disabled status of the host City.
If (effective) enabled/disabled status of a Company/Employee is really needed in the domain side (e.g. affecting the way these aggregates handle their commands), consider taking EnabledEffective value from the read side and passing it along with the command object.

node.js + mongo + atomic update of multiple entities = head ache

My setup:
Node.js
Mongojs
A simple database containing two collections - inventory and invoices.
Users may concurrently create invoices.
An invoice may refer to several inventory items.
My problem:
Keeping the inventory integrity. Imagine a scenario were two users submit two invoices with overlapping item sets.
A naive (and wrong) implementation would do the following:
For each item in the invoice read the respective item from the inventory collection.
Fix the quantity of the inventory items.
If any item quantity goes below zero - abandon the request with the relevant message to the user.
Save the inventory items.
Save the invoice.
Obviously, this implementation is bad, because the actions of the two users are going to interleave and affect each other. In a typical blocking server + relational database this is solved with complex locking/transaction schemes.
What is the nodish + mongoish way to solve this? Are there any tools that the node.js platform provides for these kind of things?

You can look at a two phase commit approach with MongoDB, or you can forget about transactions entirely and decouple your processes via a service bus approach. Use Amazon as an example - they will allow you to submit your order, but they will not confirm it until they have been able to secure your inventory item, charged your card, etc. None of this occurs in a single transaction - it is a series of steps that can occur in isolation and can have compensating steps applied where necessary.
A naive bus implementation would do the following (keep in mind that this is just a generic suggestion for you to work from and the exact implementation would depend on your specific needs for concurrency, etc.):
place the order on the queue. At this point, you can
continue to have your client wait, or you can thank them for their
order and let them know they will receive an email when its been
processed.
an "inventory worker" will grab the order and lock the inventory
items that it needs to reserve. This can be done in many different
ways. With Mongo you could create a collection that has a document per orderid. This document would have as its ID the inventory item ID and a TTL that is reasonable
(say 30 seconds). As long as the worker has the lock, then it can
manage the inventory levels of the items it has locks for. Once its
made its changes, it could delete the "lock" document.
If another worker comes along that wants to manage the same item
while its locked, you could put the blocked worker into sleep mode
for X seconds and then retry or, better yet, you could put the
request back onto the message bus to be picked up later by another
worker.
Once the worker has resolved all the inventory items, it then can
place another message on the service bus that indicates a card
should be charged, or processing should receive a notification to
pull the inventory, or an email can be sent to the person who made
the order, etc., etc.
Sounds complex, but once you have a message bus setup, its actually relatively simple. A list of Node Message Bus Implementations can be found here.
Some developers will even skip the formal message bus completely and use a database as their message passing engine which can work in simple implementations. Google Mongo and Queues.
If you don't expect more than 1 server and the message bus implementation is too bulky, node could handle the locking and message passing for you. For example, if you really wanted to lock with node, you could create an array that stored the inventory item IDs. Although, to be frank, I think the message bus is the best way to go. Anyway, here's some code I have used in the past to handle simple external resource locking with Node.
// attempt to take out a lock, if the lock exists, then place the callback into the array.
this.getLock = function( id, cb ) {
if(locks[id] ) {
locks[id].push( cb );
return false;
}
else {
locks[id] = [];
return true;
}
};
// call freelock when done
this.freeLock = function( that, id ) {
async.forEach(locks[id], function(item, callback) {
item.apply( that,[id]);
callback();
}, function(err){
if(err) {
// do something on error
}
locks[id] = null;
});
};

Develop Reference

node.js excel linux python-3.x azure haskell apache-spark rust .htaccess string