DDD child entity validation - domain-driven-design

Which layer should be responsible for checking existence of some entity in the database?
Let's say I have an order as aggregate and that order can contain multiple items. Logic implies that I can add only existing items to order.
Should I write it like this in the application service:
var item = ItemRepository.GetByID(id);
//throws exception if the item is null
order.AddItem(item);
or
//validate item existence inside aggregate function
order.AddItem(item, IItemRepository repo);

Neither, really.
Entities don't cross aggregate boundaries. Either the entity is part of the aggregate, in which case the aggregate manages its own life cycle, or the item is part of some other aggregate, in which case you don't share the entity, you share a reference.
order.AddItem(id)
Part of the definition of aggregate is that changes in different aggregates can happen independently of each other. In other words, there's no way that this aggregate can know what is happening in that aggregate "now".
In other words, you can't ensure transactional consistency across an aggregate boundary.
The right answer, if you are willing to accept a data race, is to use a domain service to query the state outside the boundary.
interface InventoryService{
boolean currentlyInStock(Item id);
}
// ...
order.addItem(id, inventoryService);
A few points:
Use a domain service rather than passing in the other repository, because it better communicates what is going on. The domain service serves as a description of the contract that the aggregate actually needs. Furthermore, by refusing to pass the repository, you exclude any possibility that the order aggregate attempts to write to the item repository.
(The trivial implementation of this domain service is to just forward the call to the repository, but the order aggregate doesn't need to know that).
The domain service, in this case, should not be "helping" by choosing an action based on the availability of the inventory -- maybe the aggregate should throw, maybe the aggregate should throw for low volume orders/low priority purchasers, but use different rules when the order is over a million dollars. It's the order's job to figure that out, the domain service just provides the data.
Given the data race, some false positives can slip through; detection and mitigation is a good idea.
If you aren't willing to accept the data race (are you sure? Amazon accepts order for out of stock items all the time...), then you need to rethink the design of your model, and where you have set your aggregate boundaries.
The null design of a model is to capture all of the business state into a single aggregate; its own state is internally consistent, but it might not be consistent with external state. When you start carving up the model into separate aggregates, you are making the same assertion -- the aggregate needs to be internally consistent, but it might not be consistent with state outside the aggregate.
If that's not acceptable, then you turn Mother's picture to the wall, and implement your business rules directly in the book of record (ie, constraints in your RDBMS).

Related

Relationship between concepts in DDD

I'm developing a budgeting app using Domain Driven Design. I'm new to DDD and therefore need a validation of my design.
Here are the concepts I came up with:
Transaction - which is either income or expense, on annual or monthly or one-off etc. basis.
Budget - which is the calculated income, expenses and balance projection, divided into occurrences (say e.g. 12 months over the next year, based on the Transactions).
I made the Transaction the Entity and Aggregate Root. In my mind it has identity, it's a concrete planned expense or income that I know I'll receive, for a concrete thing, and I also need to persist it, so I can calculate the budget based on all my transactions.
Now, I have an issue with the Budget. It depends on my concrete list of Transactions. If one of the Transactions gets deleted, the budget will need to be re-calculated (seems like a good candidate for a domain event?). It's a function of my identifiable transactions at any given time.
Nothing outside the Aggregate boundary can hold a reference to anything inside, except to the root Entity. Which makes me think the budget is the Aggregate Root as it cannot be a ValueObject or Entity within the Transaction.
What's confusing is that I don't necessarily need to persist the budget (unless I want to cache it). I could calculate it from scratch on request, and send it over to the client app. 2 different budgets could have the same number of occurrences, incomes, expenses and balances (but not Transactions). Perhaps an argument for making it a ValueObject?
So, my questions is - what is the Budget?
Domain context vs Aggregate
First element you get wrong is a point of details about DDD semantics. If there is only one object in your "aggregate", then it is not an aggregate. An aggregate is a structure made of multiple (2+) objects, with at least one being an entity and called the aggregate root. If a TransactionRpository returns a Transaction object that has no value object or entity, then Transaction is an entity but not an aggregate nor an aggregate root. If a BudgetRepository returns a Budget entity that includes a Transaction object, then Budget and Transaction form an aggregate, Budget being the aggregate root. If Budget and Transaction are returned from different repositories, then they form different contexts.
Context being the generic concept that can either be an aggregate or an entity.
Contexts are linked to use cases
Second element you get wrong is that you are trying to design your domain model outside of your use cases context. Your application clearly manipulates both concepts of Budget and Transactions, but does your application handles uses cases for both (budget management and transaction management) ? If yes, are these uses case different in a way that implies different domain constraints ?
If your application only handles Budget management, or both but they share their business constraints, then you only need a single context, that manipulates both concepts in a single aggregate. In that situation, Budget is probably your root aggregate, and it's up to your mode and use cases to tell whether the Transaction is a value object or you need to access them by Id.
If your application handles uses cases for both, with different business constraints, then you should split your domain in two contexts, with two different models, one for the Budget management use cases, the other for the Transaction management use cases.
Polysemic domain model
The third element you get wrong, is that you are trying to build a single, unified, normalized domain model. This is wrong because it introduces very complex structures, and a lot of business rules that are irrelevant to your business cases. Why would you need to manipulate the Budget domain model when the use case does not need knowledge of the Budget concept or linked business rules ?
If your application has use cases for both concepts, you need two models. The Budget management model should not use the Transaction management model. However, that does not implies that the Budget model is not allowed to manipulate the Transaction concept and vice versa. It only means you must write another model for that. You could have a Budget context that manipulates Budget and BudgetTransaction models, and Transaction context that manipulates Transaction and TransactionBudget models. These models can map to the same RDBMS tables with different columns, relevant to their use cases, implementing relevant business rules.
This is called writing a polysemic domain model.
Conclusion
So, my questions is - what is the Budget?
It is not possible to answer definitely your last question, as the answer depends on the use cases your application handles. However, you mention the following constraint:
If one of the Transactions gets deleted, the budget will need to be re-calculated
This seems a very good argument in favor of making your application as a single context application, based on an aggregate with Budget being the aggregate root and Transaction being an entity in the aggregate.
If you don't need to, try to refrain from splitting these two concepts in different contexts, unless you have very good reasons to do so: they manipulate excluding columns, they manipulate excluding business rules, you are interested in deploying these two in different bounded contexts, different services, as they would scale differently, etc ...
Having business constraints that span accross multiple contexts implies a complex implementation based on domain events, 2-phase commits, saga pattern, etc ... It's a lot of work, you should balance that work with the benefits you expect in return.

Agregate root vs child methods

I saw many different approaches and I am fairly new to domain-driven design approach. What I am struggling with is to understand one complex (at least for me) thing. I know the whole DDD is complex to understand on first but I am trying to find any resources I can on it.
Example: I have an order and order can have operations. Operations can not be accessed without order and they make no sense without an order. So order entity will be my aggregate root. Operations will be entity too because each operation will have an id (am I right on this one?). Each operation can have subitems (array of strings for example and these can be added or removed from any operation).
Now what I am struggling to understand and what I found everywhere is that every modification should be called and set only through aggregate root... But is it okay to have private methods like setters and getters on the Operation entity itself but these would be called only through the aggregate root (order entity)?
Sorry if I missed something basic, as the whole DDD concept for me is new and I am trying to explore it.
Thanks.
A couple of DDD concepts to arrive at the answer:
Aggregates are Transaction Boundaries.
Aggregates act as gatekeepers for all changes to domain elements enclosed within itself.
Data changes to an Aggregate and its enclosed domain elements are committed atomically. Either everything within the Aggregate stays in sync, or the whole state change operation fails.
The rule also means that one should not access Domain Elements within the Aggregate directly. It would be best if you did not manipulate the domain objects outside the context of the Aggregate.
If Operation is an entity under Order aggregate, then Order is responsible for ensuring operations satisfy the business invariants (a.k.a validations).
Aggregates are loaded in entirety.
Since an Aggregate represents the transaction and consistency boundary of a domain concept, its data is loaded in entirety to guarantee that all Business Invariants are satisfied. Data here means data of all underlying entities and value objects.
If you cannot load the entire data, you cannot guarantee that the change satisfies all business invariants. It may also mean that a data-intensive entity within the Aggregate may need to become an Aggregate itself.
You are protecting the data sanctity and operational consistency of the system if you adhere to these rules. Within the Aggregate itself, how you organize state changes is wholly left to you.
IMHO, I would go with your approach of enclosing all Operation related behaviors, data attributes, and invariants within the Operation entity. Order is responsible for protecting the data within its boundary, but it need not own the methods/logic of doing everything.
You can create state change methods within the Operation entity too, just like you would have done in the Order aggregate, but invoke them from the order object.

In DDD aggregate root, where should be placed logic of checking existing aggregate

Suppose I have an Order Aggregate root, and when I receive command to create Order, I should check other orders on some condition, and decline creation, if these conditions met. Checking this conditions is definitely business logic, but I should not create order at first place, while conditions are not met.
So how to implement this check that complies with DDD principles?
Is it part of domains service, application service?
EDIT:
There is TableId property in Order.
For example I need to check if table is already taken, and if it is, decline order creation. This table service may reside in another AppDomain, and network call maybe needed.
I'm using Event sourcing, CQRS, Command Handlers. Sorry I can't post a code.
So how to implement this check that complies with DDD principles?
"It depends".
If you don't need everything to be perfectly consistent, then you can give your aggregate a cached copy of the other data to compute its logic. There are different patterns people use for this; using a domain service to fetch the data for you is one, returning control to the application to obtain the data for you is another...
----> create order
<---- here is a list of other information I need
----> the other information
<---- here's the order
It's something to take to the business experts -- if the other data is one second old, is the calculation accurate enough?
On the other hand, if you need everything to be perfectly consistent, then you need to be able to lock the other information, so that nobody can change those details while you are working on your calculation.
That lock can be pessimistic (lock the data, then do the calculation), or optimistic (get a copy of the data, perform the calculation, then lock the data and make sure it hasn't changed).
Here's the "bad" news: the mechanism that defines locks in the domain driven design patterns is the aggregate. An aggregate is an expression of the coarse grained lock pattern; when you need to lock a bunch of data, that is the business telling you that the data should all be part of the same aggregate.
It will sometimes happen that you have a pretty domain model, with aggregates that express a bunch of obvious domain concepts, when you discover that the business rules don't line up with those boundaries at all, and you have to re-organize your data boundaries to get the rules to work.
It's often a good idea to begin your model design imagining your aggregates to be processes, and to group together those processes that need to be able to lock each other's data.
For example I need to check if table is already taken, and if it is, decline order creation. This table service may reside in another AppDomain, and network call maybe needed.
When the authoritative data lives somewhere else, then forget about locking. Think in terms of best effort, exception reports, and conflict mitigation.

Stream aggregate relationship in an event sourced system

So I'm trying to figure out the structure behind general use cases of a CQRS+ES architecture and one of the problems I'm having is how aggregates are represented in the event store. If we divide the events into streams, what exactly would a stream represent? In the context of a hypothetical inventory management system that tracks a collection of items, each with an ID, product code, and location, I'm having trouble visualizing the layout of the system.
From what I could gather on the internet, it could be described succinctly "one stream per aggregate." So I would have an Inventory aggregate, a single stream with ItemAdded, ItemPulled, ItemRestocked, etc. events each with serialized data containing the Item ID, quantity changed, location, etc. The aggregate root would contain a collection of InventoryItem objects (each with their respective quantity, product codes, location, etc.) That seems like it would allow for easily enforcing domain rules, but I see one major flaw to this; when applying those events to the aggregate root, you would have to first rebuild that collection of InventoryItem. Even with snapshotting, that seems be very inefficient with a large number of items.
Another method would be to have one stream per InventoryItem tracking all events pertaining to only item. Each stream is named with the ID of that item. That seems like the simpler route, but now how would you enforce domain rules like ensuring product codes are unique or you're not putting multiple items into the same location? It seems like you would now have to bring in a Read model, but isn't the whole point to keep commands and query's seperate? It just feels wrong.
So my question is 'which is correct?' Partially both? Neither? Like most things, the more I learn, the more I learn that I don't know...
In a typical event store, each event stream is an isolated transaction boundary. Any time you change the model you lock the stream, append new events, and release the lock. (In designs that use optimistic concurrency, the boundaries are the same, but the "locking" mechanism is slightly different).
You will almost certainly want to ensure that any aggregate is enclosed within a single stream -- sharing an aggregate between two streams is analogous to sharing an aggregate across two databases.
A single stream can be dedicated to a single aggregate, to a collection of aggregates, or even to the entire model. Aggregates that are part of the same stream can be changed in the same transaction -- huzzah! -- at the cost of some contention and a bit of extra work to do when loading an aggregate from the stream.
The most commonly discussed design assigns each logical stream to a single aggregate.
That seems like it would allow for easily enforcing domain rules, but I see one major flaw to this; when applying those events to the aggregate root, you would have to first rebuild that collection of InventoryItem. Even with snapshotting, that seems be very inefficient with a large number of items.
There are a couple of possibilities; in some models, especially those with a strong temporal component, it makes sense to model some "entities" as a time series of aggregates. For example, in a scheduling system, rather than Bobs Calendar you might instead have Bobs March Calendar, Bobs April Calendar and so on. Chopping the life cycle into smaller installments can keep the event count in check.
Another possibility is snapshots, with an additional trick to it: each snapshot is annotated with metadata that describes where in the stream the snapshot was made, and you simply read the stream forward from that point.
This, of course, depends on having an implementation of an event stream that supports random access, or an implementation of stream that allows you to read last in first out.
Keep in mind that both of these are really performance optimizations, and the first rule of optimization is... don't.
So I'm trying to figure out the structure behind general use cases of a CQRS+ES architecture and one of the problems I'm having is how aggregates are represented in the event store
The event store in a DDD project is designed around event-sourced Aggregates:
it provides the efficient loading of all events previously emitted by an Aggregate root instance (having a given, specified ID)
those events must be retrieved in the order they where emitted
it must not permit appending events at the same time for the same Aggregate root instance
all events emitted as result of a single command must be all appended atomically; this means that they should all succeed or all fail
The 4th point could be implemented using transactions but this is not a necessity. In fact, for scalability reasons, if you can then you should choose a persistence that provides you atomicity without the use of transactions. For example, you could store the events in a MongoDB document, as MongoDB guaranties document-level atomicity.
The 3rd point can be implemented using optimistic locking, using a version column with an unique index per (version x AggregateType x AggregateId).
At the same time, there is a DDD rule regarding the Aggregates: don't mutate more than one Aggregate per transaction. This rule helps you A LOT to design a scalable system. Break it if you don't need one.
So, the solution to all these requirements is something that is called an Event-stream, that contains all the previous emitted events by an Aggregate instance.
So I would have an Inventory aggregate
The DDD has higher precedence than the Event-store. So, if you have some business rules that force you to decide that you must have a (big) Inventory aggregate, then yes, it would load ALL the previous events generated by itself. Then the InventoryItem would be a nested entity that cannot emit events by itself.
That seems like it would allow for easily enforcing domain rules, but I see one major flaw to this; when applying those events to the aggregate root, you would have to first rebuild that collection of InventoryItem. Even with snapshotting, that seems be very inefficient with a large number of items.
Yes, indeed. The simplest thing would be for us to all have a single Aggregate, with a single instance. Then the consistency would be the strongest possible. But this is not efficient so you need to better think about the real business requirements.
Another method would be to have one stream per InventoryItem tracking all events pertaining to only item. Each stream is named with the ID of that item. That seems like the simpler route, but now how would you enforce domain rules like ensuring product codes are unique or you're not putting multiple items into the same location?
There is another possibility. You should model the assigning of product codes as a Business Process. For this you could use a Saga/Process manager that would orchestrate the entire process. This Saga could use a collection with an unique index added to the product code column in order to ensure that only one product uses a given product code.
You could design the Saga to permit the allocation of an already-taken code to a product and to compensate later or to reject the invalid allocation in the first place.
It seems like you would now have to bring in a Read model, but isn't the whole point to keep commands and query's seperate? It just feels wrong.
The Saga uses indeed a private state maintained from the domain events in an eventual consistent state, just like a Read-model but this does not feel wrong for me. It may use whatever it needs in order to bring (eventually) the system as a hole to a consistent state. It complements the Aggregates, whose purpose is to not allow the building-blocks of the system to get into an invalid state.

Read model for aggregate in DDD CQRS ES

In CQRS + ES and DDD, is it a good thing to have small read model in aggregate to get data from other aggregate or bounded context?
For example, in order validation (In Order aggregate), there is a business rules which validate order only if customer is not flagged. The flag information is put in read model (specific to the aggregate) via synchronous domain events.
What do you think about this ?
is it a good thing to have small read model in aggregate to get data from other aggregate or bounded context?
It's not ideal. Aggregates, due to their nature, are not good at enforcing consistency that involves state outside of themselves.
What this usually means is that the business is going to need some way to respond when two aggregates produce an unacceptable state.
You also have the option of checking for the flag before you run the placeOrder command on the aggregate. That check for the flag could be done in the command handler, or in the client -- basically, you have was of "validating" that the command should succeed before passing it to the aggregate.
That said, if it were critical to try to consult the read model while processing the command, a way to do it would be to use a "domain service"; you pass a service provider to the aggregate as part of the command, and let the interface abstract away the fact that running the query requires looking outside of the aggregate.
That gives you some of the decoupling you need to keep the aggregate testable.
It's doable, but not in the form of a read model, rather a Value Object in the Aggregate (since we're on the Write side).
If you already have a CustomerId in Order, you just have to compose a VO with it and a Flagged member.
Of course, this remains prone to all the problems of cross-aggregate communication since the data originates from Customer. Order has to be kept in sync with the flagged status of its Customer, which can require quite a bit of work.
In any case, you should probably first determine with your domain expert whether immediate consistency is an absolute requirement (in which case you have to somehow wrap Customer + Order in a transaction) or if you can afford a small delay in Flagged freshness when enforcing that invariant.
If the latter, you can choose between duplicating Flagged in the Order aggregate or the first option given by #VoiceOfUnreason - the main difference being probably that if the data is in the aggregate, you'll get it for free at the Domain level should you need it in multiple occasions, instead of duplicating the check in multiple use cases/command handlers at the application level.

Resources