I'm developing a budgeting app using Domain Driven Design. I'm new to DDD and therefore need a validation of my design.
Here are the concepts I came up with:
Transaction - which is either income or expense, on annual or monthly or one-off etc. basis.
Budget - which is the calculated income, expenses and balance projection, divided into occurrences (say e.g. 12 months over the next year, based on the Transactions).
I made the Transaction the Entity and Aggregate Root. In my mind it has identity, it's a concrete planned expense or income that I know I'll receive, for a concrete thing, and I also need to persist it, so I can calculate the budget based on all my transactions.
Now, I have an issue with the Budget. It depends on my concrete list of Transactions. If one of the Transactions gets deleted, the budget will need to be re-calculated (seems like a good candidate for a domain event?). It's a function of my identifiable transactions at any given time.
Nothing outside the Aggregate boundary can hold a reference to anything inside, except to the root Entity. Which makes me think the budget is the Aggregate Root as it cannot be a ValueObject or Entity within the Transaction.
What's confusing is that I don't necessarily need to persist the budget (unless I want to cache it). I could calculate it from scratch on request, and send it over to the client app. 2 different budgets could have the same number of occurrences, incomes, expenses and balances (but not Transactions). Perhaps an argument for making it a ValueObject?
So, my questions is - what is the Budget?
Domain context vs Aggregate
First element you get wrong is a point of details about DDD semantics. If there is only one object in your "aggregate", then it is not an aggregate. An aggregate is a structure made of multiple (2+) objects, with at least one being an entity and called the aggregate root. If a TransactionRpository returns a Transaction object that has no value object or entity, then Transaction is an entity but not an aggregate nor an aggregate root. If a BudgetRepository returns a Budget entity that includes a Transaction object, then Budget and Transaction form an aggregate, Budget being the aggregate root. If Budget and Transaction are returned from different repositories, then they form different contexts.
Context being the generic concept that can either be an aggregate or an entity.
Contexts are linked to use cases
Second element you get wrong is that you are trying to design your domain model outside of your use cases context. Your application clearly manipulates both concepts of Budget and Transactions, but does your application handles uses cases for both (budget management and transaction management) ? If yes, are these uses case different in a way that implies different domain constraints ?
If your application only handles Budget management, or both but they share their business constraints, then you only need a single context, that manipulates both concepts in a single aggregate. In that situation, Budget is probably your root aggregate, and it's up to your mode and use cases to tell whether the Transaction is a value object or you need to access them by Id.
If your application handles uses cases for both, with different business constraints, then you should split your domain in two contexts, with two different models, one for the Budget management use cases, the other for the Transaction management use cases.
Polysemic domain model
The third element you get wrong, is that you are trying to build a single, unified, normalized domain model. This is wrong because it introduces very complex structures, and a lot of business rules that are irrelevant to your business cases. Why would you need to manipulate the Budget domain model when the use case does not need knowledge of the Budget concept or linked business rules ?
If your application has use cases for both concepts, you need two models. The Budget management model should not use the Transaction management model. However, that does not implies that the Budget model is not allowed to manipulate the Transaction concept and vice versa. It only means you must write another model for that. You could have a Budget context that manipulates Budget and BudgetTransaction models, and Transaction context that manipulates Transaction and TransactionBudget models. These models can map to the same RDBMS tables with different columns, relevant to their use cases, implementing relevant business rules.
This is called writing a polysemic domain model.
Conclusion
So, my questions is - what is the Budget?
It is not possible to answer definitely your last question, as the answer depends on the use cases your application handles. However, you mention the following constraint:
If one of the Transactions gets deleted, the budget will need to be re-calculated
This seems a very good argument in favor of making your application as a single context application, based on an aggregate with Budget being the aggregate root and Transaction being an entity in the aggregate.
If you don't need to, try to refrain from splitting these two concepts in different contexts, unless you have very good reasons to do so: they manipulate excluding columns, they manipulate excluding business rules, you are interested in deploying these two in different bounded contexts, different services, as they would scale differently, etc ...
Having business constraints that span accross multiple contexts implies a complex implementation based on domain events, 2-phase commits, saga pattern, etc ... It's a lot of work, you should balance that work with the benefits you expect in return.
Related
I'm having trouble getting my head around how to use the repository pattern with a more complex object model. Say I have two aggregate roots Student and Class. Each student may be enrolled in any number of classes. Access to this data would therefore be through the respective repositories StudentRepository and ClassRepository.
Now on my front end say I want to create a student details page that shows the information about the student, and a list of classes they are enrolled in. I would first have to get the Student from StudentRepository and then their Classes from ClassRepository. This makes sense.
Where I get lost is when the domain model becomes more realistic/complex. Say students have a major that is associated with a department, and classes are associated with a course, room, and instructors. Rooms are associated with a building. Course are associated with a department etc.. etc..
I could easily see wanting to show information from all these entities on the student details page. But then I would have to make a number of calls to separate repositories per each class the student is enrolled in. So now what could have been a couple queries to the database has increased massively. This doesn't seem right.
I understand the ClassRepository should only be responsible for updating classes, and not anything in other aggregate roots. But does it violate DDD if the values ClassRepository returns contains information from other related aggregate roots? In most cases this would only need to be a partial summary of those related entities (building name, course name, course number, instructor name, instructor email etc..).
But then I would have to make a number of calls to separate repositories per each class the student is enrolled in. So now what could have been a couple queries to the database has increased massively. This doesn't seem right.
Yup.
But does it violate DDD if the values ClassRepository returns contains information from other related aggregate roots?
Nobody cares about "violate DDD". What we care about is: do you still get the benefits of the repository pattern if you start pulling in data from other aggregates?
Probably not - part of the point of "aggregates" is that when writing the business code you don't have to worry to much about how storage is implemented... but if you start mixing locked data and unlocked data, your abstraction starts leaking into the domain code.
However: if you are trying to support reporting, or some other effectively read only function, you don't necessarily need the domain model at all -- it might make sense to just query your data store and present a representation of the answer.
This substitution isn't necessarily "free" -- the accuracy of the information will depend in part on how closely your stored information matches your in memory information (ie, how often are you writing information into your storage).
This is basically the core idea of CQRS: reads and writes are different, so maybe we should separate the two, so that they each can be optimized without interfering with the correctness of the other.
Can DDD repositories return data from other aggregate roots?
Short answer: No. If that happened, that would not be a DDD repository for a DDD aggregate (that said, nobody will go after you if you do it).
Long answer: Your problem is that you are trying to use tools made to safely modify data (aggregates and repositories) to solve a problem reading data for presentation purposes. An aggregate is a consistency boundary. Its goal is to implement a process and encapsulate the data required for that process. The repository's goal is to read and atomically update a single aggregate. It is not meant to implement queries needed for data presentation to users.
Also, note that the model you present is not a model based on aggregates. If you break that model into aggregates you'll have multiple clusters of entities without "lines" between them. For example, a Student aggregate might have a collection of ClassEnrollments and a Class aggregate a collection of Atendees (that's just an example, note that modeling many to many relationships with aggregates can be a bit tricky). You'll have one repository for each aggregate, which will fully load the aggregate when executing an operation and transactionally update the full aggregate.
Now to your actual question: how do you implement queries for data presentation that require data from multiple aggregates? well, you have multiple options:
As you say, do multiple round trips using your existing repositories. Load a student and from the list of ClassEnrollments, load the classes that you need.
Use CQRS "lite". Aggregates and respositories will only be used for update operations and for query operations implement Queries, which won't use repositories, but access the DB directly, therefore you can join tables from multiple aggregates (Student->Enrollments->Atendees->Classes)
Use "full" CQRS. Create read models optimised for your queries based on the data from your aggregates.
My preferred approach is to use CQRS lite and only create a dedicated read model when it's really needed.
I have a subdomain which involves tracking user financial data across different financial account types.
For example, users can input data for their:
bank accounts,
credit cards,
loans,
lines of credit,
real estate,
and more...
Now within each individual type, there are more subtypes.
For instance, under loans:
personal loans,
business loans,
mortgages,
car loans,
and more...
They would each have their own particular invariants, with some unique properties and functionality, and some shared properties and functionality.
I've been approaching this using composition, creating an aggregate for each subtype, and using interfaces and helper interface implementations to share similar logic between aggregates.
However, it appears as though I'm going to end up with dozens of different aggregates when modelling all these different account types. This doesn't feel right.
Alternatives I've considered:
have a type property on the loan aggregate, and conditional logic based off the type.
create different bounded contexts for each of these types: This feels like overkill, I believe this is all part of the same business subdomain.
create aggregates based off shared functionality - eg SecuredLoan and UnsecuredLoan aggregates
creating subclasses in the general aggregates to hold the subtype's unique functionality. get some encapsulation of subtype specific logic, with some conditional logic still (eg conditional properties on the aggregate). Not really sure the difference between this and just creating a separate aggregate for each subtype
Tradeoffs seem to be, the more general the implementation, there will end up being a ton of conditional logic, and conditional properties based off the subtype.
Versus building specific aggregates for each subtype, the logic per aggregate is simplified, but there ends up being hundreds of commands in the application layer, a lot of them which are basically the same thing but to a different subtype. Additionally, there end up being dozens of repositories.
It feels like I either get an explosion of conditional logic complexity in a general aggregate, or an explosion of the number of aggregates (or contexts) if building one per subtype.
Question - is there a known pattern for dealing with this type of modelling problem? Or is it really just dealing with the above tradeoffs, and finding something which fits best? In that case, is there some precedent I can apply to the decision-making process, as I'm struggling to decide between the above approaches. And is it problematic if there end up being many dozens of aggregates within a given context?
Rather than starting with the data at rest to get your aggregates, consider instead what operations/changes ("commands" one might say) will be performed and how the results of those operations affect future operations is what leads to what the aggregates want to be. Event storming style approaches can be helpful for figuring out these relationships between state changes.
For instance, each of these kinds of loans might have AccrueInterest, DrawPrincipal, and RecordPayment commands which operate on the balances identically (given perhaps configurable rate parameters etc.) and which don't affect and aren't affected by other commands. In that scenario, you can have a Loan aggregate which models the idea that there's a loan with interest and principal balances on which interest accrues and payments are made. An AutoLoan aggregate might then just be managing the collateralization of Loan ABC123 with VIN 1G1234567890.
Sorry for the simple starting answer, but …
Build your aggregates based on your actual use cases.
Aggregate A - Scenario A
Aggregate B - Scenario B
…
Avoid building aggregates for general conditions, DDD and ubiquitous language is about developing language, aggregates and systems around the use case not general purpose.
General purpose has its use cases and isn’t necessary anti DDD; but the focus is on creating the specifics then abstracting to generality.
we are modeling an order system and we have the Order concept. The Order has a life cycle from it is created to it is delivered and between them the order can be in other states. Some states have particular business logic, and sometimes share other business logic such as when an order can be expire in a concrete date if it has not finished on time.
Well, the team is doubting if
Use the state pattern (one aggregate, one repository), or
Use one aggregate/repository for handle each state of the order.
Within of the second approach, we are considering to use the same table for each repository, to have a table order to persist/load each aggregate. It is well seen from DDD perspective?
What do you think about?
In general, DDD is all about not polluting the domain with infrastructure concerns; whether different aggregates are stored in the same table is an infrastructure concern. As long as the repository/repositories are able to meet their obligations, go for it.
That said, having an Order have a lot of variation in terms of what operations are legal and what information is available from state to state might be a sign that the states might make sense being apportioned to different bounded contexts (e.g. a context where items are added to an order (e.g. a cart context), a checkout/payment context, an assembly for delivery context, and a being delivered context).
In DDD, a repository loads an entire aggregate - we either load all of it or none of it. This also means that should avoid lazy loading.
My concern is performance-wise. What if this results in loading into memory thousands of objects? For example, an aggregate for Customer comes back with ten thousand Orders.
In this sort of cases, could it mean that I need to redesign and re-think my aggregates? Does DDD offer suggestions regarding this issue?
Take a look at this Effective Aggregate Design series of three articles from Vernon. I found them quite useful to understand when and how you can design smaller aggregates rather than a large-cluster aggregate.
EDIT
I would like to give a couple of examples to improve my previous answer, feel free to share your thoughts about them.
First, a quick definition about an Aggregate (took from Patterns, Principles and Practices of Domain Driven Design book by Scott Millet)
Entities and Value Objects collaborate to form complex relationships that meet invariants within the domain model. When dealing with large interconnected associations of objects, it is often difficult to ensure consistency and concurrency when performing actions against domain objects. Domain-Driven Design has the Aggregate pattern to ensure consistency and to define transactional concurrency boundaries for object graphs. Large models are split by invariants and grouped into aggregates of entities and value objects that are treated as conceptual whole.
Let's go with an example to see the definition in practice.
Simple Example
The first example shows how defining an Aggregate Root helps to ensure consistency when performing actions against domain objects.
Given the next business rule:
Winning auction bids must always be placed before the auction ends. If a winning bid is placed after an auction ends, the domain is in an invalid state because an invariant has been broken and the model has failed to correctly apply domain rules.
Here there is an aggregate consisting of Auction and Bids where the Auction is the Aggregate Root.
If we say that Bid is also a separated Aggregate Root you would have have a BidsRepository, and you could easily do:
var newBid = new Bid(money);
BidsRepository->save(auctionId, newBid);
And you were saving a Bid without passing the defined business rule. However, having the Auction as the only Aggregate Root you are enforcing your design because you need to do something like:
var newBid = new Bid(money);
auction.placeBid(newBid);
auctionRepository.save(auction);
Therefore, you can check your invariant within the method placeBid and nobody can skip it if they want to place a new Bid.
Here it is pretty clear that the state of a Bid depends on the state of an Auction.
Complex Example
Back to your example of Orders being associated to a Customer, looks like there are not invariants that make us define a huge aggregate consisting of a Customer and all her Orders, we can just keep the relation between both entities thru an identifier reference. By doing this, we avoid loading all the Orders when fetching a Customer as well as we mitigate concurrency problems.
But, say that now business defines the next invariant:
We want to provide Customers with a pocket so they can charge it with money to buy products. Therefore, if a Customer now wants to buy a product, it needs to have enough money to do it.
Said so, pocket is a VO inside the Customer Aggregate Root. It seems now that having two separated Aggregate Roots, one for Customer and another one for Order is not the best to satisfy the new invariant because we could save a new order without checking the rule. Looks like we are forced to consider Customer as the root. That is going to affect our performance, scalaibility and concurrency issues, etc.
Solution? Eventual Consistency. What if we allow the customer to buy the product? that is, having an Aggregate Root for Orders so we create the order and save it:
var newOrder = new Order(customerId, ...);
orderRepository.save(newOrder);
we publish an event when the order is created and then we check asynchronously if the customer has enough funds:
class OrderWasCreatedListener:
var customer = customerRepository.findOfId(event.customerId);
var order = orderRepository.findOfId(event.orderId);
customer.placeOrder(order); //Check business rules
customerRepository.save(customer);
If everything was good, we have satisfied our invariants while keeping our design as we wanted at the beginning modifying just one Aggregate Root per request. Otherwise, we will send an email to the customer telling her about the insufficient funds issue. We can take advance of it by adding to the email alternatives options she can purchase with her current budget as well as encourage her to charge the pocket.
Take into account that the UI can help us to avoid having customers paying without enough money, but we cannot blindly trust on the UI.
Hope you find both examples useful, and let me know if you find better solutions for the exposed scenarios :-)
In this sort of cases, could it mean that I need to redesign and re-think my aggregates?
Almost certainly.
The driver for aggregate design isn't structure, but behavior. We don't care that "a user has thousands of orders". What we care about are what pieces of state need to be checked when you try to process a change - what data do you need to load to know if a change is valid.
Typically, you'll come to realize that changing an order doesn't (or shouldn't) depend on the state of other orders in the system, which is a good indication that two different orders should not be part of the same aggregate.
I would like your advices about bounded contexts integration.
I have a usecase which put me in a corner :
I have a bounded context for Contract management. I can add parties (various external organizations for example) to a contract. Select for each party their investment / contribution (ex: 10% of the total). SO contract management is two-fold : one is administrative (add party, manage multiples dates, ...) the other one is financial (plan their contributions that span multiple years, check contributions consumption, ...).
I have another bounded context for Budget. This context is responsible for expenses management at the organisation level. Example: a service A will have 1000 € of expense capacity. We can plan a budget and after that each organisation party can consume, buying stuff, their part. In order to build a budget, the user in charge of the enterprise budget can allocate money directly or integrate a yearly contract financial component. When we integrate a contract part inside the budget we froze the data inside the budget, i.e we copy the monetary data from one database table inside another one (adding some audit informations). We have a single database.
It is this last part I struggle with. Each bounded context is a dedicated application. In the budget application, after a contract part has been integrated inside the current budget, I need to display the budget details lines. Unfortunately in the budget tables I have only the money data and not some basic info about the contract (object, reference, ...).
What am I thinking :
sometimes is not bad to duplicate data between bounded contexts. I froze the money part of a contract. I can also freeze / duplicate the object and reference of the contract. Then the querying will only take place inside the budget context. But what is problematic here is the data duplication. Today I need object /refrerence and if tomorrow I need more fields ... I will need domain events management to keep the data between contract / budget in sync.
querying budget and for each line query a contract service that will return the data needed. That keep each context autonomous but I need to make lots of database requests to enrich the budget details line objects.
with only one join at the database level we can make this work. What about coupling here ? It is the simple solution and what we are doing today (is it a shared kernel ?). It seems we can't afford to change contract structure without rebuilding the budget application. I don't have a programmatic contract between the contexts.
My question is :
How can I build this UI screen that need data from the budget context and each details line need data from the contract context ?
Side Notes :
Perhaps the contexts identification and perimetre are wrong from the start (it is a legacy design).
What I would like is to keep the context separate (loose coupling). If we can specify design contracts between the contexts, the maintenance is easier (or not ?).
I failed to see how to integrate these contexts (I need to re-read shared kernel, ustream / downstream etc).
This is an additional, distinct bounded context. It has some overlap with the existing bounded contexts, which can easily lead you down the wrong path (merging contexts or putting additional behaviour in a context where it doesn't belong).
Sometimes it's OK to have entities in different bounded contexts which are referring to the same logical entity, but which are just providing a different view of that entity for the purposes of a specific scenario (eg in a specific context).
A good example of this is in an e-commerce scenario. In most e-commerce applications you will have the concept of an Order, but there is no global, definitive notion of what an "order" is. In a finance context - the order is simply an invoice. In a fulfilment context - the order is simply a packing list and an address to send the goods to. In a marketing context - the order represents a little piece of intelligence about what the customer is interested in, which can be used for future targeted marketing.
There is a thread of commonality which runs through all of those entities, but you would likely see at least 3 separate Order classes, each one capturing the concept of an order within a context.
And so in your case, you have a bounded context for Contract and a bounded context for Budget. It seems to me that you now have another way of looking at these entities, and specifically the way in which they interact with each other. This is a new view of the entities, a view which can be captured in its own context. This new context will likely have its own Contract and Budget entities, and there will be overlap with the Context and Budget contexts, but there will also be additional relationships and behaviour in there, which wouldn't make sense in those other contexts.
This is a really difficult idea to explain :) I wrote an answer to a similar question some time ago here: DDD - How to design associations between different bounded contexts