Modelling aggregates with subtypes (DDD/CQRS) - domain-driven-design

I have a subdomain which involves tracking user financial data across different financial account types.
For example, users can input data for their:
bank accounts,
credit cards,
loans,
lines of credit,
real estate,
and more...
Now within each individual type, there are more subtypes.
For instance, under loans:
personal loans,
business loans,
mortgages,
car loans,
and more...
They would each have their own particular invariants, with some unique properties and functionality, and some shared properties and functionality.
I've been approaching this using composition, creating an aggregate for each subtype, and using interfaces and helper interface implementations to share similar logic between aggregates.
However, it appears as though I'm going to end up with dozens of different aggregates when modelling all these different account types. This doesn't feel right.
Alternatives I've considered:
have a type property on the loan aggregate, and conditional logic based off the type.
create different bounded contexts for each of these types: This feels like overkill, I believe this is all part of the same business subdomain.
create aggregates based off shared functionality - eg SecuredLoan and UnsecuredLoan aggregates
creating subclasses in the general aggregates to hold the subtype's unique functionality. get some encapsulation of subtype specific logic, with some conditional logic still (eg conditional properties on the aggregate). Not really sure the difference between this and just creating a separate aggregate for each subtype
Tradeoffs seem to be, the more general the implementation, there will end up being a ton of conditional logic, and conditional properties based off the subtype.
Versus building specific aggregates for each subtype, the logic per aggregate is simplified, but there ends up being hundreds of commands in the application layer, a lot of them which are basically the same thing but to a different subtype. Additionally, there end up being dozens of repositories.
It feels like I either get an explosion of conditional logic complexity in a general aggregate, or an explosion of the number of aggregates (or contexts) if building one per subtype.
Question - is there a known pattern for dealing with this type of modelling problem? Or is it really just dealing with the above tradeoffs, and finding something which fits best? In that case, is there some precedent I can apply to the decision-making process, as I'm struggling to decide between the above approaches. And is it problematic if there end up being many dozens of aggregates within a given context?

Rather than starting with the data at rest to get your aggregates, consider instead what operations/changes ("commands" one might say) will be performed and how the results of those operations affect future operations is what leads to what the aggregates want to be. Event storming style approaches can be helpful for figuring out these relationships between state changes.
For instance, each of these kinds of loans might have AccrueInterest, DrawPrincipal, and RecordPayment commands which operate on the balances identically (given perhaps configurable rate parameters etc.) and which don't affect and aren't affected by other commands. In that scenario, you can have a Loan aggregate which models the idea that there's a loan with interest and principal balances on which interest accrues and payments are made. An AutoLoan aggregate might then just be managing the collateralization of Loan ABC123 with VIN 1G1234567890.

Sorry for the simple starting answer, but …
Build your aggregates based on your actual use cases.
Aggregate A - Scenario A
Aggregate B - Scenario B
…
Avoid building aggregates for general conditions, DDD and ubiquitous language is about developing language, aggregates and systems around the use case not general purpose.
General purpose has its use cases and isn’t necessary anti DDD; but the focus is on creating the specifics then abstracting to generality.

Related

Relationship between concepts in DDD

I'm developing a budgeting app using Domain Driven Design. I'm new to DDD and therefore need a validation of my design.
Here are the concepts I came up with:
Transaction - which is either income or expense, on annual or monthly or one-off etc. basis.
Budget - which is the calculated income, expenses and balance projection, divided into occurrences (say e.g. 12 months over the next year, based on the Transactions).
I made the Transaction the Entity and Aggregate Root. In my mind it has identity, it's a concrete planned expense or income that I know I'll receive, for a concrete thing, and I also need to persist it, so I can calculate the budget based on all my transactions.
Now, I have an issue with the Budget. It depends on my concrete list of Transactions. If one of the Transactions gets deleted, the budget will need to be re-calculated (seems like a good candidate for a domain event?). It's a function of my identifiable transactions at any given time.
Nothing outside the Aggregate boundary can hold a reference to anything inside, except to the root Entity. Which makes me think the budget is the Aggregate Root as it cannot be a ValueObject or Entity within the Transaction.
What's confusing is that I don't necessarily need to persist the budget (unless I want to cache it). I could calculate it from scratch on request, and send it over to the client app. 2 different budgets could have the same number of occurrences, incomes, expenses and balances (but not Transactions). Perhaps an argument for making it a ValueObject?
So, my questions is - what is the Budget?
Domain context vs Aggregate
First element you get wrong is a point of details about DDD semantics. If there is only one object in your "aggregate", then it is not an aggregate. An aggregate is a structure made of multiple (2+) objects, with at least one being an entity and called the aggregate root. If a TransactionRpository returns a Transaction object that has no value object or entity, then Transaction is an entity but not an aggregate nor an aggregate root. If a BudgetRepository returns a Budget entity that includes a Transaction object, then Budget and Transaction form an aggregate, Budget being the aggregate root. If Budget and Transaction are returned from different repositories, then they form different contexts.
Context being the generic concept that can either be an aggregate or an entity.
Contexts are linked to use cases
Second element you get wrong is that you are trying to design your domain model outside of your use cases context. Your application clearly manipulates both concepts of Budget and Transactions, but does your application handles uses cases for both (budget management and transaction management) ? If yes, are these uses case different in a way that implies different domain constraints ?
If your application only handles Budget management, or both but they share their business constraints, then you only need a single context, that manipulates both concepts in a single aggregate. In that situation, Budget is probably your root aggregate, and it's up to your mode and use cases to tell whether the Transaction is a value object or you need to access them by Id.
If your application handles uses cases for both, with different business constraints, then you should split your domain in two contexts, with two different models, one for the Budget management use cases, the other for the Transaction management use cases.
Polysemic domain model
The third element you get wrong, is that you are trying to build a single, unified, normalized domain model. This is wrong because it introduces very complex structures, and a lot of business rules that are irrelevant to your business cases. Why would you need to manipulate the Budget domain model when the use case does not need knowledge of the Budget concept or linked business rules ?
If your application has use cases for both concepts, you need two models. The Budget management model should not use the Transaction management model. However, that does not implies that the Budget model is not allowed to manipulate the Transaction concept and vice versa. It only means you must write another model for that. You could have a Budget context that manipulates Budget and BudgetTransaction models, and Transaction context that manipulates Transaction and TransactionBudget models. These models can map to the same RDBMS tables with different columns, relevant to their use cases, implementing relevant business rules.
This is called writing a polysemic domain model.
Conclusion
So, my questions is - what is the Budget?
It is not possible to answer definitely your last question, as the answer depends on the use cases your application handles. However, you mention the following constraint:
If one of the Transactions gets deleted, the budget will need to be re-calculated
This seems a very good argument in favor of making your application as a single context application, based on an aggregate with Budget being the aggregate root and Transaction being an entity in the aggregate.
If you don't need to, try to refrain from splitting these two concepts in different contexts, unless you have very good reasons to do so: they manipulate excluding columns, they manipulate excluding business rules, you are interested in deploying these two in different bounded contexts, different services, as they would scale differently, etc ...
Having business constraints that span accross multiple contexts implies a complex implementation based on domain events, 2-phase commits, saga pattern, etc ... It's a lot of work, you should balance that work with the benefits you expect in return.

Bounded Context per jurisdiction with similar business process but slight variances in domain language and Entity attributes

How would you handle a domain that shares much of the same business logic, but has slight variances in both the domain language and attributes of entities. These variances change by "region".
A fictitious example is a Real Estate System for managing residential Real Estate. The language used can vary slightly between State/Province, and attributes about the Real Estate can be more detailed in some States. There would be an Office in each State/Province managing the Real Estate for that "region".
Would you create a separate Bounded Context for each State/Province? So there would potentially be 50+ Bounded Contexts?
Would you create a single Bounded Context, and just handle the variances of language, and data through object inheritance or composition?
Let me start off by saying that I don't think there's a silver bullet here. As always it all depends on different factors. That's way my reaction is not a concrete idea but rather a set of consideration you and your team could take into account on the path to a decision.
If I would be asked to define a bounded context I would say that it is the setting where a word has a certain meaning or connotation. If we can agree one the above statement then one would immediately be tempted to create a bounded context per region or setting.
That is, if the benefit you gain from having all those context contributes to the comprehensibility of the codebase. But if the context you are creating are really thin or you pay a heavy technical maintenance costs of having all those contexts I would strongly advise against going for multiple contexts.
I'm not certain but I feel that in your case one word will have the same meaning in each context only the word used will be different in each context. Whereas normally the same word used in different contexts has a different meaning.
I hope that this all makes sense and it helps you figure out your issue.
I reached out to Vaughn Vernon, one of the leaders in the area of DDD and I thought I would share his response (summarized by me).
Based on the limited information and examples I provided he brought up these points:
Metadata driven design would be more effective and efficient. Meaning, each "region" would have metadata to drive its flows, and data
Look at the State Pattern to drive regional situations
Look at the concept of DCI for ideas, but limit your attempts at actual DCI implementations as it can increase complexity of the code

Definition of a set of aggregates?

I'm struggling with how/if to define "a set of aggregates". Aggregates are supposed to be stand alone and isolated but it's easy to think of a bigger set of aggregates that belong together. But is this a trap?
Using this "set of aggregates" it would be possible to for instance enumerate and index aggregates on a unique property within the set and have other domain rules that could be validated across all aggregates in the set. It's tempting but also feels a bit wrong.
Another approach would be to avoid this thinking completely and not allow/define a set of aggregates and not allow enumerating aggregates but only load/save on aggregate-id. Using this option if would be necessary to reference aggregates from other aggregates and by doing this build up an interconnected graph of aggregates.
The approaches are similar to having aggregates in a folder on disk or having an "internet" of aggregates where the references between them are defining the bigger set of aggregates. In any case I'm really stuck on this problem. I have never read anywhere about this and I guess nobody really cares that much? I'm not sure I explain this very good but my question is if there are any definitions of the "set of aggregates" or if we should think of aggregates as totally isolated/on its own and with only a unique aggregate-id (UUID)?
The set of aggregates could for instance be the database being used under the surface. But what I'm wondering is if this database as in the information about what aggregates it contains has any definition in DDD or if we should think about a set of aggregates as an interconnected graph where only traversal of this graph can be used to enumerate all "associated" aggregates.
Aggregates are connected
In any application with sufficient complexity, Aggregates end up referencing one-another. And it is perfectly reasonable to use their unique identifiers as reference IDs to refer to each other.
But take care to load and persist aggregates outside the domain layer, typically in repositories. If you want to traverse links across aggregates and load them into memory, you will be doing that upfront before handing over control to the domain layer for the actual processing.
Traversing the graph to get all related aggregates is correct, but this rarely spans across too many aggregate boundaries. You rarely find a single change or rule to be applied throughout the application. If you do have such a transaction, it is probably a sign of the domain design needing improvement, simply because you are spreading one responsibility/change amongst many aggregates.
The connectivity is so usual that you should watch out for aggregates that have no linkages with the rest of the system. They are either standalone libraries, or they probably belong to a different bounded context.
Aggregates can morph into different forms
They are aggregates because they form a clear invariant boundary, with their primary responsibility being to enforce invariants across state changes for all the entities within themselves. But they can morph into different kinds of DDD objects based on the requirement.
A good example is of a single Currency note. In most applications, they are value objects. But for the federal bank, they are aggregates with clear cut invariant rules. They are aggregates when they are created and referenced, but in a transaction that ships printed notes to banks, they may become value objects.
So you may have to evaluate whether you are talking about a domain entity in its aggregate form, or as a value object when you consider each linkage.
Aggregates are invariant boundaries
It is wrong to validate domain rules across aggregates.
Your aggregate boundary is an invariant boundary, meaning all the domain rules within it should be satisfied at all time. By that logic, you are going to incorrectly build up a structure that will need to ensure that all domain rules across aggregates are valid at all time. Doing so will impose considerable performance burden, not to mention the complexity in business logic.
But this is not to say that there may be domain rules that span across aggregates. The correct way to accomplish this would be using eventual consistency and an Event-driven approach.
The primary changing aggregate would validate and persist the data, and bubble up an event containing the state change. Other aggregates would then act on the event and bring themselves up-to-date. If an aggregate's domain rules break because of the change, there is usually a supplementary mechanism that allows correction of the problem (a preferred mechanism) or a rollback of the first state change (happens very rarely).
Perhaps you can find a common denominator the agg sets have in common and use that to work with?
A simplified example; there is a set of Books and a set of Users that have nothing in common except you want to know whenever they were first registered? What might be an option is to have an interface FirstRegistration and then you can choose to either expand Books/Users or create a specific entity instead.
I'm struggling with how/if to define "a set of aggregates". Aggregates
are supposed to be stand alone and isolated but it's easy to think of
a bigger set of aggregates that belong together. But is this a trap?
I think you're struggling because indeed the idea of a set of aggregates (instances) is very generic, and the uses of such things are contextual and domain-specific. People don't talk specifically about it because of course you may have behaviors that operate on a collection of multiple aggregates, but that doesn't give such collections any particular common properties or requirements that would allow you, from a general DDD perspective, to characterize such collections more specifically than "a set of aggregates", "a list of distinct aggregates", or similar.
Using this "set of aggregates" it would be possible to for instance
enumerate and index aggregates on a unique property within the set and
have other domain rules that could be validated across all aggregates
in the set. It's tempting but also feels a bit wrong.
Tempting why? You've couched the question in very abstract terms, so it's pretty much impossible to contradict you about the "it would be possible", but just because something may be possible doesn't mean it would be useful. In practice, I think you'll find that rules or behaviors that operate on collections of aggregates most naturally belong not to collections of aggregates in an abstract sense, but rather to other aggregate types in your domain model, to domain repositories, or to domain services.
It is entirely plausible that your domain model might want to handle particular sets of aggregates characterized by some rule. For example, if you're an airline, then one of the aggregates in your domain model might a single seat on a flight, since that's the unit you sell. It makes sense in that case that there would be operations on all the seats on a particular flight, for example, but whatever rules and behaviors you might have about that are specifically about that kind of aggregate, selected in that particular way.
Another approach would be to avoid this thinking completely and not
allow/define a set of aggregates and not allow enumerating aggregates
but only load/save on aggregate-id.
It's surely counterproductive to forbid working with sets of aggregates. Just don't attribute more significance to it than is warranted. There is nothing particularly special about sets of aggregates in general.
Using this option if would be
necessary to reference aggregates from other aggregates and by doing
this build up an interconnected graph of aggregates.
I don't follow that. One certainly must be able to retrieve and store individual aggregates from persistence, as that's more or less the defining property of aggregates -- they are the unit of persistence. But that doesn't mean that you must reject the ability to work with collections of aggregates. However, sets of aggregates do not have identity in the same way that individual aggregates do, so yes, relationships between aggregates need to be modeled in terms of individual aggregates. Nevertheless, that does not inherently preclude 1:m or n:m relationships among aggregates.
I'm really stuck on this problem. I have never read anywhere about this and I guess nobody really cares that much?
You'll find all sorts of uses of various sets of aggregates in applications built and maintained based on DDD ideas, but there's not much to talk about at the level of abstraction of your question, and what there is is already summed up in the words "set" and "aggregate".
The set of aggregates could for instance be the database being used
under the surface. But what I'm wondering is if this database as in
the information about what aggregates it contains has any definition
in DDD
Not to my knowledge. I suspect most DDD practitioners would just call it "the data", or something similar.
or if we should think about a set of aggregates as an
interconnected graph where only traversal of this graph can be used to
enumerate all "associated" aggregates.
I'm still not seeing why you set that up as a thing. Sure, depending on the domain model, you might be able to traverse all or substantial chunks of the data by traversing associations between aggregates, and that might be appropriate for some purposes, but DDD doesn't have to give a special name or special rules for sets of aggregates for you to work with them.
Like any useful methodology, DDD exists to solve problems. Its bread & butter is complex applications with complex data and evolving requirements. It is not to be interpreted as a straight jacket preventing designers and developers from (thoughtfully) writing designs and code that incorporate aspects of other design approaches, much less designs and code that provide for the application's idiosyncratic needs.

DDD - Multiple Bounded Contexts because of differing aggregate data?

We try to split up our domain into bounded contexts with the goal to have a modular application design/architecture.
We did an enlightening EventSorming session which helped us a lot to identify bounded contexts and its aggregates. After the workshop every participant agreed on the bounded contexts we identified.
Nevertheless we feel uncomfortable as we fear our bounded contexts are still too large. EventStomring focusses on the domain events / process and that's the major building block we used to identify our bounded contexts.
We also identified aggregates like "Contract". Every contract nearly follows the same process, but the amount of data these contracts contain can differ massively. There are very simple contract types and contract types which include a lot of additional data and attachments.
Is it meaningful to declare another bounded context just because the aggregate's data is more complex?
Both approaches have their drawbacks:
Implementing all contract types in one bounded context might lead to a lot of if-Statements in the code in order to handle the differing data.
Extracting a new bounded context might lead to a lot of duplicate code just because some data differs.
Any suggestions / best practices how to handle this?
...domain events / process and that's the major building block we used
to identify our bounded contexts
BCs are not identified by processes, BCs are related to the language. Each BC has its own ubiquitous language (UL). A BC is the context in which a concept has meaning. Anyway BCs belong to the solution space. First of all you should explore the domain (problem space) and split it in subdomains, distilling the core domain. Then you model each subdomain. A BC is the context where a model lives. Ideally the relationship between subdomains and BCs is 1:1.
The process of discovering subdomains is iterative, and you will find them as you know the domain better, talking to experts. When you find a word that may have different meanings, or when two different words have the same meaning, then it means that you are crossing a boundary between BCs.
So, subdomains identification is not about processes, but about concepts and UL.
Is it meaningful to declare another bounded context just because the
aggregate's data is more complex?
No, you shouldn't create BCs arbitrary just because aggregates are complex. BCs are based on the UL.
Any suggestions / best practices how to handle this?
You should learn more about the domain (contract, types, etc) by talking to domain experts, and study your aggregate (transactional consistency)... Anyway, if you split your aggregate into anothers, it doesn't mean that they belong to different BCs, they still can belong to the same BC. A BC can have more than one aggreagate. It all depends on your concrete domain.
Bounded contexts have little to do with if-statements, so I'm not sure what you mean.
Bounded contexts are a semantically closed set of business functionalities. Basically your bounded context is well defined if it can execute its functions in complete isolation, even if the other contexts are not available.
Other than that, you can have any design inside of the context. I feel the amount of if-statements depends more on your class/code-design, like whether you use polymorphism correctly, interfaces, things like that.
But, to your point: You don't need to have everything perfect the first time. If you identified some valid contexts, you already did the hard part. If any context can be further divided, that could happen later anytime with little impact on others (since contexts are more or less isolated).
No specific business teams for different kinds of contracts
No dedicated dev team for specific kinds of contracts
Same ubiquitous language is used for all contracts
Every contract nearly follows the same process
These to me are signs that all contracts belong to the same business subdomain and should ideally be in the same Bounded Context - unless legacy or third party systems force you otherwise.

Bounded context find the boundary?

In my current project (e-commerce website), we have different Bounded Context like: billing, delivery or payment in our checkout process.
On top of this, depending on what the customer will buy, the checkout process will be different. So depending on the content of her cart the number of steps in the checkout process can be different, or we won't/will ask her for certain informations.
So should one create a different bounded context for each different type of checkout process ?
For example, the Order aggregate root will be different depending on the checkout process
EticketsOrder (in this context we don't need a delivery address so we won't ask one to the user)
Ticket BillingAddress
ClothesOrder (in this context we need a delivery address and there will be an additional step in the checkout process to get this)
Clothes BillingAddress DeliveryAddress
This separation will imply to create two different domain entities even thought they have similar properties.
What's the best way to model this kind of problem ? How to find the context boundary ?
A bounded context is chiefly a linguistic boundary. A quote from the blue book (highlighted key part):
A BOUNDED CONTEXT delimits the applicability of a particular model so
that team members have a clear and shared understanding of what has
to be consistent and how it relates to other CONTEXTS. Within that
CONTEXT, work to keep the model logically unified, but do not worry
about applicability outside those bounds. In other CONTEXTS, other
models apply, with differences in terminology, in concepts and rules,
and in dialects of the UBIQUITOUS LANGUAGE.
A question to ask is whether the different types of orders created are entirely distinct aggregates, or are they all order aggregates with different values. Is there a need to consider order as a whole regardless of how they were created? I've build and worked with ecommerce systems where different types of orders were all modeled as instances of the same aggregate, just with different settings and there were no linguistic issues. On the other hand, the orders in your domain may be different enough to warrant distinct contexts.
I often consider BC boundaries from the perspective of functional cohesion. If you segregate orders into two BCs will there be a high degree of coupling between them? If so, that may be a sign that they should be combined into one BC. On the other hand, if the only place that the BCs interact is for reporting purposes, there is no need to combined them.
It appears as though you may have missed a bounded context. When this happens one tends to try and fit the functionality into an existing BC. The same thing happens to aggregate roots. If something seems clumsy or it doesn't make sense try to see whether you haven't missed something.
In your example I would suggest a Shopping BC (or whatever name makes sense). You are trying to fit your checkout process into your Order BC. Your Shopping BC would be responsible for gathering all the data and then shuttling it along to the relevant parts.
The product type selected will determine whether a physical delivery is required.
Hope that helps.

Resources