Related
I'm developing a budgeting app using Domain Driven Design. I'm new to DDD and therefore need a validation of my design.
Here are the concepts I came up with:
Transaction - which is either income or expense, on annual or monthly or one-off etc. basis.
Budget - which is the calculated income, expenses and balance projection, divided into occurrences (say e.g. 12 months over the next year, based on the Transactions).
I made the Transaction the Entity and Aggregate Root. In my mind it has identity, it's a concrete planned expense or income that I know I'll receive, for a concrete thing, and I also need to persist it, so I can calculate the budget based on all my transactions.
Now, I have an issue with the Budget. It depends on my concrete list of Transactions. If one of the Transactions gets deleted, the budget will need to be re-calculated (seems like a good candidate for a domain event?). It's a function of my identifiable transactions at any given time.
Nothing outside the Aggregate boundary can hold a reference to anything inside, except to the root Entity. Which makes me think the budget is the Aggregate Root as it cannot be a ValueObject or Entity within the Transaction.
What's confusing is that I don't necessarily need to persist the budget (unless I want to cache it). I could calculate it from scratch on request, and send it over to the client app. 2 different budgets could have the same number of occurrences, incomes, expenses and balances (but not Transactions). Perhaps an argument for making it a ValueObject?
So, my questions is - what is the Budget?
Domain context vs Aggregate
First element you get wrong is a point of details about DDD semantics. If there is only one object in your "aggregate", then it is not an aggregate. An aggregate is a structure made of multiple (2+) objects, with at least one being an entity and called the aggregate root. If a TransactionRpository returns a Transaction object that has no value object or entity, then Transaction is an entity but not an aggregate nor an aggregate root. If a BudgetRepository returns a Budget entity that includes a Transaction object, then Budget and Transaction form an aggregate, Budget being the aggregate root. If Budget and Transaction are returned from different repositories, then they form different contexts.
Context being the generic concept that can either be an aggregate or an entity.
Contexts are linked to use cases
Second element you get wrong is that you are trying to design your domain model outside of your use cases context. Your application clearly manipulates both concepts of Budget and Transactions, but does your application handles uses cases for both (budget management and transaction management) ? If yes, are these uses case different in a way that implies different domain constraints ?
If your application only handles Budget management, or both but they share their business constraints, then you only need a single context, that manipulates both concepts in a single aggregate. In that situation, Budget is probably your root aggregate, and it's up to your mode and use cases to tell whether the Transaction is a value object or you need to access them by Id.
If your application handles uses cases for both, with different business constraints, then you should split your domain in two contexts, with two different models, one for the Budget management use cases, the other for the Transaction management use cases.
Polysemic domain model
The third element you get wrong, is that you are trying to build a single, unified, normalized domain model. This is wrong because it introduces very complex structures, and a lot of business rules that are irrelevant to your business cases. Why would you need to manipulate the Budget domain model when the use case does not need knowledge of the Budget concept or linked business rules ?
If your application has use cases for both concepts, you need two models. The Budget management model should not use the Transaction management model. However, that does not implies that the Budget model is not allowed to manipulate the Transaction concept and vice versa. It only means you must write another model for that. You could have a Budget context that manipulates Budget and BudgetTransaction models, and Transaction context that manipulates Transaction and TransactionBudget models. These models can map to the same RDBMS tables with different columns, relevant to their use cases, implementing relevant business rules.
This is called writing a polysemic domain model.
Conclusion
So, my questions is - what is the Budget?
It is not possible to answer definitely your last question, as the answer depends on the use cases your application handles. However, you mention the following constraint:
If one of the Transactions gets deleted, the budget will need to be re-calculated
This seems a very good argument in favor of making your application as a single context application, based on an aggregate with Budget being the aggregate root and Transaction being an entity in the aggregate.
If you don't need to, try to refrain from splitting these two concepts in different contexts, unless you have very good reasons to do so: they manipulate excluding columns, they manipulate excluding business rules, you are interested in deploying these two in different bounded contexts, different services, as they would scale differently, etc ...
Having business constraints that span accross multiple contexts implies a complex implementation based on domain events, 2-phase commits, saga pattern, etc ... It's a lot of work, you should balance that work with the benefits you expect in return.
I am re-designing my side-project to utilize DDD. I am doing this for learning purposes. It's an application for planning home budget and analysis of spendings. One of functionalities of the app is that user registers expenses and divides them into categories.
I have general question: how do you design aggregates? What steps to follow?
Below you'll find steps that I followed that lead me nowhere.
I did design-level event storming session for the project up to a step where I have identified invariants and now I am trying to name aggregates. Please consider following slice of event storming artifact as an example:
I identified relevant entities. Entities relevant to the example:
Expense
Expense category
Expense category group
I designed aggregate that fulfills all of the invariants:
I read this great article about designing aggregates. According to the article aggragates should follow the rules of:
Consistency of the lifecycle
Consistency of the problem domain
Consistency of the scenario frequency
As few elements as possible within the aggregation
In case of my aggregate I can see that:
Consistency of lifecycle rule is violated (because expense is still meaningful when you delete expense category)
Consistency of the scenarion frequency rule is violated (because expenses will created much more frequent than expense categories will be modified)
There's also to many elements in the aggregate. The expenses list will be growing.
I re-designed the aggregates so that the rules are satisfied. Here's what I've got.
I realized that now one of the invariants is not part of transactional consistency. Namely the invariant stating "Expense cannot be assigned to category withdrew from usage before the expense date". I know that it is possible to negotiate business rules and replace invariant with some sort of corrective policy but in this case I have no idea of what this policy can be (this is side-project, I am the stakeholder).
And now I am stuck. Please, help. What am I doing wrong?
So far my conclusion are that:
sometimes I can't have small and well-designed aggregates that satisfy all requirements on consistency
DDD style application will probably degenerate fast when developed by team with usual structure (more regular/junior developers than seniors/leaders).
developing DDD style adds huge overhead spent on analysis of which rules should be transactionally consistent, which eventually consistent, how changes to rules impact aggregate structure
Your conclusions are reasonable.
Regarding the invariant:
Expense cannot be assigned to category withdrew from usage before the
expense date
I presume you have a method on your Expense that accepts a Category Id and amount, so that this can be called to categorise your expense.
I'd pass in the Category entity itself and go for:
public class Category
{
DateTime? Withdrawn { get; set; }
public bool IsWithdrawn() => Withdrawn != null && Withdrawn < DateTime.Now;
}
public class Expense
{
public void Categorise(Category category, decimal amount)
{
if (category.IsWithdrawn())
{
throw new InvalidOperation("Cannot use category. It is withdrawn.");
}
// Complete the categorisation
}
}
Now, your application layer will retrieve the Category to be passed into the Expense method.
The Expense can then enforce its own invariant.
NOTE:
There is the corner case that a Category gets withdrawn by another process after your application layer has retrieved it but before your Expense is committed. As this end-to-end process would be very short, my guess is that this is unlikely to be a concern in your use case, but worth considering.
Approaches for ensuring that Category has not been 'withdrawn' before Expense is saved:
1 - Use a Domain Event
When the Expense is categorised add an appropriate domain event. The domain event handler can then perform a just-in-time check about the validity of the Category before the transaction is committed.
2 - Catch FK Exception
If 'withdrawn' means deleted from the database, then use the approach from above and when you try to save the Categorisation with an FK pointing to a now-deleted Category then you'll get an FK exception which you can catch and handle.
3 - Use Concurrency Token
If 'withdrawn' is just a flag that's put on the Category or Category is not referenced thru a database-enforced foreign key, then you could use a concurrency token. Various ways of implementing it, but using this approach will tell you if the state of the Category may have changed since you last retrieved it. If so, you can re-run the command. If that state change was the 'withdrawal' of the category, then second time around the approach above will enforce the invariant.
We are building a system which can sell our api services to multiple companies.
We have
companies (companies which purchased our api)
accounts (each company can have multiple accounts, and each account has it's user types)
users (users within account)
Infrastructurally, it looks something like this:
"company1" : {
"accounts" : [
account1 :{"users" : [{user1,user2}], accountType},
account2 :{"users" : [{user1,user2}], accountType},
]}
One of the business rules states that users can't change accounts after registration.
Other rule states that user can change his type, but only within that account type.
From my understanding, my domain model should be called UserAccount, and it should consist of Account, User and UserType entities, where Account would be aggregate root.
class UserAccount{
int AccountId;
string AccountName;
int AccountTypeId;
List<UserTypes> AvailableUserTypesForThisAccount
User User
void SetUserType(userTypeId){
if(AvailableUserTypesForThisAccount.Contains(userTypeId) == false)
throw new NotSupportedException();
}
}
With this aggregate, we can change type of the user, but it can only be type which is available for that account (one of invariants).
When I fetch UserAccount from repository, I would fetch all necessary tables (or entity data objects) and mapped them to aggregate, and returned it as a whole.
Is my understanding and modeling going in to the right direction?
It's important to understand the design trade-off of aggregates; because aggregates partition your domain model into independent spaces, you gain the ability to modify unrelated parts of the model concurrently. But you lose the ability to enforce business rules that span multiple aggregates at the point of change.
What this means is that you need to have a clear understanding of the business value of those two things. For entities that aren't going to change very often, your business may prefer strict enforcement over concurrent changes; where the data is subject to frequent change, you will probably end up preferring more isolation.
In practice, isolation means evaluating whether or not the business can afford to mitigate the cases where "conflicting" edits leave the model in an unsatisfactory state.
With this aggregate, we can change type of the user, but it can only be type which is available for that account (one of invariants).
With an invariant like this, an important question to ask is "what is the business cost of a failure here"?
If User and Account are separate aggregates, then you face the problem that a user is being assigned to a "type" at the same time that an account is dropping support for that type. So what would it cost you to detect (after the change) that a violation of the "invariant" had occurred, and what would it cost to apply a correction?
If Account is relatively stable (as seems likely), then most of those errors can be mitigated by comparing the user type to a cached list of those allowed in the account. This cache can be evaluated when the user is being changed, or in the UI that supports the edit. That will reduce (but not eliminate) the error rate without compromising concurrent edits.
From my understanding, my domain model should be called UserAccount, and it should consist of Account, User and UserType entities, where Account would be aggregate root.
I think you've lost the plot here. The "domain model" isn't really a named thing, it's just a collection of aggregates.
If you wanted an Account aggregates that contain Users and UserTypes, then you would probably model it something like
Account : Aggregate {
accountId : Id<Account>,
name : AccountName,
users : List<User>,
usertypes : List<UserType>
}
This design implies that all changes to a User need to be accessed via the Account aggregate, and that no User belongs to more than one account, and that no other aggregate can directly reference a user (you need to negotiate directly with the Account aggregate).
Account::SetUserType(UserHint hint, UserType userTypeId){
if(! usertypes.Contains(userTypeId)) {
throw new AccountInvariantViolationException();
}
User u = findUser(users, hint);
...
}
When I fetch UserAccount from repository, I would fetch all necessary tables (or entity data objects) and mapped them to aggregate, and returned it as a whole.
Yes, that's exactly right -- it's another reason that we generally prefer small aggregates loosely coupled, rather than one large aggregate.
what about having only the relationship between Account and User live in the Account aggregate as well as the type of user (as an AccountUser entity) and have the rest of the user information live in a separate User aggregate?
That model could work for some kinds of problems -- in that case, the Account aggregate would probably looks something like
Account : Aggregate {
accountId : Id<Account>,
name : AccountName,
users : Map<Id<User>,UserType>
usertypes : List<UserType>
}
This design allows you to throw exceptions if somebody tries to remove a UserType from an Account when some User is currently of that type. But it cannot, for example, ensure that the user type described here is actually consistent with the state of the independent User aggregate -- or event be certain that the identified User exists (you'll be relying on detection and mitigation for those cases).
Is that better? worse? It's not really possible to say without a more thorough understanding of the actual problem being addressed (trying to understand ddd from toy problems is really hard).
The principle is to understand which the business invariant that must be maintained at all times (as opposed to those where later reconciliation is acceptable), and then group together all of the state which must be kept consistent to satisfy the invariant.
But what if account can have hundreds or thousands of users? What would be your vision of aggregate?
Assuming the same constraints: that we have some aggregate that is responsible for the allowed range of user types.... if the aggregate got to be too large to manage in a reasonable way, and the constraints imposed by the business cannot be relaxed, then I would probably compromise the "repository" abstraction, and allow the enforcement of the set validation rules to leak into the database itself.
The conceit of DDD, taken from its original OO best practices roots, is that the model is real, and the persistence store is just an environmental detail. But looked at with a practical eye, in a world where the processes have life cycles and there are competing consumers and... it's the persistence store that represents the truth of the business.
If have the following Code.
public class CountryFactory : IEntityFactory
{
private readonly IRepository<Country> countryRepository;
public CountryFactory(IRepository<Country> countryRepository)
{
this.countryRepository = countryRepository;
}
public Country CreateCountry(string name)
{
if (countryRepository.FindAll().Any(c => c.Name == name))
{
throw new ArgumentException("There is already a country with that name!");
}
return new Country(name);
}
}
From a DDD approach, is the the correct way to create a Country. Or is it better to have a CountryService which checks whether or not a country exists, then if it does not, just call the factory to return a new entity. This will then mean that the service will be responsible of persisting the Entity rather than the Factory.
I'm a bit confused as to where the responsibility should lay. Especially if more complex entities needs to be created which is not as simple as creating a country.
In DDD factories are used to encapsulate complex objects and aggregates creation. Usually, factories are not implemented as separate classes but rather static methods on the aggregate root class that returns the new aggregate.
Factory methods are better suited than constructors since you might need to have technical constructors for serialization purposes and var x = new Country(name) has very little meaning inside your Ubiquitous Language. What does it mean? Why do you need a name when you create a country? Do you really create countries, how often new countries appear, do you even need to model this process? All these questions arise if you start thinking about your model and ubiquitous language besides tactical pattern.
Factories must return valid objects (i.e. aggregates), checking all invariants inside it, but not outside. Factory might receive services and repositories as parameters but this is also not very common. Normally, you have an application service or command handler that does some validations and then creates a new aggregate using the factory method and adds it to the repository.
There is also a good answer by Lev Gorodinski here Factory Pattern where should this live in DDD?
Besides, implementation of Factories is extensively described in Chapter 11 of the Red Book.
Injecting a Repository into a Factory is OK, but it shouldn't be your first concern. The starting point should be : what kind of consistency does your business domain require ?
By checking Country name uniqueness in CountryFactory which is part of your Domain layer, you give yourself the impression that the countries will always be consistent. But the only aggregate is Country and since there is no AllCountries aggregate to act as a consistency boundary, respect of this invariant will not be guaranteed. Somebody could always sneak in a new Country that has exactly the same name as the one being added, just after you checked it. What you could do is wrap the CreateCountry operation into a transaction that would lock the entire set of Countries (and thus the entire table if you use an RDBMS) but this would hurt concurrency.
There are other options to consider.
Why not leverage a database unique constraint to enforce the Country name invariant ? As a complement, you could also have another checkpoint at the UI level to warn the user that the country name they typed in is already taken. This would necessitate another "query" service that just calls CountryRepository.GetByName() but where the returned Countries are not expected to be modified.
Soon you'll be realizing that there are really two kinds of models - ones that can give you some domain data at a given moment in time so that you can display it on a user interface, and ones that expose operations (AddCountry) and will guarantee that domain invariants always hold. This is a first step towards CQRS.
What is the frequency of Countries being added or modified ? If it is that high, do we really need a Country name to be unique at all times ? Wouldn't it solve a lot of problems if we loosened up the constraints and allowed a user to temporarily create a duplicate Country name ? A mechanism could detect the duplicates later on and take a compensating action, putting the newly added Country on hold and reaching out to the user to ask them to change the name. A.k.a eventual consistency instead of immediate consistency.
Does Country need to be an Aggregate ? What would be the cost if it was a Value Object and duplicated in each entity where it is used ?
This is a practical Domain Driven Design question:
Conceptually, I think I get Aggregate roots until I go to define one.
I have an Employee entity, which has surfaced as an Aggregate root. In the Business, some employees can have work-related Violations logged against them:
Employee-----*Violations
Since not all Employees are subject to this, I would think that Violations would not be a part of the Employee Aggregate, correct?
So when I want to work with Employees and their related violations, is this two separate Repository interactions by some Service?
Lastly, when I add a Violation, is that method on the Employee Entity?
Thanks for the help!
After doing even MORE research, I think I have the answer to my question.
Paul Stovell had this slightly edited response to a similar question on the DDD messageboard. Substitute "Customer" for "Employee", and "Order" for "Violation" and you get the idea.
Just because Customer references Order
doesn't necessarily mean Order falls
within the Customer aggregate root.
The customer's addresses might, but
the orders can be independent (for
example, you might have a service that
processes all new orders no matter who
the customer is. Having to go
Customer->Orders makes no sense in
this scenario).
From a domain point of view, you can
even question the validity of those
references (Customer has reference to
a list of Orders). How often will you
actually need all orders for a
customer? In some systems it makes
sense, but in others, one customer
might make many orders. Chances are
you want orders for a customer between
a date range, or orders for a customer
that aren't processed yet, or orders
which have not been paid, and so on.
The scenario in which you'll need all
of them might be relatively uncommon.
However, it's much more likely that
when dealing with an Order, you will
want the customer information. So in
code, Order.Customer.Name is useful,
but Customer.Orders[0].LineItem.SKU -
probably not so useful. Of course,
that totally depends on your business
domain.
In other words, Updating Customer has nothing to do with updating Orders. And orders, or violations in my case, could conceivable be dealt with independently of Customers/Employees.
If Violations had detail lines, then Violation and Violation line would then be a part of the same aggregate because changing a violation line would likely affect a Violation.
EDIT**
The wrinkle here in my Domain is that Violations have no behavior. They are basically records of an event that happened. Not sure yet about the implications that has.
Eric Evan states in his book, Domain-Driven Design: Tackling the Complexity in the Heart of Software,
An AGGREGATE is a cluster of associated objects that we treat as a unit for the purpose of data changes.
There are 2 important points here:
These objects should be treated as a "unit".
For the purpose of "data change".
I believe in your scenario, Employee and Violation are not necessarily a unit together, whereas in the example of Order and OrderItem, they are part of a single unit.
Another thing that is important when modeling the agggregate boundaries is whether you have any invariants in your aggregate. Invariants are business rules that should be valid within the "whole" aggregate. For example, as for the Order and OrderItem example, you might have an invariant that states the total cost of the order should be less than a predefined amount. In this case, anytime you want to add an OrderItem to the Order, this invariant should be enforced to make sure that your Order is valid. However, in your problem, I don't see any invariants between your entities: Employee and Violation.
So short answer:
I believe Employee and Violation each belong to 2 separate aggregates. Each of these entities are also their own aggregate roots. So you need 2 repositories: EmployeeRepository and ViolationRepository.
I also believe you should have an unidirectional association from Violation to Employee. This way, each Violation object knows who it belongs to. But if you want to get the list of all Violations for a particular Employee, then you can ask the ViolationRepository:
var list = repository.FindAllViolationsByEmployee(someEmployee);
You say that you have employee entity and violations and each violation does not have any behavior itself. From what I can read above, it seems to me that you may have two aggregate roots:
Employee
EmployeeViolations (call it EmployeeViolationCard or EmployeeViolationRecords)
EmployeeViolations is identified by the same employee ID and it holds a collection of violation objects. You get behavior for employee and violations separated this way and you don't get Violation entity without behavior.
Whether violation is entity or value object you should decide based on its properties.
I generally agree with Mosh on this one. However, keep in mind the notion of transactions in the business point of view. So I actually take "for the purpose of data changes" to mean "for the purpose of transaction(s)".
Repositories are views of the domain model. In a domain environment, these "views" really support or represent a business function or capability - a transaction. Case in point, the Employee may have one or more violations, and if so, are aspects of a transaction(s) in a point in time. Consider your use cases.
Scenario: "An employee commits an act that is a violation of the workplace." This is a type of business event (i.e. transaction, or part of a larger, perhaps distributed transaction) that occurred. The root affected domain object actually can be seen from more than one perspective, which is why it is confusing. But the thing to remember is behavior as it pertains to a business transaction, since you want your business processes to model the real-world as accurate as possible. In terms of relationships, just like in a relational database, your conceptual domain model should actually indicate this already (i.e. the associativity), which often can be read in either direction:
Employee <----commits a -------committed by ----> Violation
So for this use case, it would be fair that to say that it is a transaction dealing with violations, and that the root - or "primary" entity - is a Violation. That, then would be your aggregate root you would reference for that particular business activity or business process. But that is not to say that, for a different activity or process, that you cannot have an Employee aggregate root, such as the "new employee process". If you take care, there should be no negative impact of cyclic references, or being able to traverse your domain model multiple ways. I will warn, however, that governing of this should be thought about and handled by your controller piece of your business domain, or whatever equivalent you have.
Aside: Thinking in terms of patterns (i.e. MVC), the repository is a view, the domain objects are the model, and thus one should also employ some form of controller pattern. Typically, the controller declares the concrete implementation of and access to the repositories (collections of aggregate roots).
In the data access world...
Using LINQ-To-SQL as an example, the DataContext would be the controller exposing a view of Customer and Order entities. The view is a non-declarative, framework-oriented Table type (rough equivalent to Repository). Note that the view keeps a reference to its parent controller, and often goes through the controller to control how/when the view gets materialized. Thus, the controller is your provider, taking care of mapping, translation, object hydration, etc. The model is then your data POCOs. Pretty much a typical MVC pattern.
Using N/Hibernate as an example, the ISession would be the controller exposing a view of Customer and Order entities by way of the session.Enumerable(string query) or session.Get(object id) or session.CreateCriteria(typeof(Customer)).List()
In the business logic world...
Customer { /*...*/ }
Employee { /*...*/ }
Repository<T> : IRepository<T>
, IEnumerable<T>
//, IQueryable<T>, IQueryProvider //optional
{ /**/ }
BusinessController {
Repository<Customer> Customers { get{ /*...*/ }} //aggregate root
Repository<Order> Orders { get{ /*...*/ }} // aggregate root
}
In a nutshell, let your business processes and transactions be the guide, and let your business infrastructure naturally evolve as processes/activities are implemented or refactored. Moreover, prefer composability over traditional black box design. When you get to service-oriented or cloud computing, you will be glad you did. :)
I was wondering what the conclusion would be?
'Violations' become a root entity. And 'violations' would be referenced by 'employee' root entity. ie violations repository <-> employee repository
But you are consfused about making violations a root entity becuase it has no behavior.
But is 'behaviour' a criteria to qualify as a root entity? I dont think so.
a slightly orthogonal question to test understanding here, going back to Order...OrderItem example, there might be an analytics module in the system that wants to look into OrderItems directly i.e get all orderItems for a particular product, or all order items greater than some given value etc, does having a lot of usecases like that and driving "aggregate root" to extreme could we argue that OrderItem is a different aggregate root in itself ??
It depends. Does any change/add/delete of a vioation change any part of employee - e.g. are you storing violation count, or violation count within past 3 years against employee?