DDD Aggregate Design - domain-driven-design

I'm trying to decide whether to use a single aggregate or if I can get away with using three separate aggregates.
I have three entities, Quiz, Question, and Answer to represent multiple-choice quizzes. Conceptually, a quiz contains multiple questions, and each question contains multiple answers. Each answer belongs to only one question and each question belongs to only one quiz. A separate quiz is created for each user.
One business rule is that once a quiz is submitted answers can no longer selected or deselected. The quiz's grade is calculated when it is submitted, so if an answer is changed after the quiz is submitted it would put the domain in an inconsistent state.
If I model this as a single aggregate with Quiz as the aggregate root, this is simple enough to enforce. If I model the domain with each of these entities as its own aggregate root then I would have to check at the application level, rather than at the domain level, whether a quiz is submitted or not before selecting or deselecting an answer.
This is a simplified version of what I would do with separate aggregates at the application level:
async execute({ quizId, answerId }: Request): Promise<Response> {
const quiz = await this.quizRepository.get(quizId);
if (quiz.submitted) {
throw new AnswerSelectQuizAlreadySubmitted();
}
const answer = await this.answerRepository.get(answerId);
answer.select();
return await this.answerRepository.save(answer);
}
What I would do with a single aggregate at the application level:
async execute({ quizId, questionId, answerId }: Request): Promise<Response> {
const quiz = await this.quizRepository.get(quizId);
// quiz entity enforces the business rule
quiz.selectAnswer(questionId, answerId);
return await this.quizRepository.save(quiz);
}
I prefer the small-aggregates model because the user can update multiple answers simultaneously without database failures due to optimistic locking at the Quiz level.
My concern with it is that a "submit quiz" command and a "select answer" command could be fired in rapid succession, causing an answer to be selected on an already submitted quiz. My intuition is that the application-level validation I showed above won't necessarily prevent this from happening.
TL;DR: Will an application-level check on a separate aggregate work in an asynchronous environment or will I be forced to include all three entities in a single aggregate?

This single statement
Each answer belongs to only one question and each question belongs to only one quiz.
renders the entire rest of the question moot. The answer is that you'd use a single aggregate in your domain model.
How you deal with database contention is a separate, albeit important, concern that could perhaps be remedied by using a document database where the document is the aggregate root and its children.
My answer would be different if your domain was richer.
For example, you could have a Question aggregate with a list of Answer and a ValidAnswer. If you randomly generated a Quiz and selected from a list of Question aggregates, then you'd have a Quiz aggregate with a list of QuizQuestion (or QuesionReference or some such) as children. Thus your Quiz aggregate references (not owned/separate aggregate) Questions through an (owned/value object) QuestionReference. In this scenario, you're safe from some contention because you're still updating the Quiz aggregate, and not the Question aggregates.
Dealing with click-happy 😀 users
Your domain model in either case must check that answer selection can't occur after quiz submission. That's logic on your Quiz aggregate.
Now, that said your concern that a race condition could occur is valid. The simple model is easy; a single aggregate to a single transaction or single document naturally prevents the race condition; just use normal db concurrency options. But, the two-aggregate model does too! Why? You're still updating only the Quiz aggregate. The Question aggregates are referenced, and don't change (well at least in quiz-answering context).
Finally, in all of the above, Answer does not seem like a separate aggregate or entity. How can an Answer mean anything without being tied to a Question? Maybe in Jeopardy!?

Related

Domain Driven Design - Can you use a simplified existing Aggregate in a different Aggregate Root

I have an aggregate called Survey - you can create a Survey, send a Survey and respond to a Survey.
I also have an aggregate called Person. They can receive a Survey and respond to a Survey. This aggregate includes demographic information that describes the Person.
I think my Survey aggregate should contain a list of people that have received the Survey. That would include the date they received it and any of their answers, if they have responded. But I feel like I don't need most of the demographic information that would normally come along with my existing Person aggregate.
I tried playing around with different concepts, like calling a send a "Delivery" and calling the employees "Recipients", but the business doesn't speak in those terms. It's just - "I create a Survey and then I send that Survey to people".
So does it make sense to create a different Person aggregate just within Survey? The action of sending a Survey to someone exists as records in the DB, along with responses to the Survey. Is this a better use case for Value Objects?
As soon as you use phrases like "I don't need all of the...data" or "this is like that but with a few more attributes but not these original ones" you are implicitly introducing the notion of a projection of an entity from one bounded context into another bounded context.
In your case a Person is in a different bounded context from a Person taking a survey (I will call the latter a SurveyParticipant) because even though it's the same actual person taking the survey, the focus of the two entities is different.
Here's a possible solution to your problem (expanded a bit, just for context).
Person is projected into the Survey Taking bounded context as a SurveyParticipant. As a bonus, Survey is really the definition of a survey, but SurveyInstance is a Survey that can be/is taken by these SurveyParticipants.
SurveyParticipant is not a Person. A SurveyParticipant is a person who is (going to) participate in completing a SurveyInstance.
Bottom line,
Can you use a simplified existing Aggregate in a different Aggregate Root
Yes. You do this by projecting the data from one aggregate to an entity in another context (at least in my proposed solution).
So does it make sense to create a different Person aggregate just within Survey?
Sort of. An aggregate can never contain another aggregate, but it can refer to one in the same bounded context. I didn't do this in this solution. The alternative, as shown, is that an aggregate can contain information from another aggregate as a projection of that other aggregate.
This is a strategic design problem.
I think you have two Bounded Contexts here, so you should split your Person domain model between them. Person from Survey Context will contain only data and behavior you need for your Surveys. Person from other one (Marketing Context for example) covers marketing team needs.
Here is an example from Martin Fowler's blogpost about BC's
So you're almost right with other aggregate, but it is not a simplified version, it's a separate one.

Aggregate relationships in Domain Driven Design

I have a question related to relationships between aggregates in Domain Driven Design.
I have the following situation: I have an aggregate (questionnaire) which has some children (questions). The questions are entities, but because they are inside the questionnaire aggregate they can have local identities (i.e. question with id 1 in questionnaire with id 1234; I can have another question with id 1 but in another questionnaire). So to refer to a question you always have to qualify it with its parent questionnaire id.
On the other side I have another aggregate (collection campaign) which stores data (response set) for the questions in a questionnaire (the collection campaign points to the questionnaire by its id, and a response set points to a question again by its id). I can have several collection campaigns (which took place at different times perhaps) and each collection campaign stores different response sets, but for the same questionnaire (and questions).
So my question is: have I designed this well (according to DDD)? Or do I have to keep the questionnaire and questions as separate aggregates of themselves in order to refer to them from the collection campaign/response sets?
I hope this makes sense and thank you.
Ask yourself this: what are the invariants that should be protected?
In your case you must ensure that a question that is answered during a campaign exists (i.e. its index is between zero and number of question in the questionaire - 1) and its an allowed one; other invariant could be that a questionaire must not be modified after at least one question is answered; in any of these cases the campaigns must be synchronized with the questionaire. I see at least 2 solutions:
The simplest solution would be to have a single big aggregate, the Questionare aggregate, with questions, campaignes and answers as sub-entities so you can protect those invariants; this has some performance implications, but only you should know is it is acceptable.
The second solution would be to use a event-driven architecture like CQRS+Event Sourcing. In this case you could have separate aggregates and keep them in sync using a simple Saga that forwards some events from Questionare aggregate (like QuestionAdded, QuestionRemoved) as commands to Campaingn aggregate. I prefer this solution as better separates the responsabilities.

DDD: do I really need to load all objects in an aggregate? (Performance concerns)

In DDD, a repository loads an entire aggregate - we either load all of it or none of it. This also means that should avoid lazy loading.
My concern is performance-wise. What if this results in loading into memory thousands of objects? For example, an aggregate for Customer comes back with ten thousand Orders.
In this sort of cases, could it mean that I need to redesign and re-think my aggregates? Does DDD offer suggestions regarding this issue?
Take a look at this Effective Aggregate Design series of three articles from Vernon. I found them quite useful to understand when and how you can design smaller aggregates rather than a large-cluster aggregate.
EDIT
I would like to give a couple of examples to improve my previous answer, feel free to share your thoughts about them.
First, a quick definition about an Aggregate (took from Patterns, Principles and Practices of Domain Driven Design book by Scott Millet)
Entities and Value Objects collaborate to form complex relationships that meet invariants within the domain model. When dealing with large interconnected associations of objects, it is often difficult to ensure consistency and concurrency when performing actions against domain objects. Domain-Driven Design has the Aggregate pattern to ensure consistency and to define transactional concurrency boundaries for object graphs. Large models are split by invariants and grouped into aggregates of entities and value objects that are treated as conceptual whole.
Let's go with an example to see the definition in practice.
Simple Example
The first example shows how defining an Aggregate Root helps to ensure consistency when performing actions against domain objects.
Given the next business rule:
Winning auction bids must always be placed before the auction ends. If a winning bid is placed after an auction ends, the domain is in an invalid state because an invariant has been broken and the model has failed to correctly apply domain rules.
Here there is an aggregate consisting of Auction and Bids where the Auction is the Aggregate Root.
If we say that Bid is also a separated Aggregate Root you would have have a BidsRepository, and you could easily do:
var newBid = new Bid(money);
BidsRepository->save(auctionId, newBid);
And you were saving a Bid without passing the defined business rule. However, having the Auction as the only Aggregate Root you are enforcing your design because you need to do something like:
var newBid = new Bid(money);
auction.placeBid(newBid);
auctionRepository.save(auction);
Therefore, you can check your invariant within the method placeBid and nobody can skip it if they want to place a new Bid.
Here it is pretty clear that the state of a Bid depends on the state of an Auction.
Complex Example
Back to your example of Orders being associated to a Customer, looks like there are not invariants that make us define a huge aggregate consisting of a Customer and all her Orders, we can just keep the relation between both entities thru an identifier reference. By doing this, we avoid loading all the Orders when fetching a Customer as well as we mitigate concurrency problems.
But, say that now business defines the next invariant:
We want to provide Customers with a pocket so they can charge it with money to buy products. Therefore, if a Customer now wants to buy a product, it needs to have enough money to do it.
Said so, pocket is a VO inside the Customer Aggregate Root. It seems now that having two separated Aggregate Roots, one for Customer and another one for Order is not the best to satisfy the new invariant because we could save a new order without checking the rule. Looks like we are forced to consider Customer as the root. That is going to affect our performance, scalaibility and concurrency issues, etc.
Solution? Eventual Consistency. What if we allow the customer to buy the product? that is, having an Aggregate Root for Orders so we create the order and save it:
var newOrder = new Order(customerId, ...);
orderRepository.save(newOrder);
we publish an event when the order is created and then we check asynchronously if the customer has enough funds:
class OrderWasCreatedListener:
var customer = customerRepository.findOfId(event.customerId);
var order = orderRepository.findOfId(event.orderId);
customer.placeOrder(order); //Check business rules
customerRepository.save(customer);
If everything was good, we have satisfied our invariants while keeping our design as we wanted at the beginning modifying just one Aggregate Root per request. Otherwise, we will send an email to the customer telling her about the insufficient funds issue. We can take advance of it by adding to the email alternatives options she can purchase with her current budget as well as encourage her to charge the pocket.
Take into account that the UI can help us to avoid having customers paying without enough money, but we cannot blindly trust on the UI.
Hope you find both examples useful, and let me know if you find better solutions for the exposed scenarios :-)
In this sort of cases, could it mean that I need to redesign and re-think my aggregates?
Almost certainly.
The driver for aggregate design isn't structure, but behavior. We don't care that "a user has thousands of orders". What we care about are what pieces of state need to be checked when you try to process a change - what data do you need to load to know if a change is valid.
Typically, you'll come to realize that changing an order doesn't (or shouldn't) depend on the state of other orders in the system, which is a good indication that two different orders should not be part of the same aggregate.

Aggreate Root, Aggregates, Entities, Value Objects

I'm struggling with some implementation details when looking at the terms mentioned in the title above.
Can someone tell me whether my interpretation is right?
For reference I look at a CRM Domain
As a AggregateRoot I could see a Customer.
It may have Entities like Address which contains street, postal code and so on.
Now there is something like Contact and Activity this should be at least aggregates. Right? Now if the Contacts and Activities would have complex business logic. For example, "Every time a contact of the type order is created, the order workflow should be started"
Would then Contact need to be an Aggregate root? What may be implementation implications that could result from this?
Further more when looking and Event Sourcing, Would each Aggregate have its own Stream? In this scenario A Customer could have thousands of activities.
It would be great if someone could guide em in which part my understanding is right and which I differ form the common interpretation.
What do you mean by “at least aggregates”?
An aggregate is a set of one or more connected entities. The aggregate can only be accessed from its root entity, also called the aggregate root. The aggregate defines the transactional boundaries for the entities which must be preserved at all time. Jimmy Bogard has a good explanation of aggregates here.
When using event sourcing each aggregate should have its own stream. The stream is used to construct the aggregates and there is no reason to let several aggregates use the same stream.
You should try to keep your aggregates small. If you expect your customer object to have thousands of activities then you should look at if it is possible to design the activities as a separate aggregate, just as long as its boundaries ensures that you do not leave the system in an invalid state.

DDD: How to handle large collections

I'm currently designing a backend for a social networking-related application in REST. I'm very intrigued by the DDD principle. Now let's assume I have a User object who has a Collection of Friends. These can be thousands if the app and the user would become very successful. Every Friend would have some properties as well, it is basically a User.
Looking at the DDD Cargo application example, the fully expanded Cargo-object is stored and retrieved from the CargoRepository from time to time. WOW, if there is a list in the aggregate-root, over time this would trigger a OOM eventually. This is why there is pagination, and lazy-loading if you approach the problem from a data-centric point of view. But how could you cope with these large collections in a persistence-unaware DDD?
As #JefClaes mentioned in the comments: You need to determine whether your User AR indeed requires a collection of Friends.
Ownership does not necessarily imply that a collection is necessary.
Take an Order / OrderLine example. An OrderLine has no meaning without being part of an Order. However, the Customer that an Order belongs to does not have a collection of Orders. It may, possibly, have a collection of ActiveOrders if a customer is limited to a maximum number (or amount) iro active orders. Keeping a collection of historical orders would be unnecessary.
I suspect the large collection problem is not limited to DDD. If one were to receive an Order with many thousands of lines there may be design trade-offs but the order may much more likely be simply split into smaller orders.
In your case I would assert that the inclusion / exclusion of a Friend has very little to do with the consistency of the User AR.
Something to keep in mind is that as soon as you start using you domain model for querying your start running into weird sorts of problems. So always try to think in terms of some read/query model with a simple query interface that can access your data directly without using your domain model. This may simplify things.
So perhaps a Relationship AR may assist in this regard.
If some paging or optimization techniques are the part of your domain, it's nothing wrong to design domain classes with this ability.
Some solutions I've thought about
If User is aggregate root, you can populate your UserRepository with method GetUserWithFriends(int userId, int firstFriendNo, int lastFriendNo) encapsulating specific user object construction. In same way you can also populate user model with some counters and etc.
On the other side, it is possible to implement lazy loading for User instance's _friends field. Thus, User instance can itself decide which "part" of friends list to load.
Finally, you can use UserRepository to get all friends of certain user with respect to paging or other filtering conditions. It doesn't violate any DDD principles.
DDD is too big to talk that it's not for CRUD. Programming in a DDD way you should always take into account some technical limitations and adapt your domain to satisfy them.
Do not prematurely optimize. If you are afraid of large stress, then you have to benchmark your application and perform stress tests.
You need to have a table like so:
friends
id, user_id1, user_id2
to handle the n-m relation. Index your fields there.
Also, you need to be aware whether friends if symmetrical. If so, then you need a single row for two people if they are friends. If not, then you might have one row, showing that a user is friends with the other user. If the other person considers the first a friend as well, you need another row.
Lazy-loading can be achieved by hidden (AJAX) requests so users will have the impression that it is faster than it really is. However, I would not worry about such problems for now, as later you can migrate the content of the tables to a new structure which is unkown now due to the infinite possible evolutions of your project.
Your aggregate root can have a collection of different objects that will only contain a small subset of the information, as reference to the actual business objects. Then when needed, items can be used to fetch the entire information from the underlying repository.

Resources