DDD Every Entitys seems fit inside one aggregate - domain-driven-design

I'm implementing a college system, and I'm trying to use DDD. I'm also reading the blue book. The basics entities of the system are Institution, Courses, Professors and Students. This system will allow a lot of Institutions, each having its courses, students and professors.
Reading about aggregates, all entities fits inside the aggregate Institution, because doesn't exists courses without Institution, the same for students and professors. Am I right thinking in that way?
In some place the professors will access the courses that they teach. Using this approach, should I always access the courses through Institution? This implementation seems strange to me, so I ask myself if Professor, as Students should be their own AR and have their Repository.

Even though you have accepted an answer I am adding this anyway since a comment is too short.
This whole aggregate root business trips up just about everyone when starting out with DDD. I know, since I have been there myself :)
As mentioned, a domain expert may be helpful in some cases but keep in mind that ownership does not imply containment. An Order typically belongs to a Customer but the Customer is not the AR for an Order since an Order can exist without a Customer. You may think: "But wait, that isn't really true!". This is where is comes down to rules. When I walk into a clothing store I can purchase a pair of shoes. I am a customer but they have no record of me other than a receipt I can produce. I am a cash customer. Perhaps my particular brand of shoe is not in stock but I can still order it. They will contact me once it arrives and that will probably be that and I'll in all likelihood still not be registered in any computer system. However, that same store is registered as a Customer with their supplier.
So why this long-winded story? Well, if it is possible to have an Entity stand alone with only a Value Object representing the owner then it is probably going to be an AR. I can include some basic customer information in a CustomerDetails value object in an Order? So the Order can be an AR.
No let's take a look at an OrderLine. Can I include some basic OrderDetails information on an OrderLine? This feels odd since a number of order lines constitute an Order. So it isn't quite as natural.
In the same way a GrapeBunch has to have a GrapeStem and a collection of GrapeBerry objects.
This seems to imply that if anything can be regarded as optional it may indicate that the related instance is an AR. If, however, a related instance is required then it is part of the AR.
These ideas are very broad but may serve as guidelines to consider your structure.
One more thing to remember is that an AR should not be instanced in another AR. Rather use the Id or a Value Object representing the relationship.

I think you're missing some transactional analysis - what typically changes together as part of the same business transaction, and how frequently ? One big aggregate is not necessarily a problem if only 2 users collaborate on it with only a few changes per day, but with dozens of concurrent modifications it could become a contention point.
Besides the data inventory and data structuration aspect of the problem, you want to have an idea of how the system will be used to make educated aggregate design decisions.

Something that might help you to separate those entities into different aggregate roots is to ask you: Which one of those must be used together? This is usually helpful as a first coarse filter.
So, for example, to add a student to a course, you don't need the Institution?
In your example about a professor accessing the courses he teaches. Can he access them by providing his professor id rather than the professor entity? I he provides the professor id, then the entities won't be associated by a reference but by an id.
Lots of this concepts have evolved a lot since the blue book was written 12 years ago. Even though the blue book is a really good book, I suggest you to also read the red book (Implementing Domain-Driven Design by Vaughn Vernon). This book has a more practical approach to DDD and shows more modern approaches, such as CQRS and Event Sourcing.

A professor and a student can exist in their own right, indeed they may associate themselves with institutions. An institution exists in its own right. A course may exist in its own right (what if the same course is offered at more that one institution, are they the same?)... The domain expert would best advise on that (infact they should advise and guide the entire design).
If you make an aggregate too big you will run in to concurrency issues that can avoided if you find the right model.
Some PDFs I recommend reading are here:
http://dddcommunity.org/library/vernon_2011/

Related

What is an Aggregate Root?

No, it is not a duplication question.
I have red many sources on the subject, but still I feel like I don't fully understand it.
This is the information I have so far (from multiple sources, be it articles, videos, etc...) about what is an Aggregate and Aggregate Root:
Aggregate is a collection of multiple Value Objects\Entity references and rules.
An Aggregate is always a command model (meant to change business state).
An Aggregate represents a single unit of (database - because essentialy the changes will be persisted) work, meaning it has to be consistent.
The Aggregate Root is the interface to the external world.
An Aggregate Root must have a globally unique identifier within the system
DDD suggests to have a Repository per Aggregate Root
A simple object from an aggregate can't be changed without its AR(Aggregate Root) knowing it
So with all that in mind, lets get to the part where I get confused:
in this site it says
The Aggregate Root is the interface to the external world. All interaction with an Aggregate is via the Aggregate Root. As such, an Aggregate Root MUST have a globally unique identifier within the system. Other Entites that are present in the Aggregate but are not Aggregate Roots require only a locally unique identifier, that is, an Id that is unique within the Aggregate.
But then, in this example I can see that an Aggregate Root is implemented by a static class called Transfer that acts as an Aggregate and a static function inside called TransferedRegistered that acts as an AR.
So the questions are:
How can it be that the function is an AR, if there must be a globaly unique identifier to it, and there isn't, reason being that its a function. what does have a globaly unique identifier is the Domain Event that this function produces.
Following question - How does an Aggregate Root looks like in code? is it the event? is it the entity that is returned? is it the function of the Aggregate class itself?
In the case that the Domain Event that the function returns is the AR (As stated that it has to have that globaly unique identifier), then how can we interact with this Aggregate? the first article clearly stated that all interaction with an Aggregate is by the AR, if the AR is an event, then we can do nothing but react on it.
Is it right to say that the aggregate has two main jobs:
Apply the needed changes based on the input it received and rules it knows
Return the needed data to be persisted from AR and/or need to be raised in a Domain Event from the AR
Please correct me on any of the bullet points in the beginning if some/all of them are wrong is some way or another and feel free to add more of them if I have missed any!
Thanks for clarifying things out!
I feel like I don't fully understand it.
That's not your fault. The literature sucks.
As best I can tell, the core ideas of implementing solutions using domain driven design came out of the world of Java circa 2003. So the patterns described by Evans in chapters 5 and six of the blue book were understood to be object oriented (in the Java sense) domain modeling done right.
Chapter 6, which discusses the aggregate pattern, is specifically about life cycle management; how do you create new entities in the domain model, how does the application find the right entity to interact with, and so on.
And so we have Factories, that allow you to create instances of domain entities, and Repositories, that provide an abstraction for retrieving a reference to a domain entity.
But there's a third riddle, which is this: what happens when you have some rule in your domain that requires synchronization between two entities in the domain? If you allow applications to talk to the entities in an uncoordinated fashion, then you may end up with inconsistencies in the data.
So the aggregate pattern is an answer to that; we organize the coordinated entities into graphs. With respect to change (and storage), the graph of entities becomes a single unit that the application is allowed to interact with.
The notion of the aggregate root is that the interface between the application and the graph should be one of the members of the graph. So the application shares information with the root entity, and then the root entity shares that information with the other members of the aggregate.
The aggregate root, being the entry point into the aggregate, plays the role of a coarse grained lock, ensuring that all of the changes to the aggregate members happen together.
It's not entirely wrong to think of this as a form of encapsulation -- to the application, the aggregate looks like a single entity (the root), with the rest of the complexity of the aggregate being hidden from view.
Now, over the past 15 years, there's been some semantic drift; people trying to adapt the pattern in ways that it better fits their problems, or better fits their preferred designs. So you have to exercise some care in designing how to translate the labels that they are using.
In simple terms an aggregate root (AR) is an entity that has a life-cycle of its own. To me this is the most important point. One AR cannot contain another AR but can reference it by Id or some value object (VO) containing at least the Id of the referenced AR. I tend to prefer to have an AR contain only other VOs instead of entities (YMMV). To this end the AR is responsible for consistency and variants w.r.t. the AR. Each VO can have its own invariants such as an EMailAddress requiring a valid e-mail format. Even if one were to call contained classes entities I will call that semantics since one could get the same thing done with a VO. A repository is responsible for AR persistence.
The example implementation you linked to is not something I would do or recommend. I followed some of the comments and I too, as one commenter alluded to, would rather use a domain service to perform something like a Transfer between two accounts. The registration of the transfer is not something that may necessarily be permitted and, as such, the domain service would be required to ensure the validity of the transfer. In fact, the registration of a transfer request would probably be a Journal in an accounting sense as that is my experience. Once the journal is approved it may attempt the actual transfer.
At some point in my DDD journey I thought that there has to be something wrong since it shouldn't be so difficult to understand aggregates. There are many opinions and interpretations w.r.t. to DDD and aggregates which is why it can get confusing. The other aspect is, in IMHO, that there is a fair amount of design involved that requires some creativity and which is based on an understanding of the domain itself. Creativity cannot be taught and design falls into the realm of tacit knowledge. The popular example of tacit knowledge is learning to ride a bike. Now, we can read all we want about how to ride a bike and it may or may not help much. Once we are on the bike and we teach ourselves to balance then we can make progress. Then there are people who end up doing absolutely crazy things on a bike and even if I read how to I don't think that I'll try :)
Keep practicing and modelling until it starts to make sense or until you feel comfortable with the model. If I recall correctly Eric Evans mentions in the Blue Book that it may take a couple of designs to get the model closer to what we need.
Keep in mind that Mike Mogosanu is using a event sourcing approach but in any case (without ES) his approach is very good to avoid unwanted artifacts in mainstream OOP languages.
How can it be that the function is an AR, if there must be a globaly unique identifier to it, and there isn't, reason being that
its a function. what does have a globaly unique identifier is the
Domain Event that this function produces.
TransferNumber acts as natural unique ID; there is also a GUID to avoid the need a full Value Object in some cases.
There is no unique ID state in the computer memory because it is an argument but think about it; why you want a globaly unique ID? It is just to locate the root element and its (non unique ID) childrens for persistence purposes (find, modify or delete it).
Order A has 2 order lines (1 and 2) while Order B has 4 order lines (1,2,3,4); the unique identifier of order lines is a composition of its ID and the Order ID: A1, B3, etc. It is just like relational schemas in relational databases.
So you need that ID just for persistence and the element that goes to persistence is a domain event expressing the changes; all the changes needed to keep consistency, so if you persist the domain event using the global unique ID to find in persistence what you have to modify the system will be in a consistent state.
You could do
var newTransfer = New Transfer(TransferNumber); //newTransfer is now an AG with a global unique ID
var changes = t.RegisterTransfer(Debit debit, Credit credit)
persistence.applyChanges(changes);
but what is the point of instantiate a object to create state in the computer memory if you are not going to do more than one thing with this object? It is pointless and most of OOP detractors use this kind of bad OOP design to criticize OOP and lean to functional programming.
Following question - How does an Aggregate Root looks like in code? is it the event? is it the entity that is returned? is it the function
of the Aggregate class itself?
It is the function itself. You can read in the post:
AR is a role , and the function is the implementation.
An Aggregate represents a single unit of work, meaning it has to be consistent. You can see how the function honors this. It is a single unit of work that keeps the system in a consistent state.
In the case that the Domain Event that the function returns is the AR (As stated that it has to have that globaly unique identifier),
then how can we interact with this Aggregate? the first article
clearly stated that all interaction with an Aggregate is by the AR, if
the AR is an event, then we can do nothing but react on it.
Answered above because the domain event is not the AR.
4 Is it right to say that the aggregate has two main jobs: Apply the
needed changes based on the input it received and rules it knows
Return the needed data to be persisted from AR and/or need to be
raised in a Domain Event from the AR
Yes; again, you can see how the static function honors this.
You could try to contat Mike Mogosanu. I am sure he could explain his approach better than me.

Relationships between multiple aggregate roots

In many applications, I deal with users and finance companies (as an example) and I have long been struggling to model the relationship between the two according to Domain Driven Design principles.
In my system I can do the following:
Add a user to an existing finance company.
Add a finance company to an existing user.
I believe both are aggregate roots... Finance Company and User.
How do I model the relationship between the 2? Is it FinanceCompany.Users? or User.FinanceCompanies? Is it neither? Or am I missing knowledge of some key DDD concept(s)? The problem is if I choose one way over the other, the code is more understandable / clear from one aggregate root entry point, but not the other. Sometimes there are cases where it makes more sense to navigate to a Finance Company and add users to it, and other times there are cases where it makes more sense to navigate to a specific user and add finance companies to the user.
Is there some better way to approach this, maybe through repository methods? Is there some key concept I am not getting or understanding here? It doesn't feel right to assume the relationship between Finance Company and User belongs under either of the 2 ARs. When I store the relationship I have to store it in a table named FinanceCompanyUsers or UserFinanceCompanies, but it still doesn't seem clear.
Would I have code such as FinanceCompany.AddUser() and User.AddFinanceCompany()? or is there some completely different approach for relationships such as this?
You have already determined that both User and FinanceCompany are aggregates so each has its own life-cycle.
The problem with many domains is that we don't have a complete understanding of the relationships. As another example we can take an Order and a Product. Both are aggregates but would we have Order.AddProduct() or Product.AddOrder()? In this case it seems pretty obvious in that an Order contains a limited subset of Product entries whereas a Product may very well contain many orders and we are not really too interested in that relationship since it is a rather weak relationship. A Product can exist and be valid without any orders attached but an Order is pretty useless without at least one product entry. This means that the Order has an invariant imposed in terms of its OrderItem entries. In addition to this we have enough knowledge about this hackneyed example that we know we are going to need an associative entity (in relational theory speak) since we need additional information regarding the relationship and entering the fray would be our OrderItem table. Curiously I have not seen it called OrderProduct.
The guidance I would suggest is to pick the most appropriate side.
However, if no side is a true winner and both aggregates can exist without a relationship to the other in terms of an invariant perhaps the relationship itself is an aggregate as you have certainly alluded to. Perhaps it isn't only a UserFinanceCompany aggregate but perhaps there is a concept that is missing from the ubiquitous language that the domain experts refer to. Perhaps something like Auditor or some such that represents that relationship. This is akin to the OrderItem or OrderLine concept as opposed to OrderProduct.

DDD: do I really need to load all objects in an aggregate? (Performance concerns)

In DDD, a repository loads an entire aggregate - we either load all of it or none of it. This also means that should avoid lazy loading.
My concern is performance-wise. What if this results in loading into memory thousands of objects? For example, an aggregate for Customer comes back with ten thousand Orders.
In this sort of cases, could it mean that I need to redesign and re-think my aggregates? Does DDD offer suggestions regarding this issue?
Take a look at this Effective Aggregate Design series of three articles from Vernon. I found them quite useful to understand when and how you can design smaller aggregates rather than a large-cluster aggregate.
EDIT
I would like to give a couple of examples to improve my previous answer, feel free to share your thoughts about them.
First, a quick definition about an Aggregate (took from Patterns, Principles and Practices of Domain Driven Design book by Scott Millet)
Entities and Value Objects collaborate to form complex relationships that meet invariants within the domain model. When dealing with large interconnected associations of objects, it is often difficult to ensure consistency and concurrency when performing actions against domain objects. Domain-Driven Design has the Aggregate pattern to ensure consistency and to define transactional concurrency boundaries for object graphs. Large models are split by invariants and grouped into aggregates of entities and value objects that are treated as conceptual whole.
Let's go with an example to see the definition in practice.
Simple Example
The first example shows how defining an Aggregate Root helps to ensure consistency when performing actions against domain objects.
Given the next business rule:
Winning auction bids must always be placed before the auction ends. If a winning bid is placed after an auction ends, the domain is in an invalid state because an invariant has been broken and the model has failed to correctly apply domain rules.
Here there is an aggregate consisting of Auction and Bids where the Auction is the Aggregate Root.
If we say that Bid is also a separated Aggregate Root you would have have a BidsRepository, and you could easily do:
var newBid = new Bid(money);
BidsRepository->save(auctionId, newBid);
And you were saving a Bid without passing the defined business rule. However, having the Auction as the only Aggregate Root you are enforcing your design because you need to do something like:
var newBid = new Bid(money);
auction.placeBid(newBid);
auctionRepository.save(auction);
Therefore, you can check your invariant within the method placeBid and nobody can skip it if they want to place a new Bid.
Here it is pretty clear that the state of a Bid depends on the state of an Auction.
Complex Example
Back to your example of Orders being associated to a Customer, looks like there are not invariants that make us define a huge aggregate consisting of a Customer and all her Orders, we can just keep the relation between both entities thru an identifier reference. By doing this, we avoid loading all the Orders when fetching a Customer as well as we mitigate concurrency problems.
But, say that now business defines the next invariant:
We want to provide Customers with a pocket so they can charge it with money to buy products. Therefore, if a Customer now wants to buy a product, it needs to have enough money to do it.
Said so, pocket is a VO inside the Customer Aggregate Root. It seems now that having two separated Aggregate Roots, one for Customer and another one for Order is not the best to satisfy the new invariant because we could save a new order without checking the rule. Looks like we are forced to consider Customer as the root. That is going to affect our performance, scalaibility and concurrency issues, etc.
Solution? Eventual Consistency. What if we allow the customer to buy the product? that is, having an Aggregate Root for Orders so we create the order and save it:
var newOrder = new Order(customerId, ...);
orderRepository.save(newOrder);
we publish an event when the order is created and then we check asynchronously if the customer has enough funds:
class OrderWasCreatedListener:
var customer = customerRepository.findOfId(event.customerId);
var order = orderRepository.findOfId(event.orderId);
customer.placeOrder(order); //Check business rules
customerRepository.save(customer);
If everything was good, we have satisfied our invariants while keeping our design as we wanted at the beginning modifying just one Aggregate Root per request. Otherwise, we will send an email to the customer telling her about the insufficient funds issue. We can take advance of it by adding to the email alternatives options she can purchase with her current budget as well as encourage her to charge the pocket.
Take into account that the UI can help us to avoid having customers paying without enough money, but we cannot blindly trust on the UI.
Hope you find both examples useful, and let me know if you find better solutions for the exposed scenarios :-)
In this sort of cases, could it mean that I need to redesign and re-think my aggregates?
Almost certainly.
The driver for aggregate design isn't structure, but behavior. We don't care that "a user has thousands of orders". What we care about are what pieces of state need to be checked when you try to process a change - what data do you need to load to know if a change is valid.
Typically, you'll come to realize that changing an order doesn't (or shouldn't) depend on the state of other orders in the system, which is a good indication that two different orders should not be part of the same aggregate.

UML Domain model - how to model multiple roles of association between two entities?

Suppose there is a scenario of Users having Tasks. Each User can either be a Watcher or Worker of a Task.
Furthermore, a Worker can file the hours he has worked on a given Task.
Would the following diagram be correct? I have looked around at domain models and I have not seen one with the two associations (works on, watches). Is it acceptable?
EDIT: What about this scenario? An User can make an Offer to another user. A possible way of modelling it is shown on the following diagram.
However, in that diagram it would seem possible for a user to make the offer to himself. Is it possible to model some constraints in, or is this handled further down the development line?
It is in principle correct, this is how you model multiple relationships between two classes.
As for the constraints, UML makes use of OCL (Object Constraint Language), so you can say that the associations are exclusive (xor - exclusive or).
Also note that it is generally a good idea to name the end roles of the associations.
Regarding one of the comments saying
There are no UML police to flag you down for being "unacceptable".
It is like saying: there's no code police to flag you down for writing shitty code.
You create diagrams to convey information (for school project anyway), if you diverge from standards or best practices you make it harder for other people to understand your diagrams.
And just like there are linters (jslint, ...) that check your code for common problems, for models there are model validations that do the same thing.
Also models, just like code, aren't set in stone so don't be afraid to modify them when you find a better way to express your domain.
Update
As Jim aptly pointed out, you usually do stuff not as a User (or Person), but as a Role. E.g. when you are student and you are filling a form, nobody cares that you are human, but that you are a Student. Typically a Person will also have several different roles (you could be a Student and a TA, a Professor, etc.)
Separating it in this way makes the domain much clearer as you are concerned only with the Roles, and not the people implementing them.
On the first model, there's not much to add after Peter's excellent and very interesting answer.
However your second diagram (in the edit section) does seem to give a true and fair view of the narative. From the strict 1 to 1 relationships, I'd understand that every user makes exactly one offer to one other user, and every user receives exactly one offer from another user:
"A user can make an offer to another user" implies a cardinality of 0..1 or *, not 1.
From this we understand implicitly that not all users need to receive an offer, i.e. also a cardinality of 0..1 or *
This can be debated, but "an offer" doesn't in my opinion mean at "at most one offer", so the upper bounds of cardinality shouldn't be * and not 1 to show that every user may make several offers and receive several offers.
As you can see, you can add constraints to the schema to increase expressivity. This is done in an annotation between { } . But you can choose the most suitable syntax. Here an example with the full expressivity of natural language (as suggested by Martin Fowler in his "UML distilled") but you can of course use the more formal OCL, making it {self.offerer<>self.offeree}.
I see #Peter updated his answer before I could post this answer, but I will post this anyway to show you some other tricks.
In general, it is perfectly valid to have multiple associations between the same two classes. However, I don't think it's a good idea here.
You say you want to build a [problem] domain model. I'm glad to hear that! A problem domain model is very important, as I explain in the last paragraph here. One thing to point out is that you want to build a durable model that transcends a system you can conceive. In the problem domain, there is no "User". There are, however, Roles that People play. For example, you mentioned Watcher and Worker. Hopefully these are concepts your customer already has in the "meat world".
In the model you posted, you have no place to hang the hours worked or progress made. You might try an exercise. If you had no computer, how would your customer track these things? They often have (or had) a way, and it was probably pretty optimal for a manual system.
Here is how I would model my understanding of your problem domain:
Some things to note:
A Person plays any number of Roles.
A Role is with respect to exactly one Task and one Person. A Watcher watches exactly one Task, a Worker is assigned to exactly one Task, and any kind of Role is played by exactly one Person.
A Role is abstract, and its subclasses are {complete}, meaning there can be no valid instance of a Role without also being an instance of a subclass.
A Task is watched by any number of Watchers and may be assigned to one Worker at a time. There is a constraint saying {person is not a watcher}. (You can show the OCL for that, but few people will understand it.)
I have added a Progress concept as a way of logging progress on a Task. A Worker makes Progress on one Task. Each bit of Progress has a Description and a Duration. Note that I have not committed to any computational way of representing a Description or a Duration. The designer of a system for this will be free to derive a Duration from start and end times, or ask the user to self-report.

Search query and search result in DomainDrivenDesign

I have a web application with news posts. Those news posts should be searchable. In context of DDD what kind of buililng block are search query and search result?
These are my thoughts
They both don't have identity, therefore they are not Entities. But the absence of identity doesn't imply they are Value Object. As Eric Evans states:
However, if we think of this category of object as just the absence of identity, we haven't added much to our toolbox or vocabulary. In fact, these objects have characteristics of their own and their own significance to the model. These are the objects that describe things.
I would like to say that both are value objects, but I'm not sure. What confuses me are examples that I've found on Internet. Usualy value object are part of other entities, they are not "standalone". Martin Fowler gives for example Money or date range object. Adam Bien event compares them to enums.
If search result would be considered value object, that would be value object that consists of entities. I'm not sure that's completely alright.
I don't think they are DataTransferObject. Because we are not current concerned with transferring data between layers, but we are concerned with their meaning for the model in absence of layer.
I don't think search query is command. Because it's not a request for change.
As stated on CQRS
People request changes to the domain by sending commands.
I'm trying to use and learn DDD, can someone clarify this problem to me? Where did I go wrong with reasoning?
The simple answer is that querying is probably not part of your domain. The domain model is not there to serve queries, it is there to enforce invariants in your domain. Since by nature queries are read-only, there are no invariants to enforce, so why complicate things? I think where people usually go wrong with DDD is that they assume since they are "doing DDD" every aspect of the system must be handled by a domain model. DDD is there to help with the complex business rules and should only be applied when/where you actually have them. Also, you can and probably should have many models to support each bounded context. But that's another discussion.
It's interesting that you mention CQRS because what does that stand for? Command query responsibility segregation. So if commands use the domain model, and query responsibility is segregated from that, what does that tell you to do? The answer is, do whatever is easiest to query and display that data. If select * from news table filled to dataset works, go with that. If you prefer entity framework, go with that. There is no need to get the domain model involved for queries.
One last point I'd like to make is I think a lot of people struggle with DDD by applying it to situations where there aren't many business invariants to enforce and the domain model ends up looking a lot like the database. Be sure you are using the right tool for the job and not over complicating things. Also, you should only apply DDD in the areas of your system where these invariants exist. It's not an all or nothing scenario.

Resources