CQRS and domain model - domain-driven-design

I need to implement a project with CQRS, however I'm in doubt about what entities get a corresponding command and query classes.
If I have classes A, B and C, being that A is my aggregate root and the others are child entities in my aggregate, what classes should have command and query classes?
I mean, should I have a QueryA, QueryB and QueryC, or should I have only QueryA, which will bring the child data using lazy loading, for instance?
For repositories, as my understanding of the domain model, I'm considering only a RepositoryA (for my aggregate root).

Queries are not per aggregate, they are on a per view basis. For instance say you have a customer account and want to display
a list of accounts
account details with confidential info (e.g. credit card details)
account details without confidential info
This would be three queries, one for every view. And usually without such painful things like lazy loading. Either you need some piece of information for a specific view or you don't.
Commands are not per aggregate as well. You would have a command for every behavior. Like OpenAccount, CloseAccount, MergeAccounts, etc.

Related

Can DDD repositories return data from other aggregate roots?

I'm having trouble getting my head around how to use the repository pattern with a more complex object model. Say I have two aggregate roots Student and Class. Each student may be enrolled in any number of classes. Access to this data would therefore be through the respective repositories StudentRepository and ClassRepository.
Now on my front end say I want to create a student details page that shows the information about the student, and a list of classes they are enrolled in. I would first have to get the Student from StudentRepository and then their Classes from ClassRepository. This makes sense.
Where I get lost is when the domain model becomes more realistic/complex. Say students have a major that is associated with a department, and classes are associated with a course, room, and instructors. Rooms are associated with a building. Course are associated with a department etc.. etc..
I could easily see wanting to show information from all these entities on the student details page. But then I would have to make a number of calls to separate repositories per each class the student is enrolled in. So now what could have been a couple queries to the database has increased massively. This doesn't seem right.
I understand the ClassRepository should only be responsible for updating classes, and not anything in other aggregate roots. But does it violate DDD if the values ClassRepository returns contains information from other related aggregate roots? In most cases this would only need to be a partial summary of those related entities (building name, course name, course number, instructor name, instructor email etc..).
But then I would have to make a number of calls to separate repositories per each class the student is enrolled in. So now what could have been a couple queries to the database has increased massively. This doesn't seem right.
Yup.
But does it violate DDD if the values ClassRepository returns contains information from other related aggregate roots?
Nobody cares about "violate DDD". What we care about is: do you still get the benefits of the repository pattern if you start pulling in data from other aggregates?
Probably not - part of the point of "aggregates" is that when writing the business code you don't have to worry to much about how storage is implemented... but if you start mixing locked data and unlocked data, your abstraction starts leaking into the domain code.
However: if you are trying to support reporting, or some other effectively read only function, you don't necessarily need the domain model at all -- it might make sense to just query your data store and present a representation of the answer.
This substitution isn't necessarily "free" -- the accuracy of the information will depend in part on how closely your stored information matches your in memory information (ie, how often are you writing information into your storage).
This is basically the core idea of CQRS: reads and writes are different, so maybe we should separate the two, so that they each can be optimized without interfering with the correctness of the other.
Can DDD repositories return data from other aggregate roots?
Short answer: No. If that happened, that would not be a DDD repository for a DDD aggregate (that said, nobody will go after you if you do it).
Long answer: Your problem is that you are trying to use tools made to safely modify data (aggregates and repositories) to solve a problem reading data for presentation purposes. An aggregate is a consistency boundary. Its goal is to implement a process and encapsulate the data required for that process. The repository's goal is to read and atomically update a single aggregate. It is not meant to implement queries needed for data presentation to users.
Also, note that the model you present is not a model based on aggregates. If you break that model into aggregates you'll have multiple clusters of entities without "lines" between them. For example, a Student aggregate might have a collection of ClassEnrollments and a Class aggregate a collection of Atendees (that's just an example, note that modeling many to many relationships with aggregates can be a bit tricky). You'll have one repository for each aggregate, which will fully load the aggregate when executing an operation and transactionally update the full aggregate.
Now to your actual question: how do you implement queries for data presentation that require data from multiple aggregates? well, you have multiple options:
As you say, do multiple round trips using your existing repositories. Load a student and from the list of ClassEnrollments, load the classes that you need.
Use CQRS "lite". Aggregates and respositories will only be used for update operations and for query operations implement Queries, which won't use repositories, but access the DB directly, therefore you can join tables from multiple aggregates (Student->Enrollments->Atendees->Classes)
Use "full" CQRS. Create read models optimised for your queries based on the data from your aggregates.
My preferred approach is to use CQRS lite and only create a dedicated read model when it's really needed.

DDD considering Repositories & Services for entities

I've been getting acquainted with DDD and trying to understand the way Entities and Aggregate Roots interact.
Below is the example of the situation:
Let's say there is a user and he/she has multiple email addresses (can have up to 200 for the sake of example). Each email address has it's own identity and so does the user. And there is one to many relationship between users and their email.
From the above example I consider Users and Emails as two entities while Users is the aggregate root
DDD Rules that I came across:
Rule: Only aggregate root has access to the repository.
Question 1: Does it mean that I cannot have a separate database table/collection to store the emails separately? Meaning that the emails have to be embedded inside the user document.
Rule: Entities outside the aggregate can only access other entities in the aggregate via the aggregate root.
Question 2: Now considering I do split them up into two different tables/collection and link the emails by having a field in email called associatedUserId that holds the reference to the user that email belongs to. I can't directly have an API endpoint like /users/{userId}/emails and handle it directly in the EmailService.getEmailsByUserId(String userId)? If not how do I model this?
I am sorry if the question seems a bit too naive but I can't seem to figure it out.
Only aggregate root has access to the repository
Does it mean that I cannot have a separate database table/collection to store the emails separately? Meaning that the emails have to be embedded inside the user document.
It means that there should be a single lock to acquire if you are going to make any changes to any of the member entities of the aggregate. That certainly means that the data representation of the aggregate is stored in a single database; but you could of course distribute the information across multiple tables in that database.
Back in 2003, using relational databases as the book of record was common; one to many relationships would normally involve multiple tables all within the same database.
Entities outside the aggregate can only access other entities in the aggregate via the aggregate root.
I can't directly have an API endpoint like /users/{userId}/emails and handle it directly in the EmailService.getEmailsByUserId(String userId)?
Of course you can; you'll do that by first loading the root entity of the User aggregate, then invoking methods on that entity to get at the information that you need.
A perspective: Evans was taking a position against the idea that the application should be able to manipulate arbitrary entities in the domain model directly. Instead, the application should only be allowed to the "root" entities in the domain model. The restriction, in effect, means that the application doesn't really need to understand the constraints that are shared by multiple entities.
Four or five years later cqrs appeared, further refining this idea -- it turns out that in read-only use cases, the domain model doesn't necessarily contribute very much; you don't need to worry about the invariants if they have already been satisfied and you aren't changing anything.
In effect, this suggests that GET /users/{userId}/emails can just pull the data out of a read-only view, without necessarily involving the domain model at all. But POST /users/{userId}/emails needs to demonstrate the original care (meaning, we need to modify the data via the domain model)
does this mean that I need to first go to the UserRepo and pull out the user and then pull out the emails, can't I just make a EmailService talking to an Email Repo directly
In the original text by Evans, repositories give access to root entities, rather than arbitrary entities. So if "email" is a an entity within the "user aggregate", then it normally wouldn't have a repository of its own.
Furthermore, if you find yourself fighting against that idea, it may be a "code smell" trying to bring you to recognize that your aggregate boundaries are in the wrong place. If email and user are in different aggregates, then of course you would use different repositories to get at them.
The trick is to recognize that aggregate design is a reflection of how we lock our data for modification, not how we link our data for reporting.

Class diagram for a mini JEE project

I want to start a project in JEE and I need to confirm about my class diagram. I need to know if the methods used are correct and if the composition I used is correct or not.
This is my class diagram:
The project is about an online sales store, that wants to set up a management tool to sell products, and to manage its products. This tool must include the following features:
Identification module: identification of clients, managers, supervisors
Sales module: make purchases for users
Product Management Module: Adding / Deleting Products
Statistical module: visualization of sales statistics
Functional Specifications
It is necessary to act on the application, to connect to the application with a user ID and password. To facilitate its use and in order to avoid any mishandling thereafter, here is the solution:
User Profile:
The user will be able to visualize the products sold by My Online Races. The user can place an order, provided that he has registered with the site My Online Races.
Manager profile:
The manager will be able to manage the products:
Add / Edit / Delete Products
Add / Edit / Delete category
These data insertions can be made using CSV or XML files, but also through various forms on the website.
The manager will be able to view the sales statistics.
Supervisor Profile:
The supervisor can add managers whose roles are specified above.
The supervisor will be able to view the sales statistics.
The supervisor will be able to view all the actions performed by the managers, a sort of audit trail.
Well I wish to know already if you have remarks about my design. As well as I have a confusion for several methods, for example adding, modifying and deleting a product. Should I put them in the manager or product class? Is the composition I put correct or should I remove it?
Quick review of the diagram and advices
First some minor remarks about class naming: Ordered should be called Order.
The composition between Article and Order is just wrong (not from a formal view, but from the meaning it conveys). Use a normal one-to-many association: it would reflect much better the real nature of the relation between the two classes. Please take into account that a new article may exist without having been ordered, so it shoud be 0..* instead of 1..*
+belongs and +do in the middle of an association are syntactically incorect. You should use a plain triangle instead (or nothing at all). The triangle should be oriented in the reading direction Person do |> Order and Article belongs to |> Category
The methods seem ok. You do not need to add a suffix.
How shall objects be managed (created/updated/deleted) ?
A more advanced concern is not about the diagram but about how you want to organise persistence (i.e. database storage):
do you really want the object to be an active record, that is an object that adds, updates and deletes itself (to the database) ? It's simple to set up, works well, but makes the class dependent on the underlying database implementation and thus makes maintenance more difficult;
or wouldn't it be better to use a repository for each object ? In this case the repository acts as a collection that manages all the database operations. The domain object (Article, order, User, ...) then have nothing to know about the database, wich leads to more maintainable code.
But this is a broader architectural question. If it's just for a first experimental project with JEE, you can very well use the active records. It's simpler to set up. Be sure however in this case to disambiguate the Add/Update/Delete on Person, since it currently may give the impression that any person can add anyone.
Improvement of the model
A final remark, again not about the diagram itself, is about the domain. Your model considers that an Order is about a single Article.
In reality however, orders are in general about one or several articles: if this would also be the case here, your Order would become an OrderItem and the real Order would be inserted between Person and OrderItem. You could then make the relation between Order and OrderItem a composition (i.e: OrderItem is owned by Order, which has responsibility for creating its items, and the items have no sense without the related order).

Service Layer DTOs - Large Complex Interactive Report-Like Objects

I have Meeting objects that form the basis of a scheduling system, of which gridviews are used to display the important information. This is for the purpose of scheduling employees to meetings, and for employees to view what has been scheduled.
I have been trying to follow DDD principles, but I'm having difficulty knowing what to pass from my service layer down to presentation area of system. This is because the schedule can be LARGE, and actually consists of many different elements of the system. Eg. Client Name, Address, Case Info, Group,etc, all of which are needed for the meeting scheduler to make a decision.
In addition to this, the scheduler needs to change values within this schedule and pass it back up to the service layer (eg. assign employees from dropdowns, maybe change group, etc). So, the information isn't really "readonly" - it needs to be interacted with. ie. It's not just a report.
Our current approach is to populate a flattened "Schedule Object" from SQL, which is constructed from small parts of different domain objects. It's quite a complex query. When changes have been made, this is then passed back up to the service layer, and the service will retrieve the domain objects in question, and fire business methods on the domain objects using information from the DTOs.
My question is, is this the correct approach? ie. Continue to generate large custom objects from SQL, and then pass down from Service Layer to Presentation Layer objects that feel a lot like View Models?
UPDATE due to an answer
To give a idea of the amount entities / aggregates relationships involved. (this is an obfuscated examples, so relationships are the important things here)
Client is in one default group
Client has one open case but many closed
Cases have many Meetings
Meeting have many assigned Employees
Meeting have many reasons
Meeting can get scheduled to different groups
Employees can be associated with many groups.
The schedule need to loads all meetings in open cases that belong to patients who are in the same groups as the employee.
Scheduler can see Client Name, Client Address, Case Info, MeetingTime, MeetingType, MeetingReasons, scheduledGroup(s) (showstrail), Assigned Employees (also has hidden employee ids).
Editable fields are assign employee dropdowns and scheduled group.
Schedule may be up to two hundred rows.
DTO is coming down from WCF, so domain model is accessed above this service layer, and not below.
Domain model business calls leveraged by service based on DTO values passed back, and repositories deal with inserts/updates.
So, I suppose to update, is using a query to populate an object which contains all of the above acceptable to pass down as one merged DTO? And if not, how would you approach it? ( giving some example calls to service layer, and explaining a little bit about how you conceive the ORM fetching the data keeping in mind performance)
In the service layer and below, I would treat each entity (see aggregate roots in DDD) separate with respect to it's transactional boundary. I.e. even if you could update a client and a case in the same UI view, it would be best to transactionally modify the client and then modify the case. The more you try to modify in one transaction, the more you can conflict with other users.
Although your schedule is large and can contain lots of objects, the service layer should again deal with each entity (aggregate root) separately and then bundle them together into a new view model. Sadly, on brown-field projects, a lot of logic might be in the SQL and the massive multi-table joins might make this harder to refactor into more atomic queries that do exactly what is needed. The old-school data-centric view of 'do everything you can in the database' goes against everything DDD.
Because DDD is a collection of design ideas and patterns and not particularly a methodology or an architecture, it sounds that it might be too late to try shoe-horn your current application into a DDD application-centric design. It sounds as though your current app is very entrenched in the data-centric view.
If everything is currently being passed up through the layers in one monolithic chunk, it might be best to keep with this style and just expose these monolithic chunks to the people in the other team who wish to consume them, for use in their new app. You might be able to put some sort of view model caching in place (a bit like the caching view model element in CQRS).
In my personal opinion, data-centric, normalised data apps have had their day (they made sense in the 1970s when hard disk space was expensive) and all apps should be moving toward more modern practices. In reality, only when legacy systems are crawling on their knees, will stakeholders usually put up the cash to look for alternatives (usually after stuffing every last server with RAM). It might be possible or best to convince them to refactor small sections at a time.

Implementing Udi's Fetching Strategy - How do I search?

Background
Udi Dahan suggests a fetching strategy as a useful pattern to use for data access. I agree.
The concept is to make roles explicit. For example I have an Aggregate Root - Customer. I want customer in several parts of my application - a list of customers to select from, a view of the customer's details, and I want a button to deactivate a customer.
It seems Udi would suggest an interface for each of these roles. So I have ICustomerInList with very basic details, ICustomerDetail which includes the latest 10 products purchased, and IDeactivateCustomer which has a method to deactivate the customer. Each interface exposes just enough of my Customer Aggregate Root to get the job done in each situation. My Customer Aggregate Root implements all these interfaces.
Now I want to implement a fetching strategy for each of these roles. Each strategy can load a different amount of data into my Aggregate Root because it will be behind an interface exposing only the bits of information needed.
The general method to implement this part is to ask a Service Locator or some other style of dependency injection. This code will take the interface you are wanting, for example ICustomerInList, and find a fetching strategy to load it (IStrategyForFetching<ICustomerInList>). This strategy is implemented by a class that knows to only load a Customer with the bits of information needed for the ICustomerInList interface.
So far so good.
Question
What you pass to the Service Locator, or the IStrategyForFetching<ICustomerInList>. All of the examples I see are only selecting one object by a known id. This case is easy, the calling code passes this id through and will get back the specific interface.
What if I want to search? Or I want page 2 of the list of customers? Now I want to pass in more terms that the Fetching Strategy needs.
Possible solutions
Some of the examples I've seen use a predicate - an expression that returns true or false if a particular Aggregate Root should be part of the result set. This works fine for conditions but what about getting back the first n customers and no more? Or getting page 2 of the search results? Or how the results are sorted?
My first reaction is to start adding generic parameters to my IStrategyForFetching<ICustomerInList> It now becomes IStrategyForFetching<TAggregateRoot, TStrategyForSelecting, TStrategyForOrdering>. This quickly becomes complex and ugly. It's further complicated by different repositories. Some repositories only supply data when using a particular strategy for selecting, some only certain types of ordering. I would like to have the flexibility to implement general repositories that can take sorting functions along with specialised repositories that only return Aggregate Roots sorted in a particular fashion.
It sounds like I should apply the same pattern used at the start - How do I make roles explicit? Should I implement a strategy for fetching X (Aggregate Root) using the payload Y (search / ordering parameters)?
Edit (2012-03-05)
This is all still valid if I'm not returning the Aggregate Root each time. If each interface is implemented by a different DTO I can still use IStrategyForFetching. This is why this pattern is powerful - what does the fetching and what is returned doesn't have to map in any way to the aggregate root.
I've ended up using IStrategyForFetching<TEntity, TSpecification>. TEntity is the thing I want to get, TSpecification is how I want to get it.
Have you come across CQRS? Udi is a big proponent of it, and its purpose is to solve this exact issue.
The concept in its most basic form is to separate the domain model from querying. This means that the domain model only comes into play when you want to execute a command / commit a transaction. You don't use data from your aggregates & entities to display information on the screen. Instead, you create a separate data access service (or bunch of them) that contain methods that provide the exact data required for each screen. These methods can accept criteria objects as parameters and therefore do searching with whatever criteria you desire.
A quick sequence of how this works:
A screen shows a list of customers that have made orders in the last week.
The UI calls the CustomerQueryService passing a date as criteria.
The CustomerQueryService executes a query that returns only the fields required for this screen, including the aggregate id of each customer.
The user chooses a customer in the list, and chooses perform the 'Make Important Customer' action /command.
The UI sends a MakeImportantCommand to the Command Service (or Application Service in DDD terms) containing the ID of the customer.
The command service fetches the Customer aggregate from the repository using the ID passed in the command, calls the necessary methods and updates the database.
Building your app using the CQRS architecture opens you up to lot of possibilities regarding performance and scalability. You can take this simple example further by creating separate query databases that contain denormalised tables for every view, eventual consistency & event sourcing. There is a lot of videos/examples/blogs about CQRS that I think would really interest you.
I know your question was regarding 'fetching strategy' but I notice that he wrote this article in 2007, and it's likely that he considers CQRS its sucessor.
To summarise my answer:
Don't try and project cut down DTO's from your domain aggregates. Instead, just create separate query services that give you a tailored query for your needs.
Read up on CQRS (if you haven't already).
To add to the response by David Masters, I think all the fetching strategy interfaces are adding needless complexity. Having the Customer AR implement the various interfaces which are modeled after a UI is a needless constraint on the AR class and you will spend far to much effort trying to enforce it. Moreover, it is a brittle solution. What if a view requires data that while related to Customer, does not belong on the customer class? Does one then coerce the customer class and the corresponding ORM mappings to contain that data? Why not just have a separate set of classes for query purposes and be done with it? This allows you to deal with fetching strategies at the place where they belong - in the repository. Furthermore, what value does the fetching strategy interface abstraction really add? It may be an appropriate model of what is happening in the application, it doesn't help in implementing it.

Resources