Lightweight aggregates and repositories

Lightweight aggregates and repositories - domain-driven-design

Let's assume that we have two simple domain objects :
Topic (entity) -> Messages (value object)
These two domain objects could be included into one aggregate according to DDD principles.
But in some cases we need to retrieve topics without messages (if want just show a list of topics) and sometimes we need to retrieve topics with messages.
What is the best way to design that simple case? Thanks in advance.

I would suggest you to separate domain logic from data needed for presentation. Something like Command-query separation (CQS) or command-query responsibility segregation (CQRS). For example, if someone adds a new message to topic, you create an appropriate command and handle it as a part of your domain logic. And if you need to display some data in user interface, you select only data that you really need through DTO (data transfer object). This solution avoids of unnessesary data retrieving and helps to keep simplicity. You retrieve only data you really need.
If this solution causes a lot of changes in your project, you can create an additional method in repository that returns a lightweight version of your aggregate (with default stub for Messages collection). But this solution has one drawback - you will need to keep in mind that this method returns incomplete data.

Related

How to correctly persist and present information from multiple aggregates?

I'm creating a selling platform. The core aggregate is called Announcement and it holds references to other aggregates such as Categories, User etc. I am using CQRS approach an event-sourcing solution as storage.
For performance reasons, I decided to store some important details about associated objects (Categories, User) inside the Announcement aggregate along with their ids. My reasoning behind it was that when filtering announcements, I want to simplify the access to those information as much as possible (reduce the number of database joins, allow fancy querying syntax). It was possible, because I included all the required information in the command, which creates an announcement. Generation of a detailed view of an announcement is based on information embedded inside the aggregate. Although it seemed reasonable at first, now I'm having second thoughts.
The considerations that made me think are:
I realized that I don't need transactional consistency on all the additional details (categories, seller details, etc.). There are no constraints that would force me to do what I did.
The event store that I'm using offers multistream projections. I'm wondering if that's the puzzle piece that should replace the redundant information in the Announcement aggregate.
Are the following steps a valid solution for the described problem?
Remove the duplicated information from the Announcement aggregate;
Use a domain event to notify other aggregates about creation of an Announcement;
Let other aggregates publish appropriate events in response to the AnnouncementCreated event; these events may contain additional information about associated objects;
Introduce a multistream projection, which will update itself in response to events from multiple aggregates and produce a complete view of the announcement;

Never design aggregates by thinking of how you will read data. That is against the purpose of CQRS. Aggregates are about commands and business rules not queries. Use events to gather data from multiple aggregates then project the data however you want without affecting your aggregates. This concept is called a "projection".

In general, the only reason to include data in a particular aggregate is if that data affects command validation or if there's some other consistency demand. if information about categories or users isn't qualifying under either reason, then it makes a lot of sense to remove it from the announcement aggregate.
I would probably consider modeling a "categorized and associated announcement" aggregate which is fed by domain events from announcement/category/user aggregates. This could be implemented via the multistream projection from your event store, but I think it's useful to keep that detail separate because there are other ways you could feed domain events from multiple aggregates as commands for a different aggregate (the command implicit in any event is "incorporate this event into your view of the world").

Axon Framework: send command on aggregate load

We're building a microservices system with Axon Framework 4.1. In our domain, we have a label concept where we can attach labels to other entities. While labels are normally created and managed by the user, some of these labels are "special" and need to be hard-coded, but they need to be present in the event stream as well.
We have a bunch of aggregates that represent entities that can be labeled with these labels. Some of these aggregates will be used frequently, while others might be used infrequently or are even abandoned by the user.
Sometimes we come up with new special labels. We add them to the code, and then we also need to add them to the event stream. What is a good way to do that?
We can create a special command that we need to send when the updated service is started for the first time. It goes through all the labels and adds the ones that aren't in the event stream yet. This has two disadvantages. First, we need to actually send that command, which either requires us to not forget it, or to add some infrastructure for it outside of the code (e.g., in our build pipeline). Also, other services could have booted up faster with the new labels and started sending commands before we fired our special command. The other disadvantage is that this command will target all aggregates, including the abandoned ones, which could be wasteful of resources and be confusing to end users who might see activity in a document they thought was abandoned.
Ideally, we would like to be able to send the command when Axon has just loaded the aggregate. That way we would be certain that the labels are only introduced in aggregates that are actually used. Also, we could wire this up in code and it wouldn't require us to add infrastructure outside of the application and/or remember to do it manually.
Unfortunately, this feature doesn't seem to exist in Axon (yet) 😉.
Are there other (better) ways to achieve this?

I've got an idea which might help you out on this.
If I understand the use case correctly, the "Label" in your system, which user can introduce themselves but for which also a couple of hard-coded versions exist, is an Aggregate.
Based on that assumption, I suggest to be smart with the Aggregate Identifier you are using.
The sole thing that Axon expects from you, is that the Aggregate Identifier is (or can be made in to) a String. Typically a UUID is used for the Aggregate Identifiers, which is a reasonable first start.
You can however wrap this UUID in a typed-id object. Taking your "Label" Aggregate, that would opt for a LabelId.
That said, let's first go back to verifying whether a given "Label" Aggregate exists within the Event Stream.
The concern you have is rather valid I think; reading the entire Event Stream to figure out whether a given Aggregate instance exists is to big of a hassle.
However, the EventStore can be queried through two mechanism:
The Event Stream from a given point in time (e.g. what the TrackingToken mechanism does).
The Event Stream for a given Aggregate instance, based on the Aggregate Identifier.
It's the second option which is far more ideal in your scenario.
Just query the EventStore for a given "Label" Aggregate's Identifier. If you receive a non-empty Event Stream, you know it already exists.
Vice versa, if no Events are found, you are certain it's a new "Label" that needs to be introduced.
The crux here is in knowing the "Label's" Aggregate Identifier up front, which circles back to the String storage approach for the Aggregate Identifiers using a typed LabelId. What you could do, is deviate in the LabelId object between a custom "Label" (I'd opt for a UUID here) and a hard-coded "Label".
For the latter, you could for example have the label-name, plus a UUID/counter if desired.
Doing so will ensure that all the Events published from a hard-coded "Label" will have an Aggregate Identifier you can anticipate on during start-up.
Hope this is clear and all, if not, please comment on my response below.

DDD: where should logic go that tests the existence of an entity?

I am in the process of refactoring an application and am trying to figure out where certain logic should fit. For example, during the registration process I have to check if a user exists based upon their email address. As this requires testing if the user exists in the database it seems as if this logic should not be tied to the model as its existence is dictated by it being in the database.
However, I will have a method on the repository responsible for fetching the user by email, etc. This handles the part about retrieval of the user if they exist. From a use case perspective, registration seems to be a use case scenario and accordingly it seems there should be a UserService (application service) with a register method that would call the repository method and perform if then logic to determine if the user entity returned was null or not.
Am I on the right track with this approach, in terms of DDD? Am I viewing this scenario the wrong way and if so, how should I revise my thinking about this?
This link was provided as a possible solution, Where to check user email does not already exits?. It does help but it does not seem to close the loop on the issue. The thing I seem to be missing from this article would be who would be responsible for calling the CreateUserService, an application service or a method on the aggregate root where the CreateUserService object would be injected into the method along with any other relevant parameters?
If the answer is the application service that seems like you are loosing some encapsulation by taking the domain service out of the domain layer. On the other hand, going the other way would mean having to inject the repository into the domain service. Which of those two options would be preferable and more in line with DDD?

I think the best fit for that behaviour is a Domain Service. DS could access to persistence so you can check for existence or uniquenes.
Check this blog entry for more info.
I.e:
public class TransferManager
{
private readonly IEventStore _store;
private readonly IDomainServices _svc;
private readonly IDomainQueries _query;
private readonly ICommandResultMediator _result;
public TransferManager(IEventStore store, IDomainServices svc,IDomainQueries query,ICommandResultMediator result)
{
_store = store;
_svc = svc;
_query = query;
_result = result;
}
public void Execute(TransferMoney cmd)
{
//interacting with the Infrastructure
var accFrom = _query.GetAccountNumber(cmd.AccountFrom);
//Setup value objects
var debit=new Debit(cmd.Amount,accFrom);
//invoking Domain Services
var balance = _svc.CalculateAccountBalance(accFrom);
if (!_svc.CanAccountBeDebitted(balance, debit))
{
//return some error message using a mediator
//this approach works well inside monoliths where everything happens in the same process
_result.AddResult(cmd.Id, new CommandResult());
return;
}
//using the Aggregate and getting the business state change expressed as an event
var evnt = Transfer.Create(/* args */);
//storing the event
_store.Append(evnt);
//publish event if you want
}
}
from http://blog.sapiensworks.com/post/2016/08/19/DDD-Application-Services-Explained

The problem that you are facing is called Set based validation. There are a lot of articles describing the possible solutions. I will give here an extract from one of them (the context is CQRS but it can be applied to some degree to any DDD architecture):
1. Locking, Transactions and Database Constraints
Locking, transactions and database constraints are tried and tested tools for maintaining data integrity, but they come at a cost. Often the code/system is difficult to scale and can be complex to write and maintain. But they have the advantage of being well understood with plenty of examples to learn from. By implication, this approach is generally done using CRUD based operations. If you want to maintain the use of event sourcing then you can try a hybrid approach.
2. Hybrid Locking Field
You can adopt a locking field approach. Create a registry or lookup table in a standard database with a unique constraint. If you are unable to insert the row then you should abandon the command. Reserve the address before issuing the command. For these sort of operations, it is best to use a data store that isn’t eventually consistent and can guarantee the constraint (uniqueness in this case). Additional complexity is a clear downside of this approach, but less obvious is the problem of knowing when the operation is complete. Read side updates are often carried out in a different thread or process or even machine to the command and there could be many different operations happening.
3. Rely on the Eventually Consistent Read Model
To some this sounds like an oxymoron, however, it is a rather neat idea. Inconsistent things happen in systems all the time. Event sourcing allows you to handle these inconsistencies. Rather than throwing an exception and losing someone’s work all in the name of data consistency. Simply record the event and fix it later.
As an aside, how do you know a consistent database is consistent? It keeps no record of the failed operations users have tried to carry out. If I try to update a row in a table that has been updated since I read from it, then the chances are I’m going to lose that data. This gives the DBA an illusion of data consistency, but try to explain that to the exasperated user!
Accepting these things happen, and allowing the business to recover, can bring real competitive advantage. First, you can make the deliberate assumption these issues won’t occur, allowing you to deliver the system quicker/cheaper. Only if they do occur and only if it is of business value do you add features to compensate for the problem.
4. Re-examine the Domain Model
Let’s take a simplistic example to illustrate how a change in perspective may be all you need to resolve the issue. Essentially we have a problem checking for uniqueness or cardinality across aggregate roots because consistency is only enforced with the aggregate. An example could be a goalkeeper in a football team. A goalkeeper is a player. You can only have 1 goalkeeper per team on the pitch at any one time. A data-driven approach may have an ‘IsGoalKeeper’ flag on the player. If the goalkeeper is sent off and an outfield player goes in the goal, then you would need to remove the goalkeeper flag from the goalkeeper and add it to one of the outfield players. You would need constraints in place to ensure that assistant managers didn’t accidentally assign a different player resulting in 2 goalkeepers. In this scenario, we could model the IsGoalKeeper property on the Team, OutFieldPlayers or Game aggregate. This way, maintaining the cardinality becomes trivial.

You seems to be on the right way, the only stuff I didn't get is what your UserService.register does.
It should take all the values to register a user as input, validate them (using the repository to check the existence of the email) and, if the input is valid store the new User.
Problems can arise when the validation involve complex queries. In that case maybe you need to create a secondary store with special indexes suited for queries that you can't do with your domain model, so you will have to manage two different stores that can be out of sync (a user exists in one but it isn't replicated in the other one, yet).
This kind of problem happens when you store your aggregates in something like a key-value store where you can search just with the id of the aggregate, but if you are using something like a sql database that permits to search using your entities fields, you can do a lot of stuff with simple queries.
The only thing you need to take care is avoid to mix query logic and commands logic, in your example the lookup you need to do is easy, is just one field and the result is a boolean, sometimes it can be harder like time operations, or query spanning multiple tables aggregating results, in these cases it is better to make your (command) service use a (query) service, that offers a simple api to do the calculation like:
interface UserReportingService {
ComplexResult aComplexQuery(AComplexInput input);
}
That you can implement with a class that use your repositories, or an implementation that executes directly the query on your database (sql, or whatever).
The difference is that if you use the repositories you "think" in terms of your domain object, if you write directly the query you think in terms of your db abstractions (tables/sets in case of sql, documents in case of mongo, etc..). One or the other depends on the query you need to do.

It is fine to inject repository into domain.
Repository should have simple inteface, so that domain objects could use it as simple collection or storage. Repositories' main idea is to hide data access under simple and clear interface.
I don't see any problems in calling domain services from usecase. Usecase is suppossed to be archestrator. And domain services are actions. It is fine (and even unavoidable) to trigger domain actions by usecase.
To decide, you should analyze Where is this restriction come from?
Is it business rule? Or maybe user shouldn't be a part of model at all?
Usualy "User" means authorization and authentification i.e behaviour, that for my mind should placed in usecase. I prefare to create separate entity for domain (e.g. buyer) and relate it with usecase's user. So when new user is registered it possible to trigger creation of new buyer.

Do we really need a separate event store with Event Sourcing and CQRS patterns?

Suppose we have a situation when we need to implement some domain rules that requires examination of object history (event store). For example we have an Order object with CurrentStatus property, and we need to examine Order.CurrentStatus changes history.
Most likely you will answer that I need to move this knowledge to domain and introduce Order.StatusHistory property that contains a collection of status records, and that I should not query event store. And I will agree with you.
What I question is the need of Event Store.
We write in event store events that has business meaning (domain value), we do not record UserMovedMouse events (in most cases). And as with OrderStatusChanged event there is a high chance that most of events from EventStore will be needed at some point for domain logic, and we end up with a domain object that have a EventHistory property with the collection of events.
I can see a value in separate event store for patterns such as CQRS when you have a single write only event store and multiple read only query stores, which gives you some scalability. However the need to to introduce such thing in code is in question too for me. All decent databases support single write server, multiple read servers scalability (master-slave replication). Why should I introduce such thing at source code level? Why not to forget about Web Services, and Message buses and use write your own wrapers around Sockets.
I have a great respect to "old school" DDD as it was described be Eric Evans, and I see some fresh and good ideas in new wave DDD+SQRC+EventSourcing pattern aggregate. However the main idea of CQRS is under big question for me. Am I missing something?

In short: if event sourcing is not needed (for its added benefits or as workarounds for some quirks), then you definitely shouldn't bring it into your system just for the sake of it.
ES is just one of many ways to augment CQRS architectural style within a bounded context. It is not a requirement.

Implementing Udi's Fetching Strategy - How do I search?

Background
Udi Dahan suggests a fetching strategy as a useful pattern to use for data access. I agree.
The concept is to make roles explicit. For example I have an Aggregate Root - Customer. I want customer in several parts of my application - a list of customers to select from, a view of the customer's details, and I want a button to deactivate a customer.
It seems Udi would suggest an interface for each of these roles. So I have ICustomerInList with very basic details, ICustomerDetail which includes the latest 10 products purchased, and IDeactivateCustomer which has a method to deactivate the customer. Each interface exposes just enough of my Customer Aggregate Root to get the job done in each situation. My Customer Aggregate Root implements all these interfaces.
Now I want to implement a fetching strategy for each of these roles. Each strategy can load a different amount of data into my Aggregate Root because it will be behind an interface exposing only the bits of information needed.
The general method to implement this part is to ask a Service Locator or some other style of dependency injection. This code will take the interface you are wanting, for example ICustomerInList, and find a fetching strategy to load it (IStrategyForFetching<ICustomerInList>). This strategy is implemented by a class that knows to only load a Customer with the bits of information needed for the ICustomerInList interface.
So far so good.
Question
What you pass to the Service Locator, or the IStrategyForFetching<ICustomerInList>. All of the examples I see are only selecting one object by a known id. This case is easy, the calling code passes this id through and will get back the specific interface.
What if I want to search? Or I want page 2 of the list of customers? Now I want to pass in more terms that the Fetching Strategy needs.
Possible solutions
Some of the examples I've seen use a predicate - an expression that returns true or false if a particular Aggregate Root should be part of the result set. This works fine for conditions but what about getting back the first n customers and no more? Or getting page 2 of the search results? Or how the results are sorted?
My first reaction is to start adding generic parameters to my IStrategyForFetching<ICustomerInList> It now becomes IStrategyForFetching<TAggregateRoot, TStrategyForSelecting, TStrategyForOrdering>. This quickly becomes complex and ugly. It's further complicated by different repositories. Some repositories only supply data when using a particular strategy for selecting, some only certain types of ordering. I would like to have the flexibility to implement general repositories that can take sorting functions along with specialised repositories that only return Aggregate Roots sorted in a particular fashion.
It sounds like I should apply the same pattern used at the start - How do I make roles explicit? Should I implement a strategy for fetching X (Aggregate Root) using the payload Y (search / ordering parameters)?
Edit (2012-03-05)
This is all still valid if I'm not returning the Aggregate Root each time. If each interface is implemented by a different DTO I can still use IStrategyForFetching. This is why this pattern is powerful - what does the fetching and what is returned doesn't have to map in any way to the aggregate root.
I've ended up using IStrategyForFetching<TEntity, TSpecification>. TEntity is the thing I want to get, TSpecification is how I want to get it.

Have you come across CQRS? Udi is a big proponent of it, and its purpose is to solve this exact issue.
The concept in its most basic form is to separate the domain model from querying. This means that the domain model only comes into play when you want to execute a command / commit a transaction. You don't use data from your aggregates & entities to display information on the screen. Instead, you create a separate data access service (or bunch of them) that contain methods that provide the exact data required for each screen. These methods can accept criteria objects as parameters and therefore do searching with whatever criteria you desire.
A quick sequence of how this works:
A screen shows a list of customers that have made orders in the last week.
The UI calls the CustomerQueryService passing a date as criteria.
The CustomerQueryService executes a query that returns only the fields required for this screen, including the aggregate id of each customer.
The user chooses a customer in the list, and chooses perform the 'Make Important Customer' action /command.
The UI sends a MakeImportantCommand to the Command Service (or Application Service in DDD terms) containing the ID of the customer.
The command service fetches the Customer aggregate from the repository using the ID passed in the command, calls the necessary methods and updates the database.
Building your app using the CQRS architecture opens you up to lot of possibilities regarding performance and scalability. You can take this simple example further by creating separate query databases that contain denormalised tables for every view, eventual consistency & event sourcing. There is a lot of videos/examples/blogs about CQRS that I think would really interest you.
I know your question was regarding 'fetching strategy' but I notice that he wrote this article in 2007, and it's likely that he considers CQRS its sucessor.
To summarise my answer:
Don't try and project cut down DTO's from your domain aggregates. Instead, just create separate query services that give you a tailored query for your needs.
Read up on CQRS (if you haven't already).

To add to the response by David Masters, I think all the fetching strategy interfaces are adding needless complexity. Having the Customer AR implement the various interfaces which are modeled after a UI is a needless constraint on the AR class and you will spend far to much effort trying to enforce it. Moreover, it is a brittle solution. What if a view requires data that while related to Customer, does not belong on the customer class? Does one then coerce the customer class and the corresponding ORM mappings to contain that data? Why not just have a separate set of classes for query purposes and be done with it? This allows you to deal with fetching strategies at the place where they belong - in the repository. Furthermore, what value does the fetching strategy interface abstraction really add? It may be an appropriate model of what is happening in the application, it doesn't help in implementing it.

Develop Reference

node.js excel linux python-3.x azure haskell apache-spark rust .htaccess string