CQRS/Event Sourcing - Does one expect to receive an Aggregate Id from the user/request?

CQRS/Event Sourcing - Does one expect to receive an Aggregate Id from the user/request? - domain-driven-design

I am currently just trying to learn some new programming patterns and I decided to give event sourcing a shot.
I have decided to model a warehouse as my aggregate root in the domain of shipping/inventory where the number of warehouses is generally pretty constant (i.e. a company wont be adding warehouses too often).
I have run into the question of how to set my aggregateId, which should correspond to a warehouse, on my server. Most examples I have seen, including this one, show the aggregate ID being generated server side when a new aggregate is being created (in my case a warehouse), and then passed in the command request when referring to that aggregate for subsequent commands.
Would you say this is the correct approach? Can I expect the user to know and pass aggregate Ids when issuing commands? I realize this is probably domain dependent and could also be a UI/UX choice as well, just wondering what other's have done. It would make more sense to me if the number of my event sourced aggregates were more frequent, such as with meal tabs or shopping carts.
Thanks!

Heuristic: aggregate id, in many cases, is analogous to the primary key used to distinguish entities in a database table. Many of the lessons of natural vs surrogate keys apply.
Can I expect the user to know and pass aggregate Ids when issuing commands?
You probably can't depend on the human to know the aggregate ids. But the client that the human operator is using can very well know them.
For instance, if an operator is going to be working in a single warehouse during a session, then we might look up the appropriate identifier, cache it, and use it when constructing messages on behalf of the user.
Analog: when you fill in a web form and submit it, the browser does the work of looking at the form action and using that information to construct the correct URI, and similarly the correct HTTP Request.
The client will normally know what the ID is, because it just got it during a previous query.
Creation patterns are weird. It can, in some circumstances, make sense for the client to choose the identifier to be used when creating a new aggregate. In others, it makes sense for the client to provide an identifier for the command message, and the server decides for itself what the aggregate identifier should be.
It's messaging, so you want to be careful about coupling the client directly to your internal implementation details -- especially if that client is under a different development schedule. If you get the message contract right, then the server and client can evolve in any way consistent with the contract at any time.
You may want to review Greg Young's 10 year retrospective, which includes a discussion of warehouse systems. TL;DR - in many cases the messages coming from the human operators are events, not commands.

Would you say this is the correct approach?
You're asking if one of Greg Young's Event Sourcing samples represents the correct approach... Given that the combination of CQRS and Event Sourcing was essentially (re)invented by Greg, I'd say there's a pretty good chance of that.
In general, letting the code that implements the Command-side generate a GUID for every Command, Event, or other persistent object that it needs to write is by far the simplest implementation, since GUIDs are guaranteed to be unique. In a distributed system, uniqueness without coordination is a big thing.
Can I expect the user to know and pass aggregate Ids when issuing commands?
No, and you particularly can't expect a user to know the GUID of their assets. What you may be able to do is to present the user with a list of his or her assets. Each item in the list will have the GUID associated, but it may not be necessary to surface that ID in the user interface. It's just data that the underlying UI object carries around internally.
In some cases, users do need to know the ID of some of their assets (e.g. if it involves phone support). In that case, you can add a lookup API to address that concern.

Related

Command accros multiple aggregates with CQRS and ES

I'm having an odd case while thinking about a solution for my problem.
A quick recap: I'm using an event store with CQRS, and i have 2 aggregates called 'Group' and 'User'.
Basically a User defines some characteristics like his region, age, and a couple of interests.
He then can choose to 'match' with a Group that is in the same region, around the same age and same interests.
Now here's the case: the 'matchmaking' part should happen completely on the backend, it can be a long running process, but for the client it's just 1 call to the endpoint and the end result should be him matching with a group.
So for this case, I have to query the groups which have the same region, the same age slice, the interests don't really matter in my query. I know have a list of groups, and the match maker is going to give each group a rating based on the common interests between the group and the user. The group with the best rating will be joined.
So again, using CQRS and ES, and my problem is that this case seems a mix between queries and a command, and mixing queries into a match command seems to go against the purpose of CQRS.
Querying multiple groups and filtering them against my write side, the event store, also is a bad idea as the aggregates have to be rebuilt and loaded in memory before being able to filter them out.
So I:m kind of stuck here, something is telling me that a long running process / saga could be an answer to my problem, but I don't see how I would still not break the mix of query and commands in my saga, as a saga is basically a chain of commands/events.
How do I tackle this specific case ? No real code is needed, a conceptual solution to get me going is perfect.

Hi this is actually a case where CQRS can shine.
Creating a dedicated matching model seems to be ideal for this case to allow answering what might be a rather non-trivial query in other forms.
So,
create a dedicated (possibly ephemeral, possibly checkpointed/persisted) query model as derived store.
Upon request run a query to get the top matches.
based on the results of the query send a command to update the event store with the new links.
The query model will not need to manage commands and could be updated on a push basis from the event store. This will keep it rather simple to build and keep up to date and further can be optimized to only have the data needed for for this particular query.
An in-memory graph might do well.
-Chris
p.s.
On the command side: the commands here would each only update a single aggregate instance.
Further using the write ahead pattern would allow for not needing any sort of process manager or "saga."
e.g.
For each new membership 1 command to add the new membership to the user stream, then 1 command to the group to add the new member information. Then a simple audit process can scan for incomplete membership assignments both on start up/recovery and as a periodic data quality check.
-Chris

Dependent entities within same aggregate

Situation:
We have a classic Order with OrderLines. Each OrderLine has reference to the ProductId.
Each Product has its RelatedProduct. For example, product
class Product {
string Id;
string Name;
string RelatedProductId;
decimal RelatedProductQuantity;
.
.
.
}
There is a business rule that whenever Product is added to Order with new OrderLine then Product with id=RelatedProductId should also be added in a quantity=RelatedProductQuantity.
Questions:
How to keep this rule within the domain so it doesn't spill over to application service but at the same time keep Order aggregate clean in a sense not to poison it by injecting repository or any data-fetching thing?
Should we use domain service? And if so, can domain service have repository injected, prepare all the data, create OrderLine (for both, base and related products), fill in the aggregate and save it to repository?
If none of the above, what's the best way to model it?

There are two common patterns that you will see here:
Fetch a copy of the information in your application code, then pass that information to the domain model as an argument
Pass the capability to fetch the information as an argument to the domain model
The second option is your classic "domain service" approach, where you use a "stateless" instance to fetch a copy of "global" state.
But, with the right perspective you might recognize that the first approach is the same mechanism - only it's the application code, rather than the domain code, that fetches the copy of the information.
In both cases, it's still the domain model deciding what to do with the copy of the information, so that's all right.
Possible tie breakers:
If the information you need to copy isn't local (ie: you are dealing with a distributed system, and the information isn't available in a local cache), then fetching that information will have failure modes, and you probably don't want to pollute the domain model with a bunch of code to handle that (in much the same way that you don't pollute your domain code with a bunch of database related concerns).
When it's hard to guess in advance which arguments are going to be passed to fetch the data, then it may make sense to let the domain code invoke that function directly. Otherwise, you end up with the application code asking the domain model for the arguments, and the passing the information back into the model, and this could even ping pong back and forth several times.
(Not that it can't be done: you can make it work - what's less clear is how happy you are going to be maintaining the code).
If you aren't sure... use the approach that feels more familiar.

CQRS to command or not to, that is the question

I am new to CQRS, but can see the value in this, so I am trying to apply this to a financial system that we are busy rebuilding.
Like I mentioned, this is a basic fin system with basic balance, withdraw, deposit like functionality.
I have a withdraw & deposit commands. But I am struggling with balance.
According to the domain experts, they want to handle balance as a transaction, with no financial implication (yet), on the clients behalf. So, when the client does a balance inq via the device, it creates a transaction, but also a balance query at the same time.
In the CQRS world, you distiguish between commands that mutate state & queries, that retrieve data in some way.
Apologies if my understanding here are flawed. Can someone point me in the correct direction?
EDIT:
Maybe let me put it this way. I was thinking of creating a CheckBalanceCommand that creates a transaction & insert a BalanceCheckedEvent into the store. But then I would also need to create a CheckBalanceQuery to retrieve the actual balance from the read db.
I would need to invoke both in order to satisfy the balance request.

This is an interesting issue. Your business case is valid: some commands don't mutate aggregate/entity states, still treating them and their resultant events are important (e.g. for audit trails).
In order to support these cases, I'd introduce a base event type named IdentityEvent (inspired by identity values for various mathematical operators and as a justification for the concept; operating them on a certain value doesn't change it). On issuing the corresponding command, derivatives of this event (e.g. BalanceCheckedEvent in your case) will be appended to the aggregate's event stream and view projection may construct views from them as usual; however, their mutate method will not perform any actual mutation while reconstructing entities from event stream.
The actual command processing takes place at the domain layer. Some of your application service, at the application layer, receives the query request, processes it as usual. Additionally, before or after the query operation, the same application service may issue the command to the domain layer, on the aggregate root itself. That doesn't violate any principle: your read and query model are still separate, application service just coordinating between the two.

This is not as rare as you would imagine. An additional valid business case is when a service provider runs a credit check on someone. Credit reporting companies actually store queries made against ones credit score, and use it to influence future credit scores. Of course, when I say that this isn't as rare as we imagine, I'm not attempting to normalize such practices (and we should push back to understand the real value something like this is offering to our product).
What I suggest though is to model this explicitly and not try to generalize this. This feature probably is driven by some business need, and you should model it as such. By this I mean that you should treat the service serving the reads as a separate service entirely, which can raise it's own events for things that have happened, and design the rest of the system in a reactive way (ie responding to events generated by another BC/service).
As an example, you could have the service which serves the query fire a BalanceChecked event, which either the same service or another one could store in a stream for subsequent processing.
I would not suggest a command, because if you'll be replying with the data it's not as if someone can reject the command; it has already happened, someone already has the data.

CQRS design: nosql data view

This is a "language agnostic" question.
I started to study the CQRS pattern.
I've a simple question. I'm supposing to have 2 different storage layer: one relational for the commands(Mysql etc..) and one NoSql (mongo,cassandra.. etc) for the "query"?
Let me explain a little example:
1) As a user I want to insert a "Todo task"
Command: "Create Task" and will insert a new task into a database which have the User and the Todo tables.
2) As a user I'm able to see a list of created task
Query: "GetTasks" that will return a "view" with a collection of task taken from a non sql table named "UserTasks" which have a user and a list of created task.
Is the right approach? I'm sorry if the language is poor, it's just a little example.
If it seems a good approach (again, don't consider details) what is the best approach to keep updated the data stores?
I'm thinking to raise an event like "TaskCreated" and take the new task and insert those information in the nosql storage.
Thanks!

I can't really understand what you're looking for. but... typically, a command would be something that results in side effects. Queries don't cause side effects. GetTasks wouldn't really be a command, but a query.
Your "CreateTask" would be a command, which would result in the task added to the relevant data store(s). Your GetTasks query would retrieve that information from a datastore. It doesn't really matter if you're using a SQL or NoSQL store for this.
The "CommandStore" is typically the store that has just enough data to enforce invariants. In your case, what data is required for that? Is some information required to decide whether or not a task can be registered? For example, say, you have a requirement that a user can have at most 3 "todo"s. In this case, a table in the "Command Store" storing (UserId, Todo Count) is enough. You could also use (UserId, [TodoId]) - ie. store a list of todo ids so that you can gain idempotence. All other information about the user and tasks would be query data, and would be in the query store.
Hope that makes sense.

While there are times when you may wish to store commands, you generally don't. Rather a popular approach is to store the domain events that occur as a result of the commands.This is referred to as Event Sourcing. This would make 'STOREA' a store of events or to put it another way, an event stream. 'STOREB' is typically referred to as the Read Model. It has a de-normalised structure optimised for read speed. It is kept up to date via de-normalisers which respond to specific events. A key point to note here is that there is often a lag between the event being raised and the read model being updated. This in my opinion is a good thing but needs to be thought about when designing the UI.
For more info take a look at CQRS – A Step-by-Step Guide to the Flow of a typical Application
I hope that helps

Complex Finds in Domain Driven Design

I'm looking into converting part of an large existing VB6 system, into .net. I'm trying to use domain driven design, but I'm having a hard time getting my head around some things.
One thing that I'm completely stumped on is how I should handle complex find statements. For example, we currently have a screen that displays a list of saved documents, that the user can select and print off, email, edit or delete. I have a SavedDocument object that does the trick for all the actions, but it only has the properties relevant to it, and I need to display the client name that the document is for and their email address if they have one. I also need to show the policy reference that this document may have come from. The Client and Policy are linked to the SavedDocument but are their own aggregate roots, so are not loaded at the same time the SavedDocuments are.
The user is also allowed to specify several filters to reduce the list down. These to can be from properties that are stored on the SavedDocument or the Client and Policy.
I'm not sure how to handle this from a Domain driven design point of view.
Do I have a function on a repository that takes the filters and returns me a list of SavedDocuments, that I then have to turn into a different object or DTO, and fill with the additional client and policy information? That seem a little slow as I have to load all the details using multiple calls.
Do I have a function on a repository that takes the filters and returns me a list of SavedDocumentsForList objects that contain just the information I want? This seems the quickest but doesn't feel like I'm using DDD.
Do I load everything from their objects and do all the filtering and column selection in a service? This seems the slowest, but also appears to be very domain orientated.
I'm just really confused how to handle these situations, and I've not really seeing any other people asking questions about it, which masks me feel that I'm missing something.

Queries can be handled in a few ways in DDD. Sometimes you can use the domain entities themselves to serve queries. This approach can become cumbersome in scenarios such as yours when queries require projections of multiple aggregates. In this case, it is easier to use objects explicitly designed for the respective queries - effectively DTOs. These DTOs will be read-only and won't have any behavior. This can be referred to as the read-model pattern.

Develop Reference

node.js excel linux python-3.x azure haskell apache-spark rust .htaccess string