I have a Rental entity that is an aggregate root. Among other things it maintains a list of Allocations (chunks of time that is reserved).
How do I add a new allocation? Since Rental is aggregate root, any new allocation should go through it but it is impossible to say if a rental can be allocated, before we try to save the allocation in the database. Another user could have reserved it in the meantime. I'm guessing, I should use a Domain Service for this?
I would hate to have to inject anything every time I need a new Rental but what is the difference between injecting a Domain Service, instead of a Repository, other than the terminology being different?
When and why should I use a domain service?
You use a domain service to allow an aggregate to run queries. Tax calculation is an example that shows up form time to time. The aggregate passes some state to the calculator, the calculator reports the tax, the aggregate decides what to do with that information (ignore it, reject the update that needs it, etc).
Running the query doesn't modify the domain service instance in any way, so you can repeat queries as often as you like without worrying that the calculations contaminate each other.
Think read only service provider.
Since Rental is aggregate root, any new allocation should go through it but it is impossible to say if a rental can be allocated, before we try to save the allocation in the database. Another user could have reserved it in the meantime. I'm guessing, I should use a Domain Service for this?
No - completely the wrong use case.
If an allocation is part of the Rental aggregate, then it's fine to have the Rental aggregate create allocations of its own. You don't need a service for that (you could, potentially, delegate the work to a factory if you like separation of concerns).
If "another user could have reserved that allocation in the meantime", then you have contention -- two users trying to change the same aggregate at the same time. This is normally managed in one of two ways.
Locking: you only let one user at a time modify the Rental aggregate. So in a data race, the loser has to wait for the winner to finish, then the aggregate can reject the loser's command because that particular allocation is already taken.
Optimistic concurrency: you allow both users to modify different copies of the aggregate at the same time, but save is only allowed if the original state is unchanged. Think "compare and swap"; the race is in the save, between these two instructions
state.compareAndSwap(originalState, loserState)
state.compareAndSwap(originalState, winnerState)
Winner's compare and swap succeeds, but the loser's fails (because originalState != winnerState), and so the losers modification is rejected.
Either way, only one write to your database reserving the allocation is allowed.
If I understand you correctly, you're saying that in this case it would be okay to use a repository from inside the Rental domain entity?
No, you shouldn't need to - the allocation, being part of the Rental aggregate, gets created by the aggregate in memory, and first appears in your data store when the aggregate is saved.
Why use aggregates at all, if everything of consequence has to be extracted into surrounding code or factories?
Some of the answer here is separation of concerns - the primary concern of the aggregate is enforcing the business invariant: ensuring that creating an allocation with some specific state is consistent with everything else going on. The factory is responsible for ensuring that the created object is wired up correctly.
To use your example: the factory would have responsibility for creating the allocation in memory, but would not need to know anything about making sure that the allocation is unique. The rules to ensure that the allocation is unique are described and enforced by the aggregate.
Use a static factory method to create a Rental object.
public static class RentalFactory
{
public Rental CreateRental()
{
var allocationSvc = new RentalAllocationService();
return new Rental(allocationSvc);
}
}
Repositories should only be concerned about persistence to underlying store.
Domain services primary concern is carrying out some behavior involving entities or value objects.
Related
Assume that I have two aggregates: Vehicles and Drivers, And I have a rule that a vehicle cannot be assigned to a driver if the driver is on vacation.
So, my implementation is:
class Vehicle {
public void assignDriver(driver Driver) {
if (driver.isInVacation()){
throw new Exception();
}
// ....
}
}
Is it ok to pass an aggregate to another one as a parameter? Am I doing anything wrong here?
I'd say your design is perfectly valid and reflects the Ubiquitous Language very well. There's several examples in the Implementing Domain-Driven Design book where an AR is passed as an argument to another AR.
e.g.
Forum#moderatePost: Post is not only provided to Forum, but modified by it.
Group#addUser: User provided, but translated to GroupMember.
If you really want to decouple you could also do something like vehicule.assignDriver(driver.id(), driver.isInVacation()) or introduce some kind of intermediary VO that holds only the necessary state from Driver to make an assignation decision.
However, note that any decision made using external data is considered stale. For instance, what happens if the driver goes in vacation right after it's been assigned to a vehicule?
In such cases you may want to use exception reports (e.g. list all vehicules with an unavailable driver), flag vehicules for a driver re-assignation, etc. Eventual consistency could be done either through batch processing or messaging (event processing).
You could also seek to make the rule strongly-consistent by inverting the relationship, where Driver keeps a set of vehiculeId it drives. Then you could use a DB unique constraint to ensure the same vehicule doesn't have more than 1 driver assigned. You could also violate the rule of modifying only 1 AR per transaction and model the 2-way relationship to protect both invariants in the model.
However, I'd advise you to think of the real world scenario here. I doubt you can prevent a driver from going away. The system must reflect the real world which is probably the book of record for that scenario, meaning the best you can do with strong consistency is probably unassign a driver from all it's vehicules while he's away. In that case, is it really important that vehicules gets unassigned immediately in the same TX or a delay could be acceptable?
In general, an aggregate should keep its own boundaries (to avoid data-load issues and transaction-scoping issues, check this page for example), and therefore only reference another aggregate by identity, e.g. assignDriver(id guid).
That means you would have to query the driver prior to invoking assignDriver, in order to perform validation check:
class MyAppService {
public void execute() {
// Get driver...
if (driver.isInVacation()){
throw new Exception();
}
// Get vehicle...
vehicle.assignDriver(driver.id);
}
}
Suppose you're in a micro-services architecture,
you have a 'Driver Management' service, and an 'Assignation Service' and you're not sharing code between both apart from technical libraries.
You'll naturally have 2 classes for 'Driver',
An aggregate in 'Driver Management' which will hold the operations to manage the state of a driver.
And a value object in the 'Assignation Service' which will only contain the relevant information for assignation.
This separation is harder to see/achieve when you're in a monolithic codebase
I also agree with #plalx, there's more to it for the enforcement of the rule, not only a check on creation, for which you could implement on of the solutions he suggested.
I encourage you to think in events, what happens when:
a driver has scheduled vacation
when he's back from vacation
if he changes he vacation dates
Did you explore creating an Aggregate for Assignation?
I have seen information on rehydrating aggregate roots in SO, but I am posting this question because I did not find any information in SO about doing so with in the context of an event sourced framework.
Has a best practice been discovered or developed for how to rehydrate aggregate roots when operating on the command side of an application using the event sourcing and CQRS pattern
OR is this still more of a “preference“ among architects?
I have read through a number of blogs and watched a number of conference presentations on you tube and I seem to get different guidance depending on who I am attending to.
On the one hand, I have found information stating fairly clearly that developers should create aggregates to hydrate themselves using “apply“ methods on events obtained directly from the event store..
On the other hand, I have also seen in several places where presenters and bloggers have recommended rehydrating aggregate roots by submitting a query to the read side of the application. Some have suggested creating specific validation “buckets“ / projections on the read side to facilitate this.
Can anyone help point me in the right direction on discovering if there is a single best practice or if the answer primarily depends upon performance issues or some other issue I am not thinking about?
Hydrating Aggregates in an event sourced framework is a well-understood problem.
On the one hand, I have found information stating fairly clearly that
developers should create aggregates to hydrate themselves using
“apply“ methods on events obtained directly from the event store..
This is the prescribed way of handling it. There are various ways of achieving this, but I would suggest keeping any persistence logic (reading or writing events) outside of your Aggregate. One simple way is to expose a constructor that accepts domain events and then applies those events.
On the other hand, I have also seen in several places where presenters
and bloggers have recommended rehydrating aggregate roots by
submitting a query to the read side of the application. Some have
suggested creating specific validation “buckets“ / projections on the
read side to facilitate this.
You can use the concept of snapshots as a way of optimizing your reads. This will create a memoized version of your hydrated Aggregate. You can load this snapshot and then only apply events that were generated since the snapshot was created. In this case, your Aggregate can define a constructor that takes two parameters: an existing state (snapshot) and any remaining domain events that can then be applied to that snapshot.
Snapshots are just an optimization and should be considered as such. You can create a system that does not use snapshots and apply them once read performance becomes a bottleneck.
On the other hand, I have also seen in several places where presenters
and bloggers have recommended rehydrating aggregate roots by
submitting a query to the read side of the application
Snapshots are not really part of the read side of the application. Data on the read side exists to satisfy use cases within the application. Those can change based on requirements even if the underlying domain does not change. As such, you shouldn't use read side data in your domain at all.
Event sourcing has developed different styles over the years. I could divide all o those into two big categories:
an event stream represents one entity (an aggregate in case of DDD)
one (partitioned) event stream for a (sub)system
When you deal with one stream per (sub)system, you aren't able to rehydrate the write-side on the fly, it is physically impossible due to the number of events in that stream. Therefore, you would rely on the projected read-side to retrieve the current entity state. As a consequence, this read-side must be fully consistent.
When going with the DDD-flavoured event sourcing, there's a strong consensus in the community how it should be done. The state of the aggregate (not just the root, but the whole aggregate) is restored by the command side before calling the domain model. You always restore using events. When snapshotting is enabled, snapshots are also stored as events in the aggregate snapshot stream, so you read the last one and all events from the snapshot version.
Concerning the Apply thing. You need to clearly separate the function that adds new events to the changes list (what you're going to save) and functions what mutate the aggregate state when events are applied.
The first function is the one called Apply and the second one is often called When. So you call the Apply function in your aggregate code to build up the changelist. The When function is called when restoring the aggregate state from events when you read the stream, and also from the Apply function.
You can find a simplistic example of an event-sourced aggregate in my book repo: https://github.com/alexeyzimarev/ddd-book/blob/master/chapter13/src/Marketplace.Ads.Domain/ClassifiedAds/ClassifiedAd.cs
For example:
public void Publish(UserId userId)
=> Apply(
new V1.ClassifiedAdPublished
{
Id = Id,
ApprovedBy = userId,
OwnerId = OwnerId,
PublishedAt = DateTimeOffset.Now
}
);
And for the When:
protected override void When(object #event)
{
switch (#event)
{
// more code here
case V1.ClassifiedAdPublished e:
ApprovedBy = UserId.FromGuid(e.ApprovedBy);
State = ClassifiedAdState.Active;
break;
// and more here
}
}
I would want expose a little scenario which is still at paper state, and which, regarding DDD principle seem a bit tedious to accomplish.
Let's say, I've an application for hosting accounts management. Basically, the application compose several bounded contexts such as Web accounts management, Ftp accounts management, Mail accounts management... each of them represented by their own AR (they can live standalone).
Now, let's imagine I want to provide a UI with an HTML form that compose one fieldset for each bounded context, for instance to update limits and or features. How should I process exactly to update all AR without breaking single transaction per request principle? Can I create a kind of "outer" AR, let's say a ClientHostingProperties AR which would holds references to other AR and update them as part of single transaction, using own repository? Or should I better create an AR that emit messages to let's listeners provided by the bounded contexts react on, in which case, I should probably think about ES?
Thanks.
How should I process exactly to update all AR without breaking single transaction per request principle?
You are probably looking for a process manager.
Basic sketch: persisting the details from the submitted form is a transaction unto itself (you are offered an opportunity to accrue business value; step 1 is to capture that opportunity).
That gives you a way to keep track of whether or not this task is "done": you compare the changes in the task to the state of the system, and fire off commands (to run in isolated transactions) to make changes.
Processes, in my mind, end up looking a lot like state machines. These tasks are commands are done, these commands are not done, these commands have failed: now what? and eventually reach a state where there are no additional changes to be made, and this instance of the process is "done".
Short answer: You don't.
An aggregate is a transactional boundary, which means that if you would update multiple aggregates in one "action", you'd have to use multiple transactions. The reason for an aggregate to be equivalent to one transaction is that this allows you to guarantee consistency.
This means that you have two options:
You can make your aggregate larger. Then you can actually guarantee consistency, but your ability to handle concurrent requests gets worse. So this is usually what you want to avoid.
You can live with the fact that it's two transactions, which means you are eventually consistent. If so, you usually use something such as a process manager or a flow to handle updating multiple aggregates. In its simplest form, a flow is nothing but a simple if this event happens, run that command rule. In its more complex form, it has its own state.
Hope this helps 😊
We are using CQRS with EventSourcing.
In our application we can add resources(it is business term for a single item) from ui and we are sending command accordingly to add resources.
So we have x number of resources present in application which were added previously.
Now, we have one special type of resource(I am calling it as SpecialResource).
When we add this SpecialResource , id needs to be linked with all existing resources in application.
Linked means this SpecialResource should have List of ids(guids) (List)of existing resources.
The solution which we tried to get all resource ids in applcation before adding the special
resource(i.e before firing the AddSpecialResource command).
Assign these List to SpecialResource, Then send AddSpecialResource command.
But we are not suppose to do so , because as per cqrs command should not query.
I.e. command cant depend upon query as query can have stale records.
How can we achieve this business scenario without querying existing records in application?
But we are not suppose to do so , because as per cqrs command should not query. I.e. command cant depend upon query as query can have stale records.
This isn't quite right.
"Commands" run queries all the time. If you are using event sourcing, in most cases your commands are queries -- "if this command were permitted, what events would be generated?"
The difference between this, and the situation you described, is the aggregate boundary, which in an event sourced domain is a fancy name for the event stream. An aggregate is allowed to run a query against its own event stream (which is to say, its own state) when processing a command. It's the other aggregates (event streams) that are out of bounds.
In practical terms, this means that if SpecialResource really does need to be transactionally consistent with the other resource ids, then all of that data needs to be part of the same aggregate, and therefore part of the same event stream, and everything from that point is pretty straight forward.
So if you have been modeling the resources with separate streams up to this point, and now you need SpecialResource to work as you have described, then you have a fairly significant change to your domain model to do.
The good news: that's probably not your real requirement. Consider what you have described so far - if resourceId:99652 is created one millisecond before SpecialResource, then it should be included in the state of SpecialResource, but if it is created one millisecond after, then it shouldn't. So what's the cost to the business if the resource created one millisecond before the SpecialResource is missed?
Because, a priori, that doesn't sound like something that should be too expensive.
More commonly, the real requirement looks something more like "SpecialResource needs to include all of the resource ids created prior to close of business", but you don't actually need SpecialResource until 5 minutes after close of business. In other words, you've got an SLA here, and you can use that SLA to better inform your command.
How can we achieve this business scenario without querying existing records in application?
Turn it around; run the query, copy the results of the query (the resource ids) into the command that creates SpecialResource, then dispatch the command to be passed to your domain model. The CreateSpecialResource command includes within it the correct list of resource ids, so the aggregate doesn't need to worry about how to discover that information.
It is hard to tell what your database is capable of, but the most consistent way of adding a "snapshot" is at the database layer, because there is no other common place in pure CQRS for that. (There are some articles on doing CQRS+ES snapshots, if that is what you actually try to achieve with SpecialResource).
One way may be to materialize list of ids using some kind of stored procedure with the arrival of AddSpecialResource command (at the database).
Another way is to capture "all existing resources (up to the moment)" with some marker (timestamp), never delete old resources, and add "SpecialResource" condition in the queries, which will use the SpecialResource data.
Ok, one more option (depends on your case at hand) is to always have the list of ids handy with the same query, which served the UI. This way the definition of "all resources" changes to "all resources as seen by the user (at some moment)".
I do not think any computer system is ever going to be 100% consistent simply because life does not, and can not, work like this. Apparently we are all also living in the past since it takes time for your brain to process input.
The point is that you do the best you can with the information at hand but ensure that your system is able to smooth out any edges. So if you need to associate one or two resources with your SpecialResource then you should be able to do so.
So even if you could associate your SpecialResource with all existing entries in your data store what is to say that there isn't another resource that has not yet been entered into the system that also needs to be associated.
It all, as usual, will depend on your specific use-case. This is why process managers, along with their state, enable one to massage that state until the process can complete.
I hope I didn't misinterpret your question :)
You can do two things in order to solve that problem:
make a distinction between write and read model. You know what read model is, right? So "write model" of data in contrast is a combination of data structures and behaviors that is just enough to enforce all invariants and generate consistent event(s) as a result of every executed command.
don't take a rule which states "Event Store is a single source of truth" too literally. Consider the following interpretation: ES is a single source of ALL truth for your application, however, for each specific command you can create "write models" which will provide just enough "truth" in order to make this command consistent.
I have recently come across a question based on multi-threading. I was given a situation where there will be variable no of cars constantly changing there locations. Also there are multiple users who are posting requests to get location of any car at any moment. What would be data structure to handle this situation and why?
You could use a mutex (one per car).
Lock: before changing location of the associated car
Unlock: after changing location of the associated car
Lock: before getting location of the associated car
Unlock: after done doing work that relies on that location being up to date
I'd answer with:
Try to make threading an external concept to your system yet make the system as modular and encapsulated as possible at the same time. It will allow adding concurrency at later phase at low cost and in case the solution happens to work nicely in a single thread (say by making it event-loop-based) no time will have been burnt for nothing.
There are several ways to do this. Which way you choose depends a lot on the number of cars, the frequency of updates and position requests, the expected response time, and how accurate (up to date) you want the position reports to be.
The easiest way to handle this is with a simple mutex (lock) that allows only one thread at a time to access the data structure. Assuming you're using a dictionary or hash map, your code would look something like this:
Map Cars = new Map(...)
Mutex CarsMutex = new Mutex(...)
Location GetLocation(carKey)
{
acquire mutex
result = Cars[carKey].Location
release mutex
return result
}
You'd do that for Add, Remove, Update, etc. Any method that reads or updates the data structure would require that you acquire the mutex.
If the number of queries far outweighs the number of updates, then you can do better with a reader/writer lock instead of a mutex. With an RW lock, you can have an unlimited number of readers, OR you can have a single writer. With that, querying the data would be:
acquire reader lock
result = Cars[carKey].Location
release reader lock
return result
And Add, Update, and Remove would be:
acquire writer lock
do update
release writer lock
Many runtime libraries have a concurrent dictionary data structure already built in. .NET, for example, has ConcurrentDictionary. With those, you don't have to worry about explicitly synchronizing access with a Mutex or RW lock; the data structure handles synchronization for you, either with a technique similar to that shown above, or by implementing lock-free algorithms.
As mentioned in comments, a relational database can handle this type of thing quite easily and can scale to a very large number of requests. Modern relational databases, properly constructed and with sufficient hardware, are surprisingly fast and can handle huge amounts of data with very high throughput.
There are other, more involved, methods that can increase throughput in some situations depending on what you're trying to optimize. For example, if you're willing to have some latency in reported position, then you could have position requests served from a list that's updated once per minute (or once every five minutes). So position requests are fulfilled immediately with no lock required from a static copy of the list that's updated once per minute. Updates are queued and once per minute a new list is created by applying the updates to the old list, and the new list is made available for requests.
There are many different ways to solve your problem.