I had a discussion recently with a co-worker, where he insisted that in Domain-Driven Design entities should not have a behavior that does not modify its state. In my experience to date, I never heard about this limitation. Is it a valid DDD rule?
To give some context (simplified scenario) - in our domain we have Computer entity, on which you can start Processes, our integration layer will actually delegate it to a remote physical Computer and start a process there.
So, should StartProcess be a behaviour of a Computer entity? Or should it be included in Domain Service as it does not affect the state of the Computer entity directly? (it modifies the state indirectly as once the process is over, and data is synchronized back to our system).
To me Entity is a natural place for it, as it follows the ubiquitous language, but I am wondering if someone has good reasons against (or other reasons for).
IMO an entity behavior does not need to modify state, but at the very least should emit an event. In this case, the event would be something like ProcessStarted. CQRS/event-sourcing views aggregates essentially as command handlers - they handle commands and emit events. State is made explicit when required for behavior or when denormalized for query purposes.
Related
I have flicked through few popular Event Sourcing frameworks written in a variety of different common languages. I have got the impression all of them affect the domain models to a really high degree. As far as I understand ES is just an infrastructure concern - a way of persisting aggregate state. Of course, it facilitates message driven inter-context integration but in core domain's point of view is negligible. I consider commands and events to be part of the domain itself so it looks perfectly fine that aggregate creates events (but not publishes them) or handles commands.
The problem is that all of DDD building blocks tend to be polluted by ES framework. Events must inherit from some base class. Aggregates at least are supposed to implement foreign interfaces. I wonder if domain models should be even aware of using ES approach within an application. In my opinion, even necessity of providing apply() methods indicates that other layer shapes our domain.
How you approach this issue in your projects?
My answer applies only when CQRS is involved (write and read models are split and they communicate using domain events).
As far as I understand ES is just an infrastructure concern - a way of persisting aggregate state
Event sourcing is indeed an infrastructure concern, a kind of repository but event-based Aggregates are not. I consider them to be an architectural style, different from the classical style.
So, the fact that an Aggregate, in reaction to an command, generates zero or more domain events that are applied onto itself in order to build its internal (private) state used to decide what events to generate in the future is just a different mode of thinking and designing an Aggregate. This is a perfect valid style, along with classical style (the one not using events but only objects) or functional programming style.
Event sourcing just means that every time a command reaches an Aggregate, its entire internal state is rebuild instead of being loaded from a flat persistence. Of course there are other huge advantages (!) but they do not affect the design of an Aggregate.
... but not publishes them ...
I like the frameworks that permit us to just return (or better yield - Aggregate's command methods are just generators!) the events.
Events must inherit from some base class
It's sad that some frameworks require that but this is not necessarily. In general, a framework needs one mean of detecting an event class. However, they can be implemented to detect an event by other means instead of using marker interfaces. For example, the client (as in YOU) could provide a filter method that rejects non-event classes.
However, there is one thing that I couldn't avoid in my framework (yes, I know, I'm guilty, I have one): the Command interface with only one method: getAggregateId.
Aggregates at least are supposed to implement foreign interfaces.
Again, like with events, this is not a necessity. A framework could be given a custom client event-applier-on-aggregates function or a convention can be used (i.e. all event-applier methods have the form applyEventClassNameOrType.
I wonder if domain models should be even aware of using ES approach within an application
Of ES not, but event-based YES, so the apply method must still exists.
As far as I understand ES is just an infrastructure concern - a way of persisting aggregate state.
No, events are really core to the domain model.
Technically, you could store diffs in a domain agnostic way. For example, you could look at an aggregate and say "here is the representation before the change, here is the representation after, we'll compute the difference and store that.
The difference between patches and events is the fact that you switch from a domain agnostic spelling to a domain specific spelling. Doing that is normally going to require being intimate with the domain model itself.
The problem is that all of DDD building blocks tend to be polluted by ES framework.
Yup, there's a lot of crap framework in the examples you find in the wild. Sturgeon's Law at work.
Thinking about the domain model from a functional perspective can help a lot. At it's core, the most general form of the model is a function that accepts current state as an input, and returns a list of events as the output.
List<Event> change(State current)
From there, if you want to save current state, you just wrap this function in something that knows how to do the fold
State current = ...
List<Event> events = change(current)
State updated = State.fold(current, events)
Similarly, you can get current state by folding over the previous history
List<Event> savedHistory = ...
State current = State.reduce(savedHistory)
List<Event> events = change(current)
State updated = State.fold(current, events)
Another way of saying the same thing; the "events" are already there in your (not event sourced) domain model -- they are just implicit. If there is business value in tracking those events, then you should replace the implementation of your domain model with one that makes those events explicit. Then you can decide which persisted representation to use independent of the domain model.
Core of my problem is that domain Event inherits from framework Event and aggregate implements some foreign interface (from framework). How to avoid this?
There are a couple of possibilities.
1) Roll your own: take a close look at the framework -- what is it really buying you? If your answer is "not much", then maybe you can do without it.
From what I've seen, the "win" of these frameworks tends to be in taking a heterogeneous collection of events and managing the routing for you. That's not nothing -- but it's a bit magic, and you might be happier having that code explicit, rather than relying on implicit framework magic
2) Suck it up: if the framework is unobtrusive, then it may be more practical to accept the tradeoffs that it imposes and live with them. To some degree, event frameworks are like object relational mappers or databases; sure, in theory you should be able to change them out freely. In practice? how often do you derive benefit from the investment in that flexibility
3) Interfaces: if you squint a little bit, you can see that your domain behaviors don't usually depend on in memory representations, but instead on the algebra of the domain itself.
For example, in the domain model, we deposit Money into an Account updating its Balance. We don't typically care whether those are integers, or longs, or floats, or JSON documents. We can satisfy the model with any implementation that satisfies the constraints of the algebra.
So you can use the framework to provide the implementation (which also happens to have all the hooks the framework needs); the behavior just interacts with the interface it defined itself.
In a strongly typed implementation, this can get really twisty. In Java, for instance, if you want the strong type checks you need to be comfortable with the magic of generics and type erasure.
The real answer to this is that DDD is overrated. It is not true that you have to have one model to rule them all. You may have different views on the state of your world, depending on your current needs. One part of the application has one view, another part - completely different view.
To put it another way, your model is not "what is", but "what happened so far". The actual data model of your application is the event stream itself. Everything else you derive from there.
My domain is about Program Management. I have a Program (Aggregate Root) that must have a Customer (Aggregate Root). So I require a CustomerID when creating a new Program as I have read aggregates should only hold reference to other aggregates by reference.
Here are my business rules:
Customers can become active and inactive over time.
If a Customer is inactivated for some reason, all programs associated with that Customer should also be inactivated.
A Program cannot be activated if its Customer is inactive.
Rules #1 & #2 I have implemented. It's #3 that is stumping me.
I can think of 3 solutions:
Program holds reference to the Customer aggregate.
Introduce a domain service that checks if the Customer is active and pass it to Program.Activate(CustomerActiveCheckService service).
Have the application service look up the Customer and pass it to Program.Activate(Customer customer).
Which is the best solution?
Update
I see both points of view made by #ConstaninGALBENU and #plalx, and I want to suggest a compromise. Can I created a CustomerStatusChecker service? The method would have the following signature: CustomerStatus CheckStatus(CustomerID id); I could then pass Programthe service like so: `Program.Activate(CustomerStatusChecker service);
Are there any problems with this design?
Which is the best solution?
There isn't a best solution; there are trade offs.
But one possible solution that is consistent with requirements #2 and #3 is that your existing model is wrong -- that Program entities are not isolated aggregates, but are part of the Customer entity, and therefore should be controlled by the same aggregate root.
Hints that this might be the case: that the life cycle of a Program fits within the life cycle of a Customer; that Programs don't normally migrate from one Customer to another, that there are limits to the count of active programs per customer.
Another possibility is that the requirements are "wrong". One way of exploring this is to review whether active/inactive is a decision made by the model, or if it is a decision made somewhere else and reported to the model. Another is to examine the cost to the business if this "rule" is violated.
If the model doesn't find out about the customer right away, or it is an inexpensive problem, then you probably have some room to detect the conflict and report it to a human, rather than trying to have the model do all of the work (See: Greg Young, Stop Over Engineering).
In these cases, having the main code path take a good guess, and implementing an alternative path that operators can use fix the mistakes is fine.
In choosing between solution #2 and #3 (I don't like #1 at all), I encourage keeping I/O actions out of the model. So unless you already have the latest version of the Customer in memory, I'm not fond of the domain service as a choice. Passing in a copy of the customer state to the domain model keeps the I/O concerns in the application component, where they belong (see Boundaries, by Gary Bernhardt, for more on this idea).
Solution 1: it breaks the rule about not holding references to other aggregate instances. That rule ensures that only one Aggregate is modified in a transaction. If you need to modify multiple aggregates in a single transaction then your design is definitely wrong.
Solution 2: I really don't like injecting services inside aggregates. My aggregates are pure functions with no touching of the outside world (I/O, repositories or the like).
Solution 3: is somehow equivalent to 1, even it is a temporary reference (Program could call command methods on Customer thus modifying Customer in the same transaction boundary as Program) .
My solution: make that check inside the Application service, before that call to Program.activate () or pass a customerStatus to Program.activate () and let Program aggregate decide if it throws an exception or emit events.
Update:
The idea is that you should pass only read-only/imutable data to Program AR to ensure that it does not modify other ARs in its transactional boundary. Also, we should not make Program dependent on what it does not need, like the entire Customer AR.
Also, if the architecture is event-driven then by listening to the right events emited by Customer you could keep the Program AR in sync: you make it "non activable" if not already activated or you deactivate it if it is activated already, using by example a Saga.
Question: what is the best, efficient and future proof way to
rehydrate an aggregate from a repository? What are the pro's and con's of the provided ways and are my perceptions correct?
Let's say we have an Aggregate Root with private setters but public getters for accessing state
Behaviour is done through methods on the aggregate root.
A repository is instructed to load an aggregate.
At the moment I see a couple of possible ways to achieve this:
set the state through reflection (manual or automatic eg.
Automapper)
make constructors that accept properties so state is set
load the aggregate with a state object
1) Jimmy Bogard alludes that his tool Automapper isn't meant for two-way mapping. But some people argue that we have to be pragmatic, use tools in a way it helps you.
For me, I don't like a full rehydration through reflection. Maybe Automapper ceise to exist or aggregate roots are bent in such a way the mapping can be done (see some comments of Vaughn on his article).
2) creating constructors for rehydration, with a couple of parameters so the state of the aggregate is rehydrated in a correct way.
These couple of parameters can expand (= new constructors) or the definition can change. I like this approach, except the part of having a bunch of parameters.
3) the state is a property of the aggregate root. The state is encapsulated in a new object and this object is build by the repository and is then given to the aggregate root for proper init.
Some people argue that building this state object is more work (new class, exposure of state properties on entity and aggregate root to enforce business rules), but it provides a clean way to initiliaze the state.
Say that we need event sourcing, does the loading of a state resemble in loading events? And does the state object provide a way of handling events? Is it more future proof?
I would argue that trying to future-proof too much represents a trap that many people fall into that adds undue complexity to a codebase. There is a fine balancing act between sound architectural decisions and over-architecting a solution to problem that is not guaranteed to exist.
That being said, I fully agree with what Jimmy says, in regards to AutoMapper not being intended for two-way mapping. Your domain represents the "truth" in your application, and should not be directly mutable. I have worked on projects with two-way mappings, and while they do work, there is a tendency to start treating the domain objects as nothing more than DTOs. It becomes painful when you start having read-only properties, having to reflect to do your setting - tooling or not. From a DDD perspective, we should not be allowing for outside influences to simply say what a property value should be, because it will lead to an anemic domain model over time, most likely.
Internal states do work well, but they are at the cost of additional overhead and complexity. There is a legitimate trade-off, as you mention, in that you are adding a fair amount of work. However, you can use that opportunity to allow the aggregate to validate the state against the self-contained business rules within the aggregate, prior to allowing the state to be set. That addresses the largest concern that I have with two-way mapping. You can at least enforce that a state object contains valid data and then only construct the aggregate if it is valid. It is more testable, as well. The largest problem that I have seen with this approach is that the skill level of your team will have a direct bearing on the success of this being utilized correctly. It could be argued that the complexity does not add enough value to implement domain-wide, as you will likely have aggregates that have different levels of churn. A couple of projects that I have been involved in have used this approach, and I found little advantage over straight constructor usage.
Normally, I use constructors for rehydration in most cases. It walks the line between not being overly-complex, plus it leaves responsibility for the aggregate to allow or disallow the construction of the object - again, allowing for the domain to be in control of whether the hydration attempt would result in a valid object. A good compromise to constructor bloat is the use a mutable DTO as a parameter for the constructor, essentially acting as a data structure to maintain a consistent constructor signature over time. In that essence, it is also somewhat future-proof. It takes the most attractive perk of the state object approach, which is the clean signatures, but removes the additional layer of an internal abstraction.
You mention event sourcing as a possibility down the road. State loading is not very similar to what you would be doing, at all (in my opinion). With a state object, you are snapshotting the state of the aggregate at a given point in time. With event sourcing, you will be replaying events, each of which represents the data required to mutate the state, as opposed to the state, itself. As such, your constructor will likely be a collection of events, representing a chain of deltas to mutate the state repeatedly, until it reaches the current state. When you want to hydrate your aggregate, you will supply it with the events that are related to that aggregate, and it will replay them to get to the current state. This is one of the true strengths of event sourcing, as well. You are forcing the hydration of your domain objects to go through the business logic required to create them, each time. Given a list of events, the aggregate will enforce that each state change is valid by applying the event in a consistent fashion, whether the event is being applied in real-time, or replayed to get to the current state.
Back to the future-proof aspect, as it relates to event sourcing, there is a conscious effort required when events require change. Since you have to replay an event to get to the current state, you will very likely have to deprecate events and bring up new events to transition to as your business logic changes. You may (read as "likely will") find yourself versioning events. Not only does your aggregate need to understand current state change requirements, but it also needs to understand previous state change requirements. So, if you change an event handler, you will have to ensure that it will be valid for existing events, as well. When you are adding additional data to an event, it is usually not too involved. But when you start removing data from an event signature, you instantly make that event at risk for being incompatible with earlier structures. Likewise, even changing the names of the data structures inside of an event can cause backwards compatibility issues. If you start event sourcing, you do not need to worry as much about future-proofing as you do backwards compatibility. Event sourcing is great, but be prepared for additional complexity.
Suppose we have a situation when we need to implement some domain rules that requires examination of object history (event store). For example we have an Order object with CurrentStatus property, and we need to examine Order.CurrentStatus changes history.
Most likely you will answer that I need to move this knowledge to domain and introduce Order.StatusHistory property that contains a collection of status records, and that I should not query event store. And I will agree with you.
What I question is the need of Event Store.
We write in event store events that has business meaning (domain value), we do not record UserMovedMouse events (in most cases). And as with OrderStatusChanged event there is a high chance that most of events from EventStore will be needed at some point for domain logic, and we end up with a domain object that have a EventHistory property with the collection of events.
I can see a value in separate event store for patterns such as CQRS when you have a single write only event store and multiple read only query stores, which gives you some scalability. However the need to to introduce such thing in code is in question too for me. All decent databases support single write server, multiple read servers scalability (master-slave replication). Why should I introduce such thing at source code level? Why not to forget about Web Services, and Message buses and use write your own wrapers around Sockets.
I have a great respect to "old school" DDD as it was described be Eric Evans, and I see some fresh and good ideas in new wave DDD+SQRC+EventSourcing pattern aggregate. However the main idea of CQRS is under big question for me. Am I missing something?
In short: if event sourcing is not needed (for its added benefits or as workarounds for some quirks), then you definitely shouldn't bring it into your system just for the sake of it.
ES is just one of many ways to augment CQRS architectural style within a bounded context. It is not a requirement.
As I understand it, a UnitOfWork class is meant to represent the concept of a business transaction in the domain. It's not directly supposed to represent a database transaction, which is a detail of only one possible implementation.
Q: So why does so much documentation about the Unit of Work pattern refer to "Commit" and "Rollback" methods?
These concepts mean nothing to the domain, or to domain experts. A business transaction can be "completed", and therefore the UnitOfWork should provide a "Complete" method. Likewise, instead of a "Rollback" method, shouldn't it be modeled as "Clear"?
Update:
Answer: Both answers below are correct. Their are two variants of UoW: object registration and caller registration. In object registration, Rollback serves to undo changes to all in-memory objects. In caller registration, Rollback serves to clear all recorded changes such that subsequent call to Commit will do nothing.
The Unit of Work design pattern, at least as defined by Fowler in Patterns of Enterprise Application Architecture - is an implementation detail concerning object-relational persistence mapping. It is not an entity defined in Evans' Domain Driven Design.
As such, it should neither be part of the business discussion, nor an entity that's directly exposed in a domain model - perhaps excepting the commit() method. Instead its intent is tracking "clean" and "dirty" business entities - the objects from a domain model exposed to clients. The purpose is allowing multiple interactions - in web context requests - with a domain model without the need to read and write from persistence (usually a database) each time.
Business entities call it when their methods are called. When their state is altered, they register themselves as dirty with the Unit of Work. Then the Unit of Work's commit() handles the entire persistence transaction in terms of writing out the object graph and rollback() means restoring the state of entities to what they were. So its very much the implementation leaking through to the "abstraction", but its intent is very clear.
On the other hand, "Undo" and "Complete" don't necessarily map one-to-one with this definition. An "Undo" or "Clear" may only rollback an object graph partially for instance depending on the business context. While "Complete" may well be altering state on some entity as well as committing the graph. As such I would put these methods, with business meaning, on a Service Layer or Aggregate Root object.
I agree. My guess is that it uses the terms "Rollback" and "Commit" because they are indeed known terms (and do reveal intent, especially to programmers). However I think that it would be more correct to use the term "Complete". With regards to "Clear" I'm not as inclined to agreeing with you. I don't think that any domain expert would agree that you "Clear" a business transaction. "Undo" is a more suitable term in my opinion.