CQRS & event sourcing can I use an auto incremented INT as the aggregate ID? - domain-driven-design

I am working on a legacy project and trying to introduce CQRS in some places where it's appropriate. In order to integrate with all of the legacy which is relational I would like to project my aggregate (or part of it) into a table in the relational database.
I would also like the aggregate ID to be the auto-incremented value on that projected table. I know this seems like going against the grain since it's mixing the read model with the write model. However I don't want to pollute the legacy schema with foreign key GUUIDs.
Would this be a complete no-no, and if so what would you suggest?
Edit: Maybe I could just store the GUUID in the projected table, that way when the events get projected I can identify the row to update, but then still have an auto incremented column for joining on?

There is nothing wrong with using an id created by the infrastructure layer for your entities. This pattern is commonly used in Vaughn Vernon's 'Implementing DDD' book:
Get the next available ID from the repository.
Create an entity.
Save the entity in the repository.
Your problem is that you want to use an id created in another Bounded Context. That is a huge and complete no-no, not the fact that the id is created by the Infrastructure Layer.
You should create the id in your Bounded Context and use it to reference the aggregate from other Contexts (just as you wrote when you edited your question).

Related

How to model relationship using event sourcing

In our scenario, we have a Course entity to represent course content. For each student attending a course, there is a CourseSession entity representing the learning progress of the student in the course. So there is a one-to-many relationship between Course and CourseSession. If using relational database, there will be a course table and course_session table, in which course has a unique ID and course session is uniquely identified by (courseId + studentId). We try to model this using event sourcing, and our event table is like following
-----------------------------------------------------
| entity_type | entity_id | event_type | event_data |
-----------------------------------------------------
this is fine for storing course, there is a courseId we can use as entity_id. But for CourseSession, there isn't an intrinsic id attribute, we have to use the concatenation of (courseId + studentId) as entity_id, which is not quite natural. Is there a better way to model this kind of relationship?
I’m not an expert, so take this answer with a grain of salt
But for CourseSession, there isn't an intrinsic id attribute, we have
to use the concatenation of (courseId + studentId) as entity_id, which
is not quite natural
It's normal to have a composite ID, and sometimes recommended, to keep your domain model aligned with the domain language.
The composite ID can be modeled as Value Object: CourseSessionId { CoursId: string, studentId: string }.
In addition to this domain-specific ID, you may need to add a surrogate ID to the entity to satisfy some infra requirements:
Some ORMs force to have a numeric sequence ID
Some Key-value stores require a ULID Key
Short and user-friendly ID
The surrogate ID is an infra detail and must be hidden as much as possible from the domain layer.
Is there a better way to model this kind of relationship?
The event sourcing pattern I saw in the DDD context suggests having a stream of events per aggregate.
In DDD, an aggregate can be considered as:
A subsystem within the bounded context
It has boundaries and invariants to protect its state
It’s represented by an entity (aggregate root) and can contain other entities and value-objects.
If you consider that CourseSession entity belongs to Course aggregate, then you should keep using course ID as entity_id (or aggregate_id) for both Course and CourseSession related events.
in this case, the write model (main model) can easily build and presents the relationship Course / CourseSessions by playing the Course stream.
Otherwise, you must introduce a read model, and define a projector that will subscribe to both Course and CourseSession streams, and build the needed views.
This read model can be queried directly or by Course and CourseSession aggregates’ commands to take decisions, but keep in mind that’s often eventually consistent, and your business should tolerate that.
Event sourcing is a different way of thinking about data. So the 'old' ways of thinking in terms of relationships don't really translate like that.
So the first point is that an event store isn't a table structure. It is a list of things that have happened in your system. The fact a student spent time on a course is a thing which happened.
If you want/need to access the data in relationships like you describe the easiest thing to do is to create a projection from the events which creates the data in the table form you are looking for.
However, as the projection is not the source of truth, why not think about creating de-normalised tables so you the database won't need to do any joins or other more complex jobs and your data will already be shaped as you need it for use in your application. This leads to super fast highly efficient read models.
Your users will thank you!

How to model an entity's current status in DDD

I am trying to get to grips with the ideas behind DDD and apply them to a pet project we have, and I am having some questions that I hope that someone here would be able to answer.
The project is a document management system. The particular problem we have regards two notions that our system handles: That of a Document and that of a DocumentStatus.
A Document has a number of properties (such as title, author, etc). Users can change any of the Document's properties through out its life time.
A Document may be, at any time, be at a particular state such as NEW, UNDER_REVISION, REVISED, APPROVED etc. For each state we need to know who made that change to that state.
We need to be able to query the system based on a document status. An example query would be "Get me all documents that are in the REVISED state".
"Get me all documents whose status has been changed by user X"
The only time that a Document and a DocumentStatus need to be changed in the same transaction is when the Document is created (create the document and at the same time assign it a status of NEW).
For all other times, the UI allows the update of either but not both (i.e. you may change a document's property such as the author, but not its state.) Or you can update its state (from NEW to UNDER_REVISION) but not its properties.
I think we are safe to consider that a Document is an Entity and an Aggregate Root.
We are buffled about what DocumentStatus is. One option is to make it a Value Object part of the Document's aggregate.
The other option is to make it an Entity and be the root of its own aggregate.
We would also liked to mention that we considered CQRS as described in various DDD documents, but we think it is too much of a hassle, especially given the fact that we need to perform queries on the DocumentStatus.
Any pointers or ideas would be welcomed.
Domain
You say you need to be able to see past status changes, so the status history becomes a domain concept. A simple solution would then be the following:
Define a StatusHistory within the Document entity.
The StatusHistory is a list of StatusUpdate value objects.
The first element in the StatusHistory always reflects the current state - make sure you add the initial state as StatusUpdate value object when creating Document entities.
Depending on how much additional logic you need for the status history, consider creating a dedicated value object (or even entity) for the history itself.
Persistence
You don't really say how your persistence layer looks like, but I think creating queries against the first element of the StatusHistory list should be possible with every persistence mechanism. With a map-reduce data store, for example, create a view that is indexed by Document.StatusHistory[0] and use that view to realize the queries you need.
If you were only to record the current status, then that could well be a value object.
Since you're composing more qualifying - if not identifying - data into it, for which you also intend to query, then that sounds to me as if no DocumentStatus is like another, so a value object doesn't make much sense, does it?
It is identified by
the document
the author
the time it occurred
Furthermore, it makes even more sense in the context of the previous DocumentStatus (if you consider more states than just NEW and UNDER_REVISION).
To me, this clearly rules out modeling DocumentStatus as a value object.
In terms of the state as a property of DocumentStatus, and following the notion of everything is an object (currently reading David West's Object Thinking), then that could of course be modeled as a value object.
Follows How to model an entity's current status in DDD.

Few confusing things about globally accessible Value Objects

Quotes are from DDD: Tackling Complexity in the Heart of Software ( pg. 150 )
a)
global search access to a VALUE is often meaningles, because finding a
VALUE by its properties would be equivalent to creating a new instance
with those properties. There are exceptions. For example, when I am
planning travel online, I sometimes save a few prospective itineraries
and return later to select one to book. Those itineraries are VALUES
(if there were two made up of the same flights, I would not care which
was which), but they have been associated with my user name and
retrieved for me intact.
I don't understand author's reasoning as for why it would be more appropriate to make Itinierary Value Object globally accessible instead of clients having to globally search for Customer root entity and then traverse from it to this Itinierary object?
b)
A subset of persistent objects must be globaly accessible through a
search based on object attributes ... They are usualy ENTITIES,
sometimes VALUE OBJECTS with complex internal structure ...
Why is it more common for Values Objects with complex internal structure to be globally accesible rather than simpler Value Objects?
c) Anyways, are there some general guidelines on how to determine whether a particular Value Object should be made globally accessible?
UPDATE:
a)
There is no domain reason to make an itinerary traverse-able through
the customer entity. Why load the customer entity if it isn't needed
for any behavior? Queries are usually best handled without
complicating the behavioral domain.
I'm probably wrong about this, but isn't it common that when user ( Ie Customer root entity ) logs in, domain model retrieves user's Customer Aggregate?
And if users have an option to book flights, then it would also be common for them to check from time to time the Itineraries ( though English isn't my first language so the term Itinerary may actually mean something a bit different than I think it means ) they have selected or booked.
And since Customer Aggregate is already retrieved from the DB, why issue another global search for Itinerary ( which will probably search for it in DB ) when it was already retrieved together with Customer Aggregate?
c)
The rule is quite simple IMO - if there is a need for it. It doesn't
depend on the structure of the VO itself but on whether an instance of
a particular VO is needed for a use case.
But this VO instance has to be related to some entity ( ie Itinerary is related to particular Customer ), else as the author pointed out, instead of searching for VO by its properties, we could simply create a new VO instance with those properties?
SECOND UPDATE:
a) From your link:
Another method for expressing relationships is with a repository.
When relationship is expressed via repository, do you implement a SalesOrder.LineItems property ( which I doubt, since you advise against entities calling repositories directly ), which in turns calls a repository, or do you implement something like SalesOrder.MyLineItems(IOrderRepository repo)? If the latter, then I assume there is no need for SalesOrder.LineItems property?
b)
The important thing to remember is that aggregates aren't meant to be
used for displaying data.
True that domain model doesn't care what upper layers will do with the data, but if not using DTO's between Application and UI layers, then I'd assume UI will extract the data to display from an aggregate ( assuming we sent to UI whole aggregate and not just some entity residing within it )?
Thank you
a) There is no domain reason to make an itinerary traverse-able through the customer entity. Why load the customer entity if it isn't needed for any behavior? Queries are usually best handled without complicating the behavioral domain.
b) I assume that his reasoning is that complex value objects are those that you want to query since you can't easily recreate them. This issue and all query related issues can be addressed with the read-model pattern.
c) The rule is quite simple IMO - if there is a need for it. It doesn't depend on the structure of the VO itself but on whether an instance of a particular VO is needed for a use case.
UPDATE
a) It is unlikely that a customer aggregate would have references to the customer's itineraries. The reason is that I don't see how an itinerary would be related to behaviors that would exist in the customer aggregate. It is also unnecessary to load the customer aggregate at all if all that is needed is some data to display. However, if you do load the aggregate and it does contain reference data that you need you may as well display it. The important thing to remember is that aggregates aren't meant to be used for displaying data.
c) The relationship between customer and itinerary could be expressed by a shared ID - each itinerary would have a customerId. This would allow lookup as required. However, just because these two things are related it does not mean that you need to traverse customer to get to the related entities or value objects for viewing purposes. More generally, associations can be implemented either as direct references or via repository search. There are trade-offs either way.
UPDATE 2
a) If implemented with a repository, there is no LineItems property - no direct references. Instead, to obtain a list of line items a repository is called.
b) Or you can create a DTO-like object, a read-model, which would be returned directly from the repository. The repository can in turn execute a simple SQL query to get all required data. This allows you to get to data that isn't part of the aggregate but is related. If an aggregate does have all the data needed for a view, then use that aggregate. But as soon as you have a need for more data that doesn't concern the aggregate, switch to a read-model.

How should I enforce relationships and constraints between aggregate roots?

I have a couple questions regarding the relationship between references between two aggregate roots in a DDD model. Refer to the typical Customer/Order model diagrammed below.
First, should references between the actual object implementation of aggregates always be done through ID values and not object references? For example if I want details on the customer of an Order I would need to take the CustomerId and pass it to a ICustomerRepository to get a Customer rather then setting up the Order object to return a Customer directly correct? I'm confused because returning a Customer directly seems like it would make writing code against the model easier, and is not much harder to setup if I am using an ORM like NHibernate. Yet I'm fairly certain this would be violating the boundaries between aggregate roots/repositories.
Second, where and how should a cascade on delete relationship be enforced for two aggregate roots? For example say I want all the associated orders to be deleted when a customer is deleted. The ICustomerRepository.DeleteCustomer() method should not be referencing the IOrderRepostiory should it? That seems like that would be breaking the boundaries between the aggregates/repositories? Should I instead have a CustomerManagment service which handles deleting Customers and their associated Orders which would references both a IOrderRepository and ICustomerRepository? In that case how can I be sure that people know to use the Service and not the repository to delete Customers. Is that just down to educating them on how to use the model correctly?
First, should references between aggregates always be done through ID values and not actual object references?
Not really - though some would make that change for performance reasons.
For example if I want details on the customer of an Order I would need to take the CustomerId and pass it to a ICustomerRepository to get a Customer rather then setting up the Order object to return a Customer directly correct?
Generally, you'd model 1 side of the relationship (eg., Customer.Orders or Order.Customer) for traversal. The other can be fetched from the appropriate Repository (eg., CustomerRepository.GetCustomerFor(Order) or OrderRepository.GetOrdersFor(Customer)).
Wouldn't that mean that the OrderRepository would have to know something about how to create a Customer? Wouldn't that be beyond what OrderRepository should be responsible for...
The OrderRepository would know how to use an ICustomerRepository.FindById(int). You can inject the ICustomerRepository. Some may be uncomfortable with that, and choose to put it into a service layer - but I think that's overkill. There's no particular reason repositories can't know about and use each other.
I'm confused because returning a Customer directly seems like it would make writing code against the model easier, and is not much harder to setup if I am using an ORM like NHibernate. Yet I'm fairly certain this would be violating the boundaries between aggregate roots/repositories.
Aggregate roots are allowed to hold references to other aggregate roots. In fact, anything is allowed to hold a reference to an aggregate root. An aggregate root cannot hold a reference to a non-aggregate root entity that doesn't belong to it, though.
Eg., Customer cannot hold a reference to OrderLines - since OrderLines properly belongs as an entity on the Order aggregate root.
Second, where and how should a cascade on delete relationship be enforced for two aggregate roots?
If (and I stress if, because it's a peculiar requirement) that's actually a use case, it's an indication that Customer should be your sole aggregate root. In most real-world systems, however, we wouldn't actually delete a Customer that has associated Orders - we may deactivate them, move their Orders to a merged Customer, etc. - but not out and out delete the Orders.
That being said, while I don't think it's pure-DDD, most folks will allow some leniency in following a unit of work pattern where you delete the Orders and then the Customer (which would fail if Orders still existed). You could even have the CustomerRepository do the work, if you like (though I'd prefer to make it more explicit myself). It's also acceptable to allow the orphaned Orders to be cleaned up later (or not). The use case makes all the difference here.
Should I instead have a CustomerManagment service which handles deleting Customers and their associated Orders which would references both a IOrderRepository and ICustomerRepository? In that case how can I be sure that people know to use the Service and not the repository to delete Customers. Is that just down to educating them on how to use the model correctly?
I probably wouldn't go a service route for something so intimately tied to the repository. As for how to make sure a service is used...you just don't put a public Delete on the CustomerRepository. Or, you throw an error if deleting a Customer would leave orphaned Orders.
Another option would be to have a ValueObject describing the association between the Order and the Customer ARs, VO which will contain the CustomerId and additional information you might need - name,address etc (something like ClientInfo or CustomerData).
This has several advantages:
Your ARs are decoupled - and now can be partitioned, stored as event streams etc.
In the Order ARs you usually need to keep the information you had about the customer at the time of the order creation and not reflect on it any future changes made to the customer.
In almost all the cases the information in the value object will be enough to perform the read operations ( display customer info with the order ).
To handle the Deletion/deactivation of a Customer you have the freedom to chose any behavior you like. You can use DomainEvents and publish a CustomerDeleted event for which you can have a handler that moves the Orders to an archive, or deletes them or whatever you need. You can also perform more than one operation on that event.
If for whatever reason DomainEvents are not your choice you can have the Delete operation implemented as a service operation and not as a repository operation and use a UOW to perform the operations on both ARs.
I have seen a lot of problems like this when trying to do DDD and i think that the source of the problems is that developers/modelers have a tendency to think in DB terms. You ( we :) ) have a natural tendency to remove redundancy and normalize the domain model. Once you get over it and allow your model to evolve and implicate the domain expert(s) in it's evolution you will see that it's not that complicated and it's quite natural.
UPDATE: and a similar VO - OrderInfo can be placed inside the Customer AR if needed, with only the needed information - order total, order items count etc.

Lookup tables in Core Data

Core data is not a database and so I am getting confused as to how to create, manage or even implement Lookup tables in core data.
Here is a specific example that relates to my project.
Staff (1) -> (Many) Talents (1)
The talents table consists of:
TalentSkillName (String)
TalentSkillLevel (int)
But I do not want to keep entering the TalentSkillName, so I want to put this information into another, separate table/entity.
But as Core Data is not really a database, I'm getting confused as to what the relationships should look like, or even if Lookup tables should even be stored in core data.
One solution I'm thinking of is to use a PLIST of all the TalentSkillNames and then in the Talents entity simply have a numeric value which points to the PLIST version.
Thanks.
I've added a diagram which I believe is what you're meant to do, but I am unsure if this is correct.
I'd suggest that you have a third entity, Skill. This can have a one to many relationship with Talent, which then just has the level as an attribute.
Effectively, this means you are modelling a many-to-many relationship between Staff and Talent through the Skill entity. Logically, that seems to fit with the situation you're describing.

Resources