How to model the StackOverflow website with DDD

How to model the StackOverflow website with DDD - domain-driven-design

I take the example of StackOverflow because obviously you know that website, and my real usecase is really close.
So let's imagine a simplified SO domain description:
There are users
Users can create new questions
Users can create answers to these questions
Users can edit their own questions and answers
Users can edit other users questions if they have more than 1000 reputation (took that threshold randomly)
The last bold rule is the one that matters to me.
What I understand about an AggregateRoot is that it should contain the state that serves to take decision to accept or reject commands, and it should not query a DB to do so. It guarantees the consistency of the app. It should only listen to the events it emits to update its state.
Now what I think is that the SO domain has an aggregate root called Question.
That question would then handle commands like:
CreateAnswer
EditQuestion
The thing is, when the EditQuestion is fired, how the Question AggregateRoot would be able to decide wether to accept or refuse that command? Because if you remember, the command should be rejected if you try to edit the question of another user if you have < 1000 reputation.
It does not seem to make sens to me that the Question AR maintain a list of all users reputations, to be able to know how to act on that command.
The problem is that when trying to model my domain I have this modeling problem coming over and over again, and I always end up with a single big fat AggregateRoot
Can someone tell me what I am missing and help me solve this problem? thanks
This question seems to say that we should not put the authorization system inside the domain model. I agree this may be practical for things like role-based authentication. However, to me the "users can't edit unless they have enough reputation" is really an SO business rule, so how could it be outside of the domain?
IMPORTANT: when answering, please consider YOU are the business expert. You know StackOverflow as an user and can guess what are the SO constraints by yourself. Even if you are wrong about them, it's not a big deal: just make a proposal for your wrong business constraints I'm fine with that!!!
It's not the first time I ask this kind of question and it always ends up with no answer but just endless discussions. What I want to know is how you would model StackOverflow if you had to build this site, with a focus on the business rules about the minimum reputation to edit.

Well, things are quite simple IMO (in this SO scenario only). This is how I would do it (obviously, other devs probably have different approaches):
You tagged the question well with the "cqrs". In the EditQuestion handler, I'd use a Domain Service (a Query from a CQRS point of view) which will check if a certain use has the required points and then return a true/false. Something like this (more or less pseudo code)
public class CanUserEditQuestionService
{
//constructor with deps\\
public bool Handle(CanUserEditQuestion input)
{
//query the read model, maybe a query object to get us the rep of the user
var rep=getReputation.Get(input.UserId);
//we can have a dependency here which tell us the number of points required for a specific permission
return(rep>=1000);
}
}
If the query returns true, then the handler will perform changes on the Question entity i.e question.ChangeText() or smth (I think SO takes an event sourcing approach).
What you have here is a simple use case of a concept "Question" , its command behaviour "Edit" and the business rule that dictates who can do what. The thing is, the 1000 rep rule is never part of the Question concept definition, therefore it doesn't belong to that aggregate i.e how the question is edited. However it's part of the use case itself and part of the application service.
I'm sure that you'll ask me : "What if the read model used by the domain query is behind the command model?". In this case it matters very little, the delay it's probably measured in seconds at most. Also the main thing here is: the business rule is not part of the Question aggregate so it doesn't care about being immediate consistent.
Another thing is that user rep is always a different concept than of a Question so dealing with rep should be never a part of a Question aggregate. But it is part of the application service.
If you view an application as a group of use cases doing things with concepts (which themselves encapsulate data and business constraints), it's quite easy to identify which is the application service, aggregate, domain service etc.

A slightly different approach if you want immediate consistency
public class UserCommandHandler
{
public void Handle(EditQuestion command)
{
// (start transaction)
var user = userRepository.Get(command.UserId);
if (user.Reputation < 1000)
// reject command here
var edit = user.EditedQuestion(...); // or just "new QuestionEdit(...)"
questionEditRepository.Add(edit);
// (commit, saving the QuestionEdited event)
}
}
Since we're using CQRS, the state of a Question as reflected on a page would not be contained in an Question Aggregate but a projection of a series of QuestionEdited events that were listened to and cumulated over time.

Related

DateTime.Now in Domain Layer of DDD

Recently I faced with the following invariants in my domain Model:
An Offer treated as Expired if ExpiryAt (DateTimeOffset) < DateTimeOffset.Now.
A Director of the Company cannot be younger than 18 years old
When Document is downloaded we should set DownloadedAt field with DateTimeOffset.Now
In Application Layer to keep purity and for better testing we usually isolate System.DateTime with IDateTime interface which allow to mock Now in UnitTests.
But all these 3 scenarios belong to Domain Layer and not to Application Layer. We should not Inject external interfaces into DomainModel to keep it pure. But from other side it might be bad to use DateTime.Now or DateTimeOffset.Now directly in DomainLayer since this adds a dependency to system clock and make it harder to test sometimes since DateTime.Now will never return the same result.
So the question is - how do you deal with this dilemma?
Options I see:
Provide now as parameter to Domain Entity methods. This is viable option and simplify testing though makes code more verbose and sometimes even stupid.
Just use DateTime.Now in domain layer. I already mentioned cons of this approach.
Anything else you might suggest from your experience?

From the different options accessing the static DateTime.Now() functionality is obviously the most disadvantageous. It both does not allow for testing and also hides the domain models dependency to some non-deterministic infrastructure inside the implementation details.
The option to inject some interface to a service that can be viewed is a little better because it makes the dependency explicit and also allows for unit testing by stubbing the non-deterministic output to return some deterministic value of your choice.
But still, at runtime your domain model needs to access some infrastructure dependency. This might be a reasonable compromise in some cases, but if possible I would try to avoid that to keep the domain model pure.
If you look at the current date time in your case from a different angle it becomes more obvious that it is actually nothing else like a normal input parameter. You could see it as something like a reference date time instead of the current date time.
Referring to your first example - checking if an offer has expired - from the domain model's point-of-view it needs to check if the offer has expired at some given point in time. This given point in time just happens to be current date time in one of the use cases where the domain logic is exercised.
So bottom line, I recommend to inject the value of the (current) date time rather than an interface to some functionality in such cases. It makes explicit what data is really need in addition to the data the domain encapsulates on its own and requires for performing the business logic.
Also, it makes more explicit what the client code (e.g. the use case or application service) wants to tell or ask the domain model. For instance, check if the offer has expired as of now or if needed, tell me if the offer was already expired at a given point in time or even if it will be expired at an important point in time.
As further reading I recommend this great article from Vladimir Khorikov where he elaborates more on that topic.

How to name an event describing the acknowledgment of the existence of an entity in an event sourced system?

I am new to Event Sourcing and I am considering using it for an industrial application to track events happening in a production facility.
Since the book of record is the production facility itself and not the system, and also because not everything is automated, workers will need to report at a given point in time (the recorded time) what they did at another point in time (the effective time). Therefore, I will be using events such as: TankFilledRecorded, TankOutputConnectedToPipeInputRecorded, ContainerMovedToFacilityAreaRecorded, etc. where these events refer to entities such as a tank, a pipe, or a facility area for example. These events will have both a recorded time and an effective time. Note that there is no submission or approval process for a record to be considered legit.
Domain-driven design (DDD) encourages to design events that are representative of what happens in the domain (like the ones above).
However, in my domain, I don’t care so much about how a tank, a pipe or a facility area came to existence. I just need to know that something exists from a particular point in time, and I also need to know if it is not there after a particular point in time. The main objective of the software is to track liquids and powders flowing in a circuit made of these pipes, tanks and other components. It is not an asset management system and should not become one.
Therefore, what would be the correct DDD way to design an event that represents the fact that there is a tank, a pipe or an area in the production facility?
It is a subtle question but language is important, particularly in DDD.
Here is what I came up with:
1 EntityExistenceAcknowledgmentRecorded
TankExistenceAcknowledgmentRecorded
PipeExistenceAcknowledgmentRecorded
FacilityAreaExistenceAcknowledgmentRecorded
TankDisappearanceAcknowledgmentRecorded
PipeDisappearanceAcknowledgmentRecorded
FacilityAreaDisappearanceAcknowledgmentRecorded
It seems awful to use this in the ubiquitous language. I don’t see myself talking in these terms or providing a UI with such vocabulary. But it does represent exactly what happens though.
2 EntityRegistered
TankRegistered
PipeRegistered
FacilityAreaRegistered
TankUnregistered
PipeUnregistered
FacilityAreaUnregistered
It seems much simpler and it also seems to be meaningful except for one thing. “Registered” conveys the existence of the representation of an entity in the system with immediate effect, without the possibility of saying now that the entity existed 2 days ago. Think about a UserRegistered event in a website that would indicate that the user “existed” from 10 days ago. What would that even mean?
Events are facts and you cannot change the past. However, I do need a way for my users to invalidate a record in which they made a mistake such as a typo. They can record now that they acknowledged the existence of a facility area a week ago and might realize later than there was something wrong, such as a typo in the name of the entity. They would invalidate the record and create a new one. But, invalidate something that has been “registered” does not sound right.
3 Keep looking
Try to dig more in the domain (event storming) and find the real events that brought the entities into existence even if these events are of no use in the problem that needs to be solved.
TankBuiltRecorded
PipeBuiltRecorded, PipeDeliveredRecorded
FacilityArea<something_meaningful>Recorded
TankDestroyedRecorded, TankDecommissionedRecorded
PipeDecommissionedRecorded
FacilityArea<something_meaningful>Recorded

A caution
TankFilled
TankFilledReported
TankFilledReportSubmitted
TankFilledReportSubmissionReceived
Think carefully about whether the increased precision is motivated by business value.
Therefore, what would be the correct DDD way to design an event that represents the fact that there is a tank, a pipe or an area in the production facility?
What is the business doing today? Is there already a process in place for tracking the lifetime of the hardware in the plant (a maintenance log, perhaps?) There's likely to be vocabulary in that place that gives you ideas as to what spellings would make sense in the code.
Events are facts and you cannot change the past.
That's true - but you can back date events. The effective date of the information is often distinct from the reported date of information.
I do need a way for my users to invalidate a record in which they made a mistake such as a typo.
Yes - error correction is an important part of the process that you are modeling.
You should probably review Greg Young's talk Answering a Question, which was based on this thread. It's a discussion of capturing and modeling of temporality.
Here's the good news: you are running into the right problem. Because you are capturing information about an external system, there are going to be opportunities for errors and conflicts, and you need to (a) figure out the protocols for addressing them, and then (b) model that process correctly. That might include exception reports generated by the system when it observes conflicting information, or compensating events, or even automated conflict resolution (for the easy cases -- see also Stop Over Engineering).

How to implement Commands and Events for complex form using Event Sourcing?

I would like to implement CQRS and ES using Axon framework
I've got a pretty complex HTML form which represents recruitment process with six steps.
ES would be helpful to generate historical statistics for selected dates and track changes in form.
Admin can always perform several operations:
assign person responsible for each step
provide notes for each step
accept or reject candidate on every step
turn on/off SMS or email notifications
assign tags
Form update (difference only) is sent from UI application to backend.
Assuming I want to make changes only for servers side application, question is what should be a Command and what should be an Event, I consider three options:
Form patch is a Command which generates Form Update Event
Drawback of this solution is that each event handler needs to check if changes in form refers to this handler ex. if email about rejection should be sent
Form patch is a Command which generates several Events ex:. Interviewer Assigned, Notifications Turned Off, Rejected on technical interview
Drawback of this solution is that some events could be generated and other will not because of breaking constraints ex: Notifications Turned Off will succeed but Interviewer Assigned will fail due to assigning unauthorized user. Maybe I should check all constraints before commands generation ?
Form patch is converted to several Commands ex: Assign Interviewer, Turn Off Notifications and each command generates event ex: Interviewer Assigned, Notifications Turned Off
Drawback of this solution is that some commands can fail ex: Assign Interviewer can fail due to assigning unauthorized user. This will end up with inconsistent state because some events would be stored in repository, some will not. Maybe I should check all constraints before commands generation ?

The question I would call your attention to: are you creating an authority for the information you store, or are you just tracking information from the outside world?
Udi Dahan wrote Race Conditions Don't Exist; raising this interesting point
A microsecond difference in timing shouldn’t make a difference to core business behaviors.
If you have an unauthorized user in your system, is it really critical to the business that they be authorized before they are assigned responsibility for a particular step? Can the system really tell that the "fault" is that the responsibility was assigned to the wrong user, rather than that the user is wrongly not authorized?
Greg Young talks about exception reports in warehouse systems, noting that the responsibility of the model in that case is not to prevent data changes, but to report when a data change has produced an inconsistent state.
What's the cost to the business if you update the data anyway?
If the semantics of the message is that a Decision Has Been Made, or that Something In The Real World Has Changed, then your model shouldn't be trying to block that information from being recorded.
FormUpdated isn't a particularly satisfactory event, for the reason you mention; you have to do a bunch of extra work to cast it in domain specific terms. Given a choice, you'd prefer to do that once. It's reasonable to think in terms of translating events from domain agnostic forms to domain specific forms as you go along.
HttpRequestReceived ->
FormSubmitted ->
InterviewerAssigned
where the intermediate representations are short lived.

I can see one big drawback of the first option. One of the biggest advantage of CQRS/ES with Axon is scalability. We can add new features without worring about regression bugs. Adding new feature is the result of defining new commands, event and handlers for both of them. None of them should not iterfere with ones existing in our system.
FormUpdate as a command require adding extra logic in one of the handler. Adding new attribute to patch and in consequence to command will cause changes in current logic. Scalability is no longer advantage in that case.

VoiceOfUnreason is giving a very good explanation what you should think about when starting with such a system, so definitely take a look at his answer.
The only thing I'd like to add, is that I'd suggest you take the third option.
With the examples you gave, the more generic commands/events don't tell that much about what's happening in your domain. The more granular events far better explain what exactly has happened, as the event message its name already points it out.
Pulling Axon Framework in to the loop, I can also add a couple of pointers.
From a command message perspective, it's safe to just take a route and not over think it to much. The framework quite easily allows you to adjust the command structure later on. In Axon Framework trainings it is typically suggested to let a command message take the form of a specific action you're performing. So 'assigning a person to a step would typically be a AssignPersonToStepCommand, as that is the exact action you'd like the system to perform.
From events it's typically a bit nastier to decide later on that you want fine grained or generic events. This follows from doing Event Sourcing. Since the events are your source of truth, you'll thus be required to deal with all forms of events you've got in your system.
Due to this I'd argue that the weight of your decision should lie with how fine grained your events become. To loop back to your question: in the example you give, I'd say option 3 would fit best.

CQRS aggregates

I'm new to the CQRS/ES world and I have a question. I'm working on an invoicing web application which uses event sourcing and CQRS.
My question is this - to my understanding, a new command coming into the system (let's say ChangeLineItemPrice) should pass through the domain model so it can be validated as a legal command (for example, to check if this line item actually exists, the price doesn't violate any business rules, etc). If all goes well (the command is not rejected) - then the appropriate event is created and stored (for example LineItemPriceChanged)
The thing I didn't quite get is how do I keep this aggregate in memory to begin with, before trying to apply the command. If I have a million invoices in the system, should I playback the whole history every time I want to apply a command? Do I always save the event without any validations and do the validations when constructing the view models / projections?
If I misunderstood any part of the process I would appreciate your feedback.
Thanks for your help!

You are not alone, this is a common misunderstanding. Let me answer the validation part first:
There are 2 types of validation which take place in this kind of system. The first is the kind where you look for valid email addresses, numeric only or required fields. This type is done before the command is even issued. A command which contains these sorts of problems should not be raised as commands (for belt and braces you can check at the domain side but this is not a domain concern and you are better off just preventing this scenario).
The next type of validation is when it is a domain concern. It could be the kind of thing you mention where you check prices are within a set of specified parameters. This is a domain concept the business people would understand, do and be able to articulate.
The next phase is for the domain to apply the state change and raise the associated events. These are then persisted and on success, published for the rest of the app.
All of this is can be done with the aggregate in memory. The actions are coordinated with a domain service which handles the command. It loads the aggregate, apply's all it's past events (or loads a snapshot) then issues the command. On success of the command it requests all the new uncommitted events and tries to persist them. On success it publishes the new events.
As you see it only loads the events for that specific aggregate. Even with a lot of events this process is lightning fast. If performance is a problem there are strategies such as keeping aggregates in memory or snapshotting which you can apply.
To your last point about validating events. As they can only be generated by your aggregate they are trustworthy.
If you want more detail check out my overview of CQRS and ES here. And take a look at my post about how to build aggregate roots here.
Good luck - I hope they help!

It is right that you have to replay the event to 'rehydrate' the domain aggregate. But you don't have to replay all events for all invoices. If you store the entity id of the root aggregate in the events, you can just select and replay the events that with the relevant id.
Then, how do you find the relevant aggregate root id? One of the read repositories should contain the relevant information to get the id, based on a set of search criteria.

Should I use Command to implement a domain derivations in CQRS

I'm using CQRS on an air booking application. one use case is help customer cancel their tickets. But before the acutal cancellation, the customer wants to know the penalty.
The penalty is calculated based on air rules. Some of our provider could calculate the penalty through exposing an web service while the others don't. (They publish some paper explaining the algorithm instead). So I define a domain service
public interface AirTicketService {
//ticket demand method
MonetaryAmount penalty(String ticketNumber);
void cancel(String ticketNumber, MonetaryAmount penalty);
}
My question is which side(command/query) is responsible for invoking this domain service and returning result in a CQRS style application?
I want to use a Command: CalculatePenlatyCommand, In this way, it's easy to resuse the domain model, but it's a little odd because this command does not modify state.
Or should I retrieve a readmodel of ticket if this is a query? But the same DomainService is needed on both command and query side, it's odd too.
Is domain derivation a query?

There is no need to shoehorn everything in to the command-query pipeline. You could query this service independently from the UI without issuing a command or asking the read-model.

There is nothing wrong with satisfying a query using an existing model if it "fits" both the terminology and the structure of that model. No need to build up a separate read model for that purpose. It's not without risk, since the semantics and the context of the query should be closely tied to the model that is otherwise used for write purposes only. The risk I allude to is the fact that the write and read concerns could drift apart (and we're back at square one, i.e. the reason why people pick CQRS in the first place). So you must keep paying attention as new requirements come in.
Queries that fit this model really well are what I call "simulators', where you want to run a simulation using current state to e.g. to give feedback to an end user. On more than one occasion I've found that the simulation logic could be reused both as a feedback mechanism and as an execution (of a write operation/command) steering mechanism. The difference is in what we do with the outcome of the simulation. Again, this is not without risk and requires careful judgement.

You may bring arguments that Calculate Penalty Command is not odd at all.
The user asks the system to do something - command enough.
You can even have a Penalty Calculation Requested Event event in your domain, and it would feel right. Because, at some time, you may be interested in, let's say, unsure clients, ones that want to cancel tickets but they change their mind every time etc. The calculation may be performed asynchronously, too - you can provide the result (penalty cost) to the user in various ways afterwards...
Or, in some other way: on your ticket booked event, store cancellation penalty, too. Then, you can make that value accessible any time, without the need to recompute it... But this may be wrong (?) because penalty would largely depend on time, right (the late you cancel your ticket, the more you pay)?
If all this would like over-complications etc., then I guess I agree with rmac's answer, too :)

Develop Reference

node.js excel linux python-3.x azure haskell apache-spark rust .htaccess string