Design pattern to address invitation race condition - multithreading

Before I begin describing the problem I'm facing, let me assure you that I've checked if there already is another thread where this has been talked about. After about 5-6 tries clicking on suggestions, I gave up, since it's hard to get an idea from threads with generic names like "What design pattern can I use?"
So I've given this question as descriptive a title as I could come up with. The reason for my concern about this being asked already is that it feels like it should be a fairly common problem (surely others would've encountered this in their client-server program).
=====
So here's my problem...
I've got a single server S, and several clients C1, C2, ..., Cn.
A client can do 1 of three things at any given time:
Create an event.
Invite other clients to created events.
Accept or reject invitations to events created by other clients.
A client sees names for events they've created (and possibly invited other clients to) as well as names for events they've accepted invitations to. The server processes all invitations; when a client invites another client to an event, the invitation goes through S but S knows nothing about an event E other than the name associated with it, the inviting client, and the invited clients. Let's symbolise the name of an event E as |E|.
Now for two events Ea and Eb, |Ea| != |Eb| does not imply Ea != Eb. That is, just because two events have different names does not mean they are different. I won't formally define what makes two events the same here, but as a use-case, let's say two events are the same if they have the same location/time. However the server never knows this info remember, only the clients do, but the clients may not communicate well enough beforehand with each other and so may choose different names to represent the same (intended) event.
My problem: I want to avoid a situation where a client Ca accepts an invitation from a client Cb to an event Eb, and Cb accepts an invitation from Ca to Ea, where Ea = Eb. This would lead to each client seeing both |Ea| and |Eb|, which actually represent the same event.
Question: How do I avoid the above? Is there a design pattern that can work on the server alone, client alone, or both server and client together? The solution can include dialogs/prompts for clients.
=====
A practical implementation for such a client-server setup could be discussion topics as events and employees as clients. Imagine a situation where Craig and Matt are colleagues who rarely see each other. They suddenly realise that their boss had asked them to look into why a recent software upgrade wasn't working for some of their customers. But neither knows the other person has been asked to look into the issue as well. So Craig creates the event 'Discuss recent upgrade', and Matt (who's done bit more research than Craig) suspecting it to be an (ahem) Adobe issue, creates 'Investigate new Adobe add-on'. They both invite each other to these topics, and both being very polite, readily accept. Confusion ensues.

How about the solution running core component # server?
I am trying to map this problem with the standard scheduling problem, with a variation of informing the conflicting appointments and finding conflict logic.
Here server keeps the list of events(or appointments). When a request for a new event comes, a conflict check is made. If a conflict is found (may be location and time or even something else - abstract this with a strategy pattern). Server completely conceals the complexity with a simple facade pattern.
#Client side a conflict can be coded to have dialog getting the user feedback to create a new or use existing event.
The next key thing will be, what and how an event will be stored. Again this majorly depends on use case. If we are designing for an event with following data
Name, Date, Start time, Location, Organizer, Description
The storage # server can be typical 'Outlook based calendar' way or 'list of events # buckets of time'.

Related

Handling Race condition in CQRS/ES with read-side

I am building an app for managing a health clinic.
We find a race condition case when an appointment is scheduled, and until now, none of team members reaches a solution.
When an appointment is scheduled, some business rules need to be verified:
cannot be scheduled to the same time as another with the same doctor or same patient
doctors can only attend N appointments in the month
in a week, doctors can only attend N appointments
So, the first approach we think is to create an aggregate that will hold all appointments, responsible for schedule them, but this aggregate will be huge and technically is not acceptable.
The second approach, and the current one, is to create Appointment as an Aggregate Root, and then validate it using a domain service (interface in domain layer and implementation in infra layer), which queries the read side.
Today its look like:
Inside command handler, instantiate new Appointment, passing a domain service in its constructor
Appointment calls domain service, which query the read side and validate the rules. However, race conditions can occurs here (two appointments being scheduled at the same time, as the two do not see each other, both will be created).
If domain service validate the rules, then the Appointment is created, but with status PENDING, and a domain event AppointmentRequested is fired.
On the read side, this event was subscribed and a projection is inserted in the read db (status = PENDING). In same transaction, a command CompleteAppointmentSchedule is inserted in my outbox and soon is sent and received asynchronously by the write side.
write side handles the command calling appointment.CompleteSchedule(domainService). The same domain service passed when instantiate a new appointment is passed again to the appointment. But, now, the appointment will already be in the read db, and will be possible to check the business rules.
Is it correct to use read side this way? We cannot think another the way to check this rules without using the read side. A team member suggested that we could create a private read-side for our write-side, and use it instead of a read-side in these cases, but, as we use EventStore DB, we would have to create another database like the one we use on the read-side (pgsql) to be able to do it that way on this private read-side.
I am building an app for managing a health clinic.
Reserve an office, get the entire team together, and watch Trench Talk: Evolving a Model. Yves Reynhout has been doing (and talking about) domain driven design, and his domain is appointment scheduling for healthcare.
When an appointment is scheduled, some business rules need to be verified:
cannot be scheduled to the same time as another with the same doctor or same patient
doctors can only attend N appointments in the month in a week,
doctors can only attend N appointments
One of the things you are going to need to discuss with your domain experts; do you need to prevent scheduling conflicts, or do you need to identify scheduling conflicts and resolve them?
Recommended reading:
Race Conditions Don't Exist - Udi Dahan, 2010
Memories, Guesses, and Apologies - Pat Helland, 2007
That said, you are really close to a common answer.
You make the your checks against a cached copy of the calendar, to avoid the most common collisions (note that there is still a race condition, when you are checking the schedule at the same time somebody else is trying to cancel the conflicting appointment). You then put an appointment request message into a queue.
Subscribing to the queue is a Service-as-in-SOA, which is the technical authority for all information related to scheduling. That service has its own database, and checks its own authoritative copy of everything before committing a change.
The critical different here is that when the service is working directly with locked instances of the data. That might be because the event handler in the service is the only process that has write permissions on the authoritative data (and is itself handling only one message at a time), or it might be because the event handler locks all of the data necessary to ensure that the result of the write is still consistent with the business rules (conflicting writes competing for the same lock, thus ensuring that data changes are controlled).
In effect, all attempts to change the authoritative calendar data are (logically) serialized, to ensure that the writes cannot conflict with each other.
In the language of CQRS, all of this locking is happening in the write model of the calendar service. Everybody else works from unlocked copies of the data, which are provided by the read model (with some modest plumbing involved in copying data change from the write model to the read model).

Microservices: how to effectively deal with data dependencies between microservices

I am developing an application utilizing the microservices development approach with the mean stack. I am running into a situation where data needs to be shared between multiple microservices. For example, let's say I have user, video, message(sending/receiving,inbox, etc.) services. Now the video and message records belong to an account record. As users create video and send /receive message there is a foreign key(userId) that has to be associated with the video and message records they create. I have scenarios where I need to display the first, middle and last name associated with each video for example. Let's now say on the front end a user is scrolling through a list of videos uploaded to the system, 50 at a time. In the worst case scenario, I could see a situation where a pull of 50 occurs where each video is tied to a unique user.
There seems to be two approaches to this issue:
One, I make an api call to the user service and get each user tied to each video in the list. This seems inefficient as it could get really chatty if I am making one call per video. In the second of the api call scenario, I would get the list of video and send a distinct list of user foreign keys to query to get each user tied to each video. This seems more efficient but still seems like I am losing performance putting everything back together to send out for display or however it needs to be manipulated.
Two, whenever a new user is created, the account service sends a message with the user information each other service needs to a fanout queue and then it is the responsibility of the individual services to add the new user to a table in it's own database thus maintaining loose coupling. The extreme downside here would be the data duplication and having to have the fanout queue to handle when updates needs to be made to ensure eventual consistency. Though, in the long run, this approach seems like it would be the most efficient from a performance perspective.
I am torn between these two approaches, as they both have their share of tradeoffs. Which approach makes the most sense to implement and why?
I'm also interested in this question.
First of all, scenario that you described is very common. Users, videos and messages definitely three different microservices. There is no issue in how you broke down system into pieces.
Secondly, there are multiple options, how to solve data sharing problem. Take a look at great article from auth0: https://auth0.com/blog/introduction-to-microservices-part-4-dependencies/
Don't restrict your design decision to those 2 options you've outlined. The hardest thing about microservices is to get your head around what a service is and how to cut your application into chunks/services that make sense to be implemented as a 'microservice'.
Just because you have those 3 entities (user, video & message) doesn't mean you have to implement 3 services. If your actual use case shows that these services (or entities) depend heavily on each other to fulfil a simple request from the front-end than that's a clear signal that your cutting was not right.
From what I see from your example I'd design 1 microservice that fulfills the request. Remember that one of the design fundamentals of a microservice is to be as independent as possible.
There's no need to over complicate services, it's not SOA.
https://martinfowler.com/articles/microservices.html -> great read!
Regards,
Lars

CQRS and synchronous operations (such as user registration)

I'm in the process of adopting DDD concepts for designing our next projects, and more specifically CQRS.
After reading a LOT of stuff I'm now trying to implement a simple Proof Of Concept.
The thing is I'm stuck right after I started :p
I'm trying to apply this approach to a simple user registration process, where steps are:
User fills the registration form & submit the request
The app creates the user
The app authenticates the user (auto log in)
The app sends a verification email to the user
The app redirect the user somewhere else with a confirmation message
From an implementation point of view, what I get so far is:
The controller action maps the request data to a RegisterCommand object
The controller action asks the Command Bus to handle the RegisterCommand
The command handler (UserService) "register" method creates a new User object (whether by a new command or a factory object)
The model raises a RegisterEvent
The command handler asks the repository to store the new user object
That's it, the controller action doesn't know about any of that.
So, my guess is, since everything in this context HAS TO be done synchronously (except for the email sending), I can use a direct/synchronous command bus, and in the controller action, right after the command bus invocation, I can query for a read only User (query database) and if it exists assume that everything went well, so I can give the user a confirmation message.
The automatic log in process being handled by an Event Handler.
Assuming that this is correct, what if something goes wrong, how to inform the user with the correct information ?
A common example is often used in articles we can find over the internet: A customer pays his order by using an expired credit card. The system accepts the request, informs the user that everything is OK, but the user receives an email a few minutes later telling him that his order could not be processed.
Well, this scenario is acceptable in many cases, but for some other it is just not possible. So where are the examples dealing with these use cases ? :p
Thank you !
I think this registration use case is closer to the paying for an order use case than you think.
Most of the CQRS thought leaders suggest validating on the read side before issuing a command, thus giving your command a higher probability of success.
If the validation fails on the read side, you know how to handle this - make the user pick another name before you even send off the registration command. If validation succeeds, send the command - now you're talking probably a few hundred microseconds AT MOST where another user could've come in and taken the same username between the time you validated the command and sent it off. Highly unlikely.
And in the very rare case when that does happen, you act in the same as way as the expired credit card example - the next time the user logs in, you present them with an explanation and a form to submit a new username - or send them an email saying "hey - someone else has that username, please click here to select a new one". Why does this work? Because you have a unique ID for that user.
Look at a user registration page like Twitter. As soon as you enter a username, it does a little Ajax call and says "nope, this is taken" or "this one is good!" That's pre-validation.
I hope this helps!
The problem with contrived examples is that you can change your mind about how the "domain" functions, so there's little use in discussing this example in particular. The basic premise you seem to forego is that we must assume that things are just going to work. Everything else is about risk and mitigating it. Taking this example, if I ask you, what if I lost 1 user registration in 100000? What if I lost 1 out of 10? Why would that happen? Do I have bigger problems at that point in time? Would future users be likely to register again when the system comes back online and works as expected? When would that be? What if we monitored our quality of service and prevent users from registering because we can't assure the quality they've come to associate with our brand? What if the server exploded, or the datacenter got nuked? Do we want to protect against that? You see, there is no right answer. Just various shades of grey. So how do we mitigate the risk? We could make things synchronous but that is only a guarantee at that limited point in time. What if I had to restore a backup that's 2 hours old (e.g. because the disk corrupted)? That's 2 hours of registered users lost (maybe). These things happen ... I just wanted to point out the relativity of what I consider a false sense of security. Mitigate it, invest in what you can't afford to lose, make sure you have a good audit trail. Probably not the answer you were looking for ...

CQRS Event Sourcing: Validate UserName uniqueness

Let's take a simple "Account Registration" example, here is the flow:
User visit the website
Click the "Register" button and fill out the form, click the "Save" button
MVC Controller: Validate UserName uniqueness by reading from ReadModel
RegisterCommand: Validate UserName uniqueness again (here is the question)
Of course, we can validate UserName uniqueness by reading from ReadModel in the MVC controller to improve performance and user experience. However, we still need to validate the uniqueness again in RegisterCommand, and obviously, we should NOT access ReadModel in Commands.
If we do not use Event Sourcing, we can query the domain model, so that's not a problem. But if we're using Event Sourcing, we are not able to query the domain model, so how can we validate UserName uniqueness in RegisterCommand?
Notice: User class has an Id property, and UserName is not the key property of the User class. We can only get the domain object by Id when using event sourcing.
BTW: In the requirement, if the entered UserName is already taken, the website should show the error message "Sorry, the user name XXX is not available" to the visitor. It's not acceptable to show a message, that says, "We are creating your account, please wait, we will send the registration result to you via Email later", to the visitor.
Any ideas? Many thanks!
[UPDATE]
A more complex example:
Requirement:
When placing an order, the system should check the client's ordering history, if he is a valuable client (if the client placed at least 10 orders per month in the last year, he is valuable), we make 10% off to the order.
Implementation:
We create PlaceOrderCommand, and in the command, we need to query the ordering history to see if the client is valuable. But how can we do that? We shouldn't access ReadModel in command! As Mikael said, we can use compensating commands in the account registration example, but if we also use that in this ordering example, it would be too complex, and the code might be too difficult to maintain.
If you validate the username using the read model before you send the command, we are talking about a race condition window of a couple of hundred milliseconds where a real race condition can happen, which in my system is not handled. It is just too unlikely to happen compared to the cost of dealing with it.
However, if you feel you must handle it for some reason or if you just feel you want to know how to master such a case, here is one way:
You shouldn't access the read model from the command handler nor the domain when using event sourcing. However, what you could do is to use a domain service that would listen to the UserRegistered event in which you access the read model again and check whether the username still isn't a duplicate. Of course you need to use the UserGuid here as well as your read model might have been updated with the user you just created. If there is a duplicate found, you have the chance of sending compensating commands such as changing the username and notifying the user that the username was taken.
That is one approach to the problem.
As you probably can see, it is not possible to do this in a synchronous request-response manner. To solve that, we are using SignalR to update the UI whenever there is something we want to push to the client (if they are still connected, that is). What we do is that we let the web client subscribe to events that contain information that is useful for the client to see immediately.
Update
For the more complex case:
I would say the order placement is less complex, since you can use the read model to find out if the client is valuable before you send the command. Actually, you could query that when you load the order form since you probably want to show the client that they'll get the 10% off before they place the order. Just add a discount to the PlaceOrderCommand and perhaps a reason for the discount, so that you can track why you are cutting profits.
But then again, if you really need to calculate the discount after the order was places for some reason, again use a domain service that would listen to OrderPlacedEvent and the "compensating" command in this case would probably be a DiscountOrderCommand or something. That command would affect the Order Aggregate root and the information could be propagated to your read models.
For the duplicate username case:
You could send a ChangeUsernameCommand as the compensating command from the domain service. Or even something more specific, that would describe the reason why the username changed which also could result in the creation of an event that the web client could subscribe to so that you can let the user see that the username was a duplicate.
In the domain service context I would say that you also have the possibility to use other means to notify the user, such like sending an email which could be useful since you cannot know if the user is still connected. Maybe that notification functionality could be initiated by the very same event that the web client is subscribing to.
When it comes to SignalR, I use a SignalR Hub that the users connects to when they load a certain form. I use the SignalR Group functionality which allows me to create a group which I name the value of the Guid I send in the command. This could be the userGuid in your case. Then I have Eventhandler that subscribe to events that could be useful for the client and when an event arrives I can invoke a javascript function on all clients in the SignalR Group (which in this case would be only the one client creating the duplicate username in your case). I know it sounds complex, but it really isn't. I had it all set up in an afternoon. There are great docs and examples on the SignalR Github page.
I think you are yet to have the mindset shift to eventual consistency and the nature of event sourcing. I had the same problem. Specifically I refused to accept that you should trust commands from the client that, using your example, say "Place this order with 10% discount" without the domain validating that the discount should go ahead. One thing that really hit home for me was something that Udi himself said to me (check the comments of the accepted answer).
Basically I came to realise that there is no reason not to trust the client; everything on the read side has been produced from the domain model, so there is no reason not to accept the commands. Whatever in the read side that says the customer qualifies for discount has been put there by the domain.
BTW: In the requirement, if the entered UserName is already taken, the website should show error message "Sorry, the user name XXX is not available" to the visitor. It's not acceptable to show a message, say, "We are creating your account, please wait, we will send the registration result to you via Email later", to the visitor.
If you are going to adopt event sourcing & eventual consistency, you will need to accept that sometimes it will not be possible to show error messages instantly after submitting a command. With the unique username example the chances of this happening are so slim (given that you check the read side before sending the command) its not worth worrying about too much, but a subsequent notification would need to be sent for this scenario, or perhaps ask them for a different username the next time they log on. The great thing about these scenarios is that it gets you thinking about business value & what's really important.
UPDATE : Oct 2015
Just wanted to add, that in actual fact, where public facing websites are concerned - indicating that an email is already taken is actually against security best practices. Instead, the registration should appear to have gone through successfully informing the user that a verification email has been sent, but in the case where the username exists, the email should inform them of this and prompt them to login or reset their password. Although this only works when using email addresses as the username, which I think is advisable for this reason.
There is nothing wrong with creating some immediately consistent read models (e.g. not over a distributed network) that get updated in the same transaction as the command.
Having read models be eventually consistent over a distributed network helps support scaling of the read model for heavy reading systems. But there's nothing to say you can't have a domain specific read model thats immediately consistent.
The immediately consistent read model is only ever used to check data before issuing a command, you should never use it for directly displaying read data to a user (i.e. from a GET web request or similar). Use eventually consistent, scaleable read models for that.
About uniqueness, I implemented the following:
A first command like "StartUserRegistration". UserAggregate would be created no matter if user is unique or not, but with a status of RegistrationRequested.
On "UserRegistrationStarted" an asynchronous message would be sent to a stateless service "UsernamesRegistry". would be something like "RegisterName".
Service would try to update (no queries, "tell don't ask") table which would include a unique constraint.
If successful, service would reply with another message (asynchronously), with a sort of authorization "UsernameRegistration", stating that username was successfully registered. You can include some requestId to keep track in case of concurrent competence (unlikely).
The issuer of the above message has now an authorization that the name was registered by itself so now can safely mark the UserRegistration aggregate as successful. Otherwise, mark as discarded.
Wrapping up:
This approach involves no queries.
User registration would be always created with no validation.
Process for confirmation would involve two asynchronous messages and one db insertion. The table is not part of a read model, but of a service.
Finally, one asynchronous command to confirm that User is valid.
At this point, a denormaliser could react to a UserRegistrationConfirmed event and create a read model for the user.
Like many others when implementing a event sourced based system we encountered the uniqueness problem.
At first I was a supporter of letting the client access the query side before sending a command in order to find out if a username is unique or not. But then I came to see that having a back-end that has zero validation on uniqueness is a bad idea. Why enforce anything at all when it's possible to post a command that would corrupt the system ? A back-end should validate all it's input else you're open for inconsistent data.
What we did was create an index table at the command side. For example, in the simple case of a username that needs to be unique, just create a user_name_index table containing the field(s) that need to be unique. Now the command side is able to query a username's uniqueness. After the command has been executed it's safe to store the new username in the index.
Something like that could also work for the Order discount problem.
The benefits are that your command back-end properly validates all input so no inconsistent data could be stored.
A downside might be that you need an extra query for each uniqueness constraint and you are enforcing extra complexity.
I think for such cases, we can use a mechanism like "advisory lock with expiration".
Sample execution:
Check username exists or not in eventually consistent read model
If not exists; by using a redis-couchbase like keyvalue storage or cache; try to push the username as key field with some expiration.
If successful; then raise userRegisteredEvent.
If either username exists in read model or cache storage, inform visitor that username has taken.
Even you can use an sql database; insert username as a primary key of some lock table; and then a scheduled job can handle expirations.
Have you considered using a "working" cache as sort of an RSVP? It's hard to explain because it works in a bit of a cycle, but basically, when a new username is "claimed" (that is, the command was issued to create it), you place the username in the cache with a short expiration (long enough to account for another request getting through the queue and denormalized into the read model). If it's one service instance, then in memory would probably work, otherwise centralize it with Redis or something.
Then while the next user is filling out the form (assuming there's a front end), you asynchronously check the read model for availability of the username and alert the user if it's already taken. When the command is submitted, you check the cache (not the read model) in order to validate the request before accepting the command (before returning 202); if the name is in the cache, don't accept the command, if it's not then you add it to the cache; if adding it fails (duplicate key because some other process beat you to it), then assume the name is taken -- then respond to the client appropriately. Between the two things, I don't think there'll be much opportunity for a collision.
If there's no front end, then you can skip the async look up or at least have your API provide the endpoint to look it up. You really shouldn't be allowing the client to speak directly to the command model anyway, and placing an API in front of it would allow you to have the API to act as a mediator between the command and read hosts.
It seems to me that perhaps the aggregate is wrong here.
In general terms, if you need to guarantee that value Z belonging to Y is unique within set X, then use X as the aggregate. X, after all, is where the invariant really exists (only one Z can be in X).
In other words, your invariant is that a username may only appear once within the scope of all of your application's users (or could be a different scope, such as within an Organization, etc.) If you have an aggregate "ApplicationUsers" and send the "RegisterUser" command to that, then you should be able to have what you need in order to ensure that the command is valid prior to storing the "UserRegistered" event. (And, of course, you can then use that event to create the projections you need in order to do things such as authenticate the user without having to load the entire "ApplicationUsers" aggregate.

How does XEP-0114 work?

I am a bit confused about how XEP-0114 works. Does servicing a domain using a component mean that the server will no longer do anything on behalf of that domain, or does it just mean that the component will ALSO be allowed to service all users on that domain.
More specifically, is it possible to have multiple components servicing the same domain? For example, one component could handle MUC, another could store all messages in a history store, and a third could handle the roster, etc... All while the XMPP server continues handling the user like it normally would - and replying to presence, iq packets, etc... What this means is that components would have to be written so that their realm doesn't intersect with each other.
Answering #dhruvbird's second question in the comments above, if you have delegated a domain to your XEP-114 component, that component is responsible for everything about that domain, including all of the presence states of the users in that domain. That is possible, if tedious, but make sure you've read the new RFC 6121 recently.
Note: most servers have a component that implements all of this presence subscription logic - it's where the real IM business logic is implemented. You'll effectively be writing a replacement for that logic, so make sure there's no other way to solve your problem first.

Resources