Handling Eventual Consistency fail between aggregates - domain-driven-design

I am a beginner in DDD and I came across a situation that involves a rule of not modifying more than 1 aggregate in the same transaction, using Domain Events to resolve changes in other aggregates. (see Effective Aggregate Project).
The situation is as follows: The user schedules to transfer a patient to another hospital. When the transfer time comes, the user selects it in a list and clicks 'Start'. However, this action changes three aggregates:
Transfer: marked as started. ex: transfer.Start();
Patient: is marked as being transferred. ex: patient.MarkAsInTransfer();
Hospital: you must reserve a place for the patient who is now coming. ex: hospita;.ReservePlace(patient);
Thus, when transfer starts, it raise an event TransferStarted.
But, for some reason, when the transfer is already occurring, an error occurs when handling the TransferStarted event (changing the patient's status or reserving a place in destination hospital).
How to deal with this situation, since the patient is already in transfer? I need to forget and use transactional consistency, modifying three aggregates in the same transaction? Using a Domain Service to do it?
Remembering that I am following an aggregate transaction rule.

How to deal with this situation, since the patient is already in transfer? I need to forget and use transactional consistency, modifying three aggregates in the same transaction? Using a Domain Service to do it?
There are a couple of aspects to what's going on here.
1) From your description, you are dealing with entities out in the real world; the book of record for the real world is the real world, not your domain model. So when you receive a "domain event" from the real world, you need to treat it appropriately.
2) collaborative domains, with contributions from multiple resources out in the real world, are inherently "eventually consistent". The people over here don't know what's going on over there, and vice versa -- they can only act on the information they have locally, and report faithfully what they are doing.
What this means, in practice, is that you need to be thinking about your "aggregates" as bookkeeping about what's going on in the real world, and documenting actions that conflict with policy as they occur (sometimes referred to as "exception reports").
3) Often in the case of collaborative processes, the "aggregate" is the instance of the process itself, rather than the entities participating in it.
How to deal with this situation, since the patient is already in transfer?
You invoke the contingency protocol provided to you by the domain experts.
A way to think of it is to imagine a bunch of SMS messages going around. You get a message from the attending announcing that the transfer is starting, and the moments later you get a message from the destination hospital that it is in lockdown.
Now what?
Well, I'm not sure - it isn't my domain. But it's probably something like sending a message to the attending to announce that the destination has been closed.
The important things to notice here are that (a) conflicting things happening in different places is a property of distributed collaborative systems, and you have to plan for it -- the race conditions are real and (b) the information you have about the state of affairs anywhere else is always stale, and subject to revision.
Take a careful read of Data on the Outside versus Data on the Inside. The real world is outside, all of the information you have about it is stale. Also, review Memories, Guesses, and Apologies.

When I face this sort of issue, the first thing I ask myself is: "are my aggregates correct, and do they have the right responsibilities"? After all, an aggregate is a transaction boundary which encapsulates the data and the business logic for a given process. If a process needs 3 aggregates, is it possible that they are, in fact, a single aggregate?
In your particular case, a Transfer sounds like an aggregate to me (as in, there must be some business rules to enforce and some data related to it), but Hospital and Patient look suspicious to me. What kind of data and business logic do they encapsulate and are in charge of? It obviously depends on the bounded context these aggregates are in, but it's something I would double-check. I assume though that they are all in the same BC.
For example, I would consider: why does a Patient need to be marked as in transfer? what kind of business rule does it enforce? If the reason is to avoid a Patient being transferred more than once, then it shouldn't be the listening an event from the Transfer (where does the transfer come from?), instead, it should be the one creating transfers (see Udi Dahan's Don't create aggregate roots). So, if you want to transfer a Patient, do Patient.TransferTo(otherHospital), which will check if the conditions are met to initiate a transfer and, if they are, send a Command to create a transfer. The Patient then can listen to TransferStarted, TransferCancelled, TransferCompleted events to update its own state safely (as a new transfer won't start until the previous one is completed either successfully or not).
From this point, the Hospital Room allocation would be something between the Transfer and the Hospital and the Patient doesn't need to know anything about it.
But regarding the Hospital, I don't know at this point, because it doesn't seem right to me that a single aggregate manages the room allocations of a whole Hospital. It seems a lot of data and responsibility and also, it's not clear to me that there's the need for transactionality: why the allocation of Patient A in Room 100 has to be transactional with the allocation of Patient B in Room 210? If you imagine the full Hospital, that's a lot of rooms and patients and a lot of contingency in a single aggregate, which won't allow concurrent changes. Obviously, I don't have enough knowledge of the domain and details to make a suggestion, these are only considerations.

Related

Update two aggregate instances in a single transaction

Lets say we have an Account aggregate for a banking service. Someone wants to transfer money from their Account to another person's Account. There are a number of rules: payer needs to have enough money in their Account, and the payee's Account must be active. If these rules pass then the balance on both Accounts are updated. In a traditional system this can easily be done in a single acid db transaction.
In DDD this would not be allowed, as we can't update two aggregate instances in a single transaction? Firstly, why? Secondly, does that mean using eventual consistency to handle the two Accounts? If so, I can see how that can be done, but it adds a lot of complexity.
In DDD this would not be allowed
Not really true - there's a lot going on here.
What Evans (2003), and also Vaughn 2013, wrote is that transaction management is not a domain model concern, but rather that transaction control belongs in the application code.
There is, however, a real concern with changing multiple aggregates at the same time: to do so assumes that you can acquire locks on those entities at the same time and also commit all of those changes together.
That's relatively straightforward when all of the aggregates that you are changing are stored in a single relational database; but it becomes very difficult when the aggregates are stored in different places.
If you design your system such that it assumes that all aggregates are stored together, then you greatly restrict your scaling options.
Be careful not to overuse the ability to commit modifications to multiple Aggregates in a single transaction just because it works in a unit test environment -- Vaughn 2013.
does that mean using eventual consistency to handle the two Accounts
That, or changing how you model your aggregates. Sometimes both.
For instance, it's somewhat common to have aggregates that handle (short lived) processes, which are different from the long lived aggregates.
When I look at my credit card statement, a charge will normally fall in one of three states: it's not yet posted to my statement (not visible), or it's pending (visible), or it is actually posted as a charge (visible). Clearly, there is stuff going on "somewhere else", and that information is eventually copied to my statement where I can see it.
I can see how that can be done, but it adds a lot of complexity.
Yup. If it wasn't complicated/complex, we wouldn't be creating our own model; we'd instead be buying some general purpose solution off the self.
Greg Young talked about this in a 2011 presentation: domain driven design makes sense in places where we can derive a competitive advantage from the work we are doing. In other words, we are using it in places where giving the business control over that complexity improves the bottom line.
Making sure you are working on the correct side of the build versus buy line is an important step. Don't skip it.

DDD - Relaxing the rule of Eventual Consistency between aggregate

I`m reading the book PATTERNS, PRINCIPLES, AND PRACTICES OF DOMAIN-DRIVEN DESIGN, written by Scott Millett with Nike Tune. In the chapter 19, Aggregates, he states:
Sometimes it is actually good practice to modify multiple aggregates within a transaction. But it’s
important to understand why the guidelines exist in the first place so that you can be aware of the
consequences of ignoring them.
When the cost of eventual consistency is too high, it’s acceptable to consider modifying two objects in the same transaction. Exceptional circumstances will usually be when the business tells you that the customer experience will be too unsatisfactory.
To summarize, saving one aggregate per transaction is the default approach. But you should
collaborate with the business, assess the technical complexity of each use case, and consciously ignore
the guideline if there is a worthwhile advantage, such as a better user experience.
I face to a case in my project when user request a operation to my app and this operation affects two aggregate, and there are rules that must be verified by the two aggregates for the operation takes place successfully.
it is something like "Allocating a cell for a detainee":
the user makes the request
the Detainee (AR1) is fetched from database and receives a command: detainee.AllocateTo(cellId);
3 the Cell (AR2) is fetched and receive a command: cell.Allocate(detaineeId);
Both steps 2 and 3 could throw an exception, depending on the detainee's status and cell capacity. But abstract it.
Using eventual consistency, if step 2 is executed successfully, emiting the event DetaineeAllocated, but step 3 fails (will run in another transaction, inside an event handler), the state of aggregates will be inconsistent, and worse, the operation seemed to be executed successfully for the user.
I know that there are cases like "when the user makes a purchase over $ 100, its type must be changed to VIP" that can be implemented using eventual consistency, but the case I mentioned does not seem to be one.
Do you think that this is a special case that the book mentions?
Each aggregate must not have an invalid state (internal state), but that does not imply aggregates have to be consistent with one another (external, or system state).
Given the context of your question, the answer could be either yes or no.
The Case for No
The external state can become eventually consistent, which may be acceptable to your product owner. In this case you design ways to detect the inconsistency and deal with it (e.g. by retrying operations, issuing compensating transactions, etc.)
The Case for Yes
In your orchestration layer, go ahead and update the aggregates in a transaction. You might choose to do this because it's "easy" and "right", or you might choose to do this because your product owner says the inconsistency can't be tolerated for whatever reason.
Another Case for No
There's another way out for saying this is not a special case, not a reason for more than one transaction. That way out requires a change to your model. Consider removing the mutual dependency between your detainee and the cell, and instead introducing another aggregate, CellAssignment, which represents a moment-interval (a temporal relationship) that can be constructed and saved in a single transaction. In this case, your detainee and the cell don't change.
"the state of aggregates will be inconsistent"
Well, it shouldn't be inconsistent forever or that wouldn't be eventual consistency. You would normally discuss with business experts to establish an acceptable consistency timeframe.
Should something go wrong an event will be raised which should trigger compensating actions and perhaps a notification to a human stating something went wrong after-all.
Another approach could be to introduce a process manager which is responsible to carry out the business process by triggering commands and listening to events, until completion or timeout. The ARs are often designed to allow small incremental steps
towards consistency. For instance, there could be a command to reserve cell space first rather than directly allocating the detainee. The UI could always poll the state of the process to know when it's complete if necessary.
Eventual consistency obviously comes at a cost. If you have a single DB in a monolith that doesn't need extreme scalability you could very well favor to modify both ARs in a single transaction until that becomes a problem.
Eventual consistency is often sold as less costly that strong consistency, but I believe that's mostly for distributed systems where you'd have to deal with XA transactions.
Do you think that this is a special case that the book mentions?
No.
What I suspect you have here is a modeling error.
From your description, it sounds like you are dealing with something like a CellAssignment, and the invariant that you are trying to maintain is to ensure that there are no conflicts among active cell assignments.
That suggests to me that you are missing some sort of aggregate - something like a seating chart? - that keeps track of all of the active assignments and conflicts.
How can you know? One way is to graph your aggregates; create a node for each piece of information you need to save, and join nodes with lines if there is a rule that requires locking both nodes. If you find yourself with disconnected graphs, or two graphs that only connect at the root id, then it's a good bet that separating some information into a new graph will improve your modeling.
All Our Aggregates Are Wrong, by Mauro Servienti, would be a good talk to review.

Should a single command address multiple aggregates?

In CQRS and DDD, an aggregate is a transactional boundary. Hence I have been modeling commands always in such a way that each command always only ever addresses a single aggregate. Of course, technically, it would be possible to write a command handler that addresses multiple aggregates, but that would not be within a single transaction and hence would not be consistent.
If you actually have to address multiple aggregates, I usually go with a process manager, but this sometimes feels like pretty much overhead. In addition, from my understanding a process manager always only reacts to domain events, it is not directly addressed by commands. So you need to decide which aggregate to put the starting point to.
I have seen that some people solve this using so-called domain or application services, which can receive commands as well, and then work on multiple aggregates – but in this case the transactional nature of the process gets lost.
To give a simple example, to better illustrate the scenario:
A user shall join a group.
A user has a max number of groups.
A group has a max number of users.
Where to put the command that triggers the initial joining process, and what to call it? user.join(group) feels as right or wrong as group.welcome(user). I'd probably go for the first one, because this is closer to the ubiquitous language, but anyway…
If I had something above the aggregates, like the aforementioned services, then I could run something such as:
userManagement.addUserToGroup(user, group);
However, this addUserToGroup function would then need to call both commands, which in turn means it has to take care of both commands being processed – which is somewhat counterintuitive to having separate aggregates at all, and having aggregates as transactional boundaries.
What would be the correct way to model this?
It may be worth reviewing Greg Young on Eventual Consistency and Set Validation.
What is the business impact of having a failure
This is the key question we need to ask and it will drive our solution
in how to handle this issue as we have many choices of varying degrees
of difficulty.
And certainly Pat Helland on Memories, Guesses, and Apologies.
Short version: the two generals tell us that, if two pieces of information must be consistent, then we need to write both pieces of information in the same place. The "invariant" constrains our data model.
The invariant you describe is effectively a couple of set validation problems: the "membership" collection allows only so many members with user A, and only so many members with group B. And if you really are in a "we go out of business if those rules are violated" situation, then you cannot distribute the members of that set -- you have to lock the entire set when you modify it to ensure that the rule is not broken and that first writer wins.
An element that requires some care in your modeling: is the domain model the authority for membership? or is the "real world" responsible for membership and the domain model is just caching that information for later use? You want to be very careful about trying to enforce an invariant on the real world.
There's a risk that you end up over constraining the order in which information is accepted by the model.
Essentially what you have is many to many relationships between users and groups with restrictions on both sides:
Restriction on the number of groups a user can join
Restriction on the number of users a group can have
VoiceOfUnreason already gave a great answer, so I'll share one way I've solved similar problems and go straight to the model and implementation in case you have to ensure that these constraints are enforced at all costs. If you don't have to, do not make the model and implementation that complex.
Ensuring consistency with such constraints on both Group and User entities will be difficult in a single operation because of the concurrency of the operations.
You can model this by adding a collection of RegisteredUsers to a Group or vice versa, adding a collection of JoinedGroups to a User, and enforce the constraint on one side, but enforcing it on the other side is still an issue.
What you can do is introduce another concept in your domain. The concept of a
"Slot" in a Group. "Slots" are limited by the max number of Slots for a Group.
Then a User will issue a JoinGroupRequest that can be Accepted or Rejected.
A Slot can be either Taken or Reserved. Then you can introduce the concept of SlotReservation. The process of joining a User to a Group will be:
Issue a JoinGroupRequest from a User
Try to Reserve a Slot enforcing the MaxUsersPerGroup constraint.
Acquire a Slot or Reject the SlotReservation of a User enforcing the MaxGroupsPerUser constraint.
Accept or Reject the JoinGroupRequest depending on the outcome of the SlotReservation
If the SlotReservation is Rejected, another User will be able to use this Slot later.
For the implementation, you can add SlotReservation Queue Per Group to ensure that once a Slot is free after a Rejected SlotReservation, the next User that wants to join the Group will be able to.
For the implementation, you can add a collection of Slots to a Group, or you can make Slot an aggregate in its own right.
You can use a Saga for this process. The Saga will be triggered when a JoinGroupRequest is made by a User.
Essentially, this operation becomes a Tentative Operation.
For more details take a look and the Accountability Pattern and Life beyond distributed transactions an apostate's opinion and Life beyond distributed transactions an apostate's implementation.

Enforce invariants spanning multiple aggregates (set validation) in Domain-driven Design

To illustrate the problem we use a simple case: there are two aggregates - Lamp and Socket. The following business rule always must be enforced: Neither a Lamp nor a Socket can be connected more than once at the same time. To provide an appropriate command we conceive a Connector-service with the Connect(Lamp, Socket)-method to plug them.
Because we want to comply to the rule that one transaction should involve only one aggregate, it's not advisable to set the association on both aggregates in the Connect-transaction. So we need an intermediate aggregate which symbolizes the Connection itself. So the Connect-transaction would just create a new Connection with the given components. Unfortunately, at this point the troubles begin; how can we ensure the consistency of connection-state? It may happen that many simultaneous users want to plug the same components at the exact same time, so our "consistency check" wouldn't reject the request. New Connection-aggregates would be stored, because we only lock at aggregate-level. The system would be inconsistent without even knowing that.
But how should we set the boundary of our aggregates to ensure our business rule? We could conceive a Connections-aggregate which gathers all active connections (as Connection-entity), thereby enabling our locking-algorithm which would properly reject duplicate Connect-requests. On the other hand this approach is inefficient and does not scale, further it is counter-intuitive in terms of domain language.
Do you know what I'm missing?
Edit: To sum up the problem, imagine an aggregate User. Since the definition of an aggregate is to be a transaction-based unit we are able to enforce invariants by locking this unit per transaction. All is fine. But now a business rule arises: the username must be unique. Therefore we must somehow reconcile our aggregate boundaries with this new requirement. Assuming millions of users registering at the same time, it becomes a problem. We try to ensure this invariant in a non-locked state since multiple users means multiple aggregates.
According to the book "Domain-driven Design" by Eric Evans one should apply eventual consistency as soon as multiple aggregates are involved in a single transaction. But is this really the case here and does is make sense?
Applying eventual consistency here would entail registering the User and afterwards checking the invariant with the username. If two Users actually set the same username the system would undo the second registering and notify the User. Thinking about this scenario disconcerts me because it disrupts the whole registering process. Sending the confirmation e-mail, for example, had to be delayed and so forth.
I think I'm just forgetting about something in general but I don't know what. It seems to me that I need something like invariants on Repository-level.
We could conceive a Connections-aggregate which gathers all active
connections (as Connection-entity), thereby enabling our
locking-algorithm which would properly reject duplicate
Connect-requests. On the other hand this approach is inefficient and
does not scale, further it is counter-intuitive in terms of domain
language
On the contrary, I think you're on the right track with this approach. It seems convoluted because you're using an example that doesn't make any sense - there is no real-life system that checks if a lamp is connected to more than one socket or a socket to more than one lamp.
But applying that approach to the second example would lead you to ask yourself what the "connection" aggregate is in that case, i.e. inside which scope a user name is unique. In a Company? For a given Tenant or Customer? For the whole <whatever-subdomain-youre-in>System? Find the name of the scope and there you have it - an Aggregate to enforce the unique name invariant. Choose the name carefully and if it doesn't exist in the ubiquitous language yet, invent a new concept with the help of a domain expert. DDD is not only about respecting existing domain terms, you're also allowed to introduce new ones when Breakthroughs are achieved.
Sometimes though, you will find that concurrent access to this aggregate is too intensive and generates problematic contention. With domain expert assent, you can introduce eventual consistency with a compensating action in case of conflict - appending a suffix to the nickname and notifying the user, for instance. Or you can split the "hot" aggregate into smaller, smarter, more efficient ones.
The problem you are describing is called set validation. Greg Young makes a very good point that a key question is whether or not the cost/benefit analysis justifies enforcing this constraint in code.
But let's suppose it does....
I find it's most useful to think about set validation from the perspective of an RDBMS. How would we handle this problem if we were doing things with tables? A likely candidate is that we would have some sort of connection table, with foreign keys for the Lamp and the Socket. Then we would define constraints that would say that each of those foreign keys must be unique in the table.
Those foreign key constraints span the entire table; which is the database's way of telling us that the entire table represents a single aggregate.
So if you were going to lift those constraints into your domain model, you would do so by making an aggregate of all connections, so that the domain model can immediately rule on whether or not a given Lamp-Socket connection should be allowed.
Now, there's an important caveat here -- we're assuming that the domain model is the authority for connections between lamps and sockets. If we are modeling lamps in the real world connected to sockets in the real world, then its important to recognize that the real world is the authority, not the model.
Put another way, if the domain model gets conflicting information about the real world (two lamps are reportedly connected to the same socket), the model only knows that its information about the world is incorrect -- maybe the first lamp was plugged in, maybe the second, maybe there's a message missing about a lamp being unplugged. So in this sort of case, it's common that you'll want to allow the conflict, with an escalation to a human being for resolution.
the username must be unique
This is the single most commonly asked variation of the set validation problem.
The basic remedy is the same: you now have a User Profile aggregate, with an identifier, and a separate user name directory aggregate, which ensures that each name is uniquely associated with a profile.
If you aren't worried that a profile has at most one user name linked to it, then there is another approach you can take, which is to introduce an aggregate for each user name, which includes the profileId as a member. Thus, each aggregate can enforce the constraint that the name can only be assigned if the previous assignment was terminated.
I think I'm just forgetting about something in general but I don't know what.
Only that constraints don't come from nowhere -- there should be a business motivation for them; and somebody (the domain expert) should be able to document the cost to the business of failing to maintain the proposed set constraint.
For instance, if you are already collecting an email address, do you really need a unique username? How much additional value are you creating by including username in the model? How much more by making it unique...?
If we plan an online game, for example, with millions of users which request games constantly, it's a real problem.
Yes, it is; but that may indicate that the game design is wrong. Review Udi Dahan's discussion of high contention domains, and his essay Race Conditions Don't Exist.
A thing to notice, however, is that if you really have an aggregate, you can scale it independently from the rest of your system. One monster box is dedicated to managing the set aggregate and nothing else (analog: an RDBMS dedicated to managing a single table).
A more likely choice is going to be sharding by realm/instance/whatzit; in which case you'd have a smaller set aggregate for each realm instance.
In addition to the suggestions already made, consider that some of these problems are very similar to database concurrency problems. Say that you have a contact, and one user changes the name, and another user changes the phone number for this contact. If you write a command that updates the whole contact with the state as it was after modification, then one of the two will overwrite the change of the other with the old value, unless measures are taken.
If, however, you write a 'ChangeEmailForContact' command, then already you will only change that one field and not have a conflict with the name change, which would similarly be a 'Name' or 'RenameContact' command.
Now what if two people change the email address shortly after the other? A really efficient way is to pass the original value (original email address) along with the new value in your command. Now you can check when updating the email address if the original email address was the same as the current email address (so it is a valid starting point), or if the new email address is the same as the current email address (no need to do anything). If not, then, only then, are you in a conflict situation.
Now, apply this to your 'set operation'. The first time a lightbulb is moved into a 'connection' (perhaps I would call it fixture), it is moving from unassigned to connection1. Then, when a lightbulb is moved, it must be moved from connection1 to connection2, say. Now you can validate if that lightbulb is already assigned, if it was assigned to connection1 or if something has changed in the meantime.
It doesn't solve everything of course, but for the tiny case that remains, that tiny moment where two initial assignments happen close enough together, you either have to go for say a redis cache of assigned usernames to validate against or give an admin an easy tool to solve this very rare instance. You could for instance make a projection that occasionally reports on such situations and make sure renaming isn't too painful.

What if domain event failed?

I am new to DDD. Now I was looking at the domain event. I am not sure if I understand this domain event correctly, but I am just thinking what will happen if domain event published failed?
I have a case here. When a buyer order something from my website, firstly we will create a object, Order with line of items. The domain event, OrderWasMade, will be published to deduct the stock in Inventory. So here is the case, what if when the event was handled, the item quantity will be deducted, but what if when the system try to deduct the stock, it found out that there is no stock remaining for the item (amount = 0). So, the item amount can't be deducted but the order had already being committed.
Will this kind of scenario happen?
Sorry to have squeeze in 2 other questions here.
It seems like each event will be in its own transaction scope, which means the system requires to open multiple connection to database at once. So if I am using IIS Server, I must enable DTC, am I correct?
Is there any relationship between domain-events and domain-services?
A domain event never fails because it's a notification of things that happened (note the past tense). But the operation which will generate that event might fail and the event won't be generated.
The scenario you told us shows that you're not really doing DDD, you're doing CRUD using DDD words. Yes, I know you're new to it, don't worry, everybody misunderstood DDD until they got it (but it might take some time and plenty of practice).
DDD is about identifying the domain model abstraction, which is not code. Code is when you're implementing that abstraction. It's very obvious you haven't done the proper modelling, because the domain expert should tell you what happens if products are out of stock.
Next, there's no db/acid transactions at this level. Those are an implementation detail. The way DDD works is identifying where the business needs things to be consistent together and that's called an aggregate.
The order was submitted and this where that use case stops. When you publish the OrderWasMadeevent, another use case (deducting the inventory or whatever) is triggered. This is a different business scenario related but not part of "submit order". If there isn't enough stock then another event is published NotEnoughInventory and another use case will be triggered. We follow the business here and we identify each step that the business does in order to fulfill the order.
The art of DDD consists in understanding and identifying granular business functionality, the involved aggregates, business behaviour which makes decisions etc and this has nothing to do the database or transactions.
In DDD the aggregate is the only place where a unit of work needs to be used.
To answer your questions:
It seems like each event will be in its own transaction scope, which means the system requires to open multiple connection to database at once. So if I am using IIS Server, I must enable DTC, am I correct?
No, transactions,events and distributed transactions are different things. IIS is a web server, I think you want to say SqlServer. You're always opening multiple connections to the db in a web app, DTC has nothing to do with it. Actually, the question tells me that you need to read a lot more about DDD and not just Evans' book. To be honest, from a DDD pov it doesn't make much sense what you're asking.. You know one of principles of DD: the db (as in persistence details) doesn't exist.
Is there any relationship between domain-events and domain-services
They're both part of the domain but they have different roles:
Domain events tell the world that something changed in the domain
Domain services encapsulate domain behaviour which doesn't have its own persisted state (like Calculate Tax)
Usually an application service (which acts as a host for a business use case) will use a domain service to verify constraints or to gather data required to change an aggregate which in turn will generate one or more events. Aggregates are the ones persisted and always, an aggregate is persisted in an atomic manner i.e db transaction / unit of work.
what will happen if domain event published failed?
MikeSW already described this - publishing the event (which is to say, making it part of the history) is a separate concern from consuming the event.
what if when the system try to deduct the stock, it found out that there is no stock remaining for the item (amount = 0). So, the item amount can't be deducted but the order had already being committed.
Will this kind of scenario happen?
So the DDD answer is: ask your domain experts!
If you sit down with your domain experts, and explore the ubiquitous language, you are likely to discover that this is a well understood exception to the happy path for ordering, with an understood mitigation ("we mark the status of the order as pending, and we check to see if we've already ordered more inventory from the supplier..."). This is basically a requirements discovery exercise.
And when you understand these requirements, you go do it.
Go do it typically means a "saga" (a somewhat misleading and overloaded use of the term); a business process/workflow/state machine implementation that keeps track of what is going on.
Using your example: OrderWasMade triggers an OrderFulfillment process, which tracks the "state" of the order. There might be an "AwaitingInventory" state where OrderFulfillment parks until the next delivery from the supplier, for example.
Recommended reading:
http://udidahan.com/2010/08/31/race-conditions-dont-exist/
http://udidahan.com/2009/04/20/saga-persistence-and-event-driven-architectures/
http://joshkodroff.com/blog/2015/08/21/an-elegant-abandoned-cart-email-using-nservicebus/
If you need the stock to be immediately consistent at all times, a common way of handling this in event sourced systems (can also in non-event based systems, this is orthogonal really) is to rely on optimistic locking at the event store level.
Events basically have a revision number that they expect the stream of events to be at to take effect. Once the event hits the persistent store, its revision number is checked against the real stream number and if they don't match, a conflict exception is raised and the transaction is aborted.
Now as #MikeSW pointed out, depending on your business requirements, stock checking can be an out-of-band process that handles the problem in an eventually consistent way. Eventually can range from milliseconds if another part of the process takes over immediately, to hours if an email is sent with human action needing to be taken.
In other words, if your domain requires it, you can choose to trade this sequence of events
(OrderAbortedOutOfStock)
for
(OrderMade, <-- Some amount of time --> OrderAbortedOutOfStock)
which amounts to the same aggregate state in the end

Resources