How to handle in Event Driven Microservices if the messaging queue is down?

How to handle in Event Driven Microservices if the messaging queue is down? - multithreading

Assume there are two services A and B, in a microservice environment.
In between A and B sits a messaging queue M that is a broker.
A<---->'M'<----->B
The problem is what if the broker M is down?
Possible Solution i can think of:
Ping from Service A at regular intervals to check on Messaging queue M as long as it is down. In the meantime, service A
stores the data in a local DB and dumps it into the queue once the broker M is up.
Considering the above problem, if someone can suggest whether threads or reactive programming is best suited for this scenario and ways it could be handled via code, I would be grateful.

The problem is what if the broker M is down?
If the broker is down, then A and B can't use it to communicate.
What A and B should do in that scenario is going to depend very much on the details of your particular application/use-case.
Is there useful work they can do in that scenario?
If not, then they might as well just stop trying to handle any work/transactions for the time being, and instead just sit and wait for M to come back up. Having them do periodic pings/queries of M (to see if it's back yet) while in this state is a good idea.
If they can do something useful in this scenario, then you can have them continue to work in some sort of "offline mode", caching their results locally in anticipation of M's re-appearance at some point in the future. Of course this can become problematic, especially if M doesn't come back up for a long time -- e.g.
what if the set of cached local results becomes unreasonably large, such that A/B runs out of space to store it?
Or what if A and B cache local results that will both apply to the same data structure(s) within M, such that when M comes back online, some of A's results will overwrite B's (or vice-versa, depending on the order in which they reconnect)? (This is analogous to the sort of thing that source-code-control servers have to deal with after several developers have been working offline, both making changes to the same lines in the same file, and then they both come back online and want to commit their changes to that file. It can get a bit complex and there's not always an obvious "correct" way to resolve conflicts)
Finally what if it was something A or B "said" that caused M to crash in the first place? In that case, re-uploading the same requests to M after it comes back online might only cause it to crash again, and so on in an infinite loop, making the service perpetually unusable. (In this case, of course, the proper fix would be to debug M)
Another approach might be to try to avoid the problem by having multiple redundant brokers (e.g. M1, M2, M3, ...) such that as long as at least one of them is still available, productive work can continue. Or perhaps allow A and B to communicate with each other directly rather than through an intermediary.
As for whether this sort of thing would best be handled by threads or reactive programming, that's a matter of personal preference -- personally I prefer reactive programming, because the multiple-threads style usually means blocking-RPC-operations, and a thread that is blocked inside a blocking-operation is a frozen/helpless thread until the remote party responds (e.g. if M takes 2 minutes to respond to an RPC request, then A's RPC call to M cannot return for 2 minutes, which means that the calling thread is unable to do anything at all for 2 minutes). In a reactive approach, A's thread could also be doing other things during this period (such as pinging M to make sure it's okay, or contacting a backup broker, or whatever) during that 2 minute period if it wanted to.

Related

How are the missing events replayed?

I am trying to learn more about CQRS and Event Sourcing (Event Store).
My understanding is that a message queue/bus is not normally used in this scenario - a message bus can be used to facilitate communication between Microservices, however it is not typically used specifically for CQRS. However, the way I see it at the moment - a message bus would be very useful guaranteeing that the read model is eventually in sync hence eventual consistency e.g. when the server hosting the read model database is brought back online.
I understand that eventual consistency is often acceptable with CQRS. My question is; how does the read side know it is out of sync with the write side? For example, lets say there are 2,000,000 events created in Event Store on a typical day and 1,999,050 are also written to the read store. The remaining 950 events are not written because of a software bug somewhere or because the server hosting the read model is offline for a few secondsetc. How does eventual consistency work here? How does the application know to replay the 950 events that are missing at the end of the day or the x events that were missed because of the downtime ten minutes ago?
I have read questions on here over the last week or so, which talk about messages being replayed from event store e.g. this one: CQRS - Event replay for read side, however none talk about how this is done. Do I need to setup a scheduled task that runs once per day and replays all events that were created since the date the scheduled task last succeeded? Is there a more elegant approach?

I've used two approaches in my projects, depending on the requirements:
Synchronous, in-process Readmodels. After the events are persisted, in the same request lifetime, in the same process, the Readmodels are fed with those events. In case of a Readmodel's failure (bug or catchable error/exception) the error is logged and that Readmodel is just skipped and the next Readmodel is fed with the events and so on. Then follow the Sagas, that may generate commands that generate more events and the cycle is repeated.
I use this approach when the impact of a Readmodel's failure is acceptable by the business, when the readiness of a Readmodel's data is more important than the risk of failure. For example, they wanted the data immediately available in the UI.
The error log should be easily accessible on some admin panel so someone would look at it in case a client reports inconsistency between write/commands and read/query.
This also works if you have your Readmodels coupled to each other, i.e. one Readmodel needs data from another canonical Readmodel. Although this seems bad, it's not, it always depends. There are cases when you trade updater code/logic duplication with resilience.
Asynchronous, in-another-process readmodel updater. This is used when I use total separation of the Readmodel from the other Readmodels, when a Readmodel's failure would not bring the whole read-side down; or when a Readmodel needs another language, different from the monolith. Basically this is a microservice. When something bad happens inside a Readmodel it necessary that some authoritative higher level component is notified, i.e. an Admin is notified by email or SMS or whatever.
The Readmodel should also have a status panel, with all kinds of metrics about the events that it has processed, if there are gaps, if there are errors or warnings; it also should have a command panel where an Admin could rebuild it at any time, preferable without a system downtime.
In any approach, the Readmodels should be easily rebuildable.
How would you choose between a pull approach and a push approach? Would you use a message queue with a push (events)
I prefer the pull based approach because:
it does not use another stateful component like a message queue, another thing that must be managed, that consume resources and that can (so it will) fail
every Readmodel consumes the events at the rate it wants
every Readmodel can easily change at any moment what event types it consumes
every Readmodel can easily at any time be rebuild by requesting all the events from the beginning
there order of events is exactly the same as the source of truth because you pull from the source of truth
There are cases when I would choose a message queue:
you need the events to be available even if the Event store is not
you need competitive/paralel consumers
you don't want to track what messages you consume; as they are consumed they are removed automatically from the queue

This talk from Greg Young may help.
How does the application know to replay the 950 events that are missing at the end of the day or the x events that were missed because of the downtime ten minutes ago?
So there are two different approaches here.
One is perhaps simpler than you expect - each time you need to rebuild a read model, just start from event 0 in the stream.
Yeah, the scale on that will eventually suck, so you won't want that to be your first strategy. But notice that it does work.
For updates with not-so-embarassing scaling properties, the usual idea is that the read model tracks meta data about stream position used to construct the previous model. Thus, the query from the read model becomes "What has happened since event #1,999,050"?
In the case of event store, the call might look something like
EventStore.ReadStreamEventsForwardAsync(stream, 1999050, 100, false)

Application doesn't know it hasn't processed some events due to a bug.
First of all, I don't understand why you assume that the number of events written on the write side must equal number of events processed by read side. Some projections may subscribe to the same event and some events may have no subscriptions on the read side.
In case of a bug in projection / infrastructure that resulted in a certain projection being invalid you might need to rebuild this projection. In most cases this would be a manual intervention that would reset the checkpoint of projection to 0 (begining of time) so the projection will pick up all events from event store from scratch and reprocess all of them again.

The event store should have a global sequence number across all events starting, say, at 1.
Each projection has a position tracking where it is along the sequence number. The projections are like logical queues.
You can clear a projection's data and reset the position back to 0 and it should be rebuilt.
In your case the projection fails for some reason, like the server going offline, at position 1,999,050 but when the server starts up again it will continue from this point.

"Resequencing" messages after processing them out-of-order

I'm working on what's basically a highly-available distributed message-passing system. The system receives messages from someplace over HTTP or TCP, perform various transformations on it, and then sends it to one or more destinations (also using TCP/HTTP).
The system has a requirement that all messages sent to a given destination are in-order, because some messages build on the content of previous ones. This limits us to processing the messages sequentially, which takes about 750ms per message. So if someone sends us, for example, one message every 250ms, we're forced to queue the messages behind each other. This eventually introduces intolerable delay in message processing under high load, as each message may have to wait for hundreds of other messages to be processed before it gets its turn.
In order to solve this problem, I want to be able to parallelize our message processing without breaking the requirement that we send them in-order.
We can easily scale our processing horizontally. The missing piece is a way to ensure that, even if messages are processed out-of-order, they are "resequenced" and sent to the destinations in the order in which they were received. I'm trying to find the best way to achieve that.
Apache Camel has a thing called a Resequencer that does this, and it includes a nice diagram (which I don't have enough rep to embed directly). This is exactly what I want: something that takes out-of-order messages and puts them in-order.
But, I don't want it to be written in Java, and I need the solution to be highly available (i.e. resistant to typical system failures like crashes or system restarts) which I don't think Apache Camel offers.
Our application is written in Node.js, with Redis and Postgresql for data persistence. We use the Kue library for our message queues. Although Kue offers priority queueing, the featureset is too limited for the use-case described above, so I think we need an alternative technology to work in tandem with Kue to resequence our messages.
I was trying to research this topic online, and I can't find as much information as I expected. It seems like the type of distributed architecture pattern that would have articles and implementations galore, but I don't see that many. Searching for things like "message resequencing", "out of order processing", "parallelizing message processing", etc. turn up solutions that mostly just relax the "in-order" requirements based on partitions or topics or whatnot. Alternatively, they talk about parallelization on a single machine. I need a solution that:
Can handle processing on multiple messages simultaneously in any order.
Will always send messages in the order in which they arrived in the system, no matter what order they were processed in.
Is usable from Node.js
Can operate in a HA environment (i.e. multiple instances of it running on the same message queue at once w/o inconsistencies.)
Our current plan, which makes sense to me but which I cannot find described anywhere online, is to use Redis to maintain sets of in-progress and ready-to-send messages, sorted by their arrival time. Roughly, it works like this:
When a message is received, that message is put on the in-progress set.
When message processing is finished, that message is put on the ready-to-send set.
Whenever there's the same message at the front of both the in-progress and ready-to-send sets, that message can be sent and it will be in order.
I would write a small Node library that implements this behavior with a priority-queue-esque API using atomic Redis transactions. But this is just something I came up with myself, so I am wondering: Are there other technologies (ideally using the Node/Redis stack we're already on) that are out there for solving the problem of resequencing out-of-order messages? Or is there some other term for this problem that I can use as a keyword for research? Thanks for your help!

This is a common problem, so there are surely many solutions available. This is also quite a simple problem, and a good learning opportunity in the field of distributed systems. I would suggest writing your own.
You're going to have a few problems building this, namely
2: Exactly-once delivery
1: Guaranteed order of messages
2: Exactly-once delivery
You've found number 1, and you're solving this by resequencing them in redis, which is an ok solution. The other one, however, is not solved.
It looks like your architecture is not geared towards fault tolerance, so currently, if a server craches, you restart it and continue with your life. This works fine when processing all requests sequentially, because then you know exactly when you crashed, based on what the last successfully completed request was.
What you need is either a strategy for finding out what requests you actually completed, and which ones failed, or a well-written apology letter to send to your customers when something crashes.
If Redis is not sharded, it is strongly consistent. It will fail and possibly lose all data if that single node crashes, but you will not have any problems with out-of-order data, or data popping in and out of existance. A single Redis node can thus hold the guarantee that if a message is inserted into the to-process-set, and then into the done-set, no node will see the message in the done-set without it also being in the to-process-set.
How I would do it
Using redis seems like too much fuzz, assuming that the messages are not huge, and that losing them is ok if a process crashes, and that running them more than once, or even multiple copies of a single request at the same time is not a problem.
I would recommend setting up a supervisor server that takes incoming requests, dispatches each to a randomly chosen slave, stores the responses and puts them back in order again before sending them on. You said you expected the processing to take 750ms. If a slave hasn't responded within say 2 seconds, dispatch it again to another node randomly within 0-1 seconds. The first one responding is the one we're going to use. Beware of duplicate responses.
If the retry request also fails, double the maximum wait time. After 5 failures or so, each waiting up to twice (or any multiple greater than one) as long as the previous one, we probably have a permanent error, so we should probably ask for human intervention. This algorithm is called exponential backoff, and prevents a sudden spike in requests from taking down the entire cluster. Not using a random interval, and retrying after n seconds would probably cause a DOS-attack every n seconds until the cluster dies, if it ever gets a big enough load spike.
There are many ways this could fail, so make sure this system is not the only place data is stored. However, this will probably work 99+% of the time, it's probably at least as good as your current system, and you can implement it in a few hundred lines of code. Just make sure your supervisor is using asynchronous requests so that you can handle retries and timeouts. Javascript is by nature single-threaded, so this is slightly trickier than normal, but I'm confident you can do it.

Concurrent read and writers through cloned data structures?

I read this question but it didn't really help.
First and most important thing: time performances are the focus in the application that I'm developing
We have a client/server model (even distributed or cloud if we wish) and a data structure D hosted on the server. Each client request consists in:
Read something from D
Eventually write something on D
Eventually delete something on D
We can say that in this application the relation between the number of received operations can be described as delete<<write<<read. In addition:
Read ops cannot absolutely wait: they must be processed immediately
Write and delete can wait some time, but sooner is better.
From the description above, any lock-mechanism is not acceptable: this would imply that read operations could wait, which is not acceptable (sorry if I stress it so much, but it's really a crucial point).
Consistency is not necessary: if a write/delete operation has been performed and then a read operation doesn't see the write/delete effect it's not a big deal. It would be better, but it's not required.
The solution should be data-structure-independent, so it shouldn't matter if we write on a vector, list, map or Donald Trump's face.
The data structure could occupy a big amount of memory.
My solution so far:
We use two servers: the first server (called f) has Df, the second server (called s) has Ds updated.
f answers clients requests using Df and sends write/delete operations to s. Then s applies write/delete operations Ds sequentially.
At a certain point, all future client requests are redirected to s. At the same time, f copies s updated Ds into its Df.
Now, f and s roles are swapped: s will answer clients request using Ds and f will keep an updated version of Ds. The swapping process is periodically repeated.
Notice that I omitted on purpose A LOT of details for simplicity (for example, once the swap has been done, f has to finish all the pending client requests before applying the write/delete operations received from s in the meantime).
Why do we need two servers? Because the data structure is potentially too big to fit into one memory.
Now, my question is: there is some similar approach in literature? I came up with this protocol in 10 minutes, I find strange that no (better) solution similar to this one has been already proposed!
PS: I could have forgot some application specs, don't hesitate to ask for any clarification!

The scheme that you have works. I don't see any particular problem with it. This is basically like many databases run their HA solution. They apply a log of writes to replicas. This model affords a great deal of flexibility in how the replicas are formed, accessed and maintained. Failovers are easy, too.
An alternative technique is to use persistent datastructures. Each write returns you a new and independent version of the data. All versions can be read in a stable and lock-free way. Versions can be kept or discarded at will. Versions share as much of the underlying state as possible.
Usually, trees underlie such persistent datastructures because it is easy to update a small part of the tree and reuse most of the old tree.
A reason you might not have found a more sophisticated approach is that your problem is extremely general: You want this to work with any data structure at all and the data can be big.
SQL Server Hekaton uses a quite sophisticated data structure to achieve lock-free, readable, point in time snapshots of any database contents. Maybe it's worth a look how they are doing it (they released a paper describing every detail of the system). They also allow for ACID transactions, serializability and concurrent writes. All lock-free.
At the same time, f copies s updated Ds into its Df.
This copy will take a long time because the data is big. It will block readers. A better approach is to apply the log of writes to the writable copy before accepting new writes there. That way reads can be accepted continuously.
The switchover also is a short period where reads might have a slightly higher latency than normal.

What are the best practices to introduce a new BC to a DDD app?

This is a theoretical question about the introduction of new BCs in a system we use ES and CQRS with DDD. So there won't be concrete examples.
There can be interesting problems by introducing new BC-s, which communicate with the old ones by receiving and publishing domain events. The root of these problems that we already have domain events in the event storage. When the new BC reacts on those old domain events it will do that in a way which is out of sync and/or out of sequence.
For example we have an old BC A and we introduce a new BC B. Both publish domain events which we call a and b. In the new system the order matters for example b1 must always come after a1, but before a2. What can we do, when we already have the a1, a2, a3 sequence in the event storage? Should we inject b1 after a1 and so on? Is this a viable solution by a huge event storage? It will certainly take a long time to replay all the old events one by one and react on them. How can we prevent sending an email to the customer by handling the newly created b1 event, which reacts on a 5 years old topic? Is there a pattern to prevent these kind of problems?

Problem Analysis
The root of these problems that we already have domain events in the event storage.
If you introduce a new BC B to an existing system, that means the system was functional without B. This is clear by the above statement and has the following consequences:
Events that B would have produced in response to events from A do not need to be published. No other system should take action based on these events, because they are artificial.
You can go live with B at any time you choose. The only thing that you need to do beforehand is getting B in sync with the current state of the system.
Getting B in Sync
This is not difficult if you design B accordingly.
First, you need a replay mode mechanism to import all domain events into B without publishing events from B in response. You need to keep Bs events internally of course if you use event sourcing, but do not publish them. Also, make sure B does not modify the state of the world while in replay mode by other means, e.g. don't send emails.
Then, switch B over to live mode. Now B consumes the new events from the system and also publishes its own.
The problem you mention with event ordering is only a problem when you use a unified event store for all domain events, and also use that store to publish events from. If this is the case, then you need to mark Bs events as "internal" during the replay phase and hide them from the publishing mechanism.
Note: If B is a purely reactive BC (this could be the case for a very simple BC), then you don't even need the replay stuff. But most BC's probably do.

First of all DDD does not require Event sourcing.
we have an old BC A and we introduce a new BC B. Both publish domain
events which we call a and b. In the new system the order matters for
example b1 must always come after a1, but before a2.
Events can be out of order, even in the same component(bounded-context). Transactional integrity is only guaranteed within aggregates.
when we already have the a1, a2, a3 sequence in the event storage?
Doesn't matter. By the way you don't have this guarantee with SQL databases unless you work in SERIALIZABLE isolation (or its vendor specific equivalent). Protip: It's so taxing on performance that it's never enabled by default; therefore you are not using it.
Pay special attention to this part in the above link:
Other transactions cannot insert new rows with key values that would
fall in the range of keys read by any statements in the current
transaction until the current transaction completes.
Furthermore, though an event store shouldn't have multiple copies of an event, events (and other messages such as commands) may arrive multiple times between components.
Should we inject b1 after a1 and so on?
Since your components should be able to handle out of order (and duplicate events) no
What can we do,
Depending on the technology used to integrate components, and the semantics of the messages:
If you are reading events from a web service, feed, DB table; such that it never goes away; you might be able to ignore an event until it is relevant.
Equivalently, you might be able to put an event back to the message queue it came from until it is relevant.
You may use the pattern known as Saga/Process Manager.
Is there a real race condition, at all?

Independent server side processing in node

Is it possible, or even practical to create a node program (or sub program/loop) that executes independently of the connected clients.
So in my specific use case, I would like to make a mulitplayer game, where each turn a player preforms actions. And at the end of that turn those actions are computed. Is it possible to perform those computations at a specific time regardless of the client/players connecting?
I assume this involves the use of threads somewhere.
Possibly an easier solution would be to compute the outcome when it is observed, but this could cause difficulties if it has an influence in with other entities. But this problem has been a curiosity of mine for a while.

Well, basically, the easiest solution would probably to run the computation onto a cluster. This is spawning a thread who's running independent task and communicating with messages with the main thread.
If you wish however to run a completely separate process (I probably wouldn't, but it is an option), this can happen too. You then just need a communication protocol between the two process. Usually this would be handled by a messaging or a task queue system. A popular queue solving this issue is RabbitMQ.

If the computations each turn is not to heavy you could solve the issue with a simple setTimeout()
function turnCalculations(){
//do loads of stuff every 30 seconds
}
setTimout(turnCalculations,30000)
//normal node server stuff here
This would do the turn calculations every 30 seconds regardless of users connected, but if the calculations take to long they might block your server.

Develop Reference

node.js excel linux python-3.x azure haskell apache-spark rust .htaccess string