I have a microservices-based architecture in place.
Service A has details about Cart and the Cart model looks like this:
Cart {
Id,
Items,
Price,
UserId
}
Service B has User details. User model:
User {
Id,
FirstName,
LastName,
Email
}
I want to fetch these User details from Service A by communicating with Service B. I am trying to implement this communication between microservices Asynchronously using a messaging queue. The problem I am facing is how to associate a particular message with the corresponding HTTP request from the client?
Is it a good idea to handle client HTTP requests with asynchronous communication among the services?
To be more specific, how do I associate the message received in step 5 in the above image with the HTTP request handler in step 1.
Service Bus works great when your two processes are disconnected completely, things like downstream processing of an order after it is placed for example. You want the extra layer of certainty that the downstream process will run even if it takes a retry or two without having to wait for it. Trying to use it for a case where you need an immediate response is going to cause the issues you are seeing- that's not what it is for.
The best solution is going to be something along the lines of having your cart service call the user service over http so that the data stays properly correlated. If you want to keep a layer in between to remove the direct dependency between the two services, something like API Management is going to be a much better fit.
Related
Background
I have a monolith Node.js + PostgreSQL app that, besides other things, needs to provide real-time in-app notifications to end users.
It is currently implemented in the following way:
there's a db table notifications which has state (pending/sent), userid (id of the notification receiver), isRead (did a user read the notification), type and body - notification data.
once specific resources get created or specific events occur, a various number of users should receive in-app notifications. When a notification is created, it gets persisted to the db and gets sent to the user using WebSockets. Notifications can also get created by a cron job.
when a user receives N number of notifications of the same type, they get collapsed into one single notification. This is done via db trigger by deleting repeated notifications and inserting a new one.
usually it works fine. But when the number of receivers exceeds several thousands, the app lags or other requests get blocked or not all notifications get sent via WebSockets.
Examples of notifications
Article published
A user is awarded with points
A user logged in multiple times but didn't perform some action
One user sends a friend request to another
One user sent a message to another
if a user receives 3+ Article published notifications, they get collapsed into the N articles published notification (N gets updated if new same notifications get received).
What I currently have doesn't seem to work very well. For example, for the Article created event the api endpoint that handles the creation, also handles notifications send-outs (which is maybe not a good approach - it creates ~5-6k notifications and sends them to users via websockets).
Question
How to correctly design such functionality?
Should I stay with a node.js + db approach or add a queuing service? Redis Pub/Sub? RabbitMQ?
We deploy to the k8s cluster, so adding another service is not a problem. More important question - is it really needed in my case?
I would love some general advice or resources to read on this topic.
I've read several articles on messaging/queuing/notifications system design but still don't quite get if this fits my case.
Should the queue store the notifications or should they be in the db? What's the correct way to notify thousands of users in real-time (websockets? SSE?)?
Also, the more I read about queues and message brokers, the more it feels like I'm overcomplicating things and getting more confused.
Consider using the Temporal open source project. It would allow modeling each user lifecycle as a separate program. The Temporal makes the code fully fault tolerant and preserves its full state (including local variables and blocking await calls) across process restarts.
I am developing an application where there is a dashboard for data insights.
The backend is a set of microservices written in NodeJS express framework, with MySQL backend. The pattern used is the Database-Per-Service pattern, with a message broker in between.
The problem I am facing is, that I have this dashboard that derives data from multiple backend services(Different databases altogether, some are sql, some are nosql and some from graphDB)
I want to avoid multiple queries between front end and backend for this screen. However, I want to avoid a single point of failure as well. I have come up with the following solutions.
Use an API gateway aggregator/composition that makes multiple calls to backend services on behalf of a single frontend request, and then compose all the responses together and send it to the client. However, scaling even one server would require scaling of the gateway itself. Also, it makes the gateway a single point of contact.
Create a facade service, maybe called dashboard service, that issues calls to multiple services in the backend and then composes the responses together and sends a single payload back to the server. However, this creates a synchronous dependency.
I favor approach 2. However, I have a question there as well. Since the services are written in nodeJs, is there a way to enforce time-bound SLAs for each service, and if the service doesn't respond to the facade aggregator, the client shall be returned partial, or cached data? Is there any mechanism for the same?
GraphQL has been designed for this.
You start by defining a global GraphQL schema that covers all the schemas of your microservices. Then you implement the fetchers, that will "populate" the response by querying the appropriate microservices. You can start several instances to do not have a single point of failure. You can return partial responses if you have a timeout (your answer will incluse resolver errors). GraphQL knows how to manage cache.
Honestly, it is a bit confusing at first, but once you got it, it is really simple to extend the schema and include new microservices into it.
I can’t answer on node’s technical implementation but indeed the second approach allows to model the query calls to remote services in a way that the answer is supposed to be received within some time boundary.
It depends on the way you interconnect between the services. The easiest approach is to spawn an http request from the aggregator service to the service that actually bring the data.
This http request can be set in a way that it won’t wait longer than X seconds for response. So you spawn multiple http requests to different services simultaneously and wait for response. I come from the java world, where these settings can be set at the level of http client making those connections, I’m sure node ecosystem has something similar…
If you prefer an asynchronous style of communication between the services, the situation is somewhat more complicated. In this case you can design some kind of ‘transactionId’ in the message protocol. So the requests from the aggregator service might include such a ‘transactionId’ (UUID might work) and “demand” that the answer will include just the same transactionId. Now the sends when sent the messages should wait for the response for the certain amount of time and then “quit waiting” after X seconds/milliseconds. All the responses that might come after that time will be discarded because no one is expected to handle them at the aggregator side.
BTW this “aggregator” approach also good / simple from the front end approach because it doesn’t have to deal with many requests to the backend as in the gateway approach, but only with one request. So I completely agree that the aggregator approach is better here.
I'm trying to understand how to do two-way communication with google pub-sub with the following architecture
EDIT: I meant to say subscribers instead of consumers
I'm trying to support the following workflow:
UI sends a request to an api service to process an async process
API Service publishes request to a topic to begin the process kick-off
The consumer picks up the message and processes the async process service.
once the async process service is done it publishes to a process complete topic.
Here is where I want the UI to pick up the process complete message and I'm trying to figure out the best approach.
So two questions:
Is the multiple topic the preferred approach when wanting to do two-way communication back to the client? Or is there a way to do this with a single topic with multiple subscriptions?
How should the consumer of the Process-Complete get the response back to the UI? Should the UI be the consumer of the subscription? Or should I send it back to the api service and publish a websocket message? Both these approaches seem to have tradeoffs.
Multiple topics are going to be preferred in this situation, one for messages going to the asynchronous processors and then one for the responses that go back. Otherwise, your asynchronous processors are going to needlessly receive the response messages and have to ack them immediately, which is unnecessary extra delivery of messages.
With regard to getting the response back to the UI, the UI should not be the consumer of the subscription. In order to do that, you'd need every running instance of the UI to have its own subscription because otherwise, they would load balance messages across them and you couldn't guarantee that the particular client that sent the request would actually receive the response. The same would be true if you have multiple API servers that need to receive particular responses based on the requests that transmitted through them. Cloud Pub/Sub isn't really designed for topics and subscriptions to be ephemeral in this way; it is best when these are created once and all of the data is transmitted across them.
Additionally, having the UI act as a subscriber means that you'd have to have the credentials in the UI to subscribe, which could be a security issue.
You might also consider not using a topic for the asynchronous response. Instead, you could encode as part of the message the address or socket of the client or API server that expects the response. Then, the asynchronous processor could receive a message, process it, send a response to the address specified in the message, and then ack the message it received. This would ensure responses are routed to where they need to go and minimize the delivery of messages that subscribers just ack that they don't need to process, e.g., messages that were intended for a different API server.
Given an event driven micro service architecture with asynchronous messaging, what solutions are there to implementing a 'synchronous' REST API wrapper such that requests to the REST interface wait for a response event to be published before sending a response to the client?
Example: POST /api/articles
Internally this would send a CreateArticleEvent in the services layer, eventually expecting an ArticleCreatedEvent in response containing the ID of the persisted article.
Only then would the REST interface response to the end client with this ID.
Dealing with multiple simultaneous requests - is keeping an in-memory map of inflight requests in the REST api layer keyed by some correlating identifier conceptually a workable approach?
How can we deal with timing out requests after a certain period?
Generally you don't need to maintain a map of in-flight requests, because this is basically done for you by node.js's http library.
Just use express as it's intended, and this is probably something you never really have to worry about, as long as you avoid any global state.
If you have a weirder pattern in mind to build, and not sure how to solve it. It might help to share a simple example. Chances are that it's not hard to rebuild and avoid global state.
With express, have you tried middleware? You can chain a series of callback functions with a certain timeout after the article is created.
I assume you are in the context of Event Sourcing and microservices? If so I recommend that you don't publish a CreateArticleEvent to the event store, and instead directly create the article in the database and then publish the ArticleCreatedEvent to the Event store.
Why you ask? Generally this pattern is created to orchestrate different microservices. In the example show in the link above, it was used to orchestrate how the Customer service should react when an Order is created. Note the past tense. The Order Service created the order, and Customer Service reacts to it.
In your case it is easier (and probably better) to just insert the order into the database (by calling the ArticleService directly) and responding with the article ID. Then just publish the ArctileCreatedEvent to your event store, to trigger other microservices that may want to listen to it (like, for example, trigger a notification to the editor for review).
Event Sourcing is a good pattern, but we don't need to apply it to everything.
I'm planning a non-trivial realtime chat platform. The app has several types of resources: Users, Groups, Channels, Messages. There are roughly 20 types of realtime events having to do with these resources: for instance, submitting a message, a user connecting or disconnecting, a user joining a group, a moderator kicking a user from a group, etc...
Overall, I see two paths to organizing all this complexity.
The first is to build a REST API to manage the resources. For instance, to send a message, POST to /api/v1/messages. Or, to kick a user from a group, POST to /api/v1/group/:group_id/kick/. Then, from within the Express route handler, call io.emit (made accessible through res.locals) with the updated data to notify all related clients. In this case, clients talk to the server through HTTP and the server notifies clients through socket.io.
The other option is to not have a rest API at all, and handle all events through socket.IO. For instance, to send a message, emit a SEND_MESSAGE event. Or, to kick a user, emit a KICK_USER event. Then, from within the socket.io event handler, call io.emit with the updated data to notify all clients.
Yet another option is to have certain actions handled by a REST API, others by socket.IO. For instance, to get all messages, GET api/v1/channel/:id/messages. But to post a message, emit SEND_MESSAGE to the socket.
Which is the most suitable option? How do I determine which actions need to be sent thorough an API, and which need to be sent through socket.io? Is it better not to have a REST API for this type of application?
Some of my thoughts so far, nothing conclusive:
Advantages of REST API over the socket.io-only approach:
Easier to organize hierarchically, more modular
Easier to test
More robust and elegant
Simpler auth implementation with middleware
Disadvantages of REST API over the socket.io-only approach:
Slightly less performant (source)
Since a socket connection needs to be open anyways, why not use it for everything?
Slightly harder to manage on the client side.
Thanks for reading !
This could be achieve this using sockets.
Why because a chat application will be having dozens of actions, like ..
'STARTS_TYPING', 'STOPS_TYPING', 'SEND_MESSAGE', 'RECIVE_MESSAGE',...
Accommodating all these features using rest api's will generate a complex system which lacks performance.
Also concept of rooms in socket.io simplifies lot of headache regarding group chat implementation.
So its better to build everything based on sockets[socket.io or web cluster].
Here is the solution I found to solve this problem.
The key mistake in my question was that I assumed a rest API and websockets were mutually exclusive, because I intended on integrating the business and database logic directly in express routes and socket.io handlers. Thus, choosing between socket.io and http was important, because it would influence the core business logic of my app.
Instead, it shouldn't matter which transport to use. The business logic has to be independent from the transport logic, in its own module.
To do this, I developed a service layer that handles CRUD tasks, but also more specific tasks such as authentication. Then, this service layer can be easily consumed from either or both express routes and socket.io handlers.
In the end, this architecture allowed me not to easily switch between transport technologies.