Given an event driven micro service architecture with asynchronous messaging, what solutions are there to implementing a 'synchronous' REST API wrapper such that requests to the REST interface wait for a response event to be published before sending a response to the client?
Example: POST /api/articles
Internally this would send a CreateArticleEvent in the services layer, eventually expecting an ArticleCreatedEvent in response containing the ID of the persisted article.
Only then would the REST interface response to the end client with this ID.
Dealing with multiple simultaneous requests - is keeping an in-memory map of inflight requests in the REST api layer keyed by some correlating identifier conceptually a workable approach?
How can we deal with timing out requests after a certain period?
Generally you don't need to maintain a map of in-flight requests, because this is basically done for you by node.js's http library.
Just use express as it's intended, and this is probably something you never really have to worry about, as long as you avoid any global state.
If you have a weirder pattern in mind to build, and not sure how to solve it. It might help to share a simple example. Chances are that it's not hard to rebuild and avoid global state.
With express, have you tried middleware? You can chain a series of callback functions with a certain timeout after the article is created.
I assume you are in the context of Event Sourcing and microservices? If so I recommend that you don't publish a CreateArticleEvent to the event store, and instead directly create the article in the database and then publish the ArticleCreatedEvent to the Event store.
Why you ask? Generally this pattern is created to orchestrate different microservices. In the example show in the link above, it was used to orchestrate how the Customer service should react when an Order is created. Note the past tense. The Order Service created the order, and Customer Service reacts to it.
In your case it is easier (and probably better) to just insert the order into the database (by calling the ArticleService directly) and responding with the article ID. Then just publish the ArctileCreatedEvent to your event store, to trigger other microservices that may want to listen to it (like, for example, trigger a notification to the editor for review).
Event Sourcing is a good pattern, but we don't need to apply it to everything.
Related
I am developing an application where there is a dashboard for data insights.
The backend is a set of microservices written in NodeJS express framework, with MySQL backend. The pattern used is the Database-Per-Service pattern, with a message broker in between.
The problem I am facing is, that I have this dashboard that derives data from multiple backend services(Different databases altogether, some are sql, some are nosql and some from graphDB)
I want to avoid multiple queries between front end and backend for this screen. However, I want to avoid a single point of failure as well. I have come up with the following solutions.
Use an API gateway aggregator/composition that makes multiple calls to backend services on behalf of a single frontend request, and then compose all the responses together and send it to the client. However, scaling even one server would require scaling of the gateway itself. Also, it makes the gateway a single point of contact.
Create a facade service, maybe called dashboard service, that issues calls to multiple services in the backend and then composes the responses together and sends a single payload back to the server. However, this creates a synchronous dependency.
I favor approach 2. However, I have a question there as well. Since the services are written in nodeJs, is there a way to enforce time-bound SLAs for each service, and if the service doesn't respond to the facade aggregator, the client shall be returned partial, or cached data? Is there any mechanism for the same?
GraphQL has been designed for this.
You start by defining a global GraphQL schema that covers all the schemas of your microservices. Then you implement the fetchers, that will "populate" the response by querying the appropriate microservices. You can start several instances to do not have a single point of failure. You can return partial responses if you have a timeout (your answer will incluse resolver errors). GraphQL knows how to manage cache.
Honestly, it is a bit confusing at first, but once you got it, it is really simple to extend the schema and include new microservices into it.
I can’t answer on node’s technical implementation but indeed the second approach allows to model the query calls to remote services in a way that the answer is supposed to be received within some time boundary.
It depends on the way you interconnect between the services. The easiest approach is to spawn an http request from the aggregator service to the service that actually bring the data.
This http request can be set in a way that it won’t wait longer than X seconds for response. So you spawn multiple http requests to different services simultaneously and wait for response. I come from the java world, where these settings can be set at the level of http client making those connections, I’m sure node ecosystem has something similar…
If you prefer an asynchronous style of communication between the services, the situation is somewhat more complicated. In this case you can design some kind of ‘transactionId’ in the message protocol. So the requests from the aggregator service might include such a ‘transactionId’ (UUID might work) and “demand” that the answer will include just the same transactionId. Now the sends when sent the messages should wait for the response for the certain amount of time and then “quit waiting” after X seconds/milliseconds. All the responses that might come after that time will be discarded because no one is expected to handle them at the aggregator side.
BTW this “aggregator” approach also good / simple from the front end approach because it doesn’t have to deal with many requests to the backend as in the gateway approach, but only with one request. So I completely agree that the aggregator approach is better here.
My project is a full stack application where a web client subscribes to an unready object. When the subscription is triggered, the backend will run an observation loop to that unready object until it becomes ready. When that happens it sends a message to the frontend through socketIO (suggestions are welcome, I'm not quite sure if it's the best method). My question is how do I construct the observation loop.
My frontend basically subscribes to the backend, and gets a return 200 and will connect to the server per Websocket (socketIO) if it got subscribed correctly, or an error 4XX code if there was something that went wrong. On the backend, when the user subscribes, it should start for that user, a "thread" (I know Nodejs doesn't support threads, it's just for the mental image) that polls an information from an api every 10 or so seconds.
I do that, because the API that I poll from does not support WebHooks, so I need to observe the API response until it's at the state that I want it (this part I already got cleared).
What I'm asking, is there a third party library that actually is meant for those kinds of tasks? Should I use worker threads or simple setTimeouts abstracted by Classes? The response will be sent over SocketIO, that part I already got working as well, it's just the method I'm using im not quite sure how to build.
I'm also open to use another fitting programming language that makes solving this case easier. I'm not in a hurry.
A polling network request (which it sounds like this is) is non-blocking and asynchronous so it doesn't really take much of your nodejs CPU unless you're doing some heavy-weight computation of the result.
So, a single nodejs thread can make a lot of network requests (for your polling and for sending data over socket.io connection) without adding WorkerThreads or clustering. This is something that nodejs is very, very good at.
I'm not aware of any third party library specifically for this as you have to custom code looking at the results of the network request anyway and that's most of the coding. There are a bunch of libraries for making http requests of other servers from nodejs listed here. My favorite in that list is got(), but you can look at the choices and decide what you like.
As for making the repeated requests, I would probably just use either repeated setTimeout() calls or a setInterval() call.
You don't say whether you have to make separate requests for every single client that is subscribed to something or whether you can somehow combine all clients watching the same resource so that you use the same polling interval for all of them. If you can do the latter, that would certainly be more efficient.
If, as you scale, you run into scaling issues, you can then move the polling code to one or more child processes or WorkerThreads and then just communicate back to the main thread via messaging when you have found a new state that needs to be sent to the client. But, I would not anticipate you would need to code that extra step until you reach larger scale. As with most scaling things, you would need to code up the more basic option (which should scale well by itself) and then measure and benchmark and see where any bottlenecks are and modify the architecture based on data, not speculation. Far too often, the architecture is over-designed and over-implemented based on where people think the bottlenecks might be rather than where they actually turn out to be. Not only does this make the development take longer and end up with more complicated implementation than required, but it can target development at the wrong part of the problem. Profile, measure, then decide.
I want to show the user exactly to the second when he can have access to a given page, othervise it will be blocked. Lets say that I receive specific date and time from the server.
I guess I could use setTimeout function but I'm sure its a bad idea.
I can use a scheduler like node cron in backend but I'd need to send a message to frontend somehow after given time has passed.
Are webSockets an option? Or is there easier way?
I want to show the user exactly to the second when he can have access to a given page
For such accuracy, indeed the WebSocket communication is the way to go. This protocol is widely used on the web for push notification in email/social services like Gmail, Facebook etc.
Regarding the backend, I would suggest you to use a more scalable approach. You could use Bull to create a scheduling service. Bull uses Redis as a store and can operate with multiple processors(Node Processes), ensuring that each task is processed only by one processor. With one word it abstracts away the complexities which arise in distributed systems.
I'm planning a non-trivial realtime chat platform. The app has several types of resources: Users, Groups, Channels, Messages. There are roughly 20 types of realtime events having to do with these resources: for instance, submitting a message, a user connecting or disconnecting, a user joining a group, a moderator kicking a user from a group, etc...
Overall, I see two paths to organizing all this complexity.
The first is to build a REST API to manage the resources. For instance, to send a message, POST to /api/v1/messages. Or, to kick a user from a group, POST to /api/v1/group/:group_id/kick/. Then, from within the Express route handler, call io.emit (made accessible through res.locals) with the updated data to notify all related clients. In this case, clients talk to the server through HTTP and the server notifies clients through socket.io.
The other option is to not have a rest API at all, and handle all events through socket.IO. For instance, to send a message, emit a SEND_MESSAGE event. Or, to kick a user, emit a KICK_USER event. Then, from within the socket.io event handler, call io.emit with the updated data to notify all clients.
Yet another option is to have certain actions handled by a REST API, others by socket.IO. For instance, to get all messages, GET api/v1/channel/:id/messages. But to post a message, emit SEND_MESSAGE to the socket.
Which is the most suitable option? How do I determine which actions need to be sent thorough an API, and which need to be sent through socket.io? Is it better not to have a REST API for this type of application?
Some of my thoughts so far, nothing conclusive:
Advantages of REST API over the socket.io-only approach:
Easier to organize hierarchically, more modular
Easier to test
More robust and elegant
Simpler auth implementation with middleware
Disadvantages of REST API over the socket.io-only approach:
Slightly less performant (source)
Since a socket connection needs to be open anyways, why not use it for everything?
Slightly harder to manage on the client side.
Thanks for reading !
This could be achieve this using sockets.
Why because a chat application will be having dozens of actions, like ..
'STARTS_TYPING', 'STOPS_TYPING', 'SEND_MESSAGE', 'RECIVE_MESSAGE',...
Accommodating all these features using rest api's will generate a complex system which lacks performance.
Also concept of rooms in socket.io simplifies lot of headache regarding group chat implementation.
So its better to build everything based on sockets[socket.io or web cluster].
Here is the solution I found to solve this problem.
The key mistake in my question was that I assumed a rest API and websockets were mutually exclusive, because I intended on integrating the business and database logic directly in express routes and socket.io handlers. Thus, choosing between socket.io and http was important, because it would influence the core business logic of my app.
Instead, it shouldn't matter which transport to use. The business logic has to be independent from the transport logic, in its own module.
To do this, I developed a service layer that handles CRUD tasks, but also more specific tasks such as authentication. Then, this service layer can be easily consumed from either or both express routes and socket.io handlers.
In the end, this architecture allowed me not to easily switch between transport technologies.
I'm trying to figure out an appropriate method to carry a request-id (x-request-id from a restify request header) through my stack; across thrift inter-service calls, and with rabbitmq queue messages. The goal is that anywhere, in any service, I can correlate an error or event back to an initiating http request. Is there a known practice for doing this with Node? I'd like to avoid passing a context around through virtually every function call.
I've looked into the way New Relic handles instrumentation, and there's this blog: https://opbeat.com/blog/posts/how-we-instrument-nodejs/; but these types of instrumentation require hooking into tons of node core library calls, and don't really help with carrying the context across thrift calls.
How can I take a restify header id such as "x-request-id" from a request, and have access to it deeper in my stack (even in async callbacks) without modifying every function to pass the values through?
I'm also looking for a clean way to pass it through all thrift calls (getting it across service boundaries).
This is with TypeScript and Node.js 5.x
Thanks!
Is there a known practice for doing this with Node
Within NodeJs you move request around whereever you need request context stuff.
With every other system you need to carry the stuff around in the request system format for that thing. E.g. for Event store we store it in the event metadata.
For thrift I recommend just adding it as a property in every query that is echoed back in every response.