Handling Response on a Worker Process in NodeJs - node.js

I am trying to design a service following the Command Query Responsibility Segregation Pattern (CQRS) in NodeJs. To handle segregation, I am taking the following approach:
Create separate workers for querying and executing commands
Expose them using a REST API
The REST API has been designed with ExpressJs. All endpoints starting with 'update', 'create' and 'delete' keywords are treated as commands; all endpoints with 'get' or 'find' are treated as queries.
When a request reaches its designated handler, one of the following occurs:
If its a command, a response is sent immediately after delegating the task to worker process; other services are notified by generating appropriate events when the master process receives a completion message from the worker.
If its a query, the response is handled by a designated worker that can use a reference of the database connection passed on as arguments to fetch and send the query result.
For (2) above, I am trying to create a mechanism that somehow "passes" the response object to the worker which, can then complete the request. Can this be done by "cloning" the response object and passing it as plain arguments? If not, what is the preferred way of achieving this?

I think you are better off in (2) to pass the query off onto a worker process, which returns to the master process, which then sends back the request.
First of all, you don't really want to give the worker processes "access" to the outside. They should be all internal workers, managed by the master process.
Second, the Express server's job is to receive requests, do something with them, then return a result. It seems like over-complicating to try to pass the communication off to a worker.
If you are really worried about your Express server getting overwhelmed with requests, you should consider something like Docker to create a "swarm" of express instances.

Related

Does NodeJS spin up a new process for new reqest?

I have a backend NodeJS API and I am trying to setting trace id. What I have been thinking is that I would generate a UUID through a Singleton module and then use it across for logging. But since NodeJS is single-threaded, would that mean that UUID will always remain the same for all clients?
For eg: If the API gets a request from https://www.example.com/client-1 and https://www.example-two.com/client-2, would it spin a new process and thereby generate separate UUIDs? or it's just one process that would be running with a single thread? If it's just one process with one thread then I think both the client apps will get the same UUID assigned.
Is this understanding correct?
Nodejs uses only one single thread to run all your Javascript (unless you specifically create a WorkerThread or child_process). Nodejs uses some threads internally for use in some of the library functions, but those aren't used for running your Javascript and are transparent to you.
So, unlike some other environments, each new request runs in the same thread. There is no new process or thread created for an incoming request.
If you use some singleton, it will have the same value for every request.
But since NodeJS is single threaded, would that mean that UUID will always remains the same for all clients?
Yes, the UUID would be the same for all requests.
For eg: If the API gets a request from https://www.example.com/client-1 and https://www.example-two.com/client-2, would it spin a new process and thereby generate separate UUIDs?
No, it would not spin a new process and would not generate a new UUID.
or it's just one process that would be running with a single thread? If it's just one process with one thread then I think both the client apps will get the same UUID assigned.
One process. One thread. Same UUID from a singleton.
If you're trying to put some request-specific UUID in every log statement, then there aren't many options. The usual option is to coin a new UUID for each new request in some middleware and attach it to the req object as a property such as req.uuid and then pass the req object or the uuid itself as a function argument to all code that might want to have access to it.
There is also a technology that has been called "async local storage" that could serve you here. Here's the doc. It can be used kind of like "thread local storage" works in other environments that do use a thread for each new request. It provides some local storage that is tied to an execution context which each incoming request that is still being processed will have, even as it goes through various asynchronous operations and even when it returns control temporarily back to the event loop.
As best I know, the async local storage interface has undergone several different implementations and is still considered experimental.
See this diagram to understand ,how node js server handles requests as compared to other language servers
So in your case there won't be a separate thread
And unless you are creating a separate process by using pm2 to run your app or explicitly creating the process using internal modules ,it won't be a separate process
Node.js is a single thread run-time environment provided that internally it does assign threads for requests that block the event loop.
What I have been thinking is that I would generate a UUID through a
Singleton module
Yes, it will generate UUID only once and every time you have new request it will reuse the same UUID, this is the main aim of using the Singleton design pattern.
would it spin a new process and thereby generate separate UUIDs? or
it's just one process that would be running with a single thread?
The process is the instance of any computer program that can have one or multiple threads in this case it is Node.js(the process), the event loop and execution context or stack are two threads part of this process. Every time the request is received, it will go to the event loop and then be passed to the stack for its execution.
You can create a separate process in Node.js using child modules.
Is this understanding correct?
Yes, your understanding is correct about the UUID Singleton pattern. I would recommend you to see how Node.js processes the request. This video helps you understand how the event loop works.

Play Framework Scala thread affinity

We have our HTTP layer served by Play Framework in Scala. One of our APIs is something of the form:
POST /customer/:id
Requests are sent by our UI team which calls these APIs through a React Framework.
The issue is that, sometimes, the requests are issued in batches, successively one after the other for the same customer ID. When this happens, different threads process these requests and so our persistent layer (MySQL) reaches an inconsistent state due to the difference in the timestamp of the handling of these requests.
Is it possible to configure some sort of thread affinity in Play Scala? What I mean by that is, can I configure Play to ensure that requests of a particular customer ID are handled by the same thread throughout the life-cycle of the application?
Batch is
put several API calls into a single HTTP request.
A batch request is a set of command in one HTTP request, like here https://developers.facebook.com/docs/graph-api/making-multiple-requests/
You describe it as
The issue is that, sometimes, the requests are issued in batches, successively one after the other for the same customer ID. When this happens, different threads process these requests and so our persistent layer (MySQL) reaches an inconsistent state due to the difference in the timestamp of the handling of these requests.
This is a set of concurrent requests. Play framework usually works as a stateless server. I assume you also organize it as stateless. There is nothing that binds one request to another, you can't control order. Well, you can, if you create a special protocol, like "opening batch request", request #1, #2, ... "closing batch request". You need to check if exactly all request was correct. You also need to run some stateful threads and some queues ... Thought akka can help with this but I am pretty sure you wan't do it.
This issue is not a "play-framework" depended. You will reproduce it in any server. For example, the general case: Is it possible to receive out-of-order responses with HTTP?
You can go in either way:
1. "Batch" the command in one request
You need to change the client so it jams "batch" requests into one. You also need to change server so it processes all the commands from the batch one after another.
Example of the requests: https://developers.facebook.com/docs/graph-api/making-multiple-requests/
2. "Pipeline" requests
You need to change the client so it sends the next request after receive the response from the previous.
Example: Is it possible to receive out-of-order responses with HTTP?
The solution to this is to pipeline Ajax requests, transmitting them serially. ... . The next request sent only after the previous one has returned successfully."

Background processes in the aiohttp event loop

I have a web service that accepts post requests. A post request specifies a specific job to be executed in the background, that modifies a database used for later analysis. The sender of the request does not care about the result, and only needs to receive a 202 acknowledgment from the web service.
How it was implemented so far:
Flask Web service will get the http request , and add the necessary parameters to the task queue (rq workers), and return back an acknowledgement. A separate rq worker process listens on the queue and processes the job.
We have now switched to aiohttp, and realized that the web service can now schedule the actual job request in its own event loop, by using the aiohttp.ensure_future() method.
This however blurs the lines between the web-server and the task queue. On the positive side, it eliminates the need of having to manage the rq workers.
Is this considered a good practice?
If your tasks are not CPU heavy - yes, it is good practice.
But if so, then you need to move them to separate service or use run_in_executor(). In other case your aiohttp event loop will be blocked by this tasks and server will not be able to accept new requests.

Java ExecutorService for Async web service

We need to implement a Async web service.
Behaviour of web service:
We send the request for an account to server and it sends back the sync response with an acknowledgement ID. After that we get multiple Callback requests which contains that acknowldegment ID. The last callback request for an acknowledgement ID will contain a text(completed:true) in the response which will tell us that this is the last callback request for that account and acknowledgement ID. This will help us to know that async call for a particular account is completed and we can mark its final status. We need to execute this web service for multiple accounts. So, we will be getting callback requests for many accounts.
Question:
What is the optimal way to process these multiple callback requests coming for multiple accounts.
Solutions that we thought of:
ExecutorService Fixed Thread Pool: This will parallely process our callback requests but the concern is that it does not maintain the sequence. So it will be difficult for us to determine that the last callback request for an acknowledgment ID(account) has come. Hence, we will not be able to mark the final status of that account as completed with surity.
ExecutorService Single Thread Executor: Here, only one thread is there in the pool with an unbouded queue. If we use this then processing will be pretty slow as only one thread will be actually processing.
Please suggest an optimal way to implement requirement both memory and performance wise.
Let's be clear about one thing: HTTP is a blocking, synchronous protocol. Request/response pairs aren't asynch. What you're doing is spawning asynch requests and returning to the caller to let them know the request is being processed (HTTP 200) or not (HTTP 500).
I'm not sure that I know optimal for this situation, but there are other considerations:
Use an ExecutorServiceThreadPool that you can configure. Make sure you have a prefix that lets you distinguish these threads from others.
Add request task to a blocking dequeue and have a pool of consumer threads process them. You can tune the dequeue and the consumer thread pool sizes.
If processing is intensive, send request messages to a queue running on another server. Have a pool of queue listeners process the requests.
You cannot assume that the callbacks will return in a certain order. Don't depend on "last" being "true". You'll have to join all those threads together to know when they're finished.
It sounds like the web service should have a URL that lets users query for status.

API with Work Queue Design Pattern

I am building an API that is connected to a work queue and I'm having trouble with the structure. What I'm looking for is a design pattern for a worker queue that is interfaced via a API.
Details:
I'm using a Node.js server and Express to create an API that takes a request and returns JSON. These request can take a long time to process (very data intensive) so this is why we use a queuing system (RabbitMQ).
So for example lets say I send a request to the API that will take 15 min to process. The Express API formats the request and puts it in a RabbitMQ (AMQP) queue. The next available worker takes the request off the queue and starts to process it. After its done (in this case 15 min) it saves the data into a MongoDB. .... now what .....
My issue is, how do I get the finished data back to the caller of the API? The caller is a completely separate program that contacts the API via something like an Ajax request.
The worker will save the processed data into a database but I have no way to push back to the original calling program.
Does anyone have any API with a work queue resources?
Please and thank you.
On the initiating call by the client you should return to the client a task identifier that will persist with the data all the way to MongoDB.
You can then provide an additional API method for the client to check the task's status. This method should take a single parameter, the task identifier, and check if a document with that identifier has made in into your collection in MongoDB. Return false if it doesn't exist yet, true when it does.
The client will have to repeatedly poll (but maybe at a 1 minute interval) the task status API method until it returns true.

Resources