NodeJS server run individual process for each request against queue - node.js

See this example node.js code:
const http = require('http');
const server = http.createServer(function (req, res) {
if (req.url === '/loop') {
console.log('LOOP');
while (true) {}
}
res.write('Hello World');
res.end();
});
server.listen(3000);
In my script each request takes 3 to 5 seconds to process. while (true) {} is just for example.
But, here nodejs not processing another request when one request in process.
I want to run multiple requests at same time. But, server is running only one request at one time.
NOTE: I don't like to open cluster or child_process for each request. Because nodejs takes another 65 ms for starting cluster or child_process.

When you create server ( and listens ) nodejs creates an eventloop in which it process the request, you will not be able to use a infinite loop in it, since it will block the eventloop in which your server is running.
I hope you are not dealing with an infinite loop, but a certain process that takes time, for that you make use of, modules like async
in request/res function block use async module like this,
async.map(['param1','param2','param3'], task, function(err, results) {
// results of task function
});
what it does is that it will make use of already running eventloop and run the process.
Point to Note :
Most Javascript VMs are single threaded ( including NodeJS ) hence you can also make use of setTimeout function instead of an infinite while loop
You will not be able to create a thread in NodeJS instead use any process based solution like cluster or childprocess ( single threaded VM )

Related

Concurrency in node js express app for get request with setTimeout

Console log Image
const express = require('express');
const app = express();
const port = 4444;
app.get('/', async (req, res) => {
console.log('got request');
await new Promise(resolve => setTimeout(resolve, 10000));
console.log('done');
res.send('Hello World!');
});
app.listen(port, () => {
console.log(`Example app listening at http://localhost:${port}`);
});
If I hit get request http://localhost:4444 three times concurrently then it is returning logs as below
got request
done
got request
done
got request
done
Shouldn't it return the output in the below way because of nodes event loop and callback queues which are external to the process thread? (Maybe I am wrong, but need some understanding on Nodes internals) and external apis in node please find the attached image
Javascript Run time environment
got request
got request
got request
done
done
done
Thanks to https://stackoverflow.com/users/5330340/phani-kumar
I got the reason why it is blocking. I was testing this in chrome. I am making get requests from chrome browser and when I tried the same in firefox it is working as expected.
Reason is because of this
Chrome locks the cache and waits to see the result of one request before requesting the same resource again.
Chrome stalls when making multiple requests to same resource?
It is returning the response like this:
Node.js is event driven language. To understand the concurrency, you should look a How node is executing this code. Node is a single thread language(but internally it uses multi-thread) which accepts the request as they come. In this case, Node accepts the request and assign a callback for the promise, however, in the meantime while it is waiting for the eventloop to execute the callback, it will accept as many request as it can handle(ex memory, cpu etc.). As there is setTimeout queue in the eventloop all these callback will be register there and once the timer is completed the eventloop will exhaust its queue.
Single Threaded Event Loop Model Processing Steps:
Client Send request to the Node.js Server.
Node.js internally maintains a limited(configurable) Thread pool to provide services to the Client Requests.
Node.js receives those requests and places them into a Queue that is known as “Event Queue”.
Node.js internally has a Component, known as “Event Loop”. Why it got this name is that it uses indefinite loop to receive requests and process them.
Event Loop uses Single Thread only. It is main heart of Node JS Platform Processing Model.
Event Loop checks any Client Request is placed in Event Queue. If not then wait for incoming requests for indefinitely.
If yes, then pick up one Client Request from Event Queue
Starts process that Client Request
If that Client Request Does Not requires any Blocking IO Operations, then process everything, prepare response and send it back to client.
If that Client Request requires some Blocking IO Operations like interacting with Database, File System, External Services then it will follow different approach
Checks Threads availability from Internal Thread Pool
Picks up one Thread and assign this Client Request to that thread.
That Thread is responsible for taking that request, process it, perform Blocking IO operations, prepare response and send it back to the Event Loop
You can check here for more details (very well explained).

How do i avoid blocking an express rest service?

When making a REST service using express in node, how do i prevent a blocking task from blocking the entire rest service? Take as example the following express rest service:
const express = require('express');
const app = express();
app.get('/', (req, res) => res.send('Hello, World'));
const blockService = async function () {
return new Promise((resolve, reject) => {
const end = Date.now() + 20000;
while (Date.now() < end) {
const doSomethingHeavyInJavaScript = 1 + 2 + 3;
}
resolve('I am done');
});
}
const blockController = function (req, res) {
blockService().then((val) => {
res.send(val);
});
};
app.get('/block', blockController);
app.listen(3000, () => console.log('app listening on port 3000'));
In this case, a call to /block will render the entire service unreachable for 20 seconds. This is a big problem if there are many clients using the service, since no other client will be able to access the service for that time. This is obviously a problem of the while loop being blocking code, and thus hanging the main thread. This code might be confusing, since, despite using a promise in blockService, the main thread still hangs. How do i ensure that blockService will run a worker-thread and not the event-loop?
By default node.js runs your Javascript code in a single thread. So, if you really have CPU intensive code in a request handler (like you show above), then that is indeed a problem. Your options are as follows:
Start up a Worker Thread and run the CPU-intensive code in a worker thread. Since version 10, node.js has had worker threads for this purpose. You then communicate back the result to the main thread with messaging.
Start up any other process that runs node.js code or any type of code and compute the result in that other process. You then communicate back the result to the main thread with messaging.
Use node clustering to start N processes so that if once process is stuck with a CPU intensive operation, at least one of the others is hopefully free to run other requests.
Please note that a lot of things that servers do like read files, do networking, make requests to databases are all asynchronous and non-blocking so it's not incredibly common to actually have lots of CPU intensive code. So, if this is just a made up example for your own curiosity, you should make sure you actually have a CPU-intensive problem in your server before you go designing threads or clusters.
Node.js is an event-based model that uses a single runtime thread. For the reasons you've discovered, Node.js is not a good choice for CPU bound tasks (or synchronously blocking tasks). Node.js works best for coordinating I/O asynchronously.
worker-threads were released in Node.js v12. This allows you to use another thread for blocking tasks. They are relatively simple to use and could work if you absolutely need the offload blocking tasks.

How to spawn a single worker instance from parent thant is being spawned multiple times?

I have a server and a scraper. The server spawns the scraper, but the server is being spawned multiple times by pm2 depending on the machine core count. I want the server to be spawned multiple times but only execute one instance of scraper. Is this possible?
// index.js (this should be forked multiple times)
const server = http.createServer(handler);
require('./scraper'); // this shoud only be forked once
Definitely need some kind of common place to store startup information. file system, socket, memory etc
function runOnce(fn) {
const http = require('http');
http.createServer()
.on('error', _=> console.log('skip'))
.listen(9999, _=> fn() && console.log('run'));
}
runOnce(_ => require('./scraper'));

Why clusters don't work when requesting the same route at the same time in Express Node JS?

I wrote a simple express application example handling 2 GET routes. The first route contains a while loop which represent a blocking operation in 5 seconds.
The second route is simply return a Hello world text.
Also I set up a cluster following the simple guide on Node JS documentation.
Result of what I've tried:
Make 2 requests to 2 different routes at the same time => They work dependently as expected. Route / took 5 seconds and route /hello took several ms.
Make 2 requests to the same route / at the same time => They work synchronously, one responds after 5 seconds and the other after 10 seconds.
const cluster = require("cluster");
const express = require("express");
const app = express();
if (cluster.isMaster) {
cluster.fork();
cluster.fork();
} else {
function doWork(duration) {
const start = Date.now();
while (Date.now() - start < duration) {}
}
app.get("/", (req, res) => {
doWork(5000);
res.send("Done");
});
app.get("/hello", (req, res) => {
res.send("Hello world");
});
app.listen(3000);
}
I expect it would handle 2 requests of the same route in parallel. Can anyone explain what is going on?
I expect it would handle 2 requests of the same route in parallel. Can
anyone explain what is going on?
This is not the case as you have created two instances of server (two event loops, using cluster.fork()) ,so each of this request gets executed in different event loops (Server instance ) and the /hello will give you prompt request, whereas / request still wait for 5 seconds to send response.
Now if you haven't created cluster ,then the / request would have blocked the event loop and until it gets executed (Sends the response to browser ) /hello wouldn't have executed.
/ will take 5 seconds time to execute because you are blocking the event loop it executes in ,so whether you create single event loop or two event loops (using fork()) it will execute after 5 seconds
I tried your scenario in two different browsers and both request took 5.05 seconds(Both executed by different worker threads at same time)
const cluster = require("cluster");
const express = require("express");
const app = express();
if (cluster.isMaster) {
cluster.fork();
cluster.fork();
} else {
function doWork(duration) {
const start = Date.now();
while (Date.now() - start < duration) {}
}
app.get("/", (req, res) => {
console.log("Cluster ID",cluster.worker.id); // publish the workerid
doWork(5000);
res.send("Done");
});
app.listen(3000);
}
But with same browser ,the request always went to one worker thread, which executes the second request only after it has executed first ,So I guess its all about how the requests are distributed among worker threads created by cluster.fork()
As quoted from node docs
The cluster module supports two methods of distributing incoming
connections.
The first one (and the default one on all platforms except Windows),
is the round-robin approach, where the master process listens on a
port, accepts new connections and distributes them across the workers
in a round-robin fashion, with some built-in smarts to avoid
overloading a worker process.
The second approach is where the master process creates the listen
socket and sends it to interested workers. The workers then accept
incoming connections directly.
Node.js does not provide routing logic. It is, therefore important to
design an application such that it does not rely too heavily on
in-memory data objects for things like sessions and login.
I ran your code, first response came after 5 seconds and the other after 8 seconds, so clusters are working. Find out the number of cores of your machine using the below code. If it ones, then there is only one main thread.
const cpuCount = require('os').cpus().length;
It happens due to the cleverness of the modern browsers. If you make the same request in two different tab at the same time, the browser notice that and it wait to finish it and use the cache data of the first request to response the second request. No matter you use the clusters or how many fork().
To get rid out of this, simply disable cache in the network tab just shown as below:
Disable Cache

NodeJS child_process or nextTick or setTimeout for long waiting task?

I have seen some questions about sending response immediately and run CPU intensive tasks.
My case is my node application depends on third party service responses so the process flow is
Node receives request and authenticates with third-party service
Send response to user after authentication
Do some tasks that needs responses from third party service
Save the results to database
In my case there is no CPU intensive tasks and no need to give results of additional tasks to the user but node needs to wait for responses from third-party service. I have to do multiple req/res to/from the third-party service after the authentication to complete the task.
How can I achieve this situation?
I have seen some workarounds with child_process, nextTick and setTimeOut.
Ultimately I want to send response immediately to user and do tasks related to that user.
Thanks in advance.
elsewhere in your code
function do_some_tasks() { //... }
// route function
(req, res) => {
// call some async task
do_some_tasks()
// if the above is doing some asynchronous task, next function should be called immediately without waiting, question is is it so?
res.send()
}
// if your do_some_tasks() is synchronous func, the you can do
// this function call will be put to queue and executed asynchronously
setImmediate(() => {
do_some_tasks()
})
// this will be called in the current iteration
res.send(something)
Just writing a very general code block here:
var do_some_tasks = (req, tp_response) => {
third_party_tasks(args, (err, result)=<{
//save to DB
});
}
var your_request_handler = (req,res) => {
third_party_auth(args, (tp_response)=>{
res.send();
//just do your tasks here
do_some_tasks(req, tp_response);
});
}

Resources