How to increase event loop capacity in nodejs? - node.js

I know that Node.js uses a single-thread and an event loop to process requests only processing one at a time (which is non-blocking). But i am unable to determine Event loop capacity to run 100k request per second.
Here i want to capacity planning for nodejs server to handle the 100k request per second.
Please let me know how can i determine the capacity of event loop to increase capacity.

A single instance of Node.js runs in a single thread. To take advantage of multi-core systems the user will sometimes want to launch a cluster of Node.js processes to handle the load.
More info here and here
For the reference check following code for simple implementation of cluster in node.js
var cluster = require('cluster');
var express = require('express');
var numCPUs = require('os').cpus().length;
if (cluster.isMaster) {
for (var i = 0; i < numCPUs; i++) {
// Create a worker
cluster.fork();
}
} else {
// Workers share the TCP connection in this server
var app = express();
app.get('/', function (req, res) {
res.send('Hello World!');
});
// All workers use this port
app.listen(8080);
}
Cluster is an extensible multi-core server manager for node.js for more source check here.

Related

Why Websocket/ws is connecting socket on single core? (Nodejs/Cluster)

I am running node js and using cluster module. In child process I am creating a websocket server and on websocket connection I am doing console.log(process.pid) on each websocket connection. I even added a loop in worker thread to slow it down which apparently is the case but it is still assigning the same core to each web socket client. I have written a bash script to run my html file which opens concurrent N connections to test if cluster module is working fine. Is this the issue with cluster module or websocket server?
const os = require('os');
const cluster = require('cluster');
if (cluster.isMaster) {
for (let i = 0; i < os.cpus().length; i++) {
cluster.fork();
}
} else {
const server = require('http').createServer(app);
const WebSocket = require('ws');
let ws_clients = {};
const wss = new WebSocket.Server({ server: server});
wss.on('connection', function connection(ws) {
let h = 0;
for (let i = 0; i < 2e10; i++) {
h++;
}
console.log('handled by:', process.pid);
});
}
Your loop just pauses the process main thread that already accepted the connection and currently handles it. If I'm not mistaken connection requests under the hood and by node.js design are handled by separate threads.
If you wish to check the distribution of load on your local machine spawn 4 child processes and have 4 other child processes requesting connections. Then if you check core utilization you will see all of your cores working fine, assuming you have a 8 logical core machine (In windows -> task manager advanced view -> processor -> logical cores). You 'll see a bunch of process pids being logged.

How to restart if NodeJS API service failed?

I've the similar NodeJS code:
cluster.js
'use strict';
const cluster = require('cluster');
var express = require('express');
const metricsServer = express();
const AggregatorRegistry = require('prom-client').AggregatorRegistry;
const aggregatorRegistry = new AggregatorRegistry();
var os = require('os');
if (cluster.isMaster) {
for (let i = 0; i < os.cpus().length; i++) {
cluster.fork();
}
metricsServer.get('/metrics', (req, res) => {
aggregatorRegistry.clusterMetrics((err, metrics) => {
if (err) console.log(err);
res.set('Content-Type', aggregatorRegistry.contentType);
res.send(metrics);
});
});
metricsServer.listen(3013);
console.log(
'Cluster metrics server listening to 3013, metrics exposed on /metrics'
);
} else {
require('./app.js'); // Here it'll handle all of our API service and it'll run under port 3000
}
As you can see in the above code I'm using NodeJS Manual cluster method instead of PM2 cluster, because I need to monitor my API via Prometheus. I'm usually starting the cluster.js via pm2 start cluster.js, however due to some DB connection our app.js service failed but cluster.js didn't. It apparently looks like I've not handled the db connection error, even though I've not handle it. I want to know,
How can I make sure my app.js and cluster.js always restarts if it crashes?
Is there a Linux crontab can be place to check the certain ports are always running (i.e 3000 and 3013)? (If this a good idea, I appreciate if you could provide me the code, I'm not much familiar with Linux)
Or I can deploy another NodeJS api to check the certain services are running, but since my API's real-time and catching certain amount of load; I'm not much happy do this?
Any help would be appreciate, Thanks in advance.
You can use monit https://www.digitalocean.com/community/tutorials/how-to-install-and-configure-monit in your server to regular monitor your process, if your project crashes it restart it again and can even notify you. but in this you have to do some configuration in server as monit regularly monitors a port, if it dosent get any reply from thta port then it restarts it.
otherwise you can use forever module. Easy to install and easy to use-https://www.npmjs.com/package/forever
it monitors and within 1 sec it restarts your application
I recently found out that, we can listen to the worker event if it's died/closed and restart it accordingly.
Here is the code:
'use strict';
const cluster = require('cluster');
var express = require('express');
const metricsServer = express();
var os = require('os');
if (cluster.isMaster) {
for (let i = 0; i < os.cpus().length; i++) {
cluster.fork();
}
cluster.on(
"exit",
function handleExit( worker, code, signal ) {
console.log( "Worker has died.", worker.process.pid );
console.log( "Death was suicide:", worker.exitedAfterDisconnect );
// If a Worker was terminated accidentally (such as by an uncaught
// exception), then we can try to restart it.
if ( ! worker.exitedAfterDisconnect ) {
var worker = cluster.fork();
// CAUTION: If the Worker dies immediately, perhaps due to a bug in the
// code, you can run [from what I have READ] into rapid CPU consumption
// as Master continually tries to create new Workers.
}
}
);
} else {
require('./app.js');
}

Node cluster; only one process being used

I'm running a clustered node app, with 8 worker processes. I'm giving output when serving requests, and the output includes the ID of the process which handled the request:
app.get('/some-url', function(req, res) {
console.log('Request being handled by process #' + process.pid);
res.status(200).text('yayyy');
});
When I furiously refresh /some-url, I see in the output that the same process is handling the request every time.
I used node load-test to query my app. Again, even with 8 workers available, only one of them handles every single request. This is obviously undesirable as I wish to load-test the clustered app to see the overall performance of all processes working together.
Here's how I'm initializing the app:
var cluster = require('cluster');
if (cluster.isMaster) {
for (var i = 0; i < 8; i++) cluster.fork();
} else {
var app = require('express')();
// ... do all setup on `app`...
var server = require('http').createServer(app);
server.listen(8000);
}
How do I get all my workers working?
Your request does not use any ressources. I suspect that the same worker is always called, because it just finishes to handle the request before the next one comes in.
What happens if you do some calculation inside that takes more time than the time needed to handle a request ? As it stands, the worker is never busy between accepting a request and answering it.

Node.js Cluster: Managing Workers

we're diving deeper in Node.js architecture, to achieve fully understanding, how to scale our application.
Clear solution is cluster usage https://nodejs.org/api/cluster.html. Everything seems to be fine, apart of workers management description:
Node.js does not automatically manage the number of workers for you, however. It is your responsibility to manage the worker pool for your application's needs.
I was searching, how to really manage the workers, but most solutions, says:
Start so many workers as you've got cores.
But I would like to dynamically scale up or down my workers count, depending on current load on server. So if there is load on server and queue is getting longer, I would like to start next worker. In another way, when there isn't so much load, I would like to shut down workers (and leave f.e. minimum 2 of them).
The ideal place, will be for me Master Process queue, and event when new Request is coming to Master Process. On this place we can decide if we need next worker.
Do you have any solution or experience with managing workers from Master Thread in Cluster? Starting and killing them dynamically?
Regards,
Radek
following code will help you to understand to create cluster on request basis.
this program will genrate new cluster in every 10 request.
Note: you need to open http://localhost:8000/ and refresh the page for increasing request.
var cluster = require('cluster');
var http = require('http');
var numCPUs = require('os').cpus().length;
var numReqs = 0;
var initialRequest = 10;
var maxcluster = 10;
var totalcluster = 2;
if (cluster.isMaster) {
// Fork workers.
for (var i = 0; i < 2; i++) {
var worker = cluster.fork();
console.log('cluster master');
worker.on('message', function(msg) {
if (msg.cmd && msg.cmd == 'notifyRequest') {
numReqs++;
}
});
}
setInterval(function() {
console.log("numReqs =", numReqs);
isNeedWorker(numReqs) && cluster.fork();
}, 1000);
} else {
console.log('cluster one initilize');
// Worker processes have a http server.
http.Server(function(req, res) {
res.writeHead(200);
res.end("hello world\n");
// Send message to master process
process.send({ cmd: 'notifyRequest' });
}).listen(8000);
}
function isNeedWorker(numReqs) {
if( numReqs >= initialRequest && totalcluster < numCPUs ) {
initialRequest = initialRequest + 10;
totalcluster = totalcluster + 1;
return true;
} else {
return false;
}
}
To manually manage your workers, you need a messaging layer to facilitate inter process communication. With IPC master and worker can communicate effectively, by default and architecture stand point this behavior is already implemented in the process module native. However i find the native implementation not flexible or robust enough to handle horizontal scaling due to network requests.
One obvious solution Redis as a message broker to facilitate this method of master and slave communication. However this solution also as its faults , which is context latency, directly linked to command and reply.
Further research led me to RabbitMQ,great fit for distributing time-consuming tasks among multiple workers.The main idea behind Work Queues (aka: Task Queues) is to avoid doing a resource-intensive task immediately and having to wait for it to complete. Instead we schedule the task to be done later. We encapsulate a task as a message and send it to the queue. A worker process running in the background will pop the tasks and eventually execute the job. When you run many workers the tasks will be shared between them.
To implement a robust server , read this link , it may give some insights. Link

Why is Node cluster.fork() forking the parent scope when implemented as a module

I'm attempting to implement a Node module which uses cluster. The problem is that the entire parent scope is forked alongside the intended cluster code. I discovered it while writing tests in Mocha for the module: the test suite will run many times, instead of once.
See below, myModule.js creates N workers, one for each CPU. These workers are http servers, or could be anything else.
Each time the test.js runs, the script runs N + 1 times. In the example below, console.log runs 5 times on my quad core.
Can someone explain if this is an implementation issue or cluster config issue? Is there any way to limit the scope of fork() without having to import a module ( as in this solution https://github.com/mochajs/mocha/issues/826 )?
/// myModule.js ////////////////////////////////////
var cluster = require('cluster');
var http = require('http');
var numCPUs = require('os').cpus().length;
var startCluster = function(){
if (cluster.isMaster) {
// CREATE A CLUSTER OF FORKED WORKERS, ONE PER CPU
//master does not listen to UDP messages.
for (var i = 0; i < numCPUs; i++) {
var worker = cluster.fork();
}
} else {
// Worker processes have an http server.
http.Server(function (req, res){
res.writeHead(200);
res.end('hello world\n');
}).listen(8000);
}
return
}
module.exports = startCluster;
/////////////////////////////////////////////////
//// test.js ////////////////////////////////////
var startCluster = require('./myModule.js')
startCluster()
console.log('hello');
////////////////////////////////////////////////////////
So I'll venture an answer. Looking closer at the node docs there is a cluster.setupMaster which can override defaults. The default on a cluster.fork() is to execute the current script, with "file path to worker file. (Default=process.argv[1])"
https://nodejs.org/docs/latest/api/cluster.html#cluster_cluster_settings
So if another module is importing a script with a cluster.fork() call, it will still use the path of process.argv[1], which may not be the path you expect, and have unintended consequences.
So we shouldn't initialize the cluster master and worker in the same file as the official docs suggest. It would be prudent to separate the worker into a new file and override the default settings. (Also for safety you can add the directory path with __dirname ).
cluster.setupMaster({ exec: __dirname + '/worker.js',});
So here would be the corrected implementation:
/// myModule.js ////////////////////////////////////
var cluster = require('cluster');
var numCPUs = require('os').cpus().length;
var startCluster = function(){
cluster.setupMaster({
exec: __dirname + '/worker.js'
});
if (cluster.isMaster) {
// CREATE A CLUSTER OF FORKED WORKERS, ONE PER CPU
for (var i = 0; i < numCPUs; i++) {
var worker = cluster.fork();
}
}
return
}
module.exports = startCluster;
/////////////////////////////////////////////////
//// worker.js ////////////////////////////////////
var http = require('http');
// All worker processes have an http server.
http.Server(function (req, res){
res.writeHead(200);
res.end('hello world\n');
}).listen(8000);
////////////////////////////////////////////////////////
//// test.js ////////////////////////////////////
var startCluster = require('./myModule.js')
startCluster()
console.log('hello');
////////////////////////////////////////////////////////
You should only see 'hello' once instead of 1 * Number of CPUs
You need to have the "isMaster" stuff at the top of your code, not inside the function. The worker will run from the top of the module ( it's not like a C++ fork, where the worker starts at the fork() point ).
I assume that you want the startCluster = require('./cluster-bug.js') to eval only once? Well, thats because your whole script runs clustered. What you do specify inside startCluster is only to make it vary between master and slave clusters. Cluster spawns fork of file in which it is initialised.

Resources