Does the node.js express framework create a new lightweight process per client connection?

Does the node.js express framework create a new lightweight process per client connection? - node.js

Say this code is run inside of a node.js express application. Say two different clients request the index resource. Call these clients ClientA and ClientB. Say ClientA requests the index resource before ClientB. In this case the console will log the value 1 for ClientA and the console will log the value 2 for ClientB. My main question is: Does each client request get its own lightweight process with the router being the shared code portion between those processes, the variables visible to router but not part of the router being the shared heap and of course each client then gets their own stack? My sub questions is: If yes to my main question then in this example each of these clients would have to queue waiting for the lock the global_counter before incrementing,correct?
var global_counter = 0;
router.get('/', function (req, res) {
global_counter += 1;
console.log(global_counter);
res.render('index');
});

Nope. Single thread/process. Concurrency is accomplished via a work queue. Some ways to get stuff into the work queue include setTimeout() and nexttick(). Check out http://howtonode.org/understanding-process-next-tick
Only one thing is running at a time, so no need to do any locking.
It takes a while to get your brain to warm up to the idea.

Related

Concurrency in node js express app for get request with setTimeout

Console log Image
const express = require('express');
const app = express();
const port = 4444;
app.get('/', async (req, res) => {
console.log('got request');
await new Promise(resolve => setTimeout(resolve, 10000));
console.log('done');
res.send('Hello World!');
});
app.listen(port, () => {
console.log(`Example app listening at http://localhost:${port}`);
});
If I hit get request http://localhost:4444 three times concurrently then it is returning logs as below
got request
done
got request
done
got request
done
Shouldn't it return the output in the below way because of nodes event loop and callback queues which are external to the process thread? (Maybe I am wrong, but need some understanding on Nodes internals) and external apis in node please find the attached image
Javascript Run time environment
got request
got request
got request
done
done
done

Thanks to https://stackoverflow.com/users/5330340/phani-kumar
I got the reason why it is blocking. I was testing this in chrome. I am making get requests from chrome browser and when I tried the same in firefox it is working as expected.
Reason is because of this
Chrome locks the cache and waits to see the result of one request before requesting the same resource again.
Chrome stalls when making multiple requests to same resource?

It is returning the response like this:
Node.js is event driven language. To understand the concurrency, you should look a How node is executing this code. Node is a single thread language(but internally it uses multi-thread) which accepts the request as they come. In this case, Node accepts the request and assign a callback for the promise, however, in the meantime while it is waiting for the eventloop to execute the callback, it will accept as many request as it can handle(ex memory, cpu etc.). As there is setTimeout queue in the eventloop all these callback will be register there and once the timer is completed the eventloop will exhaust its queue.
Single Threaded Event Loop Model Processing Steps:
Client Send request to the Node.js Server.
Node.js internally maintains a limited(configurable) Thread pool to provide services to the Client Requests.
Node.js receives those requests and places them into a Queue that is known as “Event Queue”.
Node.js internally has a Component, known as “Event Loop”. Why it got this name is that it uses indefinite loop to receive requests and process them.
Event Loop uses Single Thread only. It is main heart of Node JS Platform Processing Model.
Event Loop checks any Client Request is placed in Event Queue. If not then wait for incoming requests for indefinitely.
If yes, then pick up one Client Request from Event Queue
Starts process that Client Request
If that Client Request Does Not requires any Blocking IO Operations, then process everything, prepare response and send it back to client.
If that Client Request requires some Blocking IO Operations like interacting with Database, File System, External Services then it will follow different approach
Checks Threads availability from Internal Thread Pool
Picks up one Thread and assign this Client Request to that thread.
That Thread is responsible for taking that request, process it, perform Blocking IO operations, prepare response and send it back to the Event Loop
You can check here for more details (very well explained).

Why clusters don't work when requesting the same route at the same time in Express Node JS?

I wrote a simple express application example handling 2 GET routes. The first route contains a while loop which represent a blocking operation in 5 seconds.
The second route is simply return a Hello world text.
Also I set up a cluster following the simple guide on Node JS documentation.
Result of what I've tried:
Make 2 requests to 2 different routes at the same time => They work dependently as expected. Route / took 5 seconds and route /hello took several ms.
Make 2 requests to the same route / at the same time => They work synchronously, one responds after 5 seconds and the other after 10 seconds.
const cluster = require("cluster");
const express = require("express");
const app = express();
if (cluster.isMaster) {
cluster.fork();
cluster.fork();
} else {
function doWork(duration) {
const start = Date.now();
while (Date.now() - start < duration) {}
}
app.get("/", (req, res) => {
doWork(5000);
res.send("Done");
});
app.get("/hello", (req, res) => {
res.send("Hello world");
});
app.listen(3000);
}
I expect it would handle 2 requests of the same route in parallel. Can anyone explain what is going on?

I expect it would handle 2 requests of the same route in parallel. Can
anyone explain what is going on?
This is not the case as you have created two instances of server (two event loops, using cluster.fork()) ,so each of this request gets executed in different event loops (Server instance ) and the /hello will give you prompt request, whereas / request still wait for 5 seconds to send response.
Now if you haven't created cluster ,then the / request would have blocked the event loop and until it gets executed (Sends the response to browser ) /hello wouldn't have executed.
/ will take 5 seconds time to execute because you are blocking the event loop it executes in ,so whether you create single event loop or two event loops (using fork()) it will execute after 5 seconds
I tried your scenario in two different browsers and both request took 5.05 seconds(Both executed by different worker threads at same time)
const cluster = require("cluster");
const express = require("express");
const app = express();
if (cluster.isMaster) {
cluster.fork();
cluster.fork();
} else {
function doWork(duration) {
const start = Date.now();
while (Date.now() - start < duration) {}
}
app.get("/", (req, res) => {
console.log("Cluster ID",cluster.worker.id); // publish the workerid
doWork(5000);
res.send("Done");
});
app.listen(3000);
}
But with same browser ,the request always went to one worker thread, which executes the second request only after it has executed first ,So I guess its all about how the requests are distributed among worker threads created by cluster.fork()
As quoted from node docs
The cluster module supports two methods of distributing incoming
connections.
The first one (and the default one on all platforms except Windows),
is the round-robin approach, where the master process listens on a
port, accepts new connections and distributes them across the workers
in a round-robin fashion, with some built-in smarts to avoid
overloading a worker process.
The second approach is where the master process creates the listen
socket and sends it to interested workers. The workers then accept
incoming connections directly.
Node.js does not provide routing logic. It is, therefore important to
design an application such that it does not rely too heavily on
in-memory data objects for things like sessions and login.

I ran your code, first response came after 5 seconds and the other after 8 seconds, so clusters are working. Find out the number of cores of your machine using the below code. If it ones, then there is only one main thread.
const cpuCount = require('os').cpus().length;

It happens due to the cleverness of the modern browsers. If you make the same request in two different tab at the same time, the browser notice that and it wait to finish it and use the cache data of the first request to response the second request. No matter you use the clusters or how many fork().
To get rid out of this, simply disable cache in the network tab just shown as below:
Disable Cache

Node JS Socket.IO Emitter (and redis)

I'll give a small premise of what I'm trying to do. I have a game concept in mind which requires multiple players sitting around a table somewhat like poker.
The normal interaction between different players is easy to handle via socket.io in conjunction with node js.
What I'm having a hard time figuring out is; I have a cron job which is running in another process which gets new information every minute which then needs to be sent to each of those players. Since this is a different process I'm not sure how I send certain clients this information.
socket.io does have information for this and I'm quoting it below:
In some cases, you might want to emit events to sockets in Socket.IO namespaces / rooms from outside the context of your Socket.IO processes.
There’s several ways to tackle this problem, like implementing your own channel to send messages into the process.
To facilitate this use case, we created two modules:
socket.io-redis
socket.io-emitter
From what I understand I need these two modules to do what I mentioned earlier. What I do not understand however is why is redis in the equation when I just need to send some messages.
Is it used to just store the messages temporarily?
Any help will be appreciated.

There are several ways to achieve this if you just need to emit after an external event. It depend on what you're using for getting those new data to send :
/* if the other process is an http post incoming you can use for example
express and use your io object in a custom middleware : */
//pass the io in the req object
app.use( '/incoming', (req, res, next) => {
req.io = io;
})
//then you can do :
app.post('/incoming', (req, res, next) => {
req.io.emit('incoming', req.body);
res.send('data received from http post request then send in the socket');
})
//if you fetch data every minute, why don't you just emit after your job :
var job = sheduledJob('* */1 * * * *', io => {
axios.get('/myApi/someRessource').then(data => io.emit('newData', data.data));
})

Well in the case of socket.io providing those, I read into that you actually need both. However this shouldn't necessarily be what you want. But yes, redis is probably just used to store data temporarily, where it also does a really good job, by being close to what a message queue does.
Your cron now wouldn't need a message queue or similar behaviour.
My suggestion though would be to run the cron with some node package from within your process as a child_process hook onto it's readable stream and then push directly to your sockets.

If the cron job process is also a nodejs process, you can exchange data through redis.io pub-sub client mechanism.
Let me know what is your cron job process in and in case further help required in pub-sub mechanism..
redis is one of the memory stores used by socket.io(in case you configure)

You must employ redis only if you have multi-server configuration (cluster) to establish a connection and room/namespace sync between those node.js instances. It has nothing to do with storing data in this case, it works as a pub/sub machine.

node, is each request and response unique or cached irrespective of url

In an app that I was working, I encountered "headers sent already error" if I test using concurrency and parallel request methods.
ultimately I resolved the problem using !response.headersSent but my question is why am I forced to use it? is node caching similar requests and reuses them for the next repeated call.
if(request.headers.accept == "application/json") {
if(!response.headersSent) {response.writeHead(200, {'Content-Type': 'application/json'})}
response.end(JSON.stringify({result:{authToken:data.authToken}}));
}
Edit
var express = require('express');
var app = express();
var server = app.listen(process.env.PORT || 3000, function () {
console.log('Example app listening at http://%s:%s', server.address().address, server.address().port);
});
Edit 2:
Another problem is while testing using mocha, super agent and while the tests in progress if I just send another request through postman on the side, one of the tests in mocha end with a timeout error. These steps I'm taking to ensure the code is production ready for simultaneous, parallel requests? please advise on what measures I can take to ensure node/code works under stress.
Edit 3:
app.use(function(request, response, next){
request.id = Math.random();
next();
});

OK, in an attempt to capture what solved this for you via all our conversation in comments, I will attempt to summarize here:
The message "headers sent already error" is nearly always caused by improper async handling which causes the code to call methods on the response object in a wrong sequence. The most common case is non-async code that ends the request and then an async operation that ends some time later that then tries to use the request (but there are other ways to misuse it too).
Each request and response object is uniquely created at the time each individual HTTP request arrives at the node/express server. They are not cached or reused.
Because of asynchronous operations in the processing of a request, there may be more than one request/response object in use at any given time. Code that is processing these must not store these objects in any sort of single global variable because multiple ones can be in the state of processing at once. Because node is single threaded, code will only be running on any given request at any given moment, but as soon as that code hits an async operation (and thus has nothing to do until the async operation is done), another request could start running. So multiple requests can easily be "in flight" at the same time.
If you have a system where you need to keep track of multiple requests at once, you can coin a request id and attach it to each new request. One way to do that is with a few lines of express middleware that is early in the middleware stack that just adds a unique id property to each new request.
One simple way of coining a unique id is to just use a monotonically increasing counter.

How to lock (Mutex) in NodeJS?

There are external resources (accessing available inventories through an API) that can only be accessed one thread at a time.
My problems are:
NodeJS server handles requests concurrently, we might have multiple requests at the same time trying to reserve inventories.
If I hit the inventory API concurrently, then it will return duplicate available inventories
Therefore, I need to make sure that I am hitting the inventory API one thread at a time
There is no way for me to change the inventory API (legacy), therefore I must find a way to synchronize my nodejs server.
Note:
There is only one nodejs server, running one process, so I only need to synchronize the requests within that server
Low traffic server running on express.js

I'd use something like the async module's queue and set its concurrency parameter to 1. That way, you can put as many tasks in the queue as you need to run, but they'll only run one at a time.
The queue would look something like:
var inventoryQueue = async.queue(function(task, callback) {
// use the values in "task" to call your inventory API here
// pass your results to "callback" when you're done
}, 1);
Then, to make an inventory API request, you'd do something like:
var inventoryRequestData = { /* data you need to make your request; product id, etc. */ };
inventoryQueue.push(inventoryRequestData, function(err, results) {
// this will be called with your results
});

Develop Reference

node.js excel linux python-3.x azure haskell apache-spark rust .htaccess string