why does response.write block the thread? - node.js

Here's the example below:
const http = require('http')
const server = http.createServer()
const log = console.log.bind(console)
server.on('request', (req, res) => {
log('visited!')
res.writeHead(200, {
'Content-Type': 'text/html; charset=utf-8',
})
for (let i = 0; i < 100000; i++) {
res.write('<p>hello</p>')
}
res.end('ok')
})
server.listen(3000, '0.0.0.0')
When the server is handling the first request, the thread is blocked and can't handle the second request. I wonder why would this happen since nodejs uses an event-driven, non-blocking I/O model.

Great question.
NodeJS uses a non-blocking I/O model; that is, I/O operations will run on separate threads under the hood, but all JavaScript code will run on the same event-loop driven thread.
In your example, when you ask your HTTP server to listen for incoming requests, Node will manage the socket listening operations on a separate thread under the hood so that your code can continue to run after calling server.listen().
When a request comes, your server.on('request') callback is executed back on the main event-loop thread. If another request comes in, its callback cannot run until the first callback and any other code that is currently executing on the main thread finishes.
Most of the time, callbacks are short-lived so you rarely need to worry about blocking the main thread. If the callbacks aren't short-lived, they are almost always calling an asynchronous I/O related function that actually does run in a different thread under the hood, therefore freeing up the main thread for other code to execute.

Related

Concurrency in node js express app for get request with setTimeout

Console log Image
const express = require('express');
const app = express();
const port = 4444;
app.get('/', async (req, res) => {
console.log('got request');
await new Promise(resolve => setTimeout(resolve, 10000));
console.log('done');
res.send('Hello World!');
});
app.listen(port, () => {
console.log(`Example app listening at http://localhost:${port}`);
});
If I hit get request http://localhost:4444 three times concurrently then it is returning logs as below
got request
done
got request
done
got request
done
Shouldn't it return the output in the below way because of nodes event loop and callback queues which are external to the process thread? (Maybe I am wrong, but need some understanding on Nodes internals) and external apis in node please find the attached image
Javascript Run time environment
got request
got request
got request
done
done
done
Thanks to https://stackoverflow.com/users/5330340/phani-kumar
I got the reason why it is blocking. I was testing this in chrome. I am making get requests from chrome browser and when I tried the same in firefox it is working as expected.
Reason is because of this
Chrome locks the cache and waits to see the result of one request before requesting the same resource again.
Chrome stalls when making multiple requests to same resource?
It is returning the response like this:
Node.js is event driven language. To understand the concurrency, you should look a How node is executing this code. Node is a single thread language(but internally it uses multi-thread) which accepts the request as they come. In this case, Node accepts the request and assign a callback for the promise, however, in the meantime while it is waiting for the eventloop to execute the callback, it will accept as many request as it can handle(ex memory, cpu etc.). As there is setTimeout queue in the eventloop all these callback will be register there and once the timer is completed the eventloop will exhaust its queue.
Single Threaded Event Loop Model Processing Steps:
Client Send request to the Node.js Server.
Node.js internally maintains a limited(configurable) Thread pool to provide services to the Client Requests.
Node.js receives those requests and places them into a Queue that is known as “Event Queue”.
Node.js internally has a Component, known as “Event Loop”. Why it got this name is that it uses indefinite loop to receive requests and process them.
Event Loop uses Single Thread only. It is main heart of Node JS Platform Processing Model.
Event Loop checks any Client Request is placed in Event Queue. If not then wait for incoming requests for indefinitely.
If yes, then pick up one Client Request from Event Queue
Starts process that Client Request
If that Client Request Does Not requires any Blocking IO Operations, then process everything, prepare response and send it back to client.
If that Client Request requires some Blocking IO Operations like interacting with Database, File System, External Services then it will follow different approach
Checks Threads availability from Internal Thread Pool
Picks up one Thread and assign this Client Request to that thread.
That Thread is responsible for taking that request, process it, perform Blocking IO operations, prepare response and send it back to the Event Loop
You can check here for more details (very well explained).

How do i avoid blocking an express rest service?

When making a REST service using express in node, how do i prevent a blocking task from blocking the entire rest service? Take as example the following express rest service:
const express = require('express');
const app = express();
app.get('/', (req, res) => res.send('Hello, World'));
const blockService = async function () {
return new Promise((resolve, reject) => {
const end = Date.now() + 20000;
while (Date.now() < end) {
const doSomethingHeavyInJavaScript = 1 + 2 + 3;
}
resolve('I am done');
});
}
const blockController = function (req, res) {
blockService().then((val) => {
res.send(val);
});
};
app.get('/block', blockController);
app.listen(3000, () => console.log('app listening on port 3000'));
In this case, a call to /block will render the entire service unreachable for 20 seconds. This is a big problem if there are many clients using the service, since no other client will be able to access the service for that time. This is obviously a problem of the while loop being blocking code, and thus hanging the main thread. This code might be confusing, since, despite using a promise in blockService, the main thread still hangs. How do i ensure that blockService will run a worker-thread and not the event-loop?
By default node.js runs your Javascript code in a single thread. So, if you really have CPU intensive code in a request handler (like you show above), then that is indeed a problem. Your options are as follows:
Start up a Worker Thread and run the CPU-intensive code in a worker thread. Since version 10, node.js has had worker threads for this purpose. You then communicate back the result to the main thread with messaging.
Start up any other process that runs node.js code or any type of code and compute the result in that other process. You then communicate back the result to the main thread with messaging.
Use node clustering to start N processes so that if once process is stuck with a CPU intensive operation, at least one of the others is hopefully free to run other requests.
Please note that a lot of things that servers do like read files, do networking, make requests to databases are all asynchronous and non-blocking so it's not incredibly common to actually have lots of CPU intensive code. So, if this is just a made up example for your own curiosity, you should make sure you actually have a CPU-intensive problem in your server before you go designing threads or clusters.
Node.js is an event-based model that uses a single runtime thread. For the reasons you've discovered, Node.js is not a good choice for CPU bound tasks (or synchronously blocking tasks). Node.js works best for coordinating I/O asynchronously.
worker-threads were released in Node.js v12. This allows you to use another thread for blocking tasks. They are relatively simple to use and could work if you absolutely need the offload blocking tasks.

In Node js, what happens if a new request arrives and event loop is already busy processing a request?

I have this file named index.js:
const express = require('express')
const app = express()
const port = 3000
app.get('/home', (req, res) => {
res.send('Hello World!')
})
app.get('/route1', (req, res) => {
var num = 0;
for(var i=0; i<1000000; i++) {
num = num+1;
console.log(num);
}
res.send('This is Route1 '+ num)
})
app.listen(port, () => console.log(`Example app listening on port ${port}!`))
I first call the endpoint /route1 and then immediately the endpoint /home. The /route1 has for loop and takes some time to finish and then /home runs and finishes. My question is while app was busy processing /route1, how was the request to /home handled, given node js is single threaded?
The incoming request will be queued in the nodejs event queue until nodejs gets a chance to process the next event (when your long running event handler is done).
Since nodejs is an event-driven system, it gets an event from the event queue, runs that event's callback until completion, then gets the next event, runs it to completion and so on. The internals of nodejs add things that are waiting to be run to the event queue so they are queued up ready for the next cycle of the event loop.
Depending upon the internals of how nodejs does networking, the incoming request might be queued in the OS for a bit and then later moved to the event queue until nodejs gets a chance to serve that event.
My question is while app was busy processing /route1, how was the request to /home handled, given node js is single threaded?
Keep in mind that node.js runs your Javascript as single threaded (though we do now have Worker Threads if you want), but it does use threads internally to manage things like file I/O and some other types of asynchronous operations. It does not need threads for networking, though. That is managed with actual asynchronous interfaces from the OS.
Nodejs has event loop and event loop allows nodejs to perform non blocking I/O operation. Each event loop iteration is called a tick. There are different phases of the event loop.
First is timer phase, since there are no timers in your script event loop will go further to check I/O script.
When you hit route /route1, Node JS Web Server internally maintains a Limited Thread pool to provide services to the Client Requests. It will be placed in FIFO queue then event loop will go further to polling phase.
Polling phase will wait for pending I/O, which is route /route1. Even Loop checks any Client Request is placed in Event Queue. If no, then wait for incoming requests for indefinitely.
Meanwhile next I/O script arrives in FIFO queue which is route /home.
FIFO means, first in first out. Therefore first /route1 will get execute the route /home
Below you can see this via diagram.
A Node.js application runs on single thread and the event loop also runs on the same thread
Node.js internally uses the libuv library which is responsible for handling operating system related tasks, like asynchronous I/O based operation systems, networking, concurrency.
More info
Node has an internal thread pool from which a thread is assigned when a blocking(io or memeory or network) request is sent. If not, then the request is processed and sent back as such. If the thread pool is full, the request waits in the queue. Refer How, in general, does Node.js handle 10,000 concurrent requests? for more clear answers.

NodeJS server run individual process for each request against queue

See this example node.js code:
const http = require('http');
const server = http.createServer(function (req, res) {
if (req.url === '/loop') {
console.log('LOOP');
while (true) {}
}
res.write('Hello World');
res.end();
});
server.listen(3000);
In my script each request takes 3 to 5 seconds to process. while (true) {} is just for example.
But, here nodejs not processing another request when one request in process.
I want to run multiple requests at same time. But, server is running only one request at one time.
NOTE: I don't like to open cluster or child_process for each request. Because nodejs takes another 65 ms for starting cluster or child_process.
When you create server ( and listens ) nodejs creates an eventloop in which it process the request, you will not be able to use a infinite loop in it, since it will block the eventloop in which your server is running.
I hope you are not dealing with an infinite loop, but a certain process that takes time, for that you make use of, modules like async
in request/res function block use async module like this,
async.map(['param1','param2','param3'], task, function(err, results) {
// results of task function
});
what it does is that it will make use of already running eventloop and run the process.
Point to Note :
Most Javascript VMs are single threaded ( including NodeJS ) hence you can also make use of setTimeout function instead of an infinite while loop
You will not be able to create a thread in NodeJS instead use any process based solution like cluster or childprocess ( single threaded VM )

How is the event-loop implemented in node.js?

If node simply has two threads, one to execute the main code and the other for all the callbacks, then blocking can still occur if the callbacks are resource/time intensive.
Say you have 100,000 concurrent users and each client request to the node app runs a complicated and time consuming database query, (assuming no caching is done) will the later users experience blocking when waiting for the query to return?
function onRequest(request, response) {
//hypothetical database call
database.query("SELECT * FROM hugetable", function(data) {
response.writeHead(200, {"Content-Type": "text/plain"});
response.write("database result: " + data);
response.end();
});
}
http.createServer(onRequest).listen(8888);
If each callback can run on its own thread, then this is a non-issue. But if all the callbacks run on a single separate dedicated thread then node doesn't really help us much in such a scenario.
Please read this good article: http://blog.mixu.net/2011/02/01/understanding-the-node-js-event-loop/
There is no multiple threads, only one that executes your node.js logic. (Behind the scenes I/O should not be considered for application logic)
Requests to database are async as well - all you do, is put query in queue for transfer to db socket, then everything else happens behind the scenes and it will callback back only when there is response from database comes, so there is no blocking from application logic.

Resources