How is the event-loop implemented in node.js?

How is the event-loop implemented in node.js? - node.js

If node simply has two threads, one to execute the main code and the other for all the callbacks, then blocking can still occur if the callbacks are resource/time intensive.
Say you have 100,000 concurrent users and each client request to the node app runs a complicated and time consuming database query, (assuming no caching is done) will the later users experience blocking when waiting for the query to return?
function onRequest(request, response) {
//hypothetical database call
database.query("SELECT * FROM hugetable", function(data) {
response.writeHead(200, {"Content-Type": "text/plain"});
response.write("database result: " + data);
response.end();
});
}
http.createServer(onRequest).listen(8888);
If each callback can run on its own thread, then this is a non-issue. But if all the callbacks run on a single separate dedicated thread then node doesn't really help us much in such a scenario.

Please read this good article: http://blog.mixu.net/2011/02/01/understanding-the-node-js-event-loop/
There is no multiple threads, only one that executes your node.js logic. (Behind the scenes I/O should not be considered for application logic)
Requests to database are async as well - all you do, is put query in queue for transfer to db socket, then everything else happens behind the scenes and it will callback back only when there is response from database comes, so there is no blocking from application logic.

Related

Concurrency in node js express app for get request with setTimeout

Console log Image
const express = require('express');
const app = express();
const port = 4444;
app.get('/', async (req, res) => {
console.log('got request');
await new Promise(resolve => setTimeout(resolve, 10000));
console.log('done');
res.send('Hello World!');
});
app.listen(port, () => {
console.log(`Example app listening at http://localhost:${port}`);
});
If I hit get request http://localhost:4444 three times concurrently then it is returning logs as below
got request
done
got request
done
got request
done
Shouldn't it return the output in the below way because of nodes event loop and callback queues which are external to the process thread? (Maybe I am wrong, but need some understanding on Nodes internals) and external apis in node please find the attached image
Javascript Run time environment
got request
got request
got request
done
done
done

Thanks to https://stackoverflow.com/users/5330340/phani-kumar
I got the reason why it is blocking. I was testing this in chrome. I am making get requests from chrome browser and when I tried the same in firefox it is working as expected.
Reason is because of this
Chrome locks the cache and waits to see the result of one request before requesting the same resource again.
Chrome stalls when making multiple requests to same resource?

It is returning the response like this:
Node.js is event driven language. To understand the concurrency, you should look a How node is executing this code. Node is a single thread language(but internally it uses multi-thread) which accepts the request as they come. In this case, Node accepts the request and assign a callback for the promise, however, in the meantime while it is waiting for the eventloop to execute the callback, it will accept as many request as it can handle(ex memory, cpu etc.). As there is setTimeout queue in the eventloop all these callback will be register there and once the timer is completed the eventloop will exhaust its queue.
Single Threaded Event Loop Model Processing Steps:
Client Send request to the Node.js Server.
Node.js internally maintains a limited(configurable) Thread pool to provide services to the Client Requests.
Node.js receives those requests and places them into a Queue that is known as “Event Queue”.
Node.js internally has a Component, known as “Event Loop”. Why it got this name is that it uses indefinite loop to receive requests and process them.
Event Loop uses Single Thread only. It is main heart of Node JS Platform Processing Model.
Event Loop checks any Client Request is placed in Event Queue. If not then wait for incoming requests for indefinitely.
If yes, then pick up one Client Request from Event Queue
Starts process that Client Request
If that Client Request Does Not requires any Blocking IO Operations, then process everything, prepare response and send it back to client.
If that Client Request requires some Blocking IO Operations like interacting with Database, File System, External Services then it will follow different approach
Checks Threads availability from Internal Thread Pool
Picks up one Thread and assign this Client Request to that thread.
That Thread is responsible for taking that request, process it, perform Blocking IO operations, prepare response and send it back to the Event Loop
You can check here for more details (very well explained).

How do developer's guarantee safe database operations in asynchronous environments?

I am in the process of building an API service using Node.js with ExpressJS. I understand Node is single threaded, but allows for asynchronous operations.
For example, this simple express route:
app.post('/a', (req, res) => {
console.log('processing');
setTimeout(() => {
res.send('Hello World!');
console.log('sent');
}, 5000);
console.log('waiting');
});
If I make 2 immediately consecutive requests to this route, the console logs are run in parallel. What if these were writes to a database instead of setTimeout's? How do developers guarantee synchronous writes, so that if this endpoint was pinged twice immediately, it would not result in duplicate inserts for example? Is it controlled on the database side (with indexes, perhaps)? Maybe with transactions, but what if the database doesn't support transactions?
Any help or understanding would be more than appreciated.

why does response.write block the thread?

Here's the example below:
const http = require('http')
const server = http.createServer()
const log = console.log.bind(console)
server.on('request', (req, res) => {
log('visited!')
res.writeHead(200, {
'Content-Type': 'text/html; charset=utf-8',
})
for (let i = 0; i < 100000; i++) {
res.write('<p>hello</p>')
}
res.end('ok')
})
server.listen(3000, '0.0.0.0')
When the server is handling the first request, the thread is blocked and can't handle the second request. I wonder why would this happen since nodejs uses an event-driven, non-blocking I/O model.

Great question.
NodeJS uses a non-blocking I/O model; that is, I/O operations will run on separate threads under the hood, but all JavaScript code will run on the same event-loop driven thread.
In your example, when you ask your HTTP server to listen for incoming requests, Node will manage the socket listening operations on a separate thread under the hood so that your code can continue to run after calling server.listen().
When a request comes, your server.on('request') callback is executed back on the main event-loop thread. If another request comes in, its callback cannot run until the first callback and any other code that is currently executing on the main thread finishes.
Most of the time, callbacks are short-lived so you rarely need to worry about blocking the main thread. If the callbacks aren't short-lived, they are almost always calling an asynchronous I/O related function that actually does run in a different thread under the hood, therefore freeing up the main thread for other code to execute.

NodeJS/SailsJS app database block

Understand that NodeJS is a single thread process, but if I have to run a long process database process, do I need to start a web worker to do that?
For example, in a sails JS app, I can call database to create record, but if the database call take times to finish, it will block other user from access the database.
Below are a sample code i tried
var test = function(cb) {
for(i=0;i<10000;i++) {
Company.create({companyName:'Walter Jr'+i}).exec(cb);
}
}
test(function(err,result){
});
console.log("return to client");
return res.view('cargo/view',{
model:result
});
On first request, I see the return almost instant. But if I request it again, I will need to wait for all the records being entered before It will return me the view again.
What is the common practice for this kinda of blocking issue?

Node.js has non-blocking, asynchronous IO.
read the article below it will help you to restructure your code
http://hueniverse.com/2011/06/29/the-style-of-non-blocking/
Also start using Promises to help you avoid writing blocking IO.

node.js wait for response

I have a very limited knowledge about node and nob-blocking IO so forgive me if my question is too naive.
In order to return needed information in response body, I need to
Make a call to 3rd party API
Wait for response
Add some modifications and return JSON response with the information I got from API.
My question is.. how can I wait for response? Or is it possible to send the information to the client only when I received response from API (as far as I know, connection should be bidirectional in this case which means I won't be able to do so using HTTP).
And yet another question. If one request waits for response from API, does this mean than other users will be forced to wait too (since node is single-threaded) until I increase numbers of threads/processes from 1 to N?

You pass a callback to the function which calls the service. If the service is a database, for example:
db.connect(host, callback);
And somewhere else in the code:
var callback = function(err, dbObject) {
// The connection was made, it's safe to handle the code here
console.log(dbObject.status);
res.json(jsonObject, 200)
};
Or you can use anonymous functions, so:
db.connect(host, function(err, dbObject) {
// The connection was made, it's safe to handle the code here
console.log(dbObject.status);
res.json(jsonObject, 200)
});
Between the call and the callback, node handles other clients / connections freely, "non-blocking".

This type of situation is exactly what node was designed to solve. Once you receive the request from your client, you can make a http request, which should take a callback parameter. This will call your callback function when the request is done, but node can do other work (including serving other clients) while you are waiting for the response. Once the request is done, you can have your code return the response to the client that is still waiting.
The amount of memory and CPU used by the node process will increase as additional clients connect to it, but only one process is needed to handle many simultaneous clients.
Node focuses on doing slow I/O asynchronously, so that the application code can start a task, and then have code start executing again after the I/O has completed.

An typical example might make it clear. We make a call to the FB API. When we get a response, we modify it and then send JSON to the user.
var express = require('express');
var fb = require('facebook-js');
app.get('/user', function(req, res){
fb.apiCall('GET', '/me/', {access_token: access_token}, function(error, response, body){ // access FB API
// when FB responds this part of the code will execute
if (error){
throw new Error('Error getting user information');
}
body.platform = 'Facebook' // modify the Facebook response, available as JSON in body
res.json(body); // send the response to client
});
});

Develop Reference

node.js excel linux python-3.x azure haskell apache-spark rust .htaccess string