Concurrent in Nodejs - node.js

I have node server like below. And i push 2 request that
almost simultaneously(with the same url = "localhost:8080/").
And my question is: "Why the server wait for 1st request handle done, then 2st request will be handle"?
Output in console of my test:
Home..
Home..
(Notes: 2nd line will be display after 12second)
- server.js:
var express = require('express')
var app = express()
app.use(express.json())
app.get('/', async(request, response) => {
try {
console.log('Home ...')
await sleep(12000)
response.send('End')
} catch (error) {
response.end('Error')
}
})
function sleep(ms) {
return new Promise(resolve => setTimeout(resolve, ms));
}
var server = app.listen(8080, '127.0.0.1', async () => {
var host = server.address().address
var port = server.address().port
console.log('==================== START SERVER ===================')
console.log('* Running on http://%s:%s (Press CTRL+C to quit)', host, port)
console.log('* Create time of ' + new Date() + '\n')
})

Agreed with #unclexo - understand what the blocking/non-blocking calls are and optimize your code around that. If you really want to add capacity for parallel requests, you could consider leveraging the cluster module.
https://nodejs.org/api/cluster.html
This will kick off children processes and proxy the HTTP requests to those children. Each child can block as long as it wants and not affect the other processes (unless there is some race condition between them).

"Why the server wait for 1st request handle done, then 2st request
will be handle"?
This is not right for Node.Js. You are making a synchronous and asynchronous call in your code. The asynchronous call to the sleep() method waits for 12 seconds to be run.
Now this does not mean the asynchronous call waits until the synchronous call finishes. This is synchronous behavior. Note the following examples
asyncFunction1()
asyncFunction2()
asyncFunction3()
Here whenever the 1st function starts running, then node.js releases thread/block, then 2nd function starts running, again node releases thread, so the call to 3rd function starts. Each function returns response whenever it finishes and does not wait for other's return.
NodeJs is single-threaded or has non-blocking i/o architecture. Asynchronous feature is its beauty.

You are blocking the other request by calling await at app.get('/'....
If you want the sleep method to run asynchronously between requests, wrap it in a new async function. The code will be like this:
app.get('/', (request, response) => {
try {
print()
response.send('End')
} catch (error) {
response.end('Error')
}
})
async function print() {
await sleep(12000)
console.log('Home ...')
}
function sleep(ms) {
return new Promise(resolve => setTimeout(resolve, ms));
}
This way, if you send a request and then send another request, the second request's print() function will be executed without waiting for 12 seconds from the first request. Therefore, there will be 2 Home ... printed at around 12 seconds after the request sent. To have more understanding about this, you should learn about NodeJS Event Loop.

Related

Abandoned http requests after server.close()?

I have a vanilla nodejs server like this:
let someVar // to be set to a Promise
const getData = url => {
return new Promise((resolve, reject) => {
https.get(
url,
{ headers: { ...COMMON_REQUEST_HEADERS, 'X-Request-Time': '' + Date.now() } },
res => {
if (res.statusCode === 401) return reject(new RequestError(INVALID_KEY, res))
if (res.statusCode !== 200) return reject(new RequestError(BAD_REQUEST, res))
let json = ''
res.on('data', chunk => json += chunk)
res.on('end', () => {
try {
resolve(JSON.parse(json).data)
} catch (error) {
return reject(new RequestError(INVALID_RESPONSE, res, json))
}
})
}
).on('error', error => reject(new RequestError(FAILED, error)))
})
}
const aCallback = () =>
console.log('making api call')
someVar = getData('someApiEndpoint')
.then(data => { ... })
}
const main = () => {
const server = http.createServer(handleRequest)
anInterval = setInterval(aCallback, SOME_LENGTH_OF_TIME)
const exit = () => {
server.close(() => process.exit())
log('Server is closed')
}
process.on('SIGINT', exit)
process.on('SIGTERM', exit)
process.on('uncaughtException', (err, origin) => {
log(`Process caught unhandled exception ${err} ${origin}`, 'ERROR')
})
}
main()
I was running into a situation where I would ctrl-c and would see the Server is closed log, followed by my command prompt, but then I would see more logs printed indicting that more api calls are being made.
Calling clearInterval(anInterval) inside exit() (before server.close()) seems to have solved the issue of the interval continuing even when the server is closed, so that's good. BUT:
From these node docs:
Closes all connections connected to this server which are not sending a request or waiting for a response.
I.e., I assume server.close() will not automatically kill the http request.
What happens to the http response information when my computer / node are no longer keeping track of the variable someVar?
What are the consequences of not specifically killing the thread that made the http request (and is waiting for the response)?
Is there a best practice for cancelling the request?
What does that consist of (i.e. would I ultimately tell the API's servers 'never mind please don't send anything', or would I just instruct node to not receive any new information)?
There are a couple things you should be aware of. First off, handling SIGINT is a complicated thing in software. Next, you should never need to call process.exit() as node will always exit when it's ready. If your process doesn't exit correctly, that means there is "work being done" that you need to stop. As soon as there is no more work to be done, node will safely exit on its own. This is best explained by example. Let's start with this simple program:
const interval = setInterval(() => console.log('Hello'), 5000);
If you run this program and then press Ctrl + C (which sends the SIGINT signal), node will automatically clear the interval for you and exit (well... it's more of a "fatal" exit, but that's beyond the scope of this answer). This auto-exit behavior changes as soon as you listen for the SIGINT event:
const interval = setInterval(() => console.log('Hello'), 5000);
process.on('SIGINT', () => {
console.log('SIGINT received');
});
Now if you run this program and press Ctrl + C, you will see the "SIGINT received" message, but the process will never exit. When you listen for SIGINT, you are telling node "hey, I have some things I need to cleanup before you exit". Node will then wait for any "ongoing work" to finish before it exits. If node doesn't eventually exit on it's own, it's telling you "hey, I can see that there are some things still running - you need to stop them before I'll exit".
Let's see what happens if we clear the interval:
const interval = setInterval(() => console.log('Hello'), 5000);
process.on('SIGINT', () => {
console.log('SIGINT received');
clearInterval(interval);
});
Now if you run this program and press Ctrl + C, you will see the "SIGINT received" message and the process will exit nicely. As soon as we clear the interval, node is smart enough to see that nothing is happening, and it exits. The important lesson here is that if you listen for SIGINT, it's on you to wait for any tasks to finish, and you should never need to call process.exit().
As far as how this relates to your code, you have 3 things going on:
http server listening for requests
an interval
outgoing https.get request
When your program exits, it's on you to clean up the above items. In the most simple of circumstances, you should do the following:
close the server: server.close();
clear the interval: clearInterval(anInterval);
destroy any outgoing request: request.destroy()
You may decide to wait for any incoming requests to finish before closing your server, or you may want to listen for the 'close' event on your outgoing request in order to detect any lost connection. That's on you. You should read about the methods and events which are available in the node http docs. Hopefully by now you are starting to see how SIGINT is a complicated matter in software. Good luck.

Node Express. Continue function execution after res.render

It is necessary to build the page and then continue the execution of the function, which takes a lot of time. But res.render waits for the function to execute no matter where it is called.
I want the page to start building without waiting for the data to be processed.
Here is my code:
let promise = new Promise((resolve, reject) => {
let wb = new Workbook();
let ws = sheet_from_array_of_arrays(public_data); //May take up to 5 seconds
wb.SheetNames.push(ws_name);
wb.Sheets[ws_name] = ws;
XLSX.writeFile(wb, '/tmp/' + name); //May take up to 10 seconds
resolve('End of promise!!!');
})
res.render('ihelp/lists/person_selection_view_data', {
data: public_data,
name
});
promise.then(answer => { console.log(answer) });
How can i do this?
The code that you want to run looks to be synchronous (I don't see any promises or callbacks related to it). If it takes multiple seconds to run, that will mean that it will block your entire app for that amount of time.
This means that asynchronous functions, like res.render(), will not complete until the processing is done. Even if you change the order:
res.render(...);
long_running_code();
Not only will this not make res.render() send back a response before long_running_code is started, it will also stop your app from responding to new incoming requests (and/or block any current requests) until it's done.
If you have CPU-intensive code that will block the event look, take a look at worker_threads, which can be used to offload CPU-intensive code to separate threads, and therefore keep your main JS thread free to handle the HTTP-part of your app.
The res.render takes a 3rd argument for a callback. If supplied, Express will not send the rendered html string automatically. Any extended process can be executed after that.
res.render('ihelp/lists/person_selection_view_data', {data: public_data, name}, function (err, html) {
res.send(html);
let wb = new Workbook();
let ws = sheet_from_array_of_arrays(public_data); //May take up to 5 seconds
wb.SheetNames.push(ws_name);
wb.Sheets[ws_name] = ws;
XLSX.writeFile(wb, '/tmp/' + name); //May take up to 10 seconds
})
The executor function (inside the Promise constructor) is executed synchronously, and if this takes too long, you can move it into another Promise:
res.render('ihelp/lists/person_selection_view_data', {
data: public_data,
name
});
Promise.resolve().then(function() {
new Promise((resolve, reject) => {
// your executor code
}).then(answer => { console.log(answer) });
});

Socket.io async/await for .on()

I'm building a socket.io Node JS application and my socket.io server will be listening for data from many socket.io clients, I need to save data to an API via my socket.io server as quickly as possible and figure that async/await is the best way forward.
Right now, I've got a function inside my .on('connection'), but is there a way I can make this an async function rather than have a nested function inside?
io.use((socket, next) => {
if (!socket.handshake.query || !socket.handshake.query.token) {
console.log('Authentication Error 1')
return
}
jwt.verify(socket.handshake.query.token, process.env.AGENT_SIGNING_SECRET, (err, decoded) => {
if (err) {
console.log('Authentication Error 2')
return
}
socket.decoded = decoded
next()
})
}).on('connection', socket => {
socket.on('agent-performance', data => {
async function savePerformance () {
const saved = await db.saveToDb('http://127.0.0.1:8000/api/profiler/save', data)
console.log(saved)
}
savePerformance()
})
})
Sort of, but you'll probably want to keep your current code if there can be multiple agent-performance events. You can modify the following, but it'd be messy and less readable. Event emitters still exist for a reason, they're not made obsolete by the introduction of promises. If it's performance you're after, your current code is probably faster and more resistant to backpressure and easier to error-handle.
events.on is a utility function that takes an event emitter (like socket) and returns an iterator that yields promises. You can await those with for await of.
events.once is a utility function that takes an event emitter (like socket) and returns a promise that resolves when the specified event is executed.
const { on, once } = require('events');
(async function() {
// This is an iterator that can emit infinite number of times.
const iterator = on(io, 'connection');
// Yield a promise, await it, run what is between `{ }` and repeat.
for await (const socket of iterator) {
const data = await once(socket, 'agent-performance');
const saved = await db.saveToDb(/* etc */);
}
})();
As the names imply, on is similar to socket.on and once is similar to socket.once. In the above example:
connected user 1, first agent-performance event: OK
connected user 1, second agent-performance event: not handled, there's no more event handler, since once is "used up".
connected user 2, first agent-performance event: OK
The documentation for on has a note about concurrency when using for await (x of on(...)), but I don't know if that would be a problem in your usecase.
// The execution of this inner block is synchronous and it
// processes one event at a time (even with await). Do not use
// if concurrent execution is required.

How to deal with acks for asynchronous types of tasks

Right now I have a RabbitMQ queue setup, and I have several workers that are listening for events pushed onto this queue.
The events are essentially string urls (e.g., "https://www.youtube.com"), and it'll be processed through puppeteer.
What I'm wondering is given that puppeteer is asynchronous, is there a way for me return an ack once I've finished all the asynchronous stuff.
Right now, I think my workers listening to the queue are hanging because the ack isn't being fired.
edit -- code below is what I pretty much call within the consume part of rabbitmq. Because this is async, it kinda just goes past this operation and just immediately acks.
(async () => {
const args = {
evaluatePage: (() => ({
title: $('title').text(),
})),
persistCache: true,
cache,
onSuccess: (result => {
console.log('value for result -- ', result.result.title);
}),
};
// we need to first setup the crawler
// then we can start sending information to it
const crawler = await HCCrawler.launch(args);
crawler.queue(urls);
await crawler.onIdle();
await crawler.close();
})();

Why I cannot get value from async await by waiting synchronously?

I am trying to get mongodb instance synchronously. I know this is not recommended but I just experiment and wonder why this doesn't work. this.db is still undefined after 10 seconds of waiting when normally asynchronous code gets it in less than 500 milliseconds.
Repository.js:
var mongodb = require('mongodb');
var config = require('../config/config');
var mongoConfig = config.mongodb;
var mongoClient = mongodb.MongoClient;
class Repository {
constructor() {
(async () => {
this.db = await mongoClient.connect(mongoConfig.host);
})();
}
_getDb(t) {
t = t || 500;
if(!this.db && t < 10000) {
sleep(t);
t += 500;
this._getDb(t);
} else {
return this.db;
}
}
collection(collectionName) {
return this._getDb().collection(collectionName);
}
}
function sleep(ms) {
console.log('sleeping for ' + ms + ' ms');
var t = new Date().getTime();
while (t + ms >= new Date().getTime()) {}
}
module.exports = Repository;
app.js:
require('../babelize');
var Repository = require('../lib/Repository');
var collection = new Repository().collection('products');
Javascript is an event-based architecture. All code is initiated via an event from the event queue and the next event is pulled from the event queue ONLY when the code from the previous event has finished executing. This means that your main Javascript code is single threaded.
As such, when you fire an async operation, it starts up an operation and when that operation finishes, it puts an event in the event queue. That event (which will trigger the callback for the async operation) will not run until the code from the previous event finishes running and returns back to the system.
So, now to your code. You start running some code which launches an async operation. Then, you loop forever and never return back to the system. Because of that, the next event in the event queue from the completion of your async operation can NEVER run.
So, in a nutshell, you cannot spin in a loop waiting for an async operation to complete. That's incompatible with the event driven scheme that Javascript uses. You never return back to the event system to let the async completion callback ever run. So, you just have a deadlock or infinite loop.
Instead, you need to code for an async response by returning a promise or by passing in a callback that is called sometime later. And your code needs to finish executing and then let the callback get called sometime in the future. No spinning in loops in Javascript waiting for something else to run.
You can see the async coding options here: How do I return the response from an asynchronous call?

Resources