Node.js long poll logic help! - node.js

I m trying to implement a long polling strategy with node.js
What i want is when a request is made to node.js it will wait maximum 30 seconds for some data to become available. If there is data, it will output it and exit and if there is no data, it will just wait out 30 seconds max, and then exit.
here is the basic code logic i came up with -
var http = require('http');
var poll_function = function(req,res,counter)
{
if(counter > 30)
{
res.writeHeader(200,{'Content-Type':'text/html;charset=utf8'});
res.end('Output after 5 seconds!');
}
else
{
var rand = Math.random();
if(rand > 0.85)
{
res.writeHeader(200,{'Content-Type':'text/html;charset=utf8'});
res.end('Output done because rand: ' + rand + '! in counter: ' + counter);
}
}
setTimeout
(
function()
{
poll_function.apply(this,[req,res,counter+1]);
},
1000
);
};
http.createServer
(
function(req,res)
{
poll_function(req,res,1);
}
).listen(8088);
What i figure is, When a request is made the poll_function is called which calls itself after 1 second, via a setTimeout within itself. So, it should remain asynchronous means, it will not block other requests and will provide its output when its done.
I have used a Math.random() logic here to simulate data availability scenario at various interval.
Now, what i concern is -
1) Will there be any problem with it? - I simply don't wish to deploy it, without being sure it will not strike back!
2) Is it efficient? if not, any suggestion how can i improve it?
Thanks,
Anjan

All nodejs code is nonblocking as long as you don't get hunk in a tight CPU loop (like while(true)) or use a library that has blocking I/O. Putting a setTimeout at the end of a function doesn't make it any more parallel, it just defers some cpu work till a later event.
Here is a simple demo chat server that randomly emits "Hello World" every 0 to 60 seconds to and and all connection clients.
// A simple chat server using long-poll and timeout
var Http = require('http');
// Array of open callbacks listening for a result
var listeners = [];
Http.createServer(function (req, res) {
function onData(data) {
res.end(data);
}
listeners.push(onData);
// Set a timeout of 30 seconds
var timeout = setTimeout(function () {
// Remove our callback from the listeners array
listeners.splice(listeners.indexOf(onData), 1);
res.end("Timeout!");
}, 30000);
}).listen(8080);
console.log("Server listening on 8080");
function emitEvent(data) {
for (var i = 0; l = listeners.length; i < l; i++) {
listeners[i](data);
}
listeners.length = 0;
}
// Simulate random events
function randomEvents() {
emitData("Hello World");
setTimeout(RandomEvents, Math.random() * 60000);
}
setTimeout(RandomEvents, Math.random() * 60000);
This will be quite fast. The only dangerous part is the splice. Splice can be slow if the array gets very large. This can be made possibly more efficient by instead of closing the connection 30 seconds from when it started to closing all the handlers at once every 30 seconds or 30 seconds after the last event. But again, this is unlikely to be the bottleneck since each of those array items is backed by a real client connection that probably more expensive.

Related

How to perform recurring long running background tasks in an node.js web server

I'm working on a node.js web server using express.js that should offer a dashboard to monitor database servers.
The architecture is quite simple:
a gatherer retrieves the information in a predefined interval and stores the data
express.js listens to user requests and shows a dashboard based on the stored data
I'm now wondering how to best implement the gatherer to make sure that it does not block the main loop and the simplest solution seems be to just use a setTimeout based approach but I was wondering what the "proper way" to architecture this would be?
Your concern is your information-gathering step. It probably is not as CPU-intensive as it seems. Because it's a monitoring app, it probably gathers information by contacting other machines, something like this.
async function gather () {
const results = []
let result
result = await getOracleMetrics ('server1')
results.push(result)
result = await getMySQLMetrics ('server2')
results.push(result)
result = await getMySQLMetrics ('server3')
results.push(result)
await storeMetrics(results)
}
This is not a cpu-intensive function. (If you were doing a fast Fourier transform on an image, that would be a cpu-intensive function.)
It spends most of its time awaiting results, and then a little time storing them. Using async / await gives you the illusion it runs synchronously. But, each await yields the main loop to other things.
You might invoke it every minute something like this. The .then().catch() stuff invokes it asynchronously.
setInterval (
function go () {
gather()
.then()
.catch(console.error)
}, 1000 * 60 * 60)
If you do actually have some cpu-intensive computation to do, you have a few choices.
offload it to a worker thread.
break it up into short chunks, with sleeps between them.
sleep = function sleep (howLong) {
return new Promise(function (resolve) {
setTimeout(() => {resolve()}, howLong)
})
}
async function gather () {
for (let chunkNo = 0; chunkNo < 100; chunkNo++) {
doComputationChunk(chunkNo)
await sleep(1)
}
}
That sleep() function yields to the main loop by waiting for a timeout to expire.
None of this is debugged, sorry to say.
For recurring tasks I prefer to use node-scheduler and shedule the jobs on app start-up.
In case you don't want to run CPU-expensive tasks in the main-thread, you can always run the code below in a worker-thread in parallel instead of the main thread - see info here
Here are two examples, one with a recurrence rule and one with interval in minutes using a cron expression:
app.js
let mySheduler = require('./mysheduler.js');
mySheduler.sheduleRecurrence();
// And/Or
mySheduler.sheduleInterval();
mysheduler.js
/* INFO: Require node-schedule for starting jobs of sheduled-tasks */
var schedule = require('node-schedule');
/* INFO: Helper for constructing a cron-expression */
function getCronExpression(minutes) {
if (minutes < 60) {
return `*/${minutes} * * * *`;
}
else {
let hours = (minutes - minutes % 60) / 60;
let minutesRemainder = minutes % 60;
return `*/${minutesRemainder} */${hours} * * *`;
}
}
module.exports = {
sheduleRecurrence: () => {
// Schedule a job # 01:00 AM every day (Mo-Su)
var rule = new schedule.RecurrenceRule();
rule.hour = 01;
rule.minute = 00;
rule.second = 00;
rule.dayOfWeek = new schedule.Range(0,6);
var dailyJob = schedule.scheduleJob(rule, function(){
/* INFO: Put your database-ops or other routines here */
// ...
// ..
// .
});
// INFO: Verbose output to check if job was scheduled:
console.log(`JOB:\n${dailyJob}\n HAS BEEN SCHEDULED..`);
},
sheduleInterval: () => {
let intervalInMinutes = 60;
let cronExpressions = getCronExpression(intervalInMinutes);
// INFO: Define unique job-name in case you want to cancel it
let uniqueJobName = "myIntervalJob"; // should be unique
// INFO: Schedule the job
var job = schedule.scheduleJob(uniqueJobName,cronExpressions, function() {
/* INFO: Put your database-ops or other routines here */
// ...
// ..
// .
})
// INFO: Verbose output to check if job was scheduled:
console.log(`JOB:\n${job}\n HAS BEEN SCHEDULED..`);
}
}
In case you want to cancel a job, you can use its unique job-name:
function cancelCronJob(uniqueJobName) {
/* INFO: Get job-instance for canceling scheduled task/job */
let current_job = schedule.scheduledJobs[uniqueJobName];
if (!current_job || current_job == 'undefinded') {
/* INFO: Cron-job not found (already cancelled or unknown) */
console.log(`CRON JOB WITH UNIQUE NAME: '${uniqueJobName}' UNDEFINED OR ALREADY CANCELLED..`);
}
else {
/* INFO: Cron-job found and cancelled */
console.log(`CANCELLING CRON JOB WITH UNIQUE NAME: '${uniqueJobName}`)
current_job.cancel();
}
};
In my example the recurrence and the interval are hardcoded, obviously you can also pass the recurrence-rules or the interval as argument to the respective function..
As per your comment:
'When looking at the implementation of node-schedule it feels like a this layer on top of setTimeout..'
Actually, node-schedule is using long-timeout -> https://www.npmjs.com/package/long-timeout so you are right, it's basically a convenient layer on top of timeOuts

How to write non-blocking async function in Express request handler [duplicate]

This question already has answers here:
How node.js server serve next request, if current request have huge computation?
(2 answers)
NodeJs how to create a non-blocking computation
(5 answers)
Closed 5 years ago.
All:
I am pretty new to Node async programming, I wonder how can I write some Express request handler which can handle time consuming heavy calculation task without block Express handling following request?
I thought setTimeout can do that to put the job in a event loop, but it still block other requests:
var express = require('express');
var router = express.Router();
function heavy(callback){
setTimeout(callback, 1);
}
router.get('/', function(req, res, next) {
var callback = function(req, res){
var loop = +req.query.loop;
for(var i=0; i<loop; i++){
for(var j=0; j<loop; j++){}
}
res.send("finished task: "+Date.now());
}.bind(null, req, res);
heavy(callback)
});
I guess I did not understand how setTimeout works(my understanding about setTimeout is after that 1ms delay it will fire up the callback in a seperated thread/process without blocking other call of heavy), could any one show me how to do this without blocking other request to heavy()?
Thanks
Instead of setTimeout it's better to use process.nextTick or setImmediate (depending od when you want your callback to be run). But it is not enough to put a long running code into a function because it will still block your thread, just a millisecond later.
You need to break your code and run setImmediate or process.nextTick multiple times - like in every iteration and then schedule a new iteration from that. Otherwise you will not gain anything.
Example
Instead of a code like this:
var a = 0, b = 10000000;
function numbers() {
while (a < b) {
console.log("Number " + a++);
}
}
numbers();
you can use code like this:
var a = 0, b = 10000000;
function numbers() {
var i = 0;
while (a < b && i++ < 100) {
console.log("Number " + a++);
}
if (a < b) setImmediate(numbers);
}
numbers();
The first one will block your thread (and likely overflow your call stack) and the second one will not block (or, more precisely, it will block your thread 10000000 times for a very brief moment, letting other stuff to run in between those moments).
You can also consider spawning an external process or writing a native add on in C/C++ where you can use threads.
For more info see:
How node.js server serve next request, if current request have huge computation?
Maximum call stack size exceeded in nodejs
Node; Q Promise delay
How to avoid jimp blocking the code node.js
NodeJS, Promises and performance

How to call a function each 2 min while webserver is running?

my Problem is to call a function every two minutes WHILE a Webserver is running. So my Server starts like this:
app.listen(1309, function(){
console.log("server is listening");
doSomething();
});
And this is my function doSomething()
var doSomething = function(){
while(true) {
sleep.sleep(10);
console.log("hello"); //the original function will be called here as soon as my code works :P
}
};
So yes the function prints every 10 seconds (10 sec
'cause testcase, don't want to wait 2 min atm) after starting my webserver but then it can't receive any get requests. (tested this with console.log)
I tried it without the function and it receives them. So I guess the while loop blocks the rest of the sever. How can I call this function every 2 minutes (or 10 sec) while the server is running and without missen any requests to it ?
You need to use setInteval function:
const 2mins = 2 * 60 * 1000;
var doSomething = function() {
setInterval(function() {
console.log("hello");
}, 2mins);
}

Node.js setTimeout() behaviour

I want a piece of code to repeat 100 times with 1 sec of delay in between. This is my code:
for(var i = 0; i < 100; i++){
setTimeout(function(){
//do stuff
},1000);
}
While this seems correct to me it is not. Instead of running "do stuff" 100 times and waiting 1 sec in between what it does is wait 1 sec and then run "do stuff" 100 times with no delay.
Anybody has any idea about this?
You can accomplish it by using setInterval().
It calls function of our choice as long as clearTimeout is called to a variable timer which stores it.
See example below with comments: (and remember to open your developer console (in chrome right click -> inspect element -> console) to view console.log).
// Total count we have called doStuff()
var count = 0;
/**
* Method for calling doStuff() 100 times
*
*/
var timer = setInterval(function() {
// If count increased by one is smaller than 100, keep running and return
if(count++ < 100) {
return doStuff();
}
// mission complete, clear timeout
clearTimeout(timer);
}, 1000); // One second in milliseconds
/**
* Method for doing stuff
*
*/
function doStuff() {
console.log("doing stuff");
}
Here is also: jsfiddle example
As a bonus: Your original method won't work because you are basically assigning 100 setTimeout calls as fast as possible. So instead of them running with one second gaps. They will run as fast as the for loop is placing them to queue, starting after 1000 milliseconds of current time.
For instance, following code shows timestamps when your approach is used:
for(var i = 0; i < 100; i++){
setTimeout(function(){
// Current time in milliseconds
console.log(new Date().getTime());
},1000);
}
It will output something like (milliseconds):
1404911593267 (14 times called with this timestamp...)
1404911593268 (10 times called with this timestamp...)
1404911593269 (12 times called with this timestamp...)
1404911593270 (15 times called with this timestamp...)
1404911593271 (12 times called with this timestamp...)
You can see the behaviour also in: js fiddle
You need to use callback, node.js is asynchronous:
function call_1000times(callback) {
var i = 0,
function do_stuff() {
//do stuff
if (i < 1000) {
i = i + 1;
do_stuff();
} else {
callback(list);
}
}
do_stuff();
}
Or, more cleaner:
setInterval(function () {
//do stuff
}, 1000);
Now that you appreciate that the for loop is iterating in a matter of milliseconds, another way to do it would be to simply adjust the setTimeout delay according to the count.
for(var i = 0; i < 100; i++){
setTimeout(function(){
//do stuff
}, i * 1000);
}
For many use-cases, this could be seen as bad. But in particular circumstances where you know that you definitely want to run code x number of times after y number of seconds, it could be useful.
It's also worth noting there are some that believe using setInterval is bad practise.
I prefer the recursive function. Call the function initially with the value of counter = 0, and then within the function check to see that counter is less than 100. If so, do your stuff, then call setTimeout with another call to doStuff but with a value of counter + 1. The function will run exactly 100 times, once per second, then quit :
const doStuff = counter => {
if (counter < 100) {
// do some stuff
setTimeout(()=>doStuff(counter + 1), 1000)
}
return;
}
doStuff(0)

Close listener after idle time

I've a simple nodejs server that is started automatically.
It uses express to host the endpoint, which is started with a simple app.listen(port); command.
Since I've an automatic startup, I'd like to shutdown the server after an idle period - say 3 mins.
I've coded it manually just using the function below, which is called on each app.post:
//Idle timer
var timer;
function resetIdleTimer() {
if (timer != null) clearTimeout(timer);
timer = setTimeout(function () {
logger.info('idle shutdown');
process.exit();
}, 3 * 60 * 1000);
}
This seems a little crude though, so I wondered if there is an neater way (some sort of timer within express maybe).
Looking in the express docs I didn't see an easy way to configure this.
Is there a neater way to have this idle shutdown implemented?
app.listen() returns a wrapped HTTP server (as can be seen here in the source), on which you can then the .close() method.
var app = express();
var server = app.listen(port);
setTimeout(function() {
server.close();
}, 3 * 60 * 1000);
This will prevent the server from accepting new connection. When it has stopped serving existing connections, it will gracefully stop. This will then stop Nodejs entirely.
Edit: You might also find this GitHub issue relevant.
Take a look at forever . You can require it as a module into your application and it provides you with some functions that can help you achieve what you are looking for (such as forever.stop(index) which terminates the node process running at that index. Before terminating the process, you could retrieve the list of processes and manipulate the strings in order to get the uptime. Then, I would monitor the time that passes between server calls. If there is a gap of 3 minutes between requests, I would call forever.stop() in order to terminate the process.
I dont think it's "crude" to use your timer solution; I would take a slightly different tack:
app.timeOutDate = new Date().valueOf() + 1000*60*3; // 3 minutes from now, in ms
function quitIfTimedout(req, res, next){
if(new Date().valueOf() > app.timeOutDate){
logger.info('idle shutdown');
process.exit();
} else {
app.timeOutDate = new Date().valueOf() + 1000*60*3; //reset
next();
}
};
app.all('*', quitIfTimedout);
however this wont actually quit after 3 minutes, it would instead quit on the next request after 3 minutes. so that might not solve your problem

Resources