How to detect and measure event loop blocking in node.js? - node.js

I'd like to monitor how long each run of the event loop in node.js takes. However I'm uncertain about the best way to measure this. The best way I could come up with looks like this:
var interval = 500;
var interval = setInterval(function() {
var last = Date.now();
setImmediate(function() {
var delta = Date.now() - last;
if (delta > blockDelta) {
report("node.eventloop_blocked", delta);
}
});
}, interval);
I basically infer the event loop run time by looking at the delay of a setInterval. I've seen the same approach in the blocked node module but it feels inaccurate and heavy. Is there a better way to get to this information?
Update: Changed the code to use setImmediate as done by hapi.js.

"Is there a better way to get this information?"
I don't have a better way to test the eventloop than checking the time delay of SetImmediate, but you can get better precision using node's high resolution timer instead of Date.now()
var interval = 500;
var interval = setInterval(function() {
var last = process.hrtime(); // replace Date.now()
setImmediate(function() {
var delta = process.hrtime(last); // with process.hrtime()
if (delta > blockDelta) {
report("node.eventloop_blocked", delta);
}
});
}, interval);
NOTE: delta will be a tuple Array [seconds, nanoseconds].
For more details on process.hrtime():
https://nodejs.org/api/all.html#all_process_hrtime
"The primary use is for measuring performance between intervals."

Check out this plugin https://github.com/tj/node-blocked I'm using it now and it seems to do what you want.
let blocked = require("blocked");
blocked(ms => {
console.log("EVENT LOOP Blocked", ms);
});
Will print out how long in ms the event loop is blocked for

Code
this code will measure the time in nanoseconds it took for the event loop to trigger. it measures the time between the current process and the next tick.
var time = process.hrtime();
process.nextTick(function() {
var diff = process.hrtime(time);
console.log('benchmark took %d nanoseconds', diff[0] * 1e9 + diff[1]);
// benchmark took 1000000527 nanoseconds
});
EDIT: added explanation,
process.hrtime([time])
Returns the current high-resolution real time in a [seconds, nanoseconds] tuple Array. time is an optional parameter that must be the result of a previous process.hrtime() call (and therefore, a real time in a [seconds, nanoseconds] tuple Array containing a previous time) to diff with the current time. These times are relative to an arbitrary time in the past, and not related to the time of day and therefore not subject to clock drift. The primary use is for measuring performance between intervals.
process.nextTick(callback[, arg][, ...])
Once the current event loop turn runs to completion, call the callback function.
This is not a simple alias to setTimeout(fn, 0), it's much more efficient. It runs before any additional I/O events (including timers) fire in subsequent ticks of the event loop.

You may also want to look at the profiling built into node and io.js. See for example this article http://www.brendangregg.com/flamegraphs.html
And this related SO answer How to debug Node.js applications

Related

How to measure execution time of a particular block of code in Cloud Function for Firebase?

I wanted to measure the execution time of a particular block of code running on google cloud function( for firebase write event). Can anyone tell me how to do it.
Is there no specific tool to measure the execution time.
I have coded 2 codes. So, I wanted to know which code will have better execution time which in turn gives better performance.
I tried to use process.hrtime() in the following code but it yields different results for same data.
*Algo 1 Running time 299661890
Algo 1 Running time 5684236
Algo 1 Running time 10185061*
start time [ 87, 594147806 ]
'Algo 1 Running time 9251749'
start time [ 22, 803098325 ]
'Algo 1 Running time 1498176261'
// Import the Firebase SDK for Google Cloud Functions.
var functions = require('firebase-functions');
var t0
var mymap= new Map();
exports.processData=functions.database.ref("/test").onWrite(event=>{
const dataValue = event.data.child('data').val()
dataValue.body = myFuction(dataValue.body)
const promise = event.data.ref.child('data').set(dataValue)
'
//Finish Time
t1=process.hrtime(t0)
var RunTime=Math.round((t1[0]*1000000000) + (t1[1]));
console.log("Algo 1 Running time "+RunTime)
return promise;
}
})
function myFunction(s){
// start time
t0=process.hrtime()
var newValue=0
myProbdict.forEach(mapElements);
function mapElements(value, key, map) {
if(newValue< 68){
reduction+=parseInt(value)
var regexstring ="\\b"+`${key}`+"\\b"
var regexp = new RegExp(regexstring, "gi");
s= s.replace(regexp,"#")
}
}
return dataValue
}
You can't expect to have a constant execution time. The time it takes to execute a function is always different because it depends on the current network status (and probably the current server usage as well).
I found a good example in this blog post. The blog writer wrote a function that is supposed to be executed in 1ms:
var start = new Date();
var hrstart = process.hrtime();
setTimeout(function (argument) {
// execution time simulated with setTimeout function
var end = new Date() - start,
hrend = process.hrtime(hrstart);
console.info("Execution time: %dms", end);
console.info("Execution time (hr): %ds %dms", hrend[0], hrend[1]/1000000);
}, 1);
On the first execution, he got the expected result:
Execution time: 1ms
Execution time (hr): 0s 1.025075ms
But on the second execution, the function took a little more than 1ms:
Execution time: 3ms
Execution time (hr): 0s 2.875302ms
If you need to know the execution time of your code block, you can take these outputs and calculate the average: (29961890+5684236+10185061)รท3 which would result in something like 15277062.
No need to do any thing complex like the other answers!
You can simply do this:
console.time(`time spent on complex thing`)
await doSomethingComplex()
console.timeEnd(`time spent on complex thing`)
And then you will see it in the Google Cloud Console > Cloud Functions > select your function > Click "LOGS"
Here is an example of how it will look:
Please note:
The string you put into console.time needs to be the same as the one in console.timeEnd for it to work.

node delay execution - What's right/wrong with it?

At first, I'm a newbie without experience in node js and would like to learn more. I wrote a delay function and I'm interessted, what you as a javascript professional think about it. What is good or bad on it and why?
I try to write a bot. It has 2 function. Function 1 starts function 2. But function 2 shall not start direct afterwards. It has to start with a delay.
Of course I made research for my topic and have found stuff like this:
How Can I Wait In Node.js (Javascript), l need to pause for a period of time
How to create a sleep/delay in nodejs that is Blocking?
Unfortunately I'm not able to understand and use it. Therefore I made my own try. It works on my computer, but should I bring it on a server?
//function 1 (example)
function start(){
...;
delay(2500, 'That could be an answer');
}
//Delay
function delay(ms, msg){
var started = new Date();
var now;
var diff = 0;;
while(diff < ms){
now = new Date();
diff = now - started;
console.log('Diff time: '+diff);
}
console.log('Delay started at: '+started);
console.log('Now time: '+now);
console.log('ms time: '+ms);
console.log('While loop is done.');
answer(msg);
}
//function 2 (example)
function answer(msg){
...
}
Thank's!
This is blocking.. your event loop will block executing this code. No other work will be done throughout the 2500 ms interval except for busy waiting inside the loop.
I'm not sure why you would want to do this. What you can do if you want to start function 2 at some point after function 1 is use setTimeout. This way, function 2 will be started after at least the time that you pass as argument to the setTimeout function while allowing other code to execute and not blocking the node event loop.
setTimeout(function(){
answer(msg);
}, 2500);
it does not work nevertheless. My delay time is more than an hour. Bute function 2 is executed after a couple of seconds.
setTimeout(function(){
answer(msg);
}, Math.floor(Math.random()*1000*87));
You can use bluebird promises with .delay to maintain your code more clean.
http://bluebirdjs.com/docs/api/promise.delay.html
Make your start function a promise then:
start().delay(2500).then(function (result) {
// result = start function return statment
});

Queue up javascript code in a single process

Lets say I have a bunch of tasks in an object, each with a date object. I was wondering if it's even possible to have tasks within the object be run within a single process and trigger when the date is called.
Here's an example:
var tasks = [
"when": "1501121620",
"what": function(){
console.log("hello world");
},
"when": "1501121625",
"what": function(){
console.log("hello world x2");
},
]
I'm fine with having these stored within a database and the what script being evaled from a string. I need a point in the right direction. I've never seen anything like this in the node world.
I'm thinking about using hotload and using the file system so I don't need to deal with databases.
Should I just look into setInterval or is there something out there that is more sophisticated? I know things like cron exist, the thing is I need all of these tasks to occur within an already existing running process. I need to be able to add a new task to the queue without ending the process.
To add a little context I need some way of queuing up socket.io .emit() functions.
Do not reinvent the wheel. Use cron package from npm. He is written pure on js (using second variant from bellow). So all of these tasks will occur within an already existing running process. For example your can create CronJob like this:
var CronJob = require('cron').CronJob;
var job = new CronJob(1421110908157);
job.addCallback(function() { /* some stuff to do */ });
In pure javascript you can do it only through setTimeout and setInterval methods. There are two variants:
1) Set interval callback, which will check your task queue and execute callbacks in appropriate time:
setInterval(function() {
for (var i = 0; ii = tasks.length; ++i) {
var task = tasks[i];
if (task.when*1000 < Date.now()) {
task.what();
tasks.splice(i,1);
--i;
}
};
}, 1000);
As you see accuracy of callback calling time will be dependent on interval time. Less interval time => more accuracy, but also more CPU usage.
2) Create wrapper around your tasks. So when you want to add new task you're calling some method addTask, that will be calling setTimeout with your task callback. Beware that maximum time for setTimeout is 2147483647ms (around 25 days). So if your time exceeds max time, you must set timeout on the maximum time with callback which will be set new timeout with remaining time. For example:
var MAX_TIME = 2147483647;
function addTask(task) {
if (task.when*1000 < MAX_TIME) {
setTimeout(task.what, task.when);
}
else {
task.when -= MAX_TIME/1000;
setTimeout(addTask.bind(null, task), MAX_TIME);
}
}

Node.js Synchronous Library Code Blocking Async Execution

Suppose you've got a 3rd-party library that's got a synchronous API. Naturally, attempting to use it in an async fashion yields undesirable results in the sense that you get blocked when trying to do multiple things in "parallel".
Are there any common patterns that allow us to use such libraries in an async fashion?
Consider the following example (using the async library from NPM for brevity):
var async = require('async');
function ts() {
return new Date().getTime();
}
var startTs = ts();
process.on('exit', function() {
console.log('Total Time: ~' + (ts() - startTs) + ' ms');
});
// This is a dummy function that simulates some 3rd-party synchronous code.
function vendorSyncCode() {
var future = ts() + 50; // ~50 ms in the future.
while(ts() <= future) {} // Spin to simulate blocking work.
}
// My code that handles the workload and uses `vendorSyncCode`.
function myTaskRunner(task, callback) {
// Do async stuff with `task`...
vendorSyncCode(task);
// Do more async stuff...
callback();
}
// Dummy workload.
var work = (function() {
var result = [];
for(var i = 0; i < 100; ++i) result.push(i);
return result;
})();
// Problem:
// -------
// The following two calls will take roughly the same amount of time to complete.
// In this case, ~6 seconds each.
async.each(work, myTaskRunner, function(err) {});
async.eachLimit(work, 10, myTaskRunner, function(err) {});
// Desired:
// --------
// The latter call with 10 "workers" should complete roughly an order of magnitude
// faster than the former.
Are fork/join or spawning worker processes manually my only options?
Yes, it is your only option.
If you need to use 50ms of cpu time to do something, and need to do it 10 times, then you'll need 500ms of cpu time to do it. If you want it to be done in less than 500ms of wall clock time, you need to use more cpus. That means multiple node instances (or a C++ addon that pushes the work out onto the thread pool). How to get multiple instances depends on your app strucuture, a child that you feed the work to using child_process.send() is one way, running multiple servers with cluster is another. Breaking up your server is another way. Say its an image store application, and mostly is fast to process requests, unless someone asks to convert an image into another format and that's cpu intensive. You could push the image processing portion into a different app, and access it through a REST API, leaving the main app server responsive.
If you aren't concerned that it takes 50ms of cpu to do the request, but instead you are concerned that you can't interleave handling of other requests with the processing of the cpu intensive request, then you could break the work up into small chunks, and schedule the next chunk with setInterval(). That's usually a horrid hack, though. Better to restructure the app.

How do I execute a piece of code no more than every X minutes?

Say I have a link aggregation app where users vote on links. I sort the links using hotness scores generated by an algorithm that runs whenever a link is voted on. However running it on every vote seems excessive. How do I limit it so that it runs no more than, say, every 5 minutes.
a) use cron job
b) keep track of the timestamp when the procedure was last run, and when the current timestamp - the timestamp you have stored > 5 minutes then run the procedure and update the timestamp.
var yourVoteStuff = function() {
...
setTimeout(yourVoteStuff, 5 * 60 * 1000);
};
yourVoteStuff();
Before asking why not to use setTimeinterval, well, read the comment below.
Why "why setTimeinterval" and no "why cron job?"?, am I that wrong?
First you build a receiver that receives all your links submissions.
Secondly, the receiver push()es each link (that has been received) to
a queue (I strongly recommend redis)
Moreover you have an aggregator which loops with a time interval of your desire. Within this loop each queued link should be poll()ed and continue to your business logic.
I have use this solution to a production level and I can tell you that scales well as it also performs.
Example of use;
var MIN = 5; // don't run aggregation for short queue, saves resources
var THROTTLE = 10; // aggregation/sec
var queue = [];
var bucket = [];
var interval = 1000; // 1sec
flow.on("submission", function(link) {
queue.push(link);
});
___aggregationLoop(interval);
function ___aggregationLoop(interval) {
setTimeout(function() {
bucket = [];
if(queue.length<=MIN) {
___aggregationLoop(100); // intensive
return;
}
for(var i=0; i<THROTTLE; ++i) {
(function(index) {
bucket.push(this);
}).call(queue.pop(), i);
}
___aggregationLoop(interval);
}, interval);
}
Cheers!

Resources