How to find source of memory leak in Node.JS App - node.js

I have a memory leak in Node.js/Express app. The app dies after 3-5 days with the following log message:
FATAL ERROR: JS Allocation failed - process out of memory
I setup a server without users connecting, and it still crashes, so I know leak is originating in the following code which runs in the background to sync api changes to the db.
poll(config.refreshInterval)
function poll(refreshRate) {
return apiSync.syncDatabase()
.then(function(){
return wait(refreshRate)
})
.then(function(){
return poll(refreshRate)
})
}
var wait = function wait(time) {
return new Promise(function(resolve){
applog.info('waiting for %s ms..', time)
setTimeout(function(){
resolve(true)
},time)
})
}
What techniques are available for profiling the heap to find the source object(s) of what is taking all the memory?
This takes awhile to crash, so I would need something that logs and I can come back later and analyze.
Is there any option like Java's JVM flag -XX:HeapDumpOnOutOfMemoryError ?

Check out node-memwatch.
It provides a heap diff class:
var hd = new memwatch.HeapDiff();
// your code here ...
var diff = hd.end();
It also has event emitters for leaks:
memwatch.on('leak', function(info) {
// look at info to find out about what might be leaking
});

Related

Node memory leak while using redis brpop

I'm having memory leak issue in a node application. The application is subscribed to a topic in redis and on receiving a message pops a message from a list using brpop. There are a number instances of this application running in production so one instance might be blocking for a message in the redis list. Here is the code snippet which consumes a message from redis:
private doWork(): void {
this.storage.subscribe("newRoom", (message: [any, any]) => {
const [msg] = message;
if (msg === "room") {
return new Promise( async (resolve, reject) => {
process.nextTick( async () => {
const roomIdData = await this.storage.brpop("newRoomList"); // a promisified version of brpop with timeout of 5s
if (roomIdData) {
const roomId = roomIdData[1];
this.createRoom(roomId);
}
});
resolve();
});
}
});
}
I've tried debugging the memory leaks using chrome debugger and I've observed too many closure objects getting created. I suspect that it's due to this code as I'm able to see the redis client object name in the closure object but I'm not able to figure out how I might be able to fix it. I added process.nextTick but it didn't help. I'm using node-redis client for connecting to redis. Attaching an object retainer map screenshot from the chrome debugger tool.
P.S. blk is the redis client object name used exclusively for blocking commands i.e. brpop.
Edit: Replaced brpop with rpop and we're seeing a significant drop in memory growth rate but now the distribution of messages between the workers has gone skewed.

Does v8/Node actually garbage collect during function calls? - Or is this a sailsJS memory leak

I am creating a sailsJS webserver with a background task that needs to run continuously (if the server is idle). - This is a task to synchronize a database with some external data and pre-cache data to speed up requests.
I am using sails version 1.0. Tthe adapter is postgresql (adapter: 'sails-postgresql'), adapter version: 1.0.0-12
Now while running this application I noticed a major problem: it seems that after some time the application inexplicably crashes with an out of heap memory error. (I can't even catch this, the node process just quits).
While I tried to hunt for a memory leak I tried many different approaches, and ultimately I can reduce my code to the following function:
async DoRun(runCount=0, maxCount=undefined) {
while (maxCount === undefined || runCount < maxCount) {
this.count += 1;
runCount += 1;
console.log(`total run count: ${this.count}`);
let taskList;
try {
this.active = true;
taskList = await Task.find({}).populate('relatedTasks').populate('notBefore');
//taskList = await this.makeload();
} catch (err) {
console.error(err);
this.active = false;
return;
}
}
}
To make it "testable" I reduced the heap size allowed to be used by the application: --max-old-space-size=100; With this heapsize it always crashes about around 2000 runs. However even with an "unlimited" heap it crashes after a few (ten)thousand runs.
Now to further test this I commented out the Task.find() command and implimented a dummy that creates the "same" result".
async makeload() {
const promise = new Promise(resolve => {
setTimeout(resolve, 10, this);
});
await promise;
const ret = [];
for (let i = 0; i < 10000; i++) {
ret.push({
relatedTasks: [],
notBefore: [],
id: 1,
orderId: 1,
queueStatus: 'new',
jobType: 'test',
result: 'success',
argData: 'test',
detail: 'blah',
lastActive: new Date(),
updatedAt: Date.now(),
priority: 2 });
}
return ret;
}
This runs (so far) good even after 20000 calls, with 90 MB of heap allocated.
What am I doing wrong in the first case? This let me to believe that sails is having a memory leak? Or is node unable to free the database connections somehow?
I can't seem to see anything that is blatantly "leaking" here? As I can see in the log this.count is not a string so it's not even leaking there (same for runCount).
How can I progress from this point?
EDIT
Some further clarifications/summary:
I run on node 8.9.0
Sails version 1.0
using sails-postgresql adapter (1.0.0-12) (beta version as other version doesn't work with sails 1.0)
I run with the flag: --max-old-space-size=100
Environment variable: node_env=production
It crashes after approx 2000-2500 runs when in production environment (500 when in debug mode).
I've created a github repository containing a workable example of the code;
here. Once again to see the code at any point "soon" set the flag --max-old-space-size=80 (Or something alike)
I don't know anything about sailsJS, but I can answer the first half of the question in the title:
Does V8/Node actually garbage collect during function calls?
Yes, absolutely. The details are complicated (most garbage collection work is done in small incremental chunks, and as much as possible in the background) and keep changing as the garbage collector is improved. One of the fundamental principles is that allocations trigger chunks of GC work.
The garbage collector does not care about function calls or the event loop.

NodeJS sometimes gets killed because out of memory while streaming/piping files

Problem
I have a NodeJS Server with the request module.
I use requests pipe() for serving files.
Sometimes, the app throws an exception, all downloads cancel and I have to restart the app:
Out of memory: Kill process 9342 (nodejs) score 793 or sacrifice child
Killed process 9342 (nodejs) total-vm:1333552kB, anon-rss:410648kB, file-rss:0kB
I wrote another script which restarts the server automatically (with childprocess & fork) when it ends unexpectedly, which sometimes throws this error:
FATAL ERROR: CALL_AND_RETRY_2 Allocation failed - process out of memory
Server data
RAM: 500MB (I know that this is not much, but it's cheap)
Ubuntu 12.04.5 LTS
NodeJS version: v0.10.36
Assumptions
Too much downloads in parallel
Something wrong with pipe related to the RAM
Regarding 1:
When somebody downloads a big file a bit of it is loaded in the RAM (I know nothing about this, say 20MB at once, please correct me if I'm too wrong). When 400MB is available and 20 current downloads with the same download speed, the server crashes because he can't load more than 400MB at once in the RAM.
Regarding 2:
In addition to pipe() I use the following code to track current & canceled downloads:
req.on("close", function() {
currentDownloads--;
});
The pipe() doesn't close properly and the RAM it used doesn't get cleared.
Questions
If any of my assumptions should be right, how could I fix it?
If not, what could it be or rather where could be the cause (is it NodeJS/request module/is my code wrong or bad, are there better methods)?
Full Code
var currentDownloads = 0;
app.post("/", function (req, res) {
var open = false;
req.on("close", function () {
if (open) {
currentDownloads--;
open = false;
}
});
request.get(url)
.on("error", function (err) {
log("err " + err);
if (open) {
currentDownloads--;
open = false;
}
})
.on("response", function () {
open = true;
currentDownloads++;
})
.pipe(res);
});

jsdom and node.js leaking memory

I found a few reference to people having a similar issue where the answer always was, make sure you call window.close() when done. However that does not seem to be working for me (node 0.8.14 and jsdom 0.3.1)
A simple repro
var util = require('util');
var jsdom=require('jsdom');
function doOne() {
var htmlDoc = '<html><head></head><body id="' + i + '"></body></html>';
jsdom.env(htmlDoc, null, null, function(errors, window) {
window.close();
});
}
for (var i=1;i< 100000;i++ ) {
doOne();
if(i % 500 == 0) {
console.log(i + ":" + util.inspect(process.memoryUsage()));
}
}
console.log ("done");
Output I get is
500:{ rss: 108847104, heapTotal: 115979520, heapUsed: 102696768 }
1000:{ rss: 198250496, heapTotal: 194394624, heapUsed: 190892120 }
1500:{ rss: 267304960, heapTotal: 254246912, heapUsed: 223847712 }
...
11000:{ rss: 1565204480, heapTotal: 1593723904, heapUsed: 1466889432 }
At this point the fan goes wild and the test actually stops...or at leasts starts going very slowly
Does anyone have any other tips than window.close to get rid of the memory leak (or it sure looks like a memory leak)
Thanks!
Peter
Using jsdom 0.6.0 to help scrape some data and ran into the same problem.
window.close only helped slow the memory leak, but it did eventually creep up till the process got killed.
Running the script with
node --expose-gc myscript.js
Until they fix the memory leak, manually calling the garbage collector in addition to calling window.close seems to work:
if (process.memoryUsage().heapUsed > 200000000) { // memory use is above 200MB
global.gc();
}
Stuck that after the call to window.close. Memory use immediately drops back to baseline (around 50MB for me) every time it gets triggered. Barely perceptible halt.
update: also consider calling global.gc() multiple times in succession rather than only once (i.e. global.gc();global.gc();global.gc();global.gc();global.gc();)
Calling window.gc() multiple times was more effective (based on my imperfect tests), I suspect because it possibly caused chrome to trigger a major GC event rather than a minor one. - https://github.com/cypress-io/cypress/issues/350#issuecomment-688969443
You are not giving the program any idle time to do garbage collection. I believe you will run into the same problem with any large object graph created many times tightly in a loop with no breaks.
This is substantiated by CheapSteaks's answer, which manually forces the garbage collection. There can't be a memory leak in jsdom if that works, since memory leaks by definition prevent the garbage collector from collecting the leaked memory.
I had the same problem with jsdom and switcht to cheerio, which is much faster than jsdom and works even after scanning hundreds of sites. Perhaps you should try it, too. Only problem is, that it dosent have all the selectors which you can use in jsdom.
hope it works for you, too.
Daniel
with gulp, memory usage, cleanup, variable delete, window.close()
var gb = setInterval(function () {
//only call if memory use is bove 200MB
if (process.memoryUsage().heapUsed > 200000000) {
global.gc();
}
}, 10000); // 10sec
gulp.task('tester', ['clean:raw2'], function() {
return gulp.src('./raw/*.html')
.pipe(logger())
.pipe(map(function(contents, filename) {
var doc = jsdom.jsdom(contents);
var window = doc.parentWindow;
var $ = jquery(window);
console.log( $('title').text() );
var html = window.document.documentElement.outerHTML;
$( doc ).ready(function() {
console.log( "document loaded" );
window.close();
});
return html;
}))
.pipe(gulp.dest('./raw2'))
.on('end', onEnd);
});
and I had constatly between 200mb - 300mb usage, for 7k files. it took 30 minutes.
It might be helpful for someone, as i googled and didnt find anything helpful.
A work around for this is to run the jsdom related code in a forked child_process and send back the relevant results when done. then kill the child_process.

Make node.js not exit on error

I am working on a websocket oriented node.js server using Socket.IO. I noticed a bug where certain browsers aren't following the correct connect procedure to the server, and the code isn't written to gracefully handle it, and in short, it calls a method to an object that was never set up, thus killing the server due to an error.
My concern isn't with the bug in particular, but the fact that when such errors occur, the entire server goes down. Is there anything I can do on a global level in node to make it so if an error occurs it will simply log a message, perhaps kill the event, but the server process will keep on running?
I don't want other users' connections to go down due to one clever user exploiting an uncaught error in a large included codebase.
You can attach a listener to the uncaughtException event of the process object.
Code taken from the actual Node.js API reference (it's the second item under "process"):
process.on('uncaughtException', function (err) {
console.log('Caught exception: ', err);
});
setTimeout(function () {
console.log('This will still run.');
}, 500);
// Intentionally cause an exception, but don't catch it.
nonexistentFunc();
console.log('This will not run.');
All you've got to do now is to log it or do something with it, in case you know under what circumstances the bug occurs, you should file a bug over at Socket.IO's GitHub page:
https://github.com/LearnBoost/Socket.IO-node/issues
Using uncaughtException is a very bad idea.
The best alternative is to use domains in Node.js 0.8. If you're on an earlier version of Node.js rather use forever to restart your processes or even better use node cluster to spawn multiple worker processes and restart a worker on the event of an uncaughtException.
From: http://nodejs.org/api/process.html#process_event_uncaughtexception
Warning: Using 'uncaughtException' correctly
Note that 'uncaughtException' is a crude mechanism for exception handling intended to be used only as a last resort. The event should not be used as an equivalent to On Error Resume Next. Unhandled exceptions inherently mean that an application is in an undefined state. Attempting to resume application code without properly recovering from the exception can cause additional unforeseen and unpredictable issues.
Exceptions thrown from within the event handler will not be caught. Instead the process will exit with a non-zero exit code and the stack trace will be printed. This is to avoid infinite recursion.
Attempting to resume normally after an uncaught exception can be similar to pulling out of the power cord when upgrading a computer -- nine out of ten times nothing happens - but the 10th time, the system becomes corrupted.
The correct use of 'uncaughtException' is to perform synchronous cleanup of allocated resources (e.g. file descriptors, handles, etc) before shutting down the process. It is not safe to resume normal operation after 'uncaughtException'.
To restart a crashed application in a more reliable way, whether uncaughtException is emitted or not, an external monitor should be employed in a separate process to detect application failures and recover or restart as needed.
I just did a bunch of research on this (see here, here, here, and here) and the answer to your question is that Node will not allow you to write one error handler that will catch every error scenario that could possibly occur in your system.
Some frameworks like express will allow you to catch certain types of errors (when an async method returns an error object), but there are other conditions that you cannot catch with a global error handler. This is a limitation (in my opinion) of Node and possibly inherent to async programming in general.
For example, say you have the following express handler:
app.get("/test", function(req, res, next) {
require("fs").readFile("/some/file", function(err, data) {
if(err)
next(err);
else
res.send("yay");
});
});
Let's say that the file "some/file" does not actually exist. In this case fs.readFile will return an error as the first argument to the callback method. If you check for that and do next(err) when it happens, the default express error handler will take over and do whatever you make it do (e.g. return a 500 to the user). That's a graceful way to handle an error. Of course, if you forget to call next(err), it doesn't work.
So that's the error condition that a global handler can deal with, however consider another case:
app.get("/test", function(req, res, next) {
require("fs").readFile("/some/file", function(err, data) {
if(err)
next(err);
else {
nullObject.someMethod(); //throws a null reference exception
res.send("yay");
}
});
});
In this case, there is a bug if your code that results in you calling a method on a null object. Here an exception will be thrown, it will not be caught by the global error handler, and your node app will terminate. All clients currently executing requests on that service will get suddenly disconnected with no explanation as to why. Ungraceful.
There is currently no global error handler functionality in Node to handle this case. You cannot put a giant try/catch around all your express handlers because by the time your asyn callback executes, those try/catch blocks are no longer in scope. That's just the nature of async code, it breaks the try/catch error handling paradigm.
AFAIK, your only recourse here is to put try/catch blocks around the synchronous parts of your code inside each one of your async callbacks, something like this:
app.get("/test", function(req, res, next) {
require("fs").readFile("/some/file", function(err, data) {
if(err) {
next(err);
}
else {
try {
nullObject.someMethod(); //throws a null reference exception
res.send("yay");
}
catch(e) {
res.send(500);
}
}
});
});
That's going to make for some nasty code, especially once you start getting into nested async calls.
Some people think that what Node does in these cases (that is, die) is the proper thing to do because your system is in an inconsistent state and you have no other option. I disagree with that reasoning but I won't get into a philosophical debate about it. The point is that with Node, your options are lots of little try/catch blocks or hope that your test coverage is good enough so that this doesn't happen. You can put something like upstart or supervisor in place to restart your app when it goes down but that's simply mitigation of the problem, not a solution.
Node.js has a currently unstable feature called domains that appears to address this issue, though I don't know much about it.
I've just put together a class which listens for unhandled exceptions, and when it see's one it:
prints the stack trace to the console
logs it in it's own logfile
emails you the stack trace
restarts the server (or kills it, up to you)
It will require a little tweaking for your application as I haven't made it generic as yet, but it's only a few lines and it might be what you're looking for!
Check it out!
Note: this is over 4 years old at this point, unfinished, and there may now be a better way - I don't know!)
process.on
(
'uncaughtException',
function (err)
{
var stack = err.stack;
var timeout = 1;
// print note to logger
logger.log("SERVER CRASHED!");
// logger.printLastLogs();
logger.log(err, stack);
// save log to timestamped logfile
// var filename = "crash_" + _2.formatDate(new Date()) + ".log";
// logger.log("LOGGING ERROR TO "+filename);
// var fs = require('fs');
// fs.writeFile('logs/'+filename, log);
// email log to developer
if(helper.Config.get('email_on_error') == 'true')
{
logger.log("EMAILING ERROR");
require('./Mailer'); // this is a simple wrapper around nodemailer http://documentup.com/andris9/nodemailer/
helper.Mailer.sendMail("GAMEHUB NODE SERVER CRASHED", stack);
timeout = 10;
}
// Send signal to clients
// logger.log("EMITTING SERVER DOWN CODE");
// helper.IO.emit(SIGNALS.SERVER.DOWN, "The server has crashed unexpectedly. Restarting in 10s..");
// If we exit straight away, the write log and send email operations wont have time to run
setTimeout
(
function()
{
logger.log("KILLING PROCESS");
process.exit();
},
// timeout * 1000
timeout * 100000 // extra time. pm2 auto-restarts on crash...
);
}
);
Had a similar problem. Ivo's answer is good. But how can you catch an error in a loop and continue?
var folder='/anyFolder';
fs.readdir(folder, function(err,files){
for(var i=0; i<files.length; i++){
var stats = fs.statSync(folder+'/'+files[i]);
}
});
Here, fs.statSynch throws an error (against a hidden file in Windows that barfs I don't know why). The error can be caught by the process.on(...) trick, but the loop stops.
I tried adding a handler directly:
var stats = fs.statSync(folder+'/'+files[i]).on('error',function(err){console.log(err);});
This did not work either.
Adding a try/catch around the questionable fs.statSynch() was the best solution for me:
var stats;
try{
stats = fs.statSync(path);
}catch(err){console.log(err);}
This then led to the code fix (making a clean path var from folder and file).
I found PM2 as the best solution for handling node servers, single and multiple instances
One way of doing this would be spinning the child process and communicate with the parent process via 'message' event.
In the child process where the error occurs, catch that with 'uncaughtException' to avoid crashing the application. Mind that Exceptions thrown from within the event handler will not be caught. Once the error is caught safely, send a message like: {finish: false}.
Parent Process would listen to the message event and send the message again to the child process to re-run the function.
Child Process:
// In child.js
// function causing an exception
const errorComputation = function() {
for (let i = 0; i < 50; i ++) {
console.log('i is.......', i);
if (i === 25) {
throw new Error('i = 25');
}
}
process.send({finish: true});
}
// Instead the process will exit with a non-zero exit code and the stack trace will be printed. This is to avoid infinite recursion.
process.on('uncaughtException', err => {
console.log('uncaught exception..',err.message);
process.send({finish: false});
});
// listen to the parent process and run the errorComputation again
process.on('message', () => {
console.log('starting process ...');
errorComputation();
})
Parent Process:
// In parent.js
const { fork } = require('child_process');
const compute = fork('child.js');
// listen onto the child process
compute.on('message', (data) => {
if (!data.finish) {
compute.send('start');
} else {
console.log('Child process finish successfully!')
}
});
// send initial message to start the child process.
compute.send('start');

Resources