Node Memory Leak Causes setInterval Delay - node.js

I'm trying to ID what's slowing down a DB connection. I've narrowed it down to probably being a memory leak. Following instructions in this guide, I've set up a heap profiling function to run at intervals throughout the program. Essentially like this:
setInterval(function(){heapingFunction()},100);
//some code
const pgClient = new pg.Client(dbConfig);
app.listen(port, function (err) {
if (err) {
logger.debug(err);
} else {
logger.info("listening at " + port + " port");
}
});
pgClient.connect()
.then(function (connection) {
logger.info("database connect");
console.log("database connect");
return pgClient.query("query");
})
.then(function (result) {Queries})
When I run this, instead of consistent heap snapshots at .1s intervals, I get a huge jump:
info: listening at 3001 port
Program is using 39904440 bytes of Heap.
Program is using 39927960 bytes of Heap.
Program is using 40055272 bytes of Heap.
Program is using 40086448 bytes of Heap.
info: database connect
database connect
Program is using 206523904 bytes of Heap.
Program is using 206546224 bytes of Heap.
Program is using 206665472 bytes of Heap.
Program is using 206874608 bytes of Heap.
Program is using 206929464 bytes of Heap.
It works as expected until the heap snapshot before info: database connect. There, it stops until the DB connect (~5min). As you can see, it's using 5x as much memory once it resumes (It's also much slower). It would be more useful to have snapshots during this time period, not just before and after. What's going on here? Is the memory leak so severe that setInterval can't even run?

Related

Does v8/Node actually garbage collect during function calls? - Or is this a sailsJS memory leak

I am creating a sailsJS webserver with a background task that needs to run continuously (if the server is idle). - This is a task to synchronize a database with some external data and pre-cache data to speed up requests.
I am using sails version 1.0. Tthe adapter is postgresql (adapter: 'sails-postgresql'), adapter version: 1.0.0-12
Now while running this application I noticed a major problem: it seems that after some time the application inexplicably crashes with an out of heap memory error. (I can't even catch this, the node process just quits).
While I tried to hunt for a memory leak I tried many different approaches, and ultimately I can reduce my code to the following function:
async DoRun(runCount=0, maxCount=undefined) {
while (maxCount === undefined || runCount < maxCount) {
this.count += 1;
runCount += 1;
console.log(`total run count: ${this.count}`);
let taskList;
try {
this.active = true;
taskList = await Task.find({}).populate('relatedTasks').populate('notBefore');
//taskList = await this.makeload();
} catch (err) {
console.error(err);
this.active = false;
return;
}
}
}
To make it "testable" I reduced the heap size allowed to be used by the application: --max-old-space-size=100; With this heapsize it always crashes about around 2000 runs. However even with an "unlimited" heap it crashes after a few (ten)thousand runs.
Now to further test this I commented out the Task.find() command and implimented a dummy that creates the "same" result".
async makeload() {
const promise = new Promise(resolve => {
setTimeout(resolve, 10, this);
});
await promise;
const ret = [];
for (let i = 0; i < 10000; i++) {
ret.push({
relatedTasks: [],
notBefore: [],
id: 1,
orderId: 1,
queueStatus: 'new',
jobType: 'test',
result: 'success',
argData: 'test',
detail: 'blah',
lastActive: new Date(),
updatedAt: Date.now(),
priority: 2 });
}
return ret;
}
This runs (so far) good even after 20000 calls, with 90 MB of heap allocated.
What am I doing wrong in the first case? This let me to believe that sails is having a memory leak? Or is node unable to free the database connections somehow?
I can't seem to see anything that is blatantly "leaking" here? As I can see in the log this.count is not a string so it's not even leaking there (same for runCount).
How can I progress from this point?
EDIT
Some further clarifications/summary:
I run on node 8.9.0
Sails version 1.0
using sails-postgresql adapter (1.0.0-12) (beta version as other version doesn't work with sails 1.0)
I run with the flag: --max-old-space-size=100
Environment variable: node_env=production
It crashes after approx 2000-2500 runs when in production environment (500 when in debug mode).
I've created a github repository containing a workable example of the code;
here. Once again to see the code at any point "soon" set the flag --max-old-space-size=80 (Or something alike)
I don't know anything about sailsJS, but I can answer the first half of the question in the title:
Does V8/Node actually garbage collect during function calls?
Yes, absolutely. The details are complicated (most garbage collection work is done in small incremental chunks, and as much as possible in the background) and keep changing as the garbage collector is improved. One of the fundamental principles is that allocations trigger chunks of GC work.
The garbage collector does not care about function calls or the event loop.

node process memory usage, resident set size constantly increasing

Quted from What do the return values of node.js process.memoryUsage() stand for? RSS is the resident set size, the portion of the process's memory held in RAM
(how much memory is held in the RAM by this process in Bytes) file size of 'text.txt' used in example is here is 370KB (378880 Bytes)
var http = require('http');
var fs = require('fs');
var express = require('express');
var app = express();
console.log("On app bootstrap = ", process.memoryUsage());
app.get('/test', function(req, res) {
fs.readFile(__dirname + '/text.txt', function(err, file) {
console.log("When File is available = ", process.memoryUsage());
res.end(file);
});
setTimeout(function() {
console.log("After sending = ", process.memoryUsage());
}, 5000);
});
app.listen(8081);
So on app bootstrap: { rss: 22069248, heapTotal: 15551232, heapUsed: 9169152 }
After i made 10 request for '/test' situation is:
When File is available = { rss: 33087488, heapTotal: 18635008, heapUsed: 6553552 }
After sending = { rss: 33447936, heapTotal: 18635008, heapUsed: 6566856 }
So from app boostrap to 10nth request rss is increased for 11378688 bytes which is roughly 30 times larger than size of text.txt file.
I know that this code will buffers up the entire data.txt file into memory for every request before writing the result back to clients, but i expected that after the requst is finished occupied memory for 'text.txt' will be released? But that is not the case?
Second how to set up maximum size of RAM memory which node process can consume?
In js garbage collector does not run immediately after execution of your code. Thus the memory is not freed immediately after execution. You can run GC independently, after working with large objects, if you care about memory consumption. More information you can find here.
setTimeout(function() {
global.gc();
console.log("After sending = ", process.memoryUsage());
}, 5000);
To look at your memory allocation you can run your server with v8-profiler and get a Heap snapshot. More info here.
Try running your example again and give the process some time to run garbage collection. Keep an eye on your process' memory usage with a system monitor and it should clear after some time. If it doesn't go down the process can't go higher in memory usage than mentioned below.
According to the node documentation the memory limit is 512 mb for 32 bit and 1 gb for 64 bit. They can be increased if necessary.

How to find source of memory leak in Node.JS App

I have a memory leak in Node.js/Express app. The app dies after 3-5 days with the following log message:
FATAL ERROR: JS Allocation failed - process out of memory
I setup a server without users connecting, and it still crashes, so I know leak is originating in the following code which runs in the background to sync api changes to the db.
poll(config.refreshInterval)
function poll(refreshRate) {
return apiSync.syncDatabase()
.then(function(){
return wait(refreshRate)
})
.then(function(){
return poll(refreshRate)
})
}
var wait = function wait(time) {
return new Promise(function(resolve){
applog.info('waiting for %s ms..', time)
setTimeout(function(){
resolve(true)
},time)
})
}
What techniques are available for profiling the heap to find the source object(s) of what is taking all the memory?
This takes awhile to crash, so I would need something that logs and I can come back later and analyze.
Is there any option like Java's JVM flag -XX:HeapDumpOnOutOfMemoryError ?
Check out node-memwatch.
It provides a heap diff class:
var hd = new memwatch.HeapDiff();
// your code here ...
var diff = hd.end();
It also has event emitters for leaks:
memwatch.on('leak', function(info) {
// look at info to find out about what might be leaking
});

Node.js readStream for end of large files

I want to occasionally send the last 2kB of my large log file (>100MB) in an email notification. Right now, I am trying the following:
var endLogBytes = fs.statSync(logFilePath).size;
var endOfLogfile = fs.createReadStream(logFilePath, {start: endLogBytes-2000, end: endLogBytes - 1, autoClose: true, encoding: 'utf8'});
endOfLogfile.on('data', function(chunk) {
sendEmailFunction(chunk);
}
Since I just rebooted, my log files are only ~2MB, but as they get larger I am wondering:
1) Does it take a long time to read out the data (Does Node go through the entire file until it gets to the Bytes I want OR does Node jump to the Bytes that I want?)
2) How much memory is consumed?
3) When is the memory space freed up? How do I free the memory space?
You should not use ReadStream in that case; cause it is a stream it have to(I suppose) grind up all the prepending data before it gets to the last two kilobytes.
So I would do just fs.open and then fs.read with the descriptor of opened file. Like that:
fs.open(logFilePath, 'r', function(e, fd) {
if (e)
throw e; //or do whatever you usually doing in such kind of situations
var endOfLogfile = new Buffer(2048);
fs.read(fd, endOfLogFile, endLogBytes-2048, 2048, null, function(e, bytesRead, data) {
if (e)
throw e;
//don't forget to data.toString('ascii|utf8|you_name_it')
sendEmailFunction(data.toString('ascii'));
});
});
UPDATE:
Seems like current implementation of ReadStream smart enough to read only required amount of data. See: https://github.com/joyent/node/blob/v0.10.29/lib/fs.js#L1550. It uses fs.open and fs.read under the hood. So you can use ReadStream without worry.
Anyway I would go with fs open/read, cause it is more explicit, C-way, better style and so on.
About memory and freeing it up. You will need at least 2Mb of memory for data buffer + some overhead. I don't think there is some way to tell how much of overhead it will take exactly. Just test it with your target OS and node version. You can use this module for profiling: https://www.npmjs.org/package/webkit-devtools-agent.
Memory will be freed up when you will not use buffer with data and GC will decide that this is good time to collect some garbage. GC is non deterministic(i.e. unpredictable). You should not try to predict it behaviour or force it in any way to do garbage collection.

jsdom and node.js leaking memory

I found a few reference to people having a similar issue where the answer always was, make sure you call window.close() when done. However that does not seem to be working for me (node 0.8.14 and jsdom 0.3.1)
A simple repro
var util = require('util');
var jsdom=require('jsdom');
function doOne() {
var htmlDoc = '<html><head></head><body id="' + i + '"></body></html>';
jsdom.env(htmlDoc, null, null, function(errors, window) {
window.close();
});
}
for (var i=1;i< 100000;i++ ) {
doOne();
if(i % 500 == 0) {
console.log(i + ":" + util.inspect(process.memoryUsage()));
}
}
console.log ("done");
Output I get is
500:{ rss: 108847104, heapTotal: 115979520, heapUsed: 102696768 }
1000:{ rss: 198250496, heapTotal: 194394624, heapUsed: 190892120 }
1500:{ rss: 267304960, heapTotal: 254246912, heapUsed: 223847712 }
...
11000:{ rss: 1565204480, heapTotal: 1593723904, heapUsed: 1466889432 }
At this point the fan goes wild and the test actually stops...or at leasts starts going very slowly
Does anyone have any other tips than window.close to get rid of the memory leak (or it sure looks like a memory leak)
Thanks!
Peter
Using jsdom 0.6.0 to help scrape some data and ran into the same problem.
window.close only helped slow the memory leak, but it did eventually creep up till the process got killed.
Running the script with
node --expose-gc myscript.js
Until they fix the memory leak, manually calling the garbage collector in addition to calling window.close seems to work:
if (process.memoryUsage().heapUsed > 200000000) { // memory use is above 200MB
global.gc();
}
Stuck that after the call to window.close. Memory use immediately drops back to baseline (around 50MB for me) every time it gets triggered. Barely perceptible halt.
update: also consider calling global.gc() multiple times in succession rather than only once (i.e. global.gc();global.gc();global.gc();global.gc();global.gc();)
Calling window.gc() multiple times was more effective (based on my imperfect tests), I suspect because it possibly caused chrome to trigger a major GC event rather than a minor one. - https://github.com/cypress-io/cypress/issues/350#issuecomment-688969443
You are not giving the program any idle time to do garbage collection. I believe you will run into the same problem with any large object graph created many times tightly in a loop with no breaks.
This is substantiated by CheapSteaks's answer, which manually forces the garbage collection. There can't be a memory leak in jsdom if that works, since memory leaks by definition prevent the garbage collector from collecting the leaked memory.
I had the same problem with jsdom and switcht to cheerio, which is much faster than jsdom and works even after scanning hundreds of sites. Perhaps you should try it, too. Only problem is, that it dosent have all the selectors which you can use in jsdom.
hope it works for you, too.
Daniel
with gulp, memory usage, cleanup, variable delete, window.close()
var gb = setInterval(function () {
//only call if memory use is bove 200MB
if (process.memoryUsage().heapUsed > 200000000) {
global.gc();
}
}, 10000); // 10sec
gulp.task('tester', ['clean:raw2'], function() {
return gulp.src('./raw/*.html')
.pipe(logger())
.pipe(map(function(contents, filename) {
var doc = jsdom.jsdom(contents);
var window = doc.parentWindow;
var $ = jquery(window);
console.log( $('title').text() );
var html = window.document.documentElement.outerHTML;
$( doc ).ready(function() {
console.log( "document loaded" );
window.close();
});
return html;
}))
.pipe(gulp.dest('./raw2'))
.on('end', onEnd);
});
and I had constatly between 200mb - 300mb usage, for 7k files. it took 30 minutes.
It might be helpful for someone, as i googled and didnt find anything helpful.
A work around for this is to run the jsdom related code in a forked child_process and send back the relevant results when done. then kill the child_process.

Resources