In order to understand the memory usage pattern of NodeJS's V8 engine, I wrote a simple web program as shown below:
contents of server.js:
var http = require("http");
var server = http.createServer(function(req, res) {
res.write("Hello World");
res.end();
});
server.listen(3000);
When the program is launched using node server.js, the initial memory snapshot is as below:
After I kept making multiple URL hits to this server, I could see a pattern of increased heap usage. To be more precise, for every 6 or 7 hits, there is an increase of 4K. I kept repeating the hits continuously for about 2 minutes, then this is the snapshot.
I didn't see any eventual decrease in heap usage, even if I kept it idle without load.
My question is:
Is this a normal behavior, or there is a memory leak in nodeJS?
Or, am I understanding or interpreting it incorrectly?
Node uses V8 under the hood, so the answer to this question most likely applies:
How does V8 manage its heap?
The code appears to be valid, so to test you could write a small application to repeatedly call your api and then examine Node's memory while running. Use this to help detect possible leakage (if there is any over 5 consecutive runs of the garbage collector): http://www.nearform.com/nodecrunch/self-detect-memory-leak-node/
Related
At startup, it seems my node.js app uses around 200MB of memory. If I leave it alone for a while, it shrinks to around 9MB.
Is it possible from within the app to:
Check how much memory the app is using ?
Request the garbage collector to run ?
The reason I ask is, I load a number of files from disk, which are processed temporarily. This probably causes the memory usage to spike. But I don't want to load more files until the GC runs, otherwise there is the risk that I will run out of memory.
Any suggestions ?
If you launch the node process with the --expose-gc flag, you can then call global.gc() to force node to run garbage collection. Keep in mind that all other execution within your node app is paused until GC completes, so don't use it too often or it will affect performance.
You might want to include a check when making GC calls from within your code so things don't go bad if node was run without the flag:
try {
if (global.gc) {global.gc();}
} catch (e) {
console.log("`node --expose-gc index.js`");
process.exit();
}
When you cannot pass the --expose-gc flag to your node process on start for any reason, you may try this:
import { setFlagsFromString } from 'v8';
import { runInNewContext } from 'vm';
setFlagsFromString('--expose_gc');
const gc = runInNewContext('gc'); // nocommit
gc();
Notes:
This worked for me in node 16.x
You may want to check process.memoryUsage() before and after running the gc
Use with care: Quote from the node docs v8.setFlagsFromString:
This method should be used with care. Changing settings after the VM has started may result in unpredictable behavior, including crashes and data loss; or it may simply do nothing.
One thing I would suggest, is that unless you need those files right at startup, try to load only when you need them.
EDIT: Refer to the post above.
I am using node and am considering manually running garbage collection in node. Is there any drawbacks on this? The reason I am doing this is that it looks like node is not running garbage collection frequently enough. Does anyone know how often V8 does its garbage collection routine in node?
Thanks!
I actually had the same problem running node on heroku with 1GB instances.
When running the node server on production traffic, the memory would grow constantly until it exceeded the memory limit, which caused it to run slowly.
This is probably caused by the app generating a lot of garbage, it mostly serves JSON API responses. But it wasn't a memory leak, just uncollected garbage.
It seems that node doesn't prioritize doing enough garbage collections on old object space for my app, so memory would constantly grow.
Running global.gc() manually (enabled with node --expose_gc) would reduce memory usage by 50MB every time and would pause the app for about 400ms.
What I ended up doing is running gc manually on a randomized schedule (so that heroku instances wouldn't do GC all at once). This decreased the memory usage and stopped the memory quota exceeded errors.
A simplified version would be something like this:
function scheduleGc() {
if (!global.gc) {
console.log('Garbage collection is not exposed');
return;
}
// schedule next gc within a random interval (e.g. 15-45 minutes)
// tweak this based on your app's memory usage
var nextMinutes = Math.random() * 30 + 15;
setTimeout(function(){
global.gc();
console.log('Manual gc', process.memoryUsage());
scheduleGc();
}, nextMinutes * 60 * 1000);
}
// call this in the startup script of your app (once per process)
scheduleGc();
You need to run your app with garbage collection exposed:
node --expose_gc app.js
I know this may be a bit of a tardy reply to help to OP, but i thought I would collaborate my recent experiences with Node JS memory allocation and garbage collection.
We are currently working on a node JS server running on a raspberry pi 3. Every so often it would crash due to running out of memory. I initially thought this was a memory leak, and after a week and a half of searching through my code and coming up with nothing, I thought the problem could have been exacerbated by the fact that Node JS allocates more memory than available on the Rpi3 for its processes before it does the GC.
I have been running new instances of my server with the following commands:
'node server.js --max-executable-size=96 --max-old-space-size=128 --max-semi-space-size=2'
This effectively limits the total amount of space that node is allowed to take up on the local machine and forces garbage collections to be done more frequently. Thus far, we are seeing a constant usage of memory and it confirms to me that my code was not leaking initially, but rather node was allocating more memory than possible.
EDIT: This link here outlines in more specific terms the issue I was dealing with.
-nodejs decrease v8 garbage collector memory usage
-https://github.com/nodejs/node/issues/2738
V8 Run garbage collection when he thinks it's useful. There is no fixed delay for that. You can read this article to learn about garbage collection V8: https://strongloop.com/strongblog/node-js-performance-garbage-collection/
Anyway, it's a bad idea to run manually the garbage collector in your project because it blocks completely the node process. So during the garbage collection, your program won't handle any requests.
I've set up a simple loop to poll an IronMQ messaging system, and everything works fine... except that memory usage increases more and more until it finally stabilizes at over 250MB. I've read that it's normal for Node to use more memory over time when run in a (sort of) recursive loop like this, even when running setTimeout and doing nothing, but I still don't understand the exact mechanics behind this behavior, or whether there is any way to control it. When making HTTP requests within the loop, memory usage more than doubles.
The code is running on a Heroku worker with a limit of 512MB RAM, leaving no breathing room to use cluster to use the rest of the available CPU cores. The memory usage can increase slowly or extremely quickly, depending on the jobs that run after receiving the messages.
This is the simplest code that reproduces this.
(function loop() {
request.get('http://www.example.com', function(err, request, body) {
if (err) console.log(err);
setTimeout(loop, 200);
});
})();
I've tried many, many ways of restructuring this code to prevent memory from increasing so high, but nothing has made any changes. Only the received HTTP response seems to have any effect on the upper limit of RAM used.
Is there a way to rewrite this entirely, or am I stuck with V8's behavior? All examples I've found use the same basic structure for infinite async loops, from kue to the async library.
I am using the NodeJS VM Module to run untrusted code safely. I have noticed a huge memory leak that takes about 10M of memory on each execution and does not release it. Eventually, my node process ends up using 500M+ of memory. After some digging, I traced the problem to the constant creation of VMs. To test my theory, I commented out the code that creates the VMs. Sure enough, the memory usage dropped dramatically. I then uncommented the code again and placed global.gc() calls strategically around the problem areas and ran node with the--expose-gc flag. This reduced my memory usage dramatically and retained the functionality.
Is there a better way of cleaning up VMs after I am done using it?
My next approach is the cache the vm containing the given unsafe code and reusing it if it I see the unsafe code again (Background:I am letting users write their own parsing function for blocks of text, thus, the unsafe code be executed frequently or executed once and never seen again).
Some reference code.
async.each(items,function(i,cb){
// Initialize context...
var context = vm.createContext(init);
// Execute untrusted code
var captured = vm.runInContext(parse, context);
// This dramatically improves the usage, but isn't
// part of the standard API
// global.gc();
// Return Result via a callback
cb(null,captured);
});
When I see this right this was fixed in v5.9.0, see this PR. It appears that in those cases both node core maintainer nor programmers can do much - that we pretty much have to wait for a upstream fix in v8.
So no, you can't do anything more about it. Catching this bug was good though!
For over a month I'm struggling with a very annoying memory leak issue and I have no clue how to solve it.
I'm writing a general purpose web crawler based on: http, async, cheerio and nano. From the very beginning I've been struggling with memory leak which was very difficult to isolate.
I know it's possible to do a heapdump and analyse it with Google Chrome but I can't understand the output. It's usually a bunch of meaningless strings and objects leading to some anonymous functions telling me exactly nothing (it might be lack of experience on my side).
Eventually I came to a conclusion that the library I had been using at the time (jQuery) had issues and I replaced it with Cheerio. I had an impression that Cheerio solved the problem but now I'm sure it only made it less dramatic.
You can find my code at: https://github.com/lukaszkujawa/node-web-crawler. I understand it might be lots of code to analyse but perhaps I'm doing something stupid which can be obvious strait away. I'm suspecting the main agent class which does HTTP requests https://github.com/lukaszkujawa/node-web-crawler/blob/master/webcrawler/agent.js from multiple "threads" (with async.queue).
If you would like to run the code it requires CouchDB and after npm install do:
$ node crawler.js -c conf.example.json
I know that Node doesn't go crazy with garbage collection but after 10min of heavy crawling used memory can go easily over 1GB.
(tested with v0.10.21 and v0.10.22)
For what it's worth, Node's memory usage will grow and grow even if your actual used memory isn't very large. This is for optimization on behalf of the V8 engine. To see your real memory usage (to determine if there is actually a memory leak) consider dropping this code (or something like it) into your application:
setInterval(function () {
if (typeof gc === 'function') {
gc();
}
applog.debug('Memory Usage', process.memoryUsage());
}, 60000);
Run node --expose-gc yourApp.js. Every minute there will be a log line indicating real memory usage immediately after a forced garbage collection. I've found that watching the output of this over time is a good way to determine if there is a leak.
If you do find a leak, the best way I've found to debug it is to eliminate large sections of your code at a time. If the leak goes away, put it back and eliminate a smaller section of it. Use this method to narrow it down to where the problem is occurring. Closures are a common source, but also check for anywhere else references may not be cleaned up. Many network applications will attach handlers for sockets that aren't immediately destroyed.