I am using the NodeJS VM Module to run untrusted code safely. I have noticed a huge memory leak that takes about 10M of memory on each execution and does not release it. Eventually, my node process ends up using 500M+ of memory. After some digging, I traced the problem to the constant creation of VMs. To test my theory, I commented out the code that creates the VMs. Sure enough, the memory usage dropped dramatically. I then uncommented the code again and placed global.gc() calls strategically around the problem areas and ran node with the--expose-gc flag. This reduced my memory usage dramatically and retained the functionality.
Is there a better way of cleaning up VMs after I am done using it?
My next approach is the cache the vm containing the given unsafe code and reusing it if it I see the unsafe code again (Background:I am letting users write their own parsing function for blocks of text, thus, the unsafe code be executed frequently or executed once and never seen again).
Some reference code.
async.each(items,function(i,cb){
// Initialize context...
var context = vm.createContext(init);
// Execute untrusted code
var captured = vm.runInContext(parse, context);
// This dramatically improves the usage, but isn't
// part of the standard API
// global.gc();
// Return Result via a callback
cb(null,captured);
});
When I see this right this was fixed in v5.9.0, see this PR. It appears that in those cases both node core maintainer nor programmers can do much - that we pretty much have to wait for a upstream fix in v8.
So no, you can't do anything more about it. Catching this bug was good though!
Related
I have a simple node app with 1 function that defines 1000+ functions inside it (without running them).
When I call this function (the wrapper) around 200 times the RSS memory of the process spikes from 100MB to 1000MB and immediately goes down. (The memory spike only happens after around 200~ calls, before that all the calls do not cause a memory spike, and all the calls after do not cause a memory spike)
This issue is happening to us in our node server in production, and I was able to reproduce it in a simple node app here:
https://github.com/gileck/node-v8-memory-issue
When I use --jitless pr --no-opt the issue does not happen (no spikes). but obviously we do not want to remove all the v8 optimizations in production.
This issue must be some kind of a specific v8 optimization, I tried a few other v8 flags but non of them fix the issue (only --jitless and --no-opt fix it)
Anyone knows which v8 optimization could cause this?
Update:
We found that --no-concurrent-recompilation fix this issue (No memory spikes at all).
but still, we can't explain it.
We are not sure why it happens and which code changes might fix it (without the flag).
As one of the answers suggests, moving all the 1000+ function definitions out of the main function will solve it, but then those functions will not be able to access the context of the main function which is why they are defined inside it.
Imagine that you have a server and you want to handle a request.
Obviously, The request handler is going to run many times as the server gets a lot of requests from the client.
Would you define functions inside the request handler (so you can access the request context in those functions) or define them outside of the request handler and pass the request context as a parameter to all of them? We chose the first option... what do you think?
anyone knows which v8 optimization could cause this?
Load Elimination.
I guess it's fair to say that any optimization could cause lots of memory consumption in pathological cases (such as: a nearly 14 MB monster of a function as input, wow!), but Load Elimination is what causes it in this particular case.
You can see for yourself when your run with --turbo-stats (and optionally --turbo-filter=foo to zoom in on just that function).
You can disable Load Elimination if you feel that you must. A preferable approach would probably be to reorganize your code somewhat: defining 2,000 functions is totally fine, but the function defining all these other functions probably doesn't need to be run in a loop long enough until it gets optimized? You'll avoid not only this particular issue, but get better efficiency in general, if you define functions only once each.
There may or may not be room for improving Load Elimination in Turbofan to be more efficient for huge inputs; that's a longer investigation and I'm not sure it's worth it (compared to working on other things that likely show up more frequently in practice).
I do want to emphasize for any future readers of this that disabling optimization(s) is not generally a good rule of thumb for improving performance (or anything else), on the contrary; nor are any other "secret" flags needed to unlock "secret" performance: the default configuration is very carefully optimized to give you what's (usually) best. It's a very rare special case that a particular optimization pass interacts badly with a particular code pattern in an input function.
At startup, it seems my node.js app uses around 200MB of memory. If I leave it alone for a while, it shrinks to around 9MB.
Is it possible from within the app to:
Check how much memory the app is using ?
Request the garbage collector to run ?
The reason I ask is, I load a number of files from disk, which are processed temporarily. This probably causes the memory usage to spike. But I don't want to load more files until the GC runs, otherwise there is the risk that I will run out of memory.
Any suggestions ?
If you launch the node process with the --expose-gc flag, you can then call global.gc() to force node to run garbage collection. Keep in mind that all other execution within your node app is paused until GC completes, so don't use it too often or it will affect performance.
You might want to include a check when making GC calls from within your code so things don't go bad if node was run without the flag:
try {
if (global.gc) {global.gc();}
} catch (e) {
console.log("`node --expose-gc index.js`");
process.exit();
}
When you cannot pass the --expose-gc flag to your node process on start for any reason, you may try this:
import { setFlagsFromString } from 'v8';
import { runInNewContext } from 'vm';
setFlagsFromString('--expose_gc');
const gc = runInNewContext('gc'); // nocommit
gc();
Notes:
This worked for me in node 16.x
You may want to check process.memoryUsage() before and after running the gc
Use with care: Quote from the node docs v8.setFlagsFromString:
This method should be used with care. Changing settings after the VM has started may result in unpredictable behavior, including crashes and data loss; or it may simply do nothing.
One thing I would suggest, is that unless you need those files right at startup, try to load only when you need them.
EDIT: Refer to the post above.
I have researched a lot before posting this. This is a collection of all the things I have discovered about garbage collecting and at the end I'm asking for a better solution than the one I found.
Summary
I am hosting a Node.js app on Heroku, and when a particular endpoint of my server is hit, which uses a lot of buffers for image manipulation (using sharp, but this is a buffer issue, not a sharp one), it takes a very few requests for the buffers to occupy all the external and rss memory (used process.memoryUsage() for diagnostics) because even tho such variables have felt out of scope, or set to null, the OS never garbage collects them. The outcome is that external and rss memory grow exponentially and after a few requests my 512 dyno quota will be reached and my dyno will crash.
Now, I have made a minimal reproducible example, which shows that simply declaring a new buffer within a function, and calling that function 10 times, results in the buffers to never be garbage collected even when the functions have finished executing.
I'm writing to find a better way to make sure Node garbage collects the unreferenced buffers and to understand why it doesn't do so by default. The only solution I have found now is to call global.gc().
NOTE
In the minimal reproducible example I simply use a buffer, no external libraries, and it is enough to recreate the issue i am having with sharp because it's just an issue that node.js buffers have.
Also note, what increases is the external memory and rss. The arraybuffer memory, or heapused, or heaptotal are not affected. I have not found a way yet to trigger garbage collector for when a certain threshold of external memory is used.
Finally, my heroku server has been running, with no incoming requests, for up to 8 hours now, and the garbage collector hasn't yet cleared out the external memory and the RSS. so it is not a matter of waiting. same holds true for the minimal reproducible example, even with timers, the garbage collector doesn't do its job.
Minimal reproducible example - garbage collection is not triggered
This is the snippet of code that logs out the memory used after each function call, where the external memory and RSS memory keep building up without being freed:
async function getPrintFile() {
let buffer = Buffer.alloc(1000000);
return
}
async function test() {
for (let i = 0; i < 10; i++) {
console.log(process.memoryUsage())
await getPrintFile()
}
}
test()
console.log(process.memoryUsage())
Below I will share the endless list of things that I have tried in order to make sure those buffers get garbage collected, but without succeeding. First I'll share the only working solution that is not optimal.
Minimal reproducible example - garbage collection is triggered through code
To make this work, I have to call global.gb() in two parts of the code. For some weird reason, that I hope someone could explain to me, if I call global.gb() only at the end of the function that creates the buffer, or only just after calling that function, it won't garbage collect. However if I call global.gb() from both places, it will.
This is the only solution that has worked for me so far, but obviously it is not ideal as global.gb() is blocking.
async function getPrintFile() {
let buffer = Buffer.alloc(1000000);
global.gc()
return
}
async function test() {
for (let i = 0; i < 10; i++) {
console.log(process.memoryUsage())
await getPrintFile()
global.gc()
}
}
test()
console.log(process.memoryUsage())
What I have tried
I tried setting the buffers to null, logically if they are not referenced anymore, they should be garbage collected, but the garbage collector is very lazy apparently.
tried "delete buffer", or finding a way to resize or reallocate buffer memory, but it doesn't exist apparently in Node.js
tried buffer.fill(0), but that simply fills all the spaces with zeros, it doesn't resize it
installing a memory allocator like jemalloc on my heroku server, following this guide: Jemalloc Heroku Buildpack but it was pointless
running my script with: node --max-old-space-size=4 index.js however again, pointless, it didn't work even with the space size set at 4 MB.
I thought maybe it was because the functions were asynchronous, or I was using a loop, nope, I wrote 5 different version of that snippet, each and everyone of them had the same issue of the external memory growing like crazy.
Questions
By any remote chance, is there something super easy I'm missing, like a keyword, a function to use, that would easily sort this out? Or does anyone have anything that has worked for them so far? A library, a snippet, anything?
Why the hell do I have to call global.gb() TWO TIMES from within and outside the function for the garbace collector to work, and why is once not enough??
Why is it that the garbage collector for Buffers in Node.js is such dog s**t?
How is this not an issue? Literally, every single buffer ever declared on a running application NEVER gets garbage collected, and there is no way to find out if you are using buffers on your laptop because of the big memory, but as soon as you upload a snippet online, then by the time you realise it's probably too late. What's up with that?
I hope someone can give me a hand, as running process.gb() twice for each n type of request is not very efficient, and I'm not sure what repercussions it might have on my code.
I have some code in a library that has in the past leaked badly, and I would like to add regression tests to avoid that in the future. I understand how to find memory leaks manually, by looking at memory usage profiles or Valgrind, but I have had trouble writing automatic tests for them.
I tried using global.gc() followed by process.memoryUsage() after running the operation I was checking for leaks, then doing this repeatedly to try to establish a linear relationship between number of operations and memory usage, but there seems to be noise in the memory usage numbers that makes this hard to measure accurately.
So, my question is this: is there an effective way to write a test in Node that consistently passes when an operation leaks memory, and fails when it does not leak memory?
One wrinkle that I should mention is that the memory leaks were occurring in a C++ addon, and some of the leaked memory was not managed by the Node VM, so I was measuring process.memoryUsage().rss.
Automating and logging information to test for memory leaks in node js.
There is a great module called memwatch-next.
npm install --save memwatch-next
Add to app.js:
const memwatch = require('memwatch-next');
// ...
memwatch.on('leak', (info) => {
// Some logging code...
console.error('Memory leak detected:\n', info);
});
This will allow you to automatically measure if there is a memory leak.
Now to put it to a test:
Good tool for this is Apache jMeter. More information here.
If you are using http you can use jMeter to soak test the application's end points.
SOAK testing is done to verify system's stability and performance characteristics over an extended period of time, its good when you are looking for memory leaks, connection leaks etc.
Continuous integration software:
Prior to deployment to production if you are using a software for continuous integration like Jenkins, you can make a Jenkins job to do this for you, it will test the application with parameters provided after the test will ether deploy the application or report that there is a memory leak. ( Depending on your Jenkins job configuration )
Hope it helps, update me on how it goes;
Good luck,
Given some arbitrary program, is it always possible to determine if it will ever terminate? The halting problem describes this. Consider the following program:
function collatz(n){
if(n==1)
return;
if(n%2==0)
return collatz(n/2);
else
return collatz(3*n+1);
}
The same idea can be applied to data in memory. It's not always possible to identify what memory isn't needed anymore and can thus be garbage collected. There is also the case of the program being designed to consume a lot of memory in some situation. The only known option is coming up with some heuristic like you have done, but it will most likely result in false positives and negatives. It may be easier to determine the root cause of the leak so it can be corrected.
For over a month I'm struggling with a very annoying memory leak issue and I have no clue how to solve it.
I'm writing a general purpose web crawler based on: http, async, cheerio and nano. From the very beginning I've been struggling with memory leak which was very difficult to isolate.
I know it's possible to do a heapdump and analyse it with Google Chrome but I can't understand the output. It's usually a bunch of meaningless strings and objects leading to some anonymous functions telling me exactly nothing (it might be lack of experience on my side).
Eventually I came to a conclusion that the library I had been using at the time (jQuery) had issues and I replaced it with Cheerio. I had an impression that Cheerio solved the problem but now I'm sure it only made it less dramatic.
You can find my code at: https://github.com/lukaszkujawa/node-web-crawler. I understand it might be lots of code to analyse but perhaps I'm doing something stupid which can be obvious strait away. I'm suspecting the main agent class which does HTTP requests https://github.com/lukaszkujawa/node-web-crawler/blob/master/webcrawler/agent.js from multiple "threads" (with async.queue).
If you would like to run the code it requires CouchDB and after npm install do:
$ node crawler.js -c conf.example.json
I know that Node doesn't go crazy with garbage collection but after 10min of heavy crawling used memory can go easily over 1GB.
(tested with v0.10.21 and v0.10.22)
For what it's worth, Node's memory usage will grow and grow even if your actual used memory isn't very large. This is for optimization on behalf of the V8 engine. To see your real memory usage (to determine if there is actually a memory leak) consider dropping this code (or something like it) into your application:
setInterval(function () {
if (typeof gc === 'function') {
gc();
}
applog.debug('Memory Usage', process.memoryUsage());
}, 60000);
Run node --expose-gc yourApp.js. Every minute there will be a log line indicating real memory usage immediately after a forced garbage collection. I've found that watching the output of this over time is a good way to determine if there is a leak.
If you do find a leak, the best way I've found to debug it is to eliminate large sections of your code at a time. If the leak goes away, put it back and eliminate a smaller section of it. Use this method to narrow it down to where the problem is occurring. Closures are a common source, but also check for anywhere else references may not be cleaned up. Many network applications will attach handlers for sockets that aren't immediately destroyed.