i stuck into memory leak in js problems.
Javascript:
var index = 0;
function leak() {
console.log(index);
index++;
setTimeout(leak, 0);
}
leak();
here is my test codes, and i use instruments.app to detect memory use of it,
and the memory is going up very fast.
i am doubt that there seems no variables occupy the memory.
why?
any thought is appreciate.
Your code creates a set of closures. This prevents the release of memory. In your example the memory will be released after the completion of all timeouts.
This can be seen (after 100 seconds):
var index = 0;
var timeout;
function leak() {
index++;
timeout = setTimeout(leak, 0);
}
leak();
setTimeout(function() {
clearTimeout(timeout);
}, 100000);
setInterval(function() {
console.log(process.memoryUsage());
}, 2000);
Related
The following code increases memory usage until crash:
const httpContext = require('express-http-context');
async function t2() {
}
async function t1() {
for (let i = 0; i < 100000000; i++) {
httpContext.ns.run(t2);
}
}
t1();
run it with: node --inspect --max-old-space-size=300 ns
The problem: The namespace _contexts map is never cleaned up.
There is a function destroy(id) inside cls-hooked/context.js but that it is never called.
I tried also ns.bind, ns.runPromise (which does a ns.exit()) and ns.bind
How can I delete the contexts after a run is finished?
The code:
const httpContext = require('express-http-context');
function t2() {
}
async function t1() {
for (let i = 0; i < 100000000; i++) {
httpContext.ns.run(t2);
}
}
t1();
works.
The code:
const httpContext = require('express-http-context');
async function t3() {
}
function t2() {
t3();
}
async function t1() {
for (let i = 0; i < 100000000; i++) {
httpContext.ns.run(t2);
}
}
t1();
has the memory leak again.
The cls-hooked async_hook method init() adds the context to the _contexts map.
The cls-hooked async_hook method destroy() deletes the context from the _contexts map.
The problem is that destroy is never called.
Is this a bug in cls-hooked or an incompatibility to the current async_hooks?
As pointed out to OP, the usage is definitely incorrect.
OP should only execute ns.run() once, and everything within that run will be of the same context.
Look at this example of proper usage:
var createNamespace = require('cls-hooked').createNamespace;
var writer = createNamespace('writer');
writer.run(function () {
writer.set('value', 0);
requestHandler();
});
function requestHandler() {
writer.run(function(outer) {
// writer.get('value') returns 0
// outer.value is 0
writer.set('value', 1);
// writer.get('value') returns 1
// outer.value is 1
process.nextTick(function() {
// writer.get('value') returns 1
// outer.value is 1
writer.run(function(inner) {
// writer.get('value') returns 1
// outer.value is 1
// inner.value is 1
writer.set('value', 2);
// writer.get('value') returns 2
// outer.value is 1
// inner.value is 2
});
});
});
setTimeout(function() {
// runs with the default context, because nested contexts have ended
console.log(writer.get('value')); // prints 0
}, 1000);
}
Furthermore, the implementation inside cls-hooked do show that context is destroyed via async hook callback destroy(asyncId)
destroy(asyncID) is called after the resource corresponding to asyncId is destroyed. It is also called asynchronously from the embedder API emitDestroy(). Some resources depend on garbage collection for cleanup, so if a reference is made to the resource object passed to init it is possible that destroy will never be called, causing a memory leak in the application. If the resource does not depend on garbage collection, then this will not be an issue.
https://github.com/Jeff-Lewis/cls-hooked/blob/0ff594bf6b2edd6fb046b10b67363c3213e4726c/context.js#L416-L425
Here is my repo for comparison and test-run of memory usage by bombarding the server with tonnes of requests using autocannon
https://github.com/Darkripper214/AsyncMemoryTest
Based on the test, there is a negligible increase in the utilization of heap (As expected, as we're processing HTTP requests).
Memory Utilization of CLS-Hooked & Async-Hook
Purpose
The repository is a miniature test to see how is memory utilized when using cls-hooked and async-hook to pass context within Node.js.
Usage
npm run start for CLS-hooked server or npm run async for Async-hook server
Go to Chrome and paste chrome://inspect
Click inspect to access to Dev Tools of the server
Go to memory tab, you may take snapshot and inspect the heap before, during and after bombarding the server with requests
node benchmark.js to start bombarding server with requests. This is powered by autocannon, you may want to increase connections or duration to see the difference.
Results
CLS-hooked
Stat
1%
2.5%
50%
97.5%
Avg
Stdev
Max
Req/Sec
839
839
871
897
870.74
14.23
839
Bytes/Sec
237kB
237kB
246kB
253kB
246kB
4.01kB
237kB
Req/Bytes counts sampled once per second (Note that this is ran with debugger attached, performance per second would be impacted)
13k requests in 15.05s, 3.68 MB read
Async-Hook
Stat
1%
2.5%
50%
97.5%
Avg
Stdev
Max
Req/Sec
300
300
347
400
346.4
31.35
300
Bytes/Sec
84.6kB
84.6kB
97.9kB
113kB
97.7kB
8.84kB
84.6kB
Req/Bytes counts sampled once per second (Note that this is ran with debugger attached & plenty of debug() messages to show how it store is destroyed, performance per second would be impacted)
5k requests in 15.15s, 1.47 MB read
Edit 1
OP is complaining on the length of _context which is set every time a namespace.run() is executed. As highlighted earlier, the way OP is testing is not correct, as it is running on a loop.
The scenario that OP is complaining will only occurs when namespace.run() execute some callback that is or contain an async function.
async function t3() {} // This async function will cause _context length to not be cleared
function t2() {
t3();
}
function t1() {
for (let i = 0; i < 500; i++) {
session.run(t2);
}
}
t1();
So why _context is not cleared? This is because async function t3 won't be able to be run in the node.js event loop as the synchronous for loop is continuously running, hence the near-infinite appending of item into _context.
So to prove that this is due to this behavior, I've updated the repo to include a file cls-gc.js that can be run using npm run gc, which explicitly run garbage collection in between, and garbage collection won't affect the length of _context.
The length of _context will be long during execution of t1() and t2() as both is synchronous. However, the length of _context will be about right after the setTimeout callback is called. Please use debugger to check for this.
Length of _context will be available at the session
// process.env.DEBUG_CLS_HOOKED = true;
('use strict');
let createNamespace = require('cls-hooked').createNamespace;
let session = createNamespace('benchmark');
async function t3() {}
function t2() {
t3();
}
function t1() {
for (let i = 0; i < 500; i++) {
session.run(t2);
try {
if (global.gc) {
global.gc();
console.log('garbage collection ran');
}
} catch (e) {
console.log('`node --expose-gc index.js`');
process.exit();
}
}
}
t1();
function t5() {
for (let i = 0; i < 1000; i++) {
// Check the _context here, should have length of at least 500
session.run(t2);
try {
if (global.gc) {
global.gc();
console.log('garbage collection ran');
}
} catch (e) {
console.log('`node --expose-gc index.js`');
process.exit();
}
}
}
t5();
setTimeout(() => {
console.log('here');
// Check the _context here, length should be 0
session.run(t2);
}, 3000);
I am trying to compare the READ performance of a library called Memored to regular old RAM variables in Node.js.
I expected that the data stored with Memored to be at least slightly slower than RAM storage in terms of reading data, but the results show the opposite (read below for my outputs).
I am running this in the terminal of Visual Studio Code on Windows 10. It’s all being done in Typescript, which gets compiled down to JavaScript later and then run with the "node" command.
This is my RAM test:
var normalRAM = {
firstname: 'qwe',
lastname: 'fsa'
}
var s = process.hrtime(); //start timer
console.log(normalRAM); // read from ram
var e = process.hrtime(s) //stop timer
console.log("end0", e[0]); //results in seconds
console.log("end1", e[1]); //results in nanoseconds
This is my Memored test:
// Clustering needed to show Memored in action
if (cluster.isMaster)
{
// Fork workers.
for (let i = 0; i < 1; i++)
{
cluster.fork();
}
}
else
{
var han = {
firstname: 'Han',
lastname: 'Solo'
}
// Store and read
memored.store('character1', han, function ()
{
console.log('Value stored!');
var hrstart = process.hrtime(); // start timer
memored.read('character1', function (err: any, value: any)
{
var hrend = process.hrtime(hrstart) // stop timer
console.log('Read value:', value);
console.log("hrend0", hrend[0]); //results in seconds
console.log("hrend1", hrend[1]); //results in nanoseconds
});
});
}
The results:
The RAM read speeds are around 6500000 nanoseconds.
The Memored read speeds are around 1000000 nanoseconds
Am I testing the speeds incorrectly here? What are the flaws in my methodology? Perhaps my initial assumption is wrong?
I switched the following two lines:
var hrend = process.hrtime(hrstart) // stop timer
console.log('Read value:', value);
To this:
console.log('Read value:', value);
var hrend = process.hrtime(hrstart) // stop timer
Which makes more sense in a real scenario since I would need to read it from RAM like that anyway after the data is returned. The answer to my question is probably "your Memored test is performing faster, because it’s only testing when the data comes back for my callback to use, and not when I actually read it from the 'value' variable".
I am creating a sailsJS webserver with a background task that needs to run continuously (if the server is idle). - This is a task to synchronize a database with some external data and pre-cache data to speed up requests.
I am using sails version 1.0. Tthe adapter is postgresql (adapter: 'sails-postgresql'), adapter version: 1.0.0-12
Now while running this application I noticed a major problem: it seems that after some time the application inexplicably crashes with an out of heap memory error. (I can't even catch this, the node process just quits).
While I tried to hunt for a memory leak I tried many different approaches, and ultimately I can reduce my code to the following function:
async DoRun(runCount=0, maxCount=undefined) {
while (maxCount === undefined || runCount < maxCount) {
this.count += 1;
runCount += 1;
console.log(`total run count: ${this.count}`);
let taskList;
try {
this.active = true;
taskList = await Task.find({}).populate('relatedTasks').populate('notBefore');
//taskList = await this.makeload();
} catch (err) {
console.error(err);
this.active = false;
return;
}
}
}
To make it "testable" I reduced the heap size allowed to be used by the application: --max-old-space-size=100; With this heapsize it always crashes about around 2000 runs. However even with an "unlimited" heap it crashes after a few (ten)thousand runs.
Now to further test this I commented out the Task.find() command and implimented a dummy that creates the "same" result".
async makeload() {
const promise = new Promise(resolve => {
setTimeout(resolve, 10, this);
});
await promise;
const ret = [];
for (let i = 0; i < 10000; i++) {
ret.push({
relatedTasks: [],
notBefore: [],
id: 1,
orderId: 1,
queueStatus: 'new',
jobType: 'test',
result: 'success',
argData: 'test',
detail: 'blah',
lastActive: new Date(),
updatedAt: Date.now(),
priority: 2 });
}
return ret;
}
This runs (so far) good even after 20000 calls, with 90 MB of heap allocated.
What am I doing wrong in the first case? This let me to believe that sails is having a memory leak? Or is node unable to free the database connections somehow?
I can't seem to see anything that is blatantly "leaking" here? As I can see in the log this.count is not a string so it's not even leaking there (same for runCount).
How can I progress from this point?
EDIT
Some further clarifications/summary:
I run on node 8.9.0
Sails version 1.0
using sails-postgresql adapter (1.0.0-12) (beta version as other version doesn't work with sails 1.0)
I run with the flag: --max-old-space-size=100
Environment variable: node_env=production
It crashes after approx 2000-2500 runs when in production environment (500 when in debug mode).
I've created a github repository containing a workable example of the code;
here. Once again to see the code at any point "soon" set the flag --max-old-space-size=80 (Or something alike)
I don't know anything about sailsJS, but I can answer the first half of the question in the title:
Does V8/Node actually garbage collect during function calls?
Yes, absolutely. The details are complicated (most garbage collection work is done in small incremental chunks, and as much as possible in the background) and keep changing as the garbage collector is improved. One of the fundamental principles is that allocations trigger chunks of GC work.
The garbage collector does not care about function calls or the event loop.
Apache Web Server has a config parameter called MaxRequestsPerChild.
http://httpd.apache.org/docs/2.0/en/mod/mpm_common.html#maxrequestsperchild
"After MaxRequestsPerChild requests, the child process will die."
To avoid crush caused by memory leaks, too many connections, or other unexpected errors, should I do the same thing when using node.js Cluster module?
*I'm using Nginx in front of node.js, not Apache. I mentioned to it so that I could easily explain.
I just implemented it like this:
var maxReqsPerChild = 10; // Small number for debug
var numReqs = 0;
if (cluster.isMaster) {
var numCPUs = require('os').cpus().length;
for (var i = 0; i < numCPUs; i++) {
cluster.fork();
}
cluster.on('death', function(worker) {
// Fork another when one died
cluster.fork();
});
} else {
http.createServer(function(webReq, webRes) {
// Count up
numReqs++;
// Doing something here
// Kill myself
if (numReqs > maxReqsPerChild) {
process.kill(process.pid); // Or more simply, process.exit() is better?
}
}).listen(1338);
}
This has been working well up until now, but I'm wondering there is more proper way.
MaxRequestsPerChild is good to hide memory leak troubles, but shouldn't be used too often, because it just hides real trouble. First try to avoid the memory leaks.
It shouldn't be used to avoid other issues like too many connections, nor other unexpected errors.
When you do use MaxRequetsPerChild, you shouldn't process.kill neither process.exit,
because that immediately closes all undergoing connections.
Instead, you should server.close, which will wait for all undergoing connections finish, and then fires 'close' event.
var server = http.createServer(...);
server.on( "close", function() {
process.exit(0);
});
server.on( "request", function () {
requestCount += 1;
if ( options.max_requests_per_child && (requestCount >= options.max_requests_per_child) ) {
process.send({ cmd: "set", key: "overMaxRequests", value: 1 });
if ( ! server.isClosed ) {
server.close();
server.isClosed = 1;
}
}
});
see a complete working example here:
https://github.com/mash/node_angel
We know node.js provides us with great power but with great power comes great responsibility.
As far as I know the V8 engine doesn't do any garbage collection. So what are the most common mistakes we should avoid to ensure that no memory is leaking from my node server.
EDIT:
Sorry for my ignorance, V8 does have a powerful garbage collector.
As far as I know the V8 engine doesn't
do any garbage collection.
V8 has a powerful and intelligent garbage collector in build.
Your main problem is not understanding how closures maintain a reference to scope and context of outer functions. This means there are various ways you can create circular references or otherwise create variables that just do not get cleaned up.
This is because your code is ambigious and the compiler can not tell if it is safe to garbage collect it.
A way to force the GC to pick up data is to null your variables.
function(foo, cb) {
var bigObject = new BigObject();
doFoo(foo).on("change", function(e) {
if (e.type === bigObject.type) {
cb();
// bigObject = null;
}
});
}
How does v8 know whether it is safe to garbage collect big object when it's in an event handler? It doesn't so you need to tell it it's no longer used by setting the variable to null.
Various articles to read:
http://www.ibm.com/developerworks/web/library/wa-memleak/
I wanted to convince myself of the accepted answer, specifically:
not understanding how closures maintain a reference to scope and context of outer functions.
So I wrote the following code to demonstrate how variables can fail to be cleaned up, which people may find of interest.
If you have watch -n 0.2 'ps -o rss $(pgrep node)' running in another terminal you can watch the leak occurring. Note how commenting in either the buffer = null or using nextTick will allow the process to complete:
(function () {
"use strict";
var fs = require('fs'),
iterations = 0,
work = function (callback) {
var buffer = '',
i;
console.log('Work ' + iterations);
for (i = 0; i < 50; i += 1) {
buffer += fs.readFileSync('/usr/share/dict/words');
}
iterations += 1;
if (iterations < 100) {
// buffer = null;
// process.nextTick(function () {
work(callback);
// });
} else {
callback();
}
};
work(function () {
console.log('Done');
});
}());
active garbage collection with:
node --expose-gc test.js
and use with:
global.gc();
Happy Coding :)