Memory leak in nodejs zlib - node.js

I am doing some tests using zlib provided by nodejs.
When I called the deflate function 5000 times in a loop, a memory leak occurred. And the brk function was called more than 10000 times in Linux(strace -cfe mmap,munmap,mprotect,brk -p {process Id} ).
However, when I called 5000 every 1 second using setInterval, there was no memory leak. And the brk function occurred much less in Linux.
In nodejs, the same number is called 5000, but why is the number of calls of brk function different in Linux?
I guess it's like reusing memory space in Linux. Is this correct? If not, what is the exact cause?

Mostly likely, you are seeing some side effect of the garbage collector. Analyzing this requires more more information than you've posted here, but if I were to hazard an educated guess, NodeJS (or v8 via libuv) probably does an opportunistic GC at the bottom of the event loop, to collect garbage in at least their shortest-lived pool.
When you are running this in a loop, you never fall off the event loop, which means that garbage collection probably has fewer opportunities to run. It probably runs based on a timer or allocation counter, and almost certainly during the memory allocator when there is memory pressure.
Run it in your loop for a few hundred thousand iterations, I bet it stabilizes the total process allocation at some point.

Related

How to tweak the NodeJS Garbage Collector to schedule fewer Scavenger rounds?

I have a real-time application in NodeJS that listen to multiple websockets and reacts to its events by placing HTTPS requests; it runs continuously. I noticed that the response time, at many many times during execution, was much higher than merely the expected network latency, which led me to investigate the problem. It turns out that the Garbage Collector was running multiple times in a row adding significant latency (5s, up to 30s) to the run time.
The problem seems to be related to the frequent scheduling of a Scavenger round to free up resources, due to allocation failure. Although each round takes less than 100ms to execute, executing thousands of times in a row does add up to many seconds. My only guess is that at some point in my application the allocated heap is close to its limit and little memory is actually freed in each GC round which keeps triggering GC in a long loop. The application seems not to be leaking memory because memory allocation never indeed fails. It just seems to be hitting an edge that triggers GC madness.
Could someone with knowledge/experience shed some tips on how to use the GC options that V8 provides in order to overcome such situation? I have tried --max_old_space_size and --nouse-idle-notification (as suggested by the few articles that tackle this problem) but I do not fully understand the internals of the node GC implementation and the effects of the available options. I would like to have more control over when Scavenger round should run or at least increase the interval between successive rounds so that it becomes more efficient.

Why call AdjustAmountOfExternalAllocatedMemory

Why should external code call
v8::Isolate::AdjustAmountOfExternalAllocatedMemory, formerly known as v8::V8::AdjustAmountOfExternalAllocatedMemory and together also known as NanAdjustExternalMemory?
I see some bits of documentation on the web that these functions exist, and that they somehow help ther garbage collector. But how? Why? What repercussions are to be expected if some external code doesn't call these? In a Node.js module which makes use of asynchroneous execution, is it worth the effort to communicate changes in memory allocation from the worker threads back to the v8 thread where this function can be safely called? Why should anyone care how much memory the external code is using? And if there is a good reason, should I try to provide fine-grained updates for every malloc and free, or should I only call the function every once in a while, when the situation changes significantly?
You should only update it for memory that is kept alive by JavaScript objects. I.E. you have SetInternalField in a javascript object that points to C memory it owns.
This doesn't appear to be the case for you as you say:
is it worth the effort to communicate changes in memory allocation from the worker threads back to the v8 thread where this function can be safely called
Whatever memory your workers allocate, cannot be kept alive by some 'separate v8 thread' because an isolate can only be executed by a single thread. So it cannot be keeping alive any memory your other threads are allocating, so it is irrelevant.
In general you want to call this function because it will force v8 to do global GC more often, which it normally avoids doing at all costs. E.g. If you have 1000 dead javascript buffers each reserving 20MB, you have ~20GB of garbage while V8 thinks you only have some 20kb of garbage and thus won't try to GC anything. If you then told V8 that there is 20GB of external memory (AdjustAllocatedMemory(20 * 1024 * 1024 * 1024)), it would trigger a global GC, which would GC the JavaScript buffer objects, which would call their finalizers, where you would free() the 20MB buffers, which would free 20GB of memory.

Could calling core.memory's GC.collect consistently make for a more consistent framerate?

I'm looking into making a real time game with OpenGL and D, and I'm worried about the garbage collector. From what I'm hearing, this is a possibility:
10 frames run
Garbage collector runs kicks in automatically and runs for 10ms
10 frames run
Garbage collector runs kicks in automatically and runs for 10ms
10 frames run
and so on
This could be bad because it causes stuttering. However, if I force the the garbage collector to run consistently, like with GC.collect, will it make my game smoother? Like so:
1 frame runs
Garbage collector runs for 1-2ms
1 frame runs
Garbage collector runs for 1-2ms
1 frame runs
and so on
Would this approach actually work and make my framerate more consistent? I'd like to use D but if I can't make my framerate consistent then I'll have to use C++11 instead.
I realize that it might not be as efficient, but the important thing is that it will be smoother, at a more consistent framerate. I'd rather have a smoothe 30 fps than a stuttering 35 fps, if you know what I mean.
Yes, but it will likely not make a dramatic difference.
The bulk of time spent in a GC cycle is the "mark" stage, where the GC visits every allocated memory block (which is known to contain pointers) transitively, from the root areas (static data, TLS, stack and registers).
There are several approaches to optimize an application's memory so that D's GC makes a smaller impact on performance:
Use bulk allocation (allocate objects in bulk as arrays)
Use custom allocators (std.allocator is on its way, but you could use your own or third party solutions)
Use manual memory management, like in C++ (you can use RefCounted as you would use shared_ptr)
Avoiding memory allocation entirely during gameplay, and preallocating everything beforehand instead
Disabling the GC, and running collections manually when it is more convenient
Generally, I would not recommending being concerned about the GC before writing any code. D provides the tools to avoid the bulk of GC allocations. If you keep the managed heap small, GC cycles will likely not take long enough to interfere with your application's responsiveness.
If you were to run the GC every frame, you still would not get a smooth run, because you could have different amounts of garbage every frame.
You're left then with two options, both of which involve turning off the GC:
Use (and re-use) pre-allocated memory (structs, classes, arrays, whatever) so that you do not allocate during a frame, and do not need to.
Just run and eat up memory.
For both these, you would do a GC.disable() before you start your frames and then a GC.enable() after you're finished with all your frames (at the end of the battle or whatever).
The first option is the one which most high performance games use anyway, regardless of whether they're written in a language with a GC. They simply do not allocate or de-allocate during the main frame run. (Which is why you get the "loading" and "unloading" before and after battles and the like, and there are usually hard limits on the number of units.)

Garbage collector in Node.js

According to google, V8 uses an efficient garbage collection by employing a "stop-the-world, generational, accurate, garbage collector". Part of the claim is that the V8 stops program execution when performing a garbage collection cycle.
An obvious question is how can you have an efficient GC when you pause program execution?
I was trying to find more about this topic as I would be interested to know how does the GC impacts the response time when you have possibly tens of thounsands requests per second firing your node.js server.
Any expert help, personal experience or links would be greatly appreciated
Thank you
"Efficient" can mean several things. Here it probably refers to high throughput. When looking at response time, you're more interested in latency, which could indeed be worse than with alternative GC strategies.
The main alternatives to stop-the-world GCs are
incremental GCs, which need not finish a collection cycle before handing back control to the mutator1 temporarily, and
concurrent GCs which (virtually) operate at the same time as the mutator, interrupting it only very briefly (e.g. to scan the stack).
Both need to perform extra work to be correct in the face of concurrent modification of the heap (e.g. if a new object is created and attached to an already-scanned object, this new reference must be noticed). This impacts total throughput, i.e., it takes longer to actually clean the entire heap. The upside is that they do not (usually) interrupt the program for very long, if at all, so latency is low(er).
Although the V8 documentation still mentions a stop-the-world collector, it seems that an the V8 GC is incremental since 2011. So while it does stop program execution once in a while, it does not 2 stop the program for however long it takes to scan the entire heap. Instead it can scan for, say, a couple milliseconds, and let the program resume.
1 "Mutator" is GC terminology for the program whose heap is garbage collected.
2 At least in principle, this is probably configurable.

timing for node.js Garbage Collection

Recently, I have installed https://github.com/lloyd/node-memwatch for development, to investigate how GC interacts with my program.
I have binded the event "stat", the arthor states that the event is triggered when GC is performed.
I've found that when the script with high load. The "stat" events are not triggered. I am not sure whether it implies GC is not performed, but it is a sign that GC may not have triggered.
In my production server, the loading is even a lot higher throughout the day. I am quite sure that GC has no chance to perform. The memory usage has no chance to decrease. It is just like memory leak.
Is my observation correct? Is GC not able to perform when there are high load continuously?
If so, should I use the exposed GC interface to force GC?
Is GC blocking? Should I perform GC more frequently so that GC will not block for a long time for each GC?
I know manual GC is not a good idea (There is someone opposing the idea of manual GC in node.js, but I cannot find the link for reference), but I am seeing that the memory usage is increasing continuously. It really needs to be solved.
There are 3 types of GC events in V8
kGCTypeMarkSweepCompact
kGCTypeScavenge
kGCTypeAll
V8 runs the scavenge event quite often, but only on newly created objects. During a heavy load the other types of GC may occur infrequently.
You can try running the NodeFly agent which uses nodefly-gcinfo module to track ongoing memory usage.
You can also call nodefly-gcinfo directly which has a callback that runs each time a GC event occurs:
require('nodefly-gcinfo').onGC(function(usage, type, flags){
console.log("GC Event Occurred");
console.log("Heap After GC:",usage, type, flags);
});

Resources