NodeJS, PM2, GC, Grafana - better understanding - node.js

I would like to unterstand the GC Process a little bit better in Nodejs/V8.
Could you provide some information for the following questions:
When GC is triggered, does this block the event loop of node js?
Is GC running in it's own process or is just a submethod of the event-loop ?
When spawning nodejs process via Pm2 (clustered mode) does the instance
really have it's own process or is the GC shared between the
instances ?
For Logging Purposes I am using Grafana
(https://github.com/RuntimeTools/appmetrics-statsd), can someone
explain the differences \ more details about these gauges:
gc.size the size of the JavaScript heap in bytes.
gc.used the amount of memory used on the JavaScript heap in bytes.
Are there any scenarios where GC is not freeing memory (gc.used) in relation with stress tests?
The questions are related to an issue that I am currently facing. The used memory of GC is rising and doesn't release any memory (classical memory leak). The problem is that it only appears when we a lot of requests.
I played around with max-old-space-size to avoid pm2 restarts, but it looks like that GC is not freeing up anymore and the whole application is getting really slow...
Any ideas ?

ok some questions, I already figured out:
gc.size = Total Heap Size (https://nodejs.org/api/v8.html -> getHeapStatistics),
gc.used = used_heap_size
it looks ok that when gc_size hits a plateu that it never goes down again =>
Memory usage doesn't decrease in node.js? What's going on?

Why is garbage collection expensive? The V8 JavaScript engine employs a stop-the-world garbage collector mechanism. In practice, it means that the program stops execution while garbage collection is in progress.
https://blog.risingstack.com/finding-a-memory-leak-in-node-js/

Related

Google Cloud Profiling how to optimize my code by analyzing it?

I am trying to analyse the Google CLoud Stackdriver's Profiling, now can anyone please tell me how can I Optimize my code by using this.
Also, I cannot see any of my function name and all, i don't know what is this _tickCallback and which part of the code it is executing ??
Please help me, anyone.
When looking at node.js heap profiles, I find that it's really helpful to know if the code is busy and allocating memory (i.e. does a web server have a load?).
This is because the heap profiler takes a snap shot of everything that is in the heap at a the time of profile collection, including allocations which are not in use anymore, but have not been garbage collected.
Garbage collection doesn't happen very often if the code isn't allocating memory. So, when the code isn't very busy, the heap profile will show a lot of memory allocations which are in the heap but no longer really in use.
Looking at your flame graph, I would guess that your code isn't very busy (or isn't allocating very much memory), so memory allocations from the profiler dominate the heap profiles. If you're code is a web server and you profiled it when it didn't have a load, it may help if you generate a load for it while profiling.
To answer the secondary question: _tickCallback is a Node.js internal function that is used for running things on timers. For example, if anything is using a setTimeout. Anything that is scheduled on a timer would have _tickCallback on the bottom of the stack.
Elsewhere on the picture, you can see some green and cyan 'end' functions on the stack. I suspect those are the places where you are calling response.end in your express request handlers.

when does the garbage collector of v8 engine of chrome is activated by default in nodejs

If my nodejs memory is reached upto 1.5GB and then i am not applying any load on it and the system is idle for 30 minutes then also garbage collector is not freeing the memory
It's impossible to say anything specific, like "why gc is not collecting the garbage" as you ask in the comments, when you say nothing about what your program is doing or how the code looks like.
I can only point you to good explanation of how GC works in Node and explain how to run GC manually to see if that helps. When you run node with the --expose-gc flag, then you can use:
global.gc();
in your code. You can try to run that code in a timeout, on a regular interval or at any other specific moment and see if that frees your memory. If it does, that would mean that the GC was indeed not running and that was the problem. If that doesn't free your memory that could mean that the problems is not with the GC not running, but rather not being able to free anything.
Memory not being freed after manual GC invocation would mean that you have some memory leak or that you use so much memory that cannot be freed by the GC.
If the memory is freed after running GC manually it could mean that it is not running by itself, maybe because you are doing a very long, blocking operation that doesn't give the event loop any chance to run. For example having a long running for or while loop could give such an effect.
Not knowing anything about your code or what it does, it's not possible to give you any more specific solution to your problem.
If you want to know how and when the GC in Node works, then there is some good documentation online.
There is a nice article by StrongLoop about how the GC works:
Node.js Performance Tip of the Week: Managing Garbage Collection
Also this article by Daniel Khan is worth reading:
Understanding Garbage Collection and hunting Memory Leaks in Node.js
It's possible that the GC is running but it can't free any memory because you have some memory leak. Without seeing an actual code or even an explanation of what it does it's realy impossible to say more.

timing for node.js Garbage Collection

Recently, I have installed https://github.com/lloyd/node-memwatch for development, to investigate how GC interacts with my program.
I have binded the event "stat", the arthor states that the event is triggered when GC is performed.
I've found that when the script with high load. The "stat" events are not triggered. I am not sure whether it implies GC is not performed, but it is a sign that GC may not have triggered.
In my production server, the loading is even a lot higher throughout the day. I am quite sure that GC has no chance to perform. The memory usage has no chance to decrease. It is just like memory leak.
Is my observation correct? Is GC not able to perform when there are high load continuously?
If so, should I use the exposed GC interface to force GC?
Is GC blocking? Should I perform GC more frequently so that GC will not block for a long time for each GC?
I know manual GC is not a good idea (There is someone opposing the idea of manual GC in node.js, but I cannot find the link for reference), but I am seeing that the memory usage is increasing continuously. It really needs to be solved.
There are 3 types of GC events in V8
kGCTypeMarkSweepCompact
kGCTypeScavenge
kGCTypeAll
V8 runs the scavenge event quite often, but only on newly created objects. During a heavy load the other types of GC may occur infrequently.
You can try running the NodeFly agent which uses nodefly-gcinfo module to track ongoing memory usage.
You can also call nodefly-gcinfo directly which has a callback that runs each time a GC event occurs:
require('nodefly-gcinfo').onGC(function(usage, type, flags){
console.log("GC Event Occurred");
console.log("Heap After GC:",usage, type, flags);
});

JVM process killed by OS

I've implemented a web service using Camel's Jetty component through Akka (endpoint) which forwards received messages to an actor pool with the setup of:
def receive = _route()
def lowerBound = 5
def upperBound = 20
def rampupRate = 0.1
def partialFill = true
def selectionCount = 1
def instance() = Actor.actorOf[Processor]
And Processor is a class that processes the received message and replies with the result of the process. The app has been working normally and flawless on my local machine, however after deploying it on an EC2 micro instance (512m of memory - CentOS like OS) the OS (oom-killer) kills the process due to OutOfMemory (not JVM OOM) after 30 calls or so (regardless of the frequency of calls).
Profiling the application locally doesn't show any significant memory leaks, if there exist any at all. Due to some difficulties I could not perform proper profiling on the remote machine but monitoring "top"s output, I observed something interesting which is the free memory available stays around 400mb after the app is initialized, afterwards it bounces between 380mb to 400mb which seems pretty natural (gc, etc). But the interesting part is that after receiving the 30th or so call, it suddenly goes from there to 5mb of free memory and boom, it's killed. The oom-killer log in /var/log/messages verifies that this has been done by the OS due to lack of memory/free swap.
Now this is not totally Akka-relevant but I finally decided I should seek some advice from you guys, after 3 days of hopeless wrestling.
Thanks for any leads.
I have observed that when lot of small objects are created, which should be garbage collected immediately, the Java process is killed. Perhaps because the memory limit is reached before the temporary objects are reclaimed by GC.
Try running it with concurrent mark and sweep garbage collector:
java -XX:+UseConcMarkSweepGC
My general observation is that the JVM uses a lot of memory beyond the Java heap. I don't know exactly for what, but can only speculate that it might using normal C heap for compilation or compiled-code storage or other permgen stuff or whatnot. Either way, I have found it difficult to control its usage.
Unless you're very pressed on disk storage, you may want to simply create a swap file of a GB or two so that the JVM has some place to overflow. In my experience, the memory it uses outside the Java heap isn't referenced overly often anyway and can just lie swapped out safely without causing much I/O.

Node.js and V8 garbage collection

Here's what's I've read so far, and correct me if I'm wrong:
Node.js is based on V8 JavaScript engine.
V8 JavaScript engine implements stop-the-world garbage collection
Which..causes Node.js to sometimes completely shutdown for a few seconds to a few minutes to handle garbage collection.
If this is running for production code, that's a few seconds for 10,000 users.
Is this really acceptable in production environment?
Whether it is acceptable depends on your application and your heap size. Big Gc is around 1.3ms per Mbyte. YMMV. About half that for a compacting GC. Around 1 GC in 10 is big. Around 1 big GC in 3 is compacting. Use V8 flag --trace-gc to log GCs. We have done some work on reducing pauses. No promises, no timetables. See branches/experimental/gc in V8 repo.

Resources