running ANTS memory profiler I see that I have a lot of unused memory allocated to .net.
How do I determine what is causing this?
I have put a screenshot of the summary report generated by ANTS here: ANTS Summary report
Thanks
Thomas
I had the same issue when I ran about 20 Selenium tests in parallel. After analyzing the issue I narrowed it down that it is probably due to a particular method allocating a lot of memory and GC probably was not cleaning it in time for parallel calls, so as the discussion under the question suggested I imagine it created a swiss cheese.
I first added this code to analyze this problem as suggested here https://www.codeguru.com/dotnet/memory-leaks-dot-net/
GC.Collect();
GC.WaitForPendingFinalizers();
GC.Collect();
Sure enough unused memory allocated to .Net decreased from 500MB to 50MB
But this code is not advisable since it can cause deadlocks so instead my solution is to get allocated bytes in the beginning of the method, then at the end and alert GC about bytes allocated during that method call. As a result I saw memory consumption slowly decreasing to 50MB
public async Task<IActionResult> MemoryConsumingMethod()
{
var allocatedBytesStart = GC.GetTotalAllocatedBytes();
...
//main content
var model = MyViewModel(){...}
...
var allocatedBytesEnd = GC.GetTotalAllocatedBytes();
GC.AddMemoryPressure(allocatedBytesEnd - allocatedBytesStart);
return View(model);
}
Related
I'm investigating a memory leak in my nodejs script, by checking process.memoryUsage().heapUsed, the usage is around 3000MB.
chrome://inspect also shows memory usage of around 3000MB. However, every time after I take a heap snapshot, the heap snapshot saved reduced to around 73 MB, process.memoryUsage().heapUsed also reduced to that figure.
Anyone has a theory on how is this happening?
It sounds like the garbage collector is running after you check the usage. Basically every once in awhile it will check to see if there is anything that isn't tied to anything anymore and will remove it, freeing up space. See this article for more details:
https://blog.sessionstack.com/how-javascript-works-memory-management-how-to-handle-4-common-memory-leaks-3f28b94cfbec
Currently working on optimizing a library for speed. I've already reduced execution time drastically, using V8 CPU and Memory Profiling through Webstorm. This was achieved mainly by changing the core method from recursive to iterative.
Now the self time distribution breaks down as
I'm assuming the first entry "node" is timing internal functions calls, which is great. The other entries also make sense. I'm new to Nodejs profiling, but 31.6% for GC seems high, so I've decided to investigate.
I've now created a heap dump through Webstorm, but unfortunately that doesn't give me much information.
These seem to be system internal memory references mainly. Stepping through the core iteration code logic again, there also don't seem to be a lot of places where memory is explicitly allocated (using this as a reference).
Question
Can the GC overhead be reduced?
Is this amount of allocation just expected here?
Is it possible to get better memory profiling information?
Setup Instructions
In case someone want's to try debugging this, I'm including setup instructions.
Download or clone object-scan and run
yarn install --frozen-lockfile
yarn run test-simple --verbose
Now create a file test.js in the project root containing this content and run node --trace_gc test.js or run it through Webstorm for advanced profiling.
In Javascript and in v8 (node) particularly an amount of time spent for garbage collection depends on amount of data stored in heap, but that's only one of many factors.
In v8 engine there are two main "types" of GC: minor (scavenge) and major (mark-sweep/mark-compact). You may see GC types that happen during your tests in console with --trace-gc enabled. And in different cases one type could "eat" more time than other an vice versa. So before optimizations you should determine which gc takes more time.
There are not a lot of options for optimizing major GC, cause it highly affected by amount of data that stays in memory for "long" (actually in this case long means that object survives scavenge GC) period. Such data is stored in so called "old space" in heap. And major GC works with this space and it should scan all that memory and mark objects that no longer have any references for further clearance.
In your case the amount of test data you're loading goes to old space. As a result it affects major GC during the whole test. And in this case major GC will not clear too much, because you're using your test object, but it still consume time for scanning entire old space. So you may consider preventing v8 from doing that by launching node with gc-specific flags like: --nouse-idle-notification --expose-gc --gc_interval=100500 (where 100500 is number of allocation, it can be take high value that will prevent running gc before the whole test will pass) that will allow trigger garbage collections manually. Test your code using this approach and see how major GC affects it, try tests with different amount of data you provide to function. If the impact is quiet high you may try to refactor your code trying to minimize long-lived variables, closures, etc.
If you'll discover that major GC doesn't have much impact on performance, then scavenge GC takes the most of time. Unlike major GC it operates with so called "new space" in heap. It's a space where all new objects are stored. If those objects survive scavenge, then they are moved to old space. New space has much smaller size ( you may control it by setting --max_semi_space_size, note: new space size = 2 * semi space size) than old space and more new objects and variables you allocate more scavenge GC runs will happen. If this GC heats performance too much you may consider refactor your code to make less new allocations. But if you'll reuse variables it may also slowdown the performance and those objects will go to old space and may become a problem described in "major GC" section.
Also v8 GC doesn't always work in the same thread that your program runs. It does some work in background too, but I don't know what Webstorm shows in your case. If it counts just total time spend in GC, may be it just doesn't have so much impact.
You may find more details on v8 GC in this blog post.
TL;DR:
Can the GC overhead be reduced?
Yes, but first you should discover what should be optimized by following steps above.
Is this amount of allocation just expected here?
That's could be just discovered by comparing different approaches. There's no some absolute number that could limit "good" amount from "bad", because it depends on lot's of factors, including the amount on entry data.
Is it possible to get better memory profiling information?
You may find some good tools here, but in general you may use Chrome dev tools which could provide a bit more details rather than Webstorm does.
I have been into a little trouble lately: The memory used by GenServer processes is super high, probably because of large binary leaks.
The problem comes from here: we receive large binaries through the GenServer and we pass them to the consumer, which then interacts with that data. Now, these large binaries are never assigned to a variable and the GC doesn't go over them.
I have tried hibernating the processes after managing the data, which partially worked because the memory used by processes lowered a lot, but since binaries were not getting GC'd, the amount of memory used by them increased slowly but steadily, from 30 MBs without hibernating to 200MBs with process hibernation in about 25 minutes.
I have also tried to set :erlang.system_flag(:fullsweep_after, 0), which has also worked and lowered the memory used by processes by around 20%.
Before and after.
I must say it goes down to 60-70MB used by processes from time to time.
Edit: Using :recon.bin_leak(15) frees a lot of memory -- result of :recon.bin_leak(15)
Anyhow the memory used is still high and I'm completely sure it can be fixed.
Here you have a screenshot taken from the observer in the Processes tab. As you can see, GenServer is the one eating the memory like the cookie monster.
I have researched a lot about this topic, tried all the suggestions and possible solutions that were given out there, and nevertheless, I am still in this position.
Any help is welcome.
The code is in this Github Repository
Code of interest that is probably causing this + Applications tree. 3 out of 4 processes there (<0.294.0>, <0.295.0>, <0.297.0> are using 27MB of memory.
Thank you beforehand for reading.
You can try to add the :hibernate atom to your handle_events return values in your GenStage related modules. For example:
def handle_events(events, _from, %{handler: handler, public: public} = state) do
public = handle(handler, events, public)
{:noreply, [], %{state | public: public}, :hibernate}
end
Another option is to record the PIDs after :recon.bin_leak() and then pass them to Process.info(PID) to get some more information about the offending GenServers.
Some additional resources:
https://elixirforum.com/t/extremely-high-memory-usage-in-genservers/4035/23
https://www.erlang-in-anger.com/ (Specifically Chapter 7 on Memory Leaks)
My project has started using java 8 from java 7.
After switching to java 8, we are seeing issues like the memory consumed is getting higher with time.
Here are the investigations that we have done :
Issues comes only after migrating from java7 and from java8
As metaspace is the only thing related to memory which is changes from hava 7 to java 8. We monitored metaspace and this does not grow more then 20 MB.
Heap also remains consistent.
Now the only path left is to analyze how the memory gets distributes to process in java 7 and java 8, specifically private byte memory. Any thoughts or links here would be appreciated.
NOTE: this javaw application is a swing based application.
UPDATE 1 : After analyzing the native memory with NMT tool and generated a diff of memory occupied as compare to baseline. We found that the heap remained same but threads are leaking all this memory. So as no change in Heap, I am assuming that this leak is because of native code.
So challenge remains still open. Any thoughts on how to analyze the memory occupied by all the threads will be helpful here.
Below are the snapshots taken from native memory tracking.
In this pic, you can see that 88 MB got increased in threads. Where arena and resource handle count had increased a lot.
in this picture you can see that 73 MB had increased in this Malloc. But no method name is shown here.
So please throw some info in understanding these 2 screenshot.
You may try another GC implementation like G1 introduced in Java 7 and probably the default GC in Java 9. To do so just launch your Java apps with:
-XX:+UseG1GC
There's also an interesting functionality with G1 GC in Java 8u20 that can look for duplicated Strings in the heap and "deduplicate" them (this only works if you activate G1, not with the default Java 8's GC).
-XX:+UseStringDeduplication
Be aware to test thoroughly your system before going to production with such a change!!!
Here you can find a nice description of the diferent GCs you can use
I encountered the exact same issue.
Heap usage constant, only metaspace increase, NMT diffs showed a slow but steady leak in the memory used by threads specifically in the arena allocation. I had tried to fix it by setting the MALLOC_ARENAS_MAX=1 env var but that was not fruitful. Profiling native memory allocation with jemalloc/jeprof showed no leakage that could be attributed to client code, pointing instead to a JDK issue as the only smoking gun there was the memory leak due to malloc calls which, in theory, should be from JVM code.
Like you, I found that upgrading the JDK fixed the problem. The reason I am posting an answer here is because I know the reason it fixes the issue - it's a JDK bug that was fixed in JDK8 u152: https://bugs.openjdk.java.net/browse/JDK-8164293
The bug report mentions Class/malloc increase, not Thread/arena, but a bit further down one of the comments clarifies that the bug reproduction clearly shows increase in Thread/arena.
consider optimising the JVM options
Parallel Collector(throughput collector)
-XX:+UseParallelGC
concurrent collectors (low-latency collectors)
-XX:+UseConcMarkSweepGC
use String Duplicates remover
-XX:+UseStringDeduplication
optimise compact ratio
-XXcompactRatio:
and refer
link1
link2
In this my answer you can see information and references how to profile native memory of JVM to find memory leaks. Shortly, see this.
UPDATE
Did you use -XX:NativeMemoryTracking=detail option? The results are straightforward, they show that the most memory allocated by malloc. :) It's a little bit obviously. Your next step is to profile your application. To analyze native methods and Java I use (and we use on production) flame graphs with perf_events. Look at this blog post for a good start.
Note, that your memory increased for threads, likely your threads grow in application. Before perf I recommend analyze thread dumps before/after to check does Java threads number grow and why. Thread dumps you can get with jstack/jvisualvm/jmc, etc.
This issue does not come with Java 8 update 152. The exact root cause of why it was coming with earlier versions is still not clearly identified.
I need to check for a memory leak in an embedded system.
The IDE is HEW and we are using uCOSIII RTOS.
Valgrind does not support the above configurations. Can you please suggest a tool or a method to check for memory leaks?
First rule of dynamically allocating memory in embedded systems is "don't". Allocate it all once at the start of execution and then leave well alone. Otherwise you have to assess and decide what to do when a malloc (or similar operation) fails.
If you must dynamically allocate memory at runtime, then at its simplest you may be able to use a logging infrastructure to track calls to malloc/free by writing wrappers around them. Then you can track where and when the allocations and deallocations are happening and hopefully see what is missing.
Take a look at libtalloc, the core memory allocator used in Samba. It may not work out-of-the-box for you if you don't have atexit() or stdio.h, but it shouldn't take too much work to port it to your environment.
Have a look at talloc_enable_leak_report_full() and talloc_report_full() (among others) to get you started.
I have been giving some thoughts about it, and here is a random try on how to do this with embedded systems:
First you need to check in which thread leakage occur. When doing alloc, you should also count for each thread how many active allocation. Where number of allocation keeps growing without deallocation, this is suspicious task
Secondly, you need to count number of allocations for allocs comming from that thread. To do this, replace alloc with a macro. Using macro you can save name of the file and line number where the call originated.
for example
#define alloc(x) my_alloc(x, __LINE__, __FILE__)
void * my_alloc(size_t size, int line, char * file)
{
// increase number of allocations and dealocations for each combination line/file
}
Similarly you need to define my_free.
After this, run the program and printf from time to time allocations that keep growing. This should help find memory leaks.
P.S. I didn't test this, but I saw somebody do something similar in our code :)
Your requirement is not completely clear. If you are looking for the tool as "valgrind" that can be able find the memory leak in your environment; that is difficult to find out.
If you are having some code than you can check all the memory allocations & freeing of the memory in the particular application. As link1 Link2
Also there are some files available by executing them you can find the memory leak.
http://code.axter.com/debugalloc.cpp
http://code.axter.com/debugalloc.h
http://code.axter.com/debuglogger.cpp
http://code.axter.com/debuglogger.h
http://code.axter.com/debuglog.c
http://code.axter.com/debuglog.h
debugalloc.* code has the ability to track memory leaks, and it has
description and usage information in comments.
debuglogger.* code has some code for profileing your code.
debuglog.* is some limited C version of the code.