jemalloc / jeprof shows addresses not function names based on lg_prof_interval - memory-leaks

I'm trying to debug a memory leak in some vendor-provided JNI/native code and, like many people it seems, started here:
https://technology.blog.gov.uk/2015/12/11/using-jemalloc-to-get-to-the-bottom-of-a-memory-leak/
I've found if I run:
export MALLOC_CONF=prof:true,lg_prof_interval:30,lg_prof_sample:17
then run my application, and then run the jeprof ... command to generate the .gif I can see the JNI function names causing the leak (which is good!).
I tried changing lg_prof_interval:30 to lg_prof_interval:25 which produces .heap files much more frequently, but I've found the JNI function names I mentioned previously have been replaced by addresses (0x000123...). I haven't made any other changes, and have double- and triple- checked that running with lg_prof_interval:30 lets me see the function names in the resulting .gifs, but lg_prof_interval:25 doesn't.
The reason I want to use lg_prof_interval:25 is because I need to re-run the profiling with a slimmed-down version of the application suffering the memory leaks, but I fear it will take forever to use 2^30 bytes so will never see a .heap file.
I've seen similar questions where the answers have suggested installing a dbg version of the jdk, yet I struggle to see why this could be the cause in my case.
Many thanks in advance.
TLDR: can see function names with lg_prof_interval:30 but not lg_prof_interval:25

Related

Nodejs | Chrome Memory Debugging

Context:
I have an Node.js application which memory seems to very high, I don't know if that is memory leak or not, because it is reduced after certain period time, but some times on heavy load it keeps on increasing takes much longer to get reduced.
So going through the articles and couple of videos, i figured that i have to heap snapshot and analyse what is causing the memory leak.
Steps:
I have taken 4 snap shots as of now in my local to reproduce the memory leak.
Snapshot 1: 800MB
Snapshot 2: 1400MB
Snapshot 3: 1600MB
Snapshot 4: 2000+MB
When i uploaded the heapdump files to chrome dev tools I see there a lot of information but i don't know how to proceed from there.
Please check below screenshot, it says there is constructor [array] which has 687206 as shallow Size & Retained Size is 721414 in the columns, so when expanded that constructor i can see there are 4097716 constructors created ( refer the second screenshot attached below ).
Question
What does internal array [] means ? Why is there 4097716 created ?
How can a filter out the constructor which created by my app and showing me that instead of some system/v8 engine constructor ?
In the same screenshot one of the constructor uses global variable called tenantRequire function, this is custom global function which is being used internally in some places instead of normal Node.js require, I see the variable across all the constructor like "Array", "Object". This is that global tenantRequire code for reference. It is just patched require function with trycatch. Is this causing the memory leak somehow ?
Refer screenshot 3, [string] constructor it has 270303848 as shallow size. When i expanded it shows modules loaded by Node.js. Question why is this taking that much size ? & Why is my lodash modules are repeated in that string constructor ?
Without knowing much about your app and the actions that cause the high memory usage, it's hard to tell what could be the issue. Which tool did you use to record the heap snapshot? What is the sequence of operations you did when you recorded the snapshot? Could you add this information to your question?
A couple of remarks
You tagged the question with node.js and showed Chrome DevTools. That's ok. You can totally take a heap snapshot of a Node.js application and analyze it in Chrome DevTools. But since both Node.js and Chrome use the same JS engine (V8) and the same garbage collector (Orinoco), it might be a bit confusing for someone who reads the question. Just to make sure I understand it correctly: the issue is in a Node.js app, not in a browser app. And you are using Chrome just to analyze the heap snapshot. Right?
Also, you wrote that you took the snapshots to reproduce the memory leak. That's not correct. You performed some action which you thought would cause a high memory usage, recorded a heap snapshot, and later loaded the snapshot in Chrome DevTools to observe the supposed memory leak.
Trace first, profile second
Every time you suspect a performance issue, you should first use tracing to understand which functions in your applications are problematic (i.e. slow, create a lot of objects that have to be garbage-collected, etc).
Then, when you know which functions to focus on, you can profile them.
Try these visual tools
There are a few tools that can help you with tracing/profiling your app. Have a look a FlameScope (a web app) and node-clinic (a suite of tools). There is also Perfetto, but I think it's for Chrome apps, not Node.js apps.
I also highly recommend the V8 blog.

Meaning warning "File is touched by more than one package"

I am creating a simple linux kernel with buildroot and I am adding a small driver I've done myself, I created the Config.in file and drivername.mk to be able to select the driver in make menuconfig succesfully.
When executing make to build the image, the compilation goes correctly until my driver starts to compile, it looks to compile and create the image right but I get loooots of warnings saying that different files in ./lib/gcc/arm-buildroot-linux-uclibcgnueabihf/ are touched by more than one package: [u'host-gcc-initial', u'host-gcc-final'].
Anyone can explain me a bit about this issue and what is causing it? Do you need any more info to know what is happening? Is it safe to ignore them?
Thanks beforehand
Actually, doing a search on 'touched by more than one package', I found http://lists.busybox.net/pipermail/buildroot/2017-October/205602.html, where we find that this warning can safely be ignored if you're not doing a parallel build and aren't a kernel maintainer.
That said, if you're submitting code for inclusion in the Linux kernel, please be a good citizen and make sure you identify all of the things your code is dependent upon. (I'm not actually an active kernel hacker, so I don't know what method they're using for this right now.)
The basic idea is that there are a bunch of steps in compiling things that need to be done in a logical order. In a small project, we simply use dependencies that we know to put in because we also coded in that dependency. But with a project the size of the kernel, you can guarantee that not everyone does this. Some of them instead just specify dependencies if they're needed for things to build properly - if the default order works, things could go years before someone figures out that there was a missing dependency, causing them grief when they were trying to update just the one thing that was a missing dependency, and the other code not getting updated as a result.
When you're doing things in parallel, on the other hand, it becomes a lot more complicated. Now you really need to have every dependency specified, because there is no longer any inherent dependable order. Some people will probably still build serially, while others use two processing threads. I'll use 8. I've worked in groups that would be inclined to do 30, because they're on a 32 processor machine, and don't really need all of those during the off hours. Suddenly the fact that the file you needed from a directory that normally got processed 30 directories before yours is now getting processed at the same time as your file that needed it, because you didn't list the dependency and everything in those 30 directories that hasn't already been processed and isn't being processed has a dependency that's not yet finished its processing.

I-7188ex's weird behaviour

I have quite complex I/O program (written by someone else) for controller ICPDAS i-7188ex and I am writing a library (.lib) for it that does some calculations based on data from that program.
Problem is, if I import function with only one line printf("123") and embed it inside I/O, program crashes at some point. Without imported function I/O works fine, same goes for imported function without I/O.
Maybe it is a memory issue but why should considerable memory be allocated for function which only outputs a string? Or I am completely wrong?
I am using Borland C++ 3.1. And yes, I can't use anything newer since controller takes only 80186 instruction set.
If your code is complex then sometimes your compiler can get stuck and compile it wrongly messing things up with unpredictable behavior. Happen to me many times when the code grows ... In such case usually swapping few lines of code (if you can without breaking functionality) or even adding few empty or rem lines inside code sometimes helps. Problem is to find the place where it do its thing. You can also divide your program into several files compile each separately to obj and then just link them to the final file ...
The error description remembers me of one I did fight with a long time. If you are using class/struct/template try this:
bds 2006 C hidden memory manager conflicts
may be it will help (did not test this for old turbo).
What do you mean by embed into I/O ? are you creating a sys driver file? If that is the case you need to make sure you are not messing with CPU registers. That could cause a lot of problems try to use
void some_function_or_whatever()
{
asm { pusha };
// here your code
printf("123");
asm { popa };
}
If you writing ISR handlers then you need to use interrupt keyword so compiler returns from it properly.
Without actual code and or MCVE is hard to point any specifics ...
If you can port this into BDS2006 or newer version (just for debug not really functional) then it will analyse your code more carefully and can detect a lot of hidden errors (was supprised when I ported from BCB series into BDS2006). Also there is CodeGuard option in the compiler which is ideal for finding such errors on runtime (but I fear you will not be able to run your lib without the I/O hw present in emulated DOS)

memory leaks in Microsoft.FSharp.Control.Mailbox?

I'm hunting for some memory-leaks in a long runing service (using F#) right now.
The only "strange" thing I've seen so far is the following:
I use a MailboxProcessor in a subsystem with an algebraic-datatype named QueueChannelCommands (more or less a bunch of Add/Get commands - some with AsyncReplyChannels attached)
when I profile the service (using Ants Memory Profiler) I see instances of arrays of mentioned type (most having lenght 4, but growing) - all empty (null) whose references seems to be held by Control.Mailbox:
I cannot see any reason in my code for this behaviour (your standard code you can find in every Mailbox-example out there - just a loop with a let! = receive and a match to follow ended with a return! loop()
Has anyone seen this kind of behaviour before or even knows how to handle this?
Or is this even a (known) bug?
Update: the growing of the arrays is really strange - seems like there is additional space appended without beeing used properly:
I am not a F# expert by any means but maybe you can look at the first answer in this thread:
Does Async.StartChild have a memory leak?
The first reply mentions a tutorial for memory profiling on the following page:
http://moiraesoftware.com/blog/2011/12/11/fixing-a-hole/
But they mention this open source version of F#
https://github.com/fsharp/fsharp/blob/master/src/fsharp/FSharp.Core/control.fs
And I am not sure it is what you are looking for (about this open source version of F# in the last point), but maybe it can help you to find the source of the leak or prove that it is actually leaking memory.
Hope that helps somehow maybe ?
Tony
.NET has its own garbage collector, which works quite nicely.
The most common way to cause memory leaks in .NET technologies is by setting up delegates, and not removing them on object deconstructors.

Under Linux, how do I track down a memory leak in pre-built software?

I have a new Ubuntu Linux Server 64bit 10.04 LTS.
A default install of Mysql with replication turned on appears to be leaking memory.
However, we've tried going back to an earlier version and memory is still leaking but I can't tell where.
What tools/techniques can I use to pinpoint where memory is leaking so that I can rectify the problem?
Valgrind, http://valgrind.org/, can be very useful in these situations. It runs on unmodified executables but it does help tremendously if you can install the debugging symbols. Be sure to use the --show-reachable=yes flag as the leaked memory may still be reachable in some way but just not the way you want it. Also --trace-children in case of a fork. You'll likely have to track down in the start-up script where the executable is called and then add something like the following:
valgrind --show-reachable=yes --trace-children=yes --log-file=/path/to/log SQL-cmdline sqlargs
The man page has lots of other potentially useful options.
Have you tried the MySQL mailing list? Something like this would certainly be of interest to them if you can reproduce it in a straightforward manner.
You can use Valgrind as ninjalj suggests, but I doubt you'll get that close to anything useful. Even if you see a real leak (and they will be hard enough to validate), tracking down the root cause through the C call stacks will likely be very annoying (for example if the leak is triggered by a particular SQL pattern or stored procedure, you'll be looking at the call stack from the resultant optimized query, and not the original calls, which are likely in a different language).
Normally you might have no recourse, and have to resort to tracking it down through callstacks and iterative testing, but you have the source code to MySQL (including the source for the exact default package install), so you can use more advanced tools like MemoryScape (or at least build with symbols in order to provide Valgrind more food for thought).
Try using valgrind.
A very good and powerful tool, which is installed/available for most distributions is Valgrind.
It has a plethora of different options and is pretty much (as far as I've seen) the default profiler under linux systems.

Resources