Valgrind is not detecting HDF5 leaked resources - memory-leaks

I have noticed that Valgrind is not detecting resources created with the C API of HDF5 and that are not closed before the end of the program, though I launched it with the option --leak-check=full. Is that normal ?
I often rely on Valgrind before shipping the code, but today I was surprised and frustrated when reviewing the code that it was not detected by it.

valgrind memcheck tool detects memory allocated/released by the 'standard' allocators, such as malloc/free/new/delete/...
If the C API of HDF5 is not using (internally) the above standard allocators,
then there is no way that valgrind could guess by itself what to monitor.
If HDF5 is implementing its own heap management (e.g.based on mmap, and cutting
these blocks in smaller allocated blocks),
then valgrind provides 'client requests' allowing to have some valgrind support
for such non standard allocators. But that all implies some work in the HDF5
sources.
See e.g. http://www.valgrind.org/docs/manual/mc-manual.html#mc-manual.mempools
for more information about how to describe such non standard allocators.
Some libraries/tools that are implementing their own non standard allocators
have sometime a way (e.g. an environment variable) to indicate to bypass
these non standard allocators, and still use malloc/free/...
Again, up to HDF5 to provide this.
If now HDF5 really uses the standard allocators and valgrind cannot track
what it does, then file a bug on valgrind bugzilla.

Related

best approach to debug "corrupted double-linked list" crash

I am in the process of debugging a "corrupted double-linked list" crash. I have seen the source and understand the chunk struct and the fd/bk pointers, etc, so I think I know why this crash has occurred. I am now trying to fix it and I have a couple of questions.
Question #1: where (with respect to the pointer returned from malloc) is the malloc_chunks struct maintained? Are they before the memory block or after it?
Question #2: the malloc_chunks for allocated memory are different from the malloc_chunks for unallocated memory. It appears (??) that the allocated buffer case does not have the fd/bk pointers. Is this correct?
Question #3: what is the recommended approach to debug this type of error? I am assuming that I should put a break point for the malloc_chunks so I can break on when the struct is overwritten. But I am not sure how to access those malloc structs so I can set a break point in gdb.
Any suggestions on how to proceed would be very appreciated.
Thanks,
-Andres
what is the recommended approach to debug this type of error?
The usual way is not to peek into GLIBC internals, but to use a tool like Valgrind or AddressSanitizer, either of which is likely to point you straight at the problem.
Update:
Valgrind crashes ...
You should try building the latest Valgrind version from source, and if that still crashes, report the crash to Valgrind developers.
Chances are the Valgrind problem is already fixed, and building new Valgrind and testing your program with it will still be faster than trying to debug GLIBC internals (heap corruption bugs are notoriously difficult to find by program inspection or debugging).
AddressSanitizer, I thought it was a clang only tool -- I do not think it is available for linux.
Two points:
Clang works just fine on Linux, I use it almost every day,
Recent GCC versions have an equivalent -fsanitize=address option.
There are ways to debug heap overruns without valgrind.
One way is to use a malloc debug library such as Electric Fence. It will make your rogram crash exactly at the moment of accessing an illegal address in the heap.
The other way is to use built-in debug capabilities of GNU malloc. See man mcheck. If you call mcheck_pedantic before the first call to malloc, then every memory block is checked at every allocation. This is very slow but does allow you to isolate the fault.

Static Analysis Tools for GLib and GDBus

Does anyone know of any tools or techniques for detecting memory leaks when using GLib and GDBus? I am relatively new to using both libraries and believe I am using the API's correctly, but it would be great if there was a tool that I could use to confirm that I am cleaning up my resources correctly. I have ran my code through various lint-type programs, but these likely do not detect anything abstracted away into a library.
I am looking for either a tool aimed specifically at GLib or GDBus or a tool that I could instrument so target these libraries? Maybe there are even some compile time flags that I can set for GLib or GDBus?
I just recently did some voodoo with glib/gdbus/libsoup and from my experience valgrind and valgrind/massif do a very good job (though not really static analysis but runtime analysis).
valgrind (use malloc even for g_slice_alloc/g_slice_new, makes valgrind less confused, gc-friendly nullifies all glib internal pointers)
G_DEBUG=gc-friendly G_SLICE=always-malloc valgrind ./yourapp
There will be still false positives in the output – use a supression file to hide them.
massif (use resident modules to prevent a lot of noise)
G_DEBUG=resident-modules valgrind --tool=massif --depth=10 --max-snapshots=1000 --alloc-fn=g_malloc --alloc-fn=g_realloc --alloc-fn=g_try_malloc --alloc-fn=g_malloc0 --alloc-fn=g_mem_chunk_alloc --threshold=0.01 ./yourapp --your --app --options
Use some visualization tool to make massifs output readable (couple of MB logs) massif-visualizer does a good job
Keep in mind that glib has a couple of MB of static allocated stuff (all the GObject type classes)
If you need to debug the libraries themself, there is no way around compiling them with debug flags (-g)

Allocating a data page in linux with NX bit turned off

I would like to generate some machine code in my program and then run it. One way to do it would be to write out a .so file and then load it in the program but that seems too expensive.
IS there a way in linux for me to write out the code in my data pages and then set my function ointer there and just call it? I've seen something similar on windows where you can allocate a page with the NX protection turned off for that page, but I can't find a similar OS call for linux.
The mmap(2) (with munmap(2)) and mprotect(2) syscalls are the elementary operations to do that. Recall that syscalls are elementary operations from the point of view of an application. You want PROT_EXEC
You could just strace any dynamically linked executable to get a clue about how you might call them, since the dynamic linker ld.so is using them.
Generating a shared object might be less expensive than you imagine. Actually, generating C code, running the compiler, then dlopen-ing the resulting shared object has some sense, even when you work interactively. My MELT domain specific language (to extend GCC) is doing this. Recall that you can do a big lot of dlopen-s without issues.
If you want to generate machine code in memory, you could use GNU lightning (quick generation of slow machine code), libjit from dotgnu (generate less bad machine code), LuaJit, asmjit (x86 or amd64 specific), LLVM (slowly generate optimized machine code). BTW, the SBCL Common Lisp implementation is dynamically compiling to memory and produces good machine code at runtime (and there is also all the JIT for JVMs doing that).

Under Linux, how do I track down a memory leak in pre-built software?

I have a new Ubuntu Linux Server 64bit 10.04 LTS.
A default install of Mysql with replication turned on appears to be leaking memory.
However, we've tried going back to an earlier version and memory is still leaking but I can't tell where.
What tools/techniques can I use to pinpoint where memory is leaking so that I can rectify the problem?
Valgrind, http://valgrind.org/, can be very useful in these situations. It runs on unmodified executables but it does help tremendously if you can install the debugging symbols. Be sure to use the --show-reachable=yes flag as the leaked memory may still be reachable in some way but just not the way you want it. Also --trace-children in case of a fork. You'll likely have to track down in the start-up script where the executable is called and then add something like the following:
valgrind --show-reachable=yes --trace-children=yes --log-file=/path/to/log SQL-cmdline sqlargs
The man page has lots of other potentially useful options.
Have you tried the MySQL mailing list? Something like this would certainly be of interest to them if you can reproduce it in a straightforward manner.
You can use Valgrind as ninjalj suggests, but I doubt you'll get that close to anything useful. Even if you see a real leak (and they will be hard enough to validate), tracking down the root cause through the C call stacks will likely be very annoying (for example if the leak is triggered by a particular SQL pattern or stored procedure, you'll be looking at the call stack from the resultant optimized query, and not the original calls, which are likely in a different language).
Normally you might have no recourse, and have to resort to tracking it down through callstacks and iterative testing, but you have the source code to MySQL (including the source for the exact default package install), so you can use more advanced tools like MemoryScape (or at least build with symbols in order to provide Valgrind more food for thought).
Try using valgrind.
A very good and powerful tool, which is installed/available for most distributions is Valgrind.
It has a plethora of different options and is pretty much (as far as I've seen) the default profiler under linux systems.

Can you recommend a good debugging malloc library for linux?

Can you recommend a good debugging malloc library for linux? I know there are a lot of options out there, I just need to know which libraries people are actually using to solve real-life problems.
Thanks!
EDIT: I know about Valgrind, but sometimes the performance is really too low.
Valgrind. :-) It's not a malloc library, but, it's really good at finding memory management and memory usage bugs.
http://valgrind.org/ for finding memory leaks and heap corruption.
http://dmalloc.com/ for general purpose heap debugging.
gcc now comes with sanitizers which are much more faster than valgrind. you can check different compiler options under -fsanitize. More info here
The GNU C library itself has some debugging features and hooks you can use to add your own.
For documentation on a Linux system type info libc and then g Heap<TAB>. Another useful info node is "Hooks for Malloc", you can get there with g Hooks<TAB>
This might not be very useful to you, but you could write your own malloc wrapper. In our special "diagnostic" builds it keeps a table of all outstanding allocations (including the file name and line number where the allocation occurred) and prints out anything that was still outstanding at exit time. It also uses canary words (to check for buffer overflows) and a combination of memory re-writing and block checksumming after free and before reallocation (to check for use-after-free).
If your product is sufficiently large it might be annoying to have to find-replace your entire source, hoping for the best. Also, the development time for your own malloc wrapper is probably not negligible. Doing lots of heavyweight stuff like what I mentioned above probably won't help out your speed problem, either. Writing your own wrapper would allow the most flexibility, though.

Resources