Static Analysis Tools for GLib and GDBus - memory-leaks

Does anyone know of any tools or techniques for detecting memory leaks when using GLib and GDBus? I am relatively new to using both libraries and believe I am using the API's correctly, but it would be great if there was a tool that I could use to confirm that I am cleaning up my resources correctly. I have ran my code through various lint-type programs, but these likely do not detect anything abstracted away into a library.
I am looking for either a tool aimed specifically at GLib or GDBus or a tool that I could instrument so target these libraries? Maybe there are even some compile time flags that I can set for GLib or GDBus?

I just recently did some voodoo with glib/gdbus/libsoup and from my experience valgrind and valgrind/massif do a very good job (though not really static analysis but runtime analysis).
valgrind (use malloc even for g_slice_alloc/g_slice_new, makes valgrind less confused, gc-friendly nullifies all glib internal pointers)
G_DEBUG=gc-friendly G_SLICE=always-malloc valgrind ./yourapp
There will be still false positives in the output – use a supression file to hide them.
massif (use resident modules to prevent a lot of noise)
G_DEBUG=resident-modules valgrind --tool=massif --depth=10 --max-snapshots=1000 --alloc-fn=g_malloc --alloc-fn=g_realloc --alloc-fn=g_try_malloc --alloc-fn=g_malloc0 --alloc-fn=g_mem_chunk_alloc --threshold=0.01 ./yourapp --your --app --options
Use some visualization tool to make massifs output readable (couple of MB logs) massif-visualizer does a good job
Keep in mind that glib has a couple of MB of static allocated stuff (all the GObject type classes)
If you need to debug the libraries themself, there is no way around compiling them with debug flags (-g)

Related

Valgrind is not detecting HDF5 leaked resources

I have noticed that Valgrind is not detecting resources created with the C API of HDF5 and that are not closed before the end of the program, though I launched it with the option --leak-check=full. Is that normal ?
I often rely on Valgrind before shipping the code, but today I was surprised and frustrated when reviewing the code that it was not detected by it.
valgrind memcheck tool detects memory allocated/released by the 'standard' allocators, such as malloc/free/new/delete/...
If the C API of HDF5 is not using (internally) the above standard allocators,
then there is no way that valgrind could guess by itself what to monitor.
If HDF5 is implementing its own heap management (e.g.based on mmap, and cutting
these blocks in smaller allocated blocks),
then valgrind provides 'client requests' allowing to have some valgrind support
for such non standard allocators. But that all implies some work in the HDF5
sources.
See e.g. http://www.valgrind.org/docs/manual/mc-manual.html#mc-manual.mempools
for more information about how to describe such non standard allocators.
Some libraries/tools that are implementing their own non standard allocators
have sometime a way (e.g. an environment variable) to indicate to bypass
these non standard allocators, and still use malloc/free/...
Again, up to HDF5 to provide this.
If now HDF5 really uses the standard allocators and valgrind cannot track
what it does, then file a bug on valgrind bugzilla.

Profiling a preloaded shared library with LD_PROFILE

I'm currently trying to profile a preloaded shared library by using the LD_PROFILE environment variable.
I compile the library with "-g" flag and export LD_PROFILE_OUTPUT as well as LD_PROFILE before running an application (ncat in my case) with the preloaded library. So, more precisely what I do is the following:
Compile shared library libexample.so with "-g" flag.
export LD_PROFILE_OUTPUT=`pwd`
export LD_PROFILE=libexample.so
run LD_PRELOAD=`pwd`/libexample.so ncat ...
The preloading itself does work and my library is used, but no file libexample.so.profile gets created. If I use export LD_PROFILE=libc.so.6 instead, there is a file libc.so.6.profile as expected.
Is this a problem of combining LD_PRELOAD and LD_PROFILE or is there anything I might have done wrong?
I'm using glibc v2.12 on CentOS 6.4 if that is of any relevance.
Thanks a lot!
Sorry, I don't know the answer why LD_PROFILE does not work with LD_PRELOAD.
However, for profiling binaries compiled with -g I really like the tool valgrind together with the grapichal tool kcachegrind.
valgrind --tool=callgrind /path/to/some/binary with options
will create a file called something like callgrind.out.1234 where 1234 was the pid of the program when run. That file can be analyzed with:
kcachegrind callgrind.out.1234
In kcachegrind you will easily see in which functions most CPU time is spended, the callee map also shows this in a nise graphical way. The call graph might help to understand how the program works. You will even be able to look at the source code to see how much CPU time is spent on each line.
I hope that you will find valgrind useful even though this was not the answer to your LD_PROFILE question. The drawback of valgrind is that it slows things down both when valgrind is used for profiling and memory checking.

Under Linux, how do I track down a memory leak in pre-built software?

I have a new Ubuntu Linux Server 64bit 10.04 LTS.
A default install of Mysql with replication turned on appears to be leaking memory.
However, we've tried going back to an earlier version and memory is still leaking but I can't tell where.
What tools/techniques can I use to pinpoint where memory is leaking so that I can rectify the problem?
Valgrind, http://valgrind.org/, can be very useful in these situations. It runs on unmodified executables but it does help tremendously if you can install the debugging symbols. Be sure to use the --show-reachable=yes flag as the leaked memory may still be reachable in some way but just not the way you want it. Also --trace-children in case of a fork. You'll likely have to track down in the start-up script where the executable is called and then add something like the following:
valgrind --show-reachable=yes --trace-children=yes --log-file=/path/to/log SQL-cmdline sqlargs
The man page has lots of other potentially useful options.
Have you tried the MySQL mailing list? Something like this would certainly be of interest to them if you can reproduce it in a straightforward manner.
You can use Valgrind as ninjalj suggests, but I doubt you'll get that close to anything useful. Even if you see a real leak (and they will be hard enough to validate), tracking down the root cause through the C call stacks will likely be very annoying (for example if the leak is triggered by a particular SQL pattern or stored procedure, you'll be looking at the call stack from the resultant optimized query, and not the original calls, which are likely in a different language).
Normally you might have no recourse, and have to resort to tracking it down through callstacks and iterative testing, but you have the source code to MySQL (including the source for the exact default package install), so you can use more advanced tools like MemoryScape (or at least build with symbols in order to provide Valgrind more food for thought).
Try using valgrind.
A very good and powerful tool, which is installed/available for most distributions is Valgrind.
It has a plethora of different options and is pretty much (as far as I've seen) the default profiler under linux systems.

How to profile program on Linux platform without rebuilding?

I've used two profiling tools (VTune on windows and dbx (within sunstudio) on Solaris) which can profile program without rebuild them, and during profiling, the program just run at the same speed as normal. Both of these 2 features saved me a lot of time.
Now I want to know if there is some free tools available on Linux platform can do the same thing. I think I need profiling tools based on sampling. VTune is good but expensive ... I've heard of gprof and valgrind. But seems gprof need instrument the program (so we have to rebuild the program) and valgrind will slow down the program execution quite a lot. (from valgrind's introduction, Cachegrind runs programs about 20--100x slower than normal, and Callgrind which I need to profiling is based on Cachegrind)
For profiling, I just need to figure out the execution time of function calls so I can find out where the performance degradation happens. Actually I don't need many low level profiling information as Cachegrind provided...
oprofile is pretty good, but it can be difficult to set up. It also doesn't require you to rebuild your program.
Agreeing with Paul, I think Zoom is probably the best Linux profiler you can pay for.
However, for real results, I rely on this simple method, that I've been using since before profilers were invented.
Performance Counters for Linux is a new tool usable on kernels 2.6.31 and later; it's less intrusive (to both the program and the system as a whole) than valgrind or OProfile.
A nicer option than oprofile is Zoom. It's similar to Shark on Mac OS X, if you have ever used that. It's commercial ($199) but you can get a free trial from www.rotateright.com.

Why is the startup of an App on linux slower when using shared libs?

On the embedded device I'm working on, the startup time is an important issue. The whole application consists of several executables that use a set of libraries. Because space in FLASH memory is limited we'd like to use shared libraries.
The application workes as usual when compiled and linked with shared libraries and the amount of FLASH memory is reduced as expected.
The difference to the version that is linked to static libs is that the startup time of the application is about 20s longer and I have no idea why.
The application runs on an ARM9 CPU at 180 MHz with Linux 2.6.17 OS,
16 MB FLASH (JFFS File System) and 32 MB RAM.
Bacause shared libraries have to be linked to at runtime, usually by dlopen() or something similar. There's no such step for static libraries.
Edit: some more detail. dlopen has to perform the following tasks.
Find the shared library
Load it into memory
Recursively load all dependencies (and their dependencies....)
Resolve all symbols
This requires quite a lot of IO operations to accomplish.
In a statically linked program all of the above is done at compile time, not runtime. Therefore it's much faster to load a statically linked program.
In your case, the difference is exaggerated by the relatively slow hardware your code has to run on.
This is a fine example of the classic tradeoff of speed and space.
You can statically link all your executables so that they are faster but then they will take more space
OR
You can have shared libraries that take less space but also more time to load.
So decide what you want to sacrifice.
There are many factors for this difference (OS, compiler e.t.c) but a good list of reasons can be found here. Basically shared libraries were created for space reasons and much of the "magic" involved to make them work takes a performance hit.
(As a historical note the original Netscape navigator on Linux/Unix was a statically linked big fat executable).
This may help others with similar problems:
The reason why startup took so long in my case was, that the default setting of the GCC is to export all symbols inside of a library.
A big improvement is to set a compiler setting "-fvisibility=hidden".
All symbols that the lib has to export have to be augmented with the statement
__attribute__ ((visibility("default")))
see gcc wiki
and the very fine article how to write shared libraries
Ok, I have learned now that the usage of shared libraries has it's disadvatages concerning speed. I found this article about dynamic linking and loading enlighting. The loading process seems to be much lengthier than I have expected.
Interesting.. typically loading time for a shared library is unnoticeable from a fat app that is statically linked. So I can only surmise that the system is either very slow to load a library from flash memory, or the library that is loaded is being checked in some way (eg .NET apps run a checksum for all loaded dlls, reducing startup time considerably in some cases). It could be that the shared libraries are being loaded as-needed, and unloaded afterwards which could indicate a configuration problem.
So, sorry I can't help say why, but I think its an issue with your ARM device/OS. Have you tried instrumenting the startup code, or statically linking with 1 of the most commonly-used libraries to see if that makes a large difference. Also put the shared libs in the same directory as the app to reduce the time it takes to search the FS for the lib.
One option which seems obvious to me, is to statically link the several programs all into a single binary. That way you continue to share as much code as possible (probably more than before), but you will also avoid the overhead of the dynamic linker AND save the space of having the dynamic linker on the system at all.
It's pretty easy to combine several executables into the same one, you normally just examine argv and decide which routine to call based on that.

Resources