Alternative to valgrind (memcheck) for finding leaks on linux? [closed] - linux

Closed. This question does not meet Stack Overflow guidelines. It is not currently accepting answers.
We don’t allow questions seeking recommendations for books, tools, software libraries, and more. You can edit the question so it can be answered with facts and citations.
Closed 2 years ago.
Improve this question
I have a linux x86 application that makes use of various third-party shared-object libraries. I suspect these libraries are leaking memory (since it can't possibly be my code ;-)
I tried the trusty valgrind, but it died a horrible death because one of the third-party libraries is using an obscure x86 instruction that valgrind doesn't implement.
I found a recommendation for DUMA and gave it a try (using the LD_PRELOAD technique to bring DUMA in at run-time), but it aborted complaining about a free operation on memory that wasn't allocated via DUMA (almost certainly by some constructor of a static object in one of the previously mentioned third-party libraries).
Are there other run-time-linkable (or otherwise not requiring a recompilation/relink) tools around that will work on linux?

Give Dr. Memory a try. It is based on DynamoRIO and shares many of the features with Valgrind.

The TotalView debugger (or, more precisely, its Memscope) has a feature set similar to the one of Valgrind.
You can also try Electric Fence (original author's link) (the origin of DUMA) for buffer overflows or touch-after-free cases (but not for memleaks, though).

In 2020, to find memory leaks on Linux, you may try:
Address Sanitizers
For both GCC(above 4.8) and Clang (above 3.1), the address sanitizer can be used, it's great
the tool has been proved useful in large projects such as Chromium and Firefox.
It's much faster than other old alternatives like Valgrind.
ASan will provide very detailed memory region information, which is very helpful for analysis of the leak.
The drawback for ASan: You need to build your program with the option -fsanitize=address; The extra memory cost is much bigger.
TCmalloc
TCmalloc can be both used with LD_PRELOAD or directly link to your program. The result can be visualized with the pprof program, it has both beautiful web UI and consoles text mode, I suggest using it if address sanitizer is not applicable in your environment(If you have a very old compiler or your PC have very limited memory to run ASan).
TCmalloc is also used in large-scale production and proved to be robust.
Linux Perf tools and BCC
Linux perf tools can also be used to find memory leaks, it's a tool based on sampling. So it can not be precise, but it's still a great tool to help us analyze the usage of memory.
There is also a script from bcc's tools.
./memleak -p $(pidof allocs)
Trace allocations and display a summary of "leaked" (outstanding)
allocations every 5 seconds
./memleak -p $(pidof allocs) -t
Trace allocations and display each individual allocator function call
./memleak -ap $(pidof allocs) 10
Trace allocations and display allocated addresses, sizes, and stacks
every 10 seconds for outstanding allocations
./memleak -c "./allocs"
Run the specified command and trace its allocations
./memleak
Trace allocations in kernel mode and display a summary of outstanding
allocations every 5 seconds
./memleak -o 60000
Trace allocations in kernel mode and display a summary of outstanding
allocations that are at least one minute (60 seconds) old
./memleak -s 5
Trace roughly every 5th allocation, to reduce overhead
The pros of such tools: We don't need to rebuild our program, so it's handy for analyzing some online services.

Heapusage is a simple run-time tool for finding memory leaks on Linux and macOS. The output logging format for leaks is quite similar to Valgrind, but it only logs definite leaks (i.e. allocations not free'd at termination).
Full disclosure: I wrote Heapusage for usage in situations when Valgrind is inadequate (high performance applications, and also for CPU architectures not supported by Valgrind).

Related

Valgrind shows memory leak but no memory allocation took place

this is a rather simple question.
In my school we use a remote CentOS server to compile and test our programs. For some reason, valgrind always shows a 4096B leak in spite of the fact that no malloc was used. Does anyone here have any idea where this problem might stem from?
Your program makes a call to printf. This library might allocate memory for its own usage. More generally, depending on the OS/libc/..., various allocations might be done just to start a program.
Note also that in this case, you see that there is one block still allocated at exit, and that this block is part of the suppressed count. That means that valgrind suppression file already ensures that this memory does not appear in the list of leaks to be examined.
In summary: no problem.
In any case, when you suspect you have a leak, you can look at the details of the leaks e.g. their allocation stack trace to see if these are triggered by your application.
In addition to #phd's answer, there are a few things you can do to see more clearly what is going on.
If you run Valgrind with -s or -v it will show details of the suppressions used.
You can use --trace-malloc=yes to see all calls to allocation functions (only do that for small applications). Similarly you can run with --default-suppressions=no and than you will see the details of the memory (with --leak-check=full --show-reachable=yes in this case)
Finally, are you using an old Centos / GNU libc? A few years ago Valgrind got a mechanism to cleanup things like io buffers so you shouldn't get this sort of message with recent Valgrind and recent Linux + libc.

Following memory allocation in gdb

Why is memory consumption jumping unpredictably as I step through a program in the gdb debugger? I'm trying to use gdb to find out why a program is using far more memory than it should, and it's not cooperating.
I step through the source code while monitoring process memory usage, but I can't find what line(s) allocate the memory for two reasons:
Reported memory usage only jumps up in increments of (usually, but not always exactly) 64 MB. I suspect I'm seeing the effects of some memory manager I don't know about which reserves 64 MB at a time and masks multiple smaller allocations.
The jump doesn't happen at a consistent location in code. Not only does it occur on different lines during different gdb runs; it also sometimes happens in illogical places like the closing bracket of a (c++) function. Is it possible that gdb itself is affecting memory allocations?
Any ideas/suggestions for more effective tools to help me drill down to the code lines that are really responsible for these memory allocations?
Here's some relevant system info: I'm running x86_64-redhat-linux-gnu version 7.2-64.el6-5.2 on a virtual CentOS Linux machine under Windows. The program is built on a remote server via a complicated build script, so tracking down exactly what options were used at any point is itself a bit of a chore. I'm monitoring memory usage both with the top utility ("virt" or virtual memory column) and by reading the real-time monitoring file /proc/<pid>/status, and they agree. Since this program uses a large suite of third-party libraries, there may be one or more overridden malloc() functions involved somewhere that I don't know about--hunting them down is part of this task.
gdb, left to its own devices, will not affect the memory use of your program, though a run under gdb may differ from a standalone run for other reasons.
However, this also depends on the way you use gdb. If you are just setting simple breakpoints, stepping, and printing things, then you are ok. But sometimes, to evaluate an expression, gdb will allocate memory in the inferior. For example, if you have a breakpoint condition like strcmp(arg, "string") == 0, then gdb will allocate memory for that string constant. There are other cases like this as well.
This answer is in several parts because there were several things going on:
Valgrind with the Massif module (a memory profiler) was much more helpful than gdb for this problem. Sometimes a quick look with the debugger works, sometimes it doesn't. http://valgrind.org/docs/manual/ms-manual.html
top is a poor tool for profiling memory usage because it only reports virtual memory allocations, which in this case were about 3x the actual heap memory usage. Virtual memory is mapped and made available by the Unix kernel when a process asks for a memory block, but it's not necessarily used. The underlying system call is mmap(). I still don't know how to check the block size. top can only tell you what the Unix kernel knows about your memory consumption, which isn't enough to be helpful. Don't use it (or the memory files under /proc/) to do detailed memory profiling.
Memory allocation when stepping out of a function was caused by autolocks--that's a thread lock class whose destructor releases the lock when it goes out of scope. Then a different thread goes into action and allocates some memory, leaving the operator (me) mystified. Non-repeatability is probably because some threads were waiting for external resources like Internet connections.

tcl script aborts : unable to realloc xxx bytes [closed]

Closed. This question does not meet Stack Overflow guidelines. It is not currently accepting answers.
This question appears to be off-topic because it lacks sufficient information to diagnose the problem. Describe your problem in more detail or include a minimal example in the question itself.
Closed 8 years ago.
Improve this question
My Tcl script aborts, saying that it is unable to realloc 2191392 bytes. This happens when the script is kept for a longer execution duration, say, more than 10 hours. My Tcl script connects to devices using telnet and ssh connections and executes/verifies some command outputs on devices. The Linux machine has enough RAM 32GB, and ulimit is unlimited for process, data, file size. My script process doesn't eat up more memory, but the worst case is < 1GB. I just wonder why memory allocation is failed having plenty of RAM.
That message is an indication that an underlying malloc() call returned NULL (in a non-recoverable location), and given that it's not for very much, it's an indication that the system is thoroughly unable to allocate much memory. Depending on how your system is configured (32-bit vs. 64-bit; check what parray tcl_platform prints to find out) that could be an artefact of a few things, but if you think that it shouldn't be using more than a gigabyte of memory, it's an indication of a memory leak.
Unfortunately, it's hard to chase down memory leaks in general. Tcl's built-in memory command (enabled via configure --enable-symbols=mem at build time) can help, as can a tool like Electric Fence, but they are imperfect and can't generally tell you where you're getting things wrong (as you'll be looking for the absence of something to release memory). At the Tcl level, see whether each of the variables listed by info globals is of sensible size or whether there's a growing number of globals. You'll want to use tools like string length, array exists, array size and array names for this.
It's also possible for memory allocation to fail due to another process consuming so much memory that the OS starts to feel highly constrained. I hope this isn't happening in your case, since it's much harder to prevent.

Memory Debugging

Currently I analyze a C++ application and its memory consumption. Checking the memory consumption of the process before and after a certain function call is possible. However, it seems that, for technical reasons or for better efficiency the OS (Linux) assigns not only the required number of bytes but always a few more which can be consumed later by the application. This makes it hard to analyze the memory behavior of the application.
Is there a workaround? Can one switch Linux to a mode where it assigns just the required number of bytes/pages?
if you use malloc/new, the allocator will always alloc a little more bytes than you requested , as it needs some room to do its housekeeping, also it may need to align the bytes on pages boundaries. The amount of supplementary bytes allocated is implementation dependent.
you can consider to use tools such as gperftools (google) to monitor the memory used.
I wanted to check a process for memory leeks some years ago.
What I did was the following: I wrote a very small debugger (it is easier than it sounds) that simply set breakpoints to malloc(), free(), mmap(), ... and similar functions (I did that under Windows but under Linux it is simpler - I did it in Linux for another purpose!).
Whenever a breakpoint was reached I logged the function arguments and continued program execution...
By processing the logfile (semi-automated) I could find memory leaks.
Disadvantage: It is not possible to debug the program using another debugger in parallel.

clGetPlatformIDs Memory Leak

I'm testing my code on Ubuntu 12.04 with NVIDIA hardware.
No actual OpenCL processing takes place; but my initialization code is still running. This code calls clGetPlatformIDs. However, Valgrind is reporting a memory leak:
==2718== 8 bytes in 1 blocks are definitely lost in loss record 4 of 74
==2718== at 0x4C2B6CD: malloc (in /usr/lib/valgrind/vgpreload_memcheck-amd64-linux.so)
==2718== by 0x509ECB6: ??? (in /usr/lib/nvidia-current/libOpenCL.so.1.0.0)
==2718== by 0x50A04E1: ??? (in /usr/lib/nvidia-current/libOpenCL.so.1.0.0)
==2718== by 0x509FE9F: clGetPlatformIDs (in /usr/lib/nvidia-current/libOpenCL.so.1.0.0)
I was unaware this was even possible. Can this be fixed? Note that no special deinitialization is currently taking place--do I need to call something after this? The docs don't mention anything about having to deallocate anything.
regarding: "Check this out: devgurus.amd.com/thread/136242. valgrind cannot deal with custom memory allocators by design, which OpenCL is likely using"
to quote from the link given: "The behaviour not to free pools at the exit could be called a bug of the library though."
If you want to create a pool of memory and allocate from that, go ahead; but you still should properly deallocate it. The complexity of a memory pool as a whole is no less complex then the complexity of a regular memory reference and deserves at least the same attention, if not more, then that of regular references. Also, an 8 byte structure is highly unlikely to be a memory pool.
Tim Child would have a point about how you use clGetPlatformIds if it was designed to return allocated memory. However, reading http://www.khronos.org/registry/cl/sdk/1.0/docs/man/xhtml/clGetPlatformIDs.html I am not sufficiently convinced this should be the case.
The leak in question may or may not be serious, may or may not accumulate by successive calls, but you might be left only with the option to report the bug to nvidia in hopes they fix it or to find a different opencl implementation for development. Still, there might be reasons for an opencl library to create references to data which from the viewpoint of valgrind are not in use.
Sadly, this still leaves us with a memory leak caused by an external factor we cannot control, and it still leaves us with excess valgrind output.
Say you are sufficiently sure you are not responsible for this leak (say, we know for a fact that an nvidia engineer allocated a random value in OpenCL.so which he didn't deallocate just to spite you). Valgrind has a flag --gen-suppressions=yes, with which you can suppress warnings about particular warnings, which you can feed back to valgrind using --suppressions=$filename. Read the valgrind page for more details about how it works.
Be very wary of using suppressions though. Obviously suppressing errors does not fix them, and liberal usage of the mechanism will lead to situations where you suppress errors made by your code, rather then nvidia or valgrind. Do not suppress warnings of which you are not absolutely sure of where they come from, or regularly reassert your suppressions.

Resources