How to detect if an application leaks memory? - linux

I have quite a complex system, with 30 applications running. One quite complex C++ application was leaking memory, and I think I fixed it.
What I've done so far is:
I executed the application using valgrind's memcheck, and it detected no problems.
I monitored the application using htop, and I noticed that virtual and residual memory is not increasing
I am planing to run valgrind's massif and see if it uses new memory
The question is, how can I make sure there are no leaks? I thought if virtual memory stopped increasing, then I could be sure there are no leaks. When I test my application, I trigger the loop where the memory is allocated and deallocated several times just to make sure.

You can't be sure except you know exactly all the conditions under which the application will allocate new memory. If you can't induce all of these conditions neither valgrind nor htop will guarantee that your application doesn't leak memory under all circumstances.
Yet you should make at least sure that the application doesn't leak memory under normal conditions.

If valgrind doesn't report leaks, there aren't leaks in the sense of memory areas that aren't accessible anymore (during the runs you checked). That doesn't mean that the program doesn't allocate memory, uses it and doesn't free it when it won't use it anymore (but it is still reachable). Think e.g. a the typical to-do stack, you place new items on top, work on the item on top and then push another one. Won't ever go back to the old ones so the memory used for them is wasted, but technically it isn't a leak.
What you can do is to monitor the memory usage by the process. If it steadily increases, you might have a problem there (either a bona fide leak, or some data structure that grows without need).
If this isn't really pressing, it might be cheaper in the long run just letting it be...

You need to use a tool called Valgrind. It is memory debugging, memory leak detection, and profiling tool for Linux and Mac OS X operating systems. Valgrind is a flexible program for debugging and profiling Linux executables.
follow steps..
Just install valgrind
To run...
./a.out arg1 arg2
Now how to Use this command line to turn on the detailed memory leak detector:
valgrind --leak-check=yes ./a.out arg1 arg2
valgrind --leak-check=yes /path/to/myapp arg1 arg2
Or
You can also set logfile:
valgrind --log-file=output.file --leak-check=yes --tool=memcheck ./a.out arg1 arg2
You can check its log for error of memory leak...
cat output.file

Related

Valgrind shows memory leak but no memory allocation took place

this is a rather simple question.
In my school we use a remote CentOS server to compile and test our programs. For some reason, valgrind always shows a 4096B leak in spite of the fact that no malloc was used. Does anyone here have any idea where this problem might stem from?
Your program makes a call to printf. This library might allocate memory for its own usage. More generally, depending on the OS/libc/..., various allocations might be done just to start a program.
Note also that in this case, you see that there is one block still allocated at exit, and that this block is part of the suppressed count. That means that valgrind suppression file already ensures that this memory does not appear in the list of leaks to be examined.
In summary: no problem.
In any case, when you suspect you have a leak, you can look at the details of the leaks e.g. their allocation stack trace to see if these are triggered by your application.
In addition to #phd's answer, there are a few things you can do to see more clearly what is going on.
If you run Valgrind with -s or -v it will show details of the suppressions used.
You can use --trace-malloc=yes to see all calls to allocation functions (only do that for small applications). Similarly you can run with --default-suppressions=no and than you will see the details of the memory (with --leak-check=full --show-reachable=yes in this case)
Finally, are you using an old Centos / GNU libc? A few years ago Valgrind got a mechanism to cleanup things like io buffers so you shouldn't get this sort of message with recent Valgrind and recent Linux + libc.

Following memory allocation in gdb

Why is memory consumption jumping unpredictably as I step through a program in the gdb debugger? I'm trying to use gdb to find out why a program is using far more memory than it should, and it's not cooperating.
I step through the source code while monitoring process memory usage, but I can't find what line(s) allocate the memory for two reasons:
Reported memory usage only jumps up in increments of (usually, but not always exactly) 64 MB. I suspect I'm seeing the effects of some memory manager I don't know about which reserves 64 MB at a time and masks multiple smaller allocations.
The jump doesn't happen at a consistent location in code. Not only does it occur on different lines during different gdb runs; it also sometimes happens in illogical places like the closing bracket of a (c++) function. Is it possible that gdb itself is affecting memory allocations?
Any ideas/suggestions for more effective tools to help me drill down to the code lines that are really responsible for these memory allocations?
Here's some relevant system info: I'm running x86_64-redhat-linux-gnu version 7.2-64.el6-5.2 on a virtual CentOS Linux machine under Windows. The program is built on a remote server via a complicated build script, so tracking down exactly what options were used at any point is itself a bit of a chore. I'm monitoring memory usage both with the top utility ("virt" or virtual memory column) and by reading the real-time monitoring file /proc/<pid>/status, and they agree. Since this program uses a large suite of third-party libraries, there may be one or more overridden malloc() functions involved somewhere that I don't know about--hunting them down is part of this task.
gdb, left to its own devices, will not affect the memory use of your program, though a run under gdb may differ from a standalone run for other reasons.
However, this also depends on the way you use gdb. If you are just setting simple breakpoints, stepping, and printing things, then you are ok. But sometimes, to evaluate an expression, gdb will allocate memory in the inferior. For example, if you have a breakpoint condition like strcmp(arg, "string") == 0, then gdb will allocate memory for that string constant. There are other cases like this as well.
This answer is in several parts because there were several things going on:
Valgrind with the Massif module (a memory profiler) was much more helpful than gdb for this problem. Sometimes a quick look with the debugger works, sometimes it doesn't. http://valgrind.org/docs/manual/ms-manual.html
top is a poor tool for profiling memory usage because it only reports virtual memory allocations, which in this case were about 3x the actual heap memory usage. Virtual memory is mapped and made available by the Unix kernel when a process asks for a memory block, but it's not necessarily used. The underlying system call is mmap(). I still don't know how to check the block size. top can only tell you what the Unix kernel knows about your memory consumption, which isn't enough to be helpful. Don't use it (or the memory files under /proc/) to do detailed memory profiling.
Memory allocation when stepping out of a function was caused by autolocks--that's a thread lock class whose destructor releases the lock when it goes out of scope. Then a different thread goes into action and allocates some memory, leaving the operator (me) mystified. Non-repeatability is probably because some threads were waiting for external resources like Internet connections.

clGetPlatformIDs Memory Leak

I'm testing my code on Ubuntu 12.04 with NVIDIA hardware.
No actual OpenCL processing takes place; but my initialization code is still running. This code calls clGetPlatformIDs. However, Valgrind is reporting a memory leak:
==2718== 8 bytes in 1 blocks are definitely lost in loss record 4 of 74
==2718== at 0x4C2B6CD: malloc (in /usr/lib/valgrind/vgpreload_memcheck-amd64-linux.so)
==2718== by 0x509ECB6: ??? (in /usr/lib/nvidia-current/libOpenCL.so.1.0.0)
==2718== by 0x50A04E1: ??? (in /usr/lib/nvidia-current/libOpenCL.so.1.0.0)
==2718== by 0x509FE9F: clGetPlatformIDs (in /usr/lib/nvidia-current/libOpenCL.so.1.0.0)
I was unaware this was even possible. Can this be fixed? Note that no special deinitialization is currently taking place--do I need to call something after this? The docs don't mention anything about having to deallocate anything.
regarding: "Check this out: devgurus.amd.com/thread/136242. valgrind cannot deal with custom memory allocators by design, which OpenCL is likely using"
to quote from the link given: "The behaviour not to free pools at the exit could be called a bug of the library though."
If you want to create a pool of memory and allocate from that, go ahead; but you still should properly deallocate it. The complexity of a memory pool as a whole is no less complex then the complexity of a regular memory reference and deserves at least the same attention, if not more, then that of regular references. Also, an 8 byte structure is highly unlikely to be a memory pool.
Tim Child would have a point about how you use clGetPlatformIds if it was designed to return allocated memory. However, reading http://www.khronos.org/registry/cl/sdk/1.0/docs/man/xhtml/clGetPlatformIDs.html I am not sufficiently convinced this should be the case.
The leak in question may or may not be serious, may or may not accumulate by successive calls, but you might be left only with the option to report the bug to nvidia in hopes they fix it or to find a different opencl implementation for development. Still, there might be reasons for an opencl library to create references to data which from the viewpoint of valgrind are not in use.
Sadly, this still leaves us with a memory leak caused by an external factor we cannot control, and it still leaves us with excess valgrind output.
Say you are sufficiently sure you are not responsible for this leak (say, we know for a fact that an nvidia engineer allocated a random value in OpenCL.so which he didn't deallocate just to spite you). Valgrind has a flag --gen-suppressions=yes, with which you can suppress warnings about particular warnings, which you can feed back to valgrind using --suppressions=$filename. Read the valgrind page for more details about how it works.
Be very wary of using suppressions though. Obviously suppressing errors does not fix them, and liberal usage of the mechanism will lead to situations where you suppress errors made by your code, rather then nvidia or valgrind. Do not suppress warnings of which you are not absolutely sure of where they come from, or regularly reassert your suppressions.

Does an Application memory leak cause an Operating System memory leak?

When we say a program leaks memory, say a new without a delete in c++, does it really leak? I mean, when the program ends, is that memory still allocated to some non-running program and can't be used, or does the OS know what memory was requested by each program, and release it when the program ends? If I run that program a lot of times, will I run out of memory?
No, in all practical operating systems, when a program exits, all its resources are reclaimed by the OS. Memory leaks become a more serious issue in programs that might continue running for an extended time and/or functions that may be called often from the same program.
On operating systems with protected memory (Mac OS 10+, all Unix-clones such as Linux, and NT-based Windows systems meaning Windows 2000 and younger), the memory gets released when the program ends.
If you run any program often enough without closing it in between (running more and more instances at the same time), you will eventually run out of memory, regardless of whether there is a memory leak or not, so that's also true of programs with memory leaks. Obviously, programs leaking memory will fill the memory faster than an identical program without memory leaks, but how many times you can run it without filling the memory depends much rather on how much memory that program needs for normal operation than whether there's a memory leak or not. That comparison is really not worth anything unless you are comparing two completely identical programs, one with a memory leak and one without.
Memory leaks become the most serious when you have a program running for a very long time. Classic examples of this is server software, such as web servers. With games or spreadsheet programs or word processors, for instance, memory leaks aren't nearly as serious because you close those programs eventually, freeing up the memory. But of course memory leaks are nasty little beasts which should always be tackled as a matter of principle.
But as stated earlier, all modern operating systems release the memory when the program closes, so even with a memory leak, you won't fill up the memory if you're continuously opening and closing the program.
Leaked memory is returned by the OS after the execution has stopped.
That's why it isn't always a big problem with desktop applications, but its a big problem with servers and services (they tend to run long times.).
Lets look at the following scenario:
Program A ask memory from the OS
The OS marks the block X as been used by A and returns it to the program.
The program should have a pointer to X.
The program returns the memory.
The OS marks the block as free. Using the block now results in a access violation.
Program A ends and all memory used by A is marked unused.
Nothing wrong with that.
But if the memory is allocated in a loop and the delete is forgotten, you run into real problems:
Program A ask memory from the OS
The OS marks the block X as been used by A and returns it to the program.
The program should have a pointer to X.
Goto 1
If the OS runs out of memory, the program probably will crash.
No. Once the OS finishes closing the program, the memory comes back (given a reasonably modern OS). The problem is with long-running processes.
When the process ends, the memory gets cleared as well. The problem is that if a program leaks memory, it will requests more and more of the OS to run, and can possibly crash the OS.
It's more leaking in the sense that the code itself has no more grip on the piece of memory.
The OS can release the memory when the program ends. If a leak exists in a program then it is just an issue whilst the program is running. This is a problem for long running programs such as server processes. Or for example, if your web browser had a memory leak and you kept it running for days then it would gradually consume more memory.
As far as I know, on most OS when a program is started it receives a defined segment of memory which will be completely liberated once the program is ended.
Memory leaks are one of the main reason why garbage collector algorithms were invented since, once plugged into the runtime, they become responsible in reclaiming the memory that is no longer accessible by a program.
Memory leaks don't persist past end of execution so a "solution" to any memory leak is to simply end program execution. Obviously this is more of an issue on certain types of software. Having a database server which needs to go offline every 8 hours due to memory leaks is more of an issue than a video game which needs to be restarted after 8 hours of continual play.
The term "leak" refers to the fact that over time memory consumption will grow without any increased benefit. The "leaked" memory is memory neither used by the program nor usable by the OS (and other programs).
Sadly memory leaks are very common in unmanaged code. I have had firefox running for a couple days now and memory usage is 424MB despite only having 4 tabs open. If I closed firefox and re-opened the same tabs memory usage would likely be <100MB. Thus 300+ MB has "leaked".

the output of my fortran code is killed , any suggestion?

I'm trying to run a code on ssh that works perfect for a smaller mesh , but since the new mesh is much bigger i used ifort command to compile it,
ifort -mcmodel=medium -i-dynamic -otest.out*.f
and it complies but when i run it , the output is:
killed
i know that problem is from memory, does anyone know if there's any way to run it?
how can i understand where in code cause memory problem?
Thanks
shadi
From the ifort command line, I think you are running on Linux.
Seeing "killed" as output is generally the result of Linux's Out Of Memory killer (OOM) getting involved to prevent an impending crash (because it's common practice for applications to ask for more memory then they need requests for more memory than is currently available are accepted - check for "Out of Memory: Killed process [PID] [process name]" in the system log files). The OOM killer is generally pretty good at disposing of the application responsible for using all the memory, so the place to start is your applications memory usage.
The first thing to do is try and estimate (even if it's only roughly) how much memory you expect your application to use. One approach is to guestimate the size of the major arrays and multiply them by the number of bits needed per element. Another approach is to think about how you would expect the memory use to grow with mesh size. You can study this by experiment (run with different mesh sizes, measure the memory use and extrapolate) or from one measurement and knowledge of how the major array scale. It may be that you are asking for much more memory then you have on the machine: and the solution to this is probably to get a access to bigger computer. (Or you could try and find an alternative algorithm which uses less memory.)
If their is a memory leak you should see more memory use than expected, even for the smaller mesh size. If this is the case, valgrind should help. Moving from static to dynamic storage probably isn't going to help here - I would expect to see a segmentation fault if you were just exceeding the available space on the stack.
try using valgrind. i tried it to find memory leaks in my fortran code with good success.
http://valgrind.org/

Resources