How do I find which process is leaking memory? [closed] - linux

Closed. This question is off-topic. It is not currently accepting answers.
Want to improve this question? Update the question so it's on-topic for Stack Overflow.
Closed 10 years ago.
Improve this question
I have a system (Ubuntu) with many processes and one (or more) have a memory leak. Is there a good way to find the process that has the leak? Some of the process are JVMs, some are not. Some are home grown some are open source.

You can run the top command (to run non-interactively, type top -b -n 1). To see applications which are leaking memory, look at the following columns:
RPRVT - resident private address space size
RSHRD - resident shared address space size
RSIZE - resident memory size
VPRVT - private address space size
VSIZE - total memory size

if the program leaks over a long time, top might not be practical. I would write a simple shell scripts that appends the result of "ps aux" to a file every X seconds, depending on how long it takes to leak significant amounts of memory. Something like:
while true
do
echo "---------------------------------" >> /tmp/mem_usage
date >> /tmp/mem_usage
ps aux >> /tmp/mem_usage
sleep 60
done

I suggest the use of htop, as a better alternative to top.

In addition to top, you can use System Monitor (System - Administration - System Monitor, then select Processes tab). Select View - All Processes, go to Edit - Preferences and enable Virtual Memory column. Sort either by this column, or by Memory column

If you can't do it deductively, consider the Signal Flare debugging pattern: Increase the amount of memory allocated by one process by a factor of ten. Then run your program.
If the amount of the memory leaked is the same, that process was not the source of the leak; restore the process and make the same modification to the next process.
When you hit the process that is responsible, you'll see the size of your memory leak jump (the "signal flare"). You can narrow it down still further by selectively increasing the allocation size of separate statements within this process.

Difficult task. I would normally suggest to grab a debugger/memory profiler like Valgrind and run the programs one after one in it. Soon or later you will find the program that leaks and can tell it the devloper or fix it yourself.

As suggeseted, the way to go is valgrind. It's a profiler that checks many aspects of the running performance of your application, including the usage of memory.
Running your application through Valgrind will allow you to verify if you forget to release memory allocated with malloc, if you free the same memory twice etc.

Related

How to disable the oom killer in linux? [closed]

Closed. This question does not meet Stack Overflow guidelines. It is not currently accepting answers.
This question does not appear to be about a specific programming problem, a software algorithm, or software tools primarily used by programmers. If you believe the question would be on-topic on another Stack Exchange site, you can leave a comment to explain where the question may be able to be answered.
Closed 3 years ago.
The community reviewed whether to reopen this question 4 months ago and left it closed:
Original close reason(s) were not resolved
Improve this question
My current configs are:
> cat /proc/sys/vm/panic_on_oom
0
> cat /proc/sys/vm/oom_kill_allocating_task
0
> cat /proc/sys/vm/overcommit_memory
1
but when I run a task, it's killed anyway.
> ./test/mem.sh
Killed
> dmesg | tail -2
[24281.788131] Memory cgroup out of memory: Kill process 10565 (bash) score 1001 or sacrifice child
[24281.788133] Killed process 10565 (bash) total-vm:12601088kB, anon-rss:5242544kB, file-rss:64kB
Update
My tasks are used to scientific computing, which costs many memories, it seems that overcommit_memory=1 may be the best choice.
Update 2
Actually, I'm working on a data analyzation project, which costs memory more than 16G, but I was asked to limit them in about 5G. It might be impossible to implement this requirement via optimizing the program itself, because the project uses many sub-commands, and most of them does not contains options like Xms or Xmx in Java.
Update 3
My project should be an overcommited system. Exacetly as what a3f saying, it seems that my apps prefer to crash by xmalloc when mem allocated failed.
> cat /proc/sys/vm/overcommit_memory
2
> ./test/mem.sh
./test/mem.sh: xmalloc: .././subst.c:3542: cannot allocate 1073741825 bytes (4295237632 bytes allocated)
I don't want to surrender, although so many aweful tests make me exhausted.
So please show me a way to the light ; )
The OOM killer won't go away. If there is no memory, someone's got to pay. What you can do is set a limit after which memory allocations fail.
That's exactly what setting vm.overcommit_memory to 2 achieves.
From the docs:
The Linux kernel supports the following overcommit handling modes
2 - Don't overcommit. The total address space commit for the system
is not permitted to exceed swap + a configurable amount (default is
50%) of physical RAM. Depending on the amount you use, in most
situations this means a process will not be killed while accessing
pages but will receive errors on memory allocation as appropriate.
Normally, the kernel will happily hand out virtual memory (overcommit). Only when you reference a page, the kernel has to map the page to a real physical frame. If it can't service that request, a process needs to be killed by the OOM killer to make space.
Disabling overcommit means that e.g. malloc(3) will return NULL if the kernel couldn't commit the amount of memory requested. This makes things a bit more predictable, albeit limited (many applications allocate more than they would ever need).
The possible values of oom_adj range from -17 to +15. The higher the score, more likely the associated process is to be killed by OOM-killer. If oom_adj is set to -17, the process is not considered for OOM-killing.
But, increase ram is better choice ,if increasing ram is not possible, then add swap memory.
To increase swap memory try this link,

Does Linux have a page file? [closed]

Closed. This question does not meet Stack Overflow guidelines. It is not currently accepting answers.
This question does not appear to be about a specific programming problem, a software algorithm, or software tools primarily used by programmers. If you believe the question would be on-topic on another Stack Exchange site, you can leave a comment to explain where the question may be able to be answered.
Closed 7 years ago.
Improve this question
I found at several places that Linux uses pages and a paging mechanism but I didn't find anywhere where this file is or how to configure it.
All the information I found is about the Linux swap file / partition. There is a difference between paging and swapping:
Paging moves pages (a small frame which contains a piece of data - usually 4 KB but can vary between different OS's) from main memory to a backbend storage, happens always as a normal function of the operating system.
Swapping moves an entire process to storage and happens when the system is memory stressed or on windows 8 when a new application is hibernating.
Does Linux uses it's swap file / partition for both cases?
If so, how could I see how many page are currently paged out? This information is not there in vmstat, free or swapon commands (or that I fail to see it).
Or is there another file used for paging?
If so, how can I configure it (and watch it's usage)?
Or perhaps Linux does not use paging at all and I was mislead?
I would appreciate if the answers will be specific to red hat enterprise Linux both versions 6 and 7 but also a general answer about all Linux's will be good.
Thanks in advance.
On Linux, the swap partition(s) are used for paging.
Linux does not respond to memory pressure by swapping out whole processes. The virtual memory system does demand paging, page by page. Under extreme memory pressure, one or more processes will be killed by the OOM killer. (There are some useful links to documentation in the first NOTE in man malloc)
There is a line in the top header which shows swap partition usage, but if that is all the information you want, use
swapon -s
man swapon for more information.
The swap partition usage is not the same as the number of unmapped pages. A page might be memory-mapped to a file using the mmap call; since that page has backing store in the file, there is no need to also write it to a swap partition, and the system won't use swap space for that. But swap partition usage is a pretty good indicator.
Also note that Linux (unlike Windows) does not allocate swap space for pages when they are allocated. Instead, it adds the new page to the virtual memory map without any backing store. and allocates the swap space when the page needs to be swapped out. The consequence (as described in the malloc manpage referenced earlier) is that a malloc call may succeed in allocating virtual memory, but a subsequent attempt to use that virtual memory may fail.
Although Linux retains the term 'swap partition' as a historical relic, it actually performs paging. So your expectation is borne out; you were just thrown by the archaic terminology.

Who determine the block size when writting to a disk? [closed]

Closed. This question needs to be more focused. It is not currently accepting answers.
Want to improve this question? Update the question so it focuses on one problem only by editing this post.
Closed 8 years ago.
Improve this question
This might be a naive question but I can't find a straight answer for it.
While using IO tools such as dd tool, fio and bonnie++, one of the tools parameters is to set the block size that will be used in the test. So, one can set the block size to 512 KB, 1 MB or even more. And as the block size get greater the output MB/s also get higher, and I believe it is logical, since you get to write on less blocks.
So my questions are:
-How does the process happen while the default block size is 4 KB or 32 KB in some kernels ?!
-In any other application, who determine the block size to write on a disk ? is it the application itself or the operating system ?!
-What would be a typical block size of a database application for instance ?!
Thanks in advance :)
If you use something like dd, you're doing a block-level operation, so you get to specify a block size. Up to a point, you'll get greater speed by using a bigger block size, but it will quickly tail off. It's very inefficient to read from a disk byte by byte, but by the time you've hit a few megabytes, you won't notice any further speed increase.
When an application writes to disk, it is generally not doing block-level access, but reading and writing files. It's the operating system that is responsible for turning this file-level access into block-level access. An application, unless it's a specialised one running as root, won't care about block-level access, and won't be involved in determining block sizes for this kind of thing.
It's further complicated by the disk cache: when you read something at the application level, if you're lucky, you won't touch the disk at all: it'll be something already cached, and you'll retrieve it from there (without being aware of it). When you write, you will hopefully find that you write into the cache and appear to finish immediately, and then the operating system will do the actual write when it gets round to it. It's only if you're doing lots of writing, or if the cache is turned off, that you will exhaust the cache and the writes will need to happen before control gets passed back to your application.
In short: unless you muck about at a fairly low level, you don't need to worry about block sizes.

tcl script aborts : unable to realloc xxx bytes [closed]

Closed. This question does not meet Stack Overflow guidelines. It is not currently accepting answers.
This question appears to be off-topic because it lacks sufficient information to diagnose the problem. Describe your problem in more detail or include a minimal example in the question itself.
Closed 8 years ago.
Improve this question
My Tcl script aborts, saying that it is unable to realloc 2191392 bytes. This happens when the script is kept for a longer execution duration, say, more than 10 hours. My Tcl script connects to devices using telnet and ssh connections and executes/verifies some command outputs on devices. The Linux machine has enough RAM 32GB, and ulimit is unlimited for process, data, file size. My script process doesn't eat up more memory, but the worst case is < 1GB. I just wonder why memory allocation is failed having plenty of RAM.
That message is an indication that an underlying malloc() call returned NULL (in a non-recoverable location), and given that it's not for very much, it's an indication that the system is thoroughly unable to allocate much memory. Depending on how your system is configured (32-bit vs. 64-bit; check what parray tcl_platform prints to find out) that could be an artefact of a few things, but if you think that it shouldn't be using more than a gigabyte of memory, it's an indication of a memory leak.
Unfortunately, it's hard to chase down memory leaks in general. Tcl's built-in memory command (enabled via configure --enable-symbols=mem at build time) can help, as can a tool like Electric Fence, but they are imperfect and can't generally tell you where you're getting things wrong (as you'll be looking for the absence of something to release memory). At the Tcl level, see whether each of the variables listed by info globals is of sensible size or whether there's a growing number of globals. You'll want to use tools like string length, array exists, array size and array names for this.
It's also possible for memory allocation to fail due to another process consuming so much memory that the OS starts to feel highly constrained. I hope this isn't happening in your case, since it's much harder to prevent.

Memory Debugging

Currently I analyze a C++ application and its memory consumption. Checking the memory consumption of the process before and after a certain function call is possible. However, it seems that, for technical reasons or for better efficiency the OS (Linux) assigns not only the required number of bytes but always a few more which can be consumed later by the application. This makes it hard to analyze the memory behavior of the application.
Is there a workaround? Can one switch Linux to a mode where it assigns just the required number of bytes/pages?
if you use malloc/new, the allocator will always alloc a little more bytes than you requested , as it needs some room to do its housekeeping, also it may need to align the bytes on pages boundaries. The amount of supplementary bytes allocated is implementation dependent.
you can consider to use tools such as gperftools (google) to monitor the memory used.
I wanted to check a process for memory leeks some years ago.
What I did was the following: I wrote a very small debugger (it is easier than it sounds) that simply set breakpoints to malloc(), free(), mmap(), ... and similar functions (I did that under Windows but under Linux it is simpler - I did it in Linux for another purpose!).
Whenever a breakpoint was reached I logged the function arguments and continued program execution...
By processing the logfile (semi-automated) I could find memory leaks.
Disadvantage: It is not possible to debug the program using another debugger in parallel.

Resources