The question:
How can I tell how much memory is in use by the VMA's of my process (either when I'm in userspace or in kernel) ?
I'll give a short explanation on what I'm doing, So you could understand why I'm asking this.
I run on my Linux machine a few processes and one driver (kernel module). The processes memory is locked (not swappable), Therefore I want to make sure that the memory consume by the module along with the processes isn't acceding 90% of my total physical memory. In order to reduce malloc overhead I'm using mmap.
what I really need to know is how much memory my processes are really consuming rather than how much they asked for, and as much as I can tell I'm only missing the VMA's overhead of any allocation.
After digging I've found the answer:
While I'm in the driver I can use
current->mm->map_count
To know the current number of VMA's for the current process.
Multiply it by sizeof(struct vm_area_struct) will give me what I was looking for.
From here the accounting is pretty simple.
Related
Here is my system based on Linux2.6.32.12:
1 It contains 20 processes which occupy a lot of usr cpu
2 It needs to write data on rate 100M/s to disk and those data would not be used recently.
What I expect:
It can run steadily and disk I/O would not affect my system.
My problem:
At the beginning, the system run as I thought. But as the time passed, Linux would cache a lot data for the disk I/O, that lead to physical memory reducing. At last, there will be not enough memory, then Linux will swap in/out my processes. It will cause I/O problem that a lot cpu time was used to I/O.
What I have try:
I try to solved the problem, by "fsync" everytime I write a large block.But the physical memory is still decreasing while cached increasing.
How to stop page cache here, it's useless for me
More infomation:
When Top show free 46963m, all is well including cpu %wa is low and vmstat shows no si or so.
When Top show free 273m, %wa is so high which affect my processes and vmstat shows a lot si and so.
I'm not sure that changing something will affect overall performance.
Maybe you might use posix_fadvise(2) and sync_file_range(2) in your program (and more rarely fsync(2) or fdatasync(2) or sync(2) or syncfs(2), ...). Also look at madvise(2), mlock(2) and munlock(2), and of course mmap(2) and munmap(2). Perhaps ionice(1) could help.
In the reader process, you might perhaps use readhahead(2) (perhaps in a separate thread).
Upgrading your kernel (to a 3.6 or better) could certainly help: Linux has improved significantly on these points since 2.6.32 which is really old.
To drop pagecache you can do the following:
"echo 1 > /proc/sys/vm/drop_caches"
drop_caches are usually 0. And, can be changed as per need. As you've identified yourself, that you need to free pagecache, so this is how to do it. You can also take a look at dirty_writeback_centisecs (and it's related tunables)(http://lxr.linux.no/linux+*/Documentation/sysctl/vm.txt#L129) to make quick writeback, but note it might have consequences, as it calls up kernel flasher thread to write out dirty pages. Also, note the uses of dirty_expire_centices, which defines how much time some data needs to be eligible for writeout.
I need to check for a memory leak in an embedded system.
The IDE is HEW and we are using uCOSIII RTOS.
Valgrind does not support the above configurations. Can you please suggest a tool or a method to check for memory leaks?
First rule of dynamically allocating memory in embedded systems is "don't". Allocate it all once at the start of execution and then leave well alone. Otherwise you have to assess and decide what to do when a malloc (or similar operation) fails.
If you must dynamically allocate memory at runtime, then at its simplest you may be able to use a logging infrastructure to track calls to malloc/free by writing wrappers around them. Then you can track where and when the allocations and deallocations are happening and hopefully see what is missing.
Take a look at libtalloc, the core memory allocator used in Samba. It may not work out-of-the-box for you if you don't have atexit() or stdio.h, but it shouldn't take too much work to port it to your environment.
Have a look at talloc_enable_leak_report_full() and talloc_report_full() (among others) to get you started.
I have been giving some thoughts about it, and here is a random try on how to do this with embedded systems:
First you need to check in which thread leakage occur. When doing alloc, you should also count for each thread how many active allocation. Where number of allocation keeps growing without deallocation, this is suspicious task
Secondly, you need to count number of allocations for allocs comming from that thread. To do this, replace alloc with a macro. Using macro you can save name of the file and line number where the call originated.
for example
#define alloc(x) my_alloc(x, __LINE__, __FILE__)
void * my_alloc(size_t size, int line, char * file)
{
// increase number of allocations and dealocations for each combination line/file
}
Similarly you need to define my_free.
After this, run the program and printf from time to time allocations that keep growing. This should help find memory leaks.
P.S. I didn't test this, but I saw somebody do something similar in our code :)
Your requirement is not completely clear. If you are looking for the tool as "valgrind" that can be able find the memory leak in your environment; that is difficult to find out.
If you are having some code than you can check all the memory allocations & freeing of the memory in the particular application. As link1 Link2
Also there are some files available by executing them you can find the memory leak.
http://code.axter.com/debugalloc.cpp
http://code.axter.com/debugalloc.h
http://code.axter.com/debuglogger.cpp
http://code.axter.com/debuglogger.h
http://code.axter.com/debuglog.c
http://code.axter.com/debuglog.h
debugalloc.* code has the ability to track memory leaks, and it has
description and usage information in comments.
debuglogger.* code has some code for profileing your code.
debuglog.* is some limited C version of the code.
I'm trying to run a code on ssh that works perfect for a smaller mesh , but since the new mesh is much bigger i used ifort command to compile it,
ifort -mcmodel=medium -i-dynamic -otest.out*.f
and it complies but when i run it , the output is:
killed
i know that problem is from memory, does anyone know if there's any way to run it?
how can i understand where in code cause memory problem?
Thanks
shadi
From the ifort command line, I think you are running on Linux.
Seeing "killed" as output is generally the result of Linux's Out Of Memory killer (OOM) getting involved to prevent an impending crash (because it's common practice for applications to ask for more memory then they need requests for more memory than is currently available are accepted - check for "Out of Memory: Killed process [PID] [process name]" in the system log files). The OOM killer is generally pretty good at disposing of the application responsible for using all the memory, so the place to start is your applications memory usage.
The first thing to do is try and estimate (even if it's only roughly) how much memory you expect your application to use. One approach is to guestimate the size of the major arrays and multiply them by the number of bits needed per element. Another approach is to think about how you would expect the memory use to grow with mesh size. You can study this by experiment (run with different mesh sizes, measure the memory use and extrapolate) or from one measurement and knowledge of how the major array scale. It may be that you are asking for much more memory then you have on the machine: and the solution to this is probably to get a access to bigger computer. (Or you could try and find an alternative algorithm which uses less memory.)
If their is a memory leak you should see more memory use than expected, even for the smaller mesh size. If this is the case, valgrind should help. Moving from static to dynamic storage probably isn't going to help here - I would expect to see a segmentation fault if you were just exceeding the available space on the stack.
try using valgrind. i tried it to find memory leaks in my fortran code with good success.
http://valgrind.org/
Hello I developed a multi-threaded TCP server application that allows 10 concurrent connections receives continuous requests from them, after some processing requests, responds them to clients. I'm running it on a TI OMAP l137 processor based board it runs Monta Vista Linux. Threads are created per client ie 10 threads and it's pre-threaded. it's physical memory usage is about %1.5 and CPU usage is about %2 according to ps, top and meminfo. It's vm usage rises up to 80M where i have 48M (i reduced it from u-boot to reserve some mem for DSP). Any help is appreciated, how can i reduce it??.(/proc/sys/vm/.. tricks doesn't help :)
Thanks.
You can try using a drop in garbage collecting replacement for malloc(), and see if that solves your problem. If it does, find the leaks and fix them, then get rid of the garbage collector.
Its 'interesting' to chase these kinds of problems on platforms that most heap analyzers and profilers (e.g. valgrind) don't fully (if at all) support.
On another note, given the constraints .. I'm assuming you have decreased the default thread stack size? I think the default is 8M, you probably don't need that much. See pthread_attr_setstacksize() if you haven't adjusted it.
Edit:
You can check the default stack size with pthread_attr_getstacksize(). If it is at 8M, you've already blown your ceiling during thread creation (10 threads, as you mentioned).
Most VM is probably just for stacks. Of course, it's virtual, so it doesn't get commited if you don't use it.
(I'm wondering if thread's default stack size has anything to do with ulimit -s)
Apparently yes, according to
this other SO question
Does it rise to that level and stay there? Or does it eventually run out of memory? If the former, you simply need to figure out a way to have a smaller working set. If the latter, you have a memory leak and need to fix it.
I have a program that repeatedly solves large systems of linear equations using cholesky decomposition. Characterising is that I sometimes need to store the complete factorisation which can exceed about 20 GB of memory. The factorisation happens inside a library that I call. Furthermore, this matrix and the resulting factorisation changes quite frequently and as such the memory requirements as well.
I am not the only person to use this compute-node. Therefore, is there a way to start the program under Linux and preallocate free memory for the process?
Something like: $: prealloc -m 25G ./program
I'll stick my neck out and say that I don't think that there is such a way under Linux. I think that the philosophy of Linux (and every other multi-tasking o/s I've used or heard of) is to provide the programmer (and the program) with the illusion that they have the whole of the computer's memory to play with and to make it very difficult indeed for a programmer to interfere with the o/s.
Instead, I think that you should plan to modify your program to grab the memory it will (or may) require when it starts up, that is, do the memory management yourself in whatever your chosen language is. How easy this might be for you, considering calls to a library, I don't know.
I've never heard of such a way. Usually it would be bad for other users on the node if one program went ahead and hogged all available memory. It's not good practice.
But opinions aside, I would probably write my program in such a way that it acts like a small environment that is able to make multiple runs of the routine in question without ending. It would allocate lots of memory on startup, then wait for user commands (through a minimal shell) and make the runs requested with the allocated memory pool. It would hold on to the pool until the user requests termination.
Of course this requires you to have an interactive session on the node, which you may not have.