The malloc/free in linux was managed by glibc and when we free the memory, glibc will not return it to RAM directly (may cached for future malloc), so if there were lots of small size memory malloc and free, the heap size (VSS) will increase a lot even the memory is freed.
http://www.gnu.org/software/libc/manual/html_mono/libc.html#Efficiency-and-Malloc
So the VSS size include the memory allocted and in use one and freed but not returned to RAM one, how can we check the size of each?
Thx.
The standard mallinfo function is a bad match to answer your question, because its interface is fundamentally broken.
A non-portable GLIBC-specific answer is to use malloc_stats of malloc_info.
Related
I am rather new to vxworks, and I am building an RTP application, which needs to allocate some memory dynamically. I have configured the kernel for a memory size of 750MB.
I am allocating memory in blocks 10 numbers each of size 32MB in the very beginning of the program, but after the 5th or 6th block allocation, I get an allocation failure with message memPartAlloc: block too big 15912260 bytes (0x10 aligned) in partition 0xe004608 on the console.
How could memory allocation be failing when there is enough memory available? I do not think memory had fragmented enough for allocation to fail right in the beginning of my program and as per output of memShow(), there is indeed enough free memory to satisfy the request.
If memory has indeed fragmented due to any strange reason, is there some way to compact free space and continue in Vxworks?
This is an old question, so this answer may be moot now, and is to an extent based on speculation based on the limited information in the question.
Whilst the kernel maybe configured to support 750MB, this will be the total memory available. Some of this will be used by the OS image, although we wont expect much, and we can assume that at least 700MB should be available for use.
Some extra memory will be used to provide the stacks for each task - how much is very application dependant, as it is specified in the taskSpawn. You can check this, but again, is unlikely to make significant difference.
Lets be generous, and assume that you really only have 650MB. This should, in theory, be plenty.
And yet we have this error:
memPartAlloc: block too big 15912260 bytes (0x10 aligned) in partition 0xe004608
What can be happening? And what does this mean?
This error tells you that the memory allocator could not allocate memory, as the request was too large. Interestingly, the request is 15912260, which is not 32MB, it is actually a shade over 15MB. So it would be worth checking what you are actually requesting.
Secondly, this error message is coming from memPartAlloc. Are you using allocating memory using malloc() or memPartAlloc()? The distinction matters, since malloc will allocate memory from the system memory partition, whereas memPartAlloc allocates memory from a user-specifed, and created, partition.
If you are using memPartAlloc, ensure that you are allocating memory from the correct partition, and that it has been created with enough memory to fulfill the request.
EDIT:
As it appears that this was an RTP, you should also confirm that the RTP has a large enough heap allocated. This is specified via an environment variable, as this answer describes.
is it ok to call too many malloc & free in a program?
i have a program that does malloc and free for each record. Although it sounds bad, does it have performance issue if i use too many malloc and free ?
Most modern malloc(3) implementations work like a memory pool. Since most modern OSes treat memory with pages (usually 4KB size), a malloc will probably request at least 4KB from the OS.
Suppose you keep calling malloc with 32. In your first malloc, at least one new page is requested from the OS (via sbrk(2) on unix). The successive mallocs have nothing to do with the OS, they just return you the next free chunk of memory in the memory pool as long as memory is available. So, calling malloc many times is not a big deal, usually. The point here is that system calls (the communication between the user process and OS) are usually expensive and malloc tries its best to avoid as much as possible.
free is similar too. When you free memory, usually OS isn't notified about that. When a page is totally freed, the page may be returned to the OS. Some implementations do not return the page to the OS unless the process already holds many unused pages.
To sum it up, malloc and free are like generic memory managers working with arbitrary size. The problem you might face is that malloc is designed to work with arbitrary size allocations, which might be slower than a memory manager that's designed to work with fixed size allocations. If you're usually allocating the same types of memory, you might be better off with implementing your own memory pool. Another case would be that malloc calls involve locking/unlocking in most modern implementations to support multithreading. If you're working with a single thread, that might also be an overhead: another reason to implement your own memory pool.
You might also want to work with different malloc implementations, benchmark them and decide to go with either one. Starting with a clean implementation and stripping off unnecessary parts might also be a good idea here.
yes/no. Large volumes of malloc/free can cause the heap to be fragmented to the point where malloc can fail. It is less of an issue now that memory is pretty cheap.
There is some overhead in calling malloc, but not a lot. malloc basically has to go to through the heap and find a block of memory that is unused and large enough to hold the number of bytes you asked for, then it designates that block as used and tells the operating system to mmap it for you and returns a pointer to that block.
It's a few steps, but really not a lot of work for your computer. The difference between using malloc to get memory for you, and putting a variable on the stack is a handful of instructions, and a system call, and unless you're programming on an embedded system, you honestly shouldn't worry about it. You'll only take a real performance hit if you allocate so much memory that you actually run out of RAM (in which case your Virtual Memory Manager will have to move some things into the swap space to make more room - as it turns out, malloc never fails)!
Freeing memory is even easier than allocating it, and in the end it's better to free what you allocate (future malloc calls will be faster, more memory will be available).
In short, use malloc to your hearts content! Decades of advances in technology have worked hard to earn you that right, there's no sense squandering it!
By definition 'too many' is 'too many'.
But more seriously, on most systems heap allocation is reasonably fast - because its done a lot. Allocating space for a record each time its processed doesn't sound bad.
The real answer is : write your program and measure its speed, is it acceptable? If not then profile it and find where the bottlenecks are - my 10c says it wont be heap processing
How does libc communicate with the OS (e.g., a Linux kernel) to manage memory? Specifically, how does it allocate memory, and how does it release memory? Also, in what cases can it fail to allocate and deallocate, respectively?
That is very general question, but I want to speak to the failure to allocate. It's important to realize that memory is actually allocated by kernel upon first access. What you are doing when calling malloc/calloc/realloc is reserving some addresses inside the virtual address space of a process (via syscalls brk, mmap, etc. libc does that).
When I get malloc or similar to fail (or when libc get brk or mmap to fail), it's usually because I exhausted the virtual address space of a process. This happens when there is no continuous block of free address, an no room to expand an existing one. You can either exhaust all space available or hit a limit RLIMIT_AS. It's pretty common especially on 32bit systems when using multiple threads, because people sometimes forget that each thread needs it's own stack. Stacks usually consume several megabytes, which means you can create only few hundreds threads before you have no more free address space. Maybe an even more common reason for exhausted address space are memory leaks. Libc of course tries to reuse space on the heap (space obtained by a brk syscall) and tries to munmmap unneeded mappings. However, it can't reuse something that is not "deallocated".
The shortage of physical memory is not detectable from within a process (or libc which is part of the process) by failure to allocate. Yeah, you can hit "overcommitting limit", but that doesn't mean the physical memory is all taken. When free physical memory is low, kernel invokes special task called OOM killer (Out Of Memory Killer) which terminates some processes in order to free memory.
Regarding failure to deallocate, my guess is it doesn't happen unless you do something silly. I can imagine setting program break (end of heap) below it's original position (by a brk syscall). That is, of course, recipe for a disaster. Hopefully libc won't do that and it doesn't make much sense either. But it can be seen as failed deallocation. munmap can also fail if you supply some silly argument, but I can't think of regular reason for it to fail. That doesn't mean it doesn't exists. We would have to dig deep within source code of glibc/kernel to find out.
1) how does it allocate memory
libc provides malloc() to C programs.
Normally, malloc allocates memory from the heap, and adjusts the
size of the heap as required, using sbrk(2). When allocating blocks of
memory larger than MMAP_THRESHOLD bytes, the glibc malloc()
implementation allocates the memory as a private anonymous mapping
using mmap(2). MMAP_THRESHOLD is 128 kB by default, but is adjustable
using mallopt(3). Allocations performed using mmap(2) are unaffected
by the RLIMIT_DATA resource limit (see getrlimit(2)).
And this is about sbrk.
sbrk - change data segment size
2) in what cases can it fail to allocate
Also from malloc
By default, Linux follows an optimistic memory allocation strategy.
This means that when malloc() returns non-NULL there is no guarantee
that the memory really is available.
And from proc
/proc/sys/vm/overcommit_memory
This file contains the kernel virtual memory accounting mode. Values are:
0: heuristic overcommit (this is the default)
1: always overcommit, never check
2: always check, never overcommit
Mostly it uses the sbrk system call to adjust the size of the data segment, thereby reserving more memory for it to parcel out. Memory allocated in that way is generally not released back to the operating system because it is only possible to do it when the blocks available to be released are at the end of the data segment.
Larger blocks are sometime done by using mmap to allocate memory, and that memory can be released again with an munmap call.
How does libc communicate with the OS (e.g., a Linux kernel) to manage memory?
Through system calls - this is a low-level API that the kernel provides.
Specifically, how does it allocate memory, and how does it release memory?
Unix-like systems provide the "sbrk" syscall.
Also, in what cases can it fail to allocate and deallocate, respectively?
Allocation can fail, for example, when there's no enough available memory. Deallocation shall not fail.
I am trying to allocate a 5-page-800x600 frame buffer(roughly 5mb). But during DRAM memory map initialization, dma_alloc_coherent() only returns a zero pointer or does not allocate the buffer.
It used to work with just allocating a 4-page frame buffer(4mb). I have already tried setting CONSISTENT_DMA_SIZE to 8mb, 10mb, and 12mb. But this doesn't seem to have any effect.
Is there any other setting I'm over looking?
thanks alot,
nazekimi
P.S.
working on a Linux 2.6.10 Mobilinux kernel
kernel does power-of-2 allocation. so 5MB means 8MB allocation. so probably you need to increase CONSISTENT_DMA_SIZE even more.
Thx,
Jeffrey
By reading "understanding linux network internals" and "understanding linux kernel" the two books as well as other references, I am quite confused and need some clarifications about the "memory cache" and "memory pool" techniques.
1) Are they the same or different techniques?
2) If not the same, what makes the difference, or the distinct goals?
3) Also, how does the Slab Allocator come in?
Regarding the slab allocator:
So imagine memory is flat that is you have a block of 4 gigs contiguous memory. Then one of your programs reqeuests a 256 bytes of memory so what the memory allocator has to do is choose a suitable block of 256 bytes from this 4 gigs. So now you your memory looks something like
<============256bytes=======================>
(each = is a contiguous block of memory). Some time passes and a lot of programs operating with the memory require more 256 blocks or more or less so in the end your memory might look like:
<==256==256=256=86=68=121===>
so it gets fragmented and then there is no trace of your beautiful 4gig block of memory - this is fragmentation. Now, what the slab allocator would do is keep track of allocated objects and once they are not used anymore it will say that the memory is free when in fact it will be retained in some sort of List (You might wanna read about FreeLists).
So now imagine that the first program relinquish the 256 bytes allocated and then a new would like to have 256 bytes so instead of allocating a new chunk of the main memory it might re-use the lastly freed 256 bytes without having to go through the burden of searching the physical memory for appropriate contiguous block of space. This is how you essentially implement the memory cache. This is done so that memory fragmentation is reduced overall because you might end up in situation where memory is so fragmented that it is unusable and the memory-manager has to do some magic to get you block of appropriate size. Where as using a slab allocator pro-actively combats (but doesn't eliminate) the problem.
Linux memory allocator A.K.A slab allocator maintains the frequently used list/pool of memory objects of similar or approximate size. slab is giving extra flexibility to programmer to create their own pool of frequently used memory objects of same size and label it as programmer want,allocate, deallocate and finally destroy it.This cache is known to your driver and private to it.But there is a problem, during memory pressure there are high chances of allocation failures which could be not acceptable in some drivers, then what to do better always reserve some memory handy so that we never feel the memory crunch, since kmem cache is more generic pool mechanism we need some one who can always maintain minimum required memory and that's our buddy memory pool .
Lookaside Caches - The cache manager in the Linux kernel is sometimes called the slab allocator. You might end up allocating many objects of the same size over and over so by using this mechanism you just can allocate many objects in the same size and then use them later, without the need to allocate many objects over and over.
Memory Pool is just a form of lookaside cache that tries to always keep a list of memory around for use in emergencies, so when the memory pool is created, the allocation functions (slab allocators) create a pool of preallocated objects so you can acquire them when you need.