int main()
{
char * str = (char *) malloc(100);
return 0;
}
I was told that code above would cause a memory leak. But with the help of virtual memory:
Suppose the executable is a.out, then a.outhas its own addressing space and page table.
When a.out terminates without free(str), memory leak happens in a.out's vitual memory space.
However, after termination (or maybe after parent process reaps the terminated process?), data structures about a.out's memory are also destroied.
Does this mean memory leak is totally impossible, as long as the process terminates?
The C standard has nothing to say about what happens after the program exits.
That's an environmental thing.
However, the vast majority of situations have the malloc arena as part of process space which is auto-magically released with the process.
Memory leaks are generally related to memory that you can no longer access (e.g., free) because you've over-written the pointer to it with some other value, and that tends to be a potential problem only while the process exists.
Related
On linux malloc behaves opportunistically, only backing virtual memory by real memory when it is first accessed. Would it be possible to modify calloc so that it also behaves this way (allocating and zeroing pages when they are first accessed)?
It is not a feature of malloc() that makes it "opportunistic". It's a feature of the kernel with which malloc() has nothing to do whatsoever.
malloc() asks the kernel for a slap of memory everytime it needs more memory to fulfill a request, and it's the kernel that says "Yeah, sure, you have it" everytime without actually supplying memory. It is also the kernel that handles the subsequent page faults by supplying zero'ed memory pages. Note that any memory that the kernel supplies will already be zero'ed out due to safety considerations, so it is equally well suited for malloc() and for calloc().
That is, unless the calloc() implementation spoils this by unconditionally zeroing out the pages itself (generating the page faults that prompt the kernel to actually supply memory), it will have the same "opportunistic" behavior as malloc().
Update:
On my system, the following program successfully allocates 1 TiB (!) on a system with only 2 GiB of memory:
#include <stdlib.h>
#include <stdio.h>
int main() {
size_t allocationCount = 1024, successfullAllocations = 0;
char* allocations[allocationCount];
for(int i = allocationCount; i--; ) {
if((allocations[i] = calloc(1, 1024*1024*1024))) successfullAllocations++;
}
if(successfullAllocations == allocationCount) {
printf("all %zd allocations were successfull\n", successfullAllocations);
} else {
printf("there were %zd failed allocations\n", allocationCount - successfullAllocations);
}
}
I think, its safe to say that at least the calloc() implementation on my box behaves "opportunistically".
From the related /proc/sys/vm/overcommit_memory section in proc:
The amount of memory presently allocated on the system. The committed memory is a sum of all of the memory which has been allocated by processes, even if it has not been "used" by them as of yet. A process which allocates 1GB of memory (using malloc(3) or similar), but only touches 300MB of that memory will only show up as using 300MB of memory even if it has the address space allocated for the entire 1GB. This 1GB is memory which has been "committed" to by the VM and can be used at any time by the allocating application. With strict overcommit enabled on the system (mode 2 /proc/sys/vm/overcommit_memory), allocations which would exceed the CommitLimit (detailed above) will not be permitted. This is useful if one needs to guarantee that processes will not fail due to lack of memory once that memory has been successfully allocated.
Though not explicitly said, I think similar here means calloc and realloc. So calloc already behaves opportunistically as malloc.
How does libc communicate with the OS (e.g., a Linux kernel) to manage memory? Specifically, how does it allocate memory, and how does it release memory? Also, in what cases can it fail to allocate and deallocate, respectively?
That is very general question, but I want to speak to the failure to allocate. It's important to realize that memory is actually allocated by kernel upon first access. What you are doing when calling malloc/calloc/realloc is reserving some addresses inside the virtual address space of a process (via syscalls brk, mmap, etc. libc does that).
When I get malloc or similar to fail (or when libc get brk or mmap to fail), it's usually because I exhausted the virtual address space of a process. This happens when there is no continuous block of free address, an no room to expand an existing one. You can either exhaust all space available or hit a limit RLIMIT_AS. It's pretty common especially on 32bit systems when using multiple threads, because people sometimes forget that each thread needs it's own stack. Stacks usually consume several megabytes, which means you can create only few hundreds threads before you have no more free address space. Maybe an even more common reason for exhausted address space are memory leaks. Libc of course tries to reuse space on the heap (space obtained by a brk syscall) and tries to munmmap unneeded mappings. However, it can't reuse something that is not "deallocated".
The shortage of physical memory is not detectable from within a process (or libc which is part of the process) by failure to allocate. Yeah, you can hit "overcommitting limit", but that doesn't mean the physical memory is all taken. When free physical memory is low, kernel invokes special task called OOM killer (Out Of Memory Killer) which terminates some processes in order to free memory.
Regarding failure to deallocate, my guess is it doesn't happen unless you do something silly. I can imagine setting program break (end of heap) below it's original position (by a brk syscall). That is, of course, recipe for a disaster. Hopefully libc won't do that and it doesn't make much sense either. But it can be seen as failed deallocation. munmap can also fail if you supply some silly argument, but I can't think of regular reason for it to fail. That doesn't mean it doesn't exists. We would have to dig deep within source code of glibc/kernel to find out.
1) how does it allocate memory
libc provides malloc() to C programs.
Normally, malloc allocates memory from the heap, and adjusts the
size of the heap as required, using sbrk(2). When allocating blocks of
memory larger than MMAP_THRESHOLD bytes, the glibc malloc()
implementation allocates the memory as a private anonymous mapping
using mmap(2). MMAP_THRESHOLD is 128 kB by default, but is adjustable
using mallopt(3). Allocations performed using mmap(2) are unaffected
by the RLIMIT_DATA resource limit (see getrlimit(2)).
And this is about sbrk.
sbrk - change data segment size
2) in what cases can it fail to allocate
Also from malloc
By default, Linux follows an optimistic memory allocation strategy.
This means that when malloc() returns non-NULL there is no guarantee
that the memory really is available.
And from proc
/proc/sys/vm/overcommit_memory
This file contains the kernel virtual memory accounting mode. Values are:
0: heuristic overcommit (this is the default)
1: always overcommit, never check
2: always check, never overcommit
Mostly it uses the sbrk system call to adjust the size of the data segment, thereby reserving more memory for it to parcel out. Memory allocated in that way is generally not released back to the operating system because it is only possible to do it when the blocks available to be released are at the end of the data segment.
Larger blocks are sometime done by using mmap to allocate memory, and that memory can be released again with an munmap call.
How does libc communicate with the OS (e.g., a Linux kernel) to manage memory?
Through system calls - this is a low-level API that the kernel provides.
Specifically, how does it allocate memory, and how does it release memory?
Unix-like systems provide the "sbrk" syscall.
Also, in what cases can it fail to allocate and deallocate, respectively?
Allocation can fail, for example, when there's no enough available memory. Deallocation shall not fail.
What exactly is a memory leak?
And how will it affect the system the program is running on?
When your process allocates memory from the OS on an ongoing basis, and never frees up any of it, you will eventually be using more memory than there is physically in the machine. At this point, the OS will first swap out to virtual memory (degrades performance) if it has any, and at some point your process will reach a point where the OS can no longer grant it more memory, because you've exceeded the maximum amount of addressable space (4GB on a 32bit OS).
There are basically two reasons this can happen: You've allocated memory and you've lost the pointer to it (it has become unreachable to your program), so you cannot free it any longer. That's what most people call a memory leak. Alternatively, you may just be allocating memory and never freeing it, because your program is lazy. that's not so much a leak, but in the end, the problems you get into are the same ones.
A memory leak is when your code allocates memory and then loses track of it, including the ability to free it later.
In C, for example, this can be done with the simple sequence:
void *pointer = malloc (2718); // Alloc, store address in pointer.
pointer = malloc (31415); // And again.
free (pointer); // Only frees the second block.
The original block of memory is still allocated but, because pointer no longer points to it, you have no way to free it.
That sequence, on its own, isn't that bad (well, it is bad, but the effects may not be). It's usually when you do it repeatedly that problems occur. Such as in a loop, or in a function that's repeatedly called:
static char firstDigit (int val) {
char *buff = malloc (100); // Allocates.
if (val < 0)
val = -val;
sprintf (buff, "%d", val);
return buff[0]; // But never frees.
}
Every time you call that function, you will leak the hundred bytes (plus any housekeeping information).
And, yes, memory leaks will affect other things. But the effects should be limited.
It will eventually affect the process that is leaking as it runs out of address space for allocating more objects. While that may not necessarily matter for short-lived processes, long-lived processes will eventually fail.
However, a decent operating system (and that includes Windows) will limit the resources that a single process can use, which will minimise the effects on other processes. Since modern environments disconnect virtual from physical memory, the only real effect that can be carried from process to process is if one tries to keep all its virtual memory resident in physical memory all the time, reducing the allocation of that physical memory to other processes.
But, even if a single process leaks gigabytes of memory, the memory itself won't be being used by the process (the crux of the leak is that the process has lost access to the memory). And, since it's not being used, the OS will almost certainly swap it out to disk and never have to bring it back into RAM again.
Of course, it uses up swap space and that may affect other processes but the amount of disk far outweighs the amount of physical RAM.
Your program will eventually crash. If it does not crash itself, it will help other programs crash because of lack of memory.
When you leak memory, it means that you are dynamically creating objects but are not destroying them. If the leak is severe enough, your program will eventually run out of address space and future allocation attempts will fail (likely causing your application to terminate or crash, since if you are leaking memory, you probably aren't handling out of memory conditions very well either), or the OS will halt your process if it attempts to allocate too much memory.
Additionally, you have to remember that in C++, many objects have destructors: when you fail to destroy a dynamically allocated object, its destructor will not be called.
A memory leak is a situation when a program allocates dynamic memory and then loses all pointers to that memory, therefor it can neither address nor free it. memory remains marked as allocated, so it will never be returned when more memory is requested by the program.
The program will exhaust limited resources at some speed. Depending on the amount of memory and swap file this can cause either the program eventually getting "can't allocate memory" indication or the operating system running out of both physical memory and swap file and just any program getting "can't allocate memory" indication. The latter can have serious consequences on some operating systems - we sometimes see Windows XP completely falling apart with critical services malfunctioning severely once extreme memory consumption in one program exhausts all memory. If that happens the only way to fix the problem is to reboot the system.
My application similar to hypotetical program:
for(;;) {
for (i=0; i<1000; i++) {
p[i] = malloc(random_number_between_1000_and_100000());
p[i][0]=0; // update
}
for (i=0; i<1000; i++) {
free(p[i]);
}
}
Has no memory leaks but on my system, the consumption of memory (top, column VSS) grows without limits (such as to 300% of available physical memory). Is this normal?
Updated - use the memory for a while and then free it. Is this a difference?
The behavior is normal. Quoting man 3 malloc:
BUGS
By default, Linux follows an optimistic memory allocation strategy. This means that when malloc() returns non-NULL there is no guarantee that the
memory really is available. This is a really bad bug. In case it turns out that the system is out of memory, one or more processes will be
killed by the infamous OOM killer. In case Linux is employed under circumstances where it would be less desirable to suddenly lose some randomly
picked processes, and moreover the kernel version is sufficiently recent, one can switch off this overcommitting behavior using a command like:
# echo 2 > /proc/sys/vm/overcommit_memory
See also the kernel Documentation directory, files vm/overcommit-accounting and sysctl/vm.txt.
You need to touch (read/write) the memory for the Linux kernel to actually reserve it.
Try to add
sbrk(-1);
at end of each loop to see if it makes difference.
The free() only deallocates memory but it doesn't give it back to OS.
The OS usually allocates all pages as Copy-On-Write clones of the "0"
page, that is a fixed page filled with zeros. Reading from the pages
will return 0 as expected. As long as you only read, all references go
the same physical memory. Once you write a value the "COW" is
broken and a real, physical, page frame is allocated for you. This
means that as long as you don't write to the memory you can keep
allocating memory until the virtual memory address space runs out or
your page table fills up all available memory.
As long as you don't touch those allocated chunks the system will not really allocate them for you.
However you can run out of addressable space which is a limit the OS imposes to the processes, and is not necessarily the maximum you can address with the systems pointer type.
I am wondering if i new some object but forget to delete it, when the process exit, will the leaked memory be returned to the OS?
This isn't so mush a C++ question as an operating system question.
All of the operating systems that I have knowledge of will reclaim conventional memory that had been allocated. That's because the allocation generally comes from a processes private address space which will be reclaimed on exit.
This may not be true for other resources such as shared memory. There are implementations that will not release shared memory segments unless you specifically mark them for deletion before your process exits (and, even then, they don't get deleted until everyone has detached).
For most modern operating systems (most flavors of unix and anything running in protected memory under x86), the memory allocation occurs within a program's heap (either through malloc for C or new/delete for C++). So when the program exits, then the memory will be released for use elsewhere.