Make calloc opportunistic

Make calloc opportunistic - linux

On linux malloc behaves opportunistically, only backing virtual memory by real memory when it is first accessed. Would it be possible to modify calloc so that it also behaves this way (allocating and zeroing pages when they are first accessed)?

It is not a feature of malloc() that makes it "opportunistic". It's a feature of the kernel with which malloc() has nothing to do whatsoever.
malloc() asks the kernel for a slap of memory everytime it needs more memory to fulfill a request, and it's the kernel that says "Yeah, sure, you have it" everytime without actually supplying memory. It is also the kernel that handles the subsequent page faults by supplying zero'ed memory pages. Note that any memory that the kernel supplies will already be zero'ed out due to safety considerations, so it is equally well suited for malloc() and for calloc().
That is, unless the calloc() implementation spoils this by unconditionally zeroing out the pages itself (generating the page faults that prompt the kernel to actually supply memory), it will have the same "opportunistic" behavior as malloc().
Update:
On my system, the following program successfully allocates 1 TiB (!) on a system with only 2 GiB of memory:
#include <stdlib.h>
#include <stdio.h>
int main() {
size_t allocationCount = 1024, successfullAllocations = 0;
char* allocations[allocationCount];
for(int i = allocationCount; i--; ) {
if((allocations[i] = calloc(1, 1024*1024*1024))) successfullAllocations++;
}
if(successfullAllocations == allocationCount) {
printf("all %zd allocations were successfull\n", successfullAllocations);
} else {
printf("there were %zd failed allocations\n", allocationCount - successfullAllocations);
}
}
I think, its safe to say that at least the calloc() implementation on my box behaves "opportunistically".

From the related /proc/sys/vm/overcommit_memory section in proc:
The amount of memory presently allocated on the system. The committed memory is a sum of all of the memory which has been allocated by processes, even if it has not been "used" by them as of yet. A process which allocates 1GB of memory (using malloc(3) or similar), but only touches 300MB of that memory will only show up as using 300MB of memory even if it has the address space allocated for the entire 1GB. This 1GB is memory which has been "committed" to by the VM and can be used at any time by the allocating application. With strict overcommit enabled on the system (mode 2 /proc/sys/vm/overcommit_memory), allocations which would exceed the CommitLimit (detailed above) will not be permitted. This is useful if one needs to guarantee that processes will not fail due to lack of memory once that memory has been successfully allocated.
Though not explicitly said, I think similar here means calloc and realloc. So calloc already behaves opportunistically as malloc.

Related

Virtual memory: memory leak on unfreed malloc?

int main()
{
char * str = (char *) malloc(100);
return 0;
}
I was told that code above would cause a memory leak. But with the help of virtual memory:
Suppose the executable is a.out, then a.outhas its own addressing space and page table.
When a.out terminates without free(str), memory leak happens in a.out's vitual memory space.
However, after termination (or maybe after parent process reaps the terminated process?), data structures about a.out's memory are also destroied.
Does this mean memory leak is totally impossible, as long as the process terminates?

The C standard has nothing to say about what happens after the program exits.
That's an environmental thing.
However, the vast majority of situations have the malloc arena as part of process space which is auto-magically released with the process.
Memory leaks are generally related to memory that you can no longer access (e.g., free) because you've over-written the pointer to it with some other value, and that tends to be a potential problem only while the process exists.

libc memory management

How does libc communicate with the OS (e.g., a Linux kernel) to manage memory? Specifically, how does it allocate memory, and how does it release memory? Also, in what cases can it fail to allocate and deallocate, respectively?

That is very general question, but I want to speak to the failure to allocate. It's important to realize that memory is actually allocated by kernel upon first access. What you are doing when calling malloc/calloc/realloc is reserving some addresses inside the virtual address space of a process (via syscalls brk, mmap, etc. libc does that).
When I get malloc or similar to fail (or when libc get brk or mmap to fail), it's usually because I exhausted the virtual address space of a process. This happens when there is no continuous block of free address, an no room to expand an existing one. You can either exhaust all space available or hit a limit RLIMIT_AS. It's pretty common especially on 32bit systems when using multiple threads, because people sometimes forget that each thread needs it's own stack. Stacks usually consume several megabytes, which means you can create only few hundreds threads before you have no more free address space. Maybe an even more common reason for exhausted address space are memory leaks. Libc of course tries to reuse space on the heap (space obtained by a brk syscall) and tries to munmmap unneeded mappings. However, it can't reuse something that is not "deallocated".
The shortage of physical memory is not detectable from within a process (or libc which is part of the process) by failure to allocate. Yeah, you can hit "overcommitting limit", but that doesn't mean the physical memory is all taken. When free physical memory is low, kernel invokes special task called OOM killer (Out Of Memory Killer) which terminates some processes in order to free memory.
Regarding failure to deallocate, my guess is it doesn't happen unless you do something silly. I can imagine setting program break (end of heap) below it's original position (by a brk syscall). That is, of course, recipe for a disaster. Hopefully libc won't do that and it doesn't make much sense either. But it can be seen as failed deallocation. munmap can also fail if you supply some silly argument, but I can't think of regular reason for it to fail. That doesn't mean it doesn't exists. We would have to dig deep within source code of glibc/kernel to find out.

1) how does it allocate memory
libc provides malloc() to C programs.
Normally, malloc allocates memory from the heap, and adjusts the
size of the heap as required, using sbrk(2). When allocating blocks of
memory larger than MMAP_THRESHOLD bytes, the glibc malloc()
implementation allocates the memory as a private anonymous mapping
using mmap(2). MMAP_THRESHOLD is 128 kB by default, but is adjustable
using mallopt(3). Allocations performed using mmap(2) are unaffected
by the RLIMIT_DATA resource limit (see getrlimit(2)).
And this is about sbrk.
sbrk - change data segment size
2) in what cases can it fail to allocate
Also from malloc
By default, Linux follows an optimistic memory allocation strategy.
This means that when malloc() returns non-NULL there is no guarantee
that the memory really is available.
And from proc
/proc/sys/vm/overcommit_memory
This file contains the kernel virtual memory accounting mode. Values are:
0: heuristic overcommit (this is the default)
1: always overcommit, never check
2: always check, never overcommit

Mostly it uses the sbrk system call to adjust the size of the data segment, thereby reserving more memory for it to parcel out. Memory allocated in that way is generally not released back to the operating system because it is only possible to do it when the blocks available to be released are at the end of the data segment.
Larger blocks are sometime done by using mmap to allocate memory, and that memory can be released again with an munmap call.

How does libc communicate with the OS (e.g., a Linux kernel) to manage memory?
Through system calls - this is a low-level API that the kernel provides.
Specifically, how does it allocate memory, and how does it release memory?
Unix-like systems provide the "sbrk" syscall.
Also, in what cases can it fail to allocate and deallocate, respectively?
Allocation can fail, for example, when there's no enough available memory. Deallocation shall not fail.

What happens if memory is leaking?

What exactly is a memory leak?
And how will it affect the system the program is running on?

When your process allocates memory from the OS on an ongoing basis, and never frees up any of it, you will eventually be using more memory than there is physically in the machine. At this point, the OS will first swap out to virtual memory (degrades performance) if it has any, and at some point your process will reach a point where the OS can no longer grant it more memory, because you've exceeded the maximum amount of addressable space (4GB on a 32bit OS).
There are basically two reasons this can happen: You've allocated memory and you've lost the pointer to it (it has become unreachable to your program), so you cannot free it any longer. That's what most people call a memory leak. Alternatively, you may just be allocating memory and never freeing it, because your program is lazy. that's not so much a leak, but in the end, the problems you get into are the same ones.

A memory leak is when your code allocates memory and then loses track of it, including the ability to free it later.
In C, for example, this can be done with the simple sequence:
void *pointer = malloc (2718); // Alloc, store address in pointer.
pointer = malloc (31415); // And again.
free (pointer); // Only frees the second block.
The original block of memory is still allocated but, because pointer no longer points to it, you have no way to free it.
That sequence, on its own, isn't that bad (well, it is bad, but the effects may not be). It's usually when you do it repeatedly that problems occur. Such as in a loop, or in a function that's repeatedly called:
static char firstDigit (int val) {
char *buff = malloc (100); // Allocates.
if (val < 0)
val = -val;
sprintf (buff, "%d", val);
return buff[0]; // But never frees.
}
Every time you call that function, you will leak the hundred bytes (plus any housekeeping information).
And, yes, memory leaks will affect other things. But the effects should be limited.
It will eventually affect the process that is leaking as it runs out of address space for allocating more objects. While that may not necessarily matter for short-lived processes, long-lived processes will eventually fail.
However, a decent operating system (and that includes Windows) will limit the resources that a single process can use, which will minimise the effects on other processes. Since modern environments disconnect virtual from physical memory, the only real effect that can be carried from process to process is if one tries to keep all its virtual memory resident in physical memory all the time, reducing the allocation of that physical memory to other processes.
But, even if a single process leaks gigabytes of memory, the memory itself won't be being used by the process (the crux of the leak is that the process has lost access to the memory). And, since it's not being used, the OS will almost certainly swap it out to disk and never have to bring it back into RAM again.
Of course, it uses up swap space and that may affect other processes but the amount of disk far outweighs the amount of physical RAM.

Your program will eventually crash. If it does not crash itself, it will help other programs crash because of lack of memory.

When you leak memory, it means that you are dynamically creating objects but are not destroying them. If the leak is severe enough, your program will eventually run out of address space and future allocation attempts will fail (likely causing your application to terminate or crash, since if you are leaking memory, you probably aren't handling out of memory conditions very well either), or the OS will halt your process if it attempts to allocate too much memory.
Additionally, you have to remember that in C++, many objects have destructors: when you fail to destroy a dynamically allocated object, its destructor will not be called.

A memory leak is a situation when a program allocates dynamic memory and then loses all pointers to that memory, therefor it can neither address nor free it. memory remains marked as allocated, so it will never be returned when more memory is requested by the program.
The program will exhaust limited resources at some speed. Depending on the amount of memory and swap file this can cause either the program eventually getting "can't allocate memory" indication or the operating system running out of both physical memory and swap file and just any program getting "can't allocate memory" indication. The latter can have serious consequences on some operating systems - we sometimes see Windows XP completely falling apart with critical services malfunctioning severely once extreme memory consumption in one program exhausts all memory. If that happens the only way to fix the problem is to reboot the system.

Will malloc implementations return free-ed memory back to the system?

I have a long-living application with frequent memory allocation-deallocation. Will any malloc implementation return freed memory back to the system?
What is, in this respect, the behavior of:
ptmalloc 1, 2 (glibc default) or 3
dlmalloc
tcmalloc (google threaded malloc)
solaris 10-11 default malloc and mtmalloc
FreeBSD 8 default malloc (jemalloc)
Hoard malloc?
Update
If I have an application whose memory consumption can be very different in daytime and nighttime (e.g.), can I force any of malloc's to return freed memory to the system?
Without such return freed memory will be swapped out and in many times, but such memory contains only garbage.

The following analysis applies only to glibc (based on the ptmalloc2 algorithm).
There are certain options that seem helpful to return the freed memory back to the system:
mallopt() (defined in malloc.h) does provide an option to set the trim threshold value using one of the parameter option M_TRIM_THRESHOLD, this indicates the minimum amount of free memory (in bytes) allowed at the top of the data segment. If the amount falls below this threshold, glibc invokes brk() to give back memory to the kernel.
The default value of M_TRIM_THRESHOLD in Linux is set to 128K, setting a smaller value might save space.
The same behavior could be achieved by setting trim threshold value in the environment variable MALLOC_TRIM_THRESHOLD_, with no source changes absolutely.
However, preliminary test programs run using M_TRIM_THRESHOLD has shown that even though the memory allocated by malloc does return to the system, the remaining portion of the actual chunk of memory (the arena) initially requested via brk() tends to be retained.
It is possible to trim the memory arena and give any unused memory back to the system by calling malloc_trim(pad) (defined in malloc.h). This function resizes the data segment, leaving at least pad bytes at the end of it and failing if less than one page worth of bytes can be freed. Segment size is always a multiple of one page, which is 4,096 bytes on i386.
The implementation of this modified behavior of free() using malloc_trim could be done using the malloc hook functionality. This would not require any source code changes to the core glibc library.
Using madvise() system call inside the free implementation of glibc.

Most implementations don't bother identifying those (relatively rare) cases where entire "blocks" (of whatever size suits the OS) have been freed and could be returned, but there are of course exceptions. For example, and I quote from the wikipedia page, in OpenBSD:
On a call to free, memory is released
and unmapped from the process address
space using munmap. This system is
designed to improve security by taking
advantage of the address space layout
randomization and gap page features
implemented as part of OpenBSD's mmap
system call, and to detect
use-after-free bugs—as a large memory
allocation is completely unmapped
after it is freed, further use causes
a segmentation fault and termination
of the program.
Most systems are not as security-focused as OpenBSD, though.
Knowing this, when I'm coding a long-running system that has a known-to-be-transitory requirement for a large amount of memory, I always try to fork the process: the parent then just waits for results from the child [[typically on a pipe]], the child does the computation (including memory allocation), returns the results [[on said pipe]], then terminates. This way, my long-running process won't be uselessly hogging memory during the long times between occasional "spikes" in its demand for memory. Other alternative strategies include switching to a custom memory allocator for such special requirements (C++ makes it reasonably easy, though languages with virtual machines underneath such as Java and Python typically don't).

I had a similar problem in my app, after some investigation I noticed that for some reason glibc does not return memory to the system when allocated objects are small (in my case less than 120 bytes).
Look at this code:
#include <list>
#include <malloc.h>
template<size_t s> class x{char x[s];};
int main(int argc,char** argv){
typedef x<100> X;
std::list<X> lx;
for(size_t i = 0; i < 500000;++i){
lx.push_back(X());
}
lx.clear();
malloc_stats();
return 0;
}
Program output:
Arena 0:
system bytes = 64069632
in use bytes = 0
Total (incl. mmap):
system bytes = 64069632
in use bytes = 0
max mmap regions = 0
max mmap bytes = 0
about 64 MB are not return to system. When I changed typedef to:
typedef x<110> X; program output looks like this:
Arena 0:
system bytes = 135168
in use bytes = 0
Total (incl. mmap):
system bytes = 135168
in use bytes = 0
max mmap regions = 0
max mmap bytes = 0
almost all memory was freed. I also noticed that using malloc_trim(0) in either case released memory to system.
Here is output after adding malloc_trim to the code above:
Arena 0:
system bytes = 4096
in use bytes = 0
Total (incl. mmap):
system bytes = 4096
in use bytes = 0
max mmap regions = 0
max mmap bytes = 0

I am dealing with the same problem as the OP. So far, it seems possible with tcmalloc. I found two solutions:
compile your program with tcmalloc linked, then launch it as :
env TCMALLOC_RELEASE=100 ./my_pthread_soft
the documentation mentions that
Reasonable rates are in the range [0,10].
but 10 doesn't seem enough for me (i.e I see no change).
find somewhere in your code where it would be interesting to release all the freed memory, and then add this code:
#include "google/malloc_extension_c.h" // C include
#include "google/malloc_extension.h" // C++ include
/* ... */
MallocExtension_ReleaseFreeMemory();
The second solution has been very effective in my case; the first would be great but it isn't very successful, it is complicated to find the right number for example.

Of the ones you list, only Hoard will return memory to the system... but if it can actually do that will depend a lot on your program's allocation behaviour.

The short answer: To force malloc subsystem to return memory to OS, use malloc_trim(). Otherwise, behavior of returning memory is implementation dependent.

For all 'normal' mallocs, including the ones you've mentioned, memory is released to be reused by your process, but not back to the whole system. Releasing back to the whole system happens only when you process is finally terminated.

FreeBSD 12's malloc(3) uses jemalloc 5.1, which returns freed memory ("dirty pages") to the OS using madvise(...MADV_FREE).
Freed memory is only returned after a time delay controlled by opt.dirty_decay_ms and opt.muzzy_decay_ms; see the manual page and this issue on implementing decay-based unused dirty page purging for more details.
Earlier versions of FreeBSD shipped with older versions of jemalloc, which also returns freed memory, but uses a different algorithm to decide what to purge and when.

Can I run out of virtual memory on linux?

My application similar to hypotetical program:
for(;;) {
for (i=0; i<1000; i++) {
p[i] = malloc(random_number_between_1000_and_100000());
p[i][0]=0; // update
}
for (i=0; i<1000; i++) {
free(p[i]);
}
}
Has no memory leaks but on my system, the consumption of memory (top, column VSS) grows without limits (such as to 300% of available physical memory). Is this normal?
Updated - use the memory for a while and then free it. Is this a difference?

The behavior is normal. Quoting man 3 malloc:
BUGS
By default, Linux follows an optimistic memory allocation strategy. This means that when malloc() returns non-NULL there is no guarantee that the
memory really is available. This is a really bad bug. In case it turns out that the system is out of memory, one or more processes will be
killed by the infamous OOM killer. In case Linux is employed under circumstances where it would be less desirable to suddenly lose some randomly
picked processes, and moreover the kernel version is sufficiently recent, one can switch off this overcommitting behavior using a command like:
# echo 2 > /proc/sys/vm/overcommit_memory
See also the kernel Documentation directory, files vm/overcommit-accounting and sysctl/vm.txt.
You need to touch (read/write) the memory for the Linux kernel to actually reserve it.

Try to add
sbrk(-1);
at end of each loop to see if it makes difference.
The free() only deallocates memory but it doesn't give it back to OS.

The OS usually allocates all pages as Copy-On-Write clones of the "0"
page, that is a fixed page filled with zeros. Reading from the pages
will return 0 as expected. As long as you only read, all references go
the same physical memory. Once you write a value the "COW" is
broken and a real, physical, page frame is allocated for you. This
means that as long as you don't write to the memory you can keep
allocating memory until the virtual memory address space runs out or
your page table fills up all available memory.

As long as you don't touch those allocated chunks the system will not really allocate them for you.
However you can run out of addressable space which is a limit the OS imposes to the processes, and is not necessarily the maximum you can address with the systems pointer type.

Develop Reference

node.js excel linux python-3.x azure haskell apache-spark rust .htaccess string