Will malloc implementations return free-ed memory back to the system?

Will malloc implementations return free-ed memory back to the system? - free

I have a long-living application with frequent memory allocation-deallocation. Will any malloc implementation return freed memory back to the system?
What is, in this respect, the behavior of:
ptmalloc 1, 2 (glibc default) or 3
dlmalloc
tcmalloc (google threaded malloc)
solaris 10-11 default malloc and mtmalloc
FreeBSD 8 default malloc (jemalloc)
Hoard malloc?
Update
If I have an application whose memory consumption can be very different in daytime and nighttime (e.g.), can I force any of malloc's to return freed memory to the system?
Without such return freed memory will be swapped out and in many times, but such memory contains only garbage.

The following analysis applies only to glibc (based on the ptmalloc2 algorithm).
There are certain options that seem helpful to return the freed memory back to the system:
mallopt() (defined in malloc.h) does provide an option to set the trim threshold value using one of the parameter option M_TRIM_THRESHOLD, this indicates the minimum amount of free memory (in bytes) allowed at the top of the data segment. If the amount falls below this threshold, glibc invokes brk() to give back memory to the kernel.
The default value of M_TRIM_THRESHOLD in Linux is set to 128K, setting a smaller value might save space.
The same behavior could be achieved by setting trim threshold value in the environment variable MALLOC_TRIM_THRESHOLD_, with no source changes absolutely.
However, preliminary test programs run using M_TRIM_THRESHOLD has shown that even though the memory allocated by malloc does return to the system, the remaining portion of the actual chunk of memory (the arena) initially requested via brk() tends to be retained.
It is possible to trim the memory arena and give any unused memory back to the system by calling malloc_trim(pad) (defined in malloc.h). This function resizes the data segment, leaving at least pad bytes at the end of it and failing if less than one page worth of bytes can be freed. Segment size is always a multiple of one page, which is 4,096 bytes on i386.
The implementation of this modified behavior of free() using malloc_trim could be done using the malloc hook functionality. This would not require any source code changes to the core glibc library.
Using madvise() system call inside the free implementation of glibc.

Most implementations don't bother identifying those (relatively rare) cases where entire "blocks" (of whatever size suits the OS) have been freed and could be returned, but there are of course exceptions. For example, and I quote from the wikipedia page, in OpenBSD:
On a call to free, memory is released
and unmapped from the process address
space using munmap. This system is
designed to improve security by taking
advantage of the address space layout
randomization and gap page features
implemented as part of OpenBSD's mmap
system call, and to detect
use-after-free bugs—as a large memory
allocation is completely unmapped
after it is freed, further use causes
a segmentation fault and termination
of the program.
Most systems are not as security-focused as OpenBSD, though.
Knowing this, when I'm coding a long-running system that has a known-to-be-transitory requirement for a large amount of memory, I always try to fork the process: the parent then just waits for results from the child [[typically on a pipe]], the child does the computation (including memory allocation), returns the results [[on said pipe]], then terminates. This way, my long-running process won't be uselessly hogging memory during the long times between occasional "spikes" in its demand for memory. Other alternative strategies include switching to a custom memory allocator for such special requirements (C++ makes it reasonably easy, though languages with virtual machines underneath such as Java and Python typically don't).

I had a similar problem in my app, after some investigation I noticed that for some reason glibc does not return memory to the system when allocated objects are small (in my case less than 120 bytes).
Look at this code:
#include <list>
#include <malloc.h>
template<size_t s> class x{char x[s];};
int main(int argc,char** argv){
typedef x<100> X;
std::list<X> lx;
for(size_t i = 0; i < 500000;++i){
lx.push_back(X());
}
lx.clear();
malloc_stats();
return 0;
}
Program output:
Arena 0:
system bytes = 64069632
in use bytes = 0
Total (incl. mmap):
system bytes = 64069632
in use bytes = 0
max mmap regions = 0
max mmap bytes = 0
about 64 MB are not return to system. When I changed typedef to:
typedef x<110> X; program output looks like this:
Arena 0:
system bytes = 135168
in use bytes = 0
Total (incl. mmap):
system bytes = 135168
in use bytes = 0
max mmap regions = 0
max mmap bytes = 0
almost all memory was freed. I also noticed that using malloc_trim(0) in either case released memory to system.
Here is output after adding malloc_trim to the code above:
Arena 0:
system bytes = 4096
in use bytes = 0
Total (incl. mmap):
system bytes = 4096
in use bytes = 0
max mmap regions = 0
max mmap bytes = 0

I am dealing with the same problem as the OP. So far, it seems possible with tcmalloc. I found two solutions:
compile your program with tcmalloc linked, then launch it as :
env TCMALLOC_RELEASE=100 ./my_pthread_soft
the documentation mentions that
Reasonable rates are in the range [0,10].
but 10 doesn't seem enough for me (i.e I see no change).
find somewhere in your code where it would be interesting to release all the freed memory, and then add this code:
#include "google/malloc_extension_c.h" // C include
#include "google/malloc_extension.h" // C++ include
/* ... */
MallocExtension_ReleaseFreeMemory();
The second solution has been very effective in my case; the first would be great but it isn't very successful, it is complicated to find the right number for example.

Of the ones you list, only Hoard will return memory to the system... but if it can actually do that will depend a lot on your program's allocation behaviour.

The short answer: To force malloc subsystem to return memory to OS, use malloc_trim(). Otherwise, behavior of returning memory is implementation dependent.

For all 'normal' mallocs, including the ones you've mentioned, memory is released to be reused by your process, but not back to the whole system. Releasing back to the whole system happens only when you process is finally terminated.

FreeBSD 12's malloc(3) uses jemalloc 5.1, which returns freed memory ("dirty pages") to the OS using madvise(...MADV_FREE).
Freed memory is only returned after a time delay controlled by opt.dirty_decay_ms and opt.muzzy_decay_ms; see the manual page and this issue on implementing decay-based unused dirty page purging for more details.
Earlier versions of FreeBSD shipped with older versions of jemalloc, which also returns freed memory, but uses a different algorithm to decide what to purge and when.

Related

Check if a System V shared memory segment is backed by huge pages or regular pages

To allocate a System V shared memory segment, one can use shmget() with the SHM_HUGETLB flag.
Is there a way to check whether a System V shared memory segment is backed by huge pages or regular pages assuming that we do not know how the original creator of this memory segment used the shmget() system call.

Ok, I think I figured it out.
One method is to attach to the shared memory segment (or rely on a process which is already attached), and examine /proc/[PID]/smaps to find the shared memory segment of interest and look at the corresponding KernelPageSize field to see it matches up with the server's configured Hugepagesize

I don't believe System V shared memory segment influences the size of a page. That is a function of the os and cpu configuration. Also see What is the size of shared memory page? and friends.
On Linux you can call getpagesize(2) to determine the page size:
#include <unistd.h>
int size = getpagesize();
You can also call sysconf(3):
#include <unistd.h>
long size = sysconf(PAGESIZE);
One thing though... Glibc may not be able to determine the pagesize. You should check that size>0 and size is a multiple of 2. Treat anything else like an error and use a default page size:
#include <unistd.h>
long size = sysconf(PAGESIZE);
if (size <= 0)
size = 4096;
Though -1 is a failure, I've had Glibc return bogus values on PowerPC, like 0 instead of failure, for a cache line size (the cache line size is either 64 or 128; never 0). Also see Bug 0014599, sysconf(_SC_LEVEL1_DCACHE_LINESIZE) returns 0 instead of 128.
Also see How can i calculate the size of shared memory available to the system, where another bogus value is returned on Red Hat systems.

Make calloc opportunistic

On linux malloc behaves opportunistically, only backing virtual memory by real memory when it is first accessed. Would it be possible to modify calloc so that it also behaves this way (allocating and zeroing pages when they are first accessed)?

It is not a feature of malloc() that makes it "opportunistic". It's a feature of the kernel with which malloc() has nothing to do whatsoever.
malloc() asks the kernel for a slap of memory everytime it needs more memory to fulfill a request, and it's the kernel that says "Yeah, sure, you have it" everytime without actually supplying memory. It is also the kernel that handles the subsequent page faults by supplying zero'ed memory pages. Note that any memory that the kernel supplies will already be zero'ed out due to safety considerations, so it is equally well suited for malloc() and for calloc().
That is, unless the calloc() implementation spoils this by unconditionally zeroing out the pages itself (generating the page faults that prompt the kernel to actually supply memory), it will have the same "opportunistic" behavior as malloc().
Update:
On my system, the following program successfully allocates 1 TiB (!) on a system with only 2 GiB of memory:
#include <stdlib.h>
#include <stdio.h>
int main() {
size_t allocationCount = 1024, successfullAllocations = 0;
char* allocations[allocationCount];
for(int i = allocationCount; i--; ) {
if((allocations[i] = calloc(1, 1024*1024*1024))) successfullAllocations++;
}
if(successfullAllocations == allocationCount) {
printf("all %zd allocations were successfull\n", successfullAllocations);
} else {
printf("there were %zd failed allocations\n", allocationCount - successfullAllocations);
}
}
I think, its safe to say that at least the calloc() implementation on my box behaves "opportunistically".

From the related /proc/sys/vm/overcommit_memory section in proc:
The amount of memory presently allocated on the system. The committed memory is a sum of all of the memory which has been allocated by processes, even if it has not been "used" by them as of yet. A process which allocates 1GB of memory (using malloc(3) or similar), but only touches 300MB of that memory will only show up as using 300MB of memory even if it has the address space allocated for the entire 1GB. This 1GB is memory which has been "committed" to by the VM and can be used at any time by the allocating application. With strict overcommit enabled on the system (mode 2 /proc/sys/vm/overcommit_memory), allocations which would exceed the CommitLimit (detailed above) will not be permitted. This is useful if one needs to guarantee that processes will not fail due to lack of memory once that memory has been successfully allocated.
Though not explicitly said, I think similar here means calloc and realloc. So calloc already behaves opportunistically as malloc.

libc memory management

How does libc communicate with the OS (e.g., a Linux kernel) to manage memory? Specifically, how does it allocate memory, and how does it release memory? Also, in what cases can it fail to allocate and deallocate, respectively?

That is very general question, but I want to speak to the failure to allocate. It's important to realize that memory is actually allocated by kernel upon first access. What you are doing when calling malloc/calloc/realloc is reserving some addresses inside the virtual address space of a process (via syscalls brk, mmap, etc. libc does that).
When I get malloc or similar to fail (or when libc get brk or mmap to fail), it's usually because I exhausted the virtual address space of a process. This happens when there is no continuous block of free address, an no room to expand an existing one. You can either exhaust all space available or hit a limit RLIMIT_AS. It's pretty common especially on 32bit systems when using multiple threads, because people sometimes forget that each thread needs it's own stack. Stacks usually consume several megabytes, which means you can create only few hundreds threads before you have no more free address space. Maybe an even more common reason for exhausted address space are memory leaks. Libc of course tries to reuse space on the heap (space obtained by a brk syscall) and tries to munmmap unneeded mappings. However, it can't reuse something that is not "deallocated".
The shortage of physical memory is not detectable from within a process (or libc which is part of the process) by failure to allocate. Yeah, you can hit "overcommitting limit", but that doesn't mean the physical memory is all taken. When free physical memory is low, kernel invokes special task called OOM killer (Out Of Memory Killer) which terminates some processes in order to free memory.
Regarding failure to deallocate, my guess is it doesn't happen unless you do something silly. I can imagine setting program break (end of heap) below it's original position (by a brk syscall). That is, of course, recipe for a disaster. Hopefully libc won't do that and it doesn't make much sense either. But it can be seen as failed deallocation. munmap can also fail if you supply some silly argument, but I can't think of regular reason for it to fail. That doesn't mean it doesn't exists. We would have to dig deep within source code of glibc/kernel to find out.

1) how does it allocate memory
libc provides malloc() to C programs.
Normally, malloc allocates memory from the heap, and adjusts the
size of the heap as required, using sbrk(2). When allocating blocks of
memory larger than MMAP_THRESHOLD bytes, the glibc malloc()
implementation allocates the memory as a private anonymous mapping
using mmap(2). MMAP_THRESHOLD is 128 kB by default, but is adjustable
using mallopt(3). Allocations performed using mmap(2) are unaffected
by the RLIMIT_DATA resource limit (see getrlimit(2)).
And this is about sbrk.
sbrk - change data segment size
2) in what cases can it fail to allocate
Also from malloc
By default, Linux follows an optimistic memory allocation strategy.
This means that when malloc() returns non-NULL there is no guarantee
that the memory really is available.
And from proc
/proc/sys/vm/overcommit_memory
This file contains the kernel virtual memory accounting mode. Values are:
0: heuristic overcommit (this is the default)
1: always overcommit, never check
2: always check, never overcommit

Mostly it uses the sbrk system call to adjust the size of the data segment, thereby reserving more memory for it to parcel out. Memory allocated in that way is generally not released back to the operating system because it is only possible to do it when the blocks available to be released are at the end of the data segment.
Larger blocks are sometime done by using mmap to allocate memory, and that memory can be released again with an munmap call.

How does libc communicate with the OS (e.g., a Linux kernel) to manage memory?
Through system calls - this is a low-level API that the kernel provides.
Specifically, how does it allocate memory, and how does it release memory?
Unix-like systems provide the "sbrk" syscall.
Also, in what cases can it fail to allocate and deallocate, respectively?
Allocation can fail, for example, when there's no enough available memory. Deallocation shall not fail.

Can I run out of virtual memory on linux?

My application similar to hypotetical program:
for(;;) {
for (i=0; i<1000; i++) {
p[i] = malloc(random_number_between_1000_and_100000());
p[i][0]=0; // update
}
for (i=0; i<1000; i++) {
free(p[i]);
}
}
Has no memory leaks but on my system, the consumption of memory (top, column VSS) grows without limits (such as to 300% of available physical memory). Is this normal?
Updated - use the memory for a while and then free it. Is this a difference?

The behavior is normal. Quoting man 3 malloc:
BUGS
By default, Linux follows an optimistic memory allocation strategy. This means that when malloc() returns non-NULL there is no guarantee that the
memory really is available. This is a really bad bug. In case it turns out that the system is out of memory, one or more processes will be
killed by the infamous OOM killer. In case Linux is employed under circumstances where it would be less desirable to suddenly lose some randomly
picked processes, and moreover the kernel version is sufficiently recent, one can switch off this overcommitting behavior using a command like:
# echo 2 > /proc/sys/vm/overcommit_memory
See also the kernel Documentation directory, files vm/overcommit-accounting and sysctl/vm.txt.
You need to touch (read/write) the memory for the Linux kernel to actually reserve it.

Try to add
sbrk(-1);
at end of each loop to see if it makes difference.
The free() only deallocates memory but it doesn't give it back to OS.

The OS usually allocates all pages as Copy-On-Write clones of the "0"
page, that is a fixed page filled with zeros. Reading from the pages
will return 0 as expected. As long as you only read, all references go
the same physical memory. Once you write a value the "COW" is
broken and a real, physical, page frame is allocated for you. This
means that as long as you don't write to the memory you can keep
allocating memory until the virtual memory address space runs out or
your page table fills up all available memory.

As long as you don't touch those allocated chunks the system will not really allocate them for you.
However you can run out of addressable space which is a limit the OS imposes to the processes, and is not necessarily the maximum you can address with the systems pointer type.

A way to determine a process's "real" memory usage, i.e. private dirty RSS?

Tools like 'ps' and 'top' report various kinds of memory usages, such as the VM size and the Resident Set Size. However, none of those are the "real" memory usage:
Program code is shared between multiple instances of the same program.
Shared library program code is shared between all processes that use that library.
Some apps fork off processes and share memory with them (e.g. via shared memory segments).
The virtual memory system makes the VM size report pretty much useless.
RSS is 0 when a process is swapped out, making it not very useful.
Etc etc.
I've found that the private dirty RSS, as reported by Linux, is the closest thing to the "real" memory usage. This can be obtained by summing all Private_Dirty values in /proc/somepid/smaps.
However, do other operating systems provide similar functionality? If not, what are the alternatives? In particular, I'm interested in FreeBSD and OS X.

On OSX the Activity Monitor gives you actually a very good guess.
Private memory is for sure memory that is only used by your application. E.g. stack memory and all memory dynamically reserved using malloc() and comparable functions/methods (alloc method for Objective-C) is private memory. If you fork, private memory will be shared with you child, but marked copy-on-write. That means as long as a page is not modified by either process (parent or child) it is shared between them. As soon as either process modifies any page, this page is copied before it is modified. Even while this memory is shared with fork children (and it can only be shared with fork children), it is still shown as "private" memory, because in the worst case, every page of it will get modified (sooner or later) and then it is again private to each process again.
Shared memory is either memory that is currently shared (the same pages are visible in the virtual process space of different processes) or that is likely to become shared in the future (e.g. read-only memory, since there is no reason for not sharing read-only memory). At least that's how I read the source code of some command line tools from Apple. So if you share memory between processes using mmap (or a comparable call that maps the same memory into multiple processes), this would be shared memory. However the executable code itself is also shared memory, since if another instance of your application is started there is no reason why it may not share the code already loaded in memory (executable code pages are read-only by default, unless you are running your app in a debugger). Thus shared memory is really memory used by your application, just like private one, but it might additionally be shared with another process (or it might not, but why would it not count towards your application if it was shared?)
Real memory is the amount of RAM currently "assigned" to your process, no matter if private or shared. This can be exactly the sum of private and shared, but usually it is not. Your process might have more memory assigned to it than it currently needs (this speeds up requests for more memory in the future), but that is no loss to the system. If another process needs memory and no free memory is available, before the system starts swapping, it will take that extra memory away from your process and assign it another process (which is a fast and painless operation); therefor your next malloc call might be somewhat slower. Real memory can also be smaller than private and physical memory; this is because if your process requests memory from the system, it will only receive "virtual memory". This virtual memory is not linked to any real memory pages as long as you don't use it (so malloc 10 MB of memory, use only one byte of it, your process will get only a single page, 4096 byte, of memory assigned - the rest is only assigned if you actually ever need it). Further memory that is swapped may not count towards real memory either (not sure about this), but it will count towards shared and private memory.
Virtual memory is the sum of all address blocks that are consider valid in your apps process space. These addresses might be linked to physical memory (that is again private or shared), or they might not, but in that case they will be linked to physical memory as soon as you use the address. Accessing memory addresses outside of the known addresses will cause a SIGBUS and your app will crash. When memory is swapped, the virtual address space for this memory remains valid and accessing those addresses causes memory to be swapped back in.
Conclusion:
If your app does not explicitly or implicitly use shared memory, private memory is the amount of memory your app needs because of the stack size (or sizes if multithreaded) and because of the malloc() calls you made for dynamic memory. You don't have to care a lot for shared or real memory in that case.
If your app uses shared memory, and this includes a graphical UI, where memory is shared between your application and the WindowServer for example, then you might have a look at shared memory as well. A very high shared memory number may mean you have too many graphical resources loaded in memory at the moment.
Real memory is of little interest for app development. If it is bigger than the sum of shared and private, then this means nothing other than that the system is lazy at taken memory away from your process. If it is smaller, then your process has requested more memory than it actually needed, which is not bad either, since as long as you don't use all of the requested memory, you are not "stealing" memory from the system. If it is much smaller than the sum of shared and private, you may only consider to request less memory where possible, as you are a bit over-requesting memory (again, this is not bad, but it tells me that your code is not optimized for minimal memory usage and if it is cross platform, other platforms may not have such a sophisticated memory handling, so you may prefer to alloc many small blocks instead of a few big ones for example, or free memory a lot sooner, and so on).
If you are still not happy with all that information, you can get even more information. Open a terminal and run:
sudo vmmap <pid>
where is the process ID of your process. This will show you statistics for EVERY block of memory in your process space with start and end address. It will also tell you where this memory came from (A mapped file? Stack memory? Malloc'ed memory? A __DATA or __TEXT section of your executable?), how big it is in KB, the access rights and whether it is private, shared or copy-on-write. If it is mapped from a file, it will even give you the path to the file.
If you want only "actual" RAM usage, use
sudo vmmap -resident <pid>
Now it will show for every memory block how big the memory block is virtually and how much of it is really currently present in physical memory.
At the end of each dump is also an overview table with the sums of different memory types. This table looks like this for Firefox right now on my system:
REGION TYPE [ VIRTUAL/RESIDENT]
=========== [ =======/========]
ATS (font support) [ 33.8M/ 2496K]
CG backing stores [ 5588K/ 5460K]
CG image [ 20K/ 20K]
CG raster data [ 576K/ 576K]
CG shared images [ 2572K/ 2404K]
Carbon [ 1516K/ 1516K]
CoreGraphics [ 8K/ 8K]
IOKit [ 256.0M/ 0K]
MALLOC [ 256.9M/ 247.2M]
Memory tag=240 [ 4K/ 4K]
Memory tag=242 [ 12K/ 12K]
Memory tag=243 [ 8K/ 8K]
Memory tag=249 [ 156K/ 76K]
STACK GUARD [ 101.2M/ 9908K]
Stack [ 14.0M/ 248K]
VM_ALLOCATE [ 25.9M/ 25.6M]
__DATA [ 6752K/ 3808K]
__DATA/__OBJC [ 28K/ 28K]
__IMAGE [ 1240K/ 112K]
__IMPORT [ 104K/ 104K]
__LINKEDIT [ 30.7M/ 3184K]
__OBJC [ 1388K/ 1336K]
__OBJC/__DATA [ 72K/ 72K]
__PAGEZERO [ 4K/ 0K]
__TEXT [ 108.6M/ 63.5M]
__UNICODE [ 536K/ 512K]
mapped file [ 118.8M/ 50.8M]
shared memory [ 300K/ 276K]
shared pmap [ 6396K/ 3120K]
What does this tell us? E.g. the Firefox binary and all library it loads have 108 MB data together in their __TEXT sections, but currently only 63 MB of those are currently resident in memory. The font support (ATS) needs 33 MB, but only about 2.5 MB are really in memory. It uses a bit over 5 MB CG backing stores, CG = Core Graphics, those are most likely window contents, buttons, images and other data that is cached for fast drawing. It has requested 256 MB via malloc calls and currently 247 MB are really in mapped to memory pages. It has 14 MB space reserved for stacks, but only 248 KB stack space is really in use right now.
vmmap also has a good summary above the table
ReadOnly portion of Libraries: Total=139.3M resident=66.6M(48%) swapped_out_or_unallocated=72.7M(52%)
Writable regions: Total=595.4M written=201.8M(34%) resident=283.1M(48%) swapped_out=0K(0%) unallocated=312.3M(52%)
And this shows an interesting aspect of the OS X: For read only memory that comes from libraries, it plays no role if it is swapped out or simply unallocated; there is only resident and not resident. For writable memory this makes a difference (in my case 52% of all requested memory has never been used and is such unallocated, 0% of memory has been swapped out to disk).
The reason for that is simple: Read-only memory from mapped files is not swapped. If the memory is needed by the system, the current pages are simply dropped from the process, as the memory is already "swapped". It consisted only of content mapped directly from files and this content can be remapped whenever needed, as the files are still there. That way this memory won't waste space in the swap file either. Only writable memory must first be swapped to file before it is dropped, as its content wasn't stored on disk before.

On Linux, you may want the PSS (proportional set size) numbers in /proc/self/smaps. A mapping's PSS is its RSS divided by the number of processes which are using that mapping.

Top knows how to do this. It shows VIRT, RES and SHR by default on Debian Linux. VIRT = SWAP + RES. RES = CODE + DATA. SHR is the memory that may be shared with another process (shared library or other memory.)
Also, 'dirty' memory is merely RES memory that has been used, and/or has not been swapped.
It can be hard to tell, but the best way to understand is to look at a system that isn't swapping. Then, RES - SHR is the process exclusive memory. However, that's not a good way of looking at it, because you don't know that the memory in SHR is being used by another process. It may represent unwritten shared object pages that are only used by the process.

You really can't.
I mean, shared memory between processes... are you going to count it, or not. If you don't count it, you are wrong; the sum of all processes' memory usage is not going to be the total memory usage. If you count it, you are going to count it twice- the sum's not going to be correct.
Me, I'm happy with RSS. And knowing you can't really rely on it completely...

You can get private dirty and private clean RSS from /proc/pid/smaps

Take a look at smem. It will give you PSS information
http://www.selenic.com/smem/

Reworked this to be much cleaner, to demonstrate some proper best practices in bash, and in particular to use awk instead of bc.
find /proc/ -maxdepth 1 -name '[0-9]*' -print0 | while read -r -d $'\0' pidpath; do
[ -f "${pidpath}/smaps" ] || continue
awk '!/^Private_Dirty:/ {next;}
$3=="kB" {pd += $2 * (1024^1); next}
$3=="mB" {pd += $2 * (1024^2); next}
$3=="gB" {pd += $2 * (1024^3); next}
$3=="tB" {pd += $2 * (1024^4); next}
$3=="pB" {pd += $2 * (1024^5); next}
{print "ERROR!! "$0 >"/dev/stderr"; exit(1)}
END {printf("%10d: %d\n", '"${pidpath##*/}"', pd)}' "${pidpath}/smaps" || break
done
On a handy little container on my machine, with | sort -n -k 2 to sort the output, this looks like:
56: 106496
1: 147456
55: 155648

Use the mincore(2) system call. Quoting the man page:
DESCRIPTION
The mincore() system call determines whether each of the pages in the
region beginning at addr and continuing for len bytes is resident. The
status is returned in the vec array, one character per page. Each
character is either 0 if the page is not resident, or a combination of
the following flags (defined in <sys/mman.h>):

For a question that mentioned Freebsd, surprised no one wrote this yet :
If you want a linux style /proc/PROCESSID/status output, please do the following :
mount -t linprocfs none /proc
cat /proc/PROCESSID/status
Atleast in FreeBSD 7.0, the mounting was not done by default ( 7.0 is a much older release,but for something this basic,the answer was hidden in a mailing list!)

Check it out, this is the source code of gnome-system-monitor, it thinks the memory "really used" by one process is sum(info->mem) of X Server Memory(info->memxserver) and Writable Memory(info->memwritable), the "Writable Memory" is the memory blocks which are marked as "Private_Dirty" in /proc/PID/smaps file.
Other than linux system, could be different way according to gnome-system-monitor code.
static void
get_process_memory_writable (ProcInfo *info)
{
glibtop_proc_map buf;
glibtop_map_entry *maps;
maps = glibtop_get_proc_map(&buf, info->pid);
gulong memwritable = 0;
const unsigned number = buf.number;
for (unsigned i = 0; i < number; ++i) {
#ifdef __linux__
memwritable += maps[i].private_dirty;
#else
if (maps[i].perm & GLIBTOP_MAP_PERM_WRITE)
memwritable += maps[i].size;
#endif
}
info->memwritable = memwritable;
g_free(maps);
}
static void
get_process_memory_info (ProcInfo *info)
{
glibtop_proc_mem procmem;
WnckResourceUsage xresources;
wnck_pid_read_resource_usage (gdk_screen_get_display (gdk_screen_get_default ()),
info->pid,
&xresources);
glibtop_get_proc_mem(&procmem, info->pid);
info->vmsize = procmem.vsize;
info->memres = procmem.resident;
info->memshared = procmem.share;
info->memxserver = xresources.total_bytes_estimate;
get_process_memory_writable(info);
// fake the smart memory column if writable is not available
info->mem = info->memxserver + (info->memwritable ? info->memwritable : info->memres);
}

Develop Reference

node.js excel linux python-3.x azure haskell apache-spark rust .htaccess string