How does free calculate used memory? - linux

How does free calculate used memory and why does it differ from what /proc reports?
# cat /proc/*/status | grep VmSize | awk '{sum += $2} END {print sum}'
281260
But free says:
# free
total used free shared buffers cached
Mem: 524288 326488 197800 0 0 0
Who is right? Is 281260kb memory used or 326488kb?

The title asks: "How does free calculate used memory?"
Answer: It asks the OS, which has to keep track of that to do its job.
More specifically, it asks the memory management subsystem. As sheepsimulator notes in the comments, the Linux kernel exposes all kinds OS maintained data in the /proc virtual filesystem, but every full service OS has to keep track of them kind of data, so it is a small matter to provide an API for free to use.
The question asks: "Why does this differ from adding up the VmSize reported for all processes?"
Answer: There are at least to thing going on here
Linux will promise memory to a program without actually allocating it. When you do char *p=new(1024*1024*1024*sizeof(char)); the kernel doesn't go out to get you a gigabyte right away. It just says "OK", and figures it will grab it when you start using it. Thus the need for the infamous OOM killer.
Dynamic libraries are shared, and a single page of real memory can be mapped into the virtual address space of more than one process.
Furthermore, your pass over the proc filesystem is not atomic.
The upshot is that the output of free more accurately reflects the use of physical memory on your machine at a given moment.

By 'free' I assume you mean the standard Linux version, which is usually comes from the procps suite of command line tools. Different versions of free (such as the one from busybox) report different numbers.
The procps version of 'free' obtains information about the system memory by reading /proc/meminfo. There is a syscall (sysinfo) which is also available to get memory numbers from the kernel. This can be used if a system does not have the /proc filesystem, but that is rare outside of deeply embedded systems, and procps free does not use that syscall to my knowledge.
The calculation for "used" memory is derived by taking the total memory, and subtracting free memory, cached memory, reclaimable slab memory and buffer memory. The formula, using the names from /proc/meminfo, is:
used = MemTotal - MemFree - Cached - SReclaimable - Buffers
Note that free does not reference the Vm* values for any individual processes. These are numbers for virtual memory usage, which likely do not match the physical memory usage for a process. The numbers that free reports are for physical memory.

The result from 'free' is more likely accurate than adding up the Virtual Memory size of each process (which is just virtual memory afterall, might even add up to more memory than is physically present!)
/proc/meminfo will give you some more of the details than 'free'.

Related

RAM analysis on Linux

I want to get the map of allocated memory in RAM running on Linux.
I am looking to find the utilization of memory at a given time as well as specifics of allocation in terms of user process and kernel modules and kernel itself.
This is very very hard to get right because of shared memory, caching and on demand paging.
You can indeed use /proc/PID/maps as the previous answer stated to learn that glibc, as an example, occupies a certain range of virtual memory in a certain process but part of that is shared between all processes mapping that library (the code section), part isn't (the dynamic linker tables, for example). Part of this memory space might not be in RAM at all (paged to disk) anc opy on write semantics might mean that the memory picture at the next moment might be very different.
Add do that the sophisticated use Linux makes in caching (the page and buffer caches being the major ones) which part of which can be evicted at the kernel whim (cache IO buffers which are not dirty) but some cannot (e.g. tmpfs pages) and it gets really hairy really quickly.
In short - no one good answer to get a true view of what uses RAM and for what in a Linux system. The best answer I know is pagemap and related tool. read all about it here: http://lwn.net/Articles/230975/
You can find it out by checking ever process memory mapping
cat /proc/<PID>/maps
and for overall memory state
cat /proc/meminfo

Different between memory usage in WHM/Cpanel and Linux

If I go to WHM and see my server's memory usage, it says that only 16% of memory is in use.
But when I connect to server using SSH and run command "free -m" then it shows that 80% is in use. Why is that? I want to know exact memory usage of all applications running like MySQL, Apache e.t.c.
How do I view that?
Thanks
As they say, "It's Complicated".
Linux uses unused memory for disk buffering and caching. It speeds things up. But you may need to look at the -/+ buffers/cache line of free.
'ps' can show you, for any given process, or for all processes, the %cpu, %mem, cumulative cpu-time, rss (resident set size, the non-swapped physical memory that a process is using), size (very approximate amount of swap space that would be required if the process were to dirty all writable pages and then be swapped out), vsize (virtual memory usage of entire process (vm_lib + vm_exe + vm_data + vm_stack)), and much much more.
For any given process, you can cat /proc/$PID/status -- it's human readable -- and check out the VmSize, VmLck, VmRSS, VmData, VmStk, VmExe, VmLib, and VmPTE values, along with others...
But that's just for starters... Processes can allocate memory but not use it. (Memory can be allocated, but the memory pages are not created/issued until they're actually used. That whole on-demand thing.)
Processes can map in hardware space, showing up as using a large quantity of memory that's not actually coming from system RAM. (X-servers are known to sometimes do this. It's some wonky stuff involved kernel drivers...)
There's the executable, which is usually a memory-mapped file. Meaning that parts that are swapped-in are taking up RAM, but when swapped out it never takes up swapfile space.
Processes can have other memory-mapped files...
There's shared-memory libraries, where the same RAM pages are used by multiple programs concurrently.
So we have to ask, as far as memory goes, what exactly counts and what doesn't?

Calculating % memory used on Linux

Linux noob question:
If I have 500MB of RAM, and 500MB of swap space, can the OS and processes then use 1GB of memory?
In other words, is the total amount of memory available to programs and the OS the total of the physical memory size and swap size?
I'm trying to figure out which SNMP counters to query, but need to understand how Linux uses virtual memory a little better first.
Thanks
Actually, it IS essentially correct, but your "virtual" memory does NOT reside beside your "physical memory" (as Matthew Scharley stated).
Your "virtual memory" is an abstraction layer covering both "physical" (as in RAM) and "swap" (as in hard-disk, which is of course as much physical as RAM is) memory.
Virtual memory is in essention an abstraction layer. Your program always addresses a "virtual" address, which your OS translates to an address in RAM or on disk (which needs to be loaded to RAM first) depending on where the data resides. So your program never has to worry about lack of memory.
Nothing is ever quite so simple anymore...
Memory pages are lazily allocated. A process can malloc() a large quantity of memory and never use it. So on your 500MB_RAM + 500MB_SWAP system, I could -- at least in theory -- allocate 2 gig of memory off the heap and things will run merrily along until I try to use too much of that memory. (At which point whatever process couldn't acquire more memory pages gets nuked. Hopefully it's my process. But not always.)
Individual processes may be limited to 4 gig as a hard address limitation on 32-bit systems. Even when you have more than 4 gig of RAM on the machine and you're using that bizarre segmented 36-bit atrocity from hell addressing scheme, individual processes are still limited to only 4 gigs. Some of that 4 gigs has to go for shared libraries and program code. So yer down to 2-3 gigs of stack+heap as an ADDRESSING limitation.
You can mmap files in, effectively giving you more memory. It basically acts as extra swap. I.e. Rather than loading a program's binary code data into memory and then swapping it out to the swapfile, the file is just mmapped. As needed, pages are swapped into RAM directly from the file.
You can get into some interesting stuff with sparse data and mmapped sparse files. I've seen X-windows claim enormous memory usage when in fact it was only using up a tiny bit.
BTW: "free" might help you. As might "cat /proc/meminfo" or the Vm lines in /proc/$PID/status. (Especially VmData and VmStk.) Or perhaps "ps up $PID"
Although mostly it's true, it's not entirely correct. For a particular process, the environment you run it in may limit the memory available to your process. Check the output of ulimit -v as well.
Yes, this is essentially correct. The actual numbers might be (very) marginally lower, but for all intents and purposes, if you have x physical memory and y virtual memory (swap in linux), then you have x + y memory available to the operating system and any programs running underneath the OS.

How is the Linux calculating MemFree

I am trying to understand my embedded linux memory usage.
By using the top utility and the process file /proc/meminfo I can see how much virtual memory a process is using, and how much physical memory is available to the system. But it would seem for any given process the virtual memory can be very much higher than the used physical memory. As this is an embedded system memory swapping is disabled.(SwapTotal = 0)
How is linux calculating the free physical memory? As it doesn't seem to be accounting for everything allocated in the virtual memory space.
MemFree in /proc/meminfo is a count of how many pages are free in the buddy allocator. This buddy allocator is the fundamental unit of physical memory allocation in the kernel; however there are a lot of ways pages can be returned to the buddy allocator in time of need - for example, freeing empty SLABs, discarding cache/buffer RAM (even if this means invalidating PTEs in a running process), or as a last resort, swapping things out.
In fact, MemFree is generally controlled to be only 5-10% of total physical RAM, with any extra free RAM being co-opted into cache as time goes on. As such, MemFree alone is a very incomplete view of the overall memory situation.
As for the virtual memory (VSIZE) of a given process, this refers to the sum total of the sizes of all mapped memory segments in the process's address space. However, not all of these will be physically present - some may be paged in upon first access and as such will not register as memory in use until actually used. The resident size (RSIZE) is a more accurate view, as it only registers pages that are mapped in right now - although this may also not be accurate if a given page is mapped in multiple virtual addresses (which is very common when you consider multiple processes - shared libraries have the same physical RAM mapped to all processes that are using that library)
Try using htop. You will have to install it sudo apt-get install htop or yum install htop, whatever.
It will show you a more accurate representation of memory usage.
Basically, it comes down to "buffers/cache".
free -m
Look at the free column in the buffers/cache row, this is a more accurate representation of what is actually available.
total used free shared buffers cached
Mem: 3770 3586 183 0 112 1498
-/+ buffers/cache: 1976 1793
Swap: 7624 750 6874

A way to determine a process's "real" memory usage, i.e. private dirty RSS?

Tools like 'ps' and 'top' report various kinds of memory usages, such as the VM size and the Resident Set Size. However, none of those are the "real" memory usage:
Program code is shared between multiple instances of the same program.
Shared library program code is shared between all processes that use that library.
Some apps fork off processes and share memory with them (e.g. via shared memory segments).
The virtual memory system makes the VM size report pretty much useless.
RSS is 0 when a process is swapped out, making it not very useful.
Etc etc.
I've found that the private dirty RSS, as reported by Linux, is the closest thing to the "real" memory usage. This can be obtained by summing all Private_Dirty values in /proc/somepid/smaps.
However, do other operating systems provide similar functionality? If not, what are the alternatives? In particular, I'm interested in FreeBSD and OS X.
On OSX the Activity Monitor gives you actually a very good guess.
Private memory is for sure memory that is only used by your application. E.g. stack memory and all memory dynamically reserved using malloc() and comparable functions/methods (alloc method for Objective-C) is private memory. If you fork, private memory will be shared with you child, but marked copy-on-write. That means as long as a page is not modified by either process (parent or child) it is shared between them. As soon as either process modifies any page, this page is copied before it is modified. Even while this memory is shared with fork children (and it can only be shared with fork children), it is still shown as "private" memory, because in the worst case, every page of it will get modified (sooner or later) and then it is again private to each process again.
Shared memory is either memory that is currently shared (the same pages are visible in the virtual process space of different processes) or that is likely to become shared in the future (e.g. read-only memory, since there is no reason for not sharing read-only memory). At least that's how I read the source code of some command line tools from Apple. So if you share memory between processes using mmap (or a comparable call that maps the same memory into multiple processes), this would be shared memory. However the executable code itself is also shared memory, since if another instance of your application is started there is no reason why it may not share the code already loaded in memory (executable code pages are read-only by default, unless you are running your app in a debugger). Thus shared memory is really memory used by your application, just like private one, but it might additionally be shared with another process (or it might not, but why would it not count towards your application if it was shared?)
Real memory is the amount of RAM currently "assigned" to your process, no matter if private or shared. This can be exactly the sum of private and shared, but usually it is not. Your process might have more memory assigned to it than it currently needs (this speeds up requests for more memory in the future), but that is no loss to the system. If another process needs memory and no free memory is available, before the system starts swapping, it will take that extra memory away from your process and assign it another process (which is a fast and painless operation); therefor your next malloc call might be somewhat slower. Real memory can also be smaller than private and physical memory; this is because if your process requests memory from the system, it will only receive "virtual memory". This virtual memory is not linked to any real memory pages as long as you don't use it (so malloc 10 MB of memory, use only one byte of it, your process will get only a single page, 4096 byte, of memory assigned - the rest is only assigned if you actually ever need it). Further memory that is swapped may not count towards real memory either (not sure about this), but it will count towards shared and private memory.
Virtual memory is the sum of all address blocks that are consider valid in your apps process space. These addresses might be linked to physical memory (that is again private or shared), or they might not, but in that case they will be linked to physical memory as soon as you use the address. Accessing memory addresses outside of the known addresses will cause a SIGBUS and your app will crash. When memory is swapped, the virtual address space for this memory remains valid and accessing those addresses causes memory to be swapped back in.
Conclusion:
If your app does not explicitly or implicitly use shared memory, private memory is the amount of memory your app needs because of the stack size (or sizes if multithreaded) and because of the malloc() calls you made for dynamic memory. You don't have to care a lot for shared or real memory in that case.
If your app uses shared memory, and this includes a graphical UI, where memory is shared between your application and the WindowServer for example, then you might have a look at shared memory as well. A very high shared memory number may mean you have too many graphical resources loaded in memory at the moment.
Real memory is of little interest for app development. If it is bigger than the sum of shared and private, then this means nothing other than that the system is lazy at taken memory away from your process. If it is smaller, then your process has requested more memory than it actually needed, which is not bad either, since as long as you don't use all of the requested memory, you are not "stealing" memory from the system. If it is much smaller than the sum of shared and private, you may only consider to request less memory where possible, as you are a bit over-requesting memory (again, this is not bad, but it tells me that your code is not optimized for minimal memory usage and if it is cross platform, other platforms may not have such a sophisticated memory handling, so you may prefer to alloc many small blocks instead of a few big ones for example, or free memory a lot sooner, and so on).
If you are still not happy with all that information, you can get even more information. Open a terminal and run:
sudo vmmap <pid>
where is the process ID of your process. This will show you statistics for EVERY block of memory in your process space with start and end address. It will also tell you where this memory came from (A mapped file? Stack memory? Malloc'ed memory? A __DATA or __TEXT section of your executable?), how big it is in KB, the access rights and whether it is private, shared or copy-on-write. If it is mapped from a file, it will even give you the path to the file.
If you want only "actual" RAM usage, use
sudo vmmap -resident <pid>
Now it will show for every memory block how big the memory block is virtually and how much of it is really currently present in physical memory.
At the end of each dump is also an overview table with the sums of different memory types. This table looks like this for Firefox right now on my system:
REGION TYPE [ VIRTUAL/RESIDENT]
=========== [ =======/========]
ATS (font support) [ 33.8M/ 2496K]
CG backing stores [ 5588K/ 5460K]
CG image [ 20K/ 20K]
CG raster data [ 576K/ 576K]
CG shared images [ 2572K/ 2404K]
Carbon [ 1516K/ 1516K]
CoreGraphics [ 8K/ 8K]
IOKit [ 256.0M/ 0K]
MALLOC [ 256.9M/ 247.2M]
Memory tag=240 [ 4K/ 4K]
Memory tag=242 [ 12K/ 12K]
Memory tag=243 [ 8K/ 8K]
Memory tag=249 [ 156K/ 76K]
STACK GUARD [ 101.2M/ 9908K]
Stack [ 14.0M/ 248K]
VM_ALLOCATE [ 25.9M/ 25.6M]
__DATA [ 6752K/ 3808K]
__DATA/__OBJC [ 28K/ 28K]
__IMAGE [ 1240K/ 112K]
__IMPORT [ 104K/ 104K]
__LINKEDIT [ 30.7M/ 3184K]
__OBJC [ 1388K/ 1336K]
__OBJC/__DATA [ 72K/ 72K]
__PAGEZERO [ 4K/ 0K]
__TEXT [ 108.6M/ 63.5M]
__UNICODE [ 536K/ 512K]
mapped file [ 118.8M/ 50.8M]
shared memory [ 300K/ 276K]
shared pmap [ 6396K/ 3120K]
What does this tell us? E.g. the Firefox binary and all library it loads have 108 MB data together in their __TEXT sections, but currently only 63 MB of those are currently resident in memory. The font support (ATS) needs 33 MB, but only about 2.5 MB are really in memory. It uses a bit over 5 MB CG backing stores, CG = Core Graphics, those are most likely window contents, buttons, images and other data that is cached for fast drawing. It has requested 256 MB via malloc calls and currently 247 MB are really in mapped to memory pages. It has 14 MB space reserved for stacks, but only 248 KB stack space is really in use right now.
vmmap also has a good summary above the table
ReadOnly portion of Libraries: Total=139.3M resident=66.6M(48%) swapped_out_or_unallocated=72.7M(52%)
Writable regions: Total=595.4M written=201.8M(34%) resident=283.1M(48%) swapped_out=0K(0%) unallocated=312.3M(52%)
And this shows an interesting aspect of the OS X: For read only memory that comes from libraries, it plays no role if it is swapped out or simply unallocated; there is only resident and not resident. For writable memory this makes a difference (in my case 52% of all requested memory has never been used and is such unallocated, 0% of memory has been swapped out to disk).
The reason for that is simple: Read-only memory from mapped files is not swapped. If the memory is needed by the system, the current pages are simply dropped from the process, as the memory is already "swapped". It consisted only of content mapped directly from files and this content can be remapped whenever needed, as the files are still there. That way this memory won't waste space in the swap file either. Only writable memory must first be swapped to file before it is dropped, as its content wasn't stored on disk before.
On Linux, you may want the PSS (proportional set size) numbers in /proc/self/smaps. A mapping's PSS is its RSS divided by the number of processes which are using that mapping.
Top knows how to do this. It shows VIRT, RES and SHR by default on Debian Linux. VIRT = SWAP + RES. RES = CODE + DATA. SHR is the memory that may be shared with another process (shared library or other memory.)
Also, 'dirty' memory is merely RES memory that has been used, and/or has not been swapped.
It can be hard to tell, but the best way to understand is to look at a system that isn't swapping. Then, RES - SHR is the process exclusive memory. However, that's not a good way of looking at it, because you don't know that the memory in SHR is being used by another process. It may represent unwritten shared object pages that are only used by the process.
You really can't.
I mean, shared memory between processes... are you going to count it, or not. If you don't count it, you are wrong; the sum of all processes' memory usage is not going to be the total memory usage. If you count it, you are going to count it twice- the sum's not going to be correct.
Me, I'm happy with RSS. And knowing you can't really rely on it completely...
You can get private dirty and private clean RSS from /proc/pid/smaps
Take a look at smem. It will give you PSS information
http://www.selenic.com/smem/
Reworked this to be much cleaner, to demonstrate some proper best practices in bash, and in particular to use awk instead of bc.
find /proc/ -maxdepth 1 -name '[0-9]*' -print0 | while read -r -d $'\0' pidpath; do
[ -f "${pidpath}/smaps" ] || continue
awk '!/^Private_Dirty:/ {next;}
$3=="kB" {pd += $2 * (1024^1); next}
$3=="mB" {pd += $2 * (1024^2); next}
$3=="gB" {pd += $2 * (1024^3); next}
$3=="tB" {pd += $2 * (1024^4); next}
$3=="pB" {pd += $2 * (1024^5); next}
{print "ERROR!! "$0 >"/dev/stderr"; exit(1)}
END {printf("%10d: %d\n", '"${pidpath##*/}"', pd)}' "${pidpath}/smaps" || break
done
On a handy little container on my machine, with | sort -n -k 2 to sort the output, this looks like:
56: 106496
1: 147456
55: 155648
Use the mincore(2) system call. Quoting the man page:
DESCRIPTION
The mincore() system call determines whether each of the pages in the
region beginning at addr and continuing for len bytes is resident. The
status is returned in the vec array, one character per page. Each
character is either 0 if the page is not resident, or a combination of
the following flags (defined in <sys/mman.h>):
For a question that mentioned Freebsd, surprised no one wrote this yet :
If you want a linux style /proc/PROCESSID/status output, please do the following :
mount -t linprocfs none /proc
cat /proc/PROCESSID/status
Atleast in FreeBSD 7.0, the mounting was not done by default ( 7.0 is a much older release,but for something this basic,the answer was hidden in a mailing list!)
Check it out, this is the source code of gnome-system-monitor, it thinks the memory "really used" by one process is sum(info->mem) of X Server Memory(info->memxserver) and Writable Memory(info->memwritable), the "Writable Memory" is the memory blocks which are marked as "Private_Dirty" in /proc/PID/smaps file.
Other than linux system, could be different way according to gnome-system-monitor code.
static void
get_process_memory_writable (ProcInfo *info)
{
glibtop_proc_map buf;
glibtop_map_entry *maps;
maps = glibtop_get_proc_map(&buf, info->pid);
gulong memwritable = 0;
const unsigned number = buf.number;
for (unsigned i = 0; i < number; ++i) {
#ifdef __linux__
memwritable += maps[i].private_dirty;
#else
if (maps[i].perm & GLIBTOP_MAP_PERM_WRITE)
memwritable += maps[i].size;
#endif
}
info->memwritable = memwritable;
g_free(maps);
}
static void
get_process_memory_info (ProcInfo *info)
{
glibtop_proc_mem procmem;
WnckResourceUsage xresources;
wnck_pid_read_resource_usage (gdk_screen_get_display (gdk_screen_get_default ()),
info->pid,
&xresources);
glibtop_get_proc_mem(&procmem, info->pid);
info->vmsize = procmem.vsize;
info->memres = procmem.resident;
info->memshared = procmem.share;
info->memxserver = xresources.total_bytes_estimate;
get_process_memory_writable(info);
// fake the smart memory column if writable is not available
info->mem = info->memxserver + (info->memwritable ? info->memwritable : info->memres);
}

Resources