Today, i received few alerts about swapping activity of 3000KB/sec. This linux box have very few process running and has total 32GB of RAM. When i logged and did execute free, i did not see any thing suspicious. The ratio of free memory to used in (buffers/cache) row, was high enough(25GB in free, to 5GB in usage).
So i am wondering what are main causes of paging on linux system
How does swappiness impact paging
How long does a page stay in Physical RAM before it is swapped out. What controls this behavior on linux?
Is it possible that even if there is adequate free physical RAM, but a process's memory access pattern is such that data is spread over multiple pages. Would this cause paging?
For example consider a 5GB array, such that the program access 5GB in a loop, although slowly, such that pages which are not used , are swapped out. Again, keep in mind, that even if buffer is 5GB, there could be 20GB of physical RAM available.
UPDATE:
The linux vendor is RHEL 6.3, kernel version 2.6.32-279.el6.x86_64
Related
(I'm new to Linux)
Say I've 1300 MB memory, on a Ubuntu machine. OS and other default programs consumes 300 MB memory and 1000 MB is free for my own applications.
I installed my application and I could configure it to use 700 MB memory, when the application starts.
However I couldn't verify its actual memory usage. Even I disabled swap space.
The "VIRT" value shows a huge value and "RES", "SHR", "%MEM" shows very less value.
It is difficult to find actual physical memory usage, similar to "Resource monitor" in Windows, which will say my application is using 700 MB memory.
Is there any way to find actual physical memory in Ubuntu/Linux ?
TL;DR - Virtual memory is complicated.
The best measure of a Linux processes current usage of physical memory is RES.
The RES value represents the sum of all of the processes pages that are currently resident in physical memory. It includes resident code pages and resident data pages. It also includes shared pages (SHR) that are currently RAM resident, though these pages cannot be exclusively ascribed to >>this<< process.
The VIRT value is actually the sum of all notionally allocated pages for the process, and it includes pages that are currently RAM resident, pages that are currently swapped to disk.
See https://stackoverflow.com/a/56351211/1184752 for another explanation.
Note that RES is giving you (roughly) instantaneous RAM usage. That is what you asked about ...
The "actual" memory usage over time is more complicated because the OS's virtual memory subsystem is typically be swapping pages in and out according to demand. So, for example, some of your application's pages may not have been accesses recently, and the OS may then swap them out (to swap space) to free up RAM for other pages required by your application ... or something else.
The VIRT value while actually representing virtual address space, is a good approximation of total (virtual) memory usage. However, it may be an over-estimate:
Some pages in a processes address space are shared between multiple processes. This includes read-only code segments, pages shared between parent and child processes between vfork and exec, and shared memory segments created using mmap.
Some pages may be set to have illegal access (e.g. for stack red-zones) and may not be backed by either RAM or swap device pages.
Some pages of the address space in certain states may not have been committed to either RAM or disk yet ... depending on how the virtual memory system is implemented. (Consider the case where a process requests a huge memory segment and neither reads from it or writes to it. It is possible that the virtual memory implementation will not allocate RAM pages until the first read or write in the page. And if you use lazy swap reservation, swap pages not be committed either. But beware that you can get into trouble with lazy swap reservation.)
VIRT can also be under-estimate because the OS usually reserves swap space for all pages ... whether they are currently swapped in or swapped out. So if you count the RAM and swap versions of a given page as separate units of storage, VIRT usually underestimates the total storage used.
Finally, if your real goal is to limit your application to using at most
700 MB (of virtual address space) then you can use ulimit -v ... to do this. If the application tries to request memory beyond its limit, the request fails.
I am trying to understand what the limits of Arangodb are and what the ideal setup is. From what I have understood arango stores all the collection data in the virtual memory and ideally you want this to fit in the RAM. If the collection grows and cannot fit in the RAM it will be swapped to disk.
So my first question. If my db grows will I need to adjust the swap partition/file to accommodate the db?
Since arango also syncs the data to disk does this mean that the data will always be located in the RAM and disk? So if I have a db that's 1.5GB and my RAM is 1GB I will need to at least have 0.5GB of swap disk and 1.5GB of regular disk space?
I am a bit confused how arango uses the virtual memory. Right now I have 7 collections that are practically empty. I have 1GB of RAM and 1GB of swap disk.
The admin reports that arango is using 4.5GB of virtual memory. How is this possible if the swap disk is 1GB? It's currently using 80MB of RAM. Shouldn't this be 224MB if the journal size is 32MB for each collection?
What is the recommendation on the journal size vs collection size? Can this be dynamically adjusted as the collection grows?
What kind of performance is expected if the swap disk is used a lot when the disk is an SSD? If the swap disk is used a lot would the performance be similar to using a more traditional db such as mysql?
ArangoDB stores all data in memory-mapped files.
Each collection can have 0 to n datafiles, with a default filesize of 32 MB each (note that this filesize can be adjusted globally or on a per-collection level). An empty collection (that never had any data) will not have a datafile. The first write to a collection will create the datafile, and whenever a datafile is full, a new one will be created automatically.
Collections allocate datafiles in chunks of 32 MB by default. If you do have many but small collections this might waste some memory. If you many few but big collections, the potential waste (free space at the end of a datafile) probably doesn't matter too much.
Whenever any ArangoDB operation reads data from or writes data to a memory-mapped datafile, the operating system will first translate the offset into the file into a page number. This is because each datafile is implicitly split into pages of a specific size. How big a page is is platform-dependent, but let's assume pages are 4 KB in size. So a datafile with a default filesize will have 8192 pages.
After the OS has translated the offset into the file into a page number, it will make sure the data of requested page are present in physical RAM. If the page is not yet in physical RAM, the operating system will issue a page fault to trigger loading of the requested page from disk or swap into physical RAM. This will eventually make the complete page available in RAM, and any reads or writes to the page's data may occur after that.
All of this is done by the operating system's virtual memory manager. The operating system is free to map as many pages from a datafile into RAM as it thinks is good.
For example, when a memory-mapped file is accessed sequentially, the operating system will likely be clever and read-ahead many pages, so they are already in physical RAM when actually accessed.
The OS is also free to swap out some or all pages of a datafile. It will likely swap out pages if there is not enough physical RAM available to keep all pages from all datafiles in RAM at the same time. It may also swap out pages that haven't been used for a while, to make RAM available for other operations. It will likely use some LRU algorithm for this.
How the virtual memory manager of an OS behaves exactly is wildly different across platforms and implementations. Most systems also allow configuring the VM subsystem. For example, here are some parameters for Linux's VM subsystem.
It is therefore hard to tell how much physical memory ArangoDB will actually use for a given number of collection and their datafiles. If the collections aren't accessed at all, having the datafiles memory-mapped might use almost no RAM as the OS has probably swapped the collections out fully or at least partially. If the collections are heavily in use, the OS will likely have their datafiles fully mapped into RAM. But in both cases the memory counts as memory-mapped. This is you can have a much higher virtual memory usage than you have physical RAM.
As mentioned before, the OS has to do a lot of work when accessing pages that are not in RAM, and you want to avoid this if possible. If the total size of your frequently used collections exceeds the size of the physical RAM, the OS has no alternative but to swap pages out and in a lot when you access these collections. Using an SSD for the swap will likely be better than using a spinning HDD, but is still far slower than RAM access. Long story short: the data of your active collections (datafiles plus indexes) should fit into physical RAM if possible, or you will see a lot of disk activity.
Apart from that, ArangoDB does not only allocate virtual memory for the collection datafiles, but it also starts a few V8 threads (V8 is the JavaScript engine in ArangoDB) that also use virtual memory. This virtual memory is not file-backed.
In an empty ArangoDB V8 accounts for most of the virtual memory usage. For example, on my 64 bit computer, the V8 threads consume about 5 GB of virtual memory (but ArangoDB in total only uses 140 MB RAM), whereas on my 32 bit computer with less RAM, the V8 threads use about 600 - 700 MB virtual memory. In your case, with the 4.5 GB VM usage, I suspect V8 is the reason, too.
The virtual memory usage for the V8 threads obviously correlates with the number of V8 threads started. For example, increasing the value of the startup parameter --server.threads will start more threads and use more virtual memory for V8, and lowering the value will start less threads and use less virtual memory.
In 4GB RAM system running linux, 3gb is given to user-space and 1gb to kernel, does it mean that even if kernel is using 50MB and user space is running low, user cannot use kernel space? if no, why? why cannot linux map their pages to user space?
The 3/1 separation refers to VIRTUAL memory. The virtual memory, however, is sparse. Meaning that even though there is "on paper" 1 GB, in practice a LOT less than that is used. Whenever possible, the "virtual" memory is backed by physical pages (meaning, if your virtual memory footprint is 50MB, then you're using 50 MB of physical memory), up until the point where there is no more physical memory, in which case you either A) spill over to swap or B) the system encounters a low memory condition and frees memory the hard way - by killing processes.
It gets more complicated. Virtual memory is not really used (committed) until actually used. THis means when you allcoate memory, you get an "IOU" or "promise" for memory, but the memory only gets consumed when you actually use the memory, as in write some value to it. Overall, however, you are correct in that there is segregation - at the hardware level - between kernel and user mode. In other words, of the 4GB addressable (assuming 32bit), the top 1GB, even though it is in your address space, is not accessible to you, and in practice belongs to the kernel. (The limit of 4 GB stems from 32-bit pointers - for 64 bits, it's effectively 48, which means 256TB, btw, 128TB user, 128TB kernel). Further, this 1GB of your space that is the kernel's is identical in other processes, too. So it doesnt matter which process you are in, when you "call kernel", (i.e. a system call), you end up in the top 1GB, which is shared in between all processes.
Again, the key point is that the 1GB isn't REALLY used in full. The actual memory footprint of the kernel is a lot smaller - in the tens of MB. It's jsut that theoretically, the kernel can use UP to 1GB, but that is assuming it can be backed up either by RAM or (rarely) swap. You can look at /proc/meminfo. As for the answer above, about changing 3/1 - it actually CAN be changed (in Windows it's as easy as a kernel command line option in boot.ini, in Linux it requires recompilation).
The 3GB/1GB split in process space is fixed. There is no way to change it regardless of how much RAM is actually in use.
I have a linux totally on rootfs ( which as I understand is an instance of ramfs ). There's no hard disk and no swap. And I got a process that leaks memory continuously. The virutal memory eventually grows to 4 times the size of physical memory, shown with top. I can't understand what's happening. rootfs is supposed to take RAM only, right ? If I have no disk to swap to, how does the Virtual Memory grows to 4 times the physical memory ?
Not all allocated memory has to be backed by a block device; the glibc-people consider this behavior a bug:
BUGS
By default, Linux follows an optimistic memory allocation
strategy. This means that when malloc() returns non-NULL
there is no guarantee that the memory really is available.
This is a really bad bug. In case it turns out that the
system is out of memory, one or more processes will be killed
by the infamous OOM killer. In case Linux is employed under
circumstances where it would be less desirable to suddenly
lose some randomly picked processes, and moreover the kernel
version is sufficiently recent, one can switch off this
overcommitting behavior using a command like:
# echo 2 > /proc/sys/vm/overcommit_memory
See also the kernel Documentation directory, files
vm/overcommit-accounting and sysctl/vm.txt.
Linux noob question:
If I have 500MB of RAM, and 500MB of swap space, can the OS and processes then use 1GB of memory?
In other words, is the total amount of memory available to programs and the OS the total of the physical memory size and swap size?
I'm trying to figure out which SNMP counters to query, but need to understand how Linux uses virtual memory a little better first.
Thanks
Actually, it IS essentially correct, but your "virtual" memory does NOT reside beside your "physical memory" (as Matthew Scharley stated).
Your "virtual memory" is an abstraction layer covering both "physical" (as in RAM) and "swap" (as in hard-disk, which is of course as much physical as RAM is) memory.
Virtual memory is in essention an abstraction layer. Your program always addresses a "virtual" address, which your OS translates to an address in RAM or on disk (which needs to be loaded to RAM first) depending on where the data resides. So your program never has to worry about lack of memory.
Nothing is ever quite so simple anymore...
Memory pages are lazily allocated. A process can malloc() a large quantity of memory and never use it. So on your 500MB_RAM + 500MB_SWAP system, I could -- at least in theory -- allocate 2 gig of memory off the heap and things will run merrily along until I try to use too much of that memory. (At which point whatever process couldn't acquire more memory pages gets nuked. Hopefully it's my process. But not always.)
Individual processes may be limited to 4 gig as a hard address limitation on 32-bit systems. Even when you have more than 4 gig of RAM on the machine and you're using that bizarre segmented 36-bit atrocity from hell addressing scheme, individual processes are still limited to only 4 gigs. Some of that 4 gigs has to go for shared libraries and program code. So yer down to 2-3 gigs of stack+heap as an ADDRESSING limitation.
You can mmap files in, effectively giving you more memory. It basically acts as extra swap. I.e. Rather than loading a program's binary code data into memory and then swapping it out to the swapfile, the file is just mmapped. As needed, pages are swapped into RAM directly from the file.
You can get into some interesting stuff with sparse data and mmapped sparse files. I've seen X-windows claim enormous memory usage when in fact it was only using up a tiny bit.
BTW: "free" might help you. As might "cat /proc/meminfo" or the Vm lines in /proc/$PID/status. (Especially VmData and VmStk.) Or perhaps "ps up $PID"
Although mostly it's true, it's not entirely correct. For a particular process, the environment you run it in may limit the memory available to your process. Check the output of ulimit -v as well.
Yes, this is essentially correct. The actual numbers might be (very) marginally lower, but for all intents and purposes, if you have x physical memory and y virtual memory (swap in linux), then you have x + y memory available to the operating system and any programs running underneath the OS.