Virtual memory clarification - allocation of large contiguous memory - linux

I have an application where I have to allocate on Windows (using operator new) quite a large memory space (hundreds of MB). The application is 32 bit (we don't use 64 bit for now, even on 64 bit systems) and I enabled /LARGEADDRESSAWARE linker option to be able to use 4 GB of user space memory.
Question If I need to allocate, say 450 MB of contiguous memory does the virtual address space of the process need to have a contiguous large enough space and additionally the physical memory does not have to be fragmented on the system ? I ask this because I can make it so that my application reserves a large enough contiguous space but don't know if other applications on the system can affect me in this way. Does OS page tables need to translate contiguous virtual addresses seen by the application into contiguous physical addresses ?

If the memory is simply used in your software, then your 450MB allocation will only need a hole of 450MB in the virtual space. It can be satisfied with pages from every corner of the memory system [as long as there is at least 450MB available somewhere in the system - including swapspace].
Your system will get a little bit better performance if the OS is able to allocate the pages in contiguous blocks of 2MB a piece [using "large pages" of 2MB at a time]. But the system will fall back to individual 4KB pages if it needs to.
One of several benefits with a paged memory architecture is that any physical page can be placed at any virtual address. In some systems, for example Xen virutalization manager in Debug mode, pages are INTENTIONALLY allocated out of sequence, to make it easier to detect when the system makes assumptions about memory pages being contiguous.

You don't need to be concerned about contiguity of the physical memory. That's one thing that virtual to physical address translation helps you with. As long as you can reserve a chunk of the address space and back it with physical memory, wherever it happens to be, things are going to work.

Related

Find exact physical memory usage in Ubuntu/Linux

(I'm new to Linux)
Say I've 1300 MB memory, on a Ubuntu machine. OS and other default programs consumes 300 MB memory and 1000 MB is free for my own applications.
I installed my application and I could configure it to use 700 MB memory, when the application starts.
However I couldn't verify its actual memory usage. Even I disabled swap space.
The "VIRT" value shows a huge value and "RES", "SHR", "%MEM" shows very less value.
It is difficult to find actual physical memory usage, similar to "Resource monitor" in Windows, which will say my application is using 700 MB memory.
Is there any way to find actual physical memory in Ubuntu/Linux ?
TL;DR - Virtual memory is complicated.
The best measure of a Linux processes current usage of physical memory is RES.
The RES value represents the sum of all of the processes pages that are currently resident in physical memory. It includes resident code pages and resident data pages. It also includes shared pages (SHR) that are currently RAM resident, though these pages cannot be exclusively ascribed to >>this<< process.
The VIRT value is actually the sum of all notionally allocated pages for the process, and it includes pages that are currently RAM resident, pages that are currently swapped to disk.
See https://stackoverflow.com/a/56351211/1184752 for another explanation.
Note that RES is giving you (roughly) instantaneous RAM usage. That is what you asked about ...
The "actual" memory usage over time is more complicated because the OS's virtual memory subsystem is typically be swapping pages in and out according to demand. So, for example, some of your application's pages may not have been accesses recently, and the OS may then swap them out (to swap space) to free up RAM for other pages required by your application ... or something else.
The VIRT value while actually representing virtual address space, is a good approximation of total (virtual) memory usage. However, it may be an over-estimate:
Some pages in a processes address space are shared between multiple processes. This includes read-only code segments, pages shared between parent and child processes between vfork and exec, and shared memory segments created using mmap.
Some pages may be set to have illegal access (e.g. for stack red-zones) and may not be backed by either RAM or swap device pages.
Some pages of the address space in certain states may not have been committed to either RAM or disk yet ... depending on how the virtual memory system is implemented. (Consider the case where a process requests a huge memory segment and neither reads from it or writes to it. It is possible that the virtual memory implementation will not allocate RAM pages until the first read or write in the page. And if you use lazy swap reservation, swap pages not be committed either. But beware that you can get into trouble with lazy swap reservation.)
VIRT can also be under-estimate because the OS usually reserves swap space for all pages ... whether they are currently swapped in or swapped out. So if you count the RAM and swap versions of a given page as separate units of storage, VIRT usually underestimates the total storage used.
Finally, if your real goal is to limit your application to using at most
700 MB (of virtual address space) then you can use ulimit -v ... to do this. If the application tries to request memory beyond its limit, the request fails.

linux kernel and user address spaces

In 4GB RAM system running linux, 3gb is given to user-space and 1gb to kernel, does it mean that even if kernel is using 50MB and user space is running low, user cannot use kernel space? if no, why? why cannot linux map their pages to user space?
The 3/1 separation refers to VIRTUAL memory. The virtual memory, however, is sparse. Meaning that even though there is "on paper" 1 GB, in practice a LOT less than that is used. Whenever possible, the "virtual" memory is backed by physical pages (meaning, if your virtual memory footprint is 50MB, then you're using 50 MB of physical memory), up until the point where there is no more physical memory, in which case you either A) spill over to swap or B) the system encounters a low memory condition and frees memory the hard way - by killing processes.
It gets more complicated. Virtual memory is not really used (committed) until actually used. THis means when you allcoate memory, you get an "IOU" or "promise" for memory, but the memory only gets consumed when you actually use the memory, as in write some value to it. Overall, however, you are correct in that there is segregation - at the hardware level - between kernel and user mode. In other words, of the 4GB addressable (assuming 32bit), the top 1GB, even though it is in your address space, is not accessible to you, and in practice belongs to the kernel. (The limit of 4 GB stems from 32-bit pointers - for 64 bits, it's effectively 48, which means 256TB, btw, 128TB user, 128TB kernel). Further, this 1GB of your space that is the kernel's is identical in other processes, too. So it doesnt matter which process you are in, when you "call kernel", (i.e. a system call), you end up in the top 1GB, which is shared in between all processes.
Again, the key point is that the 1GB isn't REALLY used in full. The actual memory footprint of the kernel is a lot smaller - in the tens of MB. It's jsut that theoretically, the kernel can use UP to 1GB, but that is assuming it can be backed up either by RAM or (rarely) swap. You can look at /proc/meminfo. As for the answer above, about changing 3/1 - it actually CAN be changed (in Windows it's as easy as a kernel command line option in boot.ini, in Linux it requires recompilation).
The 3GB/1GB split in process space is fixed. There is no way to change it regardless of how much RAM is actually in use.

Can I allocate one large and guaranteed continued range physical memory (100MB)?

Can I allocate one large and guaranteed continued range physical memory (100 MB consecutive without breaks) on Linux, and if I can, then how can I do this?
It is necessary to mapping this a continuous block of memory through the PCI-Express BAR from one CPU1 to the other CPU2 located behind the PCIe Non-Transparent Bridge.
You don't allocate physical memory in user applications (physical memory only makes sense inside the kernel).
I don't understand if you are coding a kernel module or some Linux application (e.g. a numerical finite-element code=.
Inside applications, you can allocate virtual memory with e.g. mmap(2) (and then you can allocate a big contiguous segment of address space)
I guess that some GPU cards give access to a large amount of GPU memory thru mmap so I believe it is possible to do what you want.
You might be interested by numa(7) man page. Probably the numa(3) library should give you what you want. Did you consider also open MPI? See also msync(2) and mlock(2)
From user space -- there is no guarantee depends on you luck.
if you compile your driver into the kernel -- you can use the mmap and allocate the required amount of memory.
if it is required to use it as storage or some other work not specifically for a driver then you should set the memmap parameter in the boot command line.
e.g. memmap=200M$1700M
it will block 200 MB memory starting from the end of 1700M (address).
Later it can be used to as FS as well ;)

Virtual memory sections and memory mapping area

As process has virtual memory which is copied into RAM during run time. As given in the previous post.
Which part of process virtual memory layout does mmap() uses?
I have following doubles :
If memory mapping is inside unallocated memory and it is inside process's virtual memory. As virtual memory helps to avoid one process to touch other process's virtual memory. Then how can memory mapping is used for Interprocess Communication(IPC)?
In OS like Linux, whether has each individual process separate section of heap, stack and memory mapping or all processes have one common section for heap, stack and MMAP?
Example :
if there are P1,P2 and P3 processes are running on linux OS. will all have common table as given in picture or each individual task have separate table to each section.
In 32 bit system, 2^32=4 gigabytes of virtual memory is possible and 1G byte is reserved for kernel and 3 gigabytes for userspace applications. can each individual process have up to 3 gigabytes of virtual memory or sum of all userspace applications size could be 3 gigabytes (i.e virtual memory size of (P1+P2+P3)<=3 gigabytes)?
--
Learner
Using memory mapping for IPC works by mapping the same range of physical memory into two or more virtual address ranges in different processes. This works for communication because both processes are using the exact same memory cells (although they might "see" them differently, at different addresses). You change a value in one mapping, and it is instantly visible in the other mapping in a different process because it is the very same memory.
Every process has its own independent stack and heap. The OS does not care about that at all, it only cares about pages. The heap and the stack are things that are implemented by the application (via the runtime). When you call a function like malloc, the allocator in the runtime either returns a block that it already had reserved earlier or one that it has recylced (you called free earlier), or it asks the OS to reserve some more memory (sbrk or mmap). When you first access this memory, the OS sees a page fault and verifies that you are allowed to access this location (because you've reserved it) and then provides a valid page.
Every process can use (as in "reserve") the whole available address space (3GiB in your example). This does not interfere with any other process. Note that due to fragmentation and alignment, and because your executable and the stack take away a little bit, you will in practice not be able to allocate the full 3 GiB, but you can get close to it.
All processes together can use as much virtual memory as is available on the system (physical RAM plus swap space), but they can only use as much as there is physical memory available at the same time (minus a little bit for this and that, like unpageable kernel memory and such).

Where does virtual memory exist in linux?

As program is stored on flash/disk. For it execution, program is loaded into virtual memory and is mapped to RAM by virtual manager. During its execution process is in RAM. Then where does virtual memory exist (where it has all .text, .data, .stack, .heap)?
The virtual memory is a view of the RAM plus maybe some swap space provided by a virtual memory manager. Modern OSs have virtual memory managers and provide virtual memory to processes so that the executing program can behave as if it had a contiguous address space whose size is not limited by the actual RAM. The pages or blocks making up the virtual memory can be mapped anywhere in the RAM, so that contiguos virtual pages need to be stored in contiguos RAM areas. Or they can be swapped out to page space or swap space, waiting there until needed, whereupon they're read by the OS and mapped to some RAM page.
When you say
During its execution process is in RAM.
This is not entirely correct. Some or all memory pages that belong to the process may be swapped out, as explained.
One more word concerning the answers and comments that say that "virtual" means it doesn't exist. This makes no sense. On the contrary, according to Webster:
being such in essence or effect ...
Hence virtual memory is something (therefore, it exists!) that behaves as if it were memory.
Virtual memory is just like an illusion of RAM. It uses paging to acquire additional RAM that could be used by the processes in operating system.
Virtual memory means memory you can access with "normal" momory access methods, although it isn't clear where the data is actually stored.
It may be
actually in RAM
in a swap area
in another file (memory mapped file)
and access to it will be handled appropriately.
It is a layer of, well, virtualization so that you as a programmer don't have to worry about where the data is actually put.
The original purpose was mainly to be able to provide more memory to processes than we actually have and to extend it with means of swap space, but there are even more:
The OS is free to use the RAM for whatever it seems necessary, e. g. caching. Under some circumstances, it may be more effective to use RAM for cache than for holding parts of a program which hasn't been used for a long time.
Provide additional memory to a program when it requests it: if you call malloc(), the program's library may request the OS to provide a part of memory which can be attached seamlessly into the address space.
Avoid stack overflow: if the stack grows larger and larger, the respective memory section may be extended as well transparently so that the program won't have to worry about it.
A system can even do "overcommitment" of memory: if a process requests a large amount of memory, the OS may say "yes, ok", i. e. provide the memory to the program. That means in the first place "allow the program to access a certain address space area", but this address space is not immediately backed by memory. Only as soon as the program accesses this memory the mapping will be done, and if this cannot be fulfilled, the program is crashed by the Out of emory killer (at least, under Linux).
All this works by page-wise (1 page = 4 kiB) assignment of physical memory to a program, viewed via the program's address space, and this in the amount and frequency as it is needed.

Resources