Does Virtual Memory Really Exists? [closed] - linux

It's difficult to tell what is being asked here. This question is ambiguous, vague, incomplete, overly broad, or rhetorical and cannot be reasonably answered in its current form. For help clarifying this question so that it can be reopened, visit the help center.
Closed 10 years ago.
Does virtual memory exists somewhere in our computer system in reality(i.e on hard disk )?
if not how a mapping from virtual memory to real data in hard disk is made if data is not in main memory i.e.( page fault occurs ).Is there any table that maintains the mapping from virtual memory to hard disk data..

Memory is so called virtual because a process sees its address-space as a contiguous chunk of available memory, using all the breadth of the underlying address bus width, let say 4GB for a 32bits system. So every single process has a 4GB address-space, yet this memory is not fully backed by physical memory on a 1-to1 basis. And even though you have 4GB of physical memory to back the 4GB address-space of the process, where would go the kernel, the others process? This memory has to be virtual.
Yes, tables maintain the process address-space. To make it simple, some of the pages are currently mapped on the volatile physical memory, but some others are not. They are backed by a memory file on the HDD. When a page-fault occurs, the page-fault will check if that page is mapped on the physical-memory (usually it’s a bit inside the page’s attributes), and if not, it will fetch it from the memory mapped file on the HDD, and replace with it an old page mapped to the physical memory.
Hope this help.

Yes virtual memory really does exist and yes there is a table that maintains the mapping. Look for page table in wikipedia for instance. In fact most of the virtual memory article will answer your question in full.

Most of your questions are answered by http://en.wikipedia.org/wiki/Virtual_memory.
A backing store must exist for virtual memory. This is usually a hard disk. Basically its some other device that ua usually slower than RAM but is much bigger in capacity.
When a page fault occurs, the page is obtained from the backing store
The page table contains information on where in the backing store the page is to found

Short answer, no :) Virtual Memory is virtual!
Especially if you consider virtual memory as "the memory that can be addressed by a process". On 64 bit systems, the whole disk hardly could back the entire virtual memory. So "in reality", as you asked, I would say no.
Long(-ish) answer: virtual memory exists as a series of data structures in the kernel. They mostly keep trace of which page/segment is currently reserved, allocated, mapped to a file or mapped to physical memory.
Also, the answer is different if what you look at is "allocated virtual memory". This always exists in one form or another (usually, pages backed by hard-disk swap space).

Yes, most used bytes of virtual memory exist somewhere. I say "most" because pages that map registers of some special hardware can have holes. But all memory allocated by you app exists either in RAM or on hard disk.
The wikipedia article explains all the details: http://en.wikipedia.org/wiki/Virtual_memory

Related

How does linux kernel save struct_task in dynamic memory?

While I read understanding the linux kernel, I got this sentence
process descriptors are stored in dynamic memory.
As far as I know, for 32-bit computer system:
Kernel reserved almost 128MB High Memory in the highest virtual address to address the Dynamic physical address.
my question is: although the high memory can address all physical address, it can only address 128MB at most at once. The kernel data structure is so much that it could exceed 128MB. If kernel want to remap some of the high memory, the virtual address of some data structure saved in high memory might be invalid. How can kernel save more than 128MB kernel data structure in dynamic physical memory.
Although I have tried hard to express clear and obey this site's rules, there could still be some thing I made wrong. I'm very sorry if any.
What does "The kernel data structure is so much that it could exceed 128MB." mean? There is no "kernel data structure". There are things the kernel allocates, but they are few pages long tops. In particular there is no "single object" which would be > 128MB long.
If something is physically really big (say there is a file entirely read into RAM and it takes 512MB), the kernel just maps and unmaps physical pages as it needs them. In particular there is no need for the file to be mapped entirely at the same time and virtual addresses the parts get temporarily map into are meaningless.
Also note that today x86_64 provides a 128TB address space, so there are no shenaningans of the sort.

How does the kernel-side page cache virt <-> phys mapping interact with the TLB?

I am writing an application that makes heavy use of mmap, including from distinct processes (not concurrently, but serially). A big determinant of performance is how the TLB is managed user and kernel side for such mappings.
I understand reasonably well the user-visible aspects of the Linux page cache. I think this understanding extends to the userland performance impacts1.
What I don't understand is how those same pages are mapped into kernel space, and how this interacts with the TLB (on x86-64). You can find lots of information on how this worked in the 32-bit x86 world2, but I didn't dig up the answer for 64-bit.
So the two questions are (both interrelated and probably answered in one shot):
How is the page cache mapped3 in kernel space on x86-64?
If you read() N pages from a file in some process, then again read exactly those N pages again from another process on the same CPU, it possible that all the kernel side reads (during the kernel -> userpace copy of the contents) hit in the TLB? Note that this is (probably) a direct consequence of (1).
My overall goal here is to understand at a deep level the performance difference of one-off accessing of cached files via mmap or non-mmap calls such as read.
1 For example, if you mmap a file into your processes' virtual address space, you have effectively asked for your process page tables to contain a mapping from the returned/requested virual address range to a physical range corresponding to the pages for that file in the page cache (even if they don't exist in the page cache, yet). If MAP_POPULATE is specified, all the page table entries will actually be populated before the mmap call returns, and if not they will be populated as you fault-in the associated pages (sometimes with optimizations such as fault-around).
2 Basically, (for 3:1 mappings anyway) Linux uses a single 1 GB page to map approximately the first 1 GB of physical RAM directly (and places it at the top 1 GB of virtual memory), which is the end of story for machines with <= 1 GB RAM (the page cache necessarily goes in that 1GB mapping and hence a single 1 GB TLB entry covers everything). With more than 1GB RAM, the page cache is preferentially allocated from "HIGHMEM" - the region above 1GB which isn't covered by the kernel's 1GB mapping, so various temporary mapping strategies are used.
3 By mapped I mean how are the page tables set up for its access, aka how does the virtual <-> physical mapping work.
Due to vast virtual address space compared to physical ram installed (128TB for the kernel), the common trick is to permanently map all the ram. This is known as "direct map".
In principle it is possible that both relevant TLB and cache entries survive the context switch and all the other code executed, but it is hard to say how likely this can be in the real world.

Virtual memory without any swap partition

There are few other threads on this subject but I couldn't find a clear answer.
On Linux, how can the virtual memory work when there is no swap partition to perform Paging, even no secondary I/O device (HDD, SSD, etc.)?
If I take my example: I'm running a custom distribution (from initramfs) on an embedded target which hasn't got any swap partition or secondary storage.
In top, I can clearly see that the running processes are consuming a lot more of virtual addresses (VIRT) than physical ones (RSS), e.g. 500MB vs 20MB.
Is the difference between VIRT and RSS just the memory allocated but never accessed (hence never mapped by the OS)? (memory over-commitment)
I thought Virtual Memory needed Paging (not talking about swapping) to work but I'm starting to believe that I was wrong (and that there is lot of crap online about Linux memory management).
Does it mean that a Page Fault in such configuration will systematically invoke the oom-killer?
Cheers
Virtual Memory is just what the process sees in its memory space. This includes a lot of things:
Actual used RAM
Swapped memory
Memory mapped real files
Memory mapped devices
Copy-on-write anonymous mmaps used for large mallocs
Copy-on-write memory from a forked process
Shared memory
Loaded libraries shared between processes
Only swapped pages and mmapped pages from real files requires hitting a disk on page fault.
If two processes share libc, they will immediately have VIRT > RSS without any overcommitment.
It sounds like you are suffering from the conflation of two distinct concepts: virtual memory and logical address translation.
In logical address translation (logical memory) the CPU presents to each process a unique linear address space. The operating system manage a set of page tables that translate logical addresses to physical memory.
Virtual memory is the process of simulating physical memory by using a secondary storage device. Virtual memory handles the situation where a logical address has no corresponding physical address.
Sadly, most processor documentation conflates those two term.
Virtually memory requires a secondary storage. Logical memory does not. Thus you can have logical memory translation when there is no secondary storage. Such translations can end up being called "virtual" when they are technically "logical."

In Linux, physical memory pages belong to the kernel data segment are swappable or not?

I'm asking because I remember that all physical pages belong to the kernel are pinned in memory and thus are unswappable, like what is said here: http://www.cse.psu.edu/~axs53/spring01/linux/memory.ppt
However, I'm reading a research paper and feel confused as it says,
"(physical) pages frequently move between the kernel data segment and user space."
It also mentions that, in contrast, physical pages do not move between the kernel code segment and user space.
I think if a physical page sometimes belongs to the kernel data segment and sometimes belongs to user space, it must mean that physical pages belong to the kernel data segment are swappable, which is against my current understanding.
So, physical pages belong to the kernel data segment are swappable? unswappable?
P.S. The research paper is available here:
https://www.cs.cmu.edu/~arvinds/pubs/secvisor.pdf
Please search "move between" and you will find it.
P.S. again, a virtual memory area ranging from [3G + 896M] to 4G belongs to the kernel and is used for mapping physical pages in ZONE_HIGHMEM (x86 32-bit Linux, 3G + 1G setting). In such a case, the kernel may first map some virtual pages in the area to the physical pages that host the current process's page table, modify some page table entries, and unmap the virtual pages. This way, the physical pages may sometimes belong to the kernel and sometimes belong to user space, because they do not belong to the kernel after the unmapping and thus become swappable. Is this the reason?
tl;dr - the memory pools and swapping are different concepts. You can not make any deductions from one about the other.
kmalloc() and other kernel data allocation come from slab/slub, etc. The same place that the kernel gets data for user-space. Ergo pages frequently move between the kernel data segment and user space. This is correct. It doesn't say anything about swapping. That is a separate issue and you can not deduce anything.
The kernel code is typically populated at boot and marked read-only and never changes after that. Ergo physical pages do not move between the kernel code segment and user space.
Why do you think because something comes from the same pool, it is the same? The network sockets also come from the same memory pool. It is a seperation of concern. The linux-mm (memory management system) handles swap. A page can be pinned (unswappable). The check for static kernel memory (this may include .bss and .data) is a simple range check. The memory is normally pinned and marked unswappable at the linux-mm layer. The user data (whos allocation come from the same pool) can be marked as swappable by the linux-mm. For instance, even without swap, user-space text is still swappable because it is backed by an inode. Caching is much simpler for read-only data. If data is swapped, it is marked as such in the MMU tables and a fault handler must distinguish between swap and a SIGBUS; which is part of the linux-mm.
There are also versions of Linux with no-mm (or no MMU) and these will never swap anything. In theory someone might be able to swap kernel data; but the why is it in the kernel? The Linux way would be to use a module and only load them as needed. Certainly, the linux-mm data is kernel data and hopefully, you can see a problem with swapping that.
The problem with conceptual questions like this,
It can differ with Linux versions.
It can differ with Linux configurations.
The advice can change as Linux evolves.
For certain, the linux-mm code can not be swappable, nor any interrupt handler. It is possible that at some point in time, kernel code and/or data could be swapped. I don't think that this is ever the current case outside of module loading/unloading (and it is rather pedantic/esoteric as to whether you call this swapping or not).
I think if a physical page sometimes belongs to the kernel data segment and sometimes belongs to user space, it must mean that physical pages belong to the kernel data segment are swappable, which is against my current understanding.
there is no connection between swappable memory and page movement between user space and kernel space. whether a page can be swapped or not depends totally on whether it is pinned or not. Pinned pages are not swapped so their mapping is considered permanent.
So, physical pages belong to the kernel data segment are swappable? unswappable?
usually pages used by kernel are pinned and so are meant not to be swappable.
However, I'm reading a research paper and feel confused as it says, "(physical) pages frequently move between the kernel data segment and user space."
Could you please give a link of this research papaer?
As far as I known, (just from UNIX lectures and labs at school) the pages for kernel space has been allocated for kernel, with a simple, fixed mapping algorithm, and they are all pinned. After kernel turn on the paging mode, (bits operation of CR0&CR3 for x86) there will be the first user mode process, and the pages which has been allocated for kernel will not be in the available set of pages for user space.

Where does virtual memory exist in linux?

As program is stored on flash/disk. For it execution, program is loaded into virtual memory and is mapped to RAM by virtual manager. During its execution process is in RAM. Then where does virtual memory exist (where it has all .text, .data, .stack, .heap)?
The virtual memory is a view of the RAM plus maybe some swap space provided by a virtual memory manager. Modern OSs have virtual memory managers and provide virtual memory to processes so that the executing program can behave as if it had a contiguous address space whose size is not limited by the actual RAM. The pages or blocks making up the virtual memory can be mapped anywhere in the RAM, so that contiguos virtual pages need to be stored in contiguos RAM areas. Or they can be swapped out to page space or swap space, waiting there until needed, whereupon they're read by the OS and mapped to some RAM page.
When you say
During its execution process is in RAM.
This is not entirely correct. Some or all memory pages that belong to the process may be swapped out, as explained.
One more word concerning the answers and comments that say that "virtual" means it doesn't exist. This makes no sense. On the contrary, according to Webster:
being such in essence or effect ...
Hence virtual memory is something (therefore, it exists!) that behaves as if it were memory.
Virtual memory is just like an illusion of RAM. It uses paging to acquire additional RAM that could be used by the processes in operating system.
Virtual memory means memory you can access with "normal" momory access methods, although it isn't clear where the data is actually stored.
It may be
actually in RAM
in a swap area
in another file (memory mapped file)
and access to it will be handled appropriately.
It is a layer of, well, virtualization so that you as a programmer don't have to worry about where the data is actually put.
The original purpose was mainly to be able to provide more memory to processes than we actually have and to extend it with means of swap space, but there are even more:
The OS is free to use the RAM for whatever it seems necessary, e. g. caching. Under some circumstances, it may be more effective to use RAM for cache than for holding parts of a program which hasn't been used for a long time.
Provide additional memory to a program when it requests it: if you call malloc(), the program's library may request the OS to provide a part of memory which can be attached seamlessly into the address space.
Avoid stack overflow: if the stack grows larger and larger, the respective memory section may be extended as well transparently so that the program won't have to worry about it.
A system can even do "overcommitment" of memory: if a process requests a large amount of memory, the OS may say "yes, ok", i. e. provide the memory to the program. That means in the first place "allow the program to access a certain address space area", but this address space is not immediately backed by memory. Only as soon as the program accesses this memory the mapping will be done, and if this cannot be fulfilled, the program is crashed by the Out of emory killer (at least, under Linux).
All this works by page-wise (1 page = 4 kiB) assignment of physical memory to a program, viewed via the program's address space, and this in the amount and frequency as it is needed.

Resources