global or local linear address space in Linux? - linux

In linux, because the bases of segments are all 0, so the logical address coincide with the linear address (Book "Understanding the linux kernel"). I think the logical address of different process may be the same, so the linear address of different process may be the same and as each process view 4GB, each process will have its own linear address space (local address space). But some other articles says there is a large linear address space shared by all process, and the segment mechanism is used to map different process into different part of the linear address space. Sounds like a global linear address space with wider address bits. Where am I wrong? Or they are used in different architecture?

Each Linux process has its own address space; it is virtual memory. Different processes have different address spaces (but all the threads inside a process share the same address space).
You can get a map of process 1234 on Linux by reading /proc/1234/maps or from inside the process /proc/self/maps
Try the following commands
cat /proc/$$/maps
cat /proc/self/maps
and think about their output; the first command shows the memory map of your shell; the second one shows the memory map of the process running cat
The address space is set with execve(2) at program startup and changed with the mmap(2) and related syscalls.
An application interact with the kernel only thru syscalls. The kernel has a "different" address space, which you should not care about (unless you are coding inside the kernel).
Read also a good book like Advanced Unix Programming and/or Advanced Linux Programming
See also this explanation on syscalls.
Notice that segmented addressing is specific to i386 and is obsolete: most systems don't use it anymore. It has completely disappeared in 64 bits mode of x86-64. All Linux systems use a flat memory model
Please read carefully all the references.

Intel support 3 kinds of addresses:
logical address --(segment unit)---> linear address ---(paging unit)---> physical address
as you know, all kernel and user code access data or text thought virtual address (logical address in CPU). The address is translated into linear address as the following graph:
As linux implementation does not support the concept of linear addressing and the segments is only provided for permission control. Linux kernel configures each segment's offset value to zero. That is why you can't see the linear address in kernel and kernel directly use virtual address on paging units.
After getting the linear address, the MMU paging unit reference CR3 register to get base of paing table to generate physical address.
The same with cpu cache, the paging unit also has a TLB cache per CPU core to speed up the address translation that performed on memory.
Reference:
intel64 software developer's manual

Related

what is kernel mapping in linux?

what is kernel mapping? What are permanent mapping and temporary mapping. What is a window in this context? I went through code and explanation of this but could not understand this
I'm assuming you're talking about memory mapping in linux kernel.
Memory mapping is a process of mapping kernel address space directly to users process's address space.
Types of addresses :
User virtual address : These are the regular addresses seen by user-space programs
Physical addresses : The addresses used between the processor and the system’s memory.
Bus addresses : The addresses used between peripheral buses and memory. Often, they are the same as the physical addresses used by the processor, but that is not necessarily the case.
Kernel logical addresses : These make up the normal address space of the kernel.
Kernel virtual addresses : Kernel virtual addresses are similar to logical addresses in that they are a mapping from a kernel-space address to a physical address.
High and Low Memory :
Low memory : Memory for which logical addresses exist in kernel space. On almost every system you will likely encounter, all memory is low memory.
High memory : Memory for which logical addresses do not exist, because it is beyond the address range set aside for kernel virtual addresses.This means the kernel needs to start using temporary mappings of the pieces of physical memory that it wants to access.
Kernel splits virtual address into two part user address space and kernel address space. The kernel’s code and data structures must fit into that space, but the biggest consumer of kernel address space is virtual mappings for physical memory. Thus kernel needs its own virtual address for any memory it must touch directly. So, the maximum amount of physical memory that could be handled by the kernel was the amount that could be mapped into the kernel’s portion of the virtual address space, minus the space used by kernel code.
Temporary mapping : When a mapping must be created but the current context cannot sleep, the kernel provides temporary mappings (also called atomic mappings). The kernel can atomically map a high memory page into one of the reserved mappings (which can hold temporary mappings). Consequently, a temporary mapping can be used in places that cannot sleep, such as interrupt handlers, because obtaining the mapping never blocks.
Ref :
kernel.org/doc/Documentation/vm/highmem.txt
static.lwn.net/images/pdf/LDD3/ch15.pdf
man mmap
notes.shichao.io/lkd/ch12/
A full answer would be very long, for details refers (for example) to Linux Kernel Addressing or Understanding the Linux Kernel (pages 306-). These concepts are related to the way address spaces are organized in Linux. Firstly how kernel space is mapped into user space (kernel mapped onto user space simplifies the switching in between user and kernel mode) and, secondly the way physical memory is mapped onto kernel space (because kernel have to manage physical memory).
Beware that this is of no concern in modern 64bit architectures.

Which type of memory model (i.e. flat / segmentation) is used by linux kernel?

I am reading about x86 protected mode working, In that I have seen the flat memory model and segmentation memory model.
If linux kernel is using flat memory model then, How it protects the access of unprivileged applications to critical data?
Linux generally uses neither. On x86, Linux has separate page tables for userspace processes and the kernel. The userspace page tables do not contain user mappings to kernel memory, which makes it impossible for user-space processes to access kernel memory directly.
Technically, "virtual addresses" on x86 pass through segmentation first (and are converted from logical addresses to linear addresses) before being remapped from linear addresses to physical addresses through the page tables. Except in unusual cases, segmentation won't change the resulting physical address in 64 bit mode (segmentation is just used to store traits like the current privilege level, and enforce features like SMEP).
One well known "unusual case" is the implementation of Thread Local Storage by most compilers on x86, which uses the FS and GS segments to define per logical processor offsets into the address space. Other segments can not have non-zero bases, and therefore cannot shift addresses through segmentation.

How exactly do kernel virtual addresses get translated to physical RAM?

On the surface, this appears to be a silly question. Some patience please.. :-)
Am structuring this qs into 2 parts:
Part 1:
I fully understand that platform RAM is mapped into the kernel segment; esp on 64-bit systems this will work well. So each kernel virtual address is indeed just an offset from physical memory (DRAM).
Also, it's my understanding that as Linux is a modern virtual memory OS, (pretty much) all addresses are treated as virtual addresses and must "go" via hardware - the TLB/MMU - at runtime and then get translated by the TLB/MMU via kernel paging tables. Again, easy to understand for user-mode processes.
HOWEVER, what about kernel virtual addresses? For efficiency, would it not be simpler to direct-map these (and an identity mapping is indeed setup from PAGE_OFFSET onwards). But still, at runtime, the kernel virtual address must go via the TLB/MMU and get translated right??? Is this actually the case? Or is kernel virtual addr translation just an offset calculation?? (But how can that be, as we must go via hardware TLB/MMU?). As a simple example, lets consider:
char *kptr = kmalloc(1024, GFP_KERNEL);
Now kptr is a kernel virtual address.
I understand that virt_to_phys() can perform the offset calculation and return the physical DRAM address.
But, here's the Actual Question: it can't be done in this manner via software - that would be pathetically slow! So, back to my earlier point: it would have to be translated via hardware (TLB/MMU).
Is this actually the case??
Part 2:
Okay, lets say this is the case, and we do use paging in the kernel to do this, we must of course setup kernel paging tables; I understand it's rooted at swapper_pg_dir.
(I also understand that vmalloc() unlike kmalloc() is a special case- it's a pure virtual region that gets backed by physical frames only on page fault).
If (in Part 1) we do conclude that kernel virtual address translation is done via kernel paging tables, then how exactly does the kernel paging table (swapper_pg_dir) get "attached" or "mapped" to a user-mode process?? This should happen in the context-switch code? How? Where?
Eg.
On an x86_64, 2 processes A and B are alive, 1 cpu.
A is running, so it's higher-canonical addr
0xFFFF8000 00000000 through 0xFFFFFFFF FFFFFFFF "map" to the kernel segment, and it's lower-canonical addr
0x0 through 0x00007FFF FFFFFFFF map to it's private userspace.
Now, if we context-switch A->B, process B's lower-canonical region is unique But
it must "map" to the same kernel of course!
How exactly does this happen? How do we "auto" refer to the kernel paging table when
in kernel mode? Or is that a wrong statement?
Thanks for your patience, would really appreciate a well thought out answer!
First a bit of background.
This is an area where there is a lot of potential variation between
architectures, however the original poster has indicated he is mainly
interested in x86 and ARM, which share several characteristics:
no hardware segments or similar partitioning of the virtual address space (when used by Linux)
hardware page table walk
multiple page sizes
physically tagged caches (at least on modern ARMs)
So if we restrict ourselves to those systems it keeps things simpler.
Once the MMU is enabled, it is never normally turned off. So all CPU
addresses are virtual, and will be translated to physical addresses
using the MMU. The MMU will first look up the virtual address in the
TLB, and only if it doesn't find it in the TLB will it refer to the
page table - the TLB is a cache of the page table - and so we can
ignore the TLB for this discussion.
The page table
describes the entire virtual 32 or 64 bit address space, and includes
information like:
whether the virtual address is valid
which mode(s) the processor must be in for it to be valid
special attributes for things like memory mapped hardware registers
and the physical address to use
Linux divides the virtual address space into two: the lower portion is
used for user processes, and there is a different virtual to physical
mapping for each process. The upper portion is used for the kernel,
and the mapping is the same even when switching between different user
processes. This keep things simple, as an address is unambiguously in
user or kernel space, the page table doesn't need to be changed when
entering or leaving the kernel, and the kernel can simply dereference
pointers into user space for the
current user process. Typically on 32bit processors the split is 3G
user/1G kernel, although this can vary. Pages for the kernel portion
of the address space will be marked as accessible only when the processor
is in kernel mode to prevent them being accessible to user processes.
The portion of the kernel address space which is identity mapped to RAM
(kernel logical addresses) will be mapped using big pages when possible,
which may allow the page table to be smaller but more importantly
reduces the number of TLB misses.
When the kernel starts it creates a single page table for itself
(swapper_pg_dir) which just describes the kernel portion of the
virtual address space and with no mappings for the user portion of the
address space. Then every time a user process is created a new page
table will be generated for that process, the portion which describes
kernel memory will be the same in each of these page tables. This could be
done by copying all of the relevant portion of swapper_pg_dir, but
because page tables are normally a tree structures, the kernel is
frequently able to graft the portion of the tree which describes the
kernel address space from swapper_pg_dir into the page tables for each
user process by just copying a few entries in the upper layer of the
page table structure. As well as being more efficient in memory (and possibly
cache) usage, it makes it easier to keep the mappings consistent. This
is one of the reasons why the split between kernel and user virtual
address spaces can only occur at certain addresses.
To see how this is done for a particular architecture look at the
implementation of pgd_alloc(). For example ARM
(arch/arm/mm/pgd.c) uses:
pgd_t *pgd_alloc(struct mm_struct *mm)
{
...
init_pgd = pgd_offset_k(0);
memcpy(new_pgd + USER_PTRS_PER_PGD, init_pgd + USER_PTRS_PER_PGD,
(PTRS_PER_PGD - USER_PTRS_PER_PGD) * sizeof(pgd_t));
...
}
or
x86 (arch/x86/mm/pgtable.c) pgd_alloc() calls pgd_ctor():
static void pgd_ctor(struct mm_struct *mm, pgd_t *pgd)
{
/* If the pgd points to a shared pagetable level (either the
ptes in non-PAE, or shared PMD in PAE), then just copy the
references from swapper_pg_dir. */
...
clone_pgd_range(pgd + KERNEL_PGD_BOUNDARY,
swapper_pg_dir + KERNEL_PGD_BOUNDARY,
KERNEL_PGD_PTRS);
...
}
So, back to the original questions:
Part 1: Are kernel virtual addresses really translated by the TLB/MMU?
Yes.
Part 2: How is swapper_pg_dir "attached" to a user mode process.
All page tables (whether swapper_pg_dir or those for user processes)
have the same mappings for the portion used for kernel virtual
addresses. So as the kernel context switches between user processes,
changing the current page table, the mappings for the kernel portion
of the address space remain the same.
The kernel address space is mapped to a section of each process for example on 3:1 mapping after address 0xC0000000. If the user code try to access this address space it will generate a page fault and it is guarded by the kernel.
The kernel address space is divided into 2 parts, the logical address space and the virtual address space. It is defined by the constant VMALLOC_START. The CPU is using the MMU all the time, in user space and in kernel space (can't switch on/off).
The kernel virtual address space is mapped the same way as user space mapping. The logical address space is continuous and it is simple to translate it to physical so it can be done on demand using the MMU fault exception. That is the kernel is trying to access an address, the MMU generate fault , the fault handler map the page using macros __pa , __va and change the CPU pc register back to the previous instruction before the fault happened, now everything is ok. This process is actually platform dependent and in some hardware architectures it mapped the same way as user (because the kernel doesn't use a lot of memory).

Linux uses Paging or Segmentation or Both? [duplicate]

I'm reading "Understanding Linux Kernel". This is the snippet that explains how Linux uses Segmentation which I didn't understand.
Segmentation has been included in 80 x
86 microprocessors to encourage
programmers to split their
applications into logically related
entities, such as subroutines or
global and local data areas. However,
Linux uses segmentation in a very
limited way. In fact, segmentation
and paging are somewhat redundant,
because both can be used to separate
the physical address spaces of
processes: segmentation can assign a
different linear address space to each
process, while paging can map the same
linear address space into different
physical address spaces. Linux prefers
paging to segmentation for the
following reasons:
Memory management is simpler when all
processes use the same segment
register values that is, when they
share the same set of linear
addresses.
One of the design objectives of Linux
is portability to a wide range of
architectures; RISC architectures in
particular have limited support for
segmentation.
All Linux processes running in User
Mode use the same pair of segments to
address instructions and data. These
segments are called user code segment
and user data segment , respectively.
Similarly, all Linux processes running
in Kernel Mode use the same pair of
segments to address instructions and
data: they are called kernel code
segment and kernel data segment ,
respectively. Table 2-3 shows the
values of the Segment Descriptor
fields for these four crucial
segments.
I'm unable to understand 1st and last paragraph.
The 80x86 family of CPUs generate a real address by adding the contents of a CPU register called a segment register to that of the program counter. Thus by changing the segment register contents you can change the physical addresses that the program accesses. Paging does something similar by mapping the same virtual address to different real addresses. Linux using uses the latter - the segment registers for Linux processes will always have the same unchanging contents.
Segmentation and Paging are not at all redundant. The Linux OS fully incorporates demand paging, but it does not use memory segmentation. This gives all tasks a flat, linear, virtual address space of 32/64 bits.
Paging adds on another layer of abstraction to the memory address translation. With paging, linear memory addresses are mapped to pages of memory, instead of being translated directly to physical memory. Since pages can be swapped in and out of physical RAM, paging allows more memory to be allocated than what is physically available. Only pages that are being actively used need to be mapped into physical memory.
An alternative to page swapping is segment swapping, but it is generally much less efficient given that segments are usually larger than pages.
Segmentation of memory is a method of allocating multiple chunks of memory (per task) for different purposes and allowing those chunks to be protected from each other. In Linux a task's code, data, and stack sections are all mapped to a single segment of memory.
The 32-bit processors do not have a mode bit for disabling
segmentation, but the same effect can be achieved by mapping the
stack, code, and data spaces to the same range of linear addresses.
The 32-bit offsets used by 32-bit processor instructions can cover a
four-gigabyte linear address space.
Aditionally, the Intel documentation states:
A flat model without paging minimally requires a GDT with one code and
one data segment descriptor. A null descriptor in the first GDT entry
is also required. A flat model with paging may provide code and data
descriptors for supervisor mode and another set of code and data
descriptors for user mode
This is the reason for having a one pair of CS/DS for kernel privilege execution (ring 0), and one pair of CS/DS for user privilege execution (ring 3).
Summary: Segmentation provides a means to isolate and protect sections of memory. Paging provides a means to allocate more memory that what is physically available.
Windows uses the fs segment for local thread storage.
Therefore, wine has to use it, and the linux kernel needs to support it.
Modern operating systems (i.e. Linux, other Unixen, Windows NT, etc.) do not use the segmentation facility provided by the x86 processor. Instead, they use a flat 32 bit memory model. Each user mode process has it's own 32 bit virtual address space.
(Naturally the widths are expanded to 64 bits on x86_64 systems)
Intel first added segmentation on the 80286, and then paging on the 80386. Unix-like OSes typically use paging for virtual memory.
Anyway, since paging on x86 didn't support execute permissions until recently, OpenWall Linux used segmentation to provide non-executable stack regions, i.e. it set the code segment limit to a lower value than the other segment's limits, and did some emulation to support trampolines on the stack.

Processors and virtual/physical addresses

In nutshell, as I understand memory management, processor produces virtual addresses. These addresses are translated to corresponding physical addresses using per-process address table by MMU (with TLBs and page-faults in-between, as and when needed).
My question is does processor always produces Virtual addresses? In terms of Address-spaces(user/kernel), Processor modes (user/kernel) and contexts (process/system) when all times does processor produce physical addresses?
Memory typically knows nothing about virtual addresses or segments, which are CPU concepts, it is just memory, a collection of addressable and readable/writable bits. The processor talks to memory using physical addresses. Many simple processors (especially old ones or for special embedded uses) have no MMUs, virtual addresses or privileged modes. Those that have MMUs and virtual addresses normally start either with those disabled or they at first use fixed mapping because otherwise nothing would be able to work if there's no mapping at all.
So, physical addresses are always in use, while for virtual addresses it depends on the CPU and the software in use.
Processor is unaware about whether it is physical address or virtual address , it is the job of respective MMU to do the translation.
Processor has to place the address on it address bus , so now the path depends whether MMU is enabled or disabled. if MMU is enabled it will follow the path of MMU translation and respective physical address will get placed on address bus and if MMU is disabled the same address generated by respective instruction will be placed on address bus.
so it is responsibility of programmer if MMU is disabled all address access should be physical address otherwise it will be a exception or abort in system
Generally the CPU ONLY knows "virtual addresses". Ie, when you do any assembly programming, any "load regs, *(memory ptr)" or such-like operation, the addresses are virtual addresses.
The following diagram illustrates well the concept:
http://slideplayer.com/slide/4394245/ (page 7)
Any addresses coming out of CPU, is always virtual. But if the processor has a MMU, then the MMU will intercept the addresses and convert it to physical addresses (through Page Table mechanism) before putting it on the memory bus. So if you sniff the memory bus, you will see physical addresses.
References:
https://www.quora.com/What-is-the-return-address-of-kmalloc-Physical-or-Virtual
Yes when a system works on VM then instructions/programs produce only VA even at the time of booting, addresses generated are not the same as physical despite page tables are still not in existence.
But processor need physical addresses to access memory. So there is mechanism to produce physical addresses from virtual addresses which is called address translation done through page tables. Yes whatever you see in kernel mode/user mode, register values all are virtual addresses. but when processor actually perform computation, processor always uses physical addresses
My question is does processor always produces Virtual addresses? In terms of Address-spaces(user/kernel), Processor modes (user/kernel) and contexts (process/system) when all times does processor produce physical addresses?
An x86 CPU operates with 3 different "kinds" of addresses:
physical: the actual addresses that are used to select the byte in memory. Used for things like segment base addresses, interrupt descriptor table, global descriptor table.
logical: these are addresses that you use most of the time when paging is disabled. They're converted to physical addresses by adding the respective segments (physical) base address.
virtual: the addresses used by most instructions when paging is enabled. These are translated into logical addresses using page directories and tables.
Which kind of address is used isn't dependent on the current privilege level (CPL; "system mode", "user mode" or between). It depends on the state of the processor (paging enabled or not) and the actual instruction (lidt for example). Though I guess it's safe to assume that there won't be physical addressing in "user mode" (CPL > 0) since instructions using physical addresses are usually privileged instructions.

Resources