Process region table & Global Descriptor able & virtual address - linux

I am going through linux notes from one of the training institute here.
As per that when ever a process is created a region is allocated to it.
Region contains all the segments for the process.
Also region is specified by region-table. Region table contains following entry ;--
virtual address to - Physical address pointer + Disk Block Discriptor
Disk block descriptor point to the swap or exe file on disk.
two douts i have :-----
1> Where does the Global & Local Descriptor role is here.
http://www.google.co.in/imgres?um=1&hl=en&sa=N&tbo=d&biw=1366&bih=677&tbm=isch&tbnid=GSUGxm8x4QWQ1M:&imgrefurl=http://iakovlev.org/index.html%3Fp%3D945&docid=8Y36SIxwT17J6M&imgurl=http://iakovlev.org/images/intel/31.jpg&w=1534&h=1074&ei=oBX8UKuwBoHsrAer8YHQAw&zoom=1&iact=hc&vpx=79&vpy=377&dur=609&hovh=188&hovw=268&tx=150&ty=107&sig=103468883298920883665&page=1&tbnh=155&tbnw=221&start=0&ndsp=27&ved=1t:429,r:14,s:0,i:124
2> Does each process have its own global descriptor table ?
What i think is yes otherwise two processes vitual address will point towards same physical address .
Please suggest

1) The global descriptor table gives the base-address for the linear address. It is NEARLY always zero, and the "Limit" is set too "all ones" (that is, all addressable memory). In effect, the segment selectors are not actually used. The architecture requires them to be present and loaded, but the actual effect that they "can be used for" is not being used in Linux.
The Local descriptor table works exactly the same way, except there is a LDT per process. Typically it holds the stack segment of the task - it still has a base-address of zero. The process can modify the LDT, it can't modify the GDT.
To tell if the segment is GDT or LDT, look at bit 3 (the one worth 8) - for example, in my system ss has the value 0x2b, so it has bit 3 set. cs on the other hand is 33, so it's not got bit 3 set, and thus comes out of the GDT.
2) No. There is one GDT (per CPU core, to be precise) - that's why it's called "global" - there is one for everything. This is also why the stack segment is in the LDT, because there is one per process.

Related

What structure is traversed to deallocate pages, when a process terminates? (Page Table or something else?)

I am trying to understand the nature of the operations carried out regarding the deallocation of physical memory when a process terminates.
Assumed that page table for the process is a multi-level tree structure thats implemented on Linux.
My current understanding is that the OS would need to deallocate each physical page frame that is mapped to whatever subset of the virtual addresses for which the Page Table entry (PTE) exists. This could happen by a traversal of the multi-level tree PT structure & for the PTEs that have their valid bit set, the physical frame descriptor corresponding to the PTE is added to the free list (which is used in the Buddy allocation process).
My question is: Is the traversal of the Page Table actually done for this? An alternative, faster way would be to maintain a linked list of the page frame descriptors allotted to a process, for each process & then traverse that linearly during process termination. Is this more generic & faster method instead followed?
I'm not sure that page gets physically deallocated at process ending.
My understanding is that MMU is managed by the kernel.
But each process has its own virtual address space, which the kernel changes:
for explicit syscalls changing it, ie. mmap(2)
at program start thru execve(2) (which can be thought of several virtual mmap-s as described by the segments of the ELF program executable file)
at process termination, as if each segment of the address space was virtually munmap-ed
And when a process terminates, it is its virtual address space (but not any physical RAM pages) which gets destroyed or deallocated!
So the page table (whatever definition you give to it) is probably managed inside the kernel by a few primitives like adding a segment to virtual address space and removing a segment from it. The virtual space is lazily managed, since the kernel uses copy on write techniques to make fork fast.
Don't forget that some pages (e.g. the code segment of shared libraries) are shared between processes and that every task of a multi-threaded process are sharing the same virtual address space.
BTW, the Linux kernel is free software, so you should study its source code (from http://kernel.org/). Look also on http://kernelnewbies.org ; memory management happens inside the mm/ subtree of the kernel source.
There are lots of resources. Look into linux-kernel-slides, slides#245 for a start, and there are many books and resources about the Linux kernel... Look for vm_area_struct, pgetable, etc...

How does virtual to pyhsical memory mapping work

Im currently trying to understand systems programming for Linux and have a hard time understanding how virtual to physical memory mappings work.
What I understand so far is that two processes P1 and P2 can make references to the same virtual adress for example 0xf11001. Now this memory adress is split up into two parts. 0xf11 is the page number and 0x001 is the offset within that page (assuming 4096 page size is used). To find the physical adress the MMU has hardware registeres that maps the pagenumber to a physical adress lets say 0xfff. The last stage is to combine 0xfff with 0x001 to find the physical 0xfff001 adress.
However this understanding makes no sens, the same virtual adresses would still point to the same physical location??? What step is I missing inorder for my understanding to be correct???
You're missing one (crucial) step here. In general, MMU doesn't have hardware registers with mappings, but instead one register (page table base pointer) which points to the physical memory address of the page table (with mappings) for the currently running process (which are unique to every process). On context switch, kernel with change this register's value, so for each running process different mapping will be performed.
Here's nice presentation on this topic: http://www.eecs.harvard.edu/~mdw/course/cs161/notes/vm.pdf

Address of gdtr in Linux

I am not clear about the address of gdtr.
from the book "Understanding The Linux Kernel". 2.2.2 Segment Descriptor( page 38)
"The address of the GDT in main memory is contained in the gdtr processor register and the address of the currently used LDT is contained in the ldtr processor egister."
My question:
is the address in gdtr logical address/linear address or physical address?
I think it should be physical address, because paging has not implemented before that.
Need someone help to confirm it and provide better explanation
Another question about paragraph:
book "Understanding The Linux Kernel". 2.2.4 Segment Linux (page 43).
For each process, therefore, the GDT contains two different Segment Descriptors: one for the TSS segment and one for the LDT segment. The maximum number of entries allowed in the GDT is 12+2xNR_TASKS, where, in turn, NR_TASKS denotes the maximum number of processes.where, in turn, NR_TASKS denotes the maximum number of processes. In the previous list we described the six main Segment Descriptors used by Linux. Four additional Segment Descriptors cover Advanced Power Management (APM) features, and four entries of the GDT are left unused, for a grand total of 14.
12+2xNR TASKS, where does the 12 come from?
I think it should be 14 as
"In the previous list we described the six main Segment Descriptors used by Linux. Four additional Segment Descriptors cover Advanced Power Management (APM) features, and four entries of the GDT are left unused, for a grand total of 14."
I might misunderstand something, please help to make me clear.
Thanks,
$XSM
The Intel manual(64-ia-32-architectures-software-developer-vol-3a-part-1-manual) says that the linear address of GDT is stored in the GDTR register, and the linear address of LDT is stored in the LDTR register。
I believe the address is a linear address. Paging is turned on in startup_32() just after the segment registers (ds, es, fs, gs) are set to a known value (0x18 for i386). The page directory is located at 0x00101000 (also called swapper_pg_dir). Initialization of GDT and IDT comes way after paging is setup.
For more information, you can look at the source listing here
I want to say why GDTR must be linear address. Because there is an instruction lgdt, that means programmers can set GDTR. But as physical address is invisible to programmers, then GDTR must be linear address.

Segmentation registers use

I am trying to understand how memory management goes on low level and have a couple of questions.
1) A book about assembly language by by Kip R. Irvine says that in the real mode first three segment registers are loaded with base addresses of code, data, and stack segment when the program starts. This is a bit ambigous to me. Are these values specified manually or does the assembler generates instructions to write the values into registers? If it happens automatically, how it finds out what is the size of these segments?
2) I know that Linux uses flat linear model, i.e. uses segmentation in a very limited way. Also, according to "Understanding the Linux Kernel" by Daniel P. Bovet and Marco Cesati there are four main segments: user data, user code, kernel data and kernel code in GDT. All four segments have the same size and base address. I do not understand why there is need in four of them if they differ only in type and access rights (they all produce the same linear address, right?). Why not use just one of them and write its descriptor to all segment registers?
3) How operating systems that do not use segmentation divide programs into logical segments? For example, how they differentiate stack from code without segment descriptors. I read that paging can be used to handle such things, but don't understand how.
You must have read some really old books because nobody program for real-mode anymore ;-) In real-mode, you can get the physical address of a memory access with physical address = segment register * 0x10 + offset, the offset being a value inside one of the general-purpose registers. Because these registers are 16 bit wide, a segment will be 64kb long and there is nothing you can do about its size, just because there is no attribute! With the * 0x10 multiplication, 1mb of memory become available, but there are overlapping combinations depending on what you put in the segment registers and the address register. I haven't compiled any code for real-mode, but I think it's up to the OS to setup the segment registers during the the binary loading, just like a loader would allocate some pages when loading an ELF binary. However I do have compiled bare-metal kernel code, and I had to setup these registers by myself.
Four segments are mandatory in the flat model because of architecture constraints. In protected-mode the segment registers no more contains the segment base address, but a segment selector which is basically an offset into the GDT. Depending on the value of the segment selector, the CPU will be in a given level of privilege, this is the CPL (Current Privilege Level). The segment selector points to a segment descriptor which has a DPL (Descriptor Privilege Level), which is eventually the CPL if the segment register is filled with with this selector (at least true for the code-segment selector). Therefore you need at least a pair of segment selectors to differentiate the kernel from the userland. Moreover, segments are either code segment or data segment, so you eventually end up with four segment descriptors in the GDT.
I don't have any example of serious OS which make any use of segmentation, just because segmentation is still present for backward compliancy. Using the flat model approach is nothing but a mean to get rid of it. Anyway, you're right, paging is way more efficient and versatile, and available on almost all architecture (the concepts at least). I can't explain here paging internals, but all the information you need to know are inside the excellent Intel man: Intel® 64 and IA-32 Architectures
Software Developer’s Manual
Volume 3A:
System Programming Guide, Part 1
Expanding on Benoit's answer to question 3...
The division of programs into logical parts such as code, constant data, modifiable data and stack is done by different agents at different points in time.
First, your compiler (and linker) creates executable files where this division is specified. If you look at a number of executable file formats (PE, ELF, etc), you'll see that they support some kind of sections or segments or whatever you want to call it. Besides addresses and sizes and locations within the file, those sections bear attributes telling the OS the purpose of these sections, e.g. this section contains code (and here's the entry point), this - initialized constant data, that - uninitialized data (typically not taking space in the file), here's something about the stack, over there is the list of dependencies (e.g. DLLs), etc.
Next, when the OS starts executing the program, it parses the file to see how much memory the program needs, where and what memory protection is needed for every section. The latter is commonly done via page tables. The code pages are marked as executable and read-only, the constant data pages are marked as not executable and read-only, other data pages (including those of the stack) are marked as not executable and read-write. This is how it ought to be normally.
Often times programs need read-write and, at the same time, executable regions for dynamically generated code or just to be able to modify the existing code. The combined RWX access can be either specified in the executable file or requested at run time.
There can be other special pages such as guard pages for dynamic stack expansion, they're placed next to the stack pages. For example, your program starts with enough pages allocated for a 64KB stack and then when the program tries to access beyond that point, the OS intercepts access to those guard pages, allocates more pages for the stack (up to the maximum supported size) and moves the guard pages further. These pages don't need to be specified in the executable file, the OS can handle them on its own. The file should only specify the stack size(s) and perhaps the location.
If there's no hardware or code in the OS to distinguish code memory from data memory or to enforce memory access rights, the division is very formal. 16-bit real-mode DOS programs (COM and EXE) didn't have code, data and stack segments marked in some special way. COM programs had everything in one common 64KB segment and they started with IP=0x100 and SP=0xFFxx and the order of code and data could be arbitrary inside, they could intertwine practically freely. DOS EXE files only specified the starting CS:IP and SS:SP locations and beyond that the code, data and stack segments were indistinguishable to DOS. All it needed to do was load the file, perform relocation (for EXEs only), set up the PSP (Program Segment Prefix, containing the command line parameter and some other control info), load SS:SP and CS:IP. It could not protect memory because memory protection isn't available in the real address mode, and so the 16-bit DOS executable formats were very simple.
Wikipedia is your friend in this case. http://en.wikipedia.org/wiki/Memory_segmentation and http://en.wikipedia.org/wiki/X86_memory_segmentation should be good starting points.
I'm sure there are others here who can personally provide in-depth explanations, though.

program life in terms of paged segmentation memory

I have a confusing notion about the process of segmentation & paging in x86 linux machines. Will be glad if some clarify all the steps involved from the start to the end.
x86 uses paged segmentation memory technique for memory management.
Can any one please explain what happens from the moment an executable .elf format file is loaded from hard disk in to main memory to the time it dies. when compiled the executable has different sections in it (text, data, stack, heap, bss). how will this be loaded ? how will they be set up under paged segmentation memory technique.
Wanted to know how the page tables get set up for the loaded program ? Wanted to know how GDT table gets set up. how the registers are loaded ? and why it is said that logical addresses (the ones that are processed by segmentation unit of MMU are 48 bits (16 bits of segment selector + 32 bit offset) when it is a bit 32 bit machine. how will other 16 bits be stored ? any thing accessed from ram must be 32 bits or 4 bytes how does the rest of 16 bits be accessed (to be loaded into segment registers) ?
Thanks in advance. the question can have a lot of things. but wanted to get clarification about the entire life cycle of an executable. Will be glad if some answers and pulls up a discussion on this.
Unix traditionally has implemented protection via paging. 286+ provides segmentation, and 386+ provides paging. Everyone uses paging, few make any real use of segmentation.
In x86, every memory operand has an implicit segment (so the address is really 16 bit selector + 32 bit offset), depending on the register used. So if you access [ESP + 8] the implied segment register is SS, if you access [ESI] the implied segment register is DS, if you access [EDI+4] the implied segment register is ES,... You can override this via segment prefix overrides.
Linux, and virtually every modern x86 OS, uses a flat memory model (or something similar). Under a flat memory model each segment provides access to the whole memory, with a base of 0 and a limit of 4Gb, so you don't have to worry about the complications segmentation brings about. Basically there are 4 segments: kernelspace code (RX), kernelspace data (RW), userspace code (RX), userspace data (RW).
An ELF file consists of some headers that pont to "program segments" and "sections". Section are used for linking. Program segments are used for loading. Program segments are mapped into memory via mmap(), this setups page-table entries with appropriate permissions.
Now, older x86 CPUs' paging mechanism only provided RW access control (read permission implies execute permission), while segmentation provided RWX access control. The end permission takes into account both segmentation and paging (e.g: RW (data segment) + R (read only page) = R (read only), while RX (code segment) + R (read only page) = RX (read and execute)).
So there are some patches that provide execution prevention via segmentation: e.g. OpenWall provided a non-executable stack by shrinking the code segment (the one with execute permission), and having special emulation in the page fault handler for anything that needed execution from a high memory address (e.g: GCC trampolines, self-modified code created on the stack to efficiently implement nested functions).
There's no such thing as paged segmentation, not in the official documentation at least. There are two different mechanisms working together and more or less independently of each other:
Translation of a logical address of the form 16-bit segment selector value:16/32/64-bit segment offset value, that is, a pair of 2 numbers into a 32/64-bit virtual address.
Translation of the virtual address into a 32/64-bit physical address.
Logical addresses is what your applications operate directly with. Then follows the above 2-step translation of them into what the RAM will understand, physical addresses.
In the first step the GDT (or it can be LDT, depends on the selector value) is indexed by the selector to find the relevant segment's base address and size. The virtual address will be the sum of the segment base address and the offset. The segment size and other things in segment descriptors are needed to provide protection.
In the second step the page tables are indexed by different parts of the virtual address and the last indexed table in the hierarchy gives the final, physical address that goes out on the address bus for the RAM to see. Just like with segment descriptors, page table entries contain not only addresses but also protection control bits.
That's about it on the mechanisms.
Now, in many x86 OSes the segment selectors that are used for applications are fixed, they are the same in all of them, they never change and they point to segment descriptors that have base addresses equal to 0 and sizes equal to the possible maximum (e.g. 4GB in non-64-bit modes). Such a GDT setup effectively means that the first step does no useful work and the offset part of the logical address translates into numerically equal virtual address.
This makes the segment selector values practically useless. They still have to be loaded into the CPU's segment registers (in non-64-bit modes into at least CS, SS, DS and ES), but beyond that point they can be forgotten about.
This all (except Linux-related details and the ELF format) is explained in or directly follows from Intel's and AMD's x86 CPU manuals. You'll find many more details there.
Perhaps read the Assembly HOWTO. When a Linux process starts to execute an ELF executable using the execve system call, it is essentially (sort of) mmap-ing some segments (and initializing registers, and a tiny part of the stack). Read also the SVR4 x86 ABI supplement and its x86-64 variant. Don't forget that a Linux process only see memory mapping for its address space and only cares about virtual memory
There are many good books on Operating Systems (=O.S.) kernels, notably by A.Tanenbaum & by M.Bach, and some on the linux kernel
NB: segment registers are nearly (almost) unused on Linux.

Resources