How does linux kernel save struct_task in dynamic memory? - linux

While I read understanding the linux kernel, I got this sentence
process descriptors are stored in dynamic memory.
As far as I know, for 32-bit computer system:
Kernel reserved almost 128MB High Memory in the highest virtual address to address the Dynamic physical address.
my question is: although the high memory can address all physical address, it can only address 128MB at most at once. The kernel data structure is so much that it could exceed 128MB. If kernel want to remap some of the high memory, the virtual address of some data structure saved in high memory might be invalid. How can kernel save more than 128MB kernel data structure in dynamic physical memory.
Although I have tried hard to express clear and obey this site's rules, there could still be some thing I made wrong. I'm very sorry if any.

What does "The kernel data structure is so much that it could exceed 128MB." mean? There is no "kernel data structure". There are things the kernel allocates, but they are few pages long tops. In particular there is no "single object" which would be > 128MB long.
If something is physically really big (say there is a file entirely read into RAM and it takes 512MB), the kernel just maps and unmaps physical pages as it needs them. In particular there is no need for the file to be mapped entirely at the same time and virtual addresses the parts get temporarily map into are meaningless.
Also note that today x86_64 provides a 128TB address space, so there are no shenaningans of the sort.

Related

Why __GFP_HIGHMEM flag can't be applied to the __get_free_page() or kmalloc()

I want to know basically the two things
How does the kmalloc works i mean which function kmalloc calls to allocate memory is it alloc_pages() or __ger_free_pages().
Why Why __GFP_HIGHMEM flag can't be applied to the __get_free_page() or kmalloc()
I got the folowing extract from the LKD Robert Love can any body better explain that what is exact probelm with the alloc_pages() while giving __GFP_HIGHMEM flag.
Page # 240 CHAPTER 12
You cannot specify __GFP_HIGHMEM to either __get_free_pages() or
kmalloc(). Because these both return a logical address, and not a page
structure, it is possible that these functions would allocate memory
not currently mapped in the kernel’s virtual address space and, thus,
does not have a logical address. Only alloc_pages() can allocate high
memory.The majority of your allocations, however, will not specify a
zone modifier because ZONE_NORMAL is sufficient.
As explained in the book Linux Device Drivers 3rd edition (freely available here), "the Linux kernel knows about a minimum of three memory zones: DMA-capable memory, normal memory, and high memory". The __GFP_HIGHMEM flag indicates that "the allocated memory may be located in high memory". This flag has a platform-dependent role, although its usage is valid on all platforms.
Now, as explained here, "high Memory is the part of physical memory in a computer which is not directly mapped by the page tables of its operating system kernel". This zone of memory is not mapped in the kernel's virtual address space, and this prevents the kernel from being capable of directly referring it. Unfortunately, the memory used for kernel-mode data structures must be direct-mapped in the kernel, and therefore cannot be in the HIGHMEM zone.

What is the rationality of Linux kernel's mapping as much RAM as possible in direct-mapping(linear mapping) area?

The discussion below applies to 32-bit ARM Linux.
Suppose there are 512MB physical RAM in my system. For common configurations, all these 512MB physical RAM will be mapped via direct mapping by kernel(0xC000 0000 to 0xE000 0000).
Question is: kernel itself only uses part of these RAM; most of these RAM would be allocated to user space. Why bother mapping all these 512MB physical RAM in kernel's virtual space(0xC000 0000 to 0xE000 0000)? Why doesn't kernel just map part of these RAM for its only usage(say 64MB RAM)?
If physical RAM is greater than 1GB, things get a little complicated. Let's say directly-mapped area is 768MB in size. The result would be 768MB out of 1GB being directly mapped to kernel's virtual space. I guess the rest of the RAM(256MB) goes to two places: either high memory area or allocated by kernel to user space. But I still don't see any advantage of mapping so many physical RAM into kernel's virtual space.
Actually this question can be reduced to:
what are the drawbacks if kernel only directly maps a small part of physical RAM(say 64MB out of 512MB)?
Before further discussion, it is beneficial to know that
After MMU is turned on, every address issued by CPU is virtual
address.
If kernel wants to access ANY address in RAM, a mapping must be set up before the actual access happens.
If kernel only directly maps a small part of physical RAM, the cost is that every time kernel needs to access other parts of RAM, it needs to set up a temporary mapping before accessing that address and torn down that mapping after the access, which is very tedious and low efficiency.
If that mapping is set up in advance and is always there, it saves quite a lot of trouble for kernel.

Understanding Memory Mapped Files

I have started reading about memory mapped IO and I'm having some difficulties grasping the concepts
This is what I have understood so far:
Each process has a virtual address space. Memory mapped files are allocated a
specific address range in the virtual address space, that maps to the same address on
the physical memory. This way, all the writes that are done by the disk controller on
the memory(through DMA) will be reflected to the process without any additional
copying. (In a non memory mapped file case, CPU will have to copy the contents over
to the buffer of the process).
My Doubts:
Is my understanding correct?
What will happen if there are multiple processes trying to mmap a
file and there is no continuous block of memory available for direct mapping?
The memory subsystem itself doesn't have any understanding of "files", which are an OS concept, and there have been some operating systems that didn't use files at all. You're close but a little off in your understanding of how mmap works.
Each process does have its own virtual address space, which may have very little to do with the physical memory (lots of virtual address space doesn't have any memory associated at all, ever, and virtual memory that's swapped out doesn't have any physical memory). The system uses some sort of lookup tables (called descriptor tables on x86) that specify what virtual address ranges map to what physical address ranges. Virtual memory that isn't "resident" (swapped out, mmapped but not loaded) has a "not present" entry.
Whenever a program tries to access this memory, the CPU causes a page fault, which tells the OS to go find the appropriate contents somewhere and load them into physical memory. In the case of swap, the contents are loaded out of a swap file or partition; in the case of mmap, they're loaded out of somewhere in the filesystem.
The mechanism for getting them into physical memory and updating the descriptor table can vary. What you're describing is DMA, which lets the drive controller copy contents directly into a block of physical memory, and zero-copy I/O, which is a technique where the OS just creates a new descriptor mapping telling the processor to "teleport" the region of physical memory into the program's address space. Neither is technically required for mmap (the OS could load the file "by hand" and copy it into a new buffer for the program, and this may happen in a read-copy-update situation), but modern systems do it like you described.
The physical memory doesn't necessarily have to be contiguous. When the POSIX version of mmap is called, the OS allocates length bytes for the mapping, but thanks to virtual memory, those bytes could be split up among multiple blocks and mapped together by the processor.
If multiple processes are trying to mmap the same file, the OS behavior depends on whether the access is read-only or read/write; read-only copies can be shared among many processes (such as the actual executable code; this is why even though Chrome may have dozens of processes running, the Chrome binary is only in memory once).

what does 'low memory' mean in linux

HI I'm Korean and getting little confused on "The boot program first copies itself to a fixed high-memory address to free up low memory for the operating system".
What I know about low memory that I found by googling was that this is first 640K memory in DOS system. Does this means all of the OS system (like kernel) goes in to low memory (640K) ????
Thanks for reading this.
This link could be helpful: Virtual Memory
Mainly,
On 32-bit systems, memory is now divided into "high" and "low" memory. Low memory continues to be mapped directly into the kernel's address space, and is thus always reachable via a kernel-space pointer. High memory, instead, has no direct kernel mapping. When the kernel needs to work with a page in high memory, it must explicitly set up a special page table to map it into the kernel's address space first. This operation can be expensive, and there are limits on the number of high-memory pages which can be mapped at any particular time.
This question on unix.stackexchange is a little more in-depth: High and low memory

why do we need zone_highmem on x86?

In linux kernel, mem_map is the array which holds all "struct page" descriptors. Those pages includes the 128MiB memory in lowmem for dynamically mapping highmem.
Since the lowmem size is 1GiB, so the mem_map array has only 1GiB/4KiB=256KiB entries. If each entry size is 32 byte, then the mem_map memory size = 8MiB. But if we could use mem_map to map all 4GiB physical memory(if we have so much physical memory available on x86-32), then the mem_map array would occupy 32MiB, that is not a lot of kernel memory(or am i wrong?).
So my question is: why do we need to use that 128MiB in low for indirect highmem mapping in the first place? Or put another way, why not to map all those max 4GiB physical memory(if available) in the kernel space directly?
Note: if my understanding of the kernel source above is wrong, please correct. Thanks!
Look Here: http://www.xml.com/ldd/chapter/book/ch13.html
Kernel low memory is the 'real' memory map, addressed with 32-bit pointers on x86.
Kernel high memory is the 'virtual' memory map, addressed with virtual structures on x86.
You don't want to map it all into the kernel address space, because you can't always address all of it, and you need most of your memory for virtual memory segments (virtual, page-mapped process space.)
At least, that's how I read it. Wow, that's a complicated question you asked.
To throw more confusion, chapter 13 talks about some PCI devices not being able to address the 32-bit space, which was the genesis of my previous comment:
On x86, some kernel memory usage is limited to the first Gigabyte of memory bacause of DMA addressing concerns. I'm not 100% familiar with the topic, but there's a comapatibility mode for DMA on the PCI bus. That may be what you are looking at.
3.6 GB is not the ceiling when using physical address extension, which is commonly needed on most modern x86 boards, especially with memory hotplug.
Or put another way, why not to map all those max 4GiB physical
memory(if available) in the kernel space directly?
One reason is userspace: every usespace process have its own virtual address space. Suppose you have 4Gb of RAM on x86. So if we suggest that kernel owns 1Gb of memory (~800 directly mapped + ~200 vmalloc) all other ~3Gb should be dynamically distributed between processes spinning in user space. So how can you map your 4Gbs directly when you have a several address spaces?
why do we need zone_highmem on x86?
The reason is the same. Kernel reserves only ~800Mb for low mem. All other memory will be allocated and connected with particular virtual address only on demand. For example if you will execute a binary a new virtual address space will be created and some pages will be allocated for storing your binary code and data (heap ,stack ...). So the key attribute of high mem is to serve dynamic memory allocation requests, you never know in advance what will be triggered by userspace...

Resources