Why __GFP_HIGHMEM flag can't be applied to the __get_free_page() or kmalloc() - linux

I want to know basically the two things
How does the kmalloc works i mean which function kmalloc calls to allocate memory is it alloc_pages() or __ger_free_pages().
Why Why __GFP_HIGHMEM flag can't be applied to the __get_free_page() or kmalloc()
I got the folowing extract from the LKD Robert Love can any body better explain that what is exact probelm with the alloc_pages() while giving __GFP_HIGHMEM flag.
Page # 240 CHAPTER 12
You cannot specify __GFP_HIGHMEM to either __get_free_pages() or
kmalloc(). Because these both return a logical address, and not a page
structure, it is possible that these functions would allocate memory
not currently mapped in the kernel’s virtual address space and, thus,
does not have a logical address. Only alloc_pages() can allocate high
memory.The majority of your allocations, however, will not specify a
zone modifier because ZONE_NORMAL is sufficient.

As explained in the book Linux Device Drivers 3rd edition (freely available here), "the Linux kernel knows about a minimum of three memory zones: DMA-capable memory, normal memory, and high memory". The __GFP_HIGHMEM flag indicates that "the allocated memory may be located in high memory". This flag has a platform-dependent role, although its usage is valid on all platforms.
Now, as explained here, "high Memory is the part of physical memory in a computer which is not directly mapped by the page tables of its operating system kernel". This zone of memory is not mapped in the kernel's virtual address space, and this prevents the kernel from being capable of directly referring it. Unfortunately, the memory used for kernel-mode data structures must be direct-mapped in the kernel, and therefore cannot be in the HIGHMEM zone.

Related

High memory mappings in kernel virtual address space

The linear address beyond 896MB correspond to High memory region ZONE_HIGHMEM.
So the page allocator functions will not work on this region, since they give the linear address of directly mapped page frames in ZONE_NORMAL and ZONE_DMA.
I am confused about these lines specified in Undertanding linux Kernel:
What do they mean when they say "In 64 bit hardware platforms ZONE_HIGHMEM is always empty."
What does this highlighted statement mean: "The allocation of high-memory page frames is done only through alloc_pages() function. These functions do not return linear address since they do not exist. Instead the functions return linear address of the page descriptor of the first allocated page frame. These linear addresses always exist, because all page descriptors are allocated in low memory once and forever during kernel initialization."
What are these Page descriptors and does the 896MB already have all page descriptors of entire RAM.
The x86-32 kernel needs high memory to access more than 1G of physical memory, as it is impossible to permanently map more than 2^{32} addresses within a 32-bit address space and the kernel/user split is 1G/3G.
The x86-64 kernel has no such limitation, as the amount of physically-addressable memory (currently 256T) fits within its 64-bit address space and thus may always be permanently mapped.
High memory is a hack. Ideally you don't need it. Indeed, the point of x86-64 is to be able to directly address all the memory you could possibly want. Taken
from https://www.quora.com/Linux-Kernel/What-is-the-difference-between-high-memory-and-normal-memory
I think page descriptor means struct page. And considering the sizeof struct page. Yes all of them can be stored in ZONE_NORMAL

kmalloc() functionality in linux kernel

I did come across through the LDD book that using the kmalloc we can allocate from high memory. I have one basic question here.
1)But to my knowledge we can't access the high memory directly from the kernel (unless it is mapped to the kernel space through the kmap()). And i didn't see any mapping area reserved for kmalloc(), But for vmalloc() it is present.So, to which part of the kernel address does the kmalloc() will map if allocated from high memory?
This is on x86 architecture,32bit system.
My knowledge may be out of date but the stack is something like this:
kmalloc allocates physically contiguous memory by calling get_free_pages (this is what the acronym GFP stands for). The GFP_* flags passed to kmalloc end up in get_free_pages, which is the page allocator.
Since special handling is required for highmem pages, you won't get them unless you add the GFP_HIGHMEM flag to the request.
All memory in Linux is virtual (a generalization that is not exactly true and that is architecture-dependent, but let's go with it, until the next parenthesized statement in this paragraph). There is a range of memory, however, that is not subject to the virtualization in the sense of remapping of pages: it is just a linear mapping between virtual addresses and physical addresses. The memory allocated by get_free_pages is linearly mapped, except for high memory. (On some architectures, linear mappings are supported for memory ranges without the use of a MMU: it's just a simple arithmetic translation of logical addresses to physical: add a displacement. On other architectures, linear mappings are done with the MMU.)
Anyway, if you call get_free_pages (directly, or via kmalloc) to allocate two or more pages, it has to find physically contiguous ones.
Now virtual memory is also implemented on top of get_free_pages, because we can take a page allocated that way, and install it into a virtual address space.
This is how mmap works and everything else for user space. When a piece of virtual memory is committed (becomes backed by a physical page, on a page fault or whatever), a page comes from get_free_pages. Unless that page is highmem, it has a linear mapping so that it is visible in the kernel. Additionally, it is wired into the virtual address space for which the request is being made. Some kernel data structures keep track of this, and of course it is punched into the page tables so the MMU makes it happen.
vmalloc is similar in principle to mmap, but far simpler because it doesn't deal with multiple backends (devices in the filesystem with a mmap virtual function) and doesn't deal with issues like coalescing and splitting of mappings that mmap allows. The vmalloc area consists of a reserved range of virtual addresses visible only to the kernel (whose base address and is architecture dependent and can be tweaked by you at kernel compile time). The vmalloc allocator carves out this virtual space, and populates it with pages from get_free_pages. These need not be contiguous, and so can be obtained one at a time, and wired into the allocated virtual space.
Highmem pages are physical memory that is not addressable in the kernel's linear map representing physical memory. Highmem exists because the kernel's linear "window" on physical memory isn't always large enough to cover all of memory. (E.g. suppose you have a 1GB window, but 4GB of RAM.) So, for coverage of all of memory, there is in addition to the linear map, some smaller "non linear" map where pages they are selectively made visible, on a temporary basis using kmap and kunmap. Placement of a page into this view is considered the acquisition of a precious resource that must be used sparingly and released as soon as possible.
A highmem page can be installed into a virtual memory map just like any other page, and no special "highmem" handling is needed for that view of the page. Any map: that of a process, or the vmalloc range.
If you're dealing with some virtual memory that could be a mixture of highmem and non-highmem pages, which you have to view through the kernel's linear space, you have to be prepared to use the mapping functions.

Can I allocate one large and guaranteed continued range physical memory (100MB)?

Can I allocate one large and guaranteed continued range physical memory (100 MB consecutive without breaks) on Linux, and if I can, then how can I do this?
It is necessary to mapping this a continuous block of memory through the PCI-Express BAR from one CPU1 to the other CPU2 located behind the PCIe Non-Transparent Bridge.
You don't allocate physical memory in user applications (physical memory only makes sense inside the kernel).
I don't understand if you are coding a kernel module or some Linux application (e.g. a numerical finite-element code=.
Inside applications, you can allocate virtual memory with e.g. mmap(2) (and then you can allocate a big contiguous segment of address space)
I guess that some GPU cards give access to a large amount of GPU memory thru mmap so I believe it is possible to do what you want.
You might be interested by numa(7) man page. Probably the numa(3) library should give you what you want. Did you consider also open MPI? See also msync(2) and mlock(2)
From user space -- there is no guarantee depends on you luck.
if you compile your driver into the kernel -- you can use the mmap and allocate the required amount of memory.
if it is required to use it as storage or some other work not specifically for a driver then you should set the memmap parameter in the boot command line.
e.g. memmap=200M$1700M
it will block 200 MB memory starting from the end of 1700M (address).
Later it can be used to as FS as well ;)

what does 'low memory' mean in linux

HI I'm Korean and getting little confused on "The boot program first copies itself to a fixed high-memory address to free up low memory for the operating system".
What I know about low memory that I found by googling was that this is first 640K memory in DOS system. Does this means all of the OS system (like kernel) goes in to low memory (640K) ????
Thanks for reading this.
This link could be helpful: Virtual Memory
Mainly,
On 32-bit systems, memory is now divided into "high" and "low" memory. Low memory continues to be mapped directly into the kernel's address space, and is thus always reachable via a kernel-space pointer. High memory, instead, has no direct kernel mapping. When the kernel needs to work with a page in high memory, it must explicitly set up a special page table to map it into the kernel's address space first. This operation can be expensive, and there are limits on the number of high-memory pages which can be mapped at any particular time.
This question on unix.stackexchange is a little more in-depth: High and low memory

why do we need zone_highmem on x86?

In linux kernel, mem_map is the array which holds all "struct page" descriptors. Those pages includes the 128MiB memory in lowmem for dynamically mapping highmem.
Since the lowmem size is 1GiB, so the mem_map array has only 1GiB/4KiB=256KiB entries. If each entry size is 32 byte, then the mem_map memory size = 8MiB. But if we could use mem_map to map all 4GiB physical memory(if we have so much physical memory available on x86-32), then the mem_map array would occupy 32MiB, that is not a lot of kernel memory(or am i wrong?).
So my question is: why do we need to use that 128MiB in low for indirect highmem mapping in the first place? Or put another way, why not to map all those max 4GiB physical memory(if available) in the kernel space directly?
Note: if my understanding of the kernel source above is wrong, please correct. Thanks!
Look Here: http://www.xml.com/ldd/chapter/book/ch13.html
Kernel low memory is the 'real' memory map, addressed with 32-bit pointers on x86.
Kernel high memory is the 'virtual' memory map, addressed with virtual structures on x86.
You don't want to map it all into the kernel address space, because you can't always address all of it, and you need most of your memory for virtual memory segments (virtual, page-mapped process space.)
At least, that's how I read it. Wow, that's a complicated question you asked.
To throw more confusion, chapter 13 talks about some PCI devices not being able to address the 32-bit space, which was the genesis of my previous comment:
On x86, some kernel memory usage is limited to the first Gigabyte of memory bacause of DMA addressing concerns. I'm not 100% familiar with the topic, but there's a comapatibility mode for DMA on the PCI bus. That may be what you are looking at.
3.6 GB is not the ceiling when using physical address extension, which is commonly needed on most modern x86 boards, especially with memory hotplug.
Or put another way, why not to map all those max 4GiB physical
memory(if available) in the kernel space directly?
One reason is userspace: every usespace process have its own virtual address space. Suppose you have 4Gb of RAM on x86. So if we suggest that kernel owns 1Gb of memory (~800 directly mapped + ~200 vmalloc) all other ~3Gb should be dynamically distributed between processes spinning in user space. So how can you map your 4Gbs directly when you have a several address spaces?
why do we need zone_highmem on x86?
The reason is the same. Kernel reserves only ~800Mb for low mem. All other memory will be allocated and connected with particular virtual address only on demand. For example if you will execute a binary a new virtual address space will be created and some pages will be allocated for storing your binary code and data (heap ,stack ...). So the key attribute of high mem is to serve dynamic memory allocation requests, you never know in advance what will be triggered by userspace...

Resources