Memory unutilized in 32 bit systems? - linux

The address space for a 32 bit system is 0x00000000 to 0xffffffff. From what I understand, this address space will be split among the system memory (RAM), ROM and memory-mapped peripherals. If the entire address space were used to address on the 4GB RAM, all RAM bytes would be accessible. But the address space being distributed with other memory mapped peripherals, does this mean that some RAM will be unaddressable/unutilized?

Here is the memory map of a typical x86 system. As you can see, the lower ranges of memory are riddled with BIOS and ROM data with small gaps in between. There's a substantial portion reserved for memory mapped devices in the upper ranges. All of these details may vary between platforms. It's nothing short of a nightmare to detect which memory areas that can be safely used.
The kernel also typically reserves a large portion of the available memory for its internals, buffers and cache.
With the advent of virtual addressing, the kernel can advertise the address space as one consistent and gapless memory range, while that is not necessarily true behind the scenes.

Related

How much memory does a 64bit Linux Kernel take up?

The address space is huge for the x86-64 even though 48-bit addresses are mainly used.
On x86 32-bit machines it was pretty clear how much RAM the kernel took up. Generally around 1 GB of ZONE_NORMAL is on the bottom of memory while everything else above the 1GB in PHYSICAL (not virtual) addresses were for ZONE_HIGHMEM (for user space). This would be a 3:1 split. Of course we can have configurations were we can have a 1:3, 2:2, etc. (by changing VM_SPLIT).
How much memory in RAM is for kernel space for 64 bit kernels?
I know the PAGE_OFFSET is set to a value far above physically addressable memory in x64 (for both 48 and 56). PAGE_OFFSET in x64 just describes the split in virtual address space, not physical (a 48 bit PAGE_OFFSET would be ffff888000000000 ).
So does 1 GB of memory house kernel space? 2GB? 3? Are there variable or macros that describe the size? Is it calculated?
Each user-space process can use its own 2^47 bytes (128 TiB) of virtual address space. Or more on a system with PML5 support.
The available physical RAM to back those pages is the total size of physical RAM, minus maybe 30 MiB or so that the kernel needs for its own code/data. (Not including the pagecache: Linux will use any spare pages as buffers and disk cache). This is mostly unrelated to virtual address-space limits.
1G is how much virtual address space a kernel used up. Not how much physical RAM.
The address-space question mattered for how much memory a single process could use at the same time, but the kernel can still use all your RAM for caching file data, etc. Unless you're finding the 2^(48-1) or 2^(57-1) bytes of the low half virtual address-space range cramped, there's no equivalent problem.
See the kernel's Documentation/x86/x86-64/mm.txt for the x86-64 virtual memory map. Also Why 4-level paging can only cover 64 TiB of physical address re: x86-64 Linux not doing inconvenient HIGHMEM stuff - the entire high half of virtual address space is reserved for the kernel, and it maps all the RAM because it's a kernel.
Virtual address space usage does indirectly set a 64 TiB limit on how much physical RAM the kernel can use, but if you have less than that there's no effect. Just like how a 32-bit kernel wasn't a problem if your machine had less than 1 or 2 GiB of RAM.
The amount of physical RAM actually reserved by the kernel depends on build options and modules, but might be something like 16 to 32 MiB.
Check dmesg output and look for something like this kernel log message from an x86-64 5.16.3-arch1 kernel I found in an old boot-log message.
Memory: 32538176K/33352340K available (14344K kernel code, 2040K rwdata, 8996K rodata, 1652K init, 4336K bss, 813904K reserved, 0K cma-reserved
Don't count the init (freed in after boot) or reserved parts; I'm pretty sure Linux doesn't actually reserve ~800 MiB in a way that makes it unusable for anything else.
Also look for the later Freeing unused decrypted memory: 2036K / Freeing unused kernel image (initmem) memory: 1652K etc. (That's the same size as the init part listed earlier, which is why you don't have to count it.)
It might also dynamically allocate some memory during startup; that initial "memory" line is just the sum of its .text, .data, and .bss sections, static code+data sizes.
On 64-Bit systems, the only limitation is on how much physical memory the kernel can use. The kernel will map all the available ram, and user space applications should be able to gain access to as much as the kernel can provide while maintaining sufficient for the kernel to operate.

Why is contiguous memory allocation is required in linux?

GPUs and VPUs need contiguous memory.
CMA and Static memory allocation are the examples of contiguous memory.
Why is contiguous memory required here?
Contiguous memory allocation (CMA) is needed for I/O devices that can only work with contiguous ranges of physical memory. I/O devices that can only work with continuous ranges are built that way in order to simplify the design of the device.
On systems with an I/O memory management unit (IOMMU), this would not be an issue because a buffer that is contiguous in the device address space can be mapped by the IOMMU to non-contiguous regions of physical memory. Also some devices can do scatter/gather DMA (i.e., can read/write from/to multiple non-contiguous buffers). Ideally, all I/O devices should be designed to either work behind an IOMMU or should be capable of scatter/gather DMA. Unfortunately, this is not the case and there are devices that require physically contiguous buffers. There are two ways for a device driver to allocate a contiguous buffer:
The device driver can allocate a chunk of physical memory at boot-time. This is reliable because most of the physical memory would be available at boot-time. However, if the I/O device is not used, then the allocated physical memory is just wasted.
A chunk of physical memory can be allocated on demand, but it may be difficult to find a contiguous free range of the required size. The advantage, though, is that memory is only allocated when needed.
CMA solves this exact problem by providing the advantages of both of these approaches with none of their downsides. The basic idea is to make it possible to migrate allocated physical pages to create enough space for a contiguous buffer. More information on how CMA works can be found here.

Why does high-memory not exist for 64-bit cpu?

While I am trying to understand the high memory problem for 32-bit cpu and Linux, why is there no high-memory problem for 64-bit cpu?
In particular, how is the division of virtual memory into kernel space and user space changed, so that the requirement of high memory doesn't exist for 64-bit cpu?
Thanks.
A 32-bit system can only address 4GB of memory. In Linux this is divided into 3GB of user space and 1GB of kernel space. This 1GB is sometimes not enough so the kernel might need to map and unmap areas of memory which incurs a fairly significant performance penalty. The kernel space is the "high" 1GB hence the name "high memory problem".
A 64-bit system can address a huge amount of memory - 16 EB -so this issue does not occur there.
With 32-bit addresses, you can only address 2^32 bytes of memory (4GB). So if you have more that, you need to address it some special way. With 64-bit addresses, you can address 2^64 bytes of memory without special effort, and that number is way bigger than all the memory that exists on the planet.
That number of bits refers to the word size of the processor. Among other things, the word size is the size of a memory address on your machine. The size of the memory address affects how many bytes can be referenced uniquely. So doing some simple math we find that on a 32 bit system at most 2^32 = 4294967296 memory addresses exist, meaning you have a mathematical limitation to about 4GB of RAM.
However on a 64 bit system you have 2^64 = 1.8446744e+19 memory address available. This means that your computer can theoretically reference almost 20 exabytes of RAM, which is more RAM than anyone has ever needed in the history of computing.

How Linux kernel decide to which memory zone to use?

When I check pagetypeinfo
cat /proc/pagetypeinfo
I see three types of memory zones;
DMA
DMA32
Normal
How Linux choose a memory zone to allocate a new page?
These memory zones are defined only for the 32 bit systems and not in the 64 bit.
Rememember these are the kernel accessible main memory we are talking about. In a 32 bit (4GB) system, the split between the kernel and the user space is 1:3. Meaning kernel can access 1GB and the user space 3GB. The kernel's 1GB is split as follows:
Zone_DMA (0-16MB): Permanently mapped into the kernel address space.
For compatibility reasons for older ISA devices that can address only the lower 16MB of main memory.
Zone_Normal (16MB-896MB): Permanently mapped into the kernel address space.
Many kernel operations can only take place using ZONE_NORMAL so it is the most performance critical zone and is the memory mostly allocated by the kernel.
ZONE_HIGH_MEM (896MB-above): not permanently mapped into the kernel's address space.
Kernel can access entire 4GB main memory. kernel's 1GB through Zone_DMA & Zone_Normal and user's 3GB through ZONE_HIGH_MEM. With Intel's Physical Address Extension (PAE), one gets 4 extra bits to address the main memory resulting in 36 bits, a total of 64GB of memory that can be accessed. The delta address space (36 bit address - 32 bit address) is where ZONE_HIGH_MEM is used to map to the user accessed main memory (ie between 2GB - 4GB).
Read more:
http://www.quora.com/Linux-Kernel/Why-is-there-ZONE_HIGHMEM-in-the-x86-32-Linux-kernel-but-not-in-the-x86-64-kernel
http://www.quora.com/Linux-Kernel/What-is-the-difference-between-high-memory-and-normal-memory
Linux 3/1 virtual address split
For every memory allocation request (for eg via kmalloc), based on the flags passed to the function,kernel selects the memory zone. these requests internally triggers the kernel function alloc_pages().
zonelist is an argument that gets passed to alloc_pages(), that
Points to a zonelist data structure describing, in order of preference, the mem-
ory zones suitable for the memory allocation.
refer the memory management chapter in book Understanding the Linux kernel

why do we need zone_highmem on x86?

In linux kernel, mem_map is the array which holds all "struct page" descriptors. Those pages includes the 128MiB memory in lowmem for dynamically mapping highmem.
Since the lowmem size is 1GiB, so the mem_map array has only 1GiB/4KiB=256KiB entries. If each entry size is 32 byte, then the mem_map memory size = 8MiB. But if we could use mem_map to map all 4GiB physical memory(if we have so much physical memory available on x86-32), then the mem_map array would occupy 32MiB, that is not a lot of kernel memory(or am i wrong?).
So my question is: why do we need to use that 128MiB in low for indirect highmem mapping in the first place? Or put another way, why not to map all those max 4GiB physical memory(if available) in the kernel space directly?
Note: if my understanding of the kernel source above is wrong, please correct. Thanks!
Look Here: http://www.xml.com/ldd/chapter/book/ch13.html
Kernel low memory is the 'real' memory map, addressed with 32-bit pointers on x86.
Kernel high memory is the 'virtual' memory map, addressed with virtual structures on x86.
You don't want to map it all into the kernel address space, because you can't always address all of it, and you need most of your memory for virtual memory segments (virtual, page-mapped process space.)
At least, that's how I read it. Wow, that's a complicated question you asked.
To throw more confusion, chapter 13 talks about some PCI devices not being able to address the 32-bit space, which was the genesis of my previous comment:
On x86, some kernel memory usage is limited to the first Gigabyte of memory bacause of DMA addressing concerns. I'm not 100% familiar with the topic, but there's a comapatibility mode for DMA on the PCI bus. That may be what you are looking at.
3.6 GB is not the ceiling when using physical address extension, which is commonly needed on most modern x86 boards, especially with memory hotplug.
Or put another way, why not to map all those max 4GiB physical
memory(if available) in the kernel space directly?
One reason is userspace: every usespace process have its own virtual address space. Suppose you have 4Gb of RAM on x86. So if we suggest that kernel owns 1Gb of memory (~800 directly mapped + ~200 vmalloc) all other ~3Gb should be dynamically distributed between processes spinning in user space. So how can you map your 4Gbs directly when you have a several address spaces?
why do we need zone_highmem on x86?
The reason is the same. Kernel reserves only ~800Mb for low mem. All other memory will be allocated and connected with particular virtual address only on demand. For example if you will execute a binary a new virtual address space will be created and some pages will be allocated for storing your binary code and data (heap ,stack ...). So the key attribute of high mem is to serve dynamic memory allocation requests, you never know in advance what will be triggered by userspace...

Resources