Why 32-bit processor can only address 4GiB of memory, even with large word size? - memory-address

Until now I thought that a 32-bit processor can use 4 GiB of memory because 232 is 4 GiB, but this approach means processor have word size = 1 byte. So a process with 32-bit program counter can address 232 different memory words and hence we have 4 GiB.
But if a processor has word size larger than 1 byte, which is the case with most of processors now days I believe (My understanding is that word size is equal to the width of data bus, so a processor with 64-bit data bus must have a word size = 8 bytes).
Now same processor with 32 bit Program counter can address 2^32 different memory words, but in this case word size is 8 bytes hence it can address more memory which contradicts with 4 GiB thing, so what is wrong in my argument ?

Your premise is incorrect. 32-bit architectures can address more than 4GB of memory, just like most (if not all) 8-bit microcontrollers can use more than 256 bytes of memory. Indeed a 32-bit program counter can address 232 different memory locations, but word-addressable memory is only used in architectures for very special purposes like DSPs or antique architectures in the past. Modern architectures for general computing all use byte-addressable memory
See Why byte-addressable memory and not 4-byte-addressable memory?
Even in 32-bit byte-addressable architectures there are many ways to access more than 4GB of memory. For example 64-bit JVM can address 32GB of memory with 32-bit pointer using compressed Oops. See the Trick behind JVM's compressed Oops
32-bit x86 CPUs can also address 64GB (or more in later versions) of memory with PAE. It basically adds a another level of indirection in the TLB with a few more bits in the address. That allows the whole system to access more than 4GB of memory. However the pointers in applications are still 32-bit long so each process is still limited to 4GB at most. The analog on ARM is LPAE.
The 4GB address space of each process is often split into user and kernel space (before Meltdown), hence limited the user memory even more. There are several ways to workaround this
Spawning multiple processes, which is used in Adobe Premiere CS4
Mapping the needed part of memory into the current address space, like Address Windowing Extensions on Windows
...

CPU (at least x86 family 32-bit) must be able to access any byte/word/dword in 4GB space. So an instruction is encoded such a way that target word size and memory address (usually) belong to different bit-fields. So it doesn't matter whether CPU accesses byte or dword, but the encoded memory address must be the same.
Note that 32-bit OS and x86 CPU technically is able to acccess more than 4GB address space using PAE mode. But it is not supported by, say, the current Windows OS family (except Server editions). Some versions of WinXP, as well as Linux and other 32-bit OS can address 64GB of memory on x86 CPU.
Also, usually OS reserves some part of virtual address space (for OS kernel, Video memory etc.), so user programs may use, say, no more than 3 GB of RAM of the 4GB an OS can address within each process.

Related

How much memory does a 64bit Linux Kernel take up?

The address space is huge for the x86-64 even though 48-bit addresses are mainly used.
On x86 32-bit machines it was pretty clear how much RAM the kernel took up. Generally around 1 GB of ZONE_NORMAL is on the bottom of memory while everything else above the 1GB in PHYSICAL (not virtual) addresses were for ZONE_HIGHMEM (for user space). This would be a 3:1 split. Of course we can have configurations were we can have a 1:3, 2:2, etc. (by changing VM_SPLIT).
How much memory in RAM is for kernel space for 64 bit kernels?
I know the PAGE_OFFSET is set to a value far above physically addressable memory in x64 (for both 48 and 56). PAGE_OFFSET in x64 just describes the split in virtual address space, not physical (a 48 bit PAGE_OFFSET would be ffff888000000000 ).
So does 1 GB of memory house kernel space? 2GB? 3? Are there variable or macros that describe the size? Is it calculated?
Each user-space process can use its own 2^47 bytes (128 TiB) of virtual address space. Or more on a system with PML5 support.
The available physical RAM to back those pages is the total size of physical RAM, minus maybe 30 MiB or so that the kernel needs for its own code/data. (Not including the pagecache: Linux will use any spare pages as buffers and disk cache). This is mostly unrelated to virtual address-space limits.
1G is how much virtual address space a kernel used up. Not how much physical RAM.
The address-space question mattered for how much memory a single process could use at the same time, but the kernel can still use all your RAM for caching file data, etc. Unless you're finding the 2^(48-1) or 2^(57-1) bytes of the low half virtual address-space range cramped, there's no equivalent problem.
See the kernel's Documentation/x86/x86-64/mm.txt for the x86-64 virtual memory map. Also Why 4-level paging can only cover 64 TiB of physical address re: x86-64 Linux not doing inconvenient HIGHMEM stuff - the entire high half of virtual address space is reserved for the kernel, and it maps all the RAM because it's a kernel.
Virtual address space usage does indirectly set a 64 TiB limit on how much physical RAM the kernel can use, but if you have less than that there's no effect. Just like how a 32-bit kernel wasn't a problem if your machine had less than 1 or 2 GiB of RAM.
The amount of physical RAM actually reserved by the kernel depends on build options and modules, but might be something like 16 to 32 MiB.
Check dmesg output and look for something like this kernel log message from an x86-64 5.16.3-arch1 kernel I found in an old boot-log message.
Memory: 32538176K/33352340K available (14344K kernel code, 2040K rwdata, 8996K rodata, 1652K init, 4336K bss, 813904K reserved, 0K cma-reserved
Don't count the init (freed in after boot) or reserved parts; I'm pretty sure Linux doesn't actually reserve ~800 MiB in a way that makes it unusable for anything else.
Also look for the later Freeing unused decrypted memory: 2036K / Freeing unused kernel image (initmem) memory: 1652K etc. (That's the same size as the init part listed earlier, which is why you don't have to count it.)
It might also dynamically allocate some memory during startup; that initial "memory" line is just the sum of its .text, .data, and .bss sections, static code+data sizes.
On 64-Bit systems, the only limitation is on how much physical memory the kernel can use. The kernel will map all the available ram, and user space applications should be able to gain access to as much as the kernel can provide while maintaining sufficient for the kernel to operate.

Why does high-memory not exist for 64-bit cpu?

While I am trying to understand the high memory problem for 32-bit cpu and Linux, why is there no high-memory problem for 64-bit cpu?
In particular, how is the division of virtual memory into kernel space and user space changed, so that the requirement of high memory doesn't exist for 64-bit cpu?
Thanks.
A 32-bit system can only address 4GB of memory. In Linux this is divided into 3GB of user space and 1GB of kernel space. This 1GB is sometimes not enough so the kernel might need to map and unmap areas of memory which incurs a fairly significant performance penalty. The kernel space is the "high" 1GB hence the name "high memory problem".
A 64-bit system can address a huge amount of memory - 16 EB -so this issue does not occur there.
With 32-bit addresses, you can only address 2^32 bytes of memory (4GB). So if you have more that, you need to address it some special way. With 64-bit addresses, you can address 2^64 bytes of memory without special effort, and that number is way bigger than all the memory that exists on the planet.
That number of bits refers to the word size of the processor. Among other things, the word size is the size of a memory address on your machine. The size of the memory address affects how many bytes can be referenced uniquely. So doing some simple math we find that on a 32 bit system at most 2^32 = 4294967296 memory addresses exist, meaning you have a mathematical limitation to about 4GB of RAM.
However on a 64 bit system you have 2^64 = 1.8446744e+19 memory address available. This means that your computer can theoretically reference almost 20 exabytes of RAM, which is more RAM than anyone has ever needed in the history of computing.

What's the size of virtual memory of the Linux kernel occupies in a 48GB memory, 64-bit machine?

What's the size of virtual memory of the Linux kernel occupies in a 48GB memory, 64-bit machine? I know in a 32-bit machine, the Linux kernel occupies 1GB virtual memory.
AMD64 uses addresses of "canonical form" (see pages 131-135 here) for implementations that do not implement the full 64 bits. The rationale behind this weird scheme is that it is possible to add more bits in the future as hardware evolves, and the two halves will grow together towards the middle.
Currently, all implementations (i.e. all existing processors) have 48 bit addresses, thus 00000000'00000000--00007FFF'FFFFFFFF, and FFFF8000'00000000--FFFFFFFF'FFFFFFFF are valid address ranges, with 128TB of memory in each half of the usable address space (256TB total).
So that would be 128TB, which is also the maximum per-process address space under Linux under AMD64.

virtual address for processes having more memory than 4GB

If a process uses 6GB of memory and pointers are of 32 bits,how can addressing be done for 2GB above 4GB since pointers hold virtual addresses in linux?
Is running on the 64 bit only solution?Sorry for naive question
Completing Basile's answer, most architectures have extended the physical address-space to 36-bit (see Intel's PSE, PowerPC's Extended Real Page Number, ...). Therefore, although any process can only address 4GB of memory through 32 bits pointers, two differents process are virtually able to address different 4GB of a 64GB physical memory address space. This is a way for a 32bit' OS to address up to 64GB of memory (for instance, 32GB for Windows 2003 Server).
As I said in a comment, running on 64 bits is the practical solution. You really don't want to munmap then mmap again large segments on temporary files.
You could change your address space during runtime, but you don't want to do that (except when allocating memory, e.g. thru malloc, which may increase the available space thru mmap).
Changing the address space to get the illusion of a huge memory is a nightmare. Avoid that (you'll spend months on debugging hard to reproduce bugs). In the 1960-s IBM 1130 did such insane tricks.
Today, computers are cheaper than developer's time. So just buy a 64 bits processor with 8Gb (gigabytes) RAM.
Several 32 bits processors with the PAE feature are able to use more than 4Gb RAM, but each process only see at most 4Gb (in reality 3Gb) of virtual memory.
It is related to virtual memory, not to Intel-specific segmentation. Current Linux (and others) operating system use a flat memory model even on Intel processors.

Memory limit to a 32-bit process running on a 64-bit Linux OS

How much virtual memory can a 32-bit process have on 64-bit Linux
(i.e. how much memory can I allocate and use with malloc() before I start getting a NULL pointer)?
I tried it on my 32-bit Linux and reached about 3 GB limit. Will I be able to get more on 64-bit Linux?
In the standard 32-bit x86 smp kernel, each process can use 3GB of the 4GB address space and 1GB is used by the kernel (shared in the address space of each process).
With the 4G/4G split "hugemem" 32-bit x86 kernel, each process can use (almost) the entire 4GB of address space and the kernel has a separate 4GB of address space. This kernel was supported by Red Hat in RHEL 3 and 4, but they dropped it in RHEL 5 because the patch was not accepted into the mainline kernel and most people use 64-bit kernels now anyway.
With the 64-bit x86_64 kernel, a 32-bit process can use the entire 4GB address space, except for a couple pages (8KB) at the end of the 4GB address space which are managed by the kernel. The kernel itself uses a part of the address space that is beyond the 4GB accessible to 32-bit code, so it does not reduce the user address space. A 64-bit process can use much more address space (128TB in RHEL 6).
Note that some of the address space will be used by the program code, libraries, and stack space, so you won't be able to malloc() your entire address space. The size of these things varies by program. Take a look at /proc/<pid>/maps to see how the address space is being used in your process; the amount you can malloc() will be limited by the largest unused address range.
As stated above, 32bit process on 32bit kernel would be able to allocate about more or less 3GB of memory. 32bit process on 64bit kernel will be able to allocate around 4GB of memory.
A 32-bit process will only be able to access 4GB of virtual memory regardless of the OS. This is due to the process only being able to map 32-bits for memory addresses. If you do the math you'll see that 32-bit addresses can only access a maximum of 4GB evenif your running on a 128-bit os.
On 64-bit linux, the maximum memory space for a single process is 2^48 bytes. (In theory, more is possible, but current chips do not allow the entire virtual address space of 2^64 bytes to be used.)
See Wikipedia for more information.

Resources