Real use of Physical Address Extension (PAE) - memory-address

I'm a beginner. I came across an article, http://en.wikipedia.org/wiki/Physical_Address_Extension
Though I could partially understand it, I couldn't understand the practical advantage of increasing the physical address. Could anyone kindly explain it? Thank you.

Increasing the physical address space beyond 32 bits allows the operating system to access more than 4GB of memory. However, the virtual address space (pointer size) is still 32 bits, and applications need to use special APIs to change where their virtual address space addressed. PAE was a stopgap to allow applications to use more than 4GB of memory on a 32-bit processor.
There is no practical use for PAE today with the widespread implementation of x86-64, which has 64-bit pointers.

Related

virtual address for processes having more memory than 4GB

If a process uses 6GB of memory and pointers are of 32 bits,how can addressing be done for 2GB above 4GB since pointers hold virtual addresses in linux?
Is running on the 64 bit only solution?Sorry for naive question
Completing Basile's answer, most architectures have extended the physical address-space to 36-bit (see Intel's PSE, PowerPC's Extended Real Page Number, ...). Therefore, although any process can only address 4GB of memory through 32 bits pointers, two differents process are virtually able to address different 4GB of a 64GB physical memory address space. This is a way for a 32bit' OS to address up to 64GB of memory (for instance, 32GB for Windows 2003 Server).
As I said in a comment, running on 64 bits is the practical solution. You really don't want to munmap then mmap again large segments on temporary files.
You could change your address space during runtime, but you don't want to do that (except when allocating memory, e.g. thru malloc, which may increase the available space thru mmap).
Changing the address space to get the illusion of a huge memory is a nightmare. Avoid that (you'll spend months on debugging hard to reproduce bugs). In the 1960-s IBM 1130 did such insane tricks.
Today, computers are cheaper than developer's time. So just buy a 64 bits processor with 8Gb (gigabytes) RAM.
Several 32 bits processors with the PAE feature are able to use more than 4Gb RAM, but each process only see at most 4Gb (in reality 3Gb) of virtual memory.
It is related to virtual memory, not to Intel-specific segmentation. Current Linux (and others) operating system use a flat memory model even on Intel processors.

great size pointer in gcc

I want to define a great size pointer(64 bit or 128 bit) in gcc which is not depend on platform.
I think there is something like __ptr128 or __ptr64 in MSDN.
sizeof(__ptr128) is 16 bytes.
sizeof(__ptr64 ) is 8 bytes.
is it possible?
it can be useful when you use kernel functions in 64 bit OS which requires 8 bytes pointer argument and you have a 32 bit application which uses 32 bits address and you want to use this kernel function.
Your question makes no sense. Pointers, by definition, are a memory address to something - the size must depend upon the platform. How would you dereference a 128-bit pointer on a hardware platform supporting 64-bit addressing?!
You can create 64 or 128-bit values, but a pointer is directly related to the memory addressing scheme of the underlying hardware.
EDIT
With your additional statement, I think I see what you're trying to do. Unfortunately, I doubt it's possible. If the kernel function you want to use takes a 64-bit pointer argument, it's highly likely to be a 64-bit function (unless you're developing for some unusual hardware).
Even though it's technically possible to mix 64-bit instructions into a 32-bit executable, no compiler will actually let you do this. A 64-bit API call will use 64-bit code, 64-bit registers and a 64-bit stack - it would be extremely awkward for the compiler and operating system to manage arbitrary switching from a 32-bit environment to a 64-bit environment.
You should look at finding the equivalent API for a 32-bit environment. Perhaps you could post the kernel function prototype (name+parameters) you want to use and someone could help you find a better solution.
Just so there's no confusion, __ptr64 in MSDN is not platform independent:
On a 32-bit system, a pointer declared with __ptr64 is truncated to a
32-bit pointer.
Can't comment, but the statement that you can't use 64 bit instructions in a "32 bit executable" is misleading since the definition of "32 bit executable" is subject to interpretation. If you mean an executable that uses 32 bit pointers, then there is nothing at all that says you can't use instructions that manipulate 64 bit values while using 32 bit pointers. The processor doesn't know the difference.
Linux even supports a mode where you can have a 32 bit userspace and a 64 bit kernel space. Thus, each app has access to 4GB of RAM, but the system can access much more. This keeps the size of your pointers down to 4 bytes but does not restrict the use of 64 bit data manipulations.
I'm late to the party but the question makes quite a lot of sense in embedded platforms.
If you combine a CPU with some additional accelerators in the same SOC, they don't necessarily need to have the same address space or address space size.
For the firmware in the accelerator you would want pointers that cover its address space from the CPU and the accelerator's perspective. They are not necessarily the same size.
For example, with a 64 bit CPU and a 32 bit accelerator, the pointer for the firmware can cover 32 bit long address space and the pointer for CPU covers 64 bit address space. C does not have two or more void * types depending on the address spaces you want to talk to.
People generally solve this by casting void * to uintN_t with N as large as needed and passing this around between different parts of the system.
There is none, because gcc was not designed for embedded architectures. There are architectures where multiple sized pointers exist like for example m16c: ram has 16 bit addresses and rom(flash) has 20 bit addresses in the same address space. The performance and size usage is better for smaller pointers.

Is it possible to allocate a certain sector of RAM under Linux?

I have recently gotten a faulty RAM and despite already finding out this I would like to try a much easier concept - write a program that would allocate faulty regions of RAM and never release them. It might not work well if they get allocated before the program runs, but it'd be much easier to reboot on failure than to build a kernel with patches.
So the question is:
How to write a program that would allocate given sectors (or pages containing given sectors)
and (if possible) report if it was successful.
This will problematic. To understand why, you have to understand the relation between physical and virtual memory.
On any modern Operating System, programs will get a very large address space for themselves, with the remainder of the address space being used for the OS itself. Other programs are simply invisible: there's no address at which they're found. How is this possible? Simple: processes use virtual addresses. A virtual address does not correspond directly to physical RAM. Instead, there's an address translation table, managed by the OS. When your process runs, the table only contains mappings for RAM that's allocated to you.
Now, that implies that the OS decides what physical RAM is allocated to your program. It can (and will) change that at runtimke. For instance, swapping is implemented using the same mechanism. When swapping out, a page of RAM is written to disk, and its mapping deleted from the translation table. When you try to use the virtual address, the OS detects the missing mapping, restores the page from disk to RAM, and puts back a mapping. It's unlikely that you get back the same page of physical RAM, but the virtual address doesn't change during the whole swap-out/swap-in. So, even if you happened to allocate a page of bad memory, you couldn't keep it. Programs don't own RAM, they own a virtual address space.
Now, Linux does offer some specific kernel functions that allocate memory in a slightly different way, but it seems that you want to bypass the kernel entirely. You can find a much more detailed description in http://lwn.net/images/pdf/LDD3/ch08.pdf
Check out BadRAM: it seems to do exactly what you want.
Well, it's not an answer on how to write a program, but it fixes the issue whitout compiling a kernel:
Use memmap or mem parameters:
http://gquigs.blogspot.com/2009/01/bad-memory-howto.html
I will edit this answer when I get it running and give details.
The thing is write own kernel module, which can allocate physical address. And make it noswap with mlock(2).
I've never tried it. No warranty.

Force Linux to use only memory over 4G?

I have a Linux device driver that interfaces to a device that, in theory, can perform DMA using 64-bit addresses. I'd like to test to see that this actually works.
Is there a simple way that I can force a Linux machine not to use any memory below physical address 4G? It's OK if the kernel image is in low memory; I just want to be able to force a situation where I know all my dynamically allocated buffers, and any kernel or user buffers allocated for me are not addressable in 32 bits. This is a little brute force, but would be more comprehensive than anything else I can think of.
This should help me catch (1) hardware that wasn't configured correctly or loaded with the full address (or is just plain broken) as well as (2) accidental and unnecessary use of bounce buffers (because there's nowhere to bounce to).
clarification: I'm running x86_64, so I don't care about most of the old 32-bit addressing issues. I just want to test that a driver can correctly interface with multitudes of buffers using 64-bit physical addresses.
/usr/src/linux/Documentation/kernel-parameters.txt
memmap=exactmap [KNL,X86] Enable setting of an exact
E820 memory map, as specified by the user.
Such memmap=exactmap lines can be constructed based on
BIOS output or other requirements. See the memmap=nn#ss
option description.
memmap=nn[KMG]#ss[KMG]
[KNL] Force usage of a specific region of memory
Region of memory to be used, from ss to ss+nn.
memmap=nn[KMG]#ss[KMG]
[KNL,ACPI] Mark specific memory as ACPI data.
Region of memory to be used, from ss to ss+nn.
memmap=nn[KMG]$ss[KMG]
[KNL,ACPI] Mark specific memory as reserved.
Region of memory to be used, from ss to ss+nn.
Example: Exclude memory from 0x18690000-0x1869ffff
memmap=64K$0x18690000
or
memmap=0x10000$0x18690000
If you add memmap=4G$0 to the kernel's boot parameters, the lower 4GB of physical memory will no longer be accessible. Also, your system will no longer boot... but some variation hereof (memmap=3584M$512M?) may allow for enough memory below 4GB for the system to boot but not enough that your driver's DMA buffers will be allocated there.
IIRC there's an option within kernel configuration to use PAE extensions which will enable you to use more than 4GB (I am a bit rusty on the kernel config - last kernel I recompiled was 2.6.4 - so please excuse my lack of recall). You do know how to trigger a kernel config
make clean && make menuconfig
Hope this helps,
Best regards,
Tom.

why do we need zone_highmem on x86?

In linux kernel, mem_map is the array which holds all "struct page" descriptors. Those pages includes the 128MiB memory in lowmem for dynamically mapping highmem.
Since the lowmem size is 1GiB, so the mem_map array has only 1GiB/4KiB=256KiB entries. If each entry size is 32 byte, then the mem_map memory size = 8MiB. But if we could use mem_map to map all 4GiB physical memory(if we have so much physical memory available on x86-32), then the mem_map array would occupy 32MiB, that is not a lot of kernel memory(or am i wrong?).
So my question is: why do we need to use that 128MiB in low for indirect highmem mapping in the first place? Or put another way, why not to map all those max 4GiB physical memory(if available) in the kernel space directly?
Note: if my understanding of the kernel source above is wrong, please correct. Thanks!
Look Here: http://www.xml.com/ldd/chapter/book/ch13.html
Kernel low memory is the 'real' memory map, addressed with 32-bit pointers on x86.
Kernel high memory is the 'virtual' memory map, addressed with virtual structures on x86.
You don't want to map it all into the kernel address space, because you can't always address all of it, and you need most of your memory for virtual memory segments (virtual, page-mapped process space.)
At least, that's how I read it. Wow, that's a complicated question you asked.
To throw more confusion, chapter 13 talks about some PCI devices not being able to address the 32-bit space, which was the genesis of my previous comment:
On x86, some kernel memory usage is limited to the first Gigabyte of memory bacause of DMA addressing concerns. I'm not 100% familiar with the topic, but there's a comapatibility mode for DMA on the PCI bus. That may be what you are looking at.
3.6 GB is not the ceiling when using physical address extension, which is commonly needed on most modern x86 boards, especially with memory hotplug.
Or put another way, why not to map all those max 4GiB physical
memory(if available) in the kernel space directly?
One reason is userspace: every usespace process have its own virtual address space. Suppose you have 4Gb of RAM on x86. So if we suggest that kernel owns 1Gb of memory (~800 directly mapped + ~200 vmalloc) all other ~3Gb should be dynamically distributed between processes spinning in user space. So how can you map your 4Gbs directly when you have a several address spaces?
why do we need zone_highmem on x86?
The reason is the same. Kernel reserves only ~800Mb for low mem. All other memory will be allocated and connected with particular virtual address only on demand. For example if you will execute a binary a new virtual address space will be created and some pages will be allocated for storing your binary code and data (heap ,stack ...). So the key attribute of high mem is to serve dynamic memory allocation requests, you never know in advance what will be triggered by userspace...

Resources