Large physically contiguous memory area - linux

For my M.Sc. thesis, I have to reverse-engineer the hash function Intel uses inside its CPUs to spread data among Last Level Cache slices in Sandy Bridge and newer generations. To this aim, I am developing an application in Linux, which needs a physically contiguous memory area in order to make my tests. The idea is to read data from this area, so that they are cached, probe if older data have been evicted (through delay measures or LLC miss counters) in order to find colliding memory addresses and finally discover the hash function by comparing these colliding addresses.
The same procedure has already been used in Windows by a researcher, and proved to work.
To do this, I need to allocate an area that must be large (64 MB or more) and fully cachable, so without DMA-friendly options in TLB. How can I perform this allocation?
To have a full control over the allocation (i.e., for it to be really physically contiguous), my idea was to write a Linux module, export a device and mmap() it from userspace, but I do not know how to allocate so much contiguous memory inside the kernel.
I heard about Linux Contiguous Memory Allocator (CMA), but I don't know how it works

Applications don't see physical memory, a process have some address space in virtual memory. Read about the MMU (what is contiguous in virtual space might not really be physically contiguous and vice versa)
You might perhaps want to lock some memory using mlock(2)
But your application will be scheduled, and other processes (or scheduled tasks) would dirty your CPU cache. See also sched_setaffinity(2)
(and even kernel code might be perhaps preempted)

This page on Kernel Newbies, has some ideas about memory allocation. But the max for get_free_pages looks like 8MiB. (Perhaps that's a compile-time constraint?)
Since this would be all-custom, you could explore the mem= boot parameter of the linux kernel. This will limit the amount of memory used, and you can party all over the remaining memory without anyone knowing. Heck, if you boot up a busybox system, you could probably do mem=32M, but even mem=256M should work if you're not booting a GUI.
You will also want to look into the Offline Scheduler (and here). It "unplugs" the CPU from Linux so you can have full control over ALL code running on it. (Some parts of this are already in the mainline kernel, and maybe all of it is.)

Related

Is memory allocated with "ftruncate" is physically contiguous? [duplicate]

Is there a way to allocate contiguous physical memory from userspace in linux? At least few guaranteed contiguous memory pages. One huge page isn't the answer.
No. There is not. You do need to do this from Kernel space.
If you say "we need to do this from User Space" - without anything going on in kernel-space it makes little sense - because a user space program has no way of controlling or even knowing if the underlying memory is contiguous or not.
The only reason where you would need to do this - is if you were working in-conjunction with a piece of hardware, or some other low-level (i.e. Kernel) service that needed this requirement. So again, you would have to deal with it at that level.
So the answer isn't just "you can't" - but "you should never need to".
I have written such memory managers that do allow me to do this - but it was always because of some underlying issue at the kernel level, which had to be addressed at the kernel level. Generally because some other agent on the bus (PCI card, BIOS or even another computer over RDMA interface) had the physical contiguous memory requirement. Again, all of this had to be addressed in kernel space.
When you talk about "cache lines" - you don't need to worry. You can be assured that each page of your user-space memory is contiguous, and each page is much larger than a cache-line (no matter what architecture you're talking about).
Yes, if all you need is a few pages, this may indeed be possible.
The file /proc/[pid]/pagemap now allows programs to inspect the mapping of their virtual memory to physical memory.
While you cannot explicitly modify the mapping, you can just allocate a virtual page, lock it into memory via a call to mlock, record its physical address via a lookup into /proc/self/pagemap, and repeat until you just happen to get enough blocks touching eachother to create a large enough contiguous block. Then unlock and free your excess blocks.
It's hackish, clunky and potentially slow, but it's worth a try. On the other hand, there's a decently large chance that this isn't actually what you really need.
DPDK library's memory allocator uses approach #Wallacoloo described. eal_memory.c. The code is BSD licensed.
if specific device driver exports dma buffer which is physical contiguous, user space can access through dma buf apis
so user task can access but not allocate directly
that is because physically contiguous constraints are not from user aplications but only from device
so only device drivers should care.

Can I allocate one large and guaranteed continued range physical memory (100MB)?

Can I allocate one large and guaranteed continued range physical memory (100 MB consecutive without breaks) on Linux, and if I can, then how can I do this?
It is necessary to mapping this a continuous block of memory through the PCI-Express BAR from one CPU1 to the other CPU2 located behind the PCIe Non-Transparent Bridge.
You don't allocate physical memory in user applications (physical memory only makes sense inside the kernel).
I don't understand if you are coding a kernel module or some Linux application (e.g. a numerical finite-element code=.
Inside applications, you can allocate virtual memory with e.g. mmap(2) (and then you can allocate a big contiguous segment of address space)
I guess that some GPU cards give access to a large amount of GPU memory thru mmap so I believe it is possible to do what you want.
You might be interested by numa(7) man page. Probably the numa(3) library should give you what you want. Did you consider also open MPI? See also msync(2) and mlock(2)
From user space -- there is no guarantee depends on you luck.
if you compile your driver into the kernel -- you can use the mmap and allocate the required amount of memory.
if it is required to use it as storage or some other work not specifically for a driver then you should set the memmap parameter in the boot command line.
e.g. memmap=200M$1700M
it will block 200 MB memory starting from the end of 1700M (address).
Later it can be used to as FS as well ;)

why kernel needs virtual addressing?

In Linux each process has its virtual address space (e.g. 4 GB in case of 32 bit system, wherein 3GB is reserved for process and 1 GB for kernel). This virtual addressing mechanism helps isolating the address space of each process. This is understandable in case of process since there are many processes. But since we have 1 kernel only so why do we need virtual addressing for kernel?
The reason the kernel is "virtual" is not to deal with paging as such, it is becuase the processor can only run in one mode at a time. So once you turn on paged memory mapping (Bit 31 in CR0 on x86), the processor is expecting ALL memory accesses to go through the page-mapping mechanism. So, since we do want to access the kernel even after we have enabled paging (virtual memory), it needs to exist somewhere in the virtual space.
The "reserving" of memory is more about "easy way to determine if an address is kernel or user-space" than anything else. It would be perfectly possible to put a little bit of kernel at address 12345-34121, another bit of kernel at 101900-102400 and some other bit of kernel at 40000000-40001000. But it would make life difficult for every aspect of the kernel and userspace - there would be gaps/holes to deal with [there already are such holes/gapes, but having more wouldn't exactly help things]. By setting a fixed limit for "userspace is from here to here, kernel is from end of userspace to X", it makes life much easier in that respect. We can just say kernel = 0; if (address > max_userspace) kernel=1; in some code.
Of course, the kerneln only takes up as much PHYSICAL memory as it will actually use - so the common thinking that "it's a waste to take up a whole gigabyte for the kernel" is wrong - the kernel itself is a few (a dozen or so for a very "big" kernel) megabytes. The modules loaded can easily add up to several more megabytes, and graphics drivers from ATI and nVidia easily another few megabytes just for the kernel moduel for that itself. The kernel also uses some bits of memory to store "kernel data", such as tasks, queues, semaphores, files and other "stuff" the kernel has to deal with. A few megabytes is used for this as well.
Virtual Memory Management is that feature of Linux which enables Multi-tasking in system without any limitation on no. of task or amount of memory used by each task. The Linux Memory Manager Subsystem (along with MMU hardware) facilitates VMM support, where memory or mem-mapped device are accessed through virtual addresses. Within Linux everything, both kernel and user components, works with virtual address except when dealing with real hardware. That's when the Memory Manager takes its place, does virtual-to-physical address translation and points to physical mem/dev location.
A process is an abstract entity, defined by kernel to which system resources are allocated in order to execute a program. In Linux Process Management the kernel is an integrated part of a process memory map. A process has two main regions, like two faces of one coin:
User Space view - contains user program sections (Code, Data, Stack, Heap, etc...) used by process
Kernel Space view - contains kernel data structures that maintain information (PID. States, FD, Resource Usage, etc...) about the process
Every process in Linux system has a unique and separate User Space Region. This feature of Linux VMM isolates each process program sections from one and other. But all processes in the system shares the common Kernel Space Region. When a process needs service from the kernel it must execute the kernel code in this region, or in other words kernel is performing on behalf of user process request.

Contiguous physical memory from userspace

Is there a way to allocate contiguous physical memory from userspace in linux? At least few guaranteed contiguous memory pages. One huge page isn't the answer.
No. There is not. You do need to do this from Kernel space.
If you say "we need to do this from User Space" - without anything going on in kernel-space it makes little sense - because a user space program has no way of controlling or even knowing if the underlying memory is contiguous or not.
The only reason where you would need to do this - is if you were working in-conjunction with a piece of hardware, or some other low-level (i.e. Kernel) service that needed this requirement. So again, you would have to deal with it at that level.
So the answer isn't just "you can't" - but "you should never need to".
I have written such memory managers that do allow me to do this - but it was always because of some underlying issue at the kernel level, which had to be addressed at the kernel level. Generally because some other agent on the bus (PCI card, BIOS or even another computer over RDMA interface) had the physical contiguous memory requirement. Again, all of this had to be addressed in kernel space.
When you talk about "cache lines" - you don't need to worry. You can be assured that each page of your user-space memory is contiguous, and each page is much larger than a cache-line (no matter what architecture you're talking about).
Yes, if all you need is a few pages, this may indeed be possible.
The file /proc/[pid]/pagemap now allows programs to inspect the mapping of their virtual memory to physical memory.
While you cannot explicitly modify the mapping, you can just allocate a virtual page, lock it into memory via a call to mlock, record its physical address via a lookup into /proc/self/pagemap, and repeat until you just happen to get enough blocks touching eachother to create a large enough contiguous block. Then unlock and free your excess blocks.
It's hackish, clunky and potentially slow, but it's worth a try. On the other hand, there's a decently large chance that this isn't actually what you really need.
DPDK library's memory allocator uses approach #Wallacoloo described. eal_memory.c. The code is BSD licensed.
if specific device driver exports dma buffer which is physical contiguous, user space can access through dma buf apis
so user task can access but not allocate directly
that is because physically contiguous constraints are not from user aplications but only from device
so only device drivers should care.

Is it possible to allocate a certain sector of RAM under Linux?

I have recently gotten a faulty RAM and despite already finding out this I would like to try a much easier concept - write a program that would allocate faulty regions of RAM and never release them. It might not work well if they get allocated before the program runs, but it'd be much easier to reboot on failure than to build a kernel with patches.
So the question is:
How to write a program that would allocate given sectors (or pages containing given sectors)
and (if possible) report if it was successful.
This will problematic. To understand why, you have to understand the relation between physical and virtual memory.
On any modern Operating System, programs will get a very large address space for themselves, with the remainder of the address space being used for the OS itself. Other programs are simply invisible: there's no address at which they're found. How is this possible? Simple: processes use virtual addresses. A virtual address does not correspond directly to physical RAM. Instead, there's an address translation table, managed by the OS. When your process runs, the table only contains mappings for RAM that's allocated to you.
Now, that implies that the OS decides what physical RAM is allocated to your program. It can (and will) change that at runtimke. For instance, swapping is implemented using the same mechanism. When swapping out, a page of RAM is written to disk, and its mapping deleted from the translation table. When you try to use the virtual address, the OS detects the missing mapping, restores the page from disk to RAM, and puts back a mapping. It's unlikely that you get back the same page of physical RAM, but the virtual address doesn't change during the whole swap-out/swap-in. So, even if you happened to allocate a page of bad memory, you couldn't keep it. Programs don't own RAM, they own a virtual address space.
Now, Linux does offer some specific kernel functions that allocate memory in a slightly different way, but it seems that you want to bypass the kernel entirely. You can find a much more detailed description in http://lwn.net/images/pdf/LDD3/ch08.pdf
Check out BadRAM: it seems to do exactly what you want.
Well, it's not an answer on how to write a program, but it fixes the issue whitout compiling a kernel:
Use memmap or mem parameters:
http://gquigs.blogspot.com/2009/01/bad-memory-howto.html
I will edit this answer when I get it running and give details.
The thing is write own kernel module, which can allocate physical address. And make it noswap with mlock(2).
I've never tried it. No warranty.

Resources