Index into a mmap? - malloc

I'm trying to create an array of structs as a sort of rudimentary cache.
Given a void* pointer to a mmap, does mmap provide any affordances for indexing into it? I think conceptually a mmap is simply providing a block of memory, but then I'm a bit confused as to what I can do with it. Can I just think of it as a malloc?
void * mptr = mmap(NULL, 1024*1024, PROT_READ|PROT_WRITE, MAP_PRIVATE|MAP_ANONYMOUS, -1, 0);
Thanks for any clarification here.

Yes, you can think of it as a malloc, but you must deallocate it with munmap(mptr,1024*1024) rather than free(mptr).
If you want to index into it, cast it to another type, for example char:
char *cptr = (char*) mptr;
Then you can index into it using cptr[10], for example.

Regardless of what allocator you're using (mmap, malloc, sbrk, ...) you're still left with a pointer to memory. Before you can use the memory, you must tell if compiler what types live in that memory. Use C-style or C++ casting to tell the compiler how to treat memory.

Related

On Rust, is it possible to alloc a slice of memory, in such a way that the returned pointer fits in a u32?

HVM is a functional runtime which represents pointers as 32-bit values. Its allocator reserves a huge (4 GB) buffer preemptively, which it uses to create internal objects. This is not ideal. Instead, I'd like to use the system allocator, but that's not possible, since it returns 64-bit pointers, which may be larger than the space available to store them. Is there any cross-platform way in Rust to allocate a buffer, such that the pointer to the buffer is guaranteed to fit in an u32? In other words, I'm looking for something akin to:
let ptr = Box::new_with_small_ptr(size);
assert!(ptr as u64 + size < u32::MAX);
There isn't, because it's an extremely niche need and requires a lot of care.
It's not as simple as "just returning a low pointer" - you need to actually allocate that space from the OS. Your entry point into that would be mmap. Be prepared to do some low-level work with MAP_FIXED and reading /proc/self/maps, and also implementing an allocator on top of the memory region you get from mmap.
If your concern is just excess memory usage, note that Linux overcommits memory by default - allocating 4GB of memory won't reserve physical memory unless you actually try to use it all.

What is the benefit of calling ioread functions when using memory mapped IO

To use memory mapped I/O, we need to first call request_mem_region.
struct resource *request_mem_region(
unsigned long start,
unsigned long len,
char *name);
Then, as kernel is running in virtual address space, we need to map physical addresses to virtual address space by running ioremap function.
void *ioremap(unsigned long phys_addr, unsigned long size);
Then why can't we access the return value directly.
From Linux Device Drivers Book
Once equipped with ioremap (and iounmap), a device driver can access any I/O memory address, whether or not it is directly mapped to virtual address space. Remember, though, that the addresses returned from ioremap should not be dereferenced directly; instead, accessor functions provided by the kernel should be used.
Can anyone explain the reason behind this or the advantage with accessor functions like ioread32 or iowrite8()?
You need ioread8 / iowrite8 or whatever to at least cast to volatile* to make sure optimization still results in exactly 1 access (not 0 or more than 1). In fact they do more than that, handling endianness (They also handle endianness, accessing device memory as little-endian. Or ioread32be for big-endian) and some compile-time reordering memory-barrier semantics that Linux chooses to include in these functions. And even a runtime barrier after reads, because of DMA. Use the _rep version to copy a chunk from device memory with only one barrier.
In C, data races are UB (Undefined Behaviour). This means the compiler is allowed to assume that memory accessed through a non-volatile pointer doesn't change between accesses. And that if (x) y = *ptr; can be transformed into tmp = *ptr; if (x) y = tmp; i.e. compile-time speculative loads, if *ptr is known to not fault. (Related: Who's afraid of a big bad optimizing compiler? re: why the Linux kernel need volatile for rolling its own atomics.)
MMIO registers may have side effects even for reading so you must stop the compiler from doing loads that aren't in the source, and must force it to do all the loads that are in the source exactly once.
Same deal for stores. (Compilers aren't allowed to invent writes even to non-volatile objects, but they can remove dead stores. e.g. *ioreg = 1; *ioreg = 2; would typically compile the same as *ioreg = 2; The first store gets removed as "dead" because it's not considered to have a visible side effect.
C volatile semantics are ideal for MMIO, but Linux wraps more stuff around them than just volatile.
From a quick look after googling ioread8 and poking around in https://elixir.bootlin.com/linux/latest/source/lib/iomap.c#L11 we see that Linux I/O addresses can encode IO address space (port I/O, aka PIO; in / out instructions on x86) vs. memory address space (normal load/store to special addresses). And ioread* functions actually check that and dispatch accordingly.
/*
* Read/write from/to an (offsettable) iomem cookie. It might be a PIO
* access or a MMIO access, these functions don't care. The info is
* encoded in the hardware mapping set up by the mapping functions
* (or the cookie itself, depending on implementation and hw).
*
* The generic routines don't assume any hardware mappings, and just
* encode the PIO/MMIO as part of the cookie. They coldly assume that
* the MMIO IO mappings are not in the low address range.
*
* Architectures for which this is not true can't use this generic
* implementation and should do their own copy.
*/
For example implementation, here's ioread16. (IO_COND is a macro that checks the address against a predefined constant: low addresses are PIO addresses).
unsigned int ioread16(void __iomem *addr)
{
IO_COND(addr, return inw(port), return readw(addr));
return 0xffff;
}
What would break if you just cast the ioremap result to volatile uint32_t*?
e.g. if you used READ_ONCE / WRITE_ONCE which just cast to volatile unsigned char* or whatever, and are used for atomic access to shared variables. (In Linux's hand-rolled volatile + inline asm implementation of atomics which it uses instead of C11 _Atomic).
That might actually work on some little-endian ISAs like x86 if compile-time reordering wasn't a problem, but others need more barriers. If you look at the definition of readl (which ioread32 uses for MMIO, as opposed to inl for PIO), it uses barriers around a dereference of a volatile pointer.
(This and the macros this uses are defined in the same io.h as this, or you can navigate using the LXR links: every identifier is a hyperlink.)
static inline u32 readl(const volatile void __iomem *addr) {
u32 val;
__io_br();
val = __le32_to_cpu(__raw_readl(addr));
__io_ar(val);
return val;
}
The generic __raw_readl is just the volatile dereference; some ISAs may provide their own.
__io_ar() uses rmb() or barrier() After Read. /* prevent prefetching of coherent DMA data ahead of a dma-complete */. The Before Read barrier is just barrier() - blocking compile-time reordering without asm instructions.
Old answer to the wrong question: the text below answers why you need to call ioremap.
Because it's a physical address and kernel memory isn't identity-mapped (virt = phys) to physical addresses.
And returning a virtual address isn't an option: not all systems have enough virtual address space to even direct-map all of physical address space as a contiguous range of virtual addresses. (But when there is enough space, Linux does do this, e.g. x86-64 Linux's virtual address-space layout is documented in x86_64/mm.txt
Notably 32-bit x86 kernels on systems with more than 1 or 2GB of RAM (depending on how the kernel is configured: 2:2 or 1:3 kernel:user split of virtual address space). With PAE for 36-bit physical address space, a 32-bit x86 kernel can use much more physical memory than it can map at once. (This is pretty horrible and makes life difficult for a kernel: some random blog reposed Linus Torvald's comments about how PAE really really sucks.)
Other ISAs may have this too, and IDK what Alpha does about IO memory when byte accesses are needed; maybe the region of physical address space that maps word loads/stores to byte loads/stores is handled earlier so you request the right physical address. (http://www.tldp.org/HOWTO/Alpha-HOWTO-8.html)
But 32-bit x86 PAE is obviously an ISA that Linux cares a lot about, even quite early in the history of Linux.

To find the size of the memory obtained using malloc

char buf[50];
char* ptr;
scanf("%s", buf);
ptr = (char*)malloc(sizeof(buf)+1);
//And this, how to know that dynamic allocation is correctly done?
i want to know size of memory pointed by ptr
Standard malloc function does not provide this information.
This is no easy way to know the size of an object pointed by a pointer unless you use custom allocators or add some metadata to keep the size information along with the heap objects(when malloc is called memory is allocated from the heap layout) by interposing the malloc function.
There are solutions to provide this information. There is an idea named fat pointer and several fat pointer libraries can be found like Cello. An improved version of the fat pointer can found in this work.

how to avoid caching when using mmap()

I'm writing a driver in petalinux for a device in my FPGA and I have implemented the mmap function in order to control the device in the user space. My problem is that, also if I'm using
vma->vm_page_prot = pgprot_noncached(vma->vm_page_prot);
in the mmap function and MAP_SHARED flag in the user application, it seems that the cache is enabled.
The test I did is to write a value (say 5) to a specific register of my mmaped device that actually stores only the least significant bit of the data coming from the AXI bus. If I read immediately after the write operation, I expect to read 1 (this happened when using a bare metal application on Microblaze), instead I read 5. However, the value is correctly wrote in the register, because what has to happen....happens.
Thanks in advance.
Based on what was discussed in the question comments, the address pointer being assigned here:
address = mmap(NULL, PAGE_SIZE, PROT_READ|PROT_WRITE, MAP_SHARED, fd, 0);
wasn't declared with the type qualifier volatile, allowing the compiler to preform assumptions over it, leading to potential compile time optimizations over the read/write operations.

What is the time complexity of mmap in Linux?

In big O notation I guess and with respect to the size of memory requested. Also, can we assume that the memory is not committed lazily because that makes things complicated.
To be precise for the call mmap(0, n, PROT_READ | PROT_WRITE, MAP_PRIVATE | MAP_ANONYMOUS, -1, 0) where n is a variable.
In this reference state that,
MAP_ANONYMOUS initialize the region to zeros.
I believe this process is O(n) complexity, but possibly more efficient :
On some systems using private anonymous mmaps is more efficient than using malloc for large blocks. This is not an issue with the GNU C Library, as the included malloc automatically uses mmap where appropriate.

Resources