I'm trying to find a way to do heap memory allocation in armv8-a assembly, and after looking through syscall tables and trying to look at the Linux Programmer's Manual I can't find any way to allocate and de-allocate memory at runtime without using malloc and free from the c standard library.
I've looked at brk() but that doesn't appear to have any way to de-allocate memory.
mmap with MAP_ANONYMOUS is preferred to sbrk/brk for most purposes in modern programs. Use munmap to free.
By the way, brk can deallocate memory; simply pass an address lower than the current break point. But this does limit you to freeing in a last-in-first-out fashion.
Related
I'm building a small x86-64 kernel that I write completely from scratch. I'm starting to write a very simple memory allocator where I simply iterate on page structs to find a free page and break the loop when I find one. When I began writing the allocator, I stumbled upon an issue.
In my kernel, I managed to get a memory map of RAM using a UEFI call (GetMemoryMap()). By iterating on the memory map, I managed to find that I would have around 1021MB of usable memory (out of 1024MB). I test my kernel on QEMU.
I read here and there that the Linux kernel holds a page struct for every page in the system. I'm guessing that the memory allocator of the Linux kernel uses the page structs to determine what page is free and which isn't. By using a proper binary structure and an efficient algorithm, it attempts to find a free page as fast as possible to use for itself or to provide to a user mode process.
If my assumption is correct, the issue arises here. If the Linux kernel's memory allocator relies on the page struct to work, how does it allocate memory for the page structs themselves?
I thought of a simple algorithm where I simply start at the physical address of the first usable memory region. If this memory region has enough room for all the page structs (representing the whole memory), I stop there. Otherwise, I use the second memory region an so on. This seems quite simple but I wanted to know how the Linux kernel handles that issue.
Even if my above assumption is wrong, the memory allocator probably requires some memory to work. At boot, how does the Linux kernel allocate memory for its own memory allocator?
I guess you can google some stuff about memblock allocator, which manages the physical memory in kernel boot phase.
For fun I am just trying to write a program in assembly for Linux on a laptop with an x86 processor to get some system information. So one of the things I am trying to find is how much memory is available to my program, and where e.g. the stack is and if and how I can allocate additional memory if needed.
Long time ago I did things like this on an Atari ST and there was just a system 'malloc' I could ask memory from and there were functions to find the available memory.
I know Linux is set up differently and I kind of have the whole address space to myself, but I guess there are some memory areas I am not allowed to touch.
And somehow a default stack seems to have been setup.
I researched quite a bit for this, but I can't find any 'assembly' system call. Most people point to linking the C malloc for memory management, but I am not looking for a memory manager. I just want to know the memory boundaries of my program.
I find things like getrlimit, setrlimit, prlimit and brk and sbrk, but those seem to be C functions and not system calls.
What am i missing?
Linux uses virtual memory (and ASLR). Atari ST doesn't use either so it did have a fixed memory map for some OS data structures and code. (Because the OS was in ROM and couldn't be easily updated, some people even documented some internal addresses.)
Linux tries to keep the boundary between kernel and user-space rigid, with a well-defined documented API / ABI for user-space to interact with the kernel via system calls. (e.g. on x86-64, via the syscall instruction). User-space doesn't need to care what's on the other side of that wall, and usually not even where its pages are in virtual memory as long as it has pointers to them.
When glibc malloc wants more pages from the OS, it uses mmap(MAP_ANONYMOUS) or brk to get them, and hand out chunks of them for small calls to malloc. It keeps bookkeeping data structures in user-space (so that's per-process of course).
I know Linux is set up differently and I kind of have the whole address space to myself, but I guess there are some memory areas I am not allowed to touch.
Yeah, every process has its own virtual address space. You can only touch the parts you've allocated, otherwise the resulting page fault will be "invalid" (OS knows there isn't supposed to be a physical page for that virtual page) and will deliver a SIGSEGV signal to your process if you try to read or write it. ("valid" page faults happen because of swap space or lazy allocation / copy-on-write; the kernel updates the HW page tables and returns to user-space for it to re-run the instruction that faulted.)
Also, the kernel claims the high half of virtual address space for its own use. (https://wiki.osdev.org/Higher_Half_Kernel). See also https://www.kernel.org/doc/Documentation/x86/x86_64/mm.txt for Linux's x86-64 memory map layout.
I can't find any 'assembly' system call.
mmap and brk are true system calls. See the "notes" section of the brk(2) man page. Section 2 man pages are system calls, section 3 are libc functions.
Of course in C when you call mmap(...), you're actually calling a wrapper function in glibc. glibc provides wrapper functions, not inline asm macros that use the syscall instruction directly.
See also The Definitive Guide to Linux System Calls which explains the asm interface, and also the VDSO pages. Linux maps some kernel memory (read-only) into your user-space process, holding code and data so getpid() and clock_gettime() can run in user-space.
Also various Q&As on Stack Overflow, including What are the calling conventions for UNIX & Linux system calls on i386 and x86-64
So one of the things I am trying to find is how much memory is available to my program
There isn't a system call to query the current memory map of your process. Parsing /proc/self/maps would be your best bet.
See Finding mapped memory from inside a process for some fun ideas on using system calls to scan ranges of virtual address space for mapped pages. e.g. Like Linux's mincore(2) syscall returns -ENOMEM if the specified range contains any unmapped pages.
I want to develop my own operating system from the beginning. But I have some doubts about dynamic memory allocation. For example: There will be a linked list that implements a ready process queue. When I allocate memory in my programs, I use malloc. But, could I use malloc directly in the kernel implementation? Or I must develop my own memory allocator? I am not sure, but I think that malloc uses system calls to check the page table, so I could not use that in my own kernel. If I cant use malloc, how could I allocate memory for the queue?
Thanks.
How does libc communicate with the OS (e.g., a Linux kernel) to manage memory? Specifically, how does it allocate memory, and how does it release memory? Also, in what cases can it fail to allocate and deallocate, respectively?
That is very general question, but I want to speak to the failure to allocate. It's important to realize that memory is actually allocated by kernel upon first access. What you are doing when calling malloc/calloc/realloc is reserving some addresses inside the virtual address space of a process (via syscalls brk, mmap, etc. libc does that).
When I get malloc or similar to fail (or when libc get brk or mmap to fail), it's usually because I exhausted the virtual address space of a process. This happens when there is no continuous block of free address, an no room to expand an existing one. You can either exhaust all space available or hit a limit RLIMIT_AS. It's pretty common especially on 32bit systems when using multiple threads, because people sometimes forget that each thread needs it's own stack. Stacks usually consume several megabytes, which means you can create only few hundreds threads before you have no more free address space. Maybe an even more common reason for exhausted address space are memory leaks. Libc of course tries to reuse space on the heap (space obtained by a brk syscall) and tries to munmmap unneeded mappings. However, it can't reuse something that is not "deallocated".
The shortage of physical memory is not detectable from within a process (or libc which is part of the process) by failure to allocate. Yeah, you can hit "overcommitting limit", but that doesn't mean the physical memory is all taken. When free physical memory is low, kernel invokes special task called OOM killer (Out Of Memory Killer) which terminates some processes in order to free memory.
Regarding failure to deallocate, my guess is it doesn't happen unless you do something silly. I can imagine setting program break (end of heap) below it's original position (by a brk syscall). That is, of course, recipe for a disaster. Hopefully libc won't do that and it doesn't make much sense either. But it can be seen as failed deallocation. munmap can also fail if you supply some silly argument, but I can't think of regular reason for it to fail. That doesn't mean it doesn't exists. We would have to dig deep within source code of glibc/kernel to find out.
1) how does it allocate memory
libc provides malloc() to C programs.
Normally, malloc allocates memory from the heap, and adjusts the
size of the heap as required, using sbrk(2). When allocating blocks of
memory larger than MMAP_THRESHOLD bytes, the glibc malloc()
implementation allocates the memory as a private anonymous mapping
using mmap(2). MMAP_THRESHOLD is 128 kB by default, but is adjustable
using mallopt(3). Allocations performed using mmap(2) are unaffected
by the RLIMIT_DATA resource limit (see getrlimit(2)).
And this is about sbrk.
sbrk - change data segment size
2) in what cases can it fail to allocate
Also from malloc
By default, Linux follows an optimistic memory allocation strategy.
This means that when malloc() returns non-NULL there is no guarantee
that the memory really is available.
And from proc
/proc/sys/vm/overcommit_memory
This file contains the kernel virtual memory accounting mode. Values are:
0: heuristic overcommit (this is the default)
1: always overcommit, never check
2: always check, never overcommit
Mostly it uses the sbrk system call to adjust the size of the data segment, thereby reserving more memory for it to parcel out. Memory allocated in that way is generally not released back to the operating system because it is only possible to do it when the blocks available to be released are at the end of the data segment.
Larger blocks are sometime done by using mmap to allocate memory, and that memory can be released again with an munmap call.
How does libc communicate with the OS (e.g., a Linux kernel) to manage memory?
Through system calls - this is a low-level API that the kernel provides.
Specifically, how does it allocate memory, and how does it release memory?
Unix-like systems provide the "sbrk" syscall.
Also, in what cases can it fail to allocate and deallocate, respectively?
Allocation can fail, for example, when there's no enough available memory. Deallocation shall not fail.
I was wondering that if the space required on heap is not large enough
such that there is no need for a brk/sbrk system all (to shift the break pointer (brk) of data segment), how does a library function (such as malloc) allocates space on heap.
I am not asking about the data-structures and algorithms for heap management. I am just asking how does malloc get the address of the first location of the heap if it doesn't invoke a system call. I am asking this because I have heard that it is not always necessary to invoke a system call (brk/sbrk) as these are only required to expand the space.Please correct me if I am wrong.
The basic idea is that when your program starts, the heap is very small, but not necessarily zero. If you only allocate (malloc) a small amount of memory, the library is able to handle it within the small amount of space it has when it is loaded. However, when malloc runs out of that space, it needs to make a system call to get more memory.
That system call is often sbrk(), which moves the top of the heap's memory region up by a certain amount. Usually, the malloc library routine increases the heap by larger than what is needed for the current allocation, with the hope that future allocations can be performed w/o making a system call.
Other implementations of malloc use mmap() instead -- this allows the program to create a sparse virtual memory mapping. However, mmap() based malloc implementations do the same thing as the sbrk()-based ones: each system call reserves more memory than what is necessarily needed for the current call.
One way to look at this is to trace a program that uses malloc: you'll see that for N calls to malloc, you will see M system calls (where M is much smaller than N).
The short answer is that it uses sbrk() to allocate a big hunk, which at that point belongs to your app process. It can then further parcel out subsections of that as individual malloc calls without needing to ask the system for anything, until it exhausts that space and needs to resort sbrk() again.
You said you didn't want the details on the data structures, but suffice it to say that the implementation of malloc (i.e. your own process, not the OS kernel) is keeping track of which space in the region it got from the system is spoken for and which is still available to dole out as individual mallocs. It's like buying a big tract of land, then subdividing it into lots for individual houses.
Use sbrk() or mmap() — http://linux.die.net/man/2/sbrk, http://linux.die.net/man/2/mmap