Operating system development - dynamic memory allocation

Operating system development - dynamic memory allocation - linux

I want to develop my own operating system from the beginning. But I have some doubts about dynamic memory allocation. For example: There will be a linked list that implements a ready process queue. When I allocate memory in my programs, I use malloc. But, could I use malloc directly in the kernel implementation? Or I must develop my own memory allocator? I am not sure, but I think that malloc uses system calls to check the page table, so I could not use that in my own kernel. If I cant use malloc, how could I allocate memory for the queue?
Thanks.

Related

Heap Memory Allocation in ARM64 Assembly without the C Standard Library

I'm trying to find a way to do heap memory allocation in armv8-a assembly, and after looking through syscall tables and trying to look at the Linux Programmer's Manual I can't find any way to allocate and de-allocate memory at runtime without using malloc and free from the c standard library.
I've looked at brk() but that doesn't appear to have any way to de-allocate memory.

mmap with MAP_ANONYMOUS is preferred to sbrk/brk for most purposes in modern programs. Use munmap to free.
By the way, brk can deallocate memory; simply pass an address lower than the current break point. But this does limit you to freeing in a last-in-first-out fashion.

At boot, how does the Linux kernel allocate memory for its own memory allocator?

I'm building a small x86-64 kernel that I write completely from scratch. I'm starting to write a very simple memory allocator where I simply iterate on page structs to find a free page and break the loop when I find one. When I began writing the allocator, I stumbled upon an issue.
In my kernel, I managed to get a memory map of RAM using a UEFI call (GetMemoryMap()). By iterating on the memory map, I managed to find that I would have around 1021MB of usable memory (out of 1024MB). I test my kernel on QEMU.
I read here and there that the Linux kernel holds a page struct for every page in the system. I'm guessing that the memory allocator of the Linux kernel uses the page structs to determine what page is free and which isn't. By using a proper binary structure and an efficient algorithm, it attempts to find a free page as fast as possible to use for itself or to provide to a user mode process.
If my assumption is correct, the issue arises here. If the Linux kernel's memory allocator relies on the page struct to work, how does it allocate memory for the page structs themselves?
I thought of a simple algorithm where I simply start at the physical address of the first usable memory region. If this memory region has enough room for all the page structs (representing the whole memory), I stop there. Otherwise, I use the second memory region an so on. This seems quite simple but I wanted to know how the Linux kernel handles that issue.
Even if my above assumption is wrong, the memory allocator probably requires some memory to work. At boot, how does the Linux kernel allocate memory for its own memory allocator?

I guess you can google some stuff about memblock allocator, which manages the physical memory in kernel boot phase.

Does kmalloc() reserve Copy-On-Write (COW) mappings?

My understanding is that kmalloc() allocates from anonymous memory. Does this actually reserve the physical memory immediately or is that going to happen only when a write page fault happens?

kmalloc() does not usually allocate memory pages(1), it's more complicated than that. kmalloc() is used to request memory for small objects (smaller than the page size), and manages those requests using already existing memory pages, similarly to what libc's malloc() does to manage the heap in userspace programs.
There are different allocators that can be used in the Linux kernel: SLAB, SLOB and SLUB. Those use different approaches for the same goal: managing allocation and deallocation of kernel memory for small objects at runtime. A call to kmalloc() may use any of the three depending on which one was configured into the kernel.
A call to kmalloc() does not usually reserve memory at all, but rather just manages already reserved memory. Therefore memory returned by kmalloc() does not require a subsequent page fault like a page requested through mmap() normally would.
Copy on Write (CoW) is a different concept. Although it's still triggered through page faults, CoW is a mechanism used by the kernel to save space sharing existing mappings until they are modified. It is not what happens when a fault is triggered on a newly allocated memory page. A great example of CoW is what happens when the fork syscall is invoked: the process memory of the child process is not immediately duplicated, instead existing pages are marked as CoW, and the duplication only happens at the first attempt to write.
I believe this should clear any doubt you have. The short answer is that kmalloc() does not usually "reserve the physical memory immediately" because it merely allocates an object from already reserved memory.
(1) Unless the request is very large, in which case it falls back to actually allocating new memory through alloc_pages().

libc memory management

How does libc communicate with the OS (e.g., a Linux kernel) to manage memory? Specifically, how does it allocate memory, and how does it release memory? Also, in what cases can it fail to allocate and deallocate, respectively?

That is very general question, but I want to speak to the failure to allocate. It's important to realize that memory is actually allocated by kernel upon first access. What you are doing when calling malloc/calloc/realloc is reserving some addresses inside the virtual address space of a process (via syscalls brk, mmap, etc. libc does that).
When I get malloc or similar to fail (or when libc get brk or mmap to fail), it's usually because I exhausted the virtual address space of a process. This happens when there is no continuous block of free address, an no room to expand an existing one. You can either exhaust all space available or hit a limit RLIMIT_AS. It's pretty common especially on 32bit systems when using multiple threads, because people sometimes forget that each thread needs it's own stack. Stacks usually consume several megabytes, which means you can create only few hundreds threads before you have no more free address space. Maybe an even more common reason for exhausted address space are memory leaks. Libc of course tries to reuse space on the heap (space obtained by a brk syscall) and tries to munmmap unneeded mappings. However, it can't reuse something that is not "deallocated".
The shortage of physical memory is not detectable from within a process (or libc which is part of the process) by failure to allocate. Yeah, you can hit "overcommitting limit", but that doesn't mean the physical memory is all taken. When free physical memory is low, kernel invokes special task called OOM killer (Out Of Memory Killer) which terminates some processes in order to free memory.
Regarding failure to deallocate, my guess is it doesn't happen unless you do something silly. I can imagine setting program break (end of heap) below it's original position (by a brk syscall). That is, of course, recipe for a disaster. Hopefully libc won't do that and it doesn't make much sense either. But it can be seen as failed deallocation. munmap can also fail if you supply some silly argument, but I can't think of regular reason for it to fail. That doesn't mean it doesn't exists. We would have to dig deep within source code of glibc/kernel to find out.

1) how does it allocate memory
libc provides malloc() to C programs.
Normally, malloc allocates memory from the heap, and adjusts the
size of the heap as required, using sbrk(2). When allocating blocks of
memory larger than MMAP_THRESHOLD bytes, the glibc malloc()
implementation allocates the memory as a private anonymous mapping
using mmap(2). MMAP_THRESHOLD is 128 kB by default, but is adjustable
using mallopt(3). Allocations performed using mmap(2) are unaffected
by the RLIMIT_DATA resource limit (see getrlimit(2)).
And this is about sbrk.
sbrk - change data segment size
2) in what cases can it fail to allocate
Also from malloc
By default, Linux follows an optimistic memory allocation strategy.
This means that when malloc() returns non-NULL there is no guarantee
that the memory really is available.
And from proc
/proc/sys/vm/overcommit_memory
This file contains the kernel virtual memory accounting mode. Values are:
0: heuristic overcommit (this is the default)
1: always overcommit, never check
2: always check, never overcommit

Mostly it uses the sbrk system call to adjust the size of the data segment, thereby reserving more memory for it to parcel out. Memory allocated in that way is generally not released back to the operating system because it is only possible to do it when the blocks available to be released are at the end of the data segment.
Larger blocks are sometime done by using mmap to allocate memory, and that memory can be released again with an munmap call.

How does libc communicate with the OS (e.g., a Linux kernel) to manage memory?
Through system calls - this is a low-level API that the kernel provides.
Specifically, how does it allocate memory, and how does it release memory?
Unix-like systems provide the "sbrk" syscall.
Also, in what cases can it fail to allocate and deallocate, respectively?
Allocation can fail, for example, when there's no enough available memory. Deallocation shall not fail.

How is memory allocated on heap without a system call?

I was wondering that if the space required on heap is not large enough
such that there is no need for a brk/sbrk system all (to shift the break pointer (brk) of data segment), how does a library function (such as malloc) allocates space on heap.
I am not asking about the data-structures and algorithms for heap management. I am just asking how does malloc get the address of the first location of the heap if it doesn't invoke a system call. I am asking this because I have heard that it is not always necessary to invoke a system call (brk/sbrk) as these are only required to expand the space.Please correct me if I am wrong.

The basic idea is that when your program starts, the heap is very small, but not necessarily zero. If you only allocate (malloc) a small amount of memory, the library is able to handle it within the small amount of space it has when it is loaded. However, when malloc runs out of that space, it needs to make a system call to get more memory.
That system call is often sbrk(), which moves the top of the heap's memory region up by a certain amount. Usually, the malloc library routine increases the heap by larger than what is needed for the current allocation, with the hope that future allocations can be performed w/o making a system call.
Other implementations of malloc use mmap() instead -- this allows the program to create a sparse virtual memory mapping. However, mmap() based malloc implementations do the same thing as the sbrk()-based ones: each system call reserves more memory than what is necessarily needed for the current call.
One way to look at this is to trace a program that uses malloc: you'll see that for N calls to malloc, you will see M system calls (where M is much smaller than N).

The short answer is that it uses sbrk() to allocate a big hunk, which at that point belongs to your app process. It can then further parcel out subsections of that as individual malloc calls without needing to ask the system for anything, until it exhausts that space and needs to resort sbrk() again.
You said you didn't want the details on the data structures, but suffice it to say that the implementation of malloc (i.e. your own process, not the OS kernel) is keeping track of which space in the region it got from the system is spoken for and which is still available to dole out as individual mallocs. It's like buying a big tract of land, then subdividing it into lots for individual houses.

Use sbrk() or mmap() — http://linux.die.net/man/2/sbrk, http://linux.die.net/man/2/mmap

Develop Reference

node.js excel linux python-3.x azure haskell apache-spark rust .htaccess string