FreeBSD zone allocator

FreeBSD zone allocator - freebsd

I am reading the freeBSD book authored by Marshall McKusick and George Neville-Neil. In the kernel memory management, it mentions the following about zone allocator:
Each memory type is given its own zone from which all its allocations are made. Memory allocated in one zone cannot be used by any other zone or by the general memory allocator.
My question is:
1) What memory types are being referred here ?
2) What is meant by the different zones in context of zone allocator ?
If someone can also provide some reference that explains this better, it would be appreciated.
Thanks.

The zone allocator in FreeBSD is uma(9).
From the manual page:
The zone allocator first appeared in FreeBSD 3.0. It was radically
changed in FreeBSD 5.0 to function as a slab allocator.
A zone is similar to a memory arena/region in a memory pool, but as the manual page mentions, with slab allocator-like features. As your quote implies, you cannot uma_zalloc() from one zone and then uma_zfree() that chunk into a different zone. That'd screw with the internal bookkeeping.
As for memory types, I assume it refers to different kernel structures, where different kernel structures would probably have one zone each.

Related

How to analyse memory distribution in a process?

I have a running (private) server which use about 1.1G virtual memory (1.0G physical memory) . Though I have the source code of the server, while I want to figure out any better solution that I can use to get a big picture of the memory distribution among objects in the server? Some thing like this:
HashTable: 50%, 500M
PlayerCache: 20%, 200M
OtherA: 10%, 100M
...
where pointer may be in the object and points to dynamic allocated memorys.

On Linux, you could use proc(5) and pmap(1) (BTW, pmap(1), top(1), ps(1) are all using /proc/ and are useful to you).
So if your server process has pid 1234, you could try pmap 1234 and cat /proc/1234/maps in your terminal to understand the virtual address space of your process. Try also cat /proc/1234/status
Remember that processes run in virtual memory and each of them has its own virtual address space, and that the physical RAM is a resource managed by the kernel. You might be interested by the RSS
You won't have a report detailing dynamic memory usage per type, or per variable (because that does not make any sense in general: for example, in C, a given malloc-ed memory zone could be dereferenced thru casts of different types and be accessible -perhaps indirectly- thru several variables). You might use malloc_stats(3) (it could give statistics related to size of memory zones) if your program is using C dynamic memory allocation.
You could use valgrind, very useful to hunt memory leaks.
You could modify the source code of your server to handle and do some accounting, in your specific ways, of the heap allocation it is doing. For example, if you code in C++, you could modify the constructors and destructors of your classes to increment and decrement some static or global counters, or provide some class-specific operator new and delete. Heap memory consumption is a whole-program property (and is not tied, in general, to a specific type or variable). Some programs have their own (type-specific) allocators or use arena based or region based allocation.
Read also something about garbage collection, e.g. the GC handbook, to get useful concepts and terminology and techniques (e.g. tracing GC, reference counting, weak references, circular references, smart pointers, etc..). They matter even for manual memory management.
Memory management and allocation is specific to every program, particularly in programming languages like C or C++ (or Rust) with manual memory management where conventions matter a lot and are specific to every program. Study the source code of existing various free software servers (e.g. Apache, Lighttpd, PostGreSQL, Xorg, Unison, Git, ...) and you'll find out that each of them has different conventions about memory management.

Find out pages used by mem_map[] array

Recently, I am working on the ARM Linux kernel, and I need to split the HighMem zone into two parts. So I have added a new zone into the kernel, let's say "NewMem". Therefore, I have three zones in my system, they are Normal, NewMem, and HighMem. The size of the NewMem zone is 512MB (, totally 131072 pages). My propose is that I want to manage all the page frames in NewMem zone in my own way, currently I use a doubly linked list to allocate/de-allocate pages. Note that the buddy system for NewMem zone is still exist, but I do not use it. To achieve this. I modified the page allocation routine to make sure that the kernel cannot allocate any page frame from my zone.
My concern is that can I use all the page frames in that zone as it is suggested that each zone is concerned with a subset of the mem_map[] array. I found that only 131084 pages are free in NewMem zone.Therefore, some page frames in my zone may used to store mem_map[], writing data to these pages may lead to unpredictable errors. So is there exist any way to find out which page frame is used to store mem_map[], so that I can avoid rewriting it.

You have to check the break down of physical and virtual memory. usually mem_map is stored on first mappable address of the virtual memory. In Unix, kernel image of usually 8MB is stored at physical address of 1MiB accessed with virtual address PAGE_OFFSET + 0x00100000. 8MiB is reserved in virtual memory for kernel image. Then comes the 16 MiB of zone_dma. So first address which can be used by kernel for mapping is 0xC1000000. Which is supposed to contain mem_map array.
I am not familiar with ARM memory break down but from your post it is evident that there is no zone_dma at least in your case. So your best bet is that address 0xC0800000 stores mem_map. I am assuming that kernel image is 8MB.
As stated above in general first mappable virtual address stores mem_map. You can calculate that address with size and location of kernel image and zone_dma(present or not).
Please come with your feedback.

How to add memory zone to linux

I'd like to add a new memory zone (ZONE_DMA, ZONE_NORMAL, ZONE_HIGHMEM, etc) to the linux kernel. For example, this memory zone (ZONE_ALTERNATE), will have special properties that will make it desirable to use in certain instances. Imagine having a special DDR DIMM that performs some operations on the data. I would like to be able to use kmalloc to allocate from this zone. Specifically I would like to add this zone to kernel 2.6.32, but current kernels would also be of use.
Are there any resources or ideas on how to implement this?

Why __GFP_HIGHMEM flag can't be applied to the __get_free_page() or kmalloc()

I want to know basically the two things
How does the kmalloc works i mean which function kmalloc calls to allocate memory is it alloc_pages() or __ger_free_pages().
Why Why __GFP_HIGHMEM flag can't be applied to the __get_free_page() or kmalloc()
I got the folowing extract from the LKD Robert Love can any body better explain that what is exact probelm with the alloc_pages() while giving __GFP_HIGHMEM flag.
Page # 240 CHAPTER 12
You cannot specify __GFP_HIGHMEM to either __get_free_pages() or
kmalloc(). Because these both return a logical address, and not a page
structure, it is possible that these functions would allocate memory
not currently mapped in the kernel’s virtual address space and, thus,
does not have a logical address. Only alloc_pages() can allocate high
memory.The majority of your allocations, however, will not specify a
zone modifier because ZONE_NORMAL is sufficient.

As explained in the book Linux Device Drivers 3rd edition (freely available here), "the Linux kernel knows about a minimum of three memory zones: DMA-capable memory, normal memory, and high memory". The __GFP_HIGHMEM flag indicates that "the allocated memory may be located in high memory". This flag has a platform-dependent role, although its usage is valid on all platforms.
Now, as explained here, "high Memory is the part of physical memory in a computer which is not directly mapped by the page tables of its operating system kernel". This zone of memory is not mapped in the kernel's virtual address space, and this prevents the kernel from being capable of directly referring it. Unfortunately, the memory used for kernel-mode data structures must be direct-mapped in the kernel, and therefore cannot be in the HIGHMEM zone.

Any way to reserve but not commit memory in linux?

Windows has VirtualAlloc, which allows you to reserve a contiguous region of address space, but not actually use any physical memory. Later when you want to use it (or part of it) you call VirtualAlloc again to commit the region of previously reserved pages.
This is actually really useful, but I want to eventually port my application to linux - so I don't want to use it if I can't port it later. Does linux have a way to do this?
EDIT - Use Case
I'm thinking of allocating 4 GB or some such of virtual address space, but only committing it 64K at a time. This would give me a zero-copy way to grow an array up to 4 GB. Which is important, because the typical double the array size and copy introduces seemingly random unacceptable latency for very large arrays.

mmap a special file, like /dev/zero (or use MAP_ANONYMOUS) as PROT_NONE, later use mprotect to commit.

You can turn this functionality on system-wide by using kernel overcommit. This is usually default setting on many distributions.
Here is the explanation http://www.mjmwired.net/kernel/Documentation/vm/overcommit-accounting

The Linux equivalent of VirtualAlloc() is mmap(), which provides the same behaviours. However as a commenter points out, reservation of contiguous memory is the behaviour of calls to malloc() as long as the memory is not initialized (such as by calloc(), or user code).

"seemingly random unacceptable latency
for very large arrays
You could also consider mlock() or mmap() + MAP_LOCKED to mitigate the impact of paging. Many CPUs support huge (aka large) pages, pages larger than 4kb. These larger pages can mitigate the impact of the TLB on streaming reads/writes.

Develop Reference

node.js excel linux python-3.x azure haskell apache-spark rust .htaccess string