Find out pages used by mem_map[] array - linux

Recently, I am working on the ARM Linux kernel, and I need to split the HighMem zone into two parts. So I have added a new zone into the kernel, let's say "NewMem". Therefore, I have three zones in my system, they are Normal, NewMem, and HighMem. The size of the NewMem zone is 512MB (, totally 131072 pages). My propose is that I want to manage all the page frames in NewMem zone in my own way, currently I use a doubly linked list to allocate/de-allocate pages. Note that the buddy system for NewMem zone is still exist, but I do not use it. To achieve this. I modified the page allocation routine to make sure that the kernel cannot allocate any page frame from my zone.
My concern is that can I use all the page frames in that zone as it is suggested that each zone is concerned with a subset of the mem_map[] array. I found that only 131084 pages are free in NewMem zone.Therefore, some page frames in my zone may used to store mem_map[], writing data to these pages may lead to unpredictable errors. So is there exist any way to find out which page frame is used to store mem_map[], so that I can avoid rewriting it.

You have to check the break down of physical and virtual memory. usually mem_map is stored on first mappable address of the virtual memory. In Unix, kernel image of usually 8MB is stored at physical address of 1MiB accessed with virtual address PAGE_OFFSET + 0x00100000. 8MiB is reserved in virtual memory for kernel image. Then comes the 16 MiB of zone_dma. So first address which can be used by kernel for mapping is 0xC1000000. Which is supposed to contain mem_map array.
I am not familiar with ARM memory break down but from your post it is evident that there is no zone_dma at least in your case. So your best bet is that address 0xC0800000 stores mem_map. I am assuming that kernel image is 8MB.
As stated above in general first mappable virtual address stores mem_map. You can calculate that address with size and location of kernel image and zone_dma(present or not).
Please come with your feedback.

Related

How to test if address is virtual or logical in linux kernel?

I am familiar that the Linux kernel memory is usually 1:1 mapped (up to a certain limit of the zone). From what I understood is that to make this 1:1 mapping more efficient the array of struct page is virtually mapped.
I wanted to check if that is the case. Is there a way to test if - given an address (lets say that of a struct page) check if it is 1:1 mapped or virtually mapped?
The notion of address space for a 64-bit machine emcompasses 2^64 addresses. This is far larger than any modern amount of physical memory in one machine. Therefore, it is possible to have the entire physical memory mapped into the address space with plenty of room to spare. As discussed in this post and shown here, Linux leaves 64 TB of the address space for the physical mapping. Therefore, if the kernel needed to iterate through all bytes in physical memory, it could just iterate through addresses 0+offset to total_bytes_of_RAM + offset, where offset is the address where the direct mapping starts (ffff888000000000 in the 64 bit memory layout linked above). Also, this direct mapping region is within the kernel address range that is "shared between all processes" so addresses in this range should always be logical.
Your post has two questions: one is how to test if an address is logical or virtual. As I mentioned, the answer is if the address falls within the direct mapping range, then it is logical. Otherwise it is virtual. If it is a virtual address, then obtaining the physical address through the page tables should allow you to access the address logically by following the physical_addr + offset math as mentioned above.
Additionally, kmalloc allocates/reserves memory directly using this logical mapping, so you immediately know that if the address you're using came from kmalloc, it is a logical address. However, vmalloc and any user-space memory allocations use virtual addresses that must be translated to get the logical equivalent.
Your second question is whether "logically mapped pages" can be swapped out. The question should be rephrased because technically all pages that are in RAM are logically mapped in that direct mapping region. And yes certain pages in main memory can be swapped out or kicked out to be used by another page in the same page frame. Now, if you're asking whether pages that are only mapped logically and not virtually (like with kmalloc, which gets memory from slab) can be swapped out, I think the answer is that they can be reclaimed if not being used, but aren't generally swapped out. Kernel pages are generally not swapped out, except for hibernation.

Is there a way to know there is any kind of page move or swap happend in Linux?

Virtual address to physical page mapping can be changed during application runtime by swapping or physical page reallocation for memory defragmentation or etc.
What if I want to cache physical page numbers (PPNs) of some virtual address range from /proc/PID/pagemap, since accessing proc/PID/pagemap is extremely expansive overhead to be checked every time, is there a way to be notified if a page has been moved to other physical address or swapped, on the effective address or just any part of memory?
Any kind of method will be ok(not just userspace method, but also those that can be only implemented in kernel space).

How to find the unique swap page by virtual address when page fault

For example, if there are 3 processes, each using the virtual address 0x400000 for text section. And there is only one 4KB physical page for user process.
Suppose process 0 is using the physical page (virtual address 0x400000). Assume that the physical page data is page_pid_0_0x400000.
When process 1 is scheduled by the OS, and page_pid_1_0x400000 of process 1 would be loaded into physical page from executable. Then page_pid_0_0x400000 data should be swapped out to disk.
When process 2 is also loaded, the page_pid_2_0x400000 data on physical page should also be swapped out to disk.
Now, on disk, we have 2 copies of the same virtual address space, i.e. 0x400000: page_pid_1_0x400000 and page_pid_0_0x400000.
If process 1 is scheduled now, how can I (OS) identify the page_pid_1_0x400000 from virtual address 0x400000 (since memory accessing instructions only know the virtual address 0x400000 but not process id)?
The operating system can have all sorts of associated data structures. For example, each process can have its own data structures (and page tables) representing its address space, and the operating system just has to make sure to point the cpu at the correct set of page tables when it resumes the process.
Similarly, the swap handling isn't constrained to just use a virtual address, it can use (address space, virtual address) to uncover the swap location. It can make this as flexible or rigid as need be. For example it might consider a virtual address to be part of a contiguous collection of virtual addresses which have some commonality between where the pages are stored in files or swap.
The page tables, and notion of virtual address, are an interface to the CPU+MMU translation of address. The operating system can maintain whichever associated data structures it wants.
In older systems, each page descriptor (sometimes page table entry or pte) would have a bit which determined if the page was valid. The CPU/MMU would ignore pages which were not considered valid; thus when a page was swapped to disk, the other bits in the page table entry are a handy place to store the disk swap address.
Modern systems tend to have more complex data structures to accommodate transient sharing and locking of pages, so often an auxiliary structure is used.

How does linux kernel save struct_task in dynamic memory?

While I read understanding the linux kernel, I got this sentence
process descriptors are stored in dynamic memory.
As far as I know, for 32-bit computer system:
Kernel reserved almost 128MB High Memory in the highest virtual address to address the Dynamic physical address.
my question is: although the high memory can address all physical address, it can only address 128MB at most at once. The kernel data structure is so much that it could exceed 128MB. If kernel want to remap some of the high memory, the virtual address of some data structure saved in high memory might be invalid. How can kernel save more than 128MB kernel data structure in dynamic physical memory.
Although I have tried hard to express clear and obey this site's rules, there could still be some thing I made wrong. I'm very sorry if any.
What does "The kernel data structure is so much that it could exceed 128MB." mean? There is no "kernel data structure". There are things the kernel allocates, but they are few pages long tops. In particular there is no "single object" which would be > 128MB long.
If something is physically really big (say there is a file entirely read into RAM and it takes 512MB), the kernel just maps and unmaps physical pages as it needs them. In particular there is no need for the file to be mapped entirely at the same time and virtual addresses the parts get temporarily map into are meaningless.
Also note that today x86_64 provides a 128TB address space, so there are no shenaningans of the sort.

How does virtual to pyhsical memory mapping work

Im currently trying to understand systems programming for Linux and have a hard time understanding how virtual to physical memory mappings work.
What I understand so far is that two processes P1 and P2 can make references to the same virtual adress for example 0xf11001. Now this memory adress is split up into two parts. 0xf11 is the page number and 0x001 is the offset within that page (assuming 4096 page size is used). To find the physical adress the MMU has hardware registeres that maps the pagenumber to a physical adress lets say 0xfff. The last stage is to combine 0xfff with 0x001 to find the physical 0xfff001 adress.
However this understanding makes no sens, the same virtual adresses would still point to the same physical location??? What step is I missing inorder for my understanding to be correct???
You're missing one (crucial) step here. In general, MMU doesn't have hardware registers with mappings, but instead one register (page table base pointer) which points to the physical memory address of the page table (with mappings) for the currently running process (which are unique to every process). On context switch, kernel with change this register's value, so for each running process different mapping will be performed.
Here's nice presentation on this topic: http://www.eecs.harvard.edu/~mdw/course/cs161/notes/vm.pdf

Resources