Corrupted page table at address - handle

I have a driver that mapping system RAM memory by using function remap_pfn_range. However recently I encounter a following problem when writing to the mapping memory region:
BUG: unable to handle kernel
mydriver: Corrupted page table at address ffff88117ff72000
Could anyone explain for me what does exactly the "corrupted page table at address" means?
Thank you,

The page table is the part of the OS that keeps track of pages of memory and where they are (disk, RAM, etc.)
Somewhere there is a pointer to this page table <0xffff88117ff72000>, and it is either messed up or the place it points to is messed up. Either way, the error message indicates the page table isn't understandable at this point.

Related

Page Table Entry, Present Bit?

Quoting from: http://www.cburch.com/books/vm/index.html
The final bit (labeled P) indicates whether the page is present in
RAM. If this bit is 0, then any access to the page will trigger a page
fault.
My professor doesn't agree, he said the bit can be 0 while page is in RAM and he added that this can happen when the page is shared between multiple processes and someone does something or so.
Can someone kindly explain this, still I don't get it I'm looking for detailed examples when page is in RAM but it's present bit in PTE is 0 and not 1.
Yes, It's possible to have a page in RAM with p-bit disabled.
This method was useful while creating a software/kernel with multi-threading and multi-processor environment, where a process needs exclusive rights or if a piece code must not cross some other. We can temporarily disable it's access to other core/processor by demoting p-bit in page-table and the kernel/software must handle the page fault accordingly.

How to handle linux page cache (tag lookup) returning less pages that what was asked?

This is from a file system perspective.
The file system page size is 8K (i.e. double the block size, 4k).
So, when I dirty pages and go for a flush, I make sure that the range passed to pvec_lookup_tag() is 8k aligned at all costs. The page cache should give me pages starting at 8k aligned address (i.e. even index)
So, down to the problem.
I have already dirtied the pages and then I ask the page cache for 14 dirty pages in some specified range and mapping.
But, surprisingly it gives me just one page which is odd aligned.
In short, I'm getting just the second 4k page of my originally intended 8k page.
Also, I checked the mapping by taking a crashdump. All the 14 pages I had asked were right there and also marked dirty.
Just retrying the same lookup gives me the correct pages.
But I feel there must be a better solution here.
Is there some weird window between marking the pages dirty and trying a tag lookup that is causing this?
(I'm on Linux Kernel v3.10.x)
Okay, Let me rephrase the question in simpler terms.
Is it possible that a tag lookup in linux gives me less pages than I asked for?
If yes, how to handle such cases?

page fault in copy_to_user, how kernel map a page for user space address?

I've learned that when a page fault occurs in copy_to_user function, the exception table will be used.
But I found almost all fix would just set the return value and jump to the next instruction after the one which triggers page fault.
Where does the kernel do the mapping work for user space address?
I mean at least there is some place kernel will modify page table.
Your question is very unclear, a copy_to_user is basically a function for copying data from kernel-space to user-space. Mainly for security reasons as we don't want to give user access to kernel data structures and kernel-space. So we need a mechanism to request from the kernel to give us this data.
A new mapping will be added in the page-tables indeed. The mapping is done in
kernel-space where the page-tables reside.

In Linux, free blocks(free pages) are related to the file system?

I want to aggregate free pages to remove duplicated pages from XEN image.
Before saving, I collected free pages by using zone free_list array.
But after restoring, Linux occurred Bad page panic.
init[1]: segfault at 2b ip 00007ffad1220329 sp 00007fffb125ad8 error 4 in init[7ffad1211000+ec000]
init: Caught segmentation fault, core dumped
I know maybe I can't get solution from this question.
could you expect why does that errors occur?
And I wondered free blocks are related to file system (i.e ext4)
Anybody give some advices to me?
Thank you.

3.10 kernel crash BUG() in mark_bootmem()

I get a kernel crash at BUG() here - http://lxr.free-electrons.com/source/mm/bootmem.c?v=3.10#L385 with the following message
2kernel BUG at /kernel/mm/bootmem.c:385!
What could be a possible reason for this?
Following is the function call trace
[<c0e165f8>] (mark_bootmem+0xd0/0xe0) from [<c0e05d64>] (bootmem_init+0x16c/0x26
[<c0e05d64>] (bootmem_init+0x16c/0x264) from [<c0e07980>] (paging_init+0x734/0x7
[<c0e07980>] (paging_init+0x734/0x7d4) from [<c0e03f20>] (setup_arch+0x3e8/0x69c
[<c0e03f20>] (setup_arch+0x3e8/0x69c) from [<c0e007d8>] (start_kernel+0x78/0x370
[<c0e007d8>] (start_kernel+0x78/0x370) from [<10008074>] (0x10008074)
Thanks
The mm/bootmem.c file is responsible for Boot Memory Allocator. Function mark_bootmem marks memory pages between start and end addresses (start is rounded down and end is rounded up to page boundaries) as reserved (or not reserved when used for freeing) for this allocator.
It iterates over bdata_list trying to find a region containing first page from requested address range. It it won't find it, the BUG() you mentioned will be triggered. The same BUG() will be triggered if it succeeds finding it, but the region is not large enough (end is outside of the region). So this BUG() means that it wasn't able to find requested memory region to mark.
Now if I understand the kernel code correctly, on normal UMA systems there will be only one entry in bdata_list and it should describe the range of lowmemory pages available in the system. Since you didn't provide too much information about your system it's hard to guess exact reason for the problem but in general, it seems that your memory setup is broken. This thing is very architecture specific so it's hard to tell what exactly is going on.

Resources