Does Linux support 2MB pages at compile time? - linux

I know processors these days, some of them, support 2MB and 1GB page sizes. Is it possible to compile the Linux kernel to natively support 2MB as opposed to the standard 4Kb page?
Thanks.

Well, I can say yes and no.
The page size is fixed. But that depend on your patience to the erros and issues that you will encounter.
The page size is known and determined by the MMU hardware, so the operating system is taking that into account. However, notice that some Linux systems (and hardware!) have hugetlbpage and Linux mmap(2) might accept MAP_HUGETLB (but your code should handle the case of processors or kernels without huge page support, e.g. by calling mmap again without MAP_HUGETLB when the first mmap with MAP_HUGETLB has failed).
You may find these links interested for you:
https://www.cloudibee.com/linux-hugepages/
https://forums.opensuse.org/showthread.php/437078-changing-pagesize-kernel
https://linuxgazette.net/155/krishnakumar.html
https://lwn.net/Articles/375096/

Related

How the OS knows a page is dirty in mapped memory?

I mean when data is updated directly in memory, without using write().
In linux I thought all the data specified in a msync call was flushed.
But in Windows the doc of FlushViewOfFile says "writing of dirty pages", so somehow the OS knows what pages have been updated.
How does that work ? Do we have to use WriteFile to update mapped memory ?
If we use write() in linux does msync only syncs dirty pages ?
On most (perhaps all) modern-day computers running either Linux or Windows, the CPU keeps track of dirty pages on the operating system's behalf. This information is stored in the page table.
(See, for example, section 4.8 of the Intel® 64 and IA-32 Architectures
Software Developer’s Manual, Volume 3A and section 5.4.2 in the AMD64 Architecture Programmer's Manual, Volume 2.)
If that functionality isn't available on a particular CPU, an operating system could instead use page faults to detect the first write to a page, as described in datenwolf's answer.
When flushing pages (i.e. cleaning them up) the OS internally removes the "writeable" flag. After that, when a program attempts to write to a memory location in such a page, the kernel's page fault handler is invoked. The page fault handler then sets the page access permissions to allow the actual write and marks the page dirty, then returns control to the program to let it perform the actual write.

how to enable hugetlb on mips32

Here is the problem I have:
rx/tx packet in kernel driver. User space program need to access each of these packet. So, there are huge amount of data transfer between kernel and user space. (data stream: kernel rx -> user space process -> kernel tx)
throughput is the KPI.
I decide to use share memory/mmap to avoid data copy. although I haven't test it, others have told me tlb missing will be a problem.
The system I use is a
mips32 system (mips74kc, single core)
default page size 4KB.
kernel 2.6.32
It can only fit in one data packet. During the data transformation, there will be lots of tlb missing that impact throughput.
I found huge page might be a solution. But, it seems like only mips64 support hugetlbfs currently.
https://www.kernel.org/doc/Documentation/vm/hugetlbpage.txt
https://www.linux-mips.org/archives/linux-mips/2009-05/msg00429.html
So, my question is: how can I use hugetlbfs on mips32. or is there other way to solve the throughput problem.(I must do the data process part in user space)
According to ddaney's patch,
Currently the patch only works for 64-bit kernels because the value of
PTRS_PER_PTE in 32-bit kernels is such that it is impossible to have a
valid PageMask. It is thought that by adjusting the page allocation
scheme, 32-bit kernels could be supported in the future.
It seems possible. Could someone give me a hint, what need to be modify, in order to enable hugetlb.
thank you!
Does documentation of your core list support of non 4KB page in its TLB? If it is not supported you should modify your CPU (replace it with some which support larger pages, or redesign your CPU and make new chip).
But most probably you are on wrong track, and tlb missing is not yet proven to be the problem (and the 2MB huge page is wrong solution to 8KB or 15KB packets).
I will tell you "zero-copy" and/or user-space networking (netmap, snabb, PF_RING, DPDK, Network stack in userspace), or user-space network driver; or kernel-based data handler. But many of these tools are only for newer kernels.

Changing memory page size

i was reading there,that the number of virtual memory pages are equal to number of physical memory frames and that the size of frames and that of pages are equal,like for my 32bit system the page size is 4096.
Well i was thinking is there there any way to change the page size or the frame size?
I am using Linux OS. I have searched a lot and what I found is,we can change the page size or in fact we can increase the page size by shifting to huge pages.Is there any other way to change (increase or decrease) or set the page size of our choice?
(Not coding anything,a general question)
In practice it is (nearly) impossible to "change" the memory page size, since the page size is known & determined by the MMU hardware, so the operating system is taking that into account. However, notice that some Linux systems (and hardware!) have hugetlbpage and Linux mmap(2) might accept MAP_HUGETLB (but your code should handle the case of processors or kernels without huge page support, e.g. by calling mmap again without MAP_HUGETLB when the first mmap with MAP_HUGETLB has failed).
From what I read, on some Linux systems, you can use hugetlbpage with various sizes. But the sysadmin can restrict these (or some kernels disable it), so your code should always be prepared that a mmap with MAP_HUGETLB fails.
Even with those "huge pages", the page size is not arbitrary. Use sysconf(_SC_PAGE_SIZE) on POSIX systems to get the standard page size (it is usually 4Kbytes). See also sysconf(3)
AFAIK, even on systems with hugetlbpage feature, mmap can be called without MAP_HUGETLB and the page size (as reported by sysconf(_SC_PAGE_SIZE)) is still 4Kbytes. Perhaps some recent kernels with some weird configurations are using huge pages everywhere, and IIRC some kernels might be configured with 1Mbyte page (i am not sure about that and I might be wrong)...

Configure virtual page size in linux

I'm testing CPU's hardware prefetcher.
It is known that the prefetching occurs up to a page boundary.
I want to make sure that my test works right.
Anybody know how I can change virtual page size in linux?
On x86-64, the only page sizes supported by the hardware are 4kb and 2MB. 4kb is used by default; for 2MB pages, you can use Linux's hugetlb system to allocate them.

Any way to reserve but not commit memory in linux?

Windows has VirtualAlloc, which allows you to reserve a contiguous region of address space, but not actually use any physical memory. Later when you want to use it (or part of it) you call VirtualAlloc again to commit the region of previously reserved pages.
This is actually really useful, but I want to eventually port my application to linux - so I don't want to use it if I can't port it later. Does linux have a way to do this?
EDIT - Use Case
I'm thinking of allocating 4 GB or some such of virtual address space, but only committing it 64K at a time. This would give me a zero-copy way to grow an array up to 4 GB. Which is important, because the typical double the array size and copy introduces seemingly random unacceptable latency for very large arrays.
mmap a special file, like /dev/zero (or use MAP_ANONYMOUS) as PROT_NONE, later use mprotect to commit.
You can turn this functionality on system-wide by using kernel overcommit. This is usually default setting on many distributions.
Here is the explanation http://www.mjmwired.net/kernel/Documentation/vm/overcommit-accounting
The Linux equivalent of VirtualAlloc() is mmap(), which provides the same behaviours. However as a commenter points out, reservation of contiguous memory is the behaviour of calls to malloc() as long as the memory is not initialized (such as by calloc(), or user code).
"seemingly random unacceptable latency
for very large arrays
You could also consider mlock() or mmap() + MAP_LOCKED to mitigate the impact of paging. Many CPUs support huge (aka large) pages, pages larger than 4kb. These larger pages can mitigate the impact of the TLB on streaming reads/writes.

Resources