Write protecting a page in linux kernel - linux

I am trying to write protect a certain memory location on an ARM SoC. If possible, the whole page which would contain the memory location. I am aware of mprotect, but I am not looking for something from userspace. The aim is to implement a wrapper function to access that memory location after temporarily removing the write protection and once the write is completed, the protection is activated again. Other drivers will be using this wrapper function. There is pte_modify() which takes a pte_t. I have the physical address and I can get the pfn. Is there a way to get pte from pfn? Are they same?
Is there a way to do this? There is no userspace involved and remap_pfn_range() like functions are not needed, I guess.

Related

Can kernel code make things read-only in a way that other kernel code can't undo?

I'm under the impression that the Linux kernel's attempts to protect itself revolve around not letting malicious code run in kernelspace. In particular, if a malicious kernel module were to get loaded, that it would be "game over" from a security standpoint. However, I recently came across a post that contradicts this belief and says that there is some way that the kernel can protect parts of itself from other parts of itself:
There is plenty of mechanisms to protect you against malicious modules. I write kernel code for fun so I have some experience in the field; it's basically a flag in the pagetable.
What's there to stop any kernel module from changing that flag in the pagetable back? The only protection against malicious modules is keeping them from loading at all. Once one loads, it's game over.
Make the pagetable readonly. Done.
Kernel modules can just make it read-write again, the same way your code made it read-only, then carry on with their changes.
You can actually lock this down so that kernel mode cannot modify the page table until an interrupt occurs. If your IDT is read-only as well there is no way for the module to do anything about it.
That doesn't make any sense to me. Am I missing something big about how kernel memory works? Can kernelspace code restrict itself from modifying the page table? Would that really prevent kernel rootkits? If so, then why doesn't the Linux kernel do that today to put an end to all kernel rootkits?
If the malicious kernel code is loaded in the trusted way (e.g. loading a kernel module and not exploiting a vulnerability) then no: kernel code is kernel code.
Intel CPUs do have a series of mechanisms to disable read/write access to kernel memory:
CR0.WP if set disallows writes accesses to both user and kernel read-only pages. Used to detect bugs in the kernel code.
CR4.PKE if set (4-level paging must be enabled, mandatory in 64-bit mode) disallows the kernel from accessing (not including instruction fetches) the user page mode unless these are tagged with the right key (which marks their RW permissions). Used to allow the kernel to write to structures like VSDO and KUSER_SHARED_DATA but not other user mode structures. The keys permissions are in an MSR, not in memory; the keys themselves are in the page table entries.
CR4.SMEP if set disallows kernel instruction fetching from user mode pages. Used to prevent attacks where a kernel function pointer is redirected to a user mode allocated page (e.g. the nelson.c privilege escalation exploit).
CR4.SMAP if set disallows kernel access to user mode pages during implicit access or during any type (implicit or explicit) of access (if EFLAGS.AC=0, thus overriding the protection keys). Used to enforce a more strictly no-user-mode-access policy.
Of course the R/W and U/S bits in the paging structures control if the item is read-only/read-write and assigned to user or kernel.
You can read how permissions are applied for supervisor-mode accesses in the Intel manual:
Data writes to supervisor-mode addresses.
Access rights depend on the value of CR0.WP:
- If CR0.WP = 0, data may be written to any supervisor-mode address.
- If CR0.WP = 1, data may be written to any supervisor-mode address with a translation for which the
R/W flag (bit 1) is 1 in every paging-structure entry controlling the translation; data may not be written
to any supervisor-mode address with a translation for which the R/W flag is 0 in any paging-structure
entry controlling the translation.
So even if the kernel protected a page X as read-only and then protected the page structures themselves as read-only, a malicious module could simply clear CR0.WP.
It could also change CR3 and use its own paging structures.
Note that Intel developed SGX to address the threat model where the kernel itself is evil.
However, running the kernel components into enclaves in a secure way (i.e. no single point of failure) may not be trivial.
Another approach is virtualizing the kernel with the VMX extension, though this is by no way trivial to implement.
Finally, the CPU has four protection levels at the segmentation layer but paging has only two: supervisor (CPL = 0) and user (CPL > 0).
It is theoretically possible to run a kernel component in "Ring 1" but then you'd need to make an interface (e.g. something like a call gate or syscall) for it to access the other kernel functions.
It's easier to run it in user mode altogether (since you don't trust the module in the first place).
I have no idea what this is supposed to mean:
You can actually lock this down so that kernel mode cannot modify the page table until an interrupt occurs.
I don't recall any mechanism by which the interrupt handling will lock/unlock anything.
I'm curious though, if anybody can shed some light they are welcome.
Security in the x86 CPUs (but this may be generalized) has always been hierarchical: whoever cames first set up the constraints for whoever cames later.
There is usually little to no protection between nonisolated components at the same hierarchical level.

How to write data to process address space from kernel?

When a process is created by do_execve, I want to write some data somewhere (say 0x0100_0000) such that after the process is run it can access that address to retrieve the data? How to achieve this task?
You can use VDSO. Example of using VDSO mechanism for own calls. The idea is to link userspace application and your code in kernel through special shared library. gettimeofday syscall is implemented in such way, what allows to reduce number of context switches.

Linux kernel module to monitor a particular process

I would like to write a kernel module in Linux that can monitor all the memory accesses made by a particular process(that I specify by name in the kernel module). I would also like to keep track of all the signals generated by the process and log all memory accesses that result in page faults, and memory accesses that cause a TRAP or a SEGV. How could I go about doing this? Could you point me towards any resources that could get me started off?
Well if you have never written a kernel module before this might be a great start:
https://web.archive.org/web/20180901094541/http://www.freesoftwaremagazine.com/articles/drivers_linux?page=0%2C2
From there you basically wan't to grab process information and output it, perhaps create some kind of /proc device..
But you should know this isn't really something you need kernel mode for. You could probably do this easily right from user space.

How a process can refer to objects that are not in its address space (for example, a file or another process)?

I have the homework question:
Explain how a process can refer to objects that are not in its
address space (for example, a file or another process)?
I know that each process is created with an address space that defines access to every memory mapped resource in that process (got that from this book). I think that the second part to this question does not make sense. How can a process reference an object of another process? Isn't the OS suppose to restrict that? maybe I am not understanding the question correctly. Anyways if I understood the question correctly the only way that will be possible is by using the kernel I believe.
If you are asking it in a general sense, then its a no. Operating systems do not allow one process to access another process's virtual address space under the normal circumstances.
However there are ways in which you can create a controlled environment where such a thing can be done using various techniques.
A perfect example is the debugger. It uses process tracing mechanism (like reading from /proc filesystem or using the ptrace() system calls) to gain access to read and write from another address space.
There is also a shared memory concept, where a particular piece of memory is explicitly shared between two processes and can be controlled via a shared memory object.
You can attach as a debugger to the application. Or if using Windows, you can use windows hooks
I have researched and I have the answer to the file part of the question.
first of an address space is the collection of addresses that a
thread can reference. Normally these addresses reference to an
executable in memory. Some operating systems allow a programmer to
read and write the contents of a file using addresses from the process
address space. This is accomplished by opening the file, then binding each byte in the file to an address in the address space.
The second part of the question this is what I will answer:
Most operating systems will not allow reading addresses from another
process. This will imply a huge security risk. I have not heard of any
operating system that enables you to read data from a thread that is
not owned by the current process. So in short I believe this will not
be possible.

Linux Paging and interrupt handler

Dear Sir/Madam,
I am trying to implement ready boost feature in LINUX for my final year undergraduate project.I was just researching and I found out that whenever a page fault occurs the CPU sends Interrupt 14.So, I need your guidance on the foll scheme I am thinking of:
I will create an interrupt handler which will be activated when an interrupt occurs.
This handler can extract the linear address of the fault from cr2 register and we can use LINUX page table to get the physical address.
Do you think that this ,will be a feasible scheme?
Also any tutorial on the same will be highly appreciated.
Thanks to all in advance.
_Regards
Isn't "ReadyBoost" simply implemented by running mkswap followed by swapon on the /dev/sd* device special file for the flash disk? As far as I'm aware, all the necessary kernel-side support is in place.
We're not going to do your assignment for you.
IIUI ReadyBoost is not the same as swap #caf. It is about caching disk content on a faster medium to make random disk accesses faster. Linux will never page disk-backed pages to swap, they will just be dropped and reread from the disk. Only anonymous pages go to swap.
Also ReadyBoost data is mirrored to disk, so the USB drive can be removed at any time, and also encrypted so if the key is removed and analysed on another system nothing is disclosed.
So #R-The_Master you could implement something like ReadyBoost for Linux. But it has basically nothing to do with int 14.

Resources