How can kernel threads ask pages for only themselves? - linux

As you know, every kernel threads share one kernel memory space. The mm field of the task_struct describing a kernel thread is null. It uses the mm field of 'priv' task.
I think it makes any kernel thread access other kernel thread's private memory region. For example, one of the device drivers was allocated 4KB page for it's own buffer but there is no way to prevent other threads from accessing it. Because every kernel threads share one memory address space.
So, I have a question. Is there any way to ask pages that should be used to private ?

Is there any way to ask pages that should be used to private ?
No, any process executing kernel code has access to everything in the operating system.
It is up to the operating system and its policies to prevent malware drivers to be loaded into kernel, if it needs some security garantees.

Related

Linux - Accessing mmap()ed memory from Thread from Userspace in Kernel Space

Mapped this memory in my Thread in Userspace:
b7fd0000-b7fd1000 rwxp 00000000 00:00 0
Thread is running (endless loop)
Made a breakpoint in the Kernel and trying to access it:
Thread 466 received signal SIGTRAP, Trace/breakpoint trap.
[Switching to Thread 3908]
0xc10d4060 in kgdb_breakpoint ()
(gdb) x/01i 0xb7fd0000
0xb7fd0000: Cannot access memory at address 0xb7fd0000
But it is not accessible.
How can I access 0xb7fd0000 from Kernel space? What address will be under?
Is it even possible?
Thanks,
The address the memory will appear under depends on which user space context is currently mapped.
The way this works is, some of the virtual addresses are reserved to the kernel, and these are the same in all contexts. This is why you can set a break point on a kernel address without worrying about which user space process is currently mapped.
For user space, this is not the case. Each time a new process is mapped, the virtual addresses for US change completely.
This is, likely, an X-Y problem. You're trying to do something, and you think that a kernel level break point is how you want to achieve it.
Taking a guess, you want your kernel driver to do something to communicate with your user space thread. If that's the case, your best bet is to export a character device, and have the userspace open it and mmap from there (rather than just an anonymous mmap). You can then control which memory it receives, and thus also map it to the kernel address space, where pointers are stable.
I'm not expert in kgdb but it will be worth checking current->pid at the time breakpoint is hit, contains pid of the process you're tracking. This is because although memory location inside kernel space has been hit, the user space process may not be the one you are interested in. Also to guard against possibility of pages being swapped out, it might be safer to lock mapped pages using mlock.
There are at least two things to pay attention to.
When to break. If you simply hit crtl-C in the kernel debugger, you don't know what the userspace context is going to be. It could be any userspace process. You want the kernel debugger to pause and give you control when the userspace context refers to the process you are interested in. One way to do this is as follows :- If you have the ability to recompile the kernel, add a new system call. Invoke this system call from the userspace process, after the region is mmap'd in. Start debugging the kernel after placing a breakpoint on the newly added system call. When the breakpoint is hit, you know for a fact that the userspace context is that of the process one you are interested in.
Virtual memory late binding. Even if you follow the steps in [1], you will still have trouble when accessing the contents of the buffer in userspace, if you have not read / written anything in that location. Make sure that after mmap'ing the region, you either read or write to the mmap'd location before invoking your newly added system call.

why kernel needs virtual addressing?

In Linux each process has its virtual address space (e.g. 4 GB in case of 32 bit system, wherein 3GB is reserved for process and 1 GB for kernel). This virtual addressing mechanism helps isolating the address space of each process. This is understandable in case of process since there are many processes. But since we have 1 kernel only so why do we need virtual addressing for kernel?
The reason the kernel is "virtual" is not to deal with paging as such, it is becuase the processor can only run in one mode at a time. So once you turn on paged memory mapping (Bit 31 in CR0 on x86), the processor is expecting ALL memory accesses to go through the page-mapping mechanism. So, since we do want to access the kernel even after we have enabled paging (virtual memory), it needs to exist somewhere in the virtual space.
The "reserving" of memory is more about "easy way to determine if an address is kernel or user-space" than anything else. It would be perfectly possible to put a little bit of kernel at address 12345-34121, another bit of kernel at 101900-102400 and some other bit of kernel at 40000000-40001000. But it would make life difficult for every aspect of the kernel and userspace - there would be gaps/holes to deal with [there already are such holes/gapes, but having more wouldn't exactly help things]. By setting a fixed limit for "userspace is from here to here, kernel is from end of userspace to X", it makes life much easier in that respect. We can just say kernel = 0; if (address > max_userspace) kernel=1; in some code.
Of course, the kerneln only takes up as much PHYSICAL memory as it will actually use - so the common thinking that "it's a waste to take up a whole gigabyte for the kernel" is wrong - the kernel itself is a few (a dozen or so for a very "big" kernel) megabytes. The modules loaded can easily add up to several more megabytes, and graphics drivers from ATI and nVidia easily another few megabytes just for the kernel moduel for that itself. The kernel also uses some bits of memory to store "kernel data", such as tasks, queues, semaphores, files and other "stuff" the kernel has to deal with. A few megabytes is used for this as well.
Virtual Memory Management is that feature of Linux which enables Multi-tasking in system without any limitation on no. of task or amount of memory used by each task. The Linux Memory Manager Subsystem (along with MMU hardware) facilitates VMM support, where memory or mem-mapped device are accessed through virtual addresses. Within Linux everything, both kernel and user components, works with virtual address except when dealing with real hardware. That's when the Memory Manager takes its place, does virtual-to-physical address translation and points to physical mem/dev location.
A process is an abstract entity, defined by kernel to which system resources are allocated in order to execute a program. In Linux Process Management the kernel is an integrated part of a process memory map. A process has two main regions, like two faces of one coin:
User Space view - contains user program sections (Code, Data, Stack, Heap, etc...) used by process
Kernel Space view - contains kernel data structures that maintain information (PID. States, FD, Resource Usage, etc...) about the process
Every process in Linux system has a unique and separate User Space Region. This feature of Linux VMM isolates each process program sections from one and other. But all processes in the system shares the common Kernel Space Region. When a process needs service from the kernel it must execute the kernel code in this region, or in other words kernel is performing on behalf of user process request.

Direct access user memory from kernel on Linux

I've got a user-mode process and kernel module. Now I want to read certain regions of usermode process from kernel, but there's one catch: no copying of usermode memory and simple access by VA.
So what we have: task_struct for target process, other related structs (like mm_struct, vma_struct) and virtual address like 0x0070abcd that I want to read or rather map somehow to my kernel module.
I can get page list using get_user_pages for desired memory regions, but what next? Should I map pages somehow into kernel and then try to read them as continuous memory region or there are better solutions?
The problem is that "looking" at user-space requires locking a ton of stuff. So it's better that you do a short copy than leave everything locked for arbitrary amounts of time. Your user-space process may not be VM-mapped into the current CPU. In fact, it may be entirely swapped out to disk, running on another CPU, in the middle of it's own kernel call, etc.
Linux Kernel: copy_from_user - struct with pointers

For arm Linux, could threads in user space access virtual address of Kernel space?

Virtual memory is split two parts. In tradition, 0~3GB is for user space and 3GB~4GB for kernel space.
My question:
Could the thread in user space access memory of kernel space?
For ARM datasheet, the access attribution is in the charge of domain access control register. But in kernel source code,the domain value in page table entry of user space virtual memory is same as kernel space's page table entry.
In fact, your application might access page 0xFFFF0000, as it contains the swi-handler and a couple of other userspace-helpers. So no, the 3/1 split is nothing magical, it's just very easy for the kernel to manage.
Usually the kernel will setup all memory above 3GB to be only accessible by the kernel-domain itself. If a driver needs to share memory between user and kernel-space it will usually provide an mmap interface, which then creates an aliased mapping, so you have two virtual addresses for the same physical address. This only works reliably on VIPT-Cache systems or with a LOT of careful explicit cache flushing. If you don't want this you CAN hack the kernel to make a chunk of memory ABOVE the 3G-split accessible to userspace. But then all userspace applications will share this memory. I've done this once for a special application on a armv5-system.
Userspace code getting Kernel memory? The only kernel that ever allowed that was DOS and its archaic friends.
But back to the question, look at this example C code:
char c=42;
*c=42;
We take one byte (a char) and assign it the numeric value 42. We then dereference this non-pointer, which will probably try to access the 42nd byte of virtual memory, which is almost definitely not your memory, and, for the sake of this example, Kernel memory. guess what happens when you run this (if you manage to hold the compiler at gunpoint):
Segmentation fault
Linux has memory protection like any modern operating system. If you try to access the memory of another process, your process will be terminated before it can do anything (other things I'm not so sure about happen with debuggers though). Even if that memory was that of another Userland process, you would still get terminated. I'm almost sure that root programs can't access other programs memory, or Kernel memory. The only way to access Kernel memory is to be part of the Kernel, or indirectly through the kernel's cooperation.

No address space for Linux Kernel threads

Why the Linux kernel threads do not have an address space. For any task to execute, it should have a memory region right? Where do the text and data of kernel threads go?
Kernel threads do have an address space. It's just that they all share the same one. This does not prevent them from each having a different stack.
Text and data are laid out in the kernel address space (the one that is shared by all the threads), depending on how and when it was allocated, and what it's used for.
The Linux MM site has a lot of documentation about this aspect of Linux. Head over there.
I don't know the precise answer, because I'm not a Linux architect.
But in general, so-called kernel threads do have an address space: it is the address space which contains the kernel. It might not need to be explicitly represented for each kernel thread, since it is shared among many threads.
I'd expect any real thread implementation to have a machine context block containing register values (and stack pointer, etc.), and a pointer to a the address space in which the thread is supposed to run. Then a scheduler, starting a ready thread, can easily determine whether the memory management unit is set up to enable access to the address space (and if not, set it up) to enable the thread to run in its desired space.

Resources