Are user code in the end executed in kernel mode? - multithreading

For what i learned from Operating System Concepts and online searching:
all user threads are finally mapped to kernel threads for being scheduled to physical CPUs
kernel threads can only be executed in kernel mode
above two arguments leads to the conclusion:
user code are all executed in kernel mode
is this right?
i have read the whole book and searched for many articles, the question still holds.
at Wikipedia, it says about LWP:
Kernel threads
Kernel threads are handled entirely by the kernel. They need not be associated with a process; a kernel can create them whenever it needs to perform a particular task. Kernel threads cannot execute in user mode. LWPs (in systems where they are a separate layer) bind to kernel threads and provide a user-level context. This includes a link to the shared resources of the process to which the LWP belongs. When a LWP is suspended, it needs to store its user-level registers until it resumes, and the underlying kernel thread must also store its own kernel-level registers.
also what does it means when saying about user-level registers and kernel level registers?
after digging and digging, i have following temp conclusion, but i am not sure. Hope the question further be answered and clearifed:
kernel thread, depending on discussion context, has two meanings:
when talking about user/kernel threading, kernel thread means a kernel task that totally execute in kernel mode and only execute kernel codes, like ksoftirqd for handling bottom half of interrupts
when taking about threading model, namely how user code is mapped into schedulable entities in kernel, kernel thread means a task that is schedulable by kernel
further about threading model and light weight processes in Linux:
in old times the operating system does not know thread, it only know processes(tasks) and threads are implmented by thread libraries totally in user side. There is a inherent problem for this that is if one user thread is blocked, such as I/O, all the user threads are blocked, because there is only one schedulable tasks in the kernel for this process. From the perspective of the kernel, the whole process is blocked. To solve this problem, light weight process(LWP), also called virtual processor(VP) is invented.
LWP is a intermedia data structure between user thread and a kernel thread(the second meaning above). LWP binds a user thread with a kernel thread(task), which in before is bounded with a user process. Simply put: in before a user process occupies a kernel thread(task), now with LWP a user thread can occupy a individual kernel thread(task), without sharing it with other user threads. (I think) This is why it is called light weight process. The advantage of this model is obvious, if one of the user thread is blocked, other user threads has ways to continue being executed by other kernel threads(tasks).
A kernel thread(task) acutually knows nothing about user process. It is just a task, a schedulable entity created, managed, destroyed totally by kernel itself. But a LWP belongs to a specific process and knows other LWPs that also belongs to the same one. LWP is like a bridge between user process and kernel thread(task).
When a kernel thread(task) that is bound to a LWP is scheduled by the kernel, the user level registers(pointed by LWP) is loaded into CPU, also the kernel thread(task) has registers and they are also loaded into CPU. From the standing point of CPU, a LWP is a kernel thread(task). It does not care it executes kernel code or user code.
user/kernel mode, user/kernel thread: they are independent. In Linux, a user thread created by pthread essentially is a kernel thread and this thread can execute in both user mode or kernel mode, depending on whether the thread is executing user code or kernel code.

All user threads are finally mapped to kernel threads.
That is not a useful way to think about threads. In most operating systems, a program can ask the OS to create a new thread and the program can provide a pointer to a function for the new thread to call. There's no "mapping" that happens there.* The new thread runs in exactly the same way as the program's original (a.k.a., "main") thread. It runs application code in user mode except, occasionally, when it makes a system call, and then for the duration of the system call it runs kernel code in kernel mode.
Many programming languages come with an OS-independent library that provides some kind of a Thread object. The thread object is not the same thing as the actual thread. It's more of a handle that the application uses to control the OS thread. If you like, you can say that those thread objects are "mapped" to OS threads, but that's still somewhat abusing the notion of what a "mapping" is.
kernel threads can only be executed in kernel mode
If you aren't writing OS code, it's best to avoid saying "kernel thread" altogether. In the Linux OS in particular, "kernel thread" means something, and it has nothing whatever to do with application code. Linux kernel threads are threads that are created by the OS for the OS, and they never run "user" (i.e., application) code.
It's possible for an application program to create and schedule its own threads, completely unknown to the OS. Some people call those "user threads." Some used to call them "green threads." Back in the old days, before any OS had thread support, we just called them "threads." Doing threads that way is a lot of work, for little reward. (Can't schedule them preemptively.) Outside of the realm of tiny, embedded, real-time systems, almost nobody bothers to do it anymore.
* But wait! Things will get more complicated in the near future when Java's Project Loom hits the main stream. Threads traditionally are expensive. In particular, each thread must have its own contiguous call stack—usually a chunk of at least a few megabytes—allocated to it. The goal of project loom is to make threads as cheap as any other object.
They way they intend to make threads "cheap" is to "virtualize" them, and to break up their call stacks into linked lists of reclaimable heap objects. Under project loom, a limited number of real OS threads that are scheduled by the OS scheduler will, in turn, schedule and execute the code of a multitude of "virtual" application threads, and so there really will be something going on that feels a bit like "mapping."
I won't be at all surprised if the same idea spreads to other languages.

There are two different meanings of kernel threads. When threading people talk about "kernel threads" they mean "threads the kernel knows about" i.e. "threads that are controlled by the kernel". When kernel people talk about "kernel threads" they mean "threads that run in kernel mode".
"Threads the kernel knows about" are contrasted to "user threads" which are hidden from the kernel and controlled by the program itself.
No, not all threads controlled by the kernel run in kernel mode. The kernel controls the scheduling of threads that run in kernel mode, and also threads that run in user mode.
The quote about LWPs is talking about systems where the scheduler thinks that all threads are kernel-mode threads. To run a user-mode thread (which they call an LWP because it's not really a thread because all threads are kernel-mode threads) the thread has to call a function like RunLWP(pointer_to_lwp);.
I don't know which system is like this. Linux is not like this; Windows is not like this. This is a weird, overly complicated design which is why it's not normally used.
The "registers" are where the CPU remembers what it is currently doing. The most important one is the "instruction pointer" register (some CPUs call it something different) which remembers which instruction is next. If you remember all of the register values, and then come back later and set them to the same values, the CPU will carry on like nothing happened. That's why threading works - the thread can't tell that it's been interrupted, because all of the registers have the same values as if it wasn't interrupted. Here's a list of registers on x86-class CPUs. You don't need to know them for this question - it just might be interesting.
When an interrupt happens, depending on the CPU type, the CPU will save the instruction pointer and maybe one or two other registers. The interrupt handler has to save the rest (or be careful not to change them). Here about halfway down you can see how an x86-class CPU switches from user-space to an interrupt handler when an interrupt occurs.
So this RunLWP function would save the current registers (from the kernel) and set them according to the last time the LWP stopped running. Then the LWP runs. Then when some interrupt happens, the interrupt handler would save the current registers (from user-space) and set them according to the saved kernel handlers, so the kernel code after RunLWP runs. Probably. Again, I don't know any actual system like this, but it seems like the logical way to do things. The reason it should return back to the kernel code instead of the user code is so that the kernel code can decide whether it wants to keep running the LWP or not.
I don't know why they would say the interrupt handler would save both the kernel-space and user-space registers. Current CPUs generally only have one set of registers which software has to swap out when it wants to make the CPU change what it is doing. RunLWP would have to save the kernel registers and load the user ones, then the interrupt handler would have to save the user registers and load the interrupt handler ones. It could be that the CPUs which these systems were designed for did have two sets of registers.

Related

Is a coroutine a kind of thread that is managed by the user-program itself (rather than managed by the kernel)?

In my opinion,
Kernel is an alias for a running program whose program text is in the kernel area and can access all memory spaces;
Process is an alias for a running program whose program has an independent memory space in the user memory area. Which process can get the use of the CPU is completely managed by the kernel;
Thread is an alias for a running program whose program-text is in the memory space of a process and completely shares the memory space with another thread of the same process. Which thread can get the use of the CPU is completely managed by the kernel;
Coroutine is an alias for a running program whose program-text is in the memory space of a process.And it is a user thread that the process decides itself (not the kernel) how to use, and the kernel is only responsible for allocating CPU resources to the process.
Since the process itself has no right to schedule like the kernel, the coroutine can only be concurrent but not parallel.
Am I correct in saying Above?
process is an alias for a running program...
The modern way to think of a process is to think of it as a container for a collection of threads and the resources that those threads need to execute.
Every process (except for "zombie" processes that can exist in some systems) must have at least one thread. It also has a virtual address space, open file handles, and maybe sockets and other resources that are shared by the threads.
Thread is an alias for a running program...
The problem with saying that is, "running program" sounds too much like "process," and a thread is most definitely not a process. (E.g., a thread can only exist in a process.)
A computer scientist might tell you that a thread is one particular execution of the application's code. I like to think of a thread as an independent agent who executes the code.
coroutine...is a user thread...
I'm going to mostly leave that one alone. "Coroutine" seems to mean something different from the highly formalized, and not particularly useful coroutines that I learned about more than forty years ago. What people call "coroutines" today seem to have somewhat in common with what I call "green threads," but there are details of how and when and why they are used that I don't yet understand.
Green threads (a.k.a., "user mode threads") simply are threads that the kernel doesn't know about. They are pretty much just like the threads that the kernel does know about except, the kernel scheduler never preempts them because, Duh! it doesn't know about them. Context switches between green threads can only happen at specific points where the application allows it (e.g., by calling a yield() function or, by calling some library function that is a documented yield point.)
kernel is an alias for a running program...
The kernel also is most definitely not a process.
I don't know every detail about every operating system, but the bulk of kernel code does not run independently of the applications that the kernel serves. It only runs when an application thread enters kernel mode by making a system call. The thread that runs the kernel code still belongs to the application process, but the code that determines the thread's behavior at that point is written or chosen by the kernel developers, not by the application developers.

Since all user threads are mapped to kernel threads , then user threads are running in kernal mode?

I know this is not true. But why? Am I'm confusing kernel threads and kernel mode?
Sometimes, when you perform a system call. But that's not what the term really means.
"Kernel threads" refers to the fact that the kernel itself recognizes each thread as being separate. This means that they each have a corresponding data structure in the kernel, and the kernel can treat them individually, such as by scheduling them separately. The mode the thread is running in has nothing to do with it since it is just the concept of the thread in the kernel.
"User threads" are implemented in user space. The kernel has no idea that there are multiple threads in the process, and so cannot treat them individually. The kernel just sees the main thread, and might not separate it from the process at all.

Why do Kernel threads run in process context?

I have recently learned that Linux kernel threads run in process context.
Why are they run in process context?
Why are they not simply run in a traditional "thread"? (if that even makes sense to ask)
No it doesn't make sense to ask :) (see here)
Process context merely means the thread is a normal thread, such as the threads you get in processes.
Interrupt context just means the thread was started by an interrupt.
Caveat: the following is highly simplified and not be completely accurate:
Interrupts are low level events that cause the CPU to stop what it is doing and execute special code called an interrupt handler (do a context change to the interrupt handler). Interrupts are caused by hardware e.g. a network card signals that a packet has arrived and needs to be read, or by software events e.g. virtual memory uses interrupts to ask the kernel to load a page from disk physical memory, etc..
In modern CPUs interrupts and threads are quite complex, they have priorities, privilege levels, can be individually masked, etc..
Why is it called a process context and not a thread context? I assume this is for historic reasons.
Traditionally Unix, and by extension Linux, did not support threads only processes.
CPUs don't really know about processes and threads, from a CPU point of view they are all execution contexts, the difference between threads and a processes is a function of how the operating system arranges the virtual memory and other OS related attributes (user context, permissions, etc.) of the different execution contexts.
A Kernel Thread can be referred to as a context in execution that does not possess a userspace counterpart (unlike other threads/processes in Linux where the userspace process is mapped to a kernel space process). These are generally used as daemons eg. kswapd - the swapper process for evicting virtual memory pages. There is no userspace existence of this process.
Secondly, because these have a definite context associated with itself that can be switched (say the state of registers save on its own stack), the Kernel Threads are schedulable. And, anything that is schedulable can be considered as a "Process context".
On the other hand, interrupts are not schedulable. They occur and execute the interrupt handler spawning its own context.

user threads v.s. kernel threads

Could someone help clarify my understanding of kernel threads. I heard that, on Linux/Unix, kernel threads(such as those of system calls) get executed faster than user threads. But, aren't those user threads scheduled by kernel and executed using kernel threads? could someone please tell me what is the difference between a kernel thread and a user thread other than the fact that they have access to different address spaces. what are other difference between them? Is it true that on a single processor box, when user thread is running, kernel will be suspended?
Thanks in advance,
Alex
I heard that, on Linux/Unix, kernel threads(such as those of system calls) get executed faster than user threads.
This is a largely inaccurate statement.
Kernel threads are used for "background" tasks internal to the kernel, such as handling interrupts and flushing data to disk. The bulk of system calls are processed by the kernel within the context of the process that called them.
Kernel threads are scheduled more or less the same way as user processes. Some kernel threads have higher than default priority (up to realtime priority in some cases), but saying that they are "executed faster" is misleading.
Is it true that on a single processor box, when user thread is running, kernel will be suspended?
Of course. Only one process can be running at a time on a single CPU core.
That being said, there are a number of situations where the kernel can interrupt a running task and switch to another one (which may be a kernel thread):
When the timer interrupt fires. By default, this occurs 100 times every second.
When the task makes a blocking system call (such as select() or read()).
When a CPU exception occurs in the task (e.g, a memory access fault).

Mapping User-level threads and Kernel-level threads

How are User-level threads mapped to Kernel-level threads?
It varies by implementation. The three most common threading models are:
1-to-1: Each user-level thread has a corresponding entity that is scheduled by the kernel.
n-to-1: Each process is scheduled by the kernel. Thread scheduling takes place entirely in user space.
n-to-m: Each process has a pool of entities that are scheduled by the kernel. These are assigned to run particular user-level threads by a user-space scheduler that is part of the process.
Modern implementations are almost all 1-to-1.
There's a bit of confusion about the terminology used for referring to ULTs and KLTs.
Following are the two different interpretations. Please correct me if I got this wrong:
KLTs are needed to achieve concurrency in the kernel (Note the interpretation of Kernel as a Process or a live entity). This is true about Micro kernels like Symbian, where a kernel thread is responsible for every hardware resource of the system (e.g File Server, Location Server, Calendar Server, etc). However, in a kernel like Linux, which is mostly a library (and not a process or a living entity on its own), there's really no meaning for Kernel threads. In Linux, every thread you create is treated by the Kernel as a process and Kernel always runs either in the Process context or the Interrupt context.
Second interpretation is based on whether Threading (or concurrency) is visible to the Kernel or not. For instance, using setjmp, longjmp one can achieve concurrency at user space. Like already discussed, Kernel is totally unaware of this. This concurrency may be termed as ULT. And the thread whose creation the Kernel is aware of (one using Clone() system call) may be called KLT.

Resources