Could someone help clarify my understanding of kernel threads. I heard that, on Linux/Unix, kernel threads(such as those of system calls) get executed faster than user threads. But, aren't those user threads scheduled by kernel and executed using kernel threads? could someone please tell me what is the difference between a kernel thread and a user thread other than the fact that they have access to different address spaces. what are other difference between them? Is it true that on a single processor box, when user thread is running, kernel will be suspended?
Thanks in advance,
Alex
I heard that, on Linux/Unix, kernel threads(such as those of system calls) get executed faster than user threads.
This is a largely inaccurate statement.
Kernel threads are used for "background" tasks internal to the kernel, such as handling interrupts and flushing data to disk. The bulk of system calls are processed by the kernel within the context of the process that called them.
Kernel threads are scheduled more or less the same way as user processes. Some kernel threads have higher than default priority (up to realtime priority in some cases), but saying that they are "executed faster" is misleading.
Is it true that on a single processor box, when user thread is running, kernel will be suspended?
Of course. Only one process can be running at a time on a single CPU core.
That being said, there are a number of situations where the kernel can interrupt a running task and switch to another one (which may be a kernel thread):
When the timer interrupt fires. By default, this occurs 100 times every second.
When the task makes a blocking system call (such as select() or read()).
When a CPU exception occurs in the task (e.g, a memory access fault).
Related
For what i learned from Operating System Concepts and online searching:
all user threads are finally mapped to kernel threads for being scheduled to physical CPUs
kernel threads can only be executed in kernel mode
above two arguments leads to the conclusion:
user code are all executed in kernel mode
is this right?
i have read the whole book and searched for many articles, the question still holds.
at Wikipedia, it says about LWP:
Kernel threads
Kernel threads are handled entirely by the kernel. They need not be associated with a process; a kernel can create them whenever it needs to perform a particular task. Kernel threads cannot execute in user mode. LWPs (in systems where they are a separate layer) bind to kernel threads and provide a user-level context. This includes a link to the shared resources of the process to which the LWP belongs. When a LWP is suspended, it needs to store its user-level registers until it resumes, and the underlying kernel thread must also store its own kernel-level registers.
also what does it means when saying about user-level registers and kernel level registers?
after digging and digging, i have following temp conclusion, but i am not sure. Hope the question further be answered and clearifed:
kernel thread, depending on discussion context, has two meanings:
when talking about user/kernel threading, kernel thread means a kernel task that totally execute in kernel mode and only execute kernel codes, like ksoftirqd for handling bottom half of interrupts
when taking about threading model, namely how user code is mapped into schedulable entities in kernel, kernel thread means a task that is schedulable by kernel
further about threading model and light weight processes in Linux:
in old times the operating system does not know thread, it only know processes(tasks) and threads are implmented by thread libraries totally in user side. There is a inherent problem for this that is if one user thread is blocked, such as I/O, all the user threads are blocked, because there is only one schedulable tasks in the kernel for this process. From the perspective of the kernel, the whole process is blocked. To solve this problem, light weight process(LWP), also called virtual processor(VP) is invented.
LWP is a intermedia data structure between user thread and a kernel thread(the second meaning above). LWP binds a user thread with a kernel thread(task), which in before is bounded with a user process. Simply put: in before a user process occupies a kernel thread(task), now with LWP a user thread can occupy a individual kernel thread(task), without sharing it with other user threads. (I think) This is why it is called light weight process. The advantage of this model is obvious, if one of the user thread is blocked, other user threads has ways to continue being executed by other kernel threads(tasks).
A kernel thread(task) acutually knows nothing about user process. It is just a task, a schedulable entity created, managed, destroyed totally by kernel itself. But a LWP belongs to a specific process and knows other LWPs that also belongs to the same one. LWP is like a bridge between user process and kernel thread(task).
When a kernel thread(task) that is bound to a LWP is scheduled by the kernel, the user level registers(pointed by LWP) is loaded into CPU, also the kernel thread(task) has registers and they are also loaded into CPU. From the standing point of CPU, a LWP is a kernel thread(task). It does not care it executes kernel code or user code.
user/kernel mode, user/kernel thread: they are independent. In Linux, a user thread created by pthread essentially is a kernel thread and this thread can execute in both user mode or kernel mode, depending on whether the thread is executing user code or kernel code.
All user threads are finally mapped to kernel threads.
That is not a useful way to think about threads. In most operating systems, a program can ask the OS to create a new thread and the program can provide a pointer to a function for the new thread to call. There's no "mapping" that happens there.* The new thread runs in exactly the same way as the program's original (a.k.a., "main") thread. It runs application code in user mode except, occasionally, when it makes a system call, and then for the duration of the system call it runs kernel code in kernel mode.
Many programming languages come with an OS-independent library that provides some kind of a Thread object. The thread object is not the same thing as the actual thread. It's more of a handle that the application uses to control the OS thread. If you like, you can say that those thread objects are "mapped" to OS threads, but that's still somewhat abusing the notion of what a "mapping" is.
kernel threads can only be executed in kernel mode
If you aren't writing OS code, it's best to avoid saying "kernel thread" altogether. In the Linux OS in particular, "kernel thread" means something, and it has nothing whatever to do with application code. Linux kernel threads are threads that are created by the OS for the OS, and they never run "user" (i.e., application) code.
It's possible for an application program to create and schedule its own threads, completely unknown to the OS. Some people call those "user threads." Some used to call them "green threads." Back in the old days, before any OS had thread support, we just called them "threads." Doing threads that way is a lot of work, for little reward. (Can't schedule them preemptively.) Outside of the realm of tiny, embedded, real-time systems, almost nobody bothers to do it anymore.
* But wait! Things will get more complicated in the near future when Java's Project Loom hits the main stream. Threads traditionally are expensive. In particular, each thread must have its own contiguous call stack—usually a chunk of at least a few megabytes—allocated to it. The goal of project loom is to make threads as cheap as any other object.
They way they intend to make threads "cheap" is to "virtualize" them, and to break up their call stacks into linked lists of reclaimable heap objects. Under project loom, a limited number of real OS threads that are scheduled by the OS scheduler will, in turn, schedule and execute the code of a multitude of "virtual" application threads, and so there really will be something going on that feels a bit like "mapping."
I won't be at all surprised if the same idea spreads to other languages.
There are two different meanings of kernel threads. When threading people talk about "kernel threads" they mean "threads the kernel knows about" i.e. "threads that are controlled by the kernel". When kernel people talk about "kernel threads" they mean "threads that run in kernel mode".
"Threads the kernel knows about" are contrasted to "user threads" which are hidden from the kernel and controlled by the program itself.
No, not all threads controlled by the kernel run in kernel mode. The kernel controls the scheduling of threads that run in kernel mode, and also threads that run in user mode.
The quote about LWPs is talking about systems where the scheduler thinks that all threads are kernel-mode threads. To run a user-mode thread (which they call an LWP because it's not really a thread because all threads are kernel-mode threads) the thread has to call a function like RunLWP(pointer_to_lwp);.
I don't know which system is like this. Linux is not like this; Windows is not like this. This is a weird, overly complicated design which is why it's not normally used.
The "registers" are where the CPU remembers what it is currently doing. The most important one is the "instruction pointer" register (some CPUs call it something different) which remembers which instruction is next. If you remember all of the register values, and then come back later and set them to the same values, the CPU will carry on like nothing happened. That's why threading works - the thread can't tell that it's been interrupted, because all of the registers have the same values as if it wasn't interrupted. Here's a list of registers on x86-class CPUs. You don't need to know them for this question - it just might be interesting.
When an interrupt happens, depending on the CPU type, the CPU will save the instruction pointer and maybe one or two other registers. The interrupt handler has to save the rest (or be careful not to change them). Here about halfway down you can see how an x86-class CPU switches from user-space to an interrupt handler when an interrupt occurs.
So this RunLWP function would save the current registers (from the kernel) and set them according to the last time the LWP stopped running. Then the LWP runs. Then when some interrupt happens, the interrupt handler would save the current registers (from user-space) and set them according to the saved kernel handlers, so the kernel code after RunLWP runs. Probably. Again, I don't know any actual system like this, but it seems like the logical way to do things. The reason it should return back to the kernel code instead of the user code is so that the kernel code can decide whether it wants to keep running the LWP or not.
I don't know why they would say the interrupt handler would save both the kernel-space and user-space registers. Current CPUs generally only have one set of registers which software has to swap out when it wants to make the CPU change what it is doing. RunLWP would have to save the kernel registers and load the user ones, then the interrupt handler would have to save the user registers and load the interrupt handler ones. It could be that the CPUs which these systems were designed for did have two sets of registers.
I looked up the CPU scheduler source code built into the kernel.
https://github.com/torvalds/linux/tree/master/kernel/sched
But I have a question.
There are mixed opinions on the cpu scheduler on the Internet.
I saw an opinion that CPU scheduler is a process.
Question: If so, when ps-ef on Linux, the scheduler process should be visible. It was difficult to find the PID and name of the scheduler process.
The PID for the CPU scheduler process is not on the internet either. However, the PID 0 SWAPPER process is called SCHED, but in Linux, PID 0 functions as an idle process.
I saw an opinion that CPU scheduler is not a process.
CPU scheduler is a passive source code built into the kernel, and user processes frequently enter the kernel and rotate the source code.
Question: How does the user process execute the kernel's scheduler source code on its own?
What if you created a user program without adding a system call using the scheduler of the kernel?
How does the user process self-rotate the scheduler in the kernel without such code?
You have 2 similar questions (The opinion that the scheduler built into the kernel is the program and the opinion that it is the process and I want to know how to implement the cpu scheduling process in Linux operating system) so I'll answer for both of these here.
The answer is that it doesn't work that way at all. The scheduler is not called by user mode processes by using system calls. The scheduler isn't a system call. There are timers that are programmed to throw interrupts after some time has elapsed. Timers are accessed using registers that are memory in RAM often called memory mapped IO (MMIO). You write to some position in RAM specified by the ACPI tables (https://wiki.osdev.org/ACPI) and it will allow to control the chips in the CPU or external PCI devices (PCI is everything nowadays).
When the timer reaches 0, it will trigger an interrupt. Interrupts are thrown by hardware (the CPU). The CPU thus includes special mechanism to let the OS determine the position at which it will jump on interrupt (https://wiki.osdev.org/Interrupt_Descriptor_Table). Interrupts are used by the CPU to notify the OS that an event happened. Without interrupts, the OS would have to reserve at least one core of the processor for a special kernel process that would constantly poll the registers of peripherals and other things. It would be impossible to implement. Also, if user mode processes did the scheduler system call by themselves, the kernel would be slave to user mode because it wouldn't be able to tell if a process is finished and processes could be selfish over CPU time.
I didn't look at the source code but I think the scheduler is also often called on some IO completion (also on interrupt but not always on timer interrupt). I am quite sure that the scheduler must not be preempted. That is interrupts (and other things) will be disabled while the schedule() function runs.
I don't think you can call the scheduler a process (not even a kernel thread). The scheduler can be called by kernel threads that are created by interrupts due to bottom half processing. In bottom half processing, the top "half" of the interrupt handler runs fast and efficiently while the bottom "half" is added to the queue of processes and runs when the scheduler decides it should be scheduled. This has the effect of creating some kernel threads. The scheduler can thus be called from kernel threads but not always from bottom half of interrupts. There has to be a mechanism to call the scheduler without the scheduler having to schedule the task itself. Otherwise, the kernel will stop functioning.
After the Kernel schedules a process that has threads, How does said process schedule its own threads during its time splice?
For most modern kernels, the kernel only schedules threads, and processes are mostly just a container for the threads to execute inside (e.g. a container that contains a virtual address space, however many threads, and a few other scraps like file handles).
For some kernels (mostly very old unix kernels that existed before threads were invented) the kernel schedules processes and then a user-space library emulates threads. For this to work properly all of the "blocking" system calls (e.g. write()) have to be replaced by asynchronous system calls (e.g. aio_write()) so that other threads in the process can be given CPU time; however I wouldn't want to assume it works properly (e.g. if any thread blocks, then maybe all threads in the process block).
Also it may not work when there's multiple CPUs (kernel gives a process one CPU, but then from the kernel's perspective that process is running and can't use a second CPU). There are sophisticated work-arounds for this (to support "M:N threading") but it's just easier and better to fix the scheduler so it works with threads. Fortunately/unfortunately this didn't matter much in the early days because very few computers had more than one CPU anyway.
Lastly; it doesn't work for thread priorities - e.g. one process might keep CPU busy executing an unimportant/low priority thread while another process doesn't get that CPU time when it desperately needs it for an important/high priority thread. This occurs because no process knows about threads belonging to other processes and the kernel only knows about processes and not threads.
Of course these are also the reasons why every kernel adopted "kernel schedules threads and not processes" (and those that didn't died).
It's down to jargon definitions, but threads are simply a bunch of processes sharing an address space. Older Unixes even called them Light Weight Processes.
With that classical understanding of threads, the answer is that, these days, it's the OS that does the scheduling and each thread gets its own timeslices.
Extras
Some OSes do things to "the whole process" - e.g. Windows will give the process that has mouse focus a priority boost (all it's threads get dynamically notched up a few priority places), to make that application appear to be more sprightly (this goes back to Windows 3).
Other operating systems will increase the priority of a thread dynamically, to solve priority inversion situations. This is where a low priority thread that has control of a resource (I/O, or perhaps a semaphore) is blocking a higher priority thread from running (because the resource is not available. This is the priority inversion, and it's solved by the OS boosting the priority of the blocking thread until it gives up the required resource.
Either the kernel schedules the threads or the kernel schedules processes simulates thread by scheduling it own threads.
Usually, the process schedules its own threads using a library that sets timers. When the timer handler saves the current "thread's" registers then loads a new set of registers from another "thread."
I am running an application,where certain user threads must not be preempted by kernel .I will explain my setup:
OS:
Linux 2.6.32 kernel
Kernel level:
1.There are many modules which are insmoded into kernel.
2.Work_queues are also initialized in some modules(I guess separate threads are created for work_queues)
3.If I am getting any hardware interrupt, I would queue this work in any of these initialized work_queue during my isr.
Application level:
there are multiple threads running in parallel,some of which are of higher priority than any other thread in process.(Even kernel)
Objective:
1.If I get any hardware interrupt,isr will be automatically called in which work will be queued for any work_queue.But,I do not want scheduling of these work_queues if higher priority user level threads are running during that time.ie,certain User level threads should not be preempted by any work_queue handling in kernel.Now, i have observed that kernel gets priority than any other user thread.
2.I have multiple work_queues in kernel.How can i give different priority for different work queues.I havent seen any api to set priority for work_queues in kernel.
User threads must always be pre-emptible by kernel-mode threads because kernel-mode threads need to respond to hardware-events. This is by design.
If your user-mode threads need to interact with the hardware or are real-time and hence must not be pre-empted, consider making them kernel-mode threads.
If you are merely encountering errors caused by your thread being descheduled during a critical operation and ending up with another thread trampling over your operation, then you should implement locking.
If you have a custom need for breaking this fundamental design of the linux-kernel, you will need to change the behaviour of the kernel-mode scheduler.
What is a reentrant kernel?
Much simpler answer:
Kernel Re-Entrance
If the kernel is not re-entrant, a process can only be suspended while it is in user mode. Although it could be suspended in kernel mode, that would still block kernel mode execution on all other processes. The reason for this is that all kernel threads share the same memory. If execution would jump between them arbitrarily, corruption might occur.
A re-entrant kernel enables processes (or, to be more precise, their corresponding kernel threads) to give away the CPU while in kernel mode. They do not hinder other processes from also entering kernel mode. A typical use case is IO wait. The process wants to read a file. It calls a kernel function for this. Inside the kernel function, the disk controller is asked for the data. Getting the data will take some time and the function is blocked during that time.
With a re-entrant kernel, the scheduler will assign the CPU to another process (kernel thread) until an interrupt from the disk controller indicates that the data is available and our thread can be resumed. This process can still access IO (which needs kernel functions), like user input. The system stays responsive and CPU time waste due to IO wait is reduced.
This is pretty much standard for today's desktop operating systems.
Kernel pre-emption
Kernel pre-emption does not help in the overall throughput of the system. Instead, it seeks for better responsiveness.
The idea here is that normally kernel functions are only interrupted by hardware causes: Either external interrupts, or IO wait cases, where it voluntarily gives away control to the scheduler. A pre-emptive kernel instead also interrupts and suspends kernel functions just like it would interrupt processes in user mode. The system is more responsive, as processes e.g. handling mouse input, are woken up even while heavy work is done inside the kernel.
Pre-emption on kernel level makes things harder for the kernel developer: The kernel function cannot be suspended only voluntarily or by interrupt handlers (which are somewhat a controlled environment), but also by any other process due to the scheduler. Care has to be taken to e.g. avoid deadlocks: A thread locks resource A but needing resource B is interrupted by another thread which locks resource B, but then needs resource A.
Take my explanation of pre-emption with a grain of salt. I'm happy for any corrections.
All Unix kernels are reentrant. This means that several processes may be executing in Kernel Mode at the same time. Of course, on uniprocessor systems, only one process can progress, but many can be blocked in Kernel Mode when waiting for the CPU or the completion of some I/O operation. For instance, after issuing a read to a disk on behalf of a process, the kernel lets the disk controller handle it and resumes executing other processes. An interrupt notifies the kernel when the device has satisfied the read, so the former process can resume the execution.
One way to provide reentrancy is to write functions so that they modify only local variables and do not alter global data structures. Such functions are called reentrant functions . But a reentrant kernel is not limited only to such reentrant functions (although that is how some real-time kernels are implemented). Instead, the kernel can include nonreentrant functions and use locking mechanisms to ensure that only one process can execute a nonreentrant function at a time.
If a hardware interrupt occurs, a reentrant kernel is able to suspend the current running process even if that process is in Kernel Mode. This capability is very important, because it improves the throughput of the device controllers that issue interrupts. Once a device has issued an interrupt, it waits until the CPU acknowledges it. If the kernel is able to answer quickly, the device controller will be able to perform other tasks while the CPU handles the interrupt.
Now let's look at kernel reentrancy and its impact on the organization of the kernel. A kernel control path denotes the sequence of instructions executed by the kernel to handle a system call, an exception, or an interrupt.
In the simplest case, the CPU executes a kernel control path sequentially from the first instruction to the last. When one of the following events occurs, however, the CPU interleaves the kernel control paths :
A process executing in User Mode invokes a system call, and the corresponding kernel control path verifies that the request cannot be satisfied immediately; it then invokes the scheduler to select a new process to run. As a result, a process switch occurs. The first kernel control path is left unfinished, and the CPU resumes the execution of some other kernel control path. In this case, the two control paths are executed on behalf of two different processes.
The CPU detects an exception-for example, access to a page not present in RAM-while running a kernel control path. The first control path is suspended, and the CPU starts the execution of a suitable procedure. In our example, this type of procedure can allocate a new page for the process and read its contents from disk. When the procedure terminates, the first control path can be resumed. In this case, the two control paths are executed on behalf of the same process.
A hardware interrupt occurs while the CPU is running a kernel control path with the interrupts enabled. The first kernel control path is left unfinished, and the CPU starts processing another kernel control path to handle the interrupt. The first kernel control path resumes when the interrupt handler terminates. In this case, the two kernel control paths run in the execution context of the same process, and the total system CPU time is accounted to it. However, the interrupt handler doesn't necessarily operate on behalf of the process.
An interrupt occurs while the CPU is running with kernel preemption enabled, and a higher priority process is runnable. In this case, the first kernel control path is left unfinished, and the CPU resumes executing another kernel control path on behalf of the higher priority process. This occurs only if the kernel has been compiled with kernel preemption support.
These information available on http://jno.glas.net/data/prog_books/lin_kern_2.6/0596005652/understandlk-CHP-1-SECT-6.html
More On http://linux.omnipotent.net/article.php?article_id=12496&page=-1
The kernel is the core part of an operating system that interfaces directly with the hardware and schedules processes to run.
Processes call kernel functions to perform tasks such as accessing hardware or starting new processes. For certain periods of time, therefore, a process will be executing kernel code. A kernel is called reentrant if more than one process can be executing kernel code at the same time. "At the same time" can mean either that two processes are actually executing kernel code concurrently (on a multiprocessor system) or that one process has been interrupted while it is executing kernel code (because it is waiting for hardware to respond, for instance) and that another process that has been scheduled to run has also called into the kernel.
A reentrant kernel provides better performance because there is no contention for the kernel. A kernel that is not reentrant needs to use a lock to make sure that no two processes are executing kernel code at the same time.
A reentrant function is one that can be used by more than one task concurrently without fear of data corruption. Conversely, a non-reentrant function is one that cannot be shared by more than one task unless mutual exclusion to the function is ensured either by using a semaphore or by disabling interrupts during critical sections of code. A reentrant function can be interrupted at any time and resumed at a later time without loss of data. Reentrant functions either use local variables or protect their data when global variables are used.
A reentrant function:
Does not hold static data over successive calls
Does not return a pointer to static data; all data is provided by the caller of the function
Uses local data or ensures protection of global data by making a local copy of it
Must not call any non-reentrant functions