What is a context switch? - multithreading

I was reading about the debuggerstepperboundary attribute and a site says it is is useful in a context switch.
What exactly is a context switch? I'm assuming it is a switch from one thread to another, or in execution or security context? However, these are not particularly educated guesses so I'm asking here.

A context switch (also sometimes referred to as a process switch or a task switch) is the switching of the CPU (central processing unit) from one process or thread to another.
Context switching can be described in slightly more detail as the kernel (i.e., the core of the operating system) performing the following activities with regard to processes (including threads) on the CPU: (1) suspending the progression of one process and storing the CPU's state (i.e., the context) for that process somewhere in memory, (2) retrieving the context of the next process from memory and restoring it in the CPU's registers and (3) returning to the location indicated by the program counter (i.e., returning to the line of code at which the process was interrupted) in order to resume the process.
A context switch is sometimes described as the kernel suspending execution of one process on the CPU and resuming execution of some other process that had previously been suspended. Although this wording can help clarify the concept, it can be confusing in itself because a process is, by definition, an executing instance of a program. Thus the wording suspending progression of a process might be preferable.

Context switch is the switching of the CPU from one process/thread to another process/thread.
People sometimes use the term context switch outside of the specific computer world to reflect what they are doing in their own lives. "If I am going to answer that question, I need to context switch from thinking about A to thinking about B".
Wikipedia

It usually refers to switching between threads or processes. Wikipedia has a more thorough description.

It is the process of switching between processes on a CPU.

Switching the CPU core to another process requires performing a state
save of the current process and a state restore of a different
process. This task is known as a context switch
Excerpted from: Operating System Concepts, Abraham (Dinosaurs coverpage)
But it doesn't necessarily help me to understand the concept. I like to see animations or images as studying. So, the RTOS's link is genuinely helpful.

Related

Context switch between kernel threads vs user threads

Copy pasted from this link:
Thread switching does not require Kernel mode privileges.
User level threads are fast to create and manage.
Kernel threads are generally slower to create and manage than the user threads.
Transfer of control from one thread to another within the same process requires a mode switch to the Kernel.
I never came across these points while reading standard operating systems reference books. Though these points sound logical, I wanted to know how they reflect in Linux. To be precise :
Can someone give detailed steps involved in context switching between user threads and kernel threads, so that I can find the step difference between the two.
Can someone explain the difference with actual context switch example or code. May be system calls involved (in case of context switching between kernel threads) and thread library calls involved (in case of context switching between user threads).
Can someone link me to Linux source code line (say on github) handling context switch.
I also doubt why context switch between kernel threads requires changing to kernel mode. Aren't we already in kernel mode for first thread?
Can someone give detailed steps involved in context switching between user threads and kernel threads, so that I can find the step difference between the two.
Let's imagine a thread needs to read data from a file, but the file isn't cached in memory and disk drives are slow so the thread has to wait; and for simplicity let's also assume that the kernel is monolithic.
For kernel threading:
thread calls a "read()" function in a library or something; which must cause at least a switch to kernel code (because it's going to involve device drivers).
the kernel adds the IO request to the disk driver's "queue of possibly many pending requests"; realizes the thread will need to wait until the request completes, sets the thread to "blocked waiting for IO" and switches to a different thread (that may belong to a completely different process, depending on global thread priorities). The kernel returns to the user-space of whatever thread it switch to.
later; the disk hardware causes an IRQ which causes a switch back to the IRQ handler in kernel code. The disk driver finishes up the work it had to do the for (currently blocked) thread and unblocks that thread. At this point the kernel might decide to switch to the "now unblocked" thread; and the kernel returns to the user-space of the "now unblocked" thread.
For user threading:
thread calls a "read()" function in a library or something; which must cause at least a switch to kernel code (because it's going to involve device drivers).
the kernel adds the IO request to the disk driver's "queue of possibly many pending requests"; realizes the thread will need to wait until the request completes but can't take care of that because some fool decided to make everything worse by doing thread switching in user space, so the kernel returns to user-space with "IO request has been queued" status.
after the pointless extra overhead of switching back to user-space; the user-space scheduler does the thread switch that the kernel could have done. At this point the user-space scheduler will either tell kernel it has nothing to do and you'll have more pointless extra overhead switching back to kernel; or user-space scheduler will do a thread switch to another thread in the same process (which may be the wrong thread because a thread in a different process is higher priority).
later; the disk hardware causes an IRQ which causes a switch back to the IRQ handler in kernel code. The disk driver finishes up the work it had to do for the (currently blocked) thread; but the kernel isn't able to do the thread switch to unblock the thread because some fool decided to make everything worse by doing thread switching in user space. Now we've got a problem - how does kernel inform the user-space scheduler that the IO has finished? To solve this (without any "user-space scheduler running zero threads constantly polls kernel" insanity) you have to have some kind of "kernel puts notification of IO completion on some kind of queue and (if the process was idle) wakes the process up" which (on its own) will be more expensive than just doing the thread switch in the kernel. Of course if the process wasn't idle then code in user-space is going to have to poll its notification queue to find out if/when the "notification of IO completion" arrives, and that's going to increase latency and overhead. In any case, after lots of stupid pointless and avoidable overhead; the user-space scheduler can do the thread switch.
Can someone explain the difference with actual context switch example or code. May be system calls involved (in case of context switching between kernel threads) and thread library calls involved (in case of context switching between user threads).
The actual low-level context switch code typically begins with something like:
save whichever registers are "caller preserved" according to the calling conventions on the stack
save the current stack top in some kind of "thread info structure" belonging to the old thread
load a new stack top from some kind of "thread info structure" belonging to the new thread
pop whichever registers are "caller preserved" according to the calling conventions
return
However:
usually (for modern CPUs) there's a relatively large amount of "SIMD register state" (e.g. for 80x86 with support for AVX-512 I think it's over 4 KiB of of stuff). CPU manufacturers often have mechanisms to avoid saving parts of that state if it wasn't changed, and to (optionally) postpone the loading of (pieces of) that state until its actually used (and avoid it completely if its not actually used). All of that requires kernel.
if it's a task switch and not just used for thread switches you might need some kind of "if virtual address space needs to change { change virtual address space }" on top of that
normally you want to keep track of statistics, like how much CPU time a thread has used. This requires some kind of "thread_info.time_used += now() - time_at_last_thread_switch;"; which gets difficulty/ugly when "process switching" is separated from "thread switching".
normally there's other state (e.g. pointer to thread local storage, special registers for performance monitoring and/or debugging, ...) that may need to be saved/loaded during thread switches. Often this state is not directly accessible in user code.
normally you also want to set a timer to expire when the thread has used too much time; either because you're doing some kind of "time multiplexing" (e.g. round-robin scheduler) or because its a cooperating scheduler where you need to have some kind of "terminate this task after 5 seconds of not responding in case it goes into an infinite loop forever" safe-guard.
this is just the low level task/thread switching in isolation. There is almost always higher level code to select a task to switch to, handle "thread used too much CPU time", etc.
Can someone link me to Linux source code line (say on github) handling context switch
Someone probably can't. It's not one line; it's many lines of assembly for each different architecture, plus extra higher-level code (for timers, support routines, the "select a task to switch to" code, for exception handlers to support "lazy SIMD state load", ...); which probably all adds up to something like 10 thousand lines of code spread across 50 files.
I also doubt why context switch between kernel threads requires changing to kernel mode. Aren't we already in kernel mode for first thread?
Yes; often you're already in kernel code when you find out that a thread switch is needed.
Rarely/sometimes (mostly only due to communication between threads belonging to the same process - e.g. 2 or more threads in the same process trying to acquire the same mutex/semaphore at the same time; or threads sending data to each other and waiting for data from each other to arrive) kernel isn't involved; and in some cases (which are almost always massive design failures - e.g. extreme lock contention problems, failure to use "worker thread pools" to limit the number of threads needed, etc) it's possible for this to be the dominant cause of thread switches, and therefore possible that doing thread switches in user space can be beneficial (e.g. as a work-around for the massive design failures).
Don't limit yourself to Linux or even UNIX, they are neither the first nor last word on systems or programming models. The synchronous execution model dates back to the early days of computing, and are not particularly well suited to larger scale concurrent and reactive programming.
Golang, for example, employs a great many lightweight user threads -- goroutines -- and multiplexes them on a smaller set of heavyweight kernel threads to produce a more compelling concurrency paradigm. Some other programming systems take similar approaches.

Why threads implemented in kernel space are slow?

When a thread does something that may cause it to become blocked locally, for example, waiting for another thread in its process to complete some work, it calls a run-time system procedure. This procedure checks to see if the thread must be put into blocked state. If so, it stores the thread's registers in the thread table, looks in the table for a ready thread to run, and reloads the machine registers with the new thread's saved values. As soon as the stack pointer and program counter have been switched, the new thread comes to life again automatically. If the machine happens to have an instruction to store all the registers and another one to load them all, the entire thread switch can be done in just a handful of instructions. Doing thread switching like this is at least an order of magnitude-maybe more-faster than trapping to the kernel and is a strong argument in favor of user-level threads packages.
Source: Modern Operating Systems (Andrew S. Tanenbaum | Herbert Bos)
The above argument is made in favor of user-level threads. The user-level thread implementation is depicted as kernel managing all the processes, where individual processes can have their own run-time (made available by a library package) that manages all the threads in that process.
Of course, merely calling a function in the run-time than trapping to kernel might have a few less instructions to execute but why the difference is so huge?
For example, if threads are implemented in kernel space, every time a thread has to be created the program is required to make a system call. Yes. But the call only involves adding an entry to the thread table with certain attributes (which is also the case in user space threads). When a thread switch has to happen, kernel can simply do what the run-time (at user-space) would do. The only real difference I can see here is that the kernel is being involved in all this. How can the performance difference be so significant?
Threads implemented as a library package in user space perform significantly better. Why?
They're not.
The fact is that most task switches are caused by threads blocking (having to wait for IO from disk or network, or from user, or for time to pass, or for some kind of semaphore/mutex shared with a different process, or some kind of pipe/message/packet from a different process) or caused by threads unblocking (because whatever they were waiting for happened); and most reasons to block and unblock involve the kernel in some way (e.g. device drivers, networking stack, ...); so doing task switches in kernel when you're already in the kernel is faster (because it avoids the overhead of switching to user-space and back for no sane reason).
Where user-space task switching "works" is when kernel isn't involved at all. This mostly only happens when someone failed to do threads properly (e.g. they've got thousands of threads and coarse-grained locking and are constantly switching between threads due to lock contention, instead of something sensible like a "worker thread pool"). It also only works when all threads are the same priority - you don't want a situation where very important threads belonging to one process don't get CPU time because very unimportant threads belonging to a different process are hogging the CPU (but that's exactly what happens with user-space threading because one process has no idea about threads belonging to a different process).
Mostly; user-space threading is a silly broken mess. It's not faster or "significantly better"; it's worse.
When a thread does something that may cause it to become blocked locally, for example, waiting for another thread in its process to complete some work, it calls a run-time system procedure. This procedure checks to see if the thread must be put into blocked state. If so, it stores the thread's registers in the thread table, looks in the table for a ready thread to run, and reloads the machine registers with the new thread's saved values. As soon as the stack pointer and program counter have been switched, the new thread comes to life again automatically. If the machine happens to have an instruction to store all the registers and another one to load them all, the entire thread switch can be done in just a handful of instructions. Doing thread switching like this is at least an order of magnitude-maybe more-faster than trapping to the kernel and is a strong argument in favor of user-level threads packages.
This is talking about a situation where the CPU itself does the actual task switch (and either the kernel or a user-space library tells the CPU when to do a task switch to what). This has some relatively interesting history behind it...
In the 1980s Intel designed a CPU ("iAPX" - see https://en.wikipedia.org/wiki/Intel_iAPX_432 ) for "secure object oriented programming"; where each object has its own isolated memory segments and its own privilege level, and can transfer control directly to other objects. The general idea being that you'd have a single-tasking system consisting of global objects using cooperating flow control. This failed for multiple reasons, partly because all the protection checks ruined performance, and partly because the majority of software at the time was designed for "multi-process preemptive time sharing, with procedural programming".
When Intel designed protected mode (80286, 80386) they still had hopes for "single-tasking system consisting of global objects using cooperating flow control". They included hardware task/object switching, local descriptor table (so each task/object can have its own isolated segments), call gates (so tasks/objects can transfer control to each other directly), and modified a few control flow instructions (call far and jmp far) to support the new control flow. Of course this failed for the same reason iAPX failed; and (as far as I know) nobody has ever used these things for the "global objects using cooperative flow control" they were originally designed for. Some people (e.g. very early Linux) did try to use the hardware task switching for more traditional "multi-process preemptive time sharing, with procedural programming" systems; but found that it was slow because the hardware task switch did too many protection checks that could be avoided by software task switching and saved/reloaded too much state that could be avoided by a software task switching;p and didn't do any of the other stuff needed for a task switch (e.g. keeping statistics of CPU time used, saving/restoring debug registers, etc).
Now.. Andrew S. Tanenbaum is a micro-kernel advocate. His ideal system consists of isolated pieces in user-space (processes, services, drivers, ...) communicating via. synchronous messaging. In practice (ignoring superficial differences in terminology) this "isolated pieces in user-space communicating via. synchronous messaging" is almost entirely identical to Intel's twice failed "global objects using cooperative flow control".
Mostly; in theory (if you ignore all the practical problems, like CPU not saving all of the state, and wanting to do extra work on task switches like tracking statistics), for a specific type of OS that Andrew S. Tanenbaum prefers (micro-kernel with synchronous message passing, without any thread priorities), it's plausible that the paragraph quoted above is more than just wishful thinking.
I think the answer to this can use a lot of OS and parallel distributive computing knowledge (And I am not sure about the answer but I will try my best)
So if you think about it. The library package will have a greater amount of performance than you write in the kernel itself. In the package thing, interrupt given by this code will be held at once and al the execution will be done. While when you write in kernel different other interrupts can come before. Plus accessing threads again and again is harsh on the kernel since everytime there will be an interrupt. I hope it will be a better view.
it's not correct to say the user-space threads are better that the kernel-space threads since each one has its own pros and cons.
in terms of user-space threads, as the application is responsible for managing thread, its easier to implement such threads and that kind of threads have not much reliance on OS. however, you are not able to use the advantages of multi processing.
In contrary, the kernel space modules are handled by OS, so you need to implement them according to the OS that you use, and it would be a more complicated task. However, you have more control over your threads.
for more comprehensive tutorial, take a look here.

Difference between context switch of thread and context switch of process

I know here ,there is an explanation for this question.But i am bit confused with some points-:
Let i have thread T(1-a) and T(1-b) belonging to process P1 and thread T(2-a) and T(2-b) belonging to process P2.
Now my question is -:
Thread T(1-a) wants to do context switch to Thread T(1-b).According to this answer,
Both types(process Context switch and thread context switch) involve handing control over to the operating system kernel to perform the context switch(i am mainly talking about thread context switch).
Doubt
if T(1-a) and T(1-b) are user level threads,Kernel won't be able to
distinguish
T(1-a) and T(1-b),then how context switch will be done?
Let all threads T(1-a),T(1-b),T(2-a) and T(2-b) be kernel level threads and if thread T(1-a) wants to context switch to T(2-b).
Doubt
won't the cost/dealy will be same as that of process context switch as not only virtual memory space gets changed but also TLB gets flushed??
The way to think about this is that User Threads and Kernel Threads are two completely different animals.
There is no context switch in user threads under the term's normal and customary meaning.
Both types(process Context switch and thread context switch) involve handing control over to the operating system kernel to perform the context switch(i am mainly talking about thread context switch).
I see your confusion. First of all, what is being described here only applies to Kernel Threads. Second, (and this problem is clearly the result of wording and not the overall message), in Kernel Threads, there are no "process context switches," there are only "Thread Context Switches".
In ye olde days when there were no threads, a change in scheduling meant a change in a process context. The hardware instructions to do this have names like Load Process Context and Save Process Context. But in operating system that schedule threads (kernel threads) we are left with this old terminology that is no longer very precise. (This is a problem that occurs in many places when explaining operating systems).
if T(1-a) and T(1-b) are user level threads,Kernel won't be able to distinguish T(1-a) and T(1-b),then how context switch will be done?
If they are user level threads, the kernel does not know or care about them. The switch among them is handled by a user library. The switching of "threads" is not a context switch.
won't the cost/dealy will be same as that of process context switch as not only virtual memory space gets changed but also TLB gets flushed??
Hopefully, the operating system is smart enough to know not to flush memory caches when it switches among threads in the same process.
I am convinced that academics need to abandon the kernel/user thread (and even worse, 1-to-1, many-to-1, and many-to-many [yuk]) constructs. In their place I suggest using the terms
"real threads" and "simulated threads".

How pre-emption works?

As per my understanding, schedulers do following items:
Calculate the time slice for the task(this could be algorithm dependent).
Switch Tasks - An ideal schedulers like to do in O(1). A good scheduling algorithm provides O(logN) complexity. Criteria to pick new task is again scheduling algorithm dependent.
My question is for pre-emptiom. For example a new task is created and it needs to run right away(and it does satisfies the condition - example it has higher priority than current running task).
How will scheduler know that a new task with higher priority is available and needed to run. We need to have some controlling code in Kernel implementation which detects such task entry and invokes Scheduler to save state of current running task and reschedule the new task. I would like to know more detail about such software entity.
Additionally I would expect this code to be scheduled to run on CPU to control "scheduler" and make scheduler switch task.
Please advise how this is implemented or may be I have some gaps in my understanding.
Thanks in advance
The best way to understand this is to read a book like "The design of the X Operating System" where X is one of {Unix, Linux, BSD...}. You should find a chapter on Context Switches and a chapter on the Scheduler. You could also look at https://en.wikipedia.org/wiki/Context_switch and https://en.wikipedia.org/wiki/Scheduling_%28computing%29#Linux, but the book is probably better.
Basically, when user code does a system call (such as to create a new process, or to release a semaphore, or ...) or when you get a clock interrupt, or when you get some sort of other interrupt, the running state of the user process is always dumped out to memory so that kernel code can be run without messing up the user process. Once you have done this, the user process that was running isn't much different from any other runnable user process.
As part of the work required to service the system call, or interrupt, or whatever, the system can notice that there is a new runnable process or that some other process that was not runnable before is now runnable, and ask the scheduler to update its notion of the highest priority runnable process. It might also notice that a scheduling quantum has just expired, and ask the scheduler to run a complete reschedule.
Once the kernel code has done its stuff it will probably see that the scheduler has marked the highest priority runnable process, and the kernel code will read that process's state out of memory and return to it without worrying very much about whether it is the process that was running before the system call or whatever or not.
Exception: once upon a time machines worried about the cost of dumping and restoring floating point registers, which kernel mode didn't really need, because it could be written so that it never did floating point. In this case, the save/restore code might be written so that it didn't save the floating point registers unless it had to, and the kernel might check as part of the restore to see if it was switching to a new process, and needed to dump out and restore the floating point registers. For all I know, stuff might still do this, or there might be some more modern state that is only saved and restored when the process really is changing. But this is really just a detail in either case.

tracing context switches of a process on linux

I need to monitor the context switches of a process and find out the reasons of the context-switches such as the specific kernel daemon causing the switch. I have seen related posts but I didn't find the answers satisfactory. I tried pidstat but it only shows the number of context switches. I would like to do this without recompiling the kernel for some profiling tool unless necessary.Please help.
I don't think that it really make sense; context switches are happening inside the kernel, not "inside the process". They are affecting some process. And most of them are not related to a kernel task. They happen "nearly inside" the scheduler. Most context switches are related to jiffies: running tasks are rescheduled after a small quantum of time (e.g. 20 milliseconds).
And the information about each traced context switch should go somewhere, i.e. into some process, which will also get context switched, ...

Resources