Pthread Concepts - multithreading

I'm studying threads and I am not sure if I understand some concepts. What is the difference between preemption and yield? So far I know that preemption is a forced yield but I am not sure what it actually means.
Thanks for your help.

Preemption is when one thread stops another thread from running so that it may run.
To yield is when a thread voluntarily gives up processor time.

Have a gander at these...
http://en.wikipedia.org/wiki/Preemption_(computing)
http://en.wikipedia.org/wiki/Thread_(computing)

The difference is how the OS is entered.
'yield' is a software interrupt AKA system call, one of the many that may result in a change in the set of running threads, (there are lots of other system calls that can do this - blocking reads, synchronization calls). yield() is called from a running thread and may result in another ready, (but not running), thread of the same priority being run instead of the calling thread - if there is one.
The exact behaviour of yield() is somewhat hardware/OS/language-dependent. Unless you are developing low-level lock-free thread comms mechanisms, and you are very good at it, it's best to just forget about yield().
Preemption is the act of interrupting one thread and dispatching another in its place. It can only occur after a hardware interrupt. When hardware interrupts, its driver is entered. The driver may decide that it can usefully make a thread ready, (eg. a thread is blocked on a read() call to the driver and the driver has accumulated a nice, big buffer of data). The driver can do this by signaling a semaphore and exiting via. the OS, (which provides an entry point for just such a purpose). This driver exit path causes a reschedule and, probably, makes the read thread running instead of some other thread that was running before the interrupt - the other thread has been preempted. Essentially and simply, preemption occurs when the OS decides to interrupt-return to a different set of threads than the one that was interrupted.

Yield: The thread calls a function in the scheduler, which potentially "parks" that thread, and starts another one. The other thread is one which called yield earlier, and now appears to return from it. Many functions can have yielding semantics, such as reading from a device.
Preempt: an external event comes into the system: some kind of interrupt (clock, network data arriving, disk I/O completing ...). Whichever thread is running at that time is suspended, and the machine is running operating system code the interrupt context. When the interrupt is serviced, and it's time to return from the interrupt, a scheduling decision can be made to keep the interrupted thread parked, and instead resume another one. That is a preemption. If/when that original thread gets to run again, the context which was saved by the interrupt will be activated and it will pick up exactly where it left off.
Scheduling systems which rely on yield exclusively are called "cooperative" or "cooperative multitasking" as opposed to "preemptive".
Traditional (read: old, 1970's and 80's) Unix is cooperatively multitasked in the kernel, with a preemptive user space. The kernel routines are trusted to yield in a reasonable time, and so preemption is disabled when running kernel code. This greatly simplifies kernel coding and improves reliability, at the expense of performance, especially when multiple processors are introduced. Linux was like this for many years.

Related

Context switch between kernel threads vs user threads

Copy pasted from this link:
Thread switching does not require Kernel mode privileges.
User level threads are fast to create and manage.
Kernel threads are generally slower to create and manage than the user threads.
Transfer of control from one thread to another within the same process requires a mode switch to the Kernel.
I never came across these points while reading standard operating systems reference books. Though these points sound logical, I wanted to know how they reflect in Linux. To be precise :
Can someone give detailed steps involved in context switching between user threads and kernel threads, so that I can find the step difference between the two.
Can someone explain the difference with actual context switch example or code. May be system calls involved (in case of context switching between kernel threads) and thread library calls involved (in case of context switching between user threads).
Can someone link me to Linux source code line (say on github) handling context switch.
I also doubt why context switch between kernel threads requires changing to kernel mode. Aren't we already in kernel mode for first thread?
Can someone give detailed steps involved in context switching between user threads and kernel threads, so that I can find the step difference between the two.
Let's imagine a thread needs to read data from a file, but the file isn't cached in memory and disk drives are slow so the thread has to wait; and for simplicity let's also assume that the kernel is monolithic.
For kernel threading:
thread calls a "read()" function in a library or something; which must cause at least a switch to kernel code (because it's going to involve device drivers).
the kernel adds the IO request to the disk driver's "queue of possibly many pending requests"; realizes the thread will need to wait until the request completes, sets the thread to "blocked waiting for IO" and switches to a different thread (that may belong to a completely different process, depending on global thread priorities). The kernel returns to the user-space of whatever thread it switch to.
later; the disk hardware causes an IRQ which causes a switch back to the IRQ handler in kernel code. The disk driver finishes up the work it had to do the for (currently blocked) thread and unblocks that thread. At this point the kernel might decide to switch to the "now unblocked" thread; and the kernel returns to the user-space of the "now unblocked" thread.
For user threading:
thread calls a "read()" function in a library or something; which must cause at least a switch to kernel code (because it's going to involve device drivers).
the kernel adds the IO request to the disk driver's "queue of possibly many pending requests"; realizes the thread will need to wait until the request completes but can't take care of that because some fool decided to make everything worse by doing thread switching in user space, so the kernel returns to user-space with "IO request has been queued" status.
after the pointless extra overhead of switching back to user-space; the user-space scheduler does the thread switch that the kernel could have done. At this point the user-space scheduler will either tell kernel it has nothing to do and you'll have more pointless extra overhead switching back to kernel; or user-space scheduler will do a thread switch to another thread in the same process (which may be the wrong thread because a thread in a different process is higher priority).
later; the disk hardware causes an IRQ which causes a switch back to the IRQ handler in kernel code. The disk driver finishes up the work it had to do for the (currently blocked) thread; but the kernel isn't able to do the thread switch to unblock the thread because some fool decided to make everything worse by doing thread switching in user space. Now we've got a problem - how does kernel inform the user-space scheduler that the IO has finished? To solve this (without any "user-space scheduler running zero threads constantly polls kernel" insanity) you have to have some kind of "kernel puts notification of IO completion on some kind of queue and (if the process was idle) wakes the process up" which (on its own) will be more expensive than just doing the thread switch in the kernel. Of course if the process wasn't idle then code in user-space is going to have to poll its notification queue to find out if/when the "notification of IO completion" arrives, and that's going to increase latency and overhead. In any case, after lots of stupid pointless and avoidable overhead; the user-space scheduler can do the thread switch.
Can someone explain the difference with actual context switch example or code. May be system calls involved (in case of context switching between kernel threads) and thread library calls involved (in case of context switching between user threads).
The actual low-level context switch code typically begins with something like:
save whichever registers are "caller preserved" according to the calling conventions on the stack
save the current stack top in some kind of "thread info structure" belonging to the old thread
load a new stack top from some kind of "thread info structure" belonging to the new thread
pop whichever registers are "caller preserved" according to the calling conventions
return
However:
usually (for modern CPUs) there's a relatively large amount of "SIMD register state" (e.g. for 80x86 with support for AVX-512 I think it's over 4 KiB of of stuff). CPU manufacturers often have mechanisms to avoid saving parts of that state if it wasn't changed, and to (optionally) postpone the loading of (pieces of) that state until its actually used (and avoid it completely if its not actually used). All of that requires kernel.
if it's a task switch and not just used for thread switches you might need some kind of "if virtual address space needs to change { change virtual address space }" on top of that
normally you want to keep track of statistics, like how much CPU time a thread has used. This requires some kind of "thread_info.time_used += now() - time_at_last_thread_switch;"; which gets difficulty/ugly when "process switching" is separated from "thread switching".
normally there's other state (e.g. pointer to thread local storage, special registers for performance monitoring and/or debugging, ...) that may need to be saved/loaded during thread switches. Often this state is not directly accessible in user code.
normally you also want to set a timer to expire when the thread has used too much time; either because you're doing some kind of "time multiplexing" (e.g. round-robin scheduler) or because its a cooperating scheduler where you need to have some kind of "terminate this task after 5 seconds of not responding in case it goes into an infinite loop forever" safe-guard.
this is just the low level task/thread switching in isolation. There is almost always higher level code to select a task to switch to, handle "thread used too much CPU time", etc.
Can someone link me to Linux source code line (say on github) handling context switch
Someone probably can't. It's not one line; it's many lines of assembly for each different architecture, plus extra higher-level code (for timers, support routines, the "select a task to switch to" code, for exception handlers to support "lazy SIMD state load", ...); which probably all adds up to something like 10 thousand lines of code spread across 50 files.
I also doubt why context switch between kernel threads requires changing to kernel mode. Aren't we already in kernel mode for first thread?
Yes; often you're already in kernel code when you find out that a thread switch is needed.
Rarely/sometimes (mostly only due to communication between threads belonging to the same process - e.g. 2 or more threads in the same process trying to acquire the same mutex/semaphore at the same time; or threads sending data to each other and waiting for data from each other to arrive) kernel isn't involved; and in some cases (which are almost always massive design failures - e.g. extreme lock contention problems, failure to use "worker thread pools" to limit the number of threads needed, etc) it's possible for this to be the dominant cause of thread switches, and therefore possible that doing thread switches in user space can be beneficial (e.g. as a work-around for the massive design failures).
Don't limit yourself to Linux or even UNIX, they are neither the first nor last word on systems or programming models. The synchronous execution model dates back to the early days of computing, and are not particularly well suited to larger scale concurrent and reactive programming.
Golang, for example, employs a great many lightweight user threads -- goroutines -- and multiplexes them on a smaller set of heavyweight kernel threads to produce a more compelling concurrency paradigm. Some other programming systems take similar approaches.

Threads giving up CPU control - Seeming contradiction in textbook

I'm learning about threads and processes in an Operating Systems course, and I've come across an apparent contradiction in my textbook (Modern Operating Systems, 4th Ed. by Tanenbaum and Bos). I'm sure there's a something I'm misinterpreting here, it'd be great if someone could clear things up.
On page 106:
Another common thread call is thread_yield, which allows a thread to voluntarily give up the CPU to let another thread run. Such a call is important because there is no clock interrupt to actually enforce multiprogramming as there is with processes
Ok fine - so how I interpret that is that threads will never give up control unless they willingly cede it. Makes sense.
Then on page 116, in an example of threads mishandling shared information:
As an example, consider the errno variable maintained by UNIX. When a process (or a thread) makes a system call that fails, the error code is put into errno. In Fig. 2-19, thread 1 executes the system call access to find out if it has permission to access a certain file. The operating system returns the answer in the global variable errno. After control has returned to thread 1, but before it has a chance to read errno, the scheduler decides that thread 1 has had enough CPU time for the moment and decides to switch to thread 2.
But didn't thread 1 just get pulled from the CPU involuntarily? I thought there was no way to enforce thread switching as there is with process switching?
This makes sense if we're going about process-level threads instead of OS-level threads. The CPU can interrupt a process (regardless of what thread is running), but because the OS is not aware of process-level threads, it cannot interrupt them. If one thread inside the process wants to allow another thread to run, it has to specifically yield to the other thread.
However, most languages these days use OS-level threads, which the OS does know about and can pre-empt.
The confusion is that there are two different ways threads are implemented. In ye olde days there was no thread support at all. The DoD's mandate of the Ada programming language (in which tasks—aka threads—were is an integral part) forced the adoption of threads.
Run time libraries were created (largely to support Ada). That worked within a process. The process maintained a timer that would interrupt a threads and the library would switch among threads much like the operating system switches processes.
Note that this system only allows one thread of a process at a time to execute, even on a multiprocessor system.
Your first example is describing such a library but it is describing a very primitive thread library where thread scheduling is based upon cooperation among the various threads of the process.
Later, operating system started to develop support for threads. Rather than scheduling a process, the operating system schedules threads for execution. A process is then an address space with a collection of threads. Your second example is talking about this kind of thread.

suspendThread in windows

Keeping my question short... i am writing simulation for a RTOS. As usual the main problem comes with context switch simulation. In case of interrupts it is really becoming hard not to deviate from 'Good' coding guidelines.
Say Task A is running and user application is calculating its harmless private stuff which will run for a long time. during this task A, an interrupt X is supposed to occur. (hint: task A has nothing to do with triggering this interrupt X)... now how do i perform context switch from Task A to interrupt X handler?
My current implementation is based on a context thread that waits till some context switch is requested; an interrupt controller thread that can generate interrupts if someone request interrupt triggering; and a main thread that is running Task A. Now i use interrupt controller thread to generate a new thread for interrupt X and then request context thread to do the context switch. Context thread Suspends Task A main thread and resumes interrupt X handler thread. At the end of interrupt X handler thread, Task A main thread is resumed..
[Edit] just to clarify, i already know suspending and terminating threads from outside is really bad. That is why i asked this question. Plus please don't recommend using event etc. for controlling Task A. it is user application code and i can't control it. He can even use while(1){} if he wants...
I suspect that you can't do what you want to do in that way.
You mentioned that suspending a thread from outside is really bad. The reason is that you have no idea what the thread is doing when you suspend it. It's impossible to know whether the thread currently owns a mutex; if it does then any other thread that tries to access the same mutex is going to deadlock.
You have the problem that the runtime being used by the threads that might be suspended is the same as the one being used by the supervisor. That means there are many potential such deadlocks between the supervisor and the other threads.
In a real environment (i.e. not a simulator), the operating system kernel can suspend threads because there are checks in place to ensure that these deadlocks can't happen. I don't know the details, but it probably involves masking interrupts at certain critical points, and probably not sharing the same mutexes between user-mode code and critical parts of the kernel scheduler. (In your case that would mean your scheduler could not use any of the same OS API functions, either directly or indirectly, as are allowed to be used by the user threads, in case they involve mutexes. This of course would be virtually impossible to achieve.)
The reason I asked in a comment whether you have any control over the user code compiler is that if you controlled the compiler then you could arrange for the user code to effectively mask interrupts for the duration of each instruction and only yield to another thread at well-defined points between instructions. This is how it is done in a control system that I work on.
The other aspect is platform dependence. In Linux and other unix-like operating systems, you have signals, which are like user-mode interrupts. You could potentially use signals to emulate context switching, although you would still have the same problem with mutexes. There is absolutely no equivalent on Windows (as far as I know) precisely because of the problem already stated. The nearest thing is an asynchronous procedure call, but this will run only when the thread has put itself into an alertable wait state (which means the thread is in a deterministic state and is now safe to interrupt).
I think you are going to have to re-think the whole concept so that your supervisory thread has the sort of privileged control above the user threads that the OS has in a non-emulated environment. That will probably involve replacing the compiler or the run-time libraries, or both, with something of your own making.

C# When thread switching will most probably occur?

I was wondering when .Net would most probably switch from a thread to another?
I understand we can't predict when this will happen exactly, but is there any intelligence in this? For example, when a thread is executed will it try to wait for a method to returns or a loop to finish before switching?
I'm not an expert on .NET, but in general scheduling is handled by the kernel.
Either your thread's timeslice has expired (threads/processes only get a certain amount of CPU time)
Your thread has blocked for IO.
Some other obscure reason, like waiting for an IPC message, a network packet or something.
Threads can be preempted at any point along their execution path, be it in a loop or returning from a function. This in general isn't handled by the underlying VM (.NET or JVM) but is controlled by the OS.
Of course there is 'intelligence', of a sort:). The set of running threads can only change upon an interrupt, either:
An actual hardware interrupt from a peripheral device, eg. disk, NIC, KB, mouse, timer.
A software interrupt, (ie. a system call), that can change the state of thread/s. This encompasses sleep calls and calls to wait/signal on inter-thread synchro objects, as well as I/O calls that request data that is not immediately available.
If there is no interrupt, the OS cannot change the set of running threads because it is not entered. The OS does not know or care about loops, function/methods calls, (except those that make system calls as above), gotos or any other user-level flow-control mechanisms.
I read your question now, it may not be rellevant anymore, but after reading the above answers, i want to just to make sure:
Threads are managed (or as i know) by the process they belong to. There is nothing to do with the Operation System(and that's is the main reason why working with multithreads is more faster than working with multiprocess, because there are data sharing between threads and the switching between them is occuring faster than the context switch wich occure between process by the Short-Term-Scheduler).
(NOTE: There are two types of threads: USER_MODE' threads and KERNEL_MODE' threadss, and each os can have both of them or just on of them. Anyway a thread that working in a user application environment is considered as a USER_MODE' thread and managed by the process it's belong to.)
Am I Write?
Thanks!!!

what is kernel thread dispatching?

Can someone give me an easy to understand definition of kernel thread dispatching or just thread dispatching if there's no difference between the two?
From what I understand it's just doing a context switch while the currently active thread waits on a lock from another thread, so the CPU goes and does something else while this thread is in blocking mode.
I might however have misunderstood.
It's basically the process by which the operating system determines which of the many active threads is sent (dispatched) to the CPU for processing at any given point.
Each operating system has its own implementation, but the basic concept is to keep a sorted list of threads by priority, and dispatch them as needed to the CPU. Time slicing is added to allow multiple programs to run concurrently, etc.

Resources