Blocking in user-level thread - multithreading

I have some confusion regarding user-level thread blocking. As we know that one user-level thread blocking leads to the whole process being blocked, then why is "responsiveness" one of the benefit of multi-threading? Given in the book of Silberschatz "Operating System Concepts":
Multithreading in interactive application may allow a program to
continue running even if part of it is blocked or is performing a
lengthy operation, thereby increasing responsiveness to the user.
Is this referring to kernel-level threads only or is there something i am not able to understand?
So, the question is, how can responsiveness be an advantage of threads, when one user-level thread blocking results in entire process being blocked?

So, the question is, how can responsiveness be an advantage of threads, when one user-level thread blocking results in entire process being blocked?
User-level threads only block under two circumstances:
When they hit a page fault or some other condition that the threading library can't handle.
When the entire process has no way to make forward progress.
This is the primary job that a user-level threading implementation has -- to take functions that would normally block and replace them with non-blocking versions so that other user-level threads can make forward progress.
That still means that if any user-level thread hits a page fault, the entire process cannot make forward progress until the page fault is serviced.

With user-level threading, the actual threads NEVER block -- any operation that might block is instead intercepted and the non-blocking equivalent done instead. If the user level thread DID block, as you note, it would inadvertantly block other threads, which is broken.

I have some confusion regarding user-level thread blocking. As we know that one user-level thread blocking leads to the whole process being blocked, then why is "responsiveness" one of the benefit of multi-threading?
That is only true on some operating systems.
So, the question is, how can responsiveness be an advantage of threads, when one user-level thread blocking results in entire process being blocked?
The systems widely in use when "user threads" were developed often had software interrupts that prevented the entire thread from being blocked.
Now, any operating system worth its salt has real threads and "user threads" are a relic of the ages and operating systems that have not kept up.

Related

Why does a process get blocked if a thread waits for I/O in many to one mapping

Why does a multi-threaded process using a user level thread library get blocked when one of its threads waits for an I/O? This makes sense, but when I think more, a question pops up. Can the user level thread library not schedule another thread?
OS can schedule only the processes(or jobs) , it in no way knows about the threads within a program and cannot schedule them as it wants.
when a part of the process ( here the thread which got blocked due to i/o) gets blocked for i/o operation, the os suspends the entire process , since the os deals only with the processes (not threads within the process).
As in the many to one model , there is only a single kernel , the process whose thread was blocked cant be executed until the blocked thread resumes.
whereas in a many to many or one to one model, each kernel runs its piece of code and is unaware of the threads blocked in the other kernels.
There's two types of thread. OS threads, and green threads (which is what I think you're talking about).
OS threads are scheduled by the operating system, and one will not block another (at least not on any OS you're likely to come across these days) unless you deliberately introduce something to synchronise them (e.g. Semaphores).
Green threads, where a process schedules different paths of execution for itself, will block unless the scheduler is clever enough provide (and therefore catch) all potentially blocking function calls and use them as a scheduling opportunity. This is also closely related to cooperative multitasking.
So the answer is yes, but only if written that way. Threads in Python famously were not written this way, read up on the GIL, and so would cause no end of problems. Python may have fixed this now.

Benefits of user-level threads

I was looking at the differences between user-level threads and kernel-level threads, which I basically understood.
What's not clear to me is the point of implementing user-level threads at all.
If the kernel is unaware of the existence of multiple threads within a single process, then which benefits could I experience?
I have read a couple of articles that stated user-level implementation of threads is advisable only if such threads do not perform blocking operations (which would cause the entire process to block).
This being said, what's the difference between a sequential execution of all the threads and a "parallel" execution of them, considering they cannot take advantage of multiple processors and independent scheduling?
An answer to a previously asked question (similar to mine) was something like:
No modern operating system actually maps n user-level threads to 1
kernel-level thread.
But for some reason, many people on the Internet state that user-level threads can never take advantage of multiple processors.
Could you help me understand this, please?
I strongly recommend Modern Operating Systems 4th Edition by Andrew S. Tanenbaum (starring in shows such as the debate about Linux; also participating: Linus Torvalds). Costs a whole lot of bucks but it's definitely worth it if you really want to know stuff. For eager students and desperate enthusiasts it's great.
Your questions answered
[...] what's not clear to me is the point of implementing User-level threads
at all.
Read my post. It is comprehensive, I daresay.
If the kernel is unaware of the existence of multiple threads within a
single process, then which benefits could I experience?
Read the section "Disadvantages" below.
I have read a couple of articles that stated that user-level
implementation of threads is advisable only if such threads do not
perform blocking operations (which would cause the entire process to
block).
Read the subsection "No coordination with system calls" in "Disadvantages."
All citations are from the book I recommended in the top of this answer, Chapter 2.2.4, "Implementing Threads in User Space."
Advantages
Enables threads on systems without threads
The first advantage is that user-level threads are a way to work with threads on a system without threads.
The first, and most obvious, advantage is that
a user-level threads package can be implemented on an operating system that does not support threads. All operating systems used to
fall into this category, and even now some still do.
No kernel interaction required
A further benefit is the light overhead when switching threads, as opposed to switching to the kernel mode, doing stuff, switching back, etc. The lighter thread switching is described like this in the book:
When a thread does something that may cause it to become blocked
locally, for example, waiting for another thread in its process to
complete some work, it calls a run-time system procedure. This
procedure checks to see if the thread must be put into blocked state.
If, so it stores the thread’s registers (i.e., its own) [...] and
reloads the machine registers with the new thread’s saved values. As soon as the stack
pointer and program counter have been switched, the new thread comes
to life again automatically. If the machine happens to have an
instruction to store all the registers and another one to load them
all, the entire thread switch can be done in just a handful of in-
structions. Doing thread switching like this is at least an order of
magnitude—maybe more—faster than trapping to the kernel and is a
strong argument in favor of user-level threads packages.
This efficiency is also nice because it spares us from incredibly heavy context switches and all that stuff.
Individually adjusted scheduling algorithms
Also, hence there is no central scheduling algorithm, every process can have its own scheduling algorithm and is way more flexible in its variety of choices. In addition, the "private" scheduling algorithm is way more flexible concerning the information it gets from the threads. The number of information can be adjusted manually and per-process, so it's very finely-grained. This is because, again, there is no central scheduling algorithm needing to fit the needs of every process; it has to be very general and all and must deliver adequate performance in every case. User-level threads allow an extremely specialized scheduling algorithm.
This is only restricted by the disadvantage "No automatic switching to the scheduler."
They [user-level threads] allow each process to have its own
customized scheduling algorithm. For some applications, for example,
those with a garbage-collector thread, not having to worry about a
thread being stopped at an inconvenient moment is a plus. They also
scale better, since kernel threads invariably require some table space
and stack space in the kernel, which can be a problem if there are a
very large number of threads.
Disadvantages
No coordination with system calls
The user-level scheduling algorithm has no idea if some thread has called a blocking read system call. OTOH, a kernel-level scheduling algorithm would've known because it can be notified by the system call; both belong to the kernel code base.
Suppose that a thread reads from the keyboard before any keys have
been hit. Letting the thread actually make the system call is
unacceptable, since this will stop all the threads. One of the main
goals of having threads in the first place was to allow each one to
use blocking calls, but to prevent one blocked thread from affecting
the others. With blocking system calls, it is hard to see how this
goal can be achieved readily.
He goes on that system calls could be made non-blocking but that would be very inconvenient and compatibility to existing OSes would be drastically hurt.
Mr Tanenbaum also says that the library wrappers around the system calls (as found in glibc, for example) could be modified to predict when a system cal blocks using select but he utters that this is inelegant.
Building upon that, he says that threads do block often. Often blocking requires many system calls. And many system calls are bad. And without blocking, threads become less useful:
For applications that are essentially entirely CPU bound and rarely
block, what is the point of having threads at all? No one would
seriously propose computing the first n prime numbers or playing chess
using threads because there is nothing to be gained by doing it that
way.
Page faults block per-process if unaware of threads
The OS has no notion of threads. Therefore, if a page fault occurs, the whole process will be blocked, effectively blocking all user-level threads.
Somewhat analogous to the problem of blocking system calls is the
problem of page faults. [...] If the program calls or jumps to an
instruction that is not in memory, a page fault occurs and the
operating system will go and get the missing instruction (and its
neighbors) from disk. [...] The process is blocked while the necessary
instruction is being located and read in. If a thread causes a page
fault, the kernel, unaware of even the existence of threads, naturally
blocks the entire process until the disk I/O is complete, even though
other threads might be runnable.
I think this can be generalized to all interrupts.
No automatic switching to the scheduler
Since there is no per-process clock interrupt, a thread acquires the CPU forever unless some OS-dependent mechanism (such as a context switch) occurs or it voluntarily releases the CPU.
This prevents usual scheduling algorithms from working, including the Round-Robin algorithm.
[...] if a thread starts running, no other thread in that process
will ever run unless the first thread voluntarily gives up the CPU.
Within a single process, there are no clock interrupts, making it
impossible to schedule processes round-robin fashion (taking turns).
Unless a thread enters the run-time system of its own free will, the scheduler will never get a chance.
He says that a possible solution would be
[...] to have the run-time system request a clock signal (interrupt) once a
second to give it control, but this, too, is crude and messy to
program.
I would even go on further and say that such a "request" would require some system call to happen, whose drawback is already explained in "No coordination with system calls." If no system call then the program would need free access to the timer, which is a security hole and unacceptable in modern OSes.
What's not clear to me is the point of implementing user-level threads at all.
User-level threads largely came into the mainstream due to Ada and its requirement for threads (tasks in Ada terminology). At the time, there were few multiprocessor systems and most multiprocessors were of the master/slave variety. Kernel threads simply did not exist. User threads had to be created to implement languages like Ada.
If the kernel is unaware of the existence of multiple threads within a single process, then which benefits could I experience?
If you have kernel threads, threads multiple threads within a single process can run simultaneously. In user threads, the threads always execute interleaved.
Using threads can simplify some types of programming.
I have read a couple of articles that stated user-level implementation of threads is advisable only if such threads do not perform blocking operations (which would cause the entire process to block).
That is true on Unix and maybe not all unix implementations. User threads on many operating systems function perfectly fine with blocking I/O.
This being said, what's the difference between a sequential execution of all the threads and a "parallel" execution of them, considering they cannot take advantage of multiple processors and independent scheduling?
In user threads. there is never parallel execution. In kernel threads, the can be parallel execution IF there are multiple processors. On a single processor system, there is not much advantage to using kernel threads over single threads (contra: note the blocking I/O issue on Unix and user threads).
But for some reason, many people on the Internet state that user-level threads can never take advantage of multiple processors.
In user threads, the process manages its own "threads" by interleaving execution within itself. The process can only have a thread run in the processor that the process is running in.
If the operating system provides system services to schedule code to run on a different processor, user threads could run on multiple processors.
I conclude by saying that for practicable purposes there are no advantages to user threads over kernel threads. There are those that will assert that there are performance advantages, but for there to be such an advantage it would be system dependent.

How does a user-level thread come out of execution?

I understand that regarding the kernel-level threads, there is an interrupt caused by reaching a certain cycle count, that is signaling the kernel to perform the required context switch over to another thread depending on the scheduler.
In my understanding regarding user-level threads, in a many to one model the scheduling of the user threads is done completely in user space. The kernel just schedules the kernel thread user-level threads had been mapped to.
My problem is that I can't comprehend the bit after "the control has been transferred to a certain user-level thread". How does it cease to execute for the scheduler to get the control back to perform needed context switching and selecting of another thread for execution? I am not sure if there are any timer registers being used to cause an interrupt when it comes to user-level threads.
So once again how does the user-level scheduler get the control back?
Please enlighten me.
Funny thing (what a real coincidence) I've been formulating the answer to this in my head on my way home yesterday. For real.
The answer is that user-level thread has to give control back. Only kernel-level threads could be preempted. This control giving can happen either explicitly - by calling functions like yield() - or implicitly, by calling any other function which know how to transfer control. Those would be most likely thread-synchronization functions.

Multithreading Models - One to Many model

I've been reading the dinosaur book and have been confused by this particular model.
The books says that for the one to many model "Thread management is done by the thread library in user space, so it is efficient; but the entire process will block if a thread makes a blocking system call. Also, because only one thread can access the kernel at a time, multiple threads are unable to run in parallel on multiprocessors"
What I'm confused about is what is meant by an entire process will block if a blocking system call is made? Does this mean if I have a multi-threaded program and one of it's threads blocks then all of its threads will have to wait, effectively stalling the program?
If a program undergoing execution causes a block with this model does it mean that another separate program can't be swapped in to be executed because the kernel thread is blocking? If that answer is YES another program(process) could be swapped in than why couldn't a multi-threaded program simply execute another one of its threads while the blocking thread is forced to wait?
If you manage your threads in user level, it means that the swapping is done by your application, not by OS scheduler. Each thread must reach some point where he surrenders (or loses) the control to the management mechanism, but that mechanism is also user-level, so if one of the threads is in the middle of doing a system call - your thread management system (and through that all the other threads) must wait until the kernel code is done.
The OS is still active all the time, and may still preempt the entire program, so other processes will not starve, only the internal "threads" you manage yourself. These threads can't get started during that block because the mechanism responsible of starting them is also blocked by the kernel.

what is kernel thread dispatching?

Can someone give me an easy to understand definition of kernel thread dispatching or just thread dispatching if there's no difference between the two?
From what I understand it's just doing a context switch while the currently active thread waits on a lock from another thread, so the CPU goes and does something else while this thread is in blocking mode.
I might however have misunderstood.
It's basically the process by which the operating system determines which of the many active threads is sent (dispatched) to the CPU for processing at any given point.
Each operating system has its own implementation, but the basic concept is to keep a sorted list of threads by priority, and dispatch them as needed to the CPU. Time slicing is added to allow multiple programs to run concurrently, etc.

Resources