Asynchronous methods on single-threaded machine - multithreading

Anatomy of a Program in Memory states that the libraries (DLLs etc.) are mapped in the Memory-mapped segment of a process. Now, when a process runs and calls the function of a library, I believe that the program counter (PC) of the thread changes to the position of function's code in the memory mapped segment and then after execution is complete, returns to the code-segment. This makes sense if the function is synchronous because we waited for the function call to complete and then moved ahead in the code segment.
Now, consider an asynchronous programming model. The library say MySql.dll is loaded in memory-mapped segment and the main code calls an asynchronous function in the dll. Asynchronous function means that PC of thread moves ahead in the code and the thread gets a call-back when the called async procedure is completed. But, in this case the async procedure is within the address space of the thread. A thread can have only one PC which begins executing the function in the DLL. Therefore, the main program in the code-segment is stalled.
This leads me to believe that async programs are no good in single-threaded systems because the program can't move ahead till the async function completes. If more than one threads were allowed, the MySql.dll could spawn a new thread (which would have its own PC) and return control to the caller in the code-segment. The PC in code-segment would proceed ahead and thus, we could see some parallelization.
I know I am wrong somewhere because async programming in very much possible in single-threaded systems (eg: JavaScript). Therefore, I wanted to identify the fallacy in my above arguments. I have following doubts. These may/may not be the source of my confusion: -
Does every library share the address space with the linked process or has its own address space?
If a library has its own address space, that means it is a separate process. Does that mean calling a function in the library and library calling callback, involve IPC mechanisms?
EDIT:
The question above can be confusing. So, I am going to explain the main scenario here by using some notations.
A thread can have only one PC. Suppose, a single-threaded environment. Process P1 has thread T1. Say P1, refers a library L1 for an async function. L1 during loading would have been mapped to the memory-mapped segment of P1. Now, when the code in T1 calls the async function of L1, the PC (program counter) of T1 moves to the L1 segment to execute the async function. One PC can't be at two places. So, T1 doesn't proceed till async function finishes. Then, how does async benefit us in a single threaded environment?

"But, in this case the async procedure is within the address space of the thread"
think what you mean by that? A procedure, both sync and async, has several pointers: program counter points to the code which is always out of address range (not space) of the thread, and stack frame and stack top pointers always belong to the address range of the thread and are used only while procedure is running.
So from the address perspective, sync case is not different from the sync case.
And address space always belongs to a process, and not to a library or a thread. Libraries and threads each occupy parts (ranges) of the common address space - only so they can work together.
UPDT
"when the code in T1 calls the async function of L1, the PC (program counter) of T1 moves to the L1 segment to execute the async function" -
no, it does not. When PC moves, this is a sync call. Async call is to arrange a task which executes async procedure later. See https://en.wikipedia.org/wiki/Asynchronous_method_invocation

Related

How user threads are really scheduled? How is the OS (kernel) involved in such scheduling?

I’m reading the popular Operating System Concepts book but I can’t get how the user threads are really scheduled. There’s particularly a statement that confuses me :
“User-level threads are managed by a thread library, and the kernel is unaware of them”.
Let’s say I create the process A and 3 threads with the Pthreads library. My assumption is that user threads must be necessarily scheduled by the kernel (OS). Isn’t the OS responsible for allocating CPU? Doesn’t threads have their own registers and stack? So there must be a context switch ( registers switch ) and therefore there also must be some handling by the OS. How can the kernel be unaware of them?
How are user threads exactly scheduled?
In the simplest implementation of user-level threads (a.k.a., "green threads"), there would be a function named yield(). The yield() function is the scheduler. When one thread calls it, it would;
Choose which thread should run next according to the scheduling policy. If it happens to choose the calling thread, it would then simply return. Otherwise, it would...
...Save whatever registers any called function in the high-level language is obliged to save in the saved context area for the calling thread. This would include, at a minimum, the stack pointer. It probably also would include a frame pointer, and maybe a few general purpose registers.
...Restore the registers from the saved context area for the chosen thread. This would include the stack pointer, and since we're changing the stack pointer in mid-stream, so to speak, we'll have to be very careful about where the scheduler's own variables are kept. It won't really work for it to have "local" variables in the usual sense. There's a good chance that at least part of this "context switch" code will have to be written in assembly language,
Finally, all it has to do is a normal function return, and it will be returning from a yield() call in the chosen thread instead of the original thread.
The function to create a new thread would;
Allocate a block of memory for the thread's stack,
Construct an artificial stack frame in the new stack that looks like the following function just called yield() from the line marked "// 1";
void root_function(void (*thread_function)(void* args)) {
yield(); // 1
thread_function(args);
mark_my_own_saved_context_as_dead();
yield(); // 2
}
When the thread_function() returns, the thread will call the mark_my_own_saved_context_as_dead() function to notify the scheduler algorithm that the thread is dead. When the thread called yield() for the last time ("// 2"), then the scheduler algorithm would free up its stack, and clean up whatever else needs to be cleaned up before selecting some other thread to run.
In a typical implementation of green threads, there will be many other places where one thread implicitly yields to another. Any blocking I/O call, for example, or a sleep(), or any call to acquire a mutex or a semaphore.

Does every thread have its main function?

Does every thread have its own main function?
I know that its have its own stack, but what about main function (not necessarily a function which called main).
For example, when creating a thread, we pass a function as an argument for the new thread to run it.
I tried to search about this topic, but couldn't find answers.
Quote from this article:
In a multi-threaded process, all of the process’ threads share the same memory and open files. Within the shared memory, each thread gets its own stack. Each thread has its own instruction pointer and registers. Since the memory is shared, it is important to note that there is no memory protection among the threads in a process.
Therefore, the «main» function could be called the function with which the execution of the thread begins, i.e. the address of the first instruction of which is initially loaded into the instruction pointer. It is worth noting that the first code that is executed in a thread can be a routine in the standard library that initializes and then calls a user-supplied function, which in this case can be called the «main».
But this is not a common term, it is usually called simply, a thread function.
However, there is a concept, the main thread. This is the first thread that is executed when the program (process) starts.

What is the purpose of the thread entry point method/function?

when creating a thread we pass an entry point method/function , why should I have this method , what is the purpose of it?
OS needs to know where a new thread of execution starts. When using a high-level programming language, one does not specify an address of machine instructions in memory to be executed in the context of a new thread, but uses execution units defined in the language like functions or methods. If thread creation worked like fork and execution of a new thread started at the point of fork invocation, then both threads would have the same local variables that usually reside in stack. Even if there is a copy of the stack created for a new thread, both threads will run the same clean-up code when leaving scopes (e.g., in C++ a smart pointer will be freed twice). So when you specify a starting point for a new thread, you are sure it will allocate a stack-frame of its own and function's epilog won't be executed twice.
A thread has to start somewhere. The pthread interface requires you to provide a function of the form
void *start_thread( void *arg );
void * is used because they can refer to anything.
When a thread is created, the function to provide is called as the thread's starting point. Think of it like main() for the thread, but with different argument and return types.

Can my CurrentThreadID ever change?

Is it possible for the thread that is running a section of code to change during that code block?
I am specifically thinking of code running inside ASP.net, where methods are executed against the Thread Pool.
If my code initiates an I/O operation (e.g. database) the execution is suspended pending the completion of an I/O completion port. In the meantime, that thread-pool thread can be re-used to handle another web request.
When my I/O completes, and my code is returned to a thread-pool thread - is it guaranteed to be the same thread?
E.g.
private void DoStuff()
{
DWORD threadID = GetCurrentThreadID();
//And what if this 3rd party code (e.g. ADO/ADO.net) uses completion ports?
//my thread-pool thread is given to someone else(?)
ExecuteSynchronousOperationThatWaitsOnIOCompletionPort();
//Synchronous operation has completed
threadID2 = GetCurrentThread();
}
Is it possible for
threadID2 <> threadID
?
i mentioned the .NET Thread Pool, but there is also the native thread pool. And i have written code for both.
Is it ever possible for my ThreadID to be ripped out from under me? Ever.
Why do i care?
The reason i care is because i'm trying to make an object thread-safe. That means that sometimes i have to know which so-called "thread" of execution called the method. Later, when they return, i know that "they" are still "them".
The only way i know to identify a "series of machine instructions that were written to be executed by one virtual processing unit" is through GetCurrentThreadID. But if GetCurrentThreadID changes; if my series of machine instructions can be moved to different "virtual processing units" (i.e. threads) during execution, then i cannot rely on GetCurrentThreadID.
The short answer is no. No synchronous function would do this.
With sufficient cleverness and evil you could create a function that did this. However, you would expect it to break things and that's why nobody actually does this.
As just the most obvious issue -- what happens if the calling thread holds a lock for the duration of the synchronous call?

Understanding how threads are implemented

I've been trying to understand the mechanics of implementing user-space threads. I'm not able to understand the mechanics of the stack and frames. I've come across two really great resources (here and here) that explain threading and how it's implemented but I still don't understand the following details:
How does the machine context used in the thread execution? I know it consists of a stack pointer and bunch of register values. But how exactly does the OS use it to execute the thread?
Why do we require a trampoline function (mctx_create_trampoline)? In the link #2, they setup a function as a signal handler that saves the machine context and starts the thread function (mctx_create_boot).
Based on these function how does one implement a "yield" function that the threads can call? Also, how can we interrupt a running thread? I assume you'd have a timer and SIGALRM that calls a signal handler when it goes off. But if the yield function switches contexts then the signal handler wont return which would block further signals from being delivered.
Once a thread has been set off executing on a physical CPU the OS is no longer involved until either the time-slice expires or some other re-scheduling needs to take place. So the key question is: How are threads scheduled to physical CPUs? Well, the OS sets the physical CPU registers to the appropriate values and jumps to the location where the thread was last interrupted (which in effect sets the instruction pointer). At this point the OS has lost control and is no longer involved. It can only regain control if a hardware interrupt occurs or some other physical CPU core decides to take control of the CPU.
Can't open that document right now.
"yield" cannot be implemented in user-space. It generally is a kernel API that picks some other thread to schedule and schedules it to the current CPU.

Resources