Does Go runtime create OS threads(M)? - multithreading

numLogicalProcessors on Intel core i7 is 8(2 X 4 physical cores). Linux OS. So, eight OS threads(M) can work in parallel. Go runtime can assign eight contexts(P1, P2....P8 - runtime.GOMAXPROCS(numLogicalProcessors)) in my Go program.
Go follows M:N threading model, where N are OS threads and M are go routines of a Go program.
OS scheduler schedules OS threads. Thread states are WAITING, RUNNABLE & EXECUTING.
Go scheduler schedules Go routines. Go-routine states are WAITING, RUNNABLE & EXECUTING. Goroutine is a user level thread.
Does the runtime of a Go program explicitly create those eight OS threads(M)? before assigning each context(P) to each OS thread(M)?
If OS thread(M1) is pre-empted(due to time-slice) by OS scheduler, How does goroutine scheduler(P1) manage the state of goroutine G1 using LRQ? Does P1 get the notification from OS that M1 state has changed?

Yes, Go scheduler start execution threads. The number of them can be examined or changed with runtime.GOMAXPROCS.
No, operating system preemption is transparent to the running process. Go runtime since version 1.14 can preempt Go routines but this is to avoid locking threads by tight loops. It is not related to operating system preemption.

Related

Can the threads of the same process run on different core?

Can the threads spawned by a process run on different cores of a multicore system?
Let's say I have a process P from which I have spawned two threads t1 and t2 and it's a multicore system with two cores C1 and C2. My questions are:
Can the treads t1 and t2 will they run on the same memory space as process P?
Can a thread t1 will execute in a different core than which process P is running on? eg: Process P is running on the core C1 and thread t1 is running in core C2?
Can the threads spawned from a process run on different cores of a multicore system?
Yes. Assuming that the hardware has multiple cores, and provided that the operating system supports / permits this. (Modern operating systems do support it. Whether it is permitted typically depends on the admin policy.)
Will the threads t1 and t2 run in the same memory space as process P?
Yes. They will use the same memory / virtual address space.
Can a thread t1 execute in a different core than which process P is running on? For example, process P is running on the core C1 and thread t1 is running in core C2?
This question does not make sense.
A POSIX process doesn't have the ability to execute code. It is the processes threads that execute code. So the idea of a "process running on core C1" is nonsensical.
Remember: every (live) POSIX process has at least one thread. The process starts with one thread, and that thread may spawn others if it needs to. The actual allocation of threads to cores is done by the operating system and will vary over the life time of the process.
This is how threads work in modern operating systems. For Linux, the current (POSIX compliant) way of implementing threads was introduced with Linux 2.6 in 2003. Prior to the Linux 2.6 kernel, Linux did not have true native threads. Instead it had a facility called LinuxThreads:
"LinuxThreads had a number of problems, mainly owing to the implementation, which used the clone system call to create a new process sharing the parent's address space. For example, threads had distinct process identifiers, causing problems for signal handling; LinuxThreads used the signals SIGUSR1 and SIGUSR2 for inter-thread coordination, meaning these signals could not be used by programs."
(From Wikipedia.)
In the (pre 2003!) LinuxThreads model, a "thread" was actually a process, and a process could share its address space with other processes.
Generally this depends on how threads are implemented in the OS scheduler.
That said, all modern OSes I know of will explicitly try to dispense threads in a way, that a good balance between cost of context switches and good parallelity is achieved. This implies, that if there is at least one idle core and no power capping / power napping / power preserving mode is active the waiting thread will be scheduled to the idle core.
If harsh power management is in place, the scheduler might chose to wait a tick or two before waking up that idle core - if the already running core frees up it can save a lot of juice.

How does a process schedule its own threads

After the Kernel schedules a process that has threads, How does said process schedule its own threads during its time splice?
For most modern kernels, the kernel only schedules threads, and processes are mostly just a container for the threads to execute inside (e.g. a container that contains a virtual address space, however many threads, and a few other scraps like file handles).
For some kernels (mostly very old unix kernels that existed before threads were invented) the kernel schedules processes and then a user-space library emulates threads. For this to work properly all of the "blocking" system calls (e.g. write()) have to be replaced by asynchronous system calls (e.g. aio_write()) so that other threads in the process can be given CPU time; however I wouldn't want to assume it works properly (e.g. if any thread blocks, then maybe all threads in the process block).
Also it may not work when there's multiple CPUs (kernel gives a process one CPU, but then from the kernel's perspective that process is running and can't use a second CPU). There are sophisticated work-arounds for this (to support "M:N threading") but it's just easier and better to fix the scheduler so it works with threads. Fortunately/unfortunately this didn't matter much in the early days because very few computers had more than one CPU anyway.
Lastly; it doesn't work for thread priorities - e.g. one process might keep CPU busy executing an unimportant/low priority thread while another process doesn't get that CPU time when it desperately needs it for an important/high priority thread. This occurs because no process knows about threads belonging to other processes and the kernel only knows about processes and not threads.
Of course these are also the reasons why every kernel adopted "kernel schedules threads and not processes" (and those that didn't died).
It's down to jargon definitions, but threads are simply a bunch of processes sharing an address space. Older Unixes even called them Light Weight Processes.
With that classical understanding of threads, the answer is that, these days, it's the OS that does the scheduling and each thread gets its own timeslices.
Extras
Some OSes do things to "the whole process" - e.g. Windows will give the process that has mouse focus a priority boost (all it's threads get dynamically notched up a few priority places), to make that application appear to be more sprightly (this goes back to Windows 3).
Other operating systems will increase the priority of a thread dynamically, to solve priority inversion situations. This is where a low priority thread that has control of a resource (I/O, or perhaps a semaphore) is blocking a higher priority thread from running (because the resource is not available. This is the priority inversion, and it's solved by the OS boosting the priority of the blocking thread until it gives up the required resource.
Either the kernel schedules the threads or the kernel schedules processes simulates thread by scheduling it own threads.
Usually, the process schedules its own threads using a library that sets timers. When the timer handler saves the current "thread's" registers then loads a new set of registers from another "thread."

System threads vs not-system threads

I noted that very often it occurs expression: "system thread". What does it mean in the fact? In particular, I cannot imagine not-system threads. Just, the system must be aware of thread. The operating system ( a scheduler) switches a context so he must know it!
For example, on the fourth page it is written about system threads:
http://www.dabeaz.com/python/GIL.pdf
A system thread is something provided by the OS. The OS kernel is in charge of scheduling system threads. If your runtime provides something like threads and a scheduler, then you have non-system threads. These are often called green threads. Sometimes non-system threads are more efficient, or the system doesn't provide threads. For Python, examples of non-system threads would be provided by greenlet or eventlet.
Threads are a construct of the operating system, which is itself just a program, so one could implement a thread scheduler in another program on top of the OS if so they desire (usually they don't reinvent the wheel though). The pertinent components would likely include some interrupt mechanism, a memory manager (to virtualize memory allocation), and a priority queue of instruction pointers for each thread.
The concept of green threads, event loops, cooperative multitasking and coroutines is generally what is meant by non-system threads.
It essentially refers to ways of structuring programs so that instead of blocking a thread to do things like IO, we allow the thread to be used by another task.
When we park a native thread, the OS can schedule another thread to use that CPU. With cooperative multitasking approaches it is also possible to have the application choose which task to execute next.

Do I need to change the process priority or scheduler?

I'm running on Linux (Ubuntu). I have a process which contains 20 threads All the threads have the same default priority & scheduler - 20. The default is not a RT scheduler. In my process all threads need to get the same priority
(there is no thread which is more important than the other thread).
The other processes on the system are the OS processes .
I'm running on Intel I7 (quad core).
When using top I'm getting 140-150% CPU.
It seems that sometime the processes is choking without reason (some threads take longer to calculate some data)
I read some papers on priority & scheduler, but they didn't note if the change can harm the OS. Do I need to change the scheduler or priority of my process (threads)?
I didn't change yet, because I don't know if the change will harm the OS.

How is context switching of threads done on a multi-core processor?

When doing context switching on a single-core processor, the code responsible is executed on the only CPU which takes care of switching the threads.
But how is this done when we have multiple CPUs? Is there a master CPU which does all the context switching of all slave CPUs? Is each CPU responsible for its own context switching? If so, how is the switching synchronized so that two CPUs are not executing the same thread? Or is there some other mechanism in place?
The kernel is multi-threaded. It can execute on any core. When a core comes to need to swap threads, it invokes the part of the kernel responsible for selecting the next thread it should execute.
The kernel is multi-threaded; which is to say, it is written to be safe executing concurrently on multiple cores. As such, only one CPU ends up running any given thread, because the code is constructed such that if multiple CPUs reschedule concurrently, the correct outcome occurs.
CPU's don't do context switching. Operating Systems do.
In essence, an OS executes a context switch by loading a new context (registers, memory mappings, etc) in a CPU core. Threads are an OS structure in which such contexts can be saved. Hence, the OS is also responsible for picking a nonrunning thread to load the CPU context from.
If the OS were to pick a running thread, two cores would try to run the same thread. That's bound to cause confusion as they'd share the same memory, and that single thread won't expect to be run in parallel with itself (!) So no OS would do such a thing.
Assume we have two process P1 and P2 The approximate Sequence of Steps should be like This.
The current registers are stored into the process structure for P1.
The stored register values from the process structure for P2 are loaded into the CPU's registers.
CPU returns the User mode
P1 is context switched out and P2 is context switched in and running
Note, CPU only Sends the intterupts to OS to perform the Context Switch.

Resources