How does a kernel return from the thread - multithreading

I am doing some study hardcore study on computers etc. so I can get started on my own mini Hello World OS.
I was looking a how kernels work and I was wondering how the kernel makes the current thread return to the kernel (so it can switch to another) even though the kernel isn't running and the thread has no instruction to do so.
Does it use some kind of CPU interrupt that goes back to the kernel after a few nanoseconds?

Does it use some kind of CPU interrupt that goes back to the kernel after a few nanoseconds?
It is during timer interrupts and (blocking) system calls that the kernel decides whether to keep executing the currently active thread(s) or switch to another thread. The timer interupt handler updates resource usages, such as consumed system and user time, for the currently running process and scheduler_tick() function that decides whether a process/tread need to be pre-empted.
See "Preemption and Context Switching" on page 62 of Linux Kernel Development book.
The kernel, however, must know when to call schedule(). If it called schedule() only
when code explicitly did so, user-space programs could run indefinitely. Instead, the kernel
provides the need_resched flag to signify whether a reschedule should be performed (see
Table 4.1).This flag is set by scheduler_tick() when a process should be preempted, and
by try_to_wake_up() when a process that has a higher priority than the currently run-
ning process is awakened.The kernel checks the flag, sees that it is set, and calls schedule() to switch to a new process.The flag is a message to the kernel that the scheduler should be invoked as soon as possible because another process deserves to run.

Does it use some kind of CPU interrupt
Yes! Modern preemptive kernels are absolutely dependent upon interrupts from hardware to deliver good I/O performance. Keyboard, mouse, disk, NIC, USB, etc. drivers are all entered from interrupts and can make threads that are waiting on them ready/running when required (e.g., when data is available).
Threads can also change state as a result of making an OS call that changes the caller's own state of that of another thread.
The interrupt from the hardware timer is one of many interrupt sources and is only special in that many system operations have timeouts that are signaled by this interrupt. Other than that, the timer interrupt just causes a reschedule which, in most cases, changes nothing re. the ready/running state of threads. If the machine is grossly CPU-overloaded to the point where there are more ready threads than there are cores, there is a side-effect of the timer interrupt that causes CPU time to be shared amongst the ready threads.
Do not fixate on the timer interrupt—the other driver interrupts are absolutely essential. It is not impossible to build a functional preemptive multithreaded kernel with no timer interrupt at all.

Related

Is the scheduler built into the kernel a program or a process?

I looked up the CPU scheduler source code built into the kernel.
https://github.com/torvalds/linux/tree/master/kernel/sched
But I have a question.
There are mixed opinions on the cpu scheduler on the Internet.
I saw an opinion that CPU scheduler is a process.
Question: If so, when ps-ef on Linux, the scheduler process should be visible. It was difficult to find the PID and name of the scheduler process.
The PID for the CPU scheduler process is not on the internet either. However, the PID 0 SWAPPER process is called SCHED, but in Linux, PID 0 functions as an idle process.
I saw an opinion that CPU scheduler is not a process.
CPU scheduler is a passive source code built into the kernel, and user processes frequently enter the kernel and rotate the source code.
Question: How does the user process execute the kernel's scheduler source code on its own?
What if you created a user program without adding a system call using the scheduler of the kernel?
How does the user process self-rotate the scheduler in the kernel without such code?
You have 2 similar questions (The opinion that the scheduler built into the kernel is the program and the opinion that it is the process and I want to know how to implement the cpu scheduling process in Linux operating system) so I'll answer for both of these here.
The answer is that it doesn't work that way at all. The scheduler is not called by user mode processes by using system calls. The scheduler isn't a system call. There are timers that are programmed to throw interrupts after some time has elapsed. Timers are accessed using registers that are memory in RAM often called memory mapped IO (MMIO). You write to some position in RAM specified by the ACPI tables (https://wiki.osdev.org/ACPI) and it will allow to control the chips in the CPU or external PCI devices (PCI is everything nowadays).
When the timer reaches 0, it will trigger an interrupt. Interrupts are thrown by hardware (the CPU). The CPU thus includes special mechanism to let the OS determine the position at which it will jump on interrupt (https://wiki.osdev.org/Interrupt_Descriptor_Table). Interrupts are used by the CPU to notify the OS that an event happened. Without interrupts, the OS would have to reserve at least one core of the processor for a special kernel process that would constantly poll the registers of peripherals and other things. It would be impossible to implement. Also, if user mode processes did the scheduler system call by themselves, the kernel would be slave to user mode because it wouldn't be able to tell if a process is finished and processes could be selfish over CPU time.
I didn't look at the source code but I think the scheduler is also often called on some IO completion (also on interrupt but not always on timer interrupt). I am quite sure that the scheduler must not be preempted. That is interrupts (and other things) will be disabled while the schedule() function runs.
I don't think you can call the scheduler a process (not even a kernel thread). The scheduler can be called by kernel threads that are created by interrupts due to bottom half processing. In bottom half processing, the top "half" of the interrupt handler runs fast and efficiently while the bottom "half" is added to the queue of processes and runs when the scheduler decides it should be scheduled. This has the effect of creating some kernel threads. The scheduler can thus be called from kernel threads but not always from bottom half of interrupts. There has to be a mechanism to call the scheduler without the scheduler having to schedule the task itself. Otherwise, the kernel will stop functioning.

How does interrupt polling perform context switching?

Consider a very old single-core CPU that does not support hardware interrupts, and let's say I want to write a multi-tasked operating system. Using a hardware timer, one can poll an IRQ line in order to determine whether the timer has elapsed, and if so then switch threads/processes.
However, in order to be able to poll, the kernel has to have execution attention by the CPU. For a CPU that supports hardware interrupts, an ISR is called upon an interrupt and (correct me if I'm wrong) if the interrupt is by the context-switch timer, the appropriate ISR calls the kernel code that handles context switching.
If a CPU does not support hardware interrupts (again, correct me if I'm wrong), then the kernel has to repeatedly check for interrupts and the appropriate ISR is called in kernel space.
But, if a user thread is currently in execution on this hypothetical processor, the thread has to manually yield execution to the kernel for it to be able check whether the context-switch is due according to the timer through the appropriate IRQ line. This can be done by calling an appropriate kernel function.
Is there a way to implement non-cooperative multithreading on a single-core processor that only supports software interrupts? Are my thoughts correct, or am I missing something?
Well, you are generally correct that the kernel can't do multitasking until it gains control of the CPU. That happens via an interrupt or when user code makes a system call.
The timer interrupt, in particular, is used for preemptive time slicing. I think it would be pretty hard to find a whole CPU that didn't support a timer interrupt, that you didn't have to program with punch cards or switches. Interrupts are much older than multiple cores or virtual memory or DMA or anything fancy at all.
Some SoCs have real time sub-components that have this sort of restriction (like Beaglebone), and it might come up if you were coding a small CPU in an FPGA or something.
Without interrupts, you have to wait for system calls, which basically becomes cooperative multitasking.

Why software interrupts can sleep while it is not allowed in hardware interrupts?

Why we can sleep in software interrupt case while it is not allowed in case of hardware interrupt?
e.g. System calls can sleep while ISR cannot sleep.
When you enter in the kernel code through a process (i.e., a syscall) the kernel is said to be in process context. This means that the kernel is executed on behalf of a process. The execution of the kernel is synchronous with the user-level, and therefore it is possible to access user-level. It is also possible to call sleeping functions, because the scheduler is capable of schedule a new process.
When you enter in the kernel from a hardware source (i.e., an interrupt), then the kernel is said to be in interrupt context. The execution of the kernel is asynchronous with respect to the user-level, and you cannot do any ssumption of what is being executed at user-level. For example, some resources may be in some unconsistent state. For this reason, the code cannot block because the scheduler cannot schedule a new process.
This difference is well explained in Rubini's book Linux Device Drivers, 3rd edition which is freely available on the web.
Normally, ISRs run with interrupt disabled, so if sleeped in ISR we have no chance to wake up.
Interrupt handler uses interrupted process's kernel stack. If we switched to other process in ISR, the kernel stack will be changed to other process's.

Process Scheduling from Processor point of view

I understand that the scheduling is done by the kernel. Let us suppose a
process (P1) in Linux is currently executing on the processor.
Since the current process doesn't know anything about the time slice
and the kernel is currently not executing on the processor, how does the kernel schedule the next process to execute?
Is there some kind of interrupt to tell the processor to switch to execute the kernel or any other mechanism for the purpose?
In brief, it is an interrupt which gives control back to the kernel. The interrupt may appear due to any reason.
Most of the times the kernel gets control due to timer interrupt, or a key-press interrupt might wake-up the kernel.
Interrupt informing completion of IO with peripheral systems or virtually anything that changes the system state may
wake-up the kernel.
More about interrupts:
Interrupts as such are divided into top-half and bottom half. Bottom Halves are for deferring work from interrupt context.
Top-half: runs with interrupts disabled hence should be superfast, relinquish the CPU as soon as possible, usually
1) stores interrupt state flag and disables the interrupts(reset
some pin on the processor),
2) communicates with the hardware, stores state information,
delegates remaining responsibility to bottom-half,
3) restores the interrupt state flag and enables the interrupt((set
some pin on the processor).
Bottom-half: Handles the deferred work(delegated work by the top-half) runs with interrupts enabled hence may take a while before completion.
Two mechanisms are used to implement bottom-half processing.
1) Tasklets
2) Work queues
.
If timer is the interrupt to switch back to kernel, is the interrupt a hardware interrupt???
The timer interrupt of interest under our context of discussion is the hardware timer interrupt,
Inside kernel, the word timer interrupt may either mean (architecture-dependent) hardware timer interrupts or software timer interrupts.
Read this for a brief overview.
More about timers
Remeber "Timers" are an advanced topic, difficult to comprehend.
is the interrupt a hardware interrupt??? if it is a hardware
interrupt, what is the frequency of the timer?
Read Chapter 10. Timers and Time Management
if the interval of the timer is shorter than time slice, will kernel give the CPU back the same process, which was running early?
It depends upon many factors for ex: the sheduler being used, load on the system, process priorities, things like that.
The most popular CFS doesn't really depend upon the notion of time slice for preemption!
The next suitable process as picked up by CFS will get the CPU time.
The relation between timer ticks, time-slice and context switching is not so straight-forward.
Each process has its own (dynamically calculated) time slice. The kernel keeps track of the time slice used by the process.
On SMP, the CPU specific activities such as monitoring the execution time of the currently running process is done by the interrupts raised by the local APIC timer.
The local APIC timer sends an interrupt only to its processor.
However, the default time slice is defined in include/linux/sched/rt.h
Read this.
Few things could happen -
a. The current process (p1) can finish up its timeslice and then the
scheduler will check is there is any other process that could be run.
If there's no other process, the scheduler will put itself in the
idle state. The scheduler will assign p1 to the CPU if p1 is a CPU hoggy
task or p1 didn't leave the CPU voluntarily.
b. Another possibility is - a high priority task has jumped in. On every
scheduler tick, the scheduler will check if there's any process which
needs the CPU badly and is likely to preempt the current task.
In other words, a process can leave the CPU in two ways - voluntarily or involuntarily. In the first case, the process puts itself to sleep and therefore releases the CPU (case a). In the other case, a process has been preempted with a higher priority task.
(Note: This answer is based on the CFS task scheduler
of the current Linux kernel)

what is a reentrant kernel

What is a reentrant kernel?
Much simpler answer:
Kernel Re-Entrance
If the kernel is not re-entrant, a process can only be suspended while it is in user mode. Although it could be suspended in kernel mode, that would still block kernel mode execution on all other processes. The reason for this is that all kernel threads share the same memory. If execution would jump between them arbitrarily, corruption might occur.
A re-entrant kernel enables processes (or, to be more precise, their corresponding kernel threads) to give away the CPU while in kernel mode. They do not hinder other processes from also entering kernel mode. A typical use case is IO wait. The process wants to read a file. It calls a kernel function for this. Inside the kernel function, the disk controller is asked for the data. Getting the data will take some time and the function is blocked during that time.
With a re-entrant kernel, the scheduler will assign the CPU to another process (kernel thread) until an interrupt from the disk controller indicates that the data is available and our thread can be resumed. This process can still access IO (which needs kernel functions), like user input. The system stays responsive and CPU time waste due to IO wait is reduced.
This is pretty much standard for today's desktop operating systems.
Kernel pre-emption
Kernel pre-emption does not help in the overall throughput of the system. Instead, it seeks for better responsiveness.
The idea here is that normally kernel functions are only interrupted by hardware causes: Either external interrupts, or IO wait cases, where it voluntarily gives away control to the scheduler. A pre-emptive kernel instead also interrupts and suspends kernel functions just like it would interrupt processes in user mode. The system is more responsive, as processes e.g. handling mouse input, are woken up even while heavy work is done inside the kernel.
Pre-emption on kernel level makes things harder for the kernel developer: The kernel function cannot be suspended only voluntarily or by interrupt handlers (which are somewhat a controlled environment), but also by any other process due to the scheduler. Care has to be taken to e.g. avoid deadlocks: A thread locks resource A but needing resource B is interrupted by another thread which locks resource B, but then needs resource A.
Take my explanation of pre-emption with a grain of salt. I'm happy for any corrections.
All Unix kernels are reentrant. This means that several processes may be executing in Kernel Mode at the same time. Of course, on uniprocessor systems, only one process can progress, but many can be blocked in Kernel Mode when waiting for the CPU or the completion of some I/O operation. For instance, after issuing a read to a disk on behalf of a process, the kernel lets the disk controller handle it and resumes executing other processes. An interrupt notifies the kernel when the device has satisfied the read, so the former process can resume the execution.
One way to provide reentrancy is to write functions so that they modify only local variables and do not alter global data structures. Such functions are called reentrant functions . But a reentrant kernel is not limited only to such reentrant functions (although that is how some real-time kernels are implemented). Instead, the kernel can include nonreentrant functions and use locking mechanisms to ensure that only one process can execute a nonreentrant function at a time.
If a hardware interrupt occurs, a reentrant kernel is able to suspend the current running process even if that process is in Kernel Mode. This capability is very important, because it improves the throughput of the device controllers that issue interrupts. Once a device has issued an interrupt, it waits until the CPU acknowledges it. If the kernel is able to answer quickly, the device controller will be able to perform other tasks while the CPU handles the interrupt.
Now let's look at kernel reentrancy and its impact on the organization of the kernel. A kernel control path denotes the sequence of instructions executed by the kernel to handle a system call, an exception, or an interrupt.
In the simplest case, the CPU executes a kernel control path sequentially from the first instruction to the last. When one of the following events occurs, however, the CPU interleaves the kernel control paths :
A process executing in User Mode invokes a system call, and the corresponding kernel control path verifies that the request cannot be satisfied immediately; it then invokes the scheduler to select a new process to run. As a result, a process switch occurs. The first kernel control path is left unfinished, and the CPU resumes the execution of some other kernel control path. In this case, the two control paths are executed on behalf of two different processes.
The CPU detects an exception-for example, access to a page not present in RAM-while running a kernel control path. The first control path is suspended, and the CPU starts the execution of a suitable procedure. In our example, this type of procedure can allocate a new page for the process and read its contents from disk. When the procedure terminates, the first control path can be resumed. In this case, the two control paths are executed on behalf of the same process.
A hardware interrupt occurs while the CPU is running a kernel control path with the interrupts enabled. The first kernel control path is left unfinished, and the CPU starts processing another kernel control path to handle the interrupt. The first kernel control path resumes when the interrupt handler terminates. In this case, the two kernel control paths run in the execution context of the same process, and the total system CPU time is accounted to it. However, the interrupt handler doesn't necessarily operate on behalf of the process.
An interrupt occurs while the CPU is running with kernel preemption enabled, and a higher priority process is runnable. In this case, the first kernel control path is left unfinished, and the CPU resumes executing another kernel control path on behalf of the higher priority process. This occurs only if the kernel has been compiled with kernel preemption support.
These information available on http://jno.glas.net/data/prog_books/lin_kern_2.6/0596005652/understandlk-CHP-1-SECT-6.html
More On http://linux.omnipotent.net/article.php?article_id=12496&page=-1
The kernel is the core part of an operating system that interfaces directly with the hardware and schedules processes to run.
Processes call kernel functions to perform tasks such as accessing hardware or starting new processes. For certain periods of time, therefore, a process will be executing kernel code. A kernel is called reentrant if more than one process can be executing kernel code at the same time. "At the same time" can mean either that two processes are actually executing kernel code concurrently (on a multiprocessor system) or that one process has been interrupted while it is executing kernel code (because it is waiting for hardware to respond, for instance) and that another process that has been scheduled to run has also called into the kernel.
A reentrant kernel provides better performance because there is no contention for the kernel. A kernel that is not reentrant needs to use a lock to make sure that no two processes are executing kernel code at the same time.
A reentrant function is one that can be used by more than one task concurrently without fear of data corruption. Conversely, a non-reentrant function is one that cannot be shared by more than one task unless mutual exclusion to the function is ensured either by using a semaphore or by disabling interrupts during critical sections of code. A reentrant function can be interrupted at any time and resumed at a later time without loss of data. Reentrant functions either use local variables or protect their data when global variables are used.
A reentrant function:
Does not hold static data over successive calls
Does not return a pointer to static data; all data is provided by the caller of the function
Uses local data or ensures protection of global data by making a local copy of it
Must not call any non-reentrant functions

Resources