Are hardware interrupts and system calls/exceptions dispatched by the same dispatcher procedure in Linux? If you see Linux source, you will notice that hardware interrupts (on x86 arch) on their interrupt vectors doesn't contain more instructions that PUSH interrupt vector number on the stack and JUMP to common_interrupt.
My question:
Are every interrupt in Linux (exceptions (include SysCall), interrupts) dispatched by the same way until reach some point to branch? (in the reason of their type)
Sorry for my English.
Are hardware interrupts and system calls/exceptions dispatched by the same dispatcher procedure in Linux?
No. Exceptions, system calls and hardware interrupts are dispatched in a different way. If you will look in the arch/x86/entry/entry_64.S, you will find there all of them. First is the idtentry macro:
.macro idtentry sym do_sym has_error_code:req paranoid=0 shift_ist=-1
ENTRY(\sym)
...
...
...
END(\sym)
.endm
which provides preparation for an exception handling (stores registers, calls of an exception handler and etc....). Also definition of exceptions handlers with the idtentry macro:
idtentry divide_error do_divide_error has_error_code=0
idtentry overflow do_overflow has_error_code=0
idtentry bounds do_bounds has_error_code=0
Most of exceptions handlers are in the arch/x86/kernel/trap.c
Entry point of hardware interrupts is irq_entries_start. And the system call handling is starts at the entry_SYSCALL_64.
My question: Are every interrupt in Linux (exceptions (include
SysCall), interrupts) dispatched by the same way until reach some
point to branch? (in the reason of their type)
No. They are similar, but not the same. For example, system call preparation routine (entry_SYSCALL_64) checks the type of system call (64-bit or 32-bit emulation), has each time the same state of registers before execution (depends on ABI) and etc..., but for example an exception handler first of all check the type of exception to select correct stack from IST and etc.
More info you can find in the Intel® 64 and IA-32 Architectures Software Developer’s Manual 3A
Related
Understanding The Linux Kernel says:
A kernel control path denotes the sequence of instructions executed by the kernel to
handle a system call, an exception, or an interrupt.
and
Besides user processes, Unix systems include a few privileged processes called kernel
threads with the following characteristics:
• They run in Kernel Mode in the kernel address space.
• They do not interact with users, and thus do not require terminal
devices.
• They are usually created during system startup and remain alive
until the system is shut down.
What are the relations between the two concepts: a kernel control
path a kernel thread?
Is a kernel control path a kernel thread?
Is a kernel thread a kernel control path?
If I am correct, a kernel thread is represented as a task_struct
object.
So is a kernel control path?
If not, what kinds of kernel control paths can be and what kinds
can't be?
If I am correct, a kernel thread can be scheduled together with processes.
Can a kernel control path? If not, what kinds of kernel control paths can be and what kinds can't be?
Keep in mind there is no standard terminology. Using your definitions:
Is a kernel control path a kernel thread?
No, not under your definition.
Is a kernel thread a kernel control path?
No.
If I am correct, a kernel thread is represented as a task_struct object.
Probably.
So is [it] a kernel control path?
Not under your definition.
If not, what kinds of kernel control paths can be and what kinds can't be?
You defined it as:
A kernel control path denotes the sequence of instructions executed by the kernel to handle a system call, an exception, or an interrupt.
A kernel control path is the sequence of instructions executed by a kernel to handle a system call, an interrupt or an exception.
The kernel is the core of an operating system, and it controls virtually everything that occurs on a computer. An interrupt is a signal to the kernel that an event has occurred. Hardware interrupts are initiated by hardware devices, including the keyboard, the mouse, a printer or a disk drive. Interrupt signals initiated by programs are called software interrupts or exceptions.
In the most simple situation, the CPU executes a kernel control path sequentially, that is, beginning with the first instruction and ending with the last instruction.
source: http://www.linfo.org/kernel_control_path.html
In the IDT each line has some bits called "DPL" - Descriptor Privileg Level, 0 for kernel and 3 for normal users(maybe there are more levels). I don't understand 2 things:
this is the level required to run the interrupt handler code? or to the trigger the event that leads to it?. because system_call has DPL=3, so in user-mode we can do "int 0x80". but in linux only the kernel handle interrupts, so we can trigger the event but not handle it? even though we have the right CPL.
In linux only the kernel handle interrupts, but when an interrupt(or trap) happens, what get us into the kernel mode?
Sorry for any mistakes, I am new to all this stuff and just trying to learn.
The IDT has 3 types of entries - trap gates, interrupt gates and task gates (which nobody uses). For trap gates and interrupt gates; the entry mostly describes the target CS and EIP of the interrupt handler.
The DPL field in an IDT entry determines the privilege level required to use the gate (or, to switch to the target CS and EIP described by the gate). Software can only use a gate via. a software interrupt (e.g. int 0x80).
For IRQs and exceptions hardware uses the gate and not software. Hardware has no privilege level and is always able to use a gate (regardless of which privilege level software is currently using and regardless of the gate's DPL). This means that IRQ handlers should have DPL=0 (to ensure that software running at CPL=3 can't use them via. software interrupts).
When an interrupt handler is started, the CPU determines if there will be a privilege level change or not (based on the privilege level that was in use beforehand and the target privilege level that's almost always zero) and automatically switches privilege level where necessary. This is what causes the switch to CPL=0. Note: CPU will also switch stacks and save "return SS:ESP" on the new stack if a privilege level change was necessary.
In a multitasking system when any hardware generates a interrupt to a particular CPU, where CPU can be performing either of below cases unless it is already serving a ISR:
User mode process is executing on CPU
Kernel mode process is executing on CPU
Would like to know which stack is used by interrupt handler in above two situations and why ?
All interrupts are handled by kernel. That is done by interrupt handler written for that particular interrupt. For Interrupt handler there is IRQ stack. The setup of an interrupt handler’s stacks is configuration option. The size of the kernel stack might not always be enough for the kernel work and the space required by
IRQ processing routines. Hence 2 stack comes into picture.
Hardware IRQ Stack.
Software IRQ Stack.
In contrast to the regular kernel stack that is allocated per process, the two additional stacks are allocated per CPU. Whenever a hardware interrupt occurs (or a softIRQ is processed), the kernel needs to switch to
the appropriate stack.
Historically, interrupt handlers did not receive their own stacks. Instead, interrupt handlers would share the stack of the running process, they interrupted. The kernel stack is two pages in size; typically, that is 8KB on 32-bit architectures and 16KB on 64-bit architectures. Because in this setup interrupt handlers share the stack, they must be exceptionally frugal with what data they allocate there. Of course, the kernel stack is limited to begin with, so all kernel code should be cautious.
Interrupts are only handled by the kernel. So it is some kernel stack which is used (in both cases).
Interrupts do not affect (directly) user processes.
Processes may get signals, but these are not interrupts. See signal(7)...
Historically, interrupt handlers did not receive their own stacks.
Instead, they would share the stack of the process that they interrupted.
Note that a process is always running. When nothing else is schedulable, the idle task runs.
The kernel stack is two pages in size:
8KB on 32-bit architectures.
16KB on 64-bit architectures.
Because of sharing the stack, interrupt handlers must be exceptionally frugal with what data they allocate there.
Early in the 2.6 kernel process, an option was added to reduce the stack size from two pages to one, providing only a 4KB stack on 32-bit system, and interrupt handlers were given their own stack, one stack per processor, one page in size. This stack is referred to as the interrupt stack.
Although the total size of the interrupt stack is half that of the original shared stack, the average stack space available is greater because interrupt handlers get the full page of memory to themselves, because previously every process on the system needed two pages of contiguous, nonswappable kernel memory.
Your interrupt handler should not care what stack setup is in use or what the size of the kernel stack is. Always use an absolute minimum amount of stack space
https://notes.shichao.io/lkd/ch7/#stacks-of-an-interrupt-handler
I know Linux kernel take thread kernel stack as ISR stack before 2.6.32, after 2.6.32, kernel uses separated stack, if wrong, please correct me.
Would you tell me when the ISR stack is setup/crated, or destroy if there is. Or tell me the source file name and line number? Thanks in advance.
Updated at Oct 17 2014:
There are several kinds of stack in Linux. Below are 3 major(not all) that I know.
User space process stack, each user space task has its own stack,
this is created by mmap() when task is created.
Kernel stack for user space task, one for each user space task, this is
created within do_fork()->copy_process()->dup_task_struct()->alloc_thread_info() and used for system_call.
Stack for hardware interruption(top half), one for each CPU(after 2.6),
defined in arch/x86/kernel/irq_32.c: DEFINE_PER_CPU(struct irq_stack *, hardirq_stack); do_IRQ() -> handle_irq() ->
execute_on_irq_stack() switch the interrupt stack
Please let me know if these are correct or not.
For Interrupt handler there is IRQ stack. 2 kinds of stack comes into picture for interrupt handler:
Hardware IRQ Stack.
Software IRQ Stack.
In contrast to the regular kernel stack that is allocated per process, the two additional stacks are allocated per CPU. Whenever a hardware interrupt occurs (or a softIRQ is processed), the kernel needs to switch to the appropriate stack. Historically, interrupt handlers did not receive their own stacks. Instead, interrupt handlers would share the stack of the running process, they interrupted. The kernel stack is two pages in size; typically, that is 8KB on 32-bit architectures and 16KB on 64-bit architectures. Because in this setup interrupt handlers share the stack, they must be exceptionally frugal with what data they allocate there. Of course, the kernel stack is limited to begin with, so all kernel code should be cautious.
Pointers to the additional stacks are provided in the following array:
arch/x86/kernel/irq_32.c
static union irq_ctx *hardirq_ctx[NR_CPUS] __read_mostly;
static union irq_ctx *softirq_ctx[NR_CPUS] __read_mostly;
I am doing so research trying to find the code in the Linux kernel that implements interrupt handling; in particular, I am trying to find the code responsible for handling the system timer.
According to http://www.linux-tutorial.info/modules.php?name=MContent&pageid=86
The kernel treats interrupts very similarly to the way it treats exceptions: all the general >purpose registers are pushed onto the system stack and a common interrupt handler is called. >The current interrupt priority is saved and the new priority is loaded. This prevents >interrupts at lower priority levels from interrupting the kernel while it handles this >interrupt. Then the real interrupt handler is called.
I am looking for the code that pushes all of the general purpose registers on the stack, and the common interrupt handling code.
At least pushing the general purpose registers onto the stack is architecture independent, so I'm looking for the code that is associated with the x86 architecture. At the moment I'm looking at version 3.0.4 of the kernel source, but any version is probably fine. I've gotten started looking in kernel/irq/handle.c, but I don't see anything that looks like saving the registers; it just looks like it is calling the registered interrupt handler.
The 32-bit versions are in arch/i386/kernel/entry_32.S, the 64-bit versions in entry_64.S. Search for the various ENTRY macros that mark kernel entry points.
I am looking for the code that pushes all of the general purpose registers on the stack
Hardware stores the current state (which includes registers) before executing an interrupt handler. Code is not involved. And when the interrupt exits, the hardware reads the state back from where it was stored.
Now, code inside the interrupt handler may read and write the saved copies of registers, causing different values to be restored as the interrupt exits. That's how a context switch works.
On x86, the hardware only saves those registers that change before the interrupt handler starts running. On most embedded architectures, the hardware saves all registers. The reason for the difference is that x86 has a huge number of registers, and saving and restoring any not modified by the interrupt handler would be a waste. So the interrupt handler is responsible to save and restore any registers it voluntarily uses.
See Intel® 64 and IA-32 Architectures
Software Developer’s Manual, starting on page 6-15.