How does Linux remember its Kernel Stack Pointer? - linux

I know that there are two types of stack in Linux : user stack for each user threads and Kernel Stack for kernel threads (but 1 process). The interruptions, more precisely, the interruption procedures, are the bridges between this two modes, kernel (0) and user (3). The interrupt vector table let the processor loads the right instruction address in the PC register, but how is the stack pointer register changed when it switches in kernel mode? Does the subroutines indicate where's the kernel stack just before its first instruction? Or does the processor uses two stack pointer registers (I really doubt it)?
How does the "return from interrupt" knows where to return? Is the PCB saved in kernel stack or elsewhere?
Please don't hesitate to rectify it anything I've said to be true is not.
Thanks a lot for your help.

Kernel mode stack in Linux kernel is stored in task_struct->stack. Where and how it comes from is totally up to the platform. Some platforms might not be saving it like above. But then you can use task_stack_page() to find the stack.
And
While entering the interrupt handler, PC is stored on kernel stack. While returning from Interrupt this PC is loaded back from the kernel stack.

Related

Stack of hardware interrupt top-half in Linux kernel?

I know Linux kernel take thread kernel stack as ISR stack before 2.6.32, after 2.6.32, kernel uses separated stack, if wrong, please correct me.
Would you tell me when the ISR stack is setup/crated, or destroy if there is. Or tell me the source file name and line number? Thanks in advance.
Updated at Oct 17 2014:
There are several kinds of stack in Linux. Below are 3 major(not all) that I know.
User space process stack, each user space task has its own stack,
this is created by mmap() when task is created.
Kernel stack for user space task, one for each user space task, this is
created within do_fork()->copy_process()->dup_task_struct()->alloc_thread_info() and used for system_call.
Stack for hardware interruption(top half), one for each CPU(after 2.6),
defined in arch/x86/kernel/irq_32.c: DEFINE_PER_CPU(struct irq_stack *, hardirq_stack); do_IRQ() -> handle_irq() ->
execute_on_irq_stack() switch the interrupt stack
Please let me know if these are correct or not.
For Interrupt handler there is IRQ stack. 2 kinds of stack comes into picture for interrupt handler:
Hardware IRQ Stack.
Software IRQ Stack.
In contrast to the regular kernel stack that is allocated per process, the two additional stacks are allocated per CPU. Whenever a hardware interrupt occurs (or a softIRQ is processed), the kernel needs to switch to the appropriate stack. Historically, interrupt handlers did not receive their own stacks. Instead, interrupt handlers would share the stack of the running process, they interrupted. The kernel stack is two pages in size; typically, that is 8KB on 32-bit architectures and 16KB on 64-bit architectures. Because in this setup interrupt handlers share the stack, they must be exceptionally frugal with what data they allocate there. Of course, the kernel stack is limited to begin with, so all kernel code should be cautious.
Pointers to the additional stacks are provided in the following array:
arch/x86/kernel/irq_32.c
static union irq_ctx *hardirq_ctx[NR_CPUS] __read_mostly;
static union irq_ctx *softirq_ctx[NR_CPUS] __read_mostly;

Does Linux have both a user-mode and a system-mode stack for each process it runs?

This is my assignment question. Now I understand that the difference between the user mode and kernel mode (I think that is system mode).
But my question is: How the process works in Linux? Does the system have both user mode and system mode stacks for each process it runs?
I believe that this question has already been answered here:
What is the difference between kernel stack and user stack?
kernel stack and user space stack
That is, a userspace process has only one stack, a pointer to which is defined in the second element of the task_struct in include/linux/sched.h (about line 1045 in 3.12).
There is a possibility of some confusion with the per-thread kernel stack, as noted in the above posts. In a sense, a process can have one or more stacks, userspace and kernel space, depending on the number of threads it has at any point in time. The connection between the per-thread kernel stack, the thread and the process task_struct is described in this lecture by Junfeng Yang.

Linux Stack Sizes

I'm looking for a good description of stacks within the linux kernel, but I'm finding it surprisingly difficult to find anything useful.
I know that stacks are limited to 4k for most systems, and 8k for others. I'm assuming that each kernel thread / bottom half has its own stack. I've also heard that if an interrupt goes off, it uses the current thread's stack, but I can't find any documentation on any of this. What I'm looking for is how the stacks are allocated, if there's any good debugging routines for them (I'm suspecting a stack overflow for a particular problem, and I'd like to know if its possible to compile the kernel to police stack sizes, etc).
The reason that documentation is scarce is that it's an area that's quite architecture-dependent. The code is really the best documentation - for example, the THREAD_SIZE macro defines the (architecture-dependent) per-thread kernel stack size.
The stacks are allocated in alloc_thread_stack_node(). The stack pointer in the struct task_struct is updated in dup_task_struct(), which is called as part of cloning a thread.
The kernel does check for kernel stack overflows, by placing a canary value STACK_END_MAGIC at the end of the stack. In the page fault handler, if a fault in kernel space occurs this canary is checked - see for example the x86 fault handler which prints the message Thread overran stack, or stack corrupted after the Oops message if the stack canary has been clobbered.
Of course this won't trigger on all stack overruns, only the ones that clobber the stack canary. However, you should always be able to tell from the Oops output if you've suffered a stack overrun - that's the case if the stack pointer is below task->stack.
You can determine the process stack size with the ulimit command. I get 8192 KiB on my system:
$ ulimit -s
8192
For processes, you can control the stack size of processes via ulimit command (-s option). For threads, the default stack size varies a lot, but you can control it via a call to pthread_attr_setstacksize() (assuming you are using pthreads).
As for the interrupt using the userland stack, I somewhat doubt it, as accessing userland memory is a kind of a hassle from the kernel, especially from an interrupt routine. But I don't know for sure.

Does there exist Kernel stack for each process ?

Does there exist a Kernel stack and a user-space stack for each user space process? If both stacks exist, there should be 2 stack pointers for each user space process right?
In Linux, each task (userspace or kernel thread) has a kernel stack of either 8kb or 4kb, depending on kernel configuration. There are indeed separate stack pointers, however, only one is present in the CPU at any given time; if userspace code is running, the kernel stack pointer to be used on exceptions or interrupts is specified by the task-state segment, and if kernel code is running, the user stack pointer is saved in the context structure located on the kernel stack.
Each user space thread(not just process) has its user space stack and kernel space stack. The kernel space has one stack percpu and the ISR has its seperate stack too.

kernel stack for linux process

Is the kernel stack for all process shared or there is a seperate kernel stack for each process? If it is seperate for each process where is this stack pointer stored? In task_struct ?
There is just one common kernel memory. In it each process has it's own task_struct + kernel stack (by default 8K).
In a context switch the old stack pointer is saved somewhere and the actual stack pointer is made to point to the top of the stack (or bottom depending on the hardware architecture) of the new process which is going to run.
This old article says that each process has its own kernel stack. See comments to why that seems to be a very good design.
I tried reading the current source to make sure, but since the kernel stack is "implicit", it's not visible in the task_struct. This is mentioned in the article.
This answer was edited to incorporate wisdom from comments. Thanks.
The book "Linux kernel Development" of Robert Love has a good explanation about process kernel stack.
And yes, each process has its own kernel stack and if I´m not wrong its pointer is stored on thread_info structure. But I´m not really sure about it, and the struct task_struct is stored on beginning or the end of process kernel stack, depending of CPU architecture.
Cheers.
Carlos Maiolino
I think each process has its own kernel mode stack. Driver is executing in the kernel mode, the process sometimes will be blocked while executing driver routine. and the operating system can schedule another process to run. The scheduled process can call driver routine again. If kernel stack is shared, 2 processes are using the kernel stack, things will be mixed up. I'm puzzled by this question for a long time. At first I think kernel stack is shared, some books say that. After I read the Linux kernel Development, and see some driver code, I begin to think the kernel stack is not shared.

Resources