Title. Since you can enable escape analysis, there should also be an opcode that uses stack allocation instead of heap, right? Or how would that work?
Related
In Linux, when a running program attempts to use more stack space than the limit (a stack overflow), that usually results in a "segmentation fault" error and the execution is aborted.
Is it guaranteed that exceeding the stack space limit will always cause a segmentation fault error? Or could it happen that the program continues to run, possibly with some erroneous behavior due to data having been corrupted?
Another way of putting this: if a program misbehaves by producing wrong results but without a crash, can the cause still be a stack overflow?
Edit: to clarify, this question is not about "stack buffer overflow", it is about stack overflow, when the stack space used by the program exceeds the stack size limit (the limit that is in Linux given by ulimit -s).
A stack overflow turning into an access violation requires memory management hardware of some sort. Without hardware-assisted memory protection, an overgrown stack will collide with some other memory allocation, causing mutual corruption.
On demand-paged virtual memory operating systems, the upper limit of the stack is protected by a guard page: a page of virtual memory which is reserved (will not be allocated to anything) and marked "not present" so that accessing it generates a violation. A guard page is only so many bytes wide; a stack pointer can still accidentally increment over the guard page and land in some unrelated writable memory (such as a mapping belonging to a heap allocation) where havoc will be wreaked without necessarily triggering any memory access violation.
In the C language we can easily cause large stack increments by declaring large, uninitialized non-static block-scoped arrays, like char array[8192]; // (twice as large as a 4096 byte guard page). Using features like alloca or C99 variable-length arrays, we can do this dynamically: we can write a program which reads an integer value as a run-time input, and increments the stack by that much.
I debugged a problem many years ago whereby third-party code had debug logging macros, inside whose expansions there was a temporay array like char print_buf[8192] that was used for formatting messages. This was used in a multi-threaded application with many threads, whose stacks were reduced to just 64 kilobytes in size. Thanks to this print_buf, a thread's overflown stack leaped right past the guard page, and landed in another thread's stack, corrupting its local variables, causing the proverbial "hilarity to ensue".
I'm reading Professional Assembly Language by Richard Blum,and I am confusing about a Inconsistency in the book and I wondering what exactly is program stack's growth direction?
This is the picture from page 312, which is suggesting that program stack grows up.
But when I reached page 322,I see another version, which suggesting that program stack grows down.
and this
The book is not inconsistent; each drawing shows higher addresses at the top.
The first drawing illustrates a stack that grows downward. The caller pushes parameters onto the stack, then calls the new function. The act of calling pushes the return address onto the stack. The callee then pushes the current value of the base pointer onto the stack, copies the stack pointer into the base pointer, and decrements the stack pointer to make room for the callee's local variables.
Some background:
For different processors the meaning of the stack pointer and the direction of the stack may differ. For TMS Piccolo controllers stack is growing up so that "PUSH" will increment the stack pointer. The stack pointer may point to the value last pushed or to the location where the next value to be pushed is written to. The ARM processor allows all 4 possible combinations for the stack so there must be a convention on how to use the stack pointer.
On x86 processors:
On x86 processors stack ALWAYS grows downwards so a "PUSH" instruction will decrement the stack pointer; the stack pointer always points to the last value pushed.
The first picture shows you that the addresses after the stack pointer (address > stack pointer) already contain values. If you store more values to the stack they are stored to locations below the stack pointer (the next value will be stored to address -16(%ebp)). This means that the picture from page 312 also shows a down-growing stack.
-- Edit --
If a processor has a "PUSH" instruction the direction of stack growth is given by the CPU. For CPUs that do not have a "PUSH" instruction (like PowerPC or ARM without ARM-THUMB code) the Operating System has to define the direction of stack growth.
Stack growth direction varies with OS, CPU architecture, and probably a number of other things.
The most common layout has the stack start at the top of memory and grow down, while the heap starts at the bottom and grows up. Sometimes it's the other way around, eg. MacOS prior to OSX put the stack just above the code area, growing up, while the heap started at the top of memory and grew down.
Even more compelling definition of stack growth direction is (if the processor has it) interrupt stack. Some architectures (like PowerPC) don't really have a HW stack at all. Then the system designer can decide which way to implement the stack: Pre-incrementing, post-incrementing. pre-decrementing or post-decrementing.
In PPC the calls use link register, and next call overwrites it, if the return address is not programmatically saved.
PPC interrupts use 2 special registers - the "return address" and machine status. That's because instructions can be "restarted" after interrupt - a way to handle interrupts in pipelined architecture.
pre-increment: stack pointer is incremented before store in push - stack pointer points to the last used item. Seen in few more strange 8-bit architectures (some forth-processors and the like).
post-incrementing: store is done before stack pointer incrementing - stack pointer points to the first free stack element.
pre- and post decrementins: similar to the above, but stack grows downwards (more common).
Most common is post-decrementing.
I have defined an upward stack in xv6 (which had a downward stack) and want to know how I put a guard page between the stack and the heap. Is there any specific system call I can make use of? Also how can I maintain that one page address space to always lie between stack and heap?
so you know exactly from where your stack start growing up? In that case, why don't you just leave one page and just start from the next page onwards. And you might need to allocate and poison the memory with some data so that it could be detected. Like the way some of these memory overrun detection tools work. or you might need to set some custom flag to that page so that while trying to access them, you can check the flag and fault if found inappropriate.
Did I get your question correctly, btw?
What does "stack hog" means when talking about Linux kernel?
I read this notion on some linux kernel books(Professional Linux Kernel Architecture by Wolfgang Mauerer), but what exactly does "stack hog" means? Thanks.
"Stack hog" is an informal name used to describe functions that use significant amounts of automatic storage (AKA "the stack"). What exactly counts as "hogging" varies by the execution environment: in general, kernel-level functions have tighter limits on the stack space - just a few kilobytes, so a function considered a "stack hog" in kernel mode may become a "good citizen" in user mode.
A common reason for a function to become a stack hog is allocating buffers or other arrays in the automatic memory. This is more convenient, because you do not need to remember to free the memory and check the results of the allocation. You could also save some CPU cycles on the allocation itself. The downside is a possibility of overflowing the stack, wich results in panic for kernel-level programs. That is why a common remedy of "stack hogging" is moving some of your buffers into dynamic memory.
The Linux kernel uses 4K stacks. Using an inordinate amount of that small space is considered being a hog. If you are "lazy" and allocate a buffer on the stack or have a function with a large number of parameters that is being a hog.
The stack must hold any sequence of calls needed to service a system call as well as any interrupt handlers that may be called. So it is very important to conserve stack space.
I'm looking for a good description of stacks within the linux kernel, but I'm finding it surprisingly difficult to find anything useful.
I know that stacks are limited to 4k for most systems, and 8k for others. I'm assuming that each kernel thread / bottom half has its own stack. I've also heard that if an interrupt goes off, it uses the current thread's stack, but I can't find any documentation on any of this. What I'm looking for is how the stacks are allocated, if there's any good debugging routines for them (I'm suspecting a stack overflow for a particular problem, and I'd like to know if its possible to compile the kernel to police stack sizes, etc).
The reason that documentation is scarce is that it's an area that's quite architecture-dependent. The code is really the best documentation - for example, the THREAD_SIZE macro defines the (architecture-dependent) per-thread kernel stack size.
The stacks are allocated in alloc_thread_stack_node(). The stack pointer in the struct task_struct is updated in dup_task_struct(), which is called as part of cloning a thread.
The kernel does check for kernel stack overflows, by placing a canary value STACK_END_MAGIC at the end of the stack. In the page fault handler, if a fault in kernel space occurs this canary is checked - see for example the x86 fault handler which prints the message Thread overran stack, or stack corrupted after the Oops message if the stack canary has been clobbered.
Of course this won't trigger on all stack overruns, only the ones that clobber the stack canary. However, you should always be able to tell from the Oops output if you've suffered a stack overrun - that's the case if the stack pointer is below task->stack.
You can determine the process stack size with the ulimit command. I get 8192 KiB on my system:
$ ulimit -s
8192
For processes, you can control the stack size of processes via ulimit command (-s option). For threads, the default stack size varies a lot, but you can control it via a call to pthread_attr_setstacksize() (assuming you are using pthreads).
As for the interrupt using the userland stack, I somewhat doubt it, as accessing userland memory is a kind of a hassle from the kernel, especially from an interrupt routine. But I don't know for sure.