In Linux, when a running program attempts to use more stack space than the limit (a stack overflow), that usually results in a "segmentation fault" error and the execution is aborted.
Is it guaranteed that exceeding the stack space limit will always cause a segmentation fault error? Or could it happen that the program continues to run, possibly with some erroneous behavior due to data having been corrupted?
Another way of putting this: if a program misbehaves by producing wrong results but without a crash, can the cause still be a stack overflow?
Edit: to clarify, this question is not about "stack buffer overflow", it is about stack overflow, when the stack space used by the program exceeds the stack size limit (the limit that is in Linux given by ulimit -s).
A stack overflow turning into an access violation requires memory management hardware of some sort. Without hardware-assisted memory protection, an overgrown stack will collide with some other memory allocation, causing mutual corruption.
On demand-paged virtual memory operating systems, the upper limit of the stack is protected by a guard page: a page of virtual memory which is reserved (will not be allocated to anything) and marked "not present" so that accessing it generates a violation. A guard page is only so many bytes wide; a stack pointer can still accidentally increment over the guard page and land in some unrelated writable memory (such as a mapping belonging to a heap allocation) where havoc will be wreaked without necessarily triggering any memory access violation.
In the C language we can easily cause large stack increments by declaring large, uninitialized non-static block-scoped arrays, like char array[8192]; // (twice as large as a 4096 byte guard page). Using features like alloca or C99 variable-length arrays, we can do this dynamically: we can write a program which reads an integer value as a run-time input, and increments the stack by that much.
I debugged a problem many years ago whereby third-party code had debug logging macros, inside whose expansions there was a temporay array like char print_buf[8192] that was used for formatting messages. This was used in a multi-threaded application with many threads, whose stacks were reduced to just 64 kilobytes in size. Thanks to this print_buf, a thread's overflown stack leaped right past the guard page, and landed in another thread's stack, corrupting its local variables, causing the proverbial "hilarity to ensue".
Related
I'm doing computations with huge arrays and for some of this computations I need an increased stack size! Is there any downside of setting the stack size to unlimited (ulimit -s unlimited) in my ~/.bashrc?
The program is written in fortran(F77 & F90) and parallelized with MPI. Some of my arrays have more than 2E7 entries and when I use a small number of cores with MPI it crashes with segmentation fault.
The array size stays the same through the whole computation therefore I setted them to fixes value:
real :: p(200,200,400)
integer :: ib,ie,jb,je,kb,ke
...
ib=1;ie=199
jb=2;je=198
kb=2;ke=398
call SOLVE_POI_EQ(rank,p(ib:ie,jb:je,kb:ke),R)
Setting the stacksize to unlimited likely won't help you. You are allocating a chunk of 64MB on the stack, and likely don't fill it from the top, but from the bottom.
This is important, because the OS grows the stack as you go. Whenever it detects a page-fault right below the stack segment, it will assume that you need more space, and silently insert a new page. The size of this trigger-region within your address-space is limited, though, and I doubt that its larger than 64 MB. Since you index variables are likely placed below your array on the stack, accessing them already does the 64 MB jump that kills your process.
Just make your array allocatable, add the corresponding allocate() statement, and you should be fine.
Stack size is never really unlimited, so you would still have some failures. And your code still won't be portable to Linux systems with smaller (or normal-sized) stacks.
BTW, you should explain which kind of programs are you running, show some source code.
If coding in C++, using standard containers should help a lot (regarding actual stack consumption). For example, a local (stack allocated) std::vector<int> v(10000); (instead of int v[10000];) has its data allocated on the heap (and deallocated by the destructor when you exit from the block defining it)
It would be much better to improve your programs to avoid excessive stack consumption. The need of a lot of stack space is really a bug that you should try to correct. A typical rule of thumb is to have call frames smaller than a few kilobytes (so allocate any larger data on the heap).
You might consider also using the Boehm conservative garbage collector: you would use GC_MALLOC instead of malloc (and you would heap allocate large data structure using GC_MALLOC) but you won't have to bother to free your (GC-heap allcoated) data.
I'm working on pthread in android NDK that process video data through network.
I meet problem that is 'stack corruption detected : aborted'. So I set -fstack-check in application.mk, and FATAL SIGNAL 11 blabla.. again.
My conclusion in this problem is about stack size.
When I use window thread, its stack size sets 1kb as default, increase automatically.
but, I don't know about pthread.
Is pthread increase stack size automatically?
p.s. I attached this thread to JavaVM.
On stock Android, pthread stacks are allocated at 1MB by default. (Because of the way the system works, only the parts of the stack that are actually touched get physical pages, so for many threads there's actually only a few KB in use.)
The "stack corruption" message indicates that something has trashed the stack, not that you've run off the end. One way to encounter this is to write off the end of a stack-allocated array. It might be useful to decode the stack trace you get in the log file and see what method it's in when it fails.
I'm looking for a good description of stacks within the linux kernel, but I'm finding it surprisingly difficult to find anything useful.
I know that stacks are limited to 4k for most systems, and 8k for others. I'm assuming that each kernel thread / bottom half has its own stack. I've also heard that if an interrupt goes off, it uses the current thread's stack, but I can't find any documentation on any of this. What I'm looking for is how the stacks are allocated, if there's any good debugging routines for them (I'm suspecting a stack overflow for a particular problem, and I'd like to know if its possible to compile the kernel to police stack sizes, etc).
The reason that documentation is scarce is that it's an area that's quite architecture-dependent. The code is really the best documentation - for example, the THREAD_SIZE macro defines the (architecture-dependent) per-thread kernel stack size.
The stacks are allocated in alloc_thread_stack_node(). The stack pointer in the struct task_struct is updated in dup_task_struct(), which is called as part of cloning a thread.
The kernel does check for kernel stack overflows, by placing a canary value STACK_END_MAGIC at the end of the stack. In the page fault handler, if a fault in kernel space occurs this canary is checked - see for example the x86 fault handler which prints the message Thread overran stack, or stack corrupted after the Oops message if the stack canary has been clobbered.
Of course this won't trigger on all stack overruns, only the ones that clobber the stack canary. However, you should always be able to tell from the Oops output if you've suffered a stack overrun - that's the case if the stack pointer is below task->stack.
You can determine the process stack size with the ulimit command. I get 8192 KiB on my system:
$ ulimit -s
8192
For processes, you can control the stack size of processes via ulimit command (-s option). For threads, the default stack size varies a lot, but you can control it via a call to pthread_attr_setstacksize() (assuming you are using pthreads).
As for the interrupt using the userland stack, I somewhat doubt it, as accessing userland memory is a kind of a hassle from the kernel, especially from an interrupt routine. But I don't know for sure.
Given a double-free error (reported by valgrind), is there a way to find out where the memory was allocated? Valgrind only tells me the location of the deallocation site (i.e. the call to free()), but I would like to know where the memory was allocated.
To get Valgrind keep tracks of allocation stack traces, you have to use options:
--track-origins=yes --keep-stacktraces=alloc-and-free
Valgrind will then report allocation stack under Block was alloc'd at section, just after Address ... inside a block of size x free'd alert.
In case your application is large, --error-limit=no --num-callers=40 options may be useful too.
The first check I would do is verifying that the error is indeed due to a double-free error. Sometimes, running a program (including with valgrind) can show a double-free error while in reality, it's a memory corruption problem (for example a memory overflow).
The best way to check is to apply the advice detailed in the answers : How to track down a double free or corruption error in C++ with gdb.
First of all, you can try to compile your program with flags fsanitize=address -g. This will instrument the memory of the program at runtime to keep track of all allocations, detect overflows, etc.
In any case, if the problem is indeed a double-free, the error message should contain all the necessary information for you to debug the problem.
What is the maximum stack size allowed for a thread in C#.NET 2.0? Also, does this value depend on the version of the CLR and/or the bitness (32 or 64) of the underlying OS?
I have looked at the following resources msdn1 and msdn2
public Thread(
ThreadStart start,
int maxStackSize
)
The only information I can see is that the default size is 1 megabytes and in the above method, if maxStackSize is '0' the default maximum stack size specified in the header for the executable will be used, what's the maximum value that we can change the value in the header upto? Also is it advisable to do so? Thanks.
For the record, this fits Raymond Chen's category of "if you need to know then you are doing something wrong".
The default stack size for threads running 64-bit code is 4 megabytes, 1 megabyte for 32-bit code. While the Thread constructor lets you pass a integer value up to int.MaxValue, you'll never get that on a 32-bit machine. The stack must fit in an available hole in the virtual memory address space, that usually tops out at ~600 MB early in the process lifetime. Rapidly getting smaller as you allocate memory and fragment the address space.
Allocating more than the default is quite unnecessary. You might contemplate doing this when you have a heavily recursive method that blows the stack. Don't, fix the algorithm or you'll blow it anyway when the job gets bigger.
The smallest stack that .NET lets you choose is 250 KB. It silently rounds it up if you pass a value that's smaller. Necessary because both the jitter and the garbage collector need stack space to get their job done. Again, doing so should be quite unnecessary. If you contemplate doing so because you have a lot of threads and consume all virtual memory with their stacks then you have too many threads. A StackOverflowException is one of the nastiest runtime exceptions you can get. Process death is immediate and untrappable.
The stack size for the main thread is determined by an option in the EXE header. The compiler doesn't have an option to change it, you have to use editbin.exe /stack to patch the .exe header.
I am unaware of what the maximum is, but MSDN speaks to whether you should do it or not:
Avoid using this constructor overload. The default stack size used by the Thread(ThreadStart) constructor overload is the recommended stack size for threads. If a thread has memory problems, the most likely cause is programming error, such as infinite recursion.
I have never had a StackOverflow occur in C# which was not due to infinite recursion. If there truly was a case where recursion went to that depth, I would consider replacing it with iteration.