I have defined an upward stack in xv6 (which had a downward stack) and want to know how I put a guard page between the stack and the heap. Is there any specific system call I can make use of? Also how can I maintain that one page address space to always lie between stack and heap?
so you know exactly from where your stack start growing up? In that case, why don't you just leave one page and just start from the next page onwards. And you might need to allocate and poison the memory with some data so that it could be detected. Like the way some of these memory overrun detection tools work. or you might need to set some custom flag to that page so that while trying to access them, you can check the flag and fault if found inappropriate.
Did I get your question correctly, btw?
Related
In Linux, when a running program attempts to use more stack space than the limit (a stack overflow), that usually results in a "segmentation fault" error and the execution is aborted.
Is it guaranteed that exceeding the stack space limit will always cause a segmentation fault error? Or could it happen that the program continues to run, possibly with some erroneous behavior due to data having been corrupted?
Another way of putting this: if a program misbehaves by producing wrong results but without a crash, can the cause still be a stack overflow?
Edit: to clarify, this question is not about "stack buffer overflow", it is about stack overflow, when the stack space used by the program exceeds the stack size limit (the limit that is in Linux given by ulimit -s).
A stack overflow turning into an access violation requires memory management hardware of some sort. Without hardware-assisted memory protection, an overgrown stack will collide with some other memory allocation, causing mutual corruption.
On demand-paged virtual memory operating systems, the upper limit of the stack is protected by a guard page: a page of virtual memory which is reserved (will not be allocated to anything) and marked "not present" so that accessing it generates a violation. A guard page is only so many bytes wide; a stack pointer can still accidentally increment over the guard page and land in some unrelated writable memory (such as a mapping belonging to a heap allocation) where havoc will be wreaked without necessarily triggering any memory access violation.
In the C language we can easily cause large stack increments by declaring large, uninitialized non-static block-scoped arrays, like char array[8192]; // (twice as large as a 4096 byte guard page). Using features like alloca or C99 variable-length arrays, we can do this dynamically: we can write a program which reads an integer value as a run-time input, and increments the stack by that much.
I debugged a problem many years ago whereby third-party code had debug logging macros, inside whose expansions there was a temporay array like char print_buf[8192] that was used for formatting messages. This was used in a multi-threaded application with many threads, whose stacks were reduced to just 64 kilobytes in size. Thanks to this print_buf, a thread's overflown stack leaped right past the guard page, and landed in another thread's stack, corrupting its local variables, causing the proverbial "hilarity to ensue".
I'm working on pthread in android NDK that process video data through network.
I meet problem that is 'stack corruption detected : aborted'. So I set -fstack-check in application.mk, and FATAL SIGNAL 11 blabla.. again.
My conclusion in this problem is about stack size.
When I use window thread, its stack size sets 1kb as default, increase automatically.
but, I don't know about pthread.
Is pthread increase stack size automatically?
p.s. I attached this thread to JavaVM.
On stock Android, pthread stacks are allocated at 1MB by default. (Because of the way the system works, only the parts of the stack that are actually touched get physical pages, so for many threads there's actually only a few KB in use.)
The "stack corruption" message indicates that something has trashed the stack, not that you've run off the end. One way to encounter this is to write off the end of a stack-allocated array. It might be useful to decode the stack trace you get in the log file and see what method it's in when it fails.
I recently ran into a bug with the "linux stack" and the "linux stack size". I came across a blog directing me to try
ulimit -a
to see what the limit for my box was, and it was set to 8192kb which seems to be the default.
What is the "linux stack"? How does it work, what does it store, what does it do?
The short answer is:
When programs on your linux box run, they add and remove data from the stack on a regular basis as the programs function. The stack size, referes to how much space is allocated in memory for the stack. If you increase the stack size, that allows the program to increase the number of routines that can be called. Each time a function is called, data can be added to the stack (stacked on top of the last routines data.)
Unless the program is a very complex, or designed for a special purpose, a stack size of 8192kb is normally fine. Some programs like graphics processing programs require you to increase the size of the stack to function. As they may store a lot of data on the stack.
Feel free to increase the stack size for those applications, its not a problem. To do so, use
ulimit -s bytes
BTW, What is a StackOverflowError?
Note: this question relates to stack overflows (think infinite recursion), NOT buffer overflows.
If I write a program that is correct, but it accepts an input from the Internet that determines the level of recursion in a recursive function that it calls, is that potentially sufficient to allow someone to compromise the machine?
I know someone might be able to crash the process by causing a stack overflow, but could they inject code? Or does the c runtime detect the stack overflow condition and abort cleanly?
Just curious...
Rapid Refresher
First off, you need to understand that the fundamental units of protection in modern OSes are the process and the memory page. Processes are memory protection domains; they are the level at which an OS enforces security policy, and they thus correspond strongly with a running program. (Where they don't, it's either because the program is running in multiple processes or because the program is being shared in some kind of framework; the latter case has the potential to be “security-interesting” but that's 'nother story.) Virtual memory pages are the level at which the hardware applies security rules; every page in a process's memory has attributes that determine what the process can do with the page: whether it can read the page, whether it can write to it, and whether it can execute program code on it (though the third attribute is rather more rarely used than perhaps it should be). Compiled program code is mapped into memory into pages that are both readable and executable, but not writable, whereas the stack should be readable and writable, but not executable. Most memory pages are not readable, writable or executable at all; the OS only lets a process use as many pages as it explicitly asks for, and that's what memory allocation libraries (malloc() et al.) manage for you.
Analysis
Provided each stack frame is smaller than a memory page[1] so that, as the program advances through the stack, it writes to each page, the OS (i.e., the privileged part of the runtime) can at least in principle detect stack overflows reliably and terminate the program if that occurs. Basically, all that happens to do this detection is that there is a page that the program cannot write to at the end of the stack; if the program tries to write to it, the memory management hardware traps it and the OS gets a chance to intervene.
The potential problems with this come if the OS can be tricked into not setting such a page or if the stack frames can become so large and sparsely written to that the guard page is jumped over. (Keeping more guard pages would help prevent the second case with little cost; forcing variable-sized stack allocations – e.g., alloca() – to always write to the space they allocate before returning control to the program, and so detect a smashed stack, would prevent the first with some cost in terms of speed, though the writes could be reasonably sparse to keep the cost fairly small.)
Consequences
What are the consequences of this? Well, the OS has to do the right thing with memory management. (#Michael's link illustrates what can happen when it gets that wrong.) But also it is dangerous to let an attacker determine memory allocation sizes where you don't force a write to the whole allocation immediately; alloca and C99 variable-sized arrays are a particular threat. Moreover, I would be more suspicious of C++ code as that tends to do a lot more stack-based memory allocation; it might be OK, but there's a greater potential for things to go wrong.
Personally, I prefer to keep stack sizes and stack-frame sizes small anyway and do all variable-sized allocations on the heap. In part, this is a legacy of working on some types of embedded system and with code which uses very large numbers of threads, but it does make protecting against stack overflow attacks much simpler; the OS can reliably trap them and all the attacker has then is a denial-of-service (annoying, but rarely fatal). I don't know whether this is a solution for all programmers.
[1] Typical page sizes: 4kB on 32-bit systems, 16kB on 64-bit systems. Check your system documentation for what it is in your environment.
Most systems (like Windows) exit when the stack is overflowed. I don't think you are likely to see a security issue here. At least, not an elevation of privilege security issue. You could get some denial of service issues.
There is no universally correct answer... on some system the stack might grow down or up to overwrite other memory that the program's using (or another program, or the OS), but on any well designed, vaguely security-conscious OS (Linux, any common UNIX variant, even Windows) there will be no rights escalation potential. On some systems, with stack size checks disabled, the address space might approach or exceed the free virtual memory size, allowing memory exhaustion to negatively affect or even bring down the entire machine rather than just the process, but on good OSes by default there's a limit on that (e.g. Linux's limit / ulimit commands).
Worth mentioning that it's typically pretty easy to use a counter to put an arbitrary but generous limit of recursive depth too: you can use a static local variable if single-threaded, or a final parameter (conveniently defaulted to 0 if your language allows it, else have an outer caller provide 0 the first time).
Yes it is. There are algorithms to avoid recursiveness. For example in case arithmetic expressions the reverse polish notation enable you to avoid recursiveness. The main idea behind is to alter the original expression. There could be some algorythm that can help you as well.
One other problem with stack overflow, that if error handling is not appropiate, it can cause anything. To explain it for example in Java StackOverflowError is an error, and it is caught if someone catches Throwable, which is a common mistake. So error handling is a key question in case of stack overflow.
Yes, it is. Availability is an important aspect of security, that is mostly overlooked.
Don't fall into that pit.
edit
As an example of poorly understood security-consciousness in modern OSs, take a look at a relatively newly discovered vulnerability that nobody yet patched completely. There are countless other examples of privilege escalation vulnerabilities that OS developers have written off as denial of service attacks.
I am developing a multithread modular application using C programming language and NPTL 2.6. For each plugin, a POSIX thread is created. The problem is each thread has its own stack area, since default stack size depends on user's choice, this may results in huge memory consumption in some cases.
To prevent unnecessary memory usage I used something similar to this to change stack size before creating each thread:
pthread_attr_t attr;
pthread_attr_init (&attr);
pthread_attr_getstacksize(&attr, &st1);
if(pthread_attr_setstacksize (&attr, MODULE_THREAD_SIZE) != 0) perror("Stack ERR");
pthread_attr_getstacksize(&attr, &st2);
printf("OLD:%d, NEW:%d - MIN: %d\n", st1, st2, PTHREAD_STACK_MIN);
pthread_attr_setdetachstate(&attr, PTHREAD_CREATE_DETACHED);
/* "this" is static data structure that stores plugin related data */
pthread_create(&this->runner, &attr, (void *)(void *)this->run, NULL);
EDIT I: pthread_create() section added.
This did not work work as I expected, the stack size reported by pthread_attr_getstacksize() is changed but total memory usage of the application (from ps/top/pmap output) did not changed:
OLD:10485760, NEW:65536 - MIN: 16384
When I use ulimit -s MY_STACK_SIZE_LIMIT before starting application I achieve the expected result.
My questions are:
1-) Is there any portable(between UNIX variants) way to change (default)thread stack size after starting application(before creating thread of course)?
2-) Is it possible to use same stack area for every thread?
3-) Is it possible completely disable stack for threads without much pain?
Answers for #2 and #3 are no and no. Each thread needs a stack (where else do your local variables and return addresses go?) and they need to be unique per-thread (otherwise threads would overwrite each other's local variables and return addresses, making everybody crash).
As for #1... the set stack size call is precisely the answer for this. I suggest you figure out an acceptable size to create your threads with, and set it.
As for why things don't look right to you in top.... top is a notorious liar about memory usage. :-) Is stuff actually failing to be allocated or getting OOM-killed? Are thread creations failing? Is performance suffering and paging to disk increasing? If the answer to these questions is no, then I don't think there's much to worry about.
Update based on some comments below and above:
First, 16KB is still pretty big for something that you say doesn't need much stack space. If you really want to go small, I would be tempted to say 4096 or 8192 on x86 Linux. Second, yes you can set your CPU's stack pointer to something else.. But when you malloc() or mmap(), that's going to take up space. I don't know how you think it's going to help to set the stack pointer to something else. That said, if you really feel strongly that the thread that calls main() has too big of a stack (I would say that is slightly crazy) and that pthread_attr_setstacksize() doesn't let you get small enough (?), then maybe you can look into non-portable stuff like creating threads by calling the clone() syscall and specifying stacks based on the main thread's stack pointer, or a buffer from elsewhere, or whatever. But you're still going to need a stack for each thread and I have a feeling top is still going to disappoint you. Maybe your expectations are a little high.
I have seen this problem as well. It is unclear how the stacks are accounted for but the "extra" space is counted against your total VM and if you run up against your process boundary you are in trouble (even though you aren't using the space). It seems dependent on what version of Linux you are running (even within the 2.6 family), and whether you are 32 bit or 64 bit.