I have a file.hsc in my stack project that would ordinarily need to be pre-processed via hsc2hs file.hsc.
Running stack build instead causes the file to be read as a normal haskell file, without preprocessing.
Question: Is there a way to use stack and hsc2hs concurrently? Ideally, one would just run stack build and everyting would just "work".
I am learning about stack frames, and I wish to know if we can write a function which shows how stack is corrupted? I wish to see an example in c/c++ not assembly language.
If we don't do array index overflow or vicious array/address indexed visit(read/write), is there a possibility that stack gets corrupted? Any quick samples?
Thanks.
you may be interested in the excellent article "Smashing the stack for fun and profit" (http://insecure.org/stf/smashstack.html) which explains how to overwrite stack pointer and execute additional code.
This is a classic.
Good luck!
I have a device driver code in Linux. Its execution includes lot of functions and different flows of functions.
For debugging, i wanted to know stack at some points in code.
E.g.
lets say,
A calls B, B calls C, then in function C, at some line where i want to know stack, should print something like
A-->B-->C
Is it possible doing this?
Let me know your answers.
dump_stack() function will be helpful.
sample usage is
http://lxr.free-electrons.com/source/sound/soc/codecs/tpa6130a2.c#L393
Context: I'm building a programming language (called Lima), and I want to know what options there are to have the system keep track of the stack such that I can generate proper stack traces (with the right line-numbers from the original source). Note that this is not meant to be a duplicate of this related but limited question: How do stack traces get generated?
My fundamental questions is: Does the program need to make an update as to what line number it is on between every line executed?
It seems to me that the unfortunate answer here is yes.
I'm also wondering whether I can leverage anything in the environment I'm compiling to for stack traces. Right now I'm compiling the language to javascript (and running in Rhino) - but I'm looking for a general answer as to whether its even theoretically possible for the underlying environment to help you in any way here.
If the underlying system supports stack traces, can you make a static mapping from that system's line numbers to yours?
My understanding is that the stack has the return address stored as each subroutine call is made. That address is used in a symbol lookup when generating the stack trace. For a scripting language I guess you'd have to obtain the file and line number / line position when making a subroutine call and put it on the stack. I'm guessing that scripting languages would construct a hash table to lookup this information to keep the actual stack more compact.
In languages with automatic garbage collection like Haskell or Go, how can the garbage collector find out which values stored on the stack are pointers to memory and which are just numbers? If the garbage collector just scans the stack and assumes all addresses to be references to objects, a lot of objects might get incorrectly marked as reachable.
Obviously, one could add a value to the top of each stack frame that described how many of the next values are pointers, but wouldn't that cost a lot of performance?
How is it done in reality?
Some collectors assume everything on the stack is a potential pointer (like Boehm GC). This turns out to be not as bad as one might expect, but is clearly suboptimal. More often in managed languages, some extra tagging information is left with the stack to help the collector figure out where the pointers are.
Remember that in most compiled languages, the layout of a stack frame is the same every time you enter a function, therefore it is not that hard to ensure that you tag your data in the right way.
The "bitmap" approach is one way of doing this. Each bit of the bitmap corresponds to one word on the stack. If the bit is a 1 then the location on the stack is a pointer, and if it is a 0 then the location is just a number from the point of view of the collector (or something along those lines). The exceptionally well written GHC runtime and calling conventions use a one word layout for most functions, such that a few bits communicate the size of the stack frame, with the rest serving as the bitmap. Larger stack frames need a multi word structure, but the idea is the same.
The point is that the overhead is low, since the layout information is computed at compile time, and then included in the stack every time a function is called.
An even simpler approach is "pointer first", where all the pointers are located at the beginning of the stack. You only need to include a length prior to the pointers, or a special "end" word after them, to tell which words are pointers given this layout.
Interestingly, trying to get this management information on to the stack produces a host of problem related to interop with C. For example, it is sub optimal to compile high level languages to C, since even though C is portable, it is hard to carry this kind of information. Optimizing compilers designed for C like languages (GCC,LLVM) may restructure the stack frame, producing problems, so the GHC LLVM backend uses its own "stack" rather than the LLVM stack which costs it some optimizations. Similarly, the boundary between C code, and "managed" code needs to be constructed carefully to keep from confusing the GC.
For this reason, when you create a new thread on the JVM you actually create two stacks (one for Java, one for C).
The Haskell stack uses a single word of memory in each stack frame describing (with a bitmap) which of the values in that stack frame are pointers and which are not. For details, see the "Layout of the stack" article and the "Bitmap layout" article from the GHC Commentary.
To be fair, a single word of memory really isn't much cost, all things considered. You can think of it as just adding a single variable to each method; that's not all that bad.
There exist GCs that assume that every bit pattern that is the address of something the GC is managing is in fact a pointer (and so don't release the something). This can actually work pretty well, because calls pointers are usually bigger than small common integers, and usually have to be aligned. But yes, this can cause collection of some objects to be delayed. The Boehm collector for C works this way, because it's library-based and so don't get any specific help from the compiler.
There are also GCs that are more tightly coupled to the language they're used in, and actually know the structure of the objects in memory. I've never read up specifically in stack frame handling, but you could record information to help the GC if the compiler and GC are designed to work together. One trick would be putting all the pointer references together and using one word per stack frame to record how many there are, which is not such a huge overhead. If you can work out what function corresponds to each stack frame without adding a word saying so, then you could have a per-function "stack frame layout map" compiled in. Another option would be to use tagged words, where you set the low order bit of words that are not pointers to 1, which (due to address alignment) is never needed for pointers, so you can tell them apart. That means you have to shift unboxed values in order to use them though.
It's important to realize that GHC maintains its own stack and does not use the C stack (other than for FFI calls). There's no portable way to access all of the contents of the C stack (for instance, in a SPARC some of it is hidden away in register windows), so GHC maintains a stack where it has full control. Once you maintain your own stack you can pick any scheme to distinguish pointers from non-pointers on the stack (like a using a bitmap).