Correctly dynamic loading PIEs - linux

Many discussions like this and this have warned us with examples that trying to dlopen a PIE could never be correct. The reasons are various: copy relocations, TLS, etc.
However, these problems can be circumvented if we loose the restriction. This question showed us compiling with fPIC can eliminate copy relocation, and TLS seems to work alright.
This brings up the question about how far we are from correctly dynamic loading a PIE. I agree with the idea again in link 1:
Bottom line: this was never designed to work, and you just happened to not step on many of the land-mines, so you thought it is working, when in fact you were exercising undefined behavior.
But I'm more interesting about WHY we could not do that, instead of another failing example.
More specifically, users could write their own runtime dynamic linker as this comment suggest, which could make some strong assumptions or compromises just for this purpose. Yet this requires extremely broad knowledge on compiling, linking and loading, some of which are known to be poorly documented.
So again, how do users correctly dynamic load PIEs, or at least how can they try to find a way to do that(or not to do that)?

But I'm more interesting about WHY we could not do that, instead of another failing example.
Because the designers of GLIBC didn't intend to allow for this to happen and don't consider this to be a valid use case.
More specifically, users could write their own runtime dynamic linker
Absolutely. You are free to design your own libc and the dynamic loader to allow for this use case. That requirement will add some complexity, but there is no fundamental reason it can't be done.
You may also find an existing alternate libc implementation which doesn't have this restriction (either because it has been designed in, or because the designers forgot to enforce it, as was the case with GLIBC before this patch).
how do users correctly dynamic load PIEs
They don't.
how can they try to find a way to do that(or not to do that)?
The usual solution is to "not do that", and in fact the need to "do that" seems to be very esoteric.
Why do you need to dlopen a PIE executable in the first place?

Related

Relation between MSVC Compiler & linker option for COMDAT folding

This question has some answers on SO but mine is slightly different. Before marking as duplicate, please give it a shot.
MSVC has always provided the /Gy compiler option to enable identical functions to be folded into COMDAT sections. At the same time, the linker also provides the /OPT:ICF option. Is my understanding right that these two options must be used in conjunction? That is, while the former packages functions into COMDAT, the latter eliminates redundant COMDATs. Is that correct?
If yes, then either we use both or turn off both?
Answer from someone who communicated with me off-line. Helped me understand these options a lot better.
===================================
That is essentially true. Suppose we talk just C, or C++ but with no member functions. Without /Gy, the compiler creates object files that are in some sense irreducible. If the linker wants just one function from the object, it gets them all. This is specially a consideration in programming for libraries, such that if you mean to be kind to the library's users, you should write your library as lots of small object files, typically one non-static function per object, so that the user of the library doesn't bloat from having to carry code that actually never executes.
With /Gy, the compiler creates object files that have COMDATs. Each function is in its own COMDAT, which is to some extent a mini-object. If the linker wants just one function from the object, it can pick out just that one. The linker's /OPT switch gives you some control over what the linker does with this selectivity - but without /Gy there's nothing to select.
Or very little. It's at least conceivable that the linker could, for instance, fold functions that are each the whole of the code in an object file and happen to have identical code. It's certainly conceivable that the linker could eliminate a whole object file that contains nothing that's referenced. After all, it does this with object files in libraries. The rule in practice, however, used to be that if you add a non-COMDAT object file to the linker's command line, then you're saying you want that in the binary even if unreferenced. The difference between what's conceivable and what's done is typically huge.
Best, then, to stick with the quick answer. The linker options benefit from being able to separate functions (and variables) from inside each object file, but the separation depends on the code and data to have been organised into COMDATs, which is the compiler's work.
===================================
As answered by Raymond Chen in Jan 2013
As explained in the documentation for /Gy, function-level linking
allows functions to be discardable during the "unused function" pass,
if you ask for it via /OPT:REF. It does not alter the actual classical
model for linking. The flag name is misleading. It's not "perform
function-level linking". It merely enables it by telling the linker
where functions begin and end. And it's not so much function-level
linking as it is function-level unlinking. -Raymond
(This snippet might make more sense with some further context:here are the posts about classical linking model:1, 2
So in a nutshell - yes. If you activate one switch without the other, there would be no observable impact.

Why is "cabal build" so slow compared with "make"?

If I have a package with several executables, which I initially build using cabal build. Now I change one file that impacts just one executable, cabal seems to take about a second or two to examine each executable to see if it's impacted or not. On the other hand, make, given an equivalent number of executables and source files, will determine in a fraction of a second what needs to be recompiled. Why the huge difference? Is there a reason, cabal can't just build its own version of a makefile and go from there?
Disclaimer: I'm not familiar enough with Haskell or make internals to give technical specifics, but some web searching does offer some insight that lines up with my proposal (trying to avoid eliciting opinions by providing references). Also, I'm assuming your makefile is calling ghc, as cabal apparently would.
Proposal: I believe there could be several key reasons, but the main one is that make is written in C, whereas cabal is written in Haskell. This would be coupled with superior dependency checking from make (although I'm not sure how to prove this without looking at the source code). Other supporting reasons, as found on the web:
cabal tries to do a lot more than simply compiling, e.g. appears to take steps with regard to packaging (https://www.haskell.org/cabal/)
cabal is written in haskell, although the run time is written in C (https://en.wikipedia.org/wiki/Glasgow_Haskell_Compiler)
Again, not being overly familiar with make internals, make may simply have a faster dependency checking mechanism, thereby better tracking these changes. I point this out because from the OP it sounds like there is a significant enough difference to where cabal may be doing a blanket check against all dependencies. I suspect this would be the primary reason for the speed difference, if true.
At any rate, these are open source and can be downloaded from their respective sites (haskell.org/cabal/ and savannah.gnu.org/projects/make/) allowing anyone to examine specifics of the implementations.
It is also likely one could see a lot of variance in speed based upon the switches passed to the compilers in use.
HTH at least point you in the right direction.

How to inspect Haskell bytecode

I am trying to figure out a bug (a serious performance downgrade). Unfortunately, I wasn't able to figure out why by going back many different versions of my code.
I am suspecting it could be some modifications to libraries that I've updated, not to mention in the meanwhile I've updated to GHC 7.6 from 7.4 (and if anybody knows if some laziness behavior has changed I would greatly appreciate it!).
I have an older executable of this code that does not have this bug and thus I wonder if there are any tools to tell me the library versions I was linking to from before? Like if it can figure out the symbols, etc.
GHC creates executables, which are notoriously hard to understand... On my Linux box I can view the assembly code by typing in
objdump -d <executable filename>
but I get back over 100K lines of code from just a simple "Hello, World!" program written in Haskell.
If you happen to have the GHC .hi files, you can get some information about the executable by typing in
ghc --show-iface <hi filename>
This won't give you the assembly code, but you can get some extra information that may prove useful.
As I mentioned in the comment above, on Linux you can use "ldd" to see what C-system libraries you used in the compile, but that is also probably less than useful.
You can try to use a disassembler, but those are generally written to disassemble to C, not anything higher level and certainly not Haskell. That being said, GHC compiles to C as an intermediary (at least it used to; has that changed?), so you might be able to learn something.
Personally I often find view system calls in action much more interesting than viewing pure assembly. On my Linux box, I can view all system calls by running using strace (use Wireshark for the network traffic equivalent):
strace <program executable>
This also will generate a lot of data, so it might only be useful if you know of some specific place where direct real world communication (i.e., changes to a file on the hard disk drive) goes wrong.
In all honesty, you are probably better off just debugging the problem from source, although, depending on the actual problem, some of these techniques may help you pinpoint something.
Most of these tools have Mac and Windows equivalents.
Since much has changed in the last 9 years, and apparently this is still the first result a search engine gives on this question (like for me, again), an updated answer is in order:
First of all, yes, while Haskell does not specify a bytecode format, bytecode is also just a kind of machine code, for a virtual machine. So for the rest of the answer I will treat them as the same thing. The “Core“ as well as the LLVM intermediate language, or even WASM could be considered equivalent too.
Secondly, if your old binary is statically linked, then of course, no matter the format your program is in, no symbols will be available to check out. Because that is what linking does. Even with bytecode, and even with just classic static #include in simple languages. So your old binary will be no good, no matter what. And given the optimisations compilers do, a classic decompiler will very likely never be able to figure out what optimised bits used to be partially what libraries. Especially with stream fusion and such “magic”.
Third, you can do the things you asked with a modern Haskell program. But you need to have your binaries compiled with -dynamic and -rdynamic, So not only the C-calling-convention libraries (e.g. .so), and the Haskell libraries, but also the runtime itself is dynamically loaded. That way you end up with a very small binary, consisting of only your actual code, dynamic linking instructions, and the exact data about what libraries and runtime were used to build it. And since the runtime is compiler-dependent, you will know the compiler too. So it would give you everything you need, but only if you compiled it right. (I recommend using such dynamic linking by default in any case as it saves memory.)
The last factor that one might forget, is that even the exact same compiler version might behave vastly differently, depending on what IT was compiled with. (E.g. if somebody put a backdoor in the very first version of GHC, and all GHCs after that were compiled with that first GHC, and nobody ever checked, then that backdoor could still be in the code today, with no traces in any source or libraries whatsoever. … Or for a less extreme case, that version of GHC your old binary was built with might have been compiled with different architecture options, leading to it putting more optimised instructions into the binaries it compiles for unless told to cross-compile.)
Finally, of course, you can profile even compiled binaries, by profiling their system calls. This will give you clues about which part of the code acted differently and how. (E.g. if you notice that your new binary floods the system with some slow system calls where the old one just used a single fast one. A classic OpenGL example would be using fast display lists versus slow direct calls to draw triangles. Or using a different sorting algorithm, or having switched to a different kind of data structure that fits your work load badly and thrashes a lot of memory.)

How are the Haddock module fields Portability, Stability and Maintainer used?

In lots of Haddock-generated module documentation (e.g. Prelude), a small box in the top-right can be seen, containing portability, stability and maintainer information:
From looking at the source code to such modules and experimentation, I confirmed that this information is generated from lines like the following in the module description:
-- Maintainer : libraries#haskell.org
-- Stability : stable
-- Portability : portable
There are several strange things about this:
The fields only seem to work in this order — any fields put out of order are simply treat as part of the module description itself. This is despite the fact that the order in the source file is the opposite of the order in the generated documentation!
I have been unable to find any official documentation of these fields. There is a Cabal package property named stability, the example values of which match the values I've seen in the equivalent Haddock fields, but beyond that, I've found nothing.
So: How are these fields intended to be used, and are they documented anywhere?
In particular, I'd like to know:
The full list of commonly-used values for Portability and Stability. This HaskellWiki page has a list, but I'd like to know where this list originated from.
The criteria for deciding whether a module is portable or non-portable. In particular, the package I would like the answers to these questions for, acme-strfry, is an FFI binding to strfry, a function only available in glibc. Is the package non-portable, because it only works on glibc systems, or portable, because it does not use any Haskell language extensions? The common usage seems to imply the latter.
Why a specific order of fields is required in the source file, and why it's the opposite of the ordering in the generated documentation.
Oh, I thought those fields were from the cabal package description. They don't seem to be documented at all on Haddock's docs. I've found this, which doesn't really answer your question but:
http://trac.haskell.org/haddock/ticket/71
So if it's freeform anyway, why not just write "non-portable (depends on glibc)"? I've seen even "portable (depends on ghc)", which is odd. I also wonder what happens with modules that were non-portable due to non-Haskell98 extension Foo, after Foo was added to Haskell 2010.
Note that the Cabal documenation you link to also says stability is freeform. Of course, even if haddock or cabal were to define what are the acceptable values, it'd still be up to the maintainer to subjectively select one.
About the specific order, you should probably just ask at the haddock mailing list, or check the source and file a bug.
PS: strfry is an invaluable contribution to the Haskell community, but it should be pure and portable, don't you think?
Ah yes, one of the more obscure and crufty features of Haddock.
As best as I can tell, it's just an undocumented hack. There's no sane reason why the order of the fields should matter, but it does. The specific choice of formatting (i.e., as a special form inside the module comment rather than as a separate block of some kind) isn't the best either. My guess is that somebody wanted to quickly add this feature one day, so they hacked up something minimal but functioning, and left it at that. (Without bothering to document it.)
Personally, I just don't bother with these fields at all. The information is available from Cabal, so I don't bother duplicating it in Haddock as well. Perhaps some day Cabal will pass this information to Haddock automatically...

Code obfuscation usage in various languages

I recently learned about code obfuscation. Its nice thing to do, when you have spare time, but I have different question. Why to do it?
First, there are languages in which I am sure its great thing - interpreted ones, like php, JavaScript and much more. There it seems like a good and more secure thing.
Second, there are languages where this seems to have no real effect for me - all the native code compiled languages. Take C for example. when compiled, all the variable names, function names, most of obfuscation techniques go away. If some can make it into native code, it would be things like recursion instead of for cycles and so, but disassembled code will anyway have instead of names some disassembler-generated identifiers, right?
And last category are languages I am not quite sure about. And that's the main reason I ask. These languages would be Java, C# (.NET),and the last Silverlight used in WP7. I ask because I read some article that state that on WP7 apps, code obfuscation helps preventing code from hacking. But I always thought of byte-code as being very similar to standard assembler codes, therefore again not having any information about real pre-compilation variable names, function names, etc. So, where is the truth?
Do it if you want, but don't expect any determined person to be scared away by it. There exist de-obfuscators, people can read obfuscated code as well (just as there are people who can read optimized assembly and reconstruct the original C code). Code obfuscation just gives you a false sense of security and might deter a person who is just curious (instead of deterring those who are serious about stealing your code). All it gives you is a false sense of security but no real one. Schneier aptly names this "security theater".
Yes, many modern languages that retain more information about the source can be obfuscated better than those that are compiled right to machine code. For the latter the compiler already does quite a good job with optimization. Your notion of bytecode being akin to traditional assembler is slightly wrong here, though. Especially .NET bytecode retains enough metadata to reconstruct the original source almost exactly (see Reflector). What isn't retained there are the names of local variables and arguments to methods. But you still need and retain the method and class names.
Another issue you should be aware of: If you give your customers an obfuscated executable and your program crashes, make sure you have a way of getting the real stacktrace back instead of the obfuscated one. Saying "Sorry, I cannot determine the root cause of why my program killed hours of your work since I chose to obfuscate it" isn't going to cut it, I guess :-)
Obfuscation is a common technique for mobile applications where you have hardware restrictions. Obfuscated code tends to have shorter identifiers and therefore smaller binaries.

Resources