The compilation of the compiler could affect the compiled programs? - linux

Probably my question sounds weird, but my point is: i have to compile a program using GCC, if i compile GCC from the source i will get a slight edge in terms of performances from a software compiled with the fresh new GCC? What I should expect?

You won't get any faster programs out of a compiler built with optimizing flags. Since a program is the compilers' output, and optimizations don't change the output of a correct program, the programs stay the same.
You might, however, profit from new available options if your distributor ships an incomplete compiler. Look through the GCC manual for any options you want to enable (like certain target architecture variants), and if you can't enable them in your current compiler build, there might be potential in a custom-built compiler. However, it is unlikely that it's worth it.

Not unless you're building a newer version of gcc, or enabling cloog, graphite, etc.

the performance difference usually is nothing or is negligible.
in a very rare, really very rare cases you can see noticeable difference, but not always performance improvement. degradation is possible too.

Related

How to inspect Haskell bytecode

I am trying to figure out a bug (a serious performance downgrade). Unfortunately, I wasn't able to figure out why by going back many different versions of my code.
I am suspecting it could be some modifications to libraries that I've updated, not to mention in the meanwhile I've updated to GHC 7.6 from 7.4 (and if anybody knows if some laziness behavior has changed I would greatly appreciate it!).
I have an older executable of this code that does not have this bug and thus I wonder if there are any tools to tell me the library versions I was linking to from before? Like if it can figure out the symbols, etc.
GHC creates executables, which are notoriously hard to understand... On my Linux box I can view the assembly code by typing in
objdump -d <executable filename>
but I get back over 100K lines of code from just a simple "Hello, World!" program written in Haskell.
If you happen to have the GHC .hi files, you can get some information about the executable by typing in
ghc --show-iface <hi filename>
This won't give you the assembly code, but you can get some extra information that may prove useful.
As I mentioned in the comment above, on Linux you can use "ldd" to see what C-system libraries you used in the compile, but that is also probably less than useful.
You can try to use a disassembler, but those are generally written to disassemble to C, not anything higher level and certainly not Haskell. That being said, GHC compiles to C as an intermediary (at least it used to; has that changed?), so you might be able to learn something.
Personally I often find view system calls in action much more interesting than viewing pure assembly. On my Linux box, I can view all system calls by running using strace (use Wireshark for the network traffic equivalent):
strace <program executable>
This also will generate a lot of data, so it might only be useful if you know of some specific place where direct real world communication (i.e., changes to a file on the hard disk drive) goes wrong.
In all honesty, you are probably better off just debugging the problem from source, although, depending on the actual problem, some of these techniques may help you pinpoint something.
Most of these tools have Mac and Windows equivalents.
Since much has changed in the last 9 years, and apparently this is still the first result a search engine gives on this question (like for me, again), an updated answer is in order:
First of all, yes, while Haskell does not specify a bytecode format, bytecode is also just a kind of machine code, for a virtual machine. So for the rest of the answer I will treat them as the same thing. The “Core“ as well as the LLVM intermediate language, or even WASM could be considered equivalent too.
Secondly, if your old binary is statically linked, then of course, no matter the format your program is in, no symbols will be available to check out. Because that is what linking does. Even with bytecode, and even with just classic static #include in simple languages. So your old binary will be no good, no matter what. And given the optimisations compilers do, a classic decompiler will very likely never be able to figure out what optimised bits used to be partially what libraries. Especially with stream fusion and such “magic”.
Third, you can do the things you asked with a modern Haskell program. But you need to have your binaries compiled with -dynamic and -rdynamic, So not only the C-calling-convention libraries (e.g. .so), and the Haskell libraries, but also the runtime itself is dynamically loaded. That way you end up with a very small binary, consisting of only your actual code, dynamic linking instructions, and the exact data about what libraries and runtime were used to build it. And since the runtime is compiler-dependent, you will know the compiler too. So it would give you everything you need, but only if you compiled it right. (I recommend using such dynamic linking by default in any case as it saves memory.)
The last factor that one might forget, is that even the exact same compiler version might behave vastly differently, depending on what IT was compiled with. (E.g. if somebody put a backdoor in the very first version of GHC, and all GHCs after that were compiled with that first GHC, and nobody ever checked, then that backdoor could still be in the code today, with no traces in any source or libraries whatsoever. … Or for a less extreme case, that version of GHC your old binary was built with might have been compiled with different architecture options, leading to it putting more optimised instructions into the binaries it compiles for unless told to cross-compile.)
Finally, of course, you can profile even compiled binaries, by profiling their system calls. This will give you clues about which part of the code acted differently and how. (E.g. if you notice that your new binary floods the system with some slow system calls where the old one just used a single fast one. A classic OpenGL example would be using fast display lists versus slow direct calls to draw triangles. Or using a different sorting algorithm, or having switched to a different kind of data structure that fits your work load badly and thrashes a lot of memory.)

gccsense vs. clang_complete

I've been using omniCppComplete + ctags for a while, and want to make a further improvement on the code completion.
According to the suggestion in here [1], gccsense and clang_complete seems to be alternatives. However, I am not sure which one is better. Any idea on their performance?
Thanks!
Update: After I tried clang_complete, I found the completion speed extremely unacceptable.
I then tried it using libclang.dylib, which speeds up a lot but still make one feels lagging.
I think I should stick to ctags for now.
You should probably use clang_complete, not gccsense.
The main point here is the architecture of the two. The idea behind both solutions is very similar: you can't get normal C++ completion without access to internal compiler (gcc) information (Abstract Syntax Tree) while gcc doesn't provide you with sufficient interfaces for that. The implementation part of accessing this info though is quite different here: gccsense is a kind of "hack" - it's a custom build of gcc capable for storing the neccessary info for futher providing it to plugin, while clang_complete goes the other way by using alternative compiler: clang, one of the main goals of creation of which was exactly making AST easily accessible by external tools.
So, in case of using gccsense you'll need to compile your code with a kind of custom gcc compiler, which is already a little bit outdated (gccsense is using gcc 4.4) now and will constantly need developer's support in feature. On the contrary, clang_complete doesn't depend so much on clang compiler, it uses it as external tool.
As for performance: again clang was designed to be faster than gcc and it is. Clang_complete can be slightly slower on Windows than on MacOS/Linux, however gccsense can't even be compiled for Windows at the time.
GCCsense can be built on Windows.
See my patch on gcc 4.5.2 here:
http://forums.codeblocks.org/index.php/topic,13812.msg94824.html#msg94824
I admit that gccsense is just a hack to gcc, but clang has much better design from its beginning.
I hope anyone could improve gcc/gccsense.

Is assembler portable between Linux distros?

Is a program shipped in assembler format portable between Linux distributions (modulo CPU architecture differences)?
Here's the background to my question: I'm working on a new programming language (named Aklo), whose modus operandi will be the classic compiling to .s and feeding the result to the GNU assembler.
Obviously it would be nice ultimately to have the implementation written in itself, but I had resigned myself to maintaining it in C++ to solve the chicken and egg problem: suppose you download the compiler for the first time and it is itself written in Aklo, how do you compile it? As I understand it, different Linux distributions and other UNIX like systems have different conventions for binary formats.
But it's just occurred to me, a solution might be to ship the .s file (well, one per CPU architecture): it's fair to assume you have or can install the GNU assembler. Of course I'd still need a bootstrap compiler, but that doesn't need to be fast; I can write it in Python.
Is assembler portable in the way that binaries are not? Are there any other stumbling blocks I haven't thought of?
Added in response to one answer:
I had looked wistfully at LLVM, there is certainly a lot of good stuff there and it would make my life easier -- except that it would incur a dependency on the correct version of LLVM being installed. It wouldn't be so bad having that dependency on development machines, but in a world where it's common to ship programs as source, the same dependency would be incurred for every user of every program ever written in Aklo, and I decided that was too high a price to pay.
But if the solution of shipping compiled programs as assembler works... then that solves that problem, and I can use LLVM after all, which would be a big win.
So the question about portability of assembler is even considerably more important than I had first realized.
Conclusion: from answers here and on the LLVM mailing list http://lists.cs.uiuc.edu/pipermail/llvmdev/2010-January/028991.html it seems the bad news is the problem is unsolvable, but the good news is that means using LLVM makes it no worse, so I'm free to do so and obtain all the advantages thereof.
You might want to check out LLVM before going down this particular path. It might make your life a lot easier, as it provides a low level virtual machine that makes a lot of hard stuff just work and has been very popular.
At a very high level, the ABI consists of { instruction set, system calls, binary format, libraries }.
Distribution as .s may free you from the binary format. This is still rather pointless, because you are fixed to a particular ISA and still need to use libraries and/or make system calls. Libraries vary from distribution to distribution (although this isn't really that bad, especially if you just use libc) and syscalls vary from OS to OS.
It's basically 20 years since I last bootstrapped a C compiler. At the level of compilers, the differences between Linux distributions are minimal.
The much more important reason for going LLVM is cross-platform; if you're not writing some intermediate language, your compiler will be extremely difficult to retarget for different processors. And seeing as, on my laptop, I have compilers for x86, x86_64, two kinds of MIPS, PowerPC, ARM and AVR... you see where I'm going? I can compile multiple languages for most of those targets too (only C for AVR).

Is it possible to compile Linux kernel with something other than gcc

I wonder if someone managed to compile the Linux kernel with some other compiler than gcc. Or if someone have ever tried? Question may seem to be silly or academic, but it arose when I thought about answers to: Are C++ int operations atomic on the mips architecture
It seems that the atomicity of some operations depends not only on the cpu architecture, but also on used compiler. So, I wonder if in Linux world some compiler other than gcc even exists.
Linux explicitly depends on some gcc extensions, so any other compiler must be compatible with the needed extensions, in that case.
This is not a "no", since it's of course not impossible for a separate compiler vendor/developer to track gcc's extensions, just a data point that might help you search.
At some point tcc would process and run the linux kernel source. SO that would be a yes, I guess.
::Hat tip to ephemient in the comments.::
The LLVM developers are trying to compile it with clang. The meta-bug on compiling the Linux kernel with clang has more details (the dependency tree for that meta-bug shows how little seems to be left).
There have been some efforts (and patches) to compile an early version of the 2.6 kernel with icc.
Yup. I've done this. See [cfe-dev] Clang builds a working Linux Kernel (Boots to RL5 with SMP, networking and X, self hosts).
IBM's compiler was able to do it some Linux versions ago, but I'm not sure about now, nor am I sure of how well IBM optimized the kernel as instructed. All I know is, they got it to build.
As Linux is self hosting (with its own libc) and has been developed from the start with gcc (and gcc cross compilers), its sort of silly to use anything else.
I think mainly, playing nice with preprocessor macros and instructed optimizations is the biggest obstacle (not even getting into a departure from gas), as GNU has basically written the book on the above, and extended it. Beyond that, Linux tunes its optimizations to work with gcc, for instance, don't get caught using 'volatile' in the kernel without a damn good reason. Using inline and actually having the compiler agree is another challenge.
Linus is the first one to call GCC an &*#$ hole, which makes for a better compiler.
This is why we have the great GNU/Linux debate.
Many, many, many years ago, it was actually possible to compile the kernel with g++, and as far as I remember part of the motivation was because C++ had stronger type check, not necessarily to have g++ to produce object files. But as Neil Butterworth have pointed out, Linus is not particular fond of C++, and there is zero chance that this ever will be possible again.
EKOPath 4 Compiler, not now. but probably with some minor patches
https://github.com/path64/repositories
http://www.pathscale.com/ekopath-compiler-suite
I am just now working on compile Linux kernel using Open64 for MIPS archtecture, and some other guys are now just working for make Open64 can build for X86 arch. Now the kernel can partly run, and still have Run fail errors.
However for the atomic problem, at least i have not come up with it. And I do not think it is really a problem.The reasons are:
The Linux kernel have already been a collection of source code, which can successfully build with GCC, so it is only the compiler's problem if it can not build it, or the built kernel runs fail.
If a compiler want to successfully build Linux kernel, it should obide the GNU C Extension, and this extension will give a clear discription of what a atomic operation is, so such a compile only need to generate code according to this extension.
My non-technical guess: The Linux Kernel can't currently (2009) be compiled with any compiler other than the GNU compiler, gcc.
I say this on the basis that I've heard Richard Stallman, with some conviction, say Linux should be called GNU/Linux because the kernel is "only 1 part of the operating system" and I'm guessing he would not be able to say this if the kernel was non-dependant on GNU (e.g. a tonne of embedded devices run a Linux OS without any GNU software).
As I said, just a guess, let me know if I'm wrong...

What is the difference between gcc optimization levels?

What is the difference between different optimization levels in GCC? Assuming I don't care to have any debug hooks, why wouldn't I just use the highest level of optimization available to me? does a higher level of optimization necessarily (i.e. provably) generate a faster program?
Yes, a higher level can sometimes mean a better performing program. However, it can cause problems depending on your code. For example, branch prediction (enabled in -O1 and up) can break poorly written multi threading programs by causing a race condition. Optimization will actually decide something that's better than what you wrote, which in some cases might not work.
And sometimes, the higher optimizations (-O3) add no reasonable benefit but a lot of extra size. Your own testing can determine if this size tradeoff makes a reasonable performance gain for your system.
As a final note, the GNU project compiles all of their programs at -O2 by default, and -O2 is fairly common elsewhere.
Generally optimization levels higher than -O2 (just -O3 for gcc but other compilers have higher ones) include optimizations that can increase the size of your code. This includes things like loop unrolling, lots of inlining, padding for alignment regardless of size, etc. Other compilers offer vectorization and inter-procedural optimization at levels higher than -O3, as well as certain optimizations that can improve speed a lot at the cost of correctness (e.g., using faster, less accurate math routines). Check the docs before you use these things.
As for performance, it's a tradeoff. In general, compiler designers try to tune these things so that they don't decrease the performance of your code, so -O3 will usually help (at least in my experience) but your mileage may vary. It's not always the case that really aggressive size-altering optimizations will improve performance (e.g. really aggressive inlining can get you cache pollution).
I found a web page containing some information about the different optimization levels. One thing a remember hearing somewhere is that optimization might actually break your program and that can be an issue. But I'm not sure how much of a an issue that is any longer. Perhaps todays compilers are smart enough to handle those problems.
Sidenote:
It's quite hard to predict exactly what flags are turned on by the global -O directives on the gcc command line for different versions and platforms, and all documentation on the GCC site is likely to become outdated quickly or doesn't cover the compiler internals in enough detail.
Here is an easy way to check exactly what happens on your particular setup when you use one of the -O flags and other -f flags and/or combinations thereof:
Create an empty source file somewhere:touch dummy.c
Run it though the compiler pass just as you normally would, with all -O, -f and/or -m flags you would normally use, but adding -Q -v to the command line:gcc -c -Q -v dummy.c
Inspect the generated output, perhaps saving it for different run.
Change the command line to your liking, remove the generated object file via rm -f dummy.o and re-run.
Also, always keep in mind that, from a purist point of view, most non-trivial optimizations generate "broken" code (where broken is defined as deviating from the optimal path in corner cases), so choosing whether or not to enable a certain set of optimization mechanisms sometimes boils down to choosing the level of correctness for the compiler output. There always have (and currently are) bugs in any compiler's optimizer - just check the GCC mailing list and Bugzilla for some samples. Compiler optimization should only be used after actually performing measurements sincegains from using a better algorithm will dwarf any gains from compiler optimization,there is no point in optimizing code that will run every once in a blue moon,if the optimizer introduces bugs, it's immaterial how fast your code runs.

Resources