Is it possible to profile a Haskell program without prof libraries? - haskell

Is it possible to time profile a Haskell program without installing the profiling libraries?
When I pass the -prof option to ghc, I always get errors like this one:
src/MyPKG/FooBlah.lhs:7:7:
Could not find module `Data.Time.Calendar':
Perhaps you haven't installed the profiling libraries for package `time-1.1.4'?
Use -v to see a list of the files searched for.
I know that the solution is to install with cabal profile versions of the libraries, but sometimes this is a pain in the ass (sorry for the bad language).
I think it should be possible to profile my program and the calls that have no symbols should appear as ???? or something like that in the output.

No, it's not possible. Building for profiling changes the representation and function calls have extra parameters to keep track of the profiling data.
You have to install the profiling libraries to use GHC's profiler, even if it's a pain in the rear.

Related

Customising Cabal libraries (I think?)

Perhaps it's just better to describe my problem.
I'm developing a Haskell library. But part of the library is written in C, and another part actually in raw LLVM. To actually get GHC to spit out the code I want I have to follow this process:
Run ghc -emit-llvm on both the code that uses the Haskell module and the "Main" module.
Run clang -emit-llvm on the C file
Now I've got three .ll files from above. I add the part of the library I've handwritten in raw LLVM and llvm-link these into one .ll file.
I then run LLVM's opt on the linked file.
Lastly, I feed the LLVM bitcode fileback into GHC (which pleasantly accepts it) and produces an executable.
This process (with appropriate optimisation settings of course) seems to be the only way I can inline code from C, removing the function call overhead. Since many of these C functions are very small this is significant.
Anyway, I want to be able to distribute the library and for users to be able to use it as painlessly as possible, whilst still gaining the optimisations from the process above. I understand it's going to be a bit more of a pain than an ordinary library (for example, you're forced to compile via LLVM) but as painlessly as possible is what I'm looking for advice for.
Any guidance will be appreciated, I don't expect a step by step answer because I think it will be complex, but just some ideas would be helpful.

Profiling Haskell code but excluding library profiling information

As we all know, when profiling Haskell applications, all dependencies have be installed with profiling information. This is fine, but a problem arises with Haskell packages that have -auto-all in their .cabal files. This means that I will always see their profiling information, even when this is irrelevent to me.
Allow me to present an example where this is problematic. I am building a little game engine, and I do a bunch of work before my game loop loading textures and such with JuicyPixels. This isn't code that's interesting to profile - I'm interested in profiling the game loop itself. However, because JuicyPixels built itself with -auto-all, there doesn't seem to be a way to exclude this information from profiling. As a result, I end up with hundreds of profiling lines that are simply noise.
Is it possible to strip out all of JuicyPixels debugging information (or any library, in the general case)?
The comments suggest that this is a problem with the cabal file for JuicyPixels (and if this problem continues to happen in other libraries, then it is also their fault). I started a discussion on the Haskell Cafe (http://haskell.1045720.n5.nabble.com/ghc-prof-options-and-libraries-on-Hackage-td5756706.html), and will try and follow up on that.

How to inspect Haskell bytecode

I am trying to figure out a bug (a serious performance downgrade). Unfortunately, I wasn't able to figure out why by going back many different versions of my code.
I am suspecting it could be some modifications to libraries that I've updated, not to mention in the meanwhile I've updated to GHC 7.6 from 7.4 (and if anybody knows if some laziness behavior has changed I would greatly appreciate it!).
I have an older executable of this code that does not have this bug and thus I wonder if there are any tools to tell me the library versions I was linking to from before? Like if it can figure out the symbols, etc.
GHC creates executables, which are notoriously hard to understand... On my Linux box I can view the assembly code by typing in
objdump -d <executable filename>
but I get back over 100K lines of code from just a simple "Hello, World!" program written in Haskell.
If you happen to have the GHC .hi files, you can get some information about the executable by typing in
ghc --show-iface <hi filename>
This won't give you the assembly code, but you can get some extra information that may prove useful.
As I mentioned in the comment above, on Linux you can use "ldd" to see what C-system libraries you used in the compile, but that is also probably less than useful.
You can try to use a disassembler, but those are generally written to disassemble to C, not anything higher level and certainly not Haskell. That being said, GHC compiles to C as an intermediary (at least it used to; has that changed?), so you might be able to learn something.
Personally I often find view system calls in action much more interesting than viewing pure assembly. On my Linux box, I can view all system calls by running using strace (use Wireshark for the network traffic equivalent):
strace <program executable>
This also will generate a lot of data, so it might only be useful if you know of some specific place where direct real world communication (i.e., changes to a file on the hard disk drive) goes wrong.
In all honesty, you are probably better off just debugging the problem from source, although, depending on the actual problem, some of these techniques may help you pinpoint something.
Most of these tools have Mac and Windows equivalents.
Since much has changed in the last 9 years, and apparently this is still the first result a search engine gives on this question (like for me, again), an updated answer is in order:
First of all, yes, while Haskell does not specify a bytecode format, bytecode is also just a kind of machine code, for a virtual machine. So for the rest of the answer I will treat them as the same thing. The “Core“ as well as the LLVM intermediate language, or even WASM could be considered equivalent too.
Secondly, if your old binary is statically linked, then of course, no matter the format your program is in, no symbols will be available to check out. Because that is what linking does. Even with bytecode, and even with just classic static #include in simple languages. So your old binary will be no good, no matter what. And given the optimisations compilers do, a classic decompiler will very likely never be able to figure out what optimised bits used to be partially what libraries. Especially with stream fusion and such “magic”.
Third, you can do the things you asked with a modern Haskell program. But you need to have your binaries compiled with -dynamic and -rdynamic, So not only the C-calling-convention libraries (e.g. .so), and the Haskell libraries, but also the runtime itself is dynamically loaded. That way you end up with a very small binary, consisting of only your actual code, dynamic linking instructions, and the exact data about what libraries and runtime were used to build it. And since the runtime is compiler-dependent, you will know the compiler too. So it would give you everything you need, but only if you compiled it right. (I recommend using such dynamic linking by default in any case as it saves memory.)
The last factor that one might forget, is that even the exact same compiler version might behave vastly differently, depending on what IT was compiled with. (E.g. if somebody put a backdoor in the very first version of GHC, and all GHCs after that were compiled with that first GHC, and nobody ever checked, then that backdoor could still be in the code today, with no traces in any source or libraries whatsoever. … Or for a less extreme case, that version of GHC your old binary was built with might have been compiled with different architecture options, leading to it putting more optimised instructions into the binaries it compiles for unless told to cross-compile.)
Finally, of course, you can profile even compiled binaries, by profiling their system calls. This will give you clues about which part of the code acted differently and how. (E.g. if you notice that your new binary floods the system with some slow system calls where the old one just used a single fast one. A classic OpenGL example would be using fast display lists versus slow direct calls to draw triangles. Or using a different sorting algorithm, or having switched to a different kind of data structure that fits your work load badly and thrashes a lot of memory.)

Profile Haskell without installing profiling libraries for all dependencies

I wish to profile my program written in Haskell.
On compilation, I am told that I do not have profiling libraries for certain dependencies (e.g., criterion) installed and cabal aborts.
I have no interest in profiling parts of those dependencies; code called from main doesn't even use them.
How can I profile my application without installing profiling libraries I don't need and without removing all those dependencies?
A good way to circumvent having to compile everything with profiling is to use cabal sandbox. It allows you to set up a sandbox for one application only, and thereby you won't have to re-install your entire ~/.cabal prefix. You'll need a recent version of Cabal, so run cabal update && cabal install cabal-install first.
Once you initialise a sandbox, create a file cabal.config to include the necessary directives (in your case library-profiling: True; executable-profiling: True may also be handy.)
A side-effect of this is that you can test your code with dependencies that need not be installed globally, for example, experimental versions, or outdated versions.
EDIT: btw, I don't think you need to have profiling enabled for criterion to work. In any case, it works for me without profiling being enabled. Just write a Main module that contains main = defaultMain benchmarks where benchmarks has type [Benchmark], i.e. a list of benchmarks that you've written.
You then compile that file (say, we call it benchmarks.hs with ghc --make -o bench benchmarks.hs, and run the program, ./bench with the appropriate arguments (consult the criterion documentation for details. A good default argument is, say ./bench -o benchmarks.html which will generate a nifty report similar to this one)
I had the same problem this week, and although I had recompiled everything by hand, I was instructed in the IRC channel to do the following:
Go to your cabal config file (in case you don't know where)
Edit the line for enable library profiling (and while you are at it, enable documentation)
Run Cabal Install World
As mentioned in the question you refer to in your comment, a good way to solve this problem in the future is to enable profiling in the cabal configuration. This way all libraries are installed with profiling support. This might not be a satisfying solution but I guess many are opting for it.
If you are only interested in getting an impression of the memory usage of your program you can generate a heap profile of your program using -hT. More precisely, you have to compile the program with -rtsopts to enable RTS options then execute it using +RTS -hT. The compiler generates a file with the extension hp. You can convert the hp file into a postscript file with a heap profile using hp2ps. This should work without any profiling support, but note that I am too lazy to verify it as I have installed all libraries with profiling support ; )

gccsense vs. clang_complete

I've been using omniCppComplete + ctags for a while, and want to make a further improvement on the code completion.
According to the suggestion in here [1], gccsense and clang_complete seems to be alternatives. However, I am not sure which one is better. Any idea on their performance?
Thanks!
Update: After I tried clang_complete, I found the completion speed extremely unacceptable.
I then tried it using libclang.dylib, which speeds up a lot but still make one feels lagging.
I think I should stick to ctags for now.
You should probably use clang_complete, not gccsense.
The main point here is the architecture of the two. The idea behind both solutions is very similar: you can't get normal C++ completion without access to internal compiler (gcc) information (Abstract Syntax Tree) while gcc doesn't provide you with sufficient interfaces for that. The implementation part of accessing this info though is quite different here: gccsense is a kind of "hack" - it's a custom build of gcc capable for storing the neccessary info for futher providing it to plugin, while clang_complete goes the other way by using alternative compiler: clang, one of the main goals of creation of which was exactly making AST easily accessible by external tools.
So, in case of using gccsense you'll need to compile your code with a kind of custom gcc compiler, which is already a little bit outdated (gccsense is using gcc 4.4) now and will constantly need developer's support in feature. On the contrary, clang_complete doesn't depend so much on clang compiler, it uses it as external tool.
As for performance: again clang was designed to be faster than gcc and it is. Clang_complete can be slightly slower on Windows than on MacOS/Linux, however gccsense can't even be compiled for Windows at the time.
GCCsense can be built on Windows.
See my patch on gcc 4.5.2 here:
http://forums.codeblocks.org/index.php/topic,13812.msg94824.html#msg94824
I admit that gccsense is just a hack to gcc, but clang has much better design from its beginning.
I hope anyone could improve gcc/gccsense.

Resources