Profiling Template Haskell - haskell

I have a TH-heavy file which takes around 30 seconds to compile. What are some techniques I can use to help debug the performance of my Template Haskell?

If I understand compile flow of TH correctly, the ordinary haskell functions are being executed while splicing at compile time. But you can run then at the runtime on your own, of course.
For example you have something like $(foo x y ...) in your TH-heavy file. Create another file and call 'foo x y' there but don't splice the result. Then you'll be able to profile 'foo' as usual. If the bottleneck is at the AST generation stage you'll locate it. Don't forget to consider lazyness.

As of GHC 8, this can be done with -fexternal-interpreter.
Compile the library defining the TH function with profiling enabled, then compile the code* which uses the TH function in a splice with GHC options -fexternal-interpreter -opti+RTS -opti-p. This should produce a file called ghc-iserv-prof.prof.
This approach has the advantage that you can use the full functionality of the Q monad.
* A benchmark suite in the same cabal project as the TH library (but in a different hs-source-dir) also works. It might even work with a TH function defined and used in the same library, but I think you'll be profiling interpreted code then.

Related

Watch window not working for F#

I had noticed some time ago that the "Watch" window in VS2012 for Web doesn't work for default functions in FSharp. For example, cos someValue doesn't work, neither does the workaround where let _cos = cos or let _cos x = cos x is inserted in the beginning of the function and _cos(someValue) is used. The error is something like "cos doesn't exist in the current context" or "_cos isn't valid in the current scope", among others.
Should I change some settings or is this an unexpected bug? Of course I can declare all the results I need to watch, but that's a bit of overhead and it is quite impractical. What can I do to fix this?
As mentioned in the referneced answer, the watches and immediate windows only support C#, so they are not able to evaluate F# expressions and they are not aware of the F# context (such as opened namespaces).
In summary storing the result in a local variable (which is compiled to an ordinary local variable) is the best way to see the result.
More details:
In some cases, you can write C# code that corresponds to what you want to do in F#. This is probably only worth for simple situations, when the corresponding C# is not too hard to write, but it can often be done.
For example to call cos 3.14, you need to write something like:
Microsoft.FSharp.Core.Operators.Cos(3.14)
If you find the cos function in the F# source code (it righ here, in prim-types.fsi), then you can see that it comes with CompiledName attribute that tells the compiler to compile it as a method named Cos (to follow .NET naming guidelines). It is defined in module named Operators (see it here), which is annotated with AutoOpen so you do not need to explicitly write open in the F# code, but it is actually the name of the class that the F# compiler generates when compiling the code.

Runtime exception with Data Parallel Haskell / GHC 7.4.2

I'm trying to do some simple experiements with Data Parallel Haskell running, but I clearly have some options wrong. even when I try something very simple like
sumP [:1.0,2.0:]
I get an exception
Exception indexParr: out of bounds parallel array index; idx = 0, arr len = 0
Assume I have something set up wrongly - but ...
I get this same error both when trying to use GHCi, and when running a executable generated with GHC.
You might be running into some of the limitations specified by the DPH project status. Specifically
Major limitations include the inability to mix vectorised and
non-vectorised code in a single Haskell module, the need to use a
feature-deprived, special-purpose Prelude for vectorised code, and a
lack of optimisations (leading to poor performance in some cases).
If you're just looking to make use of regular data parallelism, you can probably get away with repa (which is also recommended by the DPH page).

A good way to avoid "sharing"?

Suppose that someone would translate this simple Python code to Haskell:
def important_astrological_calculation(digits):
# Get the first 1000000 digits of Pi!
lucky_numbers = calculate_first_digits_of_pi(1000000)
return digits in lucky_numbers
Haskell version:
importantAstrologicalCalculation digits =
isInfixOf digits luckyNumbers
where
luckyNumbers = calculateFirstDigitsOfPi 1000000
After working with the Haskell version, the programmer is astonished to discover that his Haskell version "leaks" memory - after the first time his function is called, luckyNumbers never gets freed. That is troubling as the program includes some more similar functions and the memory consumed by all of them is significant.
Is there an easy and elegant way to make the program "forget" luckyNumbers?
In this case, your pidigits list is a constant (or "constant applicative form
), and GHC will probably float it out, calculate it once, and share amongst uses. If there are no references to the CAF, it will be garbage collected.
Now, in general, if you want something to be recalculated, turn it into a function (e.g. by adding a dummy () parameter) and enable -fno-full-laziness. Examples in the linked question on CAFs: How to make a CAF not a CAF in Haskell?
Three ways to solve this (based on this blog post)
Using INLINE pragmas
Add {-# INLINE luckyNumbers #-} and another for importantAstrologicalCalculation.
This will make separate calls be independent from each other, each using their own copy of the luckyNumbers which is iterated once and is immediately collected by the GC.
Pros:
Require minimal changes to our code
Cons:
Fragile? kuribas wrote wrote that "INLINE doen’t guarantee inlining, and it depends on optimization flags"
Machine code duplication. May create larger and potentially less efficient executables
Using the -fno-full-laziness GHC flag
Wrap luckyNumbers with a dummy lambda and use -fno-full-laziness:
{-# OPTIONS -fno-full-laziness #-}
luckyNumbers _ = calculateFirstDigitsOfPi 1000000
Without the flag, GHC may notice that the expression in luckyNumbers doesn't use its parameter and so it may float it out and share it.
Pros:
No machine code duplication: the implementation of fibs is shared without the resulting list being shared!
Cons:
Fragile? I fear that this solution might break if another module uses fibs and GHC decides to inline it, and this second module didn't enable -fno-full-laziness
Relies on GHC flags. These might change more easily than the language standard does
Requires modification to our code including in all of fibs's call sites
Functionalization
Alonzo Church famously discovered that data can be encoded in functions, and we can use it to avoid creating data structures that could be shared.
luckyNumbers can be made to a function folding over the digits of pi rather than a data structure.
Pros:
Solid. Little doubt that this will resume working in the face of various compiler optimization
Cons:
More verbose
Non-standard. We're not using standard lists anymore, and those have a wealth of standard library functions supporting them which we may need to re-implement

Generating LLVM code for 'lambda', 'define'

So I now have a fairly complete LISP (scheme) interpreter written in haskell. Just for fun I want to try to have it compile down to LLVM. Most of the code generation seems pretty straight forward, but I'm at a loss as to how to generate code for a lambda expression (kind of important in lisp ;) ) and how to manage the heap when I encounter a define expression.
How might I generated code for these expressions?
Note: I can generate code for the body of the lambda expression, What is confusing me is how to "put" that code somewhere and make it callable.
See Lennart's blog post: http://augustss.blogspot.com/2009/06/more-llvm-recently-someone-asked-me-on.html
Look at the compileFunction function. In particular, newFunction in the LLVM core: http://hackage.haskell.org/packages/archive/llvm/0.9.1.2/doc/html/LLVM-Core.html#g:23

Does "The whole language always available" hold in case of Clojure?

Ninth bullet point in Paul Graham's What Made Lisp Different says,
9. The whole language always available.
There is no real distinction between read-time, compile-time, and runtime. You can compile or run code while reading, read or run code while compiling, and read or compile code at runtime.
Running code at read-time lets users reprogram Lisp's syntax; running code at compile-time is the basis of macros; compiling at runtime is the basis of Lisp's use as an extension language in programs like Emacs; and reading at runtime enables programs to communicate using s-expressions, an idea recently reinvented as XML.
Does this last bullet point hold for Clojure?
You can mix runtime and compile-time freely in Clojure, although Common Lisp is still somewhat more flexible here (due to the presence of compiler macros and symbol macros and a fully supported macrolet; Clojure has an advantage in its cool approach to macro hygiene through automagic symbol resolution in syntax-quote). The reader is currently closed, so the free mixing of runtime, compile-time and read-time is not possible1.
1 Except through unsupported clever hacks.
It does hold,
(eval (read-string "(println \"Hello World!!\")"))
Hello World!!
nil
Just like emacs you can have your program configuration in Clojure, one project that I know Clojure for is static which allows you to have your template as a Clojure vector along with arbitrary code which will be executed at read time.

Resources