Is it possible to emit raw source code with Template Haskell? - haskell

Say I have a String (or Text or whatever) containing valid Haskell code. Is there a way to convert it into a [Dec] with Template Haskell?
I'm pretty sure the AST doesn't directly go to GHC so there's going to be a printing and then a parsing stage anyways.
This would be great to have since it would allow different "backends" for TH. For example you could use the AST from haskell-src-exts which supports more Haskell syntax than TH does.

I'm pretty sure the AST doesn't directly go to GHC so there's going to be a printing and then a parsing stage anyways.
Why would you think that? That isn’t the case, the TH AST is converted to GHC’s internal AST directly; it never gets converted back to text at any point in that process. (If it did, that would be pretty strange.)
Still, it would be somewhat nice if Template Haskell exposed a way to parse Haskell source to expressions, types, and declarations, basically exposing the parsers behind various e, t, and d quoters that are built in to Template Haskell. Unfortunately, it does not, and I don’t believe there are currently any plans to change that.
Currently, you need to go through haskell-src-exts instead. This is somewhat less than ideal, since there are differences between haskell-src-exts’s parser and GHCs, but it’s as good as you’re currently going to get. To lessen the pain, there is a package called haskell-src-meta that bridges haskell-src-exts and template-haskell.
For your use case, you can use the parseDecs function from Language.Haskell.Meta.Parse, which has the type String -> Either String [Dec], which is what you’re looking for.

Related

Haskell 'showInt' not in scope: why not?

I'm trying to understand Philip Wadler's "Essence of Functional Programming", and I seem to be held back by his assertion that "No knowledge of Haskell is necessary to understand this paper." Maybe not, but his examples sure require some.
Specifically, I'm trying to understand his example interpreter. When I try to compile this using GHC, or load it using :load, it complains not in scope: showint. Perhaps you meant showInt. When I replace the token with showInt, it says Not in scope: showInt.
I'd really like to believe Dr. Wadler when he says all I need to know is contained in his paper.
I'd really like to get it working under GHCI, so I can try various expressions under the interpreter. I'm new to Haskell, and was duly warned about the opaqueness of its error messages, but this seems designed to perplex!
The showInt function is part of the Numeric module, so you have to import Numeric to have it in scope. I guess the typo hint system knows about modules you haven't imported.
showInt also doesn't return a string directly but instead a String -> String function. I think this functionality is used for showing things composed of multiple parts more efficiently, but here it'd just be a pain and your code won't typecheck as is.
Instead, you can replace showint with just show and let the compiler figure it out. show is Haskell's toString and is overloaded for any type it's reasonable to convert to a string.

How to build Abstract Syntax Trees from grammar specification in Haskell?

I'm working on a project which involves optimizing certain constructs in a very small subset of Java, formalized in BNF.
If I were to do this in Java, I would use a combination of JTB and JavaCC which builds an AST. Visitors are then used to manipulate the tree. But, given the vast libraries for parsing in Haskell (parsec, happy, alex etc), I'm a bit confused in chossing the appropriate library.
So, simply put, when a language is specified in BNF, which library offers the easiest means to build an AST? And what is the best way to go about modifying this tree in idiomatic Haskell?
Well in Haskell there are 2 main ways of parsing something, parse combinators or a parser generator. Since you already have a BNF I'd suggest the latter.
A good one is alex. GHC's parser IIRC is written using this so you'd be in good company.
Next you'll have a big honking stack of data declarations to parse into:
data JavaClass = {
className :: Name,
interfaces :: [Name],
contents :: [ClassContents],
...
}
data ClassContents = M Method
| F Field
| IC InnerClass
and for expressions and whatever else you need. Finally you'll combine these into something like
data TopLevel = JC JavaClass
| WhateverOtherForms
| YouWillParse
Once you have this you'll have the entire AST represented as one TopLevel or a list of them depending on how many you classes/files you parse.
To proceed from here depends on what you want to do. There are a number of libraries such as syb (scrap your boilerplate) that let you write very concise tree traversals and modifications. lens is also an option. At a minimum check out Data.Traversable and Data.Foldable.
To modify the tree, you can do something as simple as
ignoreInnerClasses :: JavaClass -> JavaClass
ignoreInnerContents c = c{contents = filter isClass $ contents c}
-- ^^^ that is called a record update
where isClass (IC _) = True
isClass _ = False
and then you could potentially use something like syb to write
everywhere (mkT ignoreInnerClass) toplevel
which will traverse everything and apply ignoreInnerClass to all JavaClasses. This is possible to do in lens and many other libraries too, but syb is very easy to read.
I've never used bnfc-meta (suggested by #phg), but I would strongly recommend you look into BNFC (on hackage: http://hackage.haskell.org/package/BNFC). The basic approach is that you write your grammar in an annotated BNF style, and it will automatically generate an AST, parser, and pretty-printer for the grammar.
How suitable BNFC is depends upon the complexity of your grammar. If it's not context-free, you'll likely have a difficult time making any progress (I did make some success hacking up context-sensitive extensions, but that code's likely bit-rotted by now). The other downside is that your AST will very directly reflect the grammar specification. But since you already have a BNF specification, adding the necessary annotations for BNFC should be rather straightforward, so it's probably the fastest way to get a usable AST. Even if you decide to go a different route, you might be able to take the generated data types as a starting point for a hand-written version.
Alex + Happy.
There are many approaches to modify/investigate the parsed terms (ASTs). The keyword to search for is "datatype-generic" programming. But beware: it is a complex topic ...
http://people.cs.uu.nl/andres/Rec/MutualRec.pdf
http://www.cs.uu.nl/wiki/GenericProgramming/Multirec
It has a generic implementation of the zipper available here:
http://hackage.haskell.org/packages/archive/zipper/0.3/doc/html/Generics-MultiRec-Zipper.html
Also checkout https://github.com/pascalh/Astview
You might also check out the Haskell Compiler Series which is nice as an introduction to using alex and happy to parse a subset of Java: http://bjbell.wordpress.com/haskell-compiler-series/.
Since your grammar can be expressed in BNF, it is in the class of grammars that are efficiently parseable with a shift-reduce parser (LALR grammars). Such efficient parsers can be generated by the parser generator yacc/bison (C,C++), or its Haskell equivalent "Happy".
That's why I would use "Happy" in your case. It takes grammar rules in BNF form and generates a parser from it directly. The resulting parser will accept the language that is described by your grammar rules and produce an AST (abstract syntax tree). The Happy user guide is quite nice and gets you started quickly:
http://www.haskell.org/happy/doc/html/
To transform the resulting AST, generic programming is a good idea. Here is a classical explanation on how to do this in Haskell in a practical fashion, from scratch:
http://research.microsoft.com/en-us/um/people/simonpj/papers/hmap/
I have used exactly this to build a compiler for a small domain specific language, and it was a simple and concise solution.

Dynamic loading of Haskell abstract syntax expression

Can we use GHC API or something else to load not text source modules, but AST expressions, similar to haskell-src-exts Exp type? This way we could save time for code generation and parsing.
I don't think the GHC API exposes an AST interface (could be wrong though), but Template Haskell does. If you build expressions using the Language.Haskell.TH Exp structure, you can create functions/declarations and make use of them by the $(someTHFunction) syntax.
A fairly major caveat is that TH only runs at compile time, so you would need to pre-generate everything. If you want to use TH at run-time, I think you'd need to pretty-print the template haskell AST, then use the GHC API on the resulting string.

What's the alternative to exceptions in a deep Haskell recursion?

I'm trying to learn Haskell by writing little programs... so I'm currently writing a lexer/parser for simple expressions. (Yes I could use Alex/Happy... but I want to learn the core language first).
My parser is essentially a set of recursive functions that build a Tree. In the case of syntax errors, I'd normally throw an exception (i.e. if I were writing in C#), but that seems to be discouraged in Haskell.
So what's the alternative? I don't really want to test for error states in every single bit of the parser. I want to either end up with a valid tree of nodes, or an error state with details.
Haskell provides the Maybe and Either types for this. Since you want to return an error state, Either seems to be what you want.
For a computation that might fail with details on why it failes, there's the type Either a b, e.g. Either ErrorDetails ParseTree, so your result could be Right theParseTree or Left ErrorDetails. You can pattern-match these constructors in your recursive function and if there's an error, you pass it on; otherwise you go on as usual.

Modifying the pretty printer from haskell-src-exts

The haskell-src-exts package has functions for pretty printing a Haskell AST. What I want to do is change its behavior on certain constructors, in my case the way SCC pragmas are printed. So everything else should be printed the default way, only SCCs are handled differently. Is it possible to do it without copying the source file and editing it, which is what I'm doing now?
Well, the library has done one thing right, using a type class for Pretty. The challenge then is how to select a different instance for the constructors you want to print differently. Ideally, you would just newtype the AST node you care about, and somehow substitute that into the AST.
Now, the problem here is that the Haskell AST exported by the library has its type structure fixed. It doesn't, e.g. use two-level types, which would let you substitute newtypes for parts of the tree. So you would have to redefine the type of the AST down to the node you wish to change the type of.

Resources