Is there a nice(r) way of writing this Template Haskell code involving singleton data types? - haskell

I've just started to use Template Haskell (I've finally got a use case, yay!) and now I'm cognitively stuck.
What I'm trying to do is generating a singleton datatype declaration of the form
data $V = $V deriving (Eq,Ord)
starting from a name V (hopefully starting with an uppercase character!). To be explicit, I'm trying to write a function declareSingleton of type String -> DecsQ (I should mention here I'm using GHC 7.6.1, template-haskell version 2.8.0.0) such that the splice
$(declareSingleton "Foo")
is the equivalent of
data Foo = Foo deriving (Eq,Ord)
I've got the following code working and doing what I want, but I'm not very happy with it:
declareSingleton :: String -> Q [Dec]
declareSingleton s = let n = mkName s in sequence [
dataD (cxt []) n [] [normalC n []] [''Eq,''Ord]
]
I was hoping to get something like the following to work:
declareSingleton :: String -> Q [Dec]
declareSingleton s = let n = mkName s in
[d| data $n = $n deriving (Eq,Ord) |]
I've tried, to no avail (but not exhaustively!), various combinations of $s, $v, $(conT v), v, 'v so I have to suppose my mental model of how Template Haskell works is too simplistic.
Am I missing something obvious here, am I confusing type names and constructor names in some essential way, and can I write declareSingleton in a nice(r) way?
If so, how; if not, why not?
(Side remark: the Template Haskell API changes rapidly, and I'm happy for that - I want this simple type to eventually implement a multi-parameter type class with an associated type family - but the churn the API is currently going through doesn't make it easy to search for tutorials! There's a huge difference how TH was implemented in 6.12.1 or 7.2 (when most of the existing tutorial were written) versus how it works nowadays...)

From the Template Haskell documentation:
A splice can occur in place of
an expression; the spliced expression must have type Q Exp
an type; the spliced expression must have type Q Typ
a list of top-level declarations; the spliced expression must have type Q [Dec]
So e.g. constructor names simply cannot be spliced in the current version of Template Haskell.
I don't think there's much you can do to simplify this use-case (barring constructing the whole declaration as a string and transforming it into a Dec via toDec in haskell-src-meta).
You might consider simply binding the different parts of the declaration to local variables. While more verbose, it makes the code a bit easier to read.
declareSingleton :: String -> Q [Dec]
declareSingleton s = return [DataD context name vars cons derives] where
context = []
name = mkName s
vars = []
cons = [NormalC name fields]
fields = []
derives = [''Eq, ''Ord]

a hack that parses an interpolated string rather than constructing the AST:
makeU1 :: String -> Q [Dec]
makeU1 = makeSingletonDeclaration >>> parseDecs >>> fromRight >>> return
makeSingletonDeclaration :: String -> String
makeSingletonDeclaration name = [qq|data {name} = {name} deriving (Show)|]
where:
parseDecs :: String -> Either String [Dec]
usage:
makeU1 "T"
main = print T
it needs these packages:
$ cabal install haskell-src-exts
$ cabal install haskell-src-meta
$ cabal install interpolatedstring-perl6
runnable script:
https://github.com/sboosali/haskell/blob/0794eb7a52cf4c658270e49fa4f0d9e53c4b0ebf/TemplateHaskell/Splice.hs

Related

Typed abstract syntax and DSL design in Haskell

I'm designing a DSL in Haskell and I would like to have an assignment operation. Something like this (the code below is just for explaining my problem in a limited context, I didn't have type checked Stmt type):
data Stmt = forall a . Assign String (Exp a) -- Assignment operation
| forall a. Decl String a -- Variable declaration
data Exp t where
EBool :: Bool -> Exp Bool
EInt :: Int -> Exp Int
EAdd :: Exp Int -> Exp Int -> Exp Int
ENot :: Exp Bool -> Exp Bool
In the previous code, I'm able to use a GADT to enforce type constraints on expressions. My problem is how can I enforce that the left hand side of an assignment is: 1) Defined, i.e., a variable must be declared before it is used and 2) The right hand side must have the same type of the left hand side variable?
I know that in a full dependently typed language, I could define statements indexed by some sort of typing context, that is, a list of defined variables and their type. I believe that this would solve my problem. But, I'm wondering if there is some way to achieve this in Haskell.
Any pointer to example code or articles is highly appreciated.
Given that my work focuses on related issues of scope and type safety being encoded at the type-level, I stumbled upon this old-ish question whilst googling around and thought I'd give it a try.
This post provides, I think, an answer quite close to the original specification. The whole thing is surprisingly short once you have the right setup.
First, I'll start with a sample program to give you an idea of what the end result looks like:
program :: Program
program = Program
$ Declare (Var :: Name "foo") (Of :: Type Int)
:> Assign (The (Var :: Name "foo")) (EInt 1)
:> Declare (Var :: Name "bar") (Of :: Type Bool)
:> increment (The (Var :: Name "foo"))
:> Assign (The (Var :: Name "bar")) (ENot $ EBool True)
:> Done
Scoping
In order to ensure that we may only assign values to variables which have been declared before, we need a notion of scope.
GHC.TypeLits provides us with type-level strings (called Symbol) so we can very-well use strings as variable names if we want. And because we want to ensure type safety, each variable declaration comes with a type annotation which we will store together with the variable name. Our type of scopes is therefore: [(Symbol, *)].
We can use a type family to test whether a given Symbol is in scope and return its associated type if that is the case:
type family HasSymbol (g :: [(Symbol,*)]) (s :: Symbol) :: Maybe * where
HasSymbol '[] s = 'Nothing
HasSymbol ('(s, a) ': g) s = 'Just a
HasSymbol ('(t, a) ': g) s = HasSymbol g s
From this definition we can define a notion of variable: a variable of type a in scope g is a symbol s such that HasSymbol g s returns 'Just a. This is what the ScopedSymbol data type represents by using an existential quantification to store the s.
data ScopedSymbol (g :: [(Symbol,*)]) (a :: *) = forall s.
(HasSymbol g s ~ 'Just a) => The (Name s)
data Name (s :: Symbol) = Var
Here I am purposefully abusing notations all over the place: The is the constructor for the type ScopedSymbol and Name is a Proxy type with a nicer name and constructor. This allows us to write such niceties as:
example :: ScopedSymbol ('("foo", Int) ': '("bar", Bool) ': '[]) Bool
example = The (Var :: Name "bar")
Statements
Now that we have a notion of scope and of well-typed variables in that scope, we can start considering the effects Statements should have. Given that new variables can be declared in a Statement, we need to find a way to propagate this information in the scope. The key hindsight is to have two indices: an input and an output scope.
To Declare a new variable together with its type will expand the current scope with the pair of the variable name and the corresponding type.
Assignments on the other hand do not modify the scope. They merely associate a ScopedSymbol to an expression of the corresponding type.
data Statement (g :: [(Symbol, *)]) (h :: [(Symbol,*)]) where
Declare :: Name s -> Type a -> Statement g ('(s, a) ': g)
Assign :: ScopedSymbol g a -> Exp g a -> Statement g g
data Type (a :: *) = Of
Once again we have introduced a proxy type to have a nicer user-level syntax.
example' :: Statement '[] ('("foo", Int) ': '[])
example' = Declare (Var :: Name "foo") (Of :: Type Int)
example'' :: Statement ('("foo", Int) ': '[]) ('("foo", Int) ': '[])
example'' = Assign (The (Var :: Name "foo")) (EInt 1)
Statements can be chained in a scope-preserving way by defining the following GADT of type-aligned sequences:
infixr 5 :>
data Statements (g :: [(Symbol, *)]) (h :: [(Symbol,*)]) where
Done :: Statements g g
(:>) :: Statement g h -> Statements h i -> Statements g i
Expressions
Expressions are mostly unchanged from your original definition except that they are now scoped and a new constructor EVar lets us dereference a previously-declared variable (using ScopedSymbol) giving us an expression of the appropriate type.
data Exp (g :: [(Symbol,*)]) (t :: *) where
EVar :: ScopedSymbol g a -> Exp g a
EBool :: Bool -> Exp g Bool
EInt :: Int -> Exp g Int
EAdd :: Exp g Int -> Exp g Int -> Exp g Int
ENot :: Exp g Bool -> Exp g Bool
Programs
A Program is quite simply a sequence of statements starting in the empty scope. We use, once more, an existential quantification to hide the scope we end up with.
data Program = forall h. Program (Statements '[] h)
It is obviously possible to write subroutines in Haskell and use them in your programs. In the example, I have the very simple increment which can be defined like so:
increment :: ScopedSymbol g Int -> Statement g g
increment v = Assign v (EAdd (EVar v) (EInt 1))
I have uploaded the whole code snippet together with the right LANGUAGE pragmas and the examples listed here in a self-contained gist. I haven't however included any comments there.
You should know that your goals are quite lofty. I don't think you will get very far treating your variables exactly as strings. I'd do something slightly more annoying to use, but more practical. Define a monad for your DSL, which I'll call M:
newtype M a = ...
data Exp a where
... as before ...
data Var a -- a typed variable
assign :: Var a -> Exp a -> M ()
declare :: String -> a -> M (Var a)
I'm not sure why you have Exp a for assignment and just a for declaration, but I reproduced that here. The String in declare is just for cosmetics, if you need it for code generation or error reporting or something -- the identity of the variable should really not be tied to that name. So it's usually used as
myFunc = do
foobar <- declare "foobar" 42
which is the annoying redundant bit. Haskell doesn't really have a good way around this (though depending on what you're doing with your DSL, you may not need the string at all).
As for the implementation, maybe something like
data Stmt = forall a. Assign (Var a) (Exp a)
| forall a. Declare (Var a) a
data Var a = Var String Integer -- string is auxiliary from before, integer
-- stores real identity.
For M, we need a unique supply of names and a list of statements to output.
newtype M a = M { runM :: WriterT [Stmt] (StateT Integer Identity a) }
deriving (Functor, Applicative, Monad)
Then the operations as usually fairly trivial.
assign v a = M $ tell [Assign v a]
declare name a = M $ do
ident <- lift get
lift . put $! ident + 1
let var = Var name ident
tell [Declare var a]
return var
I've made a fairly large DSL for code generation in another language using a fairly similar design, and it scales well. I find it a good idea to stay "near the ground", just doing solid modeling without using too many fancy type-level magical features, and accepting minor linguistic annoyances. That way Haskell's main strength -- it's ability to abstract -- can still be used for code in your DSL.
One drawback is that everything needs to be defined within a do block, which can be a hinderance to good organization as the amount of code grows. I'll steal declare to show a way around that:
declare :: String -> M a -> M a
used like
foo = declare "foo" $ do
-- actual function body
then your M can have as a component of its state a cache from names to variables, and the first time you use a declaration with a certain name you render it and put it in a variable (this will require a bit more sophisticated monoid than [Stmt] as the target of your Writer). Later times you just look up the variable. It does have a rather floppy dependence on uniqueness of names, unfortunately; an explicit model of namespaces can help with that but never eliminate it entirely.
After seeing all the code by #Cactus and the Haskell suggestions by #luqui, I've managed to got a solution close to what I want in Idris. The complete code is available at the following gist:
(https://gist.github.com/rodrigogribeiro/33356c62e36bff54831d)
Some little things I need to fix in the previous solution:
I don't know (yet) if Idris support integer literal overloading, what would be quite useful to build my DSL.
I've tried to define in DSL syntax a prefix operator for program variables, but it didn't worked as I like. I've got a solution (in the previous gist) that uses a keyword --- use --- for variable access.
I'll check this minor points with guys in Idris #freenode channel to see if these two points are possible.

How to avoid default return value when accessing a non-existent field with lenses?

I love Lens library and I love how it works, but sometimes it introduces so many problems, that I regret I ever started using it. Lets look at this simple example:
{-# LANGUAGE TemplateHaskell #-}
import Control.Lens
data Data = A { _x :: String, _y :: String }
| B { _x :: String }
makeLenses ''Data
main = do
let b = B "x"
print $ view y b
it outputs:
""
And now imagine - we've got a datatype and we refactor it - by changing some names. Instead of getting error (in runtime, like with normal accessors) that this name does not longer apply to particular data constructor, lenses use mempty from Monoid to create default object, so we get strange results instead of error. Debugging something like this is almost impossible.
Is there any way to fix this behaviour? I know there are some special operators to get the behaviour I want, but all "normal" looking functions from lenses are just horrible. Should I just override them with my custom module or is there any nicer method?
As a sidenote: I want to be able to read and set the arguments using lens syntax, but just remove the behaviour of automatic result creating when field is missing.
It sounds like you just want to recover the exception behavior. I vaguely recall that this is how view once worked. If so, I expect a reasonable choice was made with the change.
Normally I end up working with (^?) in the cases you are talking about:
> b ^? y
Nothing
If you want the exception behavior you can use ^?!
> b ^?! y
"*** Exception: (^?!): empty Fold
I prefer to use ^? to avoid partial functions and exceptions, similar to how it is commonly advised to stay away from head, last, !! and other partial functions.
Yes, I too have found it a bit odd that view works for Traversals by concatenating the targets. I think this is because of the instance Monoid m => Applicative (Const m). You can write your own view equivalent that doesn't have this behaviour by writing your own Const equivalent that doesn't have this instance.
Perhaps one workaround would be to provide a type signature for y, so know know exactly what it is. If you had this then your "pathological" use of view wouldn't compile.
data Data = A { _x :: String, _y' :: String }
| B { _x :: String }
makeLenses ''Data
y :: Lens' Data String
y = y'
You can do this by defining your own view1 operator. It doesn't exist in the lens package, but it's easy to define locally.
{-# LANGUAGE TemplateHaskell #-}
import Control.Lens
data Data = A { _x :: String, _y :: String }
| B { _x :: String }
makeLenses ''Data
newtype Get a b = Get { unGet :: a }
instance Functor (Get a) where
fmap _ (Get x) = Get x
view1 :: LensLike' (Get a) s a -> s -> a
view1 l = unGet . l Get
works :: Data -> String
works = view1 x
-- fails :: Data -> String
-- fails = view1 y
-- Bug.hs:23:15:
-- No instance for (Control.Applicative.Applicative (Get String))
-- arising from a use of ‘y’

How to return a polymorphic type in Haskell based on the results of string parsing?

TL;DR:
How can I write a function which is polymorphic in its return type? I'm working on an exercise where the task is to write a function which is capable of analyzing a String and, depending on its contents, generate either a Vector [Int], Vector [Char] or Vector [String].
Longer version:
Here are a few examples of how the intended function would behave:
The string "1 2\n3 4" would generate a Vector [Int] that's made up of two lists: [1,2] and [3,4].
The string "'t' 'i' 'c'\n't' 'a' 'c'\n't' 'o' 'e'" would generate a Vector [Char] (i.e., made up of the lists "tic", "tac" and "toe").
The string "\"hello\" \"world\"\n\"monad\" \"party\"" would generate a Vector [String] (i.e., ["hello","world"] and ["monad","party"]).
Error-checking/exception handling is not a concern for this particular exercise. At this stage, all testing is done purely, i.e., this isn't in the realm of the IO monad.
What I have so far:
I have a function (and new datatype) which is capable of classifying a string. I also have functions (one for each Int, Char and String) which can convert the string into the necessary Vector.
My question: how can I combine these three conversion functions into a single function?
What I've tried:
(It obviously doesn't typecheck if I stuff the three conversion
functions into a single function (i.e., using a case..of structure
to pattern match on VectorType of the string.
I tried making a Vectorable class and defining a separate instance for each type; I quickly realized that this approach only works if the functions' arguments vary by type. In our case, the the type of the argument doesn't vary (i.e., it's always a String).
My code:
A few comments
Parsing: the mySplitter object and the mySplit function handle the parsing. It's admittedly a crude parser based on the Splitter type and the split function from Data.List.Split.Internals.
Classifying: The classify function is capable of determining the final VectorType based on the string.
Converting: The toVectorNumber, toVectorChar and toVectorString functions are able to convert a string to type Vector [Int], Vector [Char] and Vector [String], respectively.
As a side note, I'm trying out CorePrelude based on a recommendation from a mentor. That's why you'll see me use the generalized versions of the normal Prelude functions.
Code:
import qualified Prelude
import CorePrelude
import Data.Foldable (concat, elem, any)
import Control.Monad (mfilter)
import Text.Read (read)
import Data.Char (isAlpha, isSpace)
import Data.List.Split (split)
import Data.List.Split.Internals (Splitter(..), DelimPolicy(..), CondensePolicy(..), EndPolicy(..), Delimiter(..))
import Data.Vector ()
import qualified Data.Vector as V
data VectorType = Number | Character | TextString deriving (Show)
mySplitter :: [Char] -> Splitter Char
mySplitter elts = Splitter { delimiter = Delimiter [(`elem` elts)]
, delimPolicy = Drop
, condensePolicy = Condense
, initBlankPolicy = DropBlank
, finalBlankPolicy = DropBlank }
mySplit :: [Char]-> [Char]-> [[Char]]
mySplit delims = split (mySplitter delims)
classify :: String -> VectorType
classify xs
| '\"' `elem` cs = TextString
| hasAlpha cs = Character
| otherwise = Number
where
cs = concat $ split (mySplitter "\n") xs
hasAlpha = any isAlpha . mfilter (/=' ')
toRows :: [Char] -> [[Char]]
toRows = mySplit "\n"
toVectorChar :: [Char] -> Vector [Char]
toVectorChar = let toChar = concat . mySplit " \'"
in V.fromList . fmap (toChar) . toRows
toVectorNumber :: [Char] -> Vector [Int]
toVectorNumber = let toNumber = fmap (\x -> read x :: Int) . mySplit " "
in V.fromList . fmap toNumber . toRows
toVectorString :: [Char] -> Vector [[Char]]
toVectorString = let toString = mfilter (/= " ") . mySplit "\""
in V.fromList . fmap toString . toRows
You can't.
Covariant polymorphism is not supported in Haskell, and wouldn't be useful if it were.
That's basically all there is to answer. Now as to why this is so.
It's no good "returning a polymorphic value" like OO languages so like to do, because the only reason to return any value at all is to use it in other functions. Now, in OO languages you don't have functions but methods that come with the object, so it's quite easy to "return different types": each will have its suitable methods built-in, and they can per instance vary. (Whether that's a good idea is another question.)
But in Haskell, the functions come from elsewhere. They don't know about implementation changes for a particular instance, so the only way such functions can safely be defined is to know every possible implementation. But if your return type is really polymorphic, that's not possible, because polymorphism is an "open" concept (it allows new implementation varieties to be added any time later).
Instead, Haskell has a very convenient and totally safe mechanism of describing a closed set of "instances" – you've actually used it yourself already! ADTs.
data PolyVector = NumbersVector (Vector [Int])
| CharsVector (Vector [Char])
| StringsVector (Vector [String])
That's the return type you want. The function won't be polymorphic as such, it'll simply return a more versatile type.
If you insist it should be polymorphic
Now... actually, Haskell does have a way to sort-of deal with "polymorphic returns". As in OO when you declare that you return a subclass of a specified class. Well, you can't "return a class" at all in Haskell, you can only return types. But those can be made to express "any instance of...". It's called existential quantification.
{-# LANGUAGE GADTs #-}
data PolyVector' where
PolyVector :: YourVElemClass e => Vector [e] -> PolyVector'
class YourVElemClass where
...?
instance YourVElemClass Int
instance YourVElemClass Char
instance YourVElemClass String
I don't know if that looks intriguing to you. Truth is, it's much more complicated and rather harder to use; you can't just just any of the possible results directly but can only make use of the elements through methods of YourVElemClass. GADTs can in some applications be extremely useful, but these usually involve classes with very deep mathematical motivation. YourVElemClass doesn't seem to have such a motivation, so you'll be much better off with a simple ADT alternative, than existential quantification.
There's a famous rant against existentials by Luke Palmer (note he uses another syntax, existential-specific, which I consider obsolete, as GADTs are strictly more general).
Easy, use an sum type!
data ParsedVector = NumberVector (Vector [Int]) | CharacterVector (Vector [Char]) | TextString (Vector [String]) deriving (Show)
parse :: [Char] -> ParsedVector
parse cs = case classify cs of
Number -> NumberVector $ toVectorNumber cs
Character -> CharacterVector $ toVectorChar cs
TextString -> TextStringVector $ toVectorString cs

How to include code in different places during compilations in Haskell?

Quasi-quotes allow generating AST code during compilations, but it inserts generated code at the place where Quasi-quote was written. Is it possible in any way to insert the compile-time generated code elsewhere? For example in specific module files which are different from the one where QQ was written? It would depend on hard-coded module structure, but that's fine.
If that's not possible with QQ but anyone knows a different way of achieving it, I am open for suggestions.
To answer this, it's helpful to know what a quasi-quoter is. From the GHC Documentation, a quasi-quoter is a value of
data QuasiQuoter = QuasiQuoter { quoteExp :: String -> Q Exp,
quotePat :: String -> Q Pat,
quoteType :: String -> Q Type,
quoteDec :: String -> Q [Dec] }
That is, it's a parser from an arbitrary String to one or more of ExpQ, PatQ, TypeQ, and DecQ, which are Template Haskell representations of expressions, patterns, types, and declarations respectively.
When you use a quasi-quote, GHC applies the parser to the String to create a ExpQ (or other type), then splices in the resulting template haskell expression to produce an actual value.
It sounds like what you're asking to do is separate the quasiquote parsing and splicing, so that you have access to the TH expression. Then you can import that expression into another module and splice it there yourself.
Knowing the type of a quasi-quoter, it's readily apparent this is possible. Normally you use a QQ as
-- file Expr.hs
eval :: Expr -> Integer
expr = QuasiQuoter { quoteExp = parseExprExp, quotePat = parseExprPat }
-- file Foo.hs
import Expr
myInt = eval [expr|1 + 2|]
Instead, you can extract the parser yourself, get a TH expression, and splice it later:
-- file Foo.hs
import Expr
-- run the QQ parser
myInt_TH :: ExpQ
myInt_TH = quoteExp expr "1 + 2"
-- file Bar.hs
import Foo.hs
-- run the TH splice
myInt = $(myInt_TH)
Of course if you're writing all this yourself, you can skip the quasi-quotes and use a parser and Template Haskell directly. It's pretty much the same thing either way.

Data constructor in template haskell

I'm trying to create the ring Z/n (like normal arithmetic, but modulo some integer). An example instance is Z4:
instance Additive.C Z4 where
zero = Z4 0
(Z4 x) + (Z4 y) = Z4 $ (x + y) `mod` 4
And so on for the ring. I'd like to be able to quickly generate these things, and I think the way to do it is with template haskell. Ideally I'd like to just go $(makeZ 4) and have it spit out the code for Z4 like I defined above.
I'm having a lot of trouble with this though. When I do genData n = [d| data $n = $n Integer] I get "parse error in data/newtype declaration". It does work if I don't use variables though: [d| data Z5 = Z5 Integer |], which must mean that I'm doing something weird with the variables. I'm not sure what though; I tried constructing them via newName and that didn't seem to work either.
Can anyone help me with what's going on here?
The Template Haskell documentation lists the things you are allowed to splice.
A splice can occur in place of
an expression; the spliced expression must have type Q Exp
an type; the spliced expression must have type Q Typ
a list of top-level declarations; the spliced expression must have type Q [Dec]
In both occurrences of $n, however, you're trying to splice a name.
This means you can't do this using quotations and splices. You'll have to build declaration using the various combinators available in the Language.Haskell.TH module.
I think this should be equivalent to what you're trying to do.
genData :: Name -> Q [Dec]
genData n = fmap (:[]) $ dataD (cxt []) n []
[normalC n [strictType notStrict [t| Integer |]]] []
Yep, it's a bit ugly, but there you go. To use this, call it with a fresh name, e.g.
$(genData (mkName "Z5"))

Resources