Template Haskell data declarations that derive Show - haskell

The following doesn't compile:
import Language.Haskell.TH
makeAlpha n = [d| data Alpha = Alpha $(conT n) deriving (Show, Read) |]
I can't make out what the error means at all:
Can't derive instances where the instance context mentions
type variables that are not data type parameters
Offending constraint: Show t_d
When deriving the instance for (Show Alpha)
In the Template Haskell quotation
[d| data Alpha = Alpha $(conT n) deriving (Show, Read) |]
In the expression:
[d| data Alpha = Alpha $(conT n) deriving (Show, Read) |]
Is it possible to do derivations like this?

This problem arises because TH quotes are type checked when they are compiled, with splices replaced by variables. This is usually a good idea, because it allows many kinds of problems to be detected before the splice is run, but in some cases this can make the compiler wrongfully reject a splice that would generate valid code.
In this case, this means that the compiler tries to check this code:
data Alpha = Alpha t deriving (Show, Read)
This doesn't work because the derived Show and Read instances need to use Show and Read for t, but since t is not a type parameter of Alpha, it cannot add the necessary constraints. Of course, when this splice is run, t is replaced by a concrete type, so the appropriate instances will be available without the need for any constraints, so this is one of the cases where the compiler is being over-cautious.
The workaround is to not use quoting, but instead use TH combinators, which are not subject to these extra checks. It's messy, but it works:
makeAlpha n = sequence [dataD (cxt []) alpha []
[normalC alpha [strictType notStrict (conT n)]] [''Show, ''Read]]
where alpha = mkName "Alpha"
There has been some talk about relaxing the checks done on quotes, but for now you'll just have to deal with it.

Related

How do you model "metadata" in Haskell?

I'm writing a parser in Haskell (mostly just to learn). I have a working tokenizer and parser and I want to add line numbers when giving an error message. I have this type:
data Token = Lambda
| Dot
| LParen
| RParen
| Ident String
Back in OO land, I would just create a Metadata object that holds the token's position in the source code. So I could try this:
data Metadata = Pos String Int Int
Then, I could change Token to
data Token = Lambda Metadata
| Dot Metadata
| LParen Metadata
| RParen Metadata
| Ident String Metadata
However, my parser is written using pattern matching on the tokens. So now, all my pattern matching is broken because I need to also account for the Metadata. So that doesn't seem ideal. 99% of the time, I don't care about the Metadata.
So what's the "right" way to do what I want to do?
There’s a wide array of approaches to the design of syntax representations in Haskell, but I can offer some recommendations and reasoning.
It’s advisable to keep metadata annotations out of the Token type, so that it sticks to a single responsibility. If a Token represents just a token, its derived instances for Eq and so on will work as expected without needing to worry about when to ignore the annotation.
Thankfully, the alternatives are simple in this case. One option is to move the annotation info to a separate wrapper type.
-- An #'Anno' a# is a value of type #a# annotated with some 'Metadata'.
data Anno a = Anno { annotation :: Metadata, item :: a }
deriving
( Eq
, Ord
, Show
-- …
)
Now the tokeniser can return a sequence of annotated tokens, i.e. [Annotated Token]. You still need to update the use sites, but the changes are now much simpler. And you can ignore annotations in various ways:
-- Positional matching
f1 (Anno _meta (Ident name)) = …
-- Record matching
f2 Anno { item = Ident name } = …
-- With ‘NamedFieldPuns’
f3 Anno { item } = …
-- 'U'nannotated value; with ‘PatternSynonyms’
pattern U :: a -> Anno a
pattern U x <- Anno _meta x
f4 (U LParen) = …
You can deannotate a sequence of tokens with fmap item to reuse existing code that doesn’t care about location info. And since Anno is a type of kind Type -> Type, GHC can also derive Foldable, Functor, and Traversable for it, making it easy to operate on the annotated item with e.g. fmap and traverse.
This is the preferable approach for Token, but for a parsed AST containing annotations, you may want to make the annotation type a parameter of the AST type, for example:
data Expr a = Add a (Expr a) (Expr a) | Literal a Int
deriving (Eq, Foldable, Functor, Ord, Show, Traversable)
Then you can use Expr Metadata for an annotated term, or Expr () for an unannotated one. To compare terms for equality, such as in unit tests, you can use the Functor instance to strip out the annotations, e.g. void expr1 == void expr2, where void is equivalent to fmap (\ _meta -> ()) here.
In a larger codebase, if there’s a lot of code depending on a data type and you really want to avoid updating it all at once, you can wrap the old type in a module that exports a pattern synonym for each of the old constructors. This lets you gradually update the old code before deleting the adapter module.
Culturally, it’s typical in a self-contained Haskell codebase to simply make breaking changes, and let the compiler tell you everywhere that needs to be updated, since it’s so easy to do extensive refactoring with high assurance that it’s correct. We’re more concerned with backward compatibility when it comes to published library code, since that actually affects other people.

Type design for the AST of my language remembering token locations

I wrote a parser and evaluator for a simple programming language. Here is a simplified version of the types for the AST:
data Value = IntV Int | FloatV Float | BoolV Bool
data Expr = IfE Value [Expr] | VarDefE String Value
type Program = [Expr]
I want error messages to tell the line and column of the source code in which the error occured. For example, if the value in an If expression is not a boolean, I want the evaluator to show an error saying "expected boolean at line x, column y", with x and y referring to the location of the value.
So, what I need to do is redefine the previous types so that they can store the relevant locations of different things. One option would be to add a location to each constructor for expressions, like so:
type Location = (Int, Int)
data Expr = IfE Value [Expr] Location | VarDef String Value Location
This clearly isn't optimal, because I have to add those Location fields to every possible expression, and if for example a value contained other values, I would need to add locations to that value too:
{-
this would turn into FunctionCall String [Value] [Location],
with one location for each value in the function call
-}
data Value = ... | FunctionCall String [Value]
I came up with another solution, which allows me to add locations to everything:
data Located a = Located Location a
type LocatedExpr = Located Expr
type LocatedValue = Located Value
data Value = IntV Int | FloatV Float | BoolV Bool | FunctionCall String [LocatedValue]
data Expr = IfE LocatedValue [LocatedExpr] | VarDef String LocatedValue
data Program = [LocatedExpr]
However I don't like this that much. First of all, it clutters the definition of the evaluator and pattern matching has an extra layer every time. Also, I don't think saying that a function call takes located values as arguments is quite right. Function calls should take values as arguments, and locations should be metadata that doesn't interfere with the evaluator.
I need help redefining my types so that the solution is as clean as possible. Maybe there is a language extension or a design pattern I don't know about that could be helpful.
There are many ways to annotate an AST! This is half of what’s known as the AST typing problem, the other half being how you manage an AST that changes over the course of compilation. The problem isn’t exactly “solved”: all of the solutions have tradeoffs, and which one to pick depends on your expected use cases. I’ll go over a few that you might like to investigate at the end.
Whichever method you choose for organising the actual data types, if it makes pattern-matching ugly or unwieldy, the natural solution is PatternSynonyms.
Considering your first example:
{-# Language PatternSynonyms #-}
type Location = (Int, Int)
data Expr
= LocatedIf Value [Expr] Location
| LocatedVarDef String Value Location
-- Unidirectional pattern synonyms which ignore the location:
pattern If :: Value -> [Expr] -> Expr
pattern If val exprs <- LocatedIf val exprs _loc
pattern VarDef :: String -> Value -> Expr
pattern VarDef name expr <- LocatedVarDef name expr _loc
-- Inform GHC that matching ‘If’ and ‘VarDef’ is just as good
-- as matching ‘LocatedIf’ and ‘LocatedVarDef’.
{-# Complete If, VarDef #-}
This may be sufficiently tidy for your purposes already. But here are a few more tips that I find helpful.
Put annotations first: when adding an annotation type to an AST directly, I often prefer to place it as the first parameter of each constructor, so that it can be conveniently partially applied.
data LocatedExpr
= LocatedIf Location Value [Expr]
| LocatedVarDef Location String Value
If the annotation is a location, then this also makes it more convenient to obtain when writing certain kinds of parsers, along the lines of AnnotatedIf <$> (getSourceLocation <* ifKeyword) <*> value <*> many expr in a parser combinator library.
Parameterise your annotations: I often make the annotation type into a type parameter, so that GHC can derive some useful classes for me:
{-# Language
DeriveFoldable,
DeriveFunctor,
DeriveTraversable #-}
data AnnotatedExpr a
= AnnotatedIf a Value [Expr]
| AnnotatedVarDef a String Value
deriving (Functor, Foldable, Traversable)
type LocatedExpr = AnnotatedExpr Location
-- Get the annotation of an expression.
-- (Total as long as every constructor is annotated.)
exprAnnotation :: AnnotatedExpr a -> a
exprAnnotation = head
-- Update annotations purely.
mapAnnotations
:: (a -> b)
-> AnnotatedExpr a -> AnnotatedExpr b
mapAnnotations = fmap
-- traverse, foldMap, &c.
If you want “doesn’t interfere”, use polymorphism: you can enforce that the evaluator can’t inspect the annotation type by being polymorphic over it. Pattern synonyms still let you match on these expressions conveniently:
pattern If :: Value -> [AnnotatedExpr a] -> AnnotatedExpr a
pattern If val exprs <- AnnotatedIf _anno val exprs
-- …
eval :: AnnotatedExpr a -> Value
eval expr = case expr of
If val exprs -> -- …
VarDef name expr -> -- …
Unannotated terms aren’t your enemy: a term without source locations is no good for error reporting, but I think it’s still a good idea to make the pattern synonyms bidirectional for the convenience of constructing unannotated terms with a unit () annotation. (Or something equivalent, if you use e.g. Maybe Location as the annotation type.)
The reason is that this is quite convenient for writing unit tests, where you want to check the output, but want to use Eq instead of pattern matching, and don’t want to have to compare all the source locations in tests that aren’t concerned with them. Using the derived classes, void :: (Functor f) => f a -> f () strips out all the annotations on an AST.
import Control.Monad (void)
type BareExpr = AnnotatedExpr ()
-- One way to define bidirectional synonyms, so e.g.
-- ‘If’ can be used as either a pattern or a constructor.
pattern If :: Value -> [BareExpr] -> BareExpr
pattern If val exprs = AnnotatedIf () val exprs
-- …
stripAnnotations :: AnnotatedExpr a -> BareExpr
stripAnnotations = void
Equivalently, you could use GADTs / ExistentialQuantification to say data AnyExpr where { AnyExpr :: AnnotatedExpr a -> AnyExpr } / data AnyExpr = forall a. AnyExpr (AnnotatedExpr a); that way, the annotations have exactly as much information as (), but you don’t need to fmap over the entire tree with void in order to strip it, just apply the AnyExpr constructor to hide the type.
Finally, here are some brief introductions to a few AST typing solutions.
Annotate each AST node with a tag (e.g. a unique ID), then store all metadata like source locations, types, and whatever else, separately from the AST:
import Data.IntMap (IntMap)
-- More sophisticated/stronglier-typed tags are possible.
newtype Tag = Tag Int
newtype TagMap a = TagMap (IntMap a)
data Expr
= If !Tag Value [Expr]
| VarDef !Tag String Expr
type Span = (Location, Location)
type SourceMap = TagMap Span
type CommentMap = TagMap (Span, String)
parse
:: String -- Input
-> Either ParseError
( Expr -- Parsed expression
, SourceMap -- Source locations of tags
, CommentMap -- Sideband for comments
-- …
)
The advantage is that you can very easily mix in arbitrary new types of annotations anywhere, without affecting the AST itself, and avoid rewriting the AST just to change annotations. You can think of the tree and annotation tables as a kind of database, where the tags are the “foreign keys” relating them. A downside is that you must be careful to maintain these tags when you do rewrite the AST.
I don’t know if this approach has an established name; I think of it as just “tagging” or a “tagged AST”.
recursion-schemes and/or Data Types à la CartePDF: separate out the “recursive” part of an annotated expression tree from the “annotation” part, and use Fix to tie them back together, with Compose (or Cofree) to add annotations in the middle.
data ExprF e
= IfF Value [e]
| VarDefF String e
-- …
deriving (Foldable, Functor, Traversable, …)
-- Unannotated: Expr ~ ExprF (ExprF (ExprF (…)))
type Expr = Fix ExprF
-- With a location at each recursive step:
--
-- LocatedExpr ~ Located (ExprF (Located (ExprF (…))))
type LocatedExpr = Fix (Compose Located ExprF)
data Located a = Located Location a
deriving (Foldable, Functor, Traversable, …)
-- or: type Located = (,) Location
A distinct advantage is that you get a bunch of nice traversal stuff like cata for free-ish, so you can avoid having to write manual traversals over your AST over and over. A downside is that it adds some pattern clutter to clean up, as does the “à la carte” approach, but they do offer a lot of flexibility.
Trees That GrowPDF is overkill for just source locations, but in a serious compiler it’s quite helpful. If you expect to have more than one annotation type (such as inferred types or other analysis results) or an AST that changes over time, then you add a type parameter for the “compilation phase” (parsed, renamed, typechecked, desugared, &c.) and select field types or enable & disable constructors based on that index.
A really unfortunate downside of this is that you often have to rewrite the tree even in places nothing has changed, because everything depends on the “phase”. An alternative that I use is to add one type parameter for each type of phase or annotation that can vary independently, e.g. data Expr annotation termVarName typeVarName, and abstract over that with type and pattern synonyms. This lets you update indices independently and still use classes like Functor and Bitraversable.

Arithmetic for values inside a maybe, inside a record or How to write the instance for (Num (Maybe a))

Given I have the following:
data Tick = Tick { _price :: Int } deriving (Eq, Ord)
type High = Maybe Tick
type Low = Maybe Tick
data Candle = Candle
{ _open :: Open
, _high :: High
, _low :: Low
, _close :: Close
} deriving (Eq, Ord)
makeLenses ''Candle
When I create a Candle:
candle = ...
Then I wish to use the calculate the difference between high and low:
axis = high - low
Results in:
• No instance for (Num High) arising from a use of ‘-’
• In the expression: high - low
In an equation for ‘axis’: axis = high - low
I have tried creating an instance like this:
instance Num High where
(-) (Just a) (Just b) = Just (a - b)
But that results in the error:
• Illegal instance declaration for ‘Num High’
(All instance types must be of the form (T t1 ... tn)
where T is not a synonym.
Use TypeSynonymInstances if you want to disable this.)
• In the instance declaration for ‘Num High’
So my question is, what should the instance for (Num High) look like to allow my calculation?
Arghhhh... no! Don't do that! Why do you think this is a good idea?
High is not a number type, because it also contains the non-number value Nothing&star;. (Even any such reasoning aside, it's basically never a good idea to define an instance for a type and class that are both in an established standard library, most certainly not base. Such an instance will surprise everybody who knows these libraries and suddenly runs into unexpected instances! Also, if the instance did make sense, the GHC people would probably have already defined it.)
What the Maybe does is, it forces you to think about the case where there is no number, i.e. where either upper or lower boundary aren't given. I don't know what behaviour you want for this case, but plausible would be to immediately make the axis-result Nothing too then. This is readily accomplished using Maybe's Applicative instance:
axis :: Maybe Int
axis = (-) <$> candle^.high <*> candle^.low
...provided that Tick itself has a Num instance, which might not be very prudent either† but at least plausible. If it doesn't have a Num instance, you need to unwrap the Tick constructor:
axis = Tick <$> ((-) <$> (_price <$> candle^.high) <*> (_price <$> candle^.low)
That said, if you insist to define that instance (as you shouldn't), then it's easy enough to do. For one thing, you could just unravel the type synonym:
instance Num (Maybe Tick) where
Just (Tick a) - Just (Tick b) = Just . Tick $ a - b
_ - _ = Nothing
But that's not even necessary; as the compiler already hinted you can directly give instances via type synonyms, it's just not Haskell98. Just enable the extension
{-# LANGUAGE TypeSynonymInstances #-}
or
{-# LANGUAGE FlexibleInstances #-}
and then
instance Num High where
...
is accepted, for better or for worse.
&star;
Note that just using Applicative for defining a Num instance leads to trouble: you want - and + to be cancellative, i.e. (a - b) + b ≡ a. But with the straightforward a+b = (+)<$>a<*>b (which can also be written (+)=liftA2(+)) you would have
(Just 1 - Nothing) + Nothing = Nothing
‡ Just 1
...mind, much the same behaviour is already exhibited by the floating-point instances
Prelude> (1 - 1/0) + 1/0
NaN
but at least there it's understood that infinities and NaN are corner cases where everything gets a bit weird, whereas for a Maybe Int you'd expect the Nothing to be a pretty normal, “it's just not here” value.
†
A better instance is probably
import Data.AdditiveGroup
instance AdditiveGroup Tick
...which works without any explicit implementation, provided you add Generic to the data definition:
{-# LANGUAGE DeriveGeneric #-}
import GHC.Generics
data Tick = Tick { _price :: Int } deriving (Eq, Ord)
With that defined, you can then do
axis = (^-^) <$> candle^.high <*> candle^.low

What does (..) mean?

I'm trying to learn Haskell.
I'm reading the code on here[1]. I just copy and past some part of the code from lines:46 and 298-300.
Question: What does (..) mean?
I Hoggled it but I got no result.
module Pos.Core.Types(
-- something here
SharedSeed (..) -- what does this mean?
) where
newtype SharedSeed = SharedSeed
{ getSharedSeed :: ByteString
} deriving (Show, Eq, Ord, Generic, NFData, Typeable)
[1] https://github.com/input-output-hk/cardano-sl/blob/master/core/Pos/Core/Types.hs
The syntax of import/export lists has not much to do with the syntax of Haskell itself. It's just a comma-separated listing of everything you want to export from your module. Now, there's a problem there because Haskell really has two languages with symbols that may have the same name. This is particularly common with newtypes like the one in your example: you have a type-level name SharedSeed :: *, and also a value-level name (data constructor) SharedSeed :: ByteString -> SharedSeed.
This only happens with uppercase names, because lowercase at type level are always local type variables. Thus the convention in export lists that uppercase names refer to types.
But just exporting the type does not allow users to actually construct values of that type. That's often prudent: not all internal-representation values might make legal values of the newtype (see Bartek's example), so then it's better to only export a safe smart constructor instead of the unsafe data constructor.
But other times, you do want to make the data constructor available, certainly for multi-constructor types like Maybe. To that end, export lists have three syntaxes you can use:
module Pos.Core.Types(
SharedSeed (SharedSeed) will export the constructor SharedSeed. In this case that's of course the only constructor anyway, but if there were other constructors they would not be exported with this syntax.
SharedSeed (..) will export all constructors. Again, in this case that's only SharedSeed, but for e.g. Maybe it would export both Nothing and Just.
pattern SharedSeed will export ShareSeed as a standalone pattern (independent of the export of the ShareSeed type). This requires the -XPatternSynonyms extension.
)
It means "export all constructors and record fields for this data type".
When writing a module export list, there's three4 ways you can export a data type:
module ModuleD(
D, -- just the type, no constructors
D(..), -- the type and all its constructors
D(DA) -- the type and the specific constructor
) where
data D = DA A | DB B
If you don't export any constructors, the type, well, can't be constructed, at least directly. This is useful if you e.g. want to enforce some runtime invariants on the data type:
module Even (evenInt, toInt) where
newtype EvenInt = EvenInt Int deriving (Show, Eq)
evenInt :: Int -> Maybe EvenInt
evenInt x = if x `mod` 2 == 0 then Just x else Nothing
toInt :: EvenInt -> Int
toInt (EvenInt x) = x
The caller code can now use this type, but only in the allowed manner:
x = evenInt 2
putStrLn $ if isJust x then show . toInt . fromJust $ x else "Not even!"
As a side note, toInt is usually implemented indirectly via the record syntax for convenience:
data EvenInt = EvenInt { toInt :: Int }
4 See #leftaroundabout's answer

How can I make my type an instance of Arbitrary?

I have the following data and function
data Foo = A | B deriving (Show)
foolist :: Maybe Foo -> [Foo]
foolist Nothing = [A]
foolist (Just x) = [x]
prop_foolist x = (length (foolist x)) == 1
when running quickCheck prop_foolist, ghc tells me that Foo needs to be an instance of Arbitrary.
No instance for (Arbitrary Foo) arising from a use of ‘quickCheck’
In the expression: quickCheck prop_foolist
In an equation for ‘it’: it = quickCheck prop_foolist
I tried data Foo = A | B deriving (Show, Arbitrary), but this results in
Can't make a derived instance of ‘Arbitrary Foo’:
‘Arbitrary’ is not a derivable class
Try enabling DeriveAnyClass
In the data declaration for ‘Foo’
However, I can't figure out how to enble DeriveAnyClass. I just wanted to use quickcheck with my simple function! The possible values of x is Nothing, Just A and Just B. Surely this should be possible to test?
There are two reasonable approaches:
Reuse an existing instance
If there's another instance that looks similar, you can use it. The Gen type is an instance of Functor, Applicative, and even Monad, so you can easily build generators from other ones. This is probably the most important general technique for writing Arbitrary instances. Most complex instances will be built up from one or more simpler ones.
boolToFoo :: Bool -> Foo
boolToFoo False = A
boolToFoo True = B
instance Arbitrary Foo where
arbitrary = boolToFoo <$> arbitrary
In this case, Foo can't be "shrunk" to subparts in any meaningful way, so the default trivial implementation of shrink will work fine. If it were a more interesting type, you could have used some analogue of
shrink = map boolToFoo . shrink . fooToBool
Use the pieces available in Test.QuickCheck.Arbitrary and/or Test.QuickCheck.Gen
In this case, it's pretty easy to just put together the pieces:
import Test.QuickCheck.Arbitrary
data Foo = A | B
deriving (Show,Enum,Bounded)
instance Arbitrary Foo where
arbitrary = arbitraryBoundedEnum
As mentioned, the default shrink implementation would be fine in this case. In the case of a recursive type, you'd likely want to add
{-# LANGUAGE DeriveGeneric #-}
import GHC.Generics (Generic)
and then derive Generic for your type and use
instance Arbitrary ... where
...
shrink = genericShrink
As the documentation warns, genericShrink does not respect any internal validity conditions you may wish to impose, so some care may be required in some cases.
You asked about DeriveAnyClass. If you wanted that, you'd add
{-# LANGUAGE DeriveAnyClass #-}
to the top of your file. But you don't want that. You certainly don't want it here, anyway. It only works for classes that have a full complement of defaults based on Generics, typically using the DefaultSignatures extension. In this case, there is no default arbitrary :: Generic a => Gen a line in the Arbitrary class definition, and arbitrary is mandatory. So an instance of Arbitrary produced by DeriveAnyClass will produce a runtime error as soon as QuickCheck tries to call its arbitrary method.

Resources