How can I use Template Haskell to build structures polymorphically? - haskell

I can write an instance
-- In Data.Sequence.Internal
instance Lift a => Lift (Seq a) where
...
letting users lift fully realized sequences into splices. But suppose I want something a bit different, to build functions for creating sequences?
sequenceCode :: Quote m => Seq (Code m a) -> Code m (Seq a)
sequenceCode = ???
The idea would be that I'd be able to write something like
triple :: a -> a -> a -> Seq a
triple a b c = $$(sequenceCode (fromList [[|| a ||], [|| b ||], [|| c ||]]))
and have that function build its sequence directly with the underlying sequence constructors, rather than having to build and convert a list at run-time.
It's not very hard to write something like sequenceCode directly for sequences, using their internals (look below the jump). But, as the name suggests, sequenceCode looks a lot like sequence. Is there a way to generalize it? A moment's reflection shows that Traversable is insufficient. Would it be possible to do something with the Generic1 class in staged generics? I made a few attempts, but I don't understand that package well enough to know the right place to start. Would it be possible even just using plain old GHC generics? I'm beginning to suspect so, but I haven't tried yet and it will surely be hairy.
Here's the code for a Data.Sequence version:
{-# language TemplateHaskellQuotes #-}
import Data.Sequence.Internal
import qualified Language.Haskell.TH.Syntax as TH
class Functor t => SequenceCode t where
traverseCode :: TH.Quote m => (a -> TH.Code m b) -> t a -> TH.Code m (t b)
traverseCode f = sequenceCode . fmap f
sequenceCode :: TH.Quote m => t (TH.Code m a) -> TH.Code m (t a)
sequenceCode = traverseCode id
instance SequenceCode Seq where
sequenceCode (Seq t) = [|| Seq $$(traverseCode sequenceCode t) ||]
instance SequenceCode Elem where
sequenceCode (Elem t) = [|| Elem $$t ||]
instance SequenceCode FingerTree where
sequenceCode (Deep s pr m sf) =
[|| Deep s $$(sequenceCode pr) $$(traverseCode sequenceCode m) $$(sequenceCode sf) ||]
sequenceCode (Single a) = [|| Single $$a ||]
sequenceCode EmptyT = [|| EmptyT ||]
instance SequenceCode Digit where
sequenceCode (One a) = [|| One $$a ||]
sequenceCode (Two a b) = [|| Two $$a $$b ||]
sequenceCode (Three a b c) = [|| Three $$a $$b $$c ||]
sequenceCode (Four a b c d) = [|| Four $$a $$b $$c $$d ||]
instance SequenceCode Node where
sequenceCode (Node2 s x y) = [|| Node2 s $$x $$y ||]
sequenceCode (Node3 s x y z) = [|| Node3 s $$x $$y $$z ||]
Then in another module, we can define triple as above:
triple :: a -> a -> a -> Seq a
triple a b c = $$(sequenceCode (fromList [[|| a ||], [|| b ||], [|| c ||]]))
When I compile this with -ddump-splices (or -ddump-ds), I can verify that the sequence is built directly rather than using fromList.

I've uploaded a package that does this.
It turns out that GHC.Generics is sufficient. However, I will actually use linear-generics instead, because it has a more general version of Generic1. The idea is that by examining the generic representation of a value, we can build up all the information we need to produce a Template Haskell code for it. It's all quite low-level! First, some throat-clearing:
{-# language TemplateHaskellQuotes #-}
{-# language FlexibleContexts #-}
{-# language FlexibleInstances #-}
{-# language ScopedTypeVariables #-}
{-# language DataKinds #-}
{-# language TypeOperators #-}
{-# language EmptyCase #-}
{-# language DefaultSignatures #-}
module Language.Haskell.TH.TraverseCode
( TraverseCode (..)
, sequenceCode
, genericTraverseCode
, genericSequenceCode
) where
import Generics.Linear
import Language.Haskell.TH.Syntax
(Code, Lift (..), Exp (..), Quote, Name)
import qualified Language.Haskell.TH.Syntax as TH
import Language.Haskell.TH.Lib (conE)
import Data.Kind (Type)
-- for instances
import qualified Data.Functor.Product as FProd
import qualified Data.Functor.Sum as FSum
import Data.Functor.Identity
import qualified Data.Sequence.Internal as Seq
import Data.Coerce
Now we'll get into the meat of things:
class TraverseCode t where
traverseCode :: Quote m => (a -> Code m b) -> t a -> Code m (t b)
default traverseCode
:: (Quote m, GTraverseCode (Rep1 t), Generic1 t)
=> (a -> Code m b) -> t a -> Code m (t b)
traverseCode = genericTraverseCode
sequenceCode
:: (TraverseCode t, Quote m)
=> t (Code m a) -> Code m (t a)
sequenceCode = traverseCode id
genericSequenceCode
:: (Quote m, GTraverseCode (Rep1 t), Generic1 t)
=> t (Code m a) -> Code m (t a)
genericSequenceCode = TH.unsafeCodeCoerce . gtraverseCode id . from1
genericTraverseCode
:: (Quote m, GTraverseCode (Rep1 t), Generic1 t)
=> (a -> Code m b) -> t a -> Code m (t b)
genericTraverseCode f = TH.unsafeCodeCoerce . gtraverseCode f . from1
class GTraverseCode f where
gtraverseCode :: Quote m => (a -> Code m b) -> f a -> m Exp
Why do we use untyped Template Haskell here? Simple: it's pretty easy to build the expressions we need, but working out how to make types useful for the sub-expressions would be tricky. So then, of course, we need generic instances. We'll work our way down step by step, from the outside in, gathering info along the way.
First we look at the type stuff:
instance (Datatype c, GTraverseCodeCon f)
=> GTraverseCode (D1 c f) where
gtraverseCode f m#(M1 x) = gtraverseCodeCon pkg modl f x
where
pkg = packageName m
modl = moduleName m
This gets us the names GHC uses for the package and module.
Next we look at the constructor stuff:
class GTraverseCodeCon f where
gtraverseCodeCon :: Quote m => String -> String -> (a -> Code m b) -> f a -> m Exp
instance GTraverseCodeCon V1 where
gtraverseCodeCon _pkg _modl _f x = case x of
instance (GTraverseCodeCon f, GTraverseCodeCon g)
=> GTraverseCodeCon (f :+: g) where
gtraverseCodeCon pkg modl f (L1 x) = gtraverseCodeCon pkg modl f x
gtraverseCodeCon pkg modl f (R1 y) = gtraverseCodeCon pkg modl f y
instance (Constructor c, GTraverseCodeFields f)
=> GTraverseCodeCon (C1 c f) where
gtraverseCodeCon pkg modl f m#(M1 x) = gtraverseCodeFields (conE conN) f x
where
conBase = conName m
conN :: Name
conN = TH.mkNameG_d pkg modl conBase
The interesting case is when we reach an actual constructor (C1). Here we grab the (unqualified) name of the constructor from the Constructor instance, and combine it with the package and module names to get the Template Haskell Name of the constructor, from which we can build an expression referring to it. This expression gets passed on down to the lowest level, where we deal with fields. The rest is basically a left fold over those fields.
class GTraverseCodeFields f where
gtraverseCodeFields :: Quote m => m Exp -> (a -> Code m b) -> f a -> m Exp
instance GTraverseCodeFields f => GTraverseCodeFields (S1 c f) where
gtraverseCodeFields c f (M1 x) = gtraverseCodeFields c f x
instance (GTraverseCodeFields f, GTraverseCodeFields g)
=> GTraverseCodeFields (f :*: g) where
gtraverseCodeFields c f (x :*: y) =
gtraverseCodeFields (gtraverseCodeFields c f x) f y
instance Lift p => GTraverseCodeFields (K1 i p) where
gtraverseCodeFields c _f (K1 x) = [| $c x |]
instance GTraverseCodeFields Par1 where
gtraverseCodeFields cc f (Par1 ca) =
[| $cc $(TH.unTypeCode (f ca)) |]
instance GTraverseCodeFields U1 where
gtraverseCodeFields cc _f U1 = cc
-- Note: this instance is *different* from the one that we'd
-- write if we were using GHC.Generics, because composition works
-- differently in Generics.Linear.
instance (GTraverseCodeFields f, TraverseCode g) => GTraverseCodeFields (f :.: g) where
gtraverseCodeFields cc f (Comp1 x) =
gtraverseCodeFields cc (traverseCode f) x
Now we can write all sorts of instances:
instance TraverseCode Maybe
instance TraverseCode Identity
instance TraverseCode []
instance TH.Lift a => TraverseCode (Either a)
instance TH.Lift a => TraverseCode ((,) a)
instance (TraverseCode f, TraverseCode g)
=> TraverseCode (FProd.Product f g)
instance (TraverseCode f, TraverseCode g)
=> TraverseCode (FSum.Sum f g)
instance TraverseCode V1
-- The Elem instance isn't needed for the Seq instance
instance TraverseCode Seq.Elem
instance TraverseCode Seq.Digit
instance TraverseCode Seq.Node
instance TraverseCode Seq.FingerTree
For the Seq instance I was after, we need to write something by hand, because Seq isn't an instance of Generic1 (and we don't want it to be). Additionally, we don't really want the derived instance. Using a bit of coercion magic, and knowing a little something about how zipWith and replicate work on sequences, we can minimize the size of the splice and the number of types GHC has to deal with once it's compiled to Core.
instance TraverseCode Seq.Seq where
-- Stick a single coercion on the outside, instead of having a bunch
-- of `Elem` constructors on the inside.
traverseCode f s = [|| coerceFT $$(traverseCode f ft') ||]
where
-- Use zipWith to make the tree representing the sequence
-- nice and shallow.
ft' = coerceSeq (Seq.zipWith (flip const) (Seq.replicate (Seq.length s) ()) s)
coerceFT :: Seq.FingerTree a -> Seq.Seq a
coerceFT = coerce
coerceSeq :: Seq.Seq a -> Seq.FingerTree a
coerceSeq = coerce

Related

How to test Monad instance for custom StateT?

I'm learning Monad Transformers, and one of the exercises asks to implement the Monad instance for StateT.
I want to test that my implementation admits to the Monad laws using the validity package, which is like the checkers package.
Problem is, my Arbitrary instance doesn't compile. I saw this question, but it doesn't quite do what I want because the test basically duplicates the implementation and doesn't check the laws.
There's also this question, but it's unanswered, and I've already figured out how to test Monad Transformers not involving functions (like MaybeT).
{-# LANGUAGE DerivingVia #-}
{-# LANGUAGE InstanceSigs #-}
module Ch11.MonadT (StT (..)) where
import Control.Monad.Trans.State (StateT (..))
newtype StT s m a = StT (s -> m (a, s))
deriving
(Functor, Applicative)
via StateT s m
instance (Monad m) => Monad (StT s m) where
return :: a -> StT s m a
return = pure
(>>=) :: StT s m a -> (a -> StT s m b) -> StT s m b
StT x >>= f = StT $ \s -> do
(k, s') <- x s
let StT y = f k
y s'
(>>) :: StT s m a -> StT s m b -> StT s m b
(>>) = (*>)
My test:
{-# LANGUAGE DerivingVia #-}
{-# LANGUAGE GeneralizedNewtypeDeriving #-}
{-# LANGUAGE TypeApplications #-}
module Ch11.MonadTSpec (spec) where
import Ch11.MonadT (StT (..))
import Test.Hspec
import Test.QuickCheck
import Test.Validity.Monad
spec :: Spec
spec = do
monadSpecOnArbitrary #(StTArbit Int [] Int)
-- create wrapper to avoid orphan instance error
newtype StTArbit s m a = StTArbit (StT s m a)
deriving
(Functor, Applicative, Monad)
instance (Arbitrary s, Function s, Arbitrary1 m, Arbitrary a) => Arbitrary (StTArbit s m a) where
arbitrary = do
f <- arbitrary :: Fun s (m (a, s))
StTArbit . StT <$> f
Error:
• Couldn't match type: (a0, s0)
with: s -> m (a, s)
Expected: Gen (s -> m (a, s))
Actual: Gen (a0, s0)
• In the second argument of ‘(<$>)’, namely ‘f’
In a stmt of a 'do' block: StTArbit . StT <$> f
OP here, this is what I ended up doing.
-- https://ghc.gitlab.haskell.org/ghc/doc/users_guide/exts/explicit_forall.html
{-# LANGUAGE ExplicitForAll #-}
-- https://ghc.gitlab.haskell.org/ghc/doc/users_guide/exts/type_applications.html
{-# LANGUAGE TypeApplications #-}
module Ch11.MonadTSpec (spec) where
import Ch11.MonadT (StT (..), runStT)
import Data.Function as F
import Test.Hspec
import Test.Hspec.QuickCheck
import Test.QuickCheck
spec :: Spec
spec = do
describe "Monad (StT Int [])" $ do
describe "satisfies Monad laws" $ do
-- the types are in the same order as in `forall`
prop "right identity law" (prop_monadRightId #Int #Int #[])
prop "left identity law" (prop_monadLeftId #Int #Int #Int #[])
prop "associative law" (prop_monadAssoc #Int #Int #Int #Int #[])
{- HLINT ignore -}
{-
the types in `forall` are specified in the order of dependency.
since `m` needs `a` and `s`, those appear before `m` in the list.
-}
-- (x >>= return) == x
prop_monadRightId ::
forall a s m.
(Monad m, Eq (m (a, s)), Show (m (a, s))) =>
s ->
Fun s (m (a, s)) ->
Property
prop_monadRightId s f = ((===) `F.on` go) (m >>= return) m
where
m = StT $ applyFun f
go st = runStT st s
-- (return x >>= f) == (f x)
prop_monadLeftId ::
forall a b s m.
(Monad m, Eq (m (b, s)), Show (m (b, s))) =>
a ->
s ->
Fun (a, s) (m (b, s)) ->
Property
prop_monadLeftId a s f = ((===) `F.on` go) (return a >>= h) m
where
g = applyFun2 f
m = StT $ g a
h = StT . g
go st = runStT st s
-- ((x >>= f) >>= g) == (x >>= (\x' -> f x' >>= g))
prop_monadAssoc ::
forall a b c s m.
(Monad m, Eq (m (b, s)), Show (m (b, s)), Eq (m (c, s)), Show (m (c, s))) =>
s ->
Fun s (m (a, s)) ->
Fun (a, s) (m (b, s)) ->
Fun (b, s) (m (c, s)) ->
Property
prop_monadAssoc s h f g =
((===) `F.on` go)
((m >>= f') >>= g')
(m >>= (\x -> f' x >>= g'))
where
m = StT $ applyFun h
f' = StT . applyFun2 f
g' = StT . applyFun2 g
go st = runStT st s
I think you want pure, not (<$>). (But I haven't checked with my local compiler, so I'm not sure.) You probably also have to turn your Fun into an actual function.
arbitrary = do
f <- arbitrary
pure (StTArbit . StT . applyFun $ f)
I'd also point out that there's not much point to making a newtype here. I guess it avoids an orphan instance warning? But you've defined the type you're writing an instance for yourself, presumably even in the same package, so it seems pretty benign; if it's part of a separate cabal component that people can't depend on, like a test suite, even more so.

Making QualifiedDo and ApplicativeDo work together when nesting applicative functors

I want to define deeply nested compositions of applicative functors. For example something like this:
{-# LANGUAGE TypeOperators #-}
import Control.Monad.Trans.Cont
import Control.Arrow (Kleisli (..))
import Data.Aeson
import Data.Aeson.Types
import Data.Functor.Compose
import Data.Functor
type Configurator = Kleisli Parser Value
type Allocator = ContT () IO
type Validator = Either String
someConfigurator :: Configurator Int
someConfigurator = undefined
someAllocator :: Allocator Char
someAllocator = undefined
-- the nested functor composition. left-associated
type Phases = Configurator `Compose` Allocator `Compose` Validator
data Foo = Foo Int Char
-- I want to streamline writing this, without spamming the Compose constructor
fooPhases :: Phases Foo
fooPhases = _
To streamline the syntax for creating the fooPhases value, I though of (ab)using QualifiedDo:
module Bind where
import Data.Functor
import Data.Functor.Compose
(>>=) :: Functor f => f a -> (a -> g b) -> Compose f g b
(>>=) f k = bindPhase f k
(>>) :: Functor f => f a -> g b -> Compose f g b
(>>) f g = Compose $ f <&> \_ -> g
fail :: MonadFail m => String -> m a
fail = Prelude.fail
bindPhase :: Functor f => f a -> (a -> g b) -> Compose f g b
bindPhase f k = Compose (f <&> k)
Somewhat to my surprise, it worked:
{-# LANGUAGE QualifiedDo #-}
import qualified Bind
fooPhases :: Phases Foo
fooPhases = Bind.do
i <- someConfigurator
c <- someAllocator
pure (Foo i c)
Alas, when I add applicative-like functions to the Bind module
return :: Applicative f => a -> f a
return = Prelude.pure
pure :: Applicative f => a -> f a
pure = Prelude.pure
fmap :: Functor f => (a -> b) -> f a -> f b
fmap = Prelude.fmap
join :: f (g a) -> Compose f g a
join = Compose
(<*>) :: (Applicative f, Applicative g) => f (a -> b) -> g a -> Compose f g b
(<*>) f g = Compose $ f <&> \z -> Prelude.fmap (z $) g
and then enable ApplicativeDo in Main, I start to get errors like the following:
* Couldn't match type: Compose (Kleisli Parser Value) (ContT () IO)
with: Kleisli Parser Value
Expected: Configurator (Compose Allocator Validator Foo)
Actual: Compose
(Kleisli Parser Value)
(ContT () IO)
(Compose Allocator Validator Foo)
Is there a way to use my Bind.do when both QualifiedDo and ApplicativeDo are enabled in Main?
To make this easier to reason about, first manually desugar fooPhases each way:
fooPhasesMonad =
someConfigurator Bind.>>= \i ->
someAllocator Bind.>>= \c ->
pure (Foo i c)
fooPhasesApplicative = Bind.fmap Foo someConfigurator Bind.<*> someAllocator
If you check their types in GHCi, you'll see that fooPhasesMonad has the type you want (as expected, since it works), but fooPhasesApplicative has type (Configurator `Compose` Allocator) Foo.
The first problem is that Bind.fmap f m isn't equivalent to m Bind.>>= (pure . f). In particular, the latter produces an extra layer of Compose but the former does not. When you use ApplicativeDo, using the former instead means you end up with just (Configurator `Compose` Allocator) instead of (Configurator `Compose` Allocator `Compose` Validator), which is the cause of your type error. To fix it, replace your definition of Bind.fmap with this one:
fmap :: (Functor f, Applicative g) => (a -> b) -> f a -> Compose f g b
fmap f k = bindPhase k (Prelude.pure . f)
The "monads" of your do-notation fail all of the monad laws, though (even the types of the results can't be right), so some rewrites that you take for granted aren't still valid. In particular, you'll still get an error unless you settle for your types being composed like this instead:
type Phases = (Configurator `Compose` Validator) `Compose` Allocator

Could not deduce (Applicative f0) when Applicative f is given

I'm having trouble understanding why there are two independent f and f0 Applicative constraints in code below (requires reducers package).
import Data.Semigroup.Applicative
import Data.Semigroup.Reducer
import Data.Semigroup
-- | Foo
--
-- >>> getMax $ foo ["abc","a","hello",""]
-- 5
foo :: [String] -> (Max Int)
foo = foldReduce . map (length)
bar :: (Foldable f, Monoid m, Reducer e m) => f e -> m
bar = foldReduce
m :: Max Int
m = unit (2 :: Int)
apm :: (Applicative f) => Ap f (Max Int)
apm = unit $ pure (2 :: Int) -- ambiguous Applicative!
I think that I need to somehow tell that I want f0 ~ f where f0 is independently inferred by use of pure.
I tried to simplify:
u :: (Applicative f, Monoid m) => e -> Ap f m
u = undefined
m :: (Applicative f) => Ap f (Max Int)
m = u $ (pure (2 :: Int))
It will compile once e is changed to f e so that contexts can "unify". But I don't know how to unify with reducer context.
My goal is to foldReduce with Applicative Semigroup (if that's possible at all) so that length will be replaced with effectful version.
The standard solution is to use ScopedTypeVariables to announce that the f in the signature for apm and the f you want pure to produce are the same f. So:
{-# LANGUAGE ScopedTypeVariables #-}
import Data.Semigroup.Applicative
import Data.Semigroup.Reducer
import Data.Semigroup
apm :: forall f. Applicative f => Ap f (Max Int)
apm = unit (pure 2 :: f Int)
Don't lose the forall; it's a requirement of the extension to bring f into scope in the definition's body.

Is it possible to derive recursion principles generically?

In Idris, there's some magical machinery to automatically create (dependent) eliminators for user-defined types. I'm wondering if it's possible to do something (perhaps less dependent) with Haskell types. For instance, given
data Foo a = No | Yes a | Perhaps (Foo a)
I want to generate
foo :: b -> (a -> b) -> (b -> b) -> Foo a -> b
foo b _ _ No = b
foo _ f _ (Yes a) = f a
foo b f g (Perhaps c) = g (foo b f g x)
I'm pretty weak on polyvariadic functions and generics, so I could use a bit of help getting started.
Here's a start of doing this using GHC Generics. Adding some code to reassociate the (:+:) would make this nicer. A few more instances are required and this probably has ergonomic problems.
EDIT: Bah, I got lazy and fell back to a data family to get injectivity for my type equality dispatch. This mildly changes the interface. I suspect with enough trickery, and/or using injective type families this can be done without a data family or overlapping instances.
{-# LANGUAGE DeriveGeneric #-}
{-# LANGUAGE EmptyDataDecls #-}
{-# LANGUAGE TypeFamilies #-}
{-# LANGUAGE TypeOperators #-}
{-# LANGUAGE FlexibleInstances #-}
{-# LANGUAGE FlexibleContexts #-}
{-# LANGUAGE MultiParamTypeClasses #-}
{-# LANGUAGE UndecidableInstances #-}
module Main where
import Data.Function (fix)
import GHC.Generics
data Foo a = No | Yes | Perhaps (Foo a) | Extra a Int Bool
deriving (Show, Generic1)
data Bar a = Bar (Maybe a)
deriving (Show, Generic1)
gcata :: (GCata (f a) (Rep1 f a), Generic1 f) => Alg (f a) (Rep1 f a) r -> f a -> r
gcata f = fix(\w -> gcata' w f . from1)
ex' :: Show a => Foo a -> String
ex' = gcata (("No","Yes"),(\(Rec s) -> "Perhaps ("++s++")", \a i b -> "Extra ("++show a++") ("++show i++") ("++show b++")"))
ex1 = ex' (Perhaps (Perhaps Yes) :: Foo Int)
ex2 = ex' (Perhaps (Perhaps (Extra 'a' 2 True)) :: Foo Char)
ex3 :: Foo a -> Foo a
ex3 = gcata ((No, Yes), (Perhaps . unRec, Extra))
ex4 = gcata (\(K m) -> show m) (Bar (Just 3))
class GCata rec f where
type Alg (rec :: *) (f :: *) (r :: *) :: *
gcata' :: (rec -> r) -> Alg rec f r -> f -> r
instance (GCata rec (f p)) => GCata rec (M1 i c f p) where
type Alg rec (M1 i c f p) r = Alg rec (f p) r
gcata' w f (M1 x) = gcata' w f x
instance (GCata rec (f p), GCata rec (g p)) => GCata rec ((f :+: g) p) where
type Alg rec ((f :+: g) p) r = (Alg rec (f p) r, Alg rec (g p) r)
gcata' w (l,_) (L1 x) = gcata' w l x
gcata' w (_,r) (R1 x) = gcata' w r x
instance GCata rec (U1 p) where
type Alg rec (U1 p) r = r
gcata' _ f U1 = f
instance (Project rec (f p), GCata rec (g p)) => GCata rec ((f :*: g) p) where
type Alg rec ((f :*: g) p) r = Prj rec (f p) r -> Alg rec (g p) r
gcata' w f (x :*: y) = gcata' w (f (prj w x)) y
class Project rec f where
type Prj (rec :: *) (f :: *) (r :: *) :: *
prj :: (rec -> r) -> f -> Prj rec f r
instance (Project rec (f p)) => Project rec (M1 i c f p) where
type Prj rec (M1 i c f p) r = Prj rec (f p) r
prj w (M1 x) = prj w x
instance Project rec (K1 i c p) where
type Prj rec (K1 i c p) r = c
prj _ (K1 x) = x
instance (RecIfEq (TEq rec (f p)) rec (f p)) => Project rec (Rec1 f p) where
type Prj rec (Rec1 f p) r = Tgt (TEq rec (f p)) rec (f p) r
prj w (Rec1 x) = recIfEq w x
instance Project rec (Par1 p) where
type Prj rec (Par1 p) r = p
prj _ (Par1 x) = x
instance GCata rec (K1 i c p) where
type Alg rec (K1 i c p) r = c -> r
gcata' _ f (K1 x) = f x
instance GCata rec (Par1 p) where
type Alg rec (Par1 p) r = p -> r
gcata' _ f (Par1 x) = f x
instance (Project rec (Rec1 f p)) => GCata rec (Rec1 f p) where
type Alg rec (Rec1 f p) r = Prj rec (Rec1 f p) r -> r
gcata' w f = f . prj w
data HTrue; data HFalse
type family TEq x y where
TEq x x = HTrue
TEq x y = HFalse
class RecIfEq b rec t where
data Tgt b rec t r :: *
recIfEq :: (rec -> r) -> t -> Tgt b rec t r
instance RecIfEq HTrue rec rec where
newtype Tgt HTrue rec rec r = Rec { unRec :: r }
recIfEq w = Rec . w
instance RecIfEq HFalse rec t where
newtype Tgt HFalse rec t r = K { unK :: t }
recIfEq _ = K
As pigworker remarked in the question comments, using the default Generic representation leads to great ugliness, since we don't have prior information about recursion in our type, and we have to dig out recursive occurrences by manually checking for type equality. I'd like to present here alternative solutions with explicit f-algebra-style recursion. For this, we need an alternative generic Rep. Sadly, this means we can't easily tap into GHC.Generics, but I hope this will be edifying nonetheless.
In my first solution I aim for a presentation that is as simple as possible within current GHC capabilities. The second solution is a TypeApplication-heavy GHC 8-based one with more sophisticated types.
Starting out as usual:
{-# language
TypeOperators, DataKinds, PolyKinds,
RankNTypes, EmptyCase, ScopedTypeVariables,
DeriveFunctor, StandaloneDeriving, GADTs,
TypeFamilies, FlexibleContexts, FlexibleInstances #-}
My generic representation is a fixpoint of a sum-of-products. It slightly extends the basic model of generics-sop, which is also a sum-of-products but not functorial and therefore ill-equipped for recursive algorithms. I think SOP is overall a much better practical representation than arbitrarily nested types; you can find extended arguments as to why this is the case in the paper. In short, SOP removes unnecessary nesting information and lets us separate metadata from basic data.
But before anything else, we should decide on a code for generic types. In vanilla GHC.Generics there isn't a well-defined kind of codes, as the type constructors of sums, products etc. form an ad-hoc type-level grammar, and we can dispatch on them using type classes. We adhere more closely to usual presentations in dependently typed generics, and use explicit codes, interpretations and functions. Our codes shall be of kind:
[[Maybe *]]
The outer list encodes a sum of constructors, with each inner [Maybe *] encoding a constructor. A Just * is just a constructor field, while Nothing denotes a recursive field. For example, the code of [Int] is ['[], [Just Int, Nothing]].
type Rep a = Fix (SOP (Code a))
class Generic a where
type Code a :: [[Maybe *]]
to :: a -> Rep a
from :: Rep a -> a
data NP (ts :: [Maybe *]) (k :: *) where
Nil :: NP '[] k
(:>) :: t -> NP ts k -> NP (Just t ': ts) k
Rec :: k -> NP ts k -> NP (Nothing ': ts) k
infixr 5 :>
data SOP (code :: [[Maybe *]]) (k :: *) where
Z :: NP ts k -> SOP (ts ': code) k
S :: SOP code k -> SOP (ts ': code) k
Note that NP has different constructors for recursive and non-recursive fields. This is quite important, because we want codes to be unambiguously reflected in the type indices. In other words, we would like NP to also act as a singleton for [Maybe *] (although we remain parametric in * for good reasons).
We use a k parameter in the definitions to leave a hole for recursion. We set up recursion as usual, leaving the Functor instances to GHC:
deriving instance Functor (SOP code)
deriving instance Functor (NP code)
newtype Fix f = In {out :: f (Fix f)}
cata :: Functor f => (f a -> a) -> Fix f -> a
cata phi = go where go = phi . fmap go . out
We have two type families:
type family CurryNP (ts :: [Maybe *]) (r :: *) :: * where
CurryNP '[] r = r
CurryNP (Just t ': ts) r = t -> CurryNP ts r
CurryNP (Nothing ': ts) r = r -> CurryNP ts r
type family Alg (code :: [[Maybe *]]) (r :: *) :: * where
Alg '[] r = ()
Alg (ts ': tss) r = (CurryNP ts r, Alg tss r)
CurryNP ts r curries NP ts with result type r, and it also plugs in r in the recursive occurrences.
Alg code r computes the type of an algebra on SOP code r. It tuples together the eliminators for the individual constructors. Here we use plain nested tuples, but of course HList-s would be adequate too. We could also reuse NP here as a HList, but I find that too kludgy.
All that's left is to implement the functions:
uncurryNP :: CurryNP ts a -> NP ts a -> a
uncurryNP f Nil = f
uncurryNP f (x :> xs) = uncurryNP (f x) xs
uncurryNP f (Rec k xs) = uncurryNP (f k) xs
algSOP :: Alg code a -> SOP code a -> a
algSOP fs (Z np) = uncurryNP (fst fs) np
algSOP fs (S sop) = algSOP (snd fs) sop
gcata :: Generic a => Alg (Code a) r -> a -> r
gcata f = cata (algSOP f) . to
The key point here is that we have to convert the curried eliminators in Alg into a "proper" SOP code a -> a algebra, since that is the form that can be directly used in cata.
Let's define some sugar and instances:
(<:) :: a -> b -> (a, b)
(<:) = (,)
infixr 5 <:
instance Generic (Fix (SOP code)) where
type Code (Fix (SOP code)) = code
to = id
from = id
instance Generic [a] where
type Code [a] = ['[], [Just a, Nothing]]
to = foldr (\x xs -> In (S (Z (x :> Rec xs Nil)))) (In (Z Nil))
from = gcata ([] <: (:) <: ()) -- note the use of "Generic (Rep [a])"
Example:
> gcata (0 <: (+) <: ()) [0..10]
55
Full code.
However, it would be nicer if we had currying and didn't have to use HList-s or tuples to store eliminators. The most convenient way is to have the same order of arguments as in standard library folds, such as foldr or maybe. In this case the return type of gcata is given by a type family that computes from the generic code of a type.
type family CurryNP (ts :: [Maybe *]) (r :: *) :: * where
CurryNP '[] r = r
CurryNP (Just t ': ts) r = t -> CurryNP ts r
CurryNP (Nothing ': ts) r = r -> CurryNP ts r
type family Fold' code a r where
Fold' '[] a r = r
Fold' (ts ': tss) a r = CurryNP ts a -> Fold' tss a r
type Fold a r = Fold' (Code a) r (a -> r)
gcata :: forall a r. Generic a => Fold a r
This gcata is highly (fully) ambiguous. We need either explicit application or Proxy, and I opted for the former, incurring a GHC 8 dependence. However, once we supply an a type, the result type reduces, and we can easily curry:
> :t gcata #[_]
gcata #[_] :: Generic [t] => r -> (t -> r -> r) -> [t] -> r
> :t gcata #[_] 0
gcata #[_] 0 :: Num t1 => (t -> t1 -> t1) -> [t] -> t1
> gcata #[_] 0 (+) [0..10]
55
I used above a partial type signature in [_]. We can also create a shorthand for this:
gcata1 :: forall f a r. Generic (f a) => Fold (f a) r
gcata1 = gcata #(f a) #r
Which can be used as gcata1 #[].
I'd rather not elaborate the implementation of the above gcata here. It's not much longer than the simple version, but the gcata implementation is pretty hairy (embarrassingly, it's responsible for my delayed answer). Right now I couldn't explain it very well, since I wrote it with Agda aid, which entails plenty of automatic search and type tetris.
As has been said in the comments and other answers, it's best to start from a generic representation that has access to the recursive positions.
One library that works with such a representation is multirec (another is compdata):
{-# LANGUAGE TemplateHaskell #-}
{-# LANGUAGE GADTs, TypeFamilies, MultiParamTypeClasses, RankNTypes #-}
module FooFold where
import Generics.MultiRec.FoldAlgK
import Generics.MultiRec.TH
data Foo a = No | Yes a | Perhaps (Foo a)
data FooF :: * -> * -> * where
Foo :: FooF a (Foo a)
deriveAll ''FooF
foldFoo :: (r, (a -> r, r -> r)) -> Foo a -> r
foldFoo phi = fold (const phi) Foo
The FoldAlgK module provides a fold with a single result type and computes the algebra type as a nested pair. It would be relatively easy to additionally curry that. There are some other variants offered by the library.

Datatype-generic programming and the mysterious gdmXXX

I'm using datatype-generic programming for a class called Generic that contains a method called get. If my end user defines a type and forgets to add deriving Generic, and calls put, they will see an error message such as this:
No instance for (ALife.Creatur.Genetics.Code.BRGCWord8.GGene
(GHC.Generics.Rep ClassifierGene))
arising from a use of `ALife.Creatur.Genetics.Code.BRGCWord8.$gdmput'
I can tell users how to fix the error, but I am curious about this $gdmput. I assume it's a function or symbol that's automatically generated, but by what? Is it the use of the DefaultSignatures pragma, or the DeriveGeneric pragma? I read a few papers about datatype-generic programming, but did not see any reference to gdmXXX symbols.
Here's the definition of the Generic class.
{-# LANGUAGE TypeFamilies, FlexibleContexts, FlexibleInstances,
DefaultSignatures, DeriveGeneric, TypeOperators #-}
. . .
-- | A class representing anything which is represented in, and
-- determined by, an agent's genome.
-- This might include traits, parameters, "organs" (components of
-- agents), or even entire agents.
-- Instances of this class can be thought of as genes, i.e.,
-- instructions for building an agent.
class Genetic g where
-- | Writes a gene to a sequence.
put :: g -> Writer ()
default put :: (Generic g, GGenetic (Rep g)) => g -> Writer ()
put = gput . from
-- | Reads the next gene in a sequence.
get :: Reader (Either [String] g)
default get :: (Generic g, GGenetic (Rep g)) => Reader (Either [String] g)
get = do
a <- gget
return $ fmap to a
getWithDefault :: g -> Reader g
getWithDefault d = fmap (fromEither d) get
class GGenetic f where
gput :: f a -> Writer ()
gget :: Reader (Either [String] (f a))
-- | Unit: used for constructors without arguments
instance GGenetic U1 where
gput U1 = return ()
gget = return (Right U1)
-- | Constants, additional parameters and recursion of kind *
instance (GGenetic a, GGenetic b) => GGenetic (a :*: b) where
gput (a :*: b) = gput a >> gput b
gget = do
a <- gget
b <- gget
return $ (:*:) <$> a <*> b
-- | Meta-information (constructor names, etc.)
instance (GGenetic a, GGenetic b) => GGenetic (a :+: b) where
gput (L1 x) = putRawWord16 0 >> gput x
gput (R1 x) = putRawWord16 1 >> gput x
gget = do
a <- getRawWord16
case a of
Right x -> do
if even x -- Only care about the last bit
then fmap (fmap L1) gget
else fmap (fmap R1) gget
Left s -> return $ Left s
-- | Sums: encode choice between constructors
instance (GGenetic a) => GGenetic (M1 i c a) where
gput (M1 x) = gput x
gget = fmap (fmap M1) gget
-- | Products: encode multiple arguments to constructors
instance (Genetic a) => GGenetic (K1 i a) where
gput (K1 x) = put x
gget = do
a <- get
return $ fmap K1 a
The $gdm comes from DefaultSignatures. Here's a minimal example that produces a similar error message
{-# LANGUAGE DefaultSignatures #-}
data NoInstances = NoInstances
class Display a where
display :: a -> String
default display :: Show a => a -> String
display = show
instance Display NoInstances
The error message produced is
defaultsignatures.hs:11:10:
No instance for (Show NoInstances)
arising from a use of `Main.$gdmdisplay'
In the expression: Main.$gdmdisplay
In an equation for `display': display = Main.$gdmdisplay
In the instance declaration for `Display NoInstances'

Resources