Is it possible to re-implement deriving of Enum type class using GHC
generics?
At first, it looks easy:
data Foo -- representation without metadata (wrong):
= Foo -- L1 U1
| Bar -- R1 (L1 U1)
| Baz -- R1 (R1 (L1 U1))
| Quux -- R1 (R1 (R1 U1))
deriving (Show, Eq, Generic)
-- Rep Foo p = (U1 :+: (U1 :+: (U1 :+: U1))) p
instance Enum Foo where
toEnum = undefined -- FIXME
fromEnum = gfromEnum . from
class GEnum f where
gfromEnum :: f p -> Int
instance GEnum U1 where
gfromEnum U1 = 0
instance GEnum f => GEnum (M1 i t f) where
gfromEnum (M1 x) = gfromEnum x
instance (GEnum x, GEnum y) => GEnum (x :+: y) where
gfromEnum (L1 x) = gfromEnum x
gfromEnum (R1 y) = 1 + gfromEnum y
However, this is not going to work:
λ> fromEnum Foo
0
λ> fromEnum Bar
1
λ> fromEnum Baz
1
λ> fromEnum Quux
2
This is because we can't rely on how arguments of (:+:) are grouped. In
this case it seems they are nested like this:
((U1 :+: U1) :+: (U1 :+: U1)) p
So, is it possible to derive Enum using Generics? If yes, how?
GHC derives Generic such that the L and R variants form a tree where the leaves are in Enum order. Consider the following example (with trimmed output):
ghci> data D = A | B | C | D | E deriving (Generic)
ghci> from A
L1 (L1 U1)
ghci> from B
L1 (R1 U1)
ghci> from C
R1 (L1 U1)
ghci> from D
R1 (R1 (L1 U1))
ghci> from E
R1 (R1 (R1 U1)))
Notice that if you arranged these as a tree, toEnum `map` [1..] is going to be the left to right traversal of the leaves. With that intuition, we'll start by defining a GLeaves class which counts the number of leaves that a generic type (not a value!) has in its tree.
{-# LANGUAGE ScopedTypeVariables, PolyKinds, TypeApplications, TypeOperators,
DefaultSignatures, FlexibleContexts, TypeFamilies #-}
import GHC.Generics
import Data.Proxy
class GLeaves f where
-- | Counts the number of "leaves" (i.e. U1's) in the type `f`
gSize :: Proxy f -> Int
instance GLeaves U1 where
gSize _ = 1
instance GLeaves x => GLeaves (M1 i t x) where
gSize _ = gSize (Proxy :: Proxy x)
instance (GLeaves x, GLeaves y) => GLeaves (x :+: y) where
gSize _ = gSize (Proxy :: Proxy x) + gSize (Proxy :: Proxy y)
Now, we are in shape to define GEnum. As is usual with this setup, we define our class Enum' and have default signatures that rely on GEnum.
class Enum' a where
toEnum' :: Int -> a
fromEnum' :: a -> Int
default toEnum' :: (Generic a, GEnum (Rep a)) => Int -> a
toEnum' = to . gToEnum
default fromEnum' :: (Generic a, GEnum (Rep a)) => a -> Int
fromEnum' = gFromEnum . from
class GEnum f where
gFromEnum :: f p -> Int
gToEnum :: Int -> f p
Finally, we get to the good stuff. For U1 and M1, gFromEnum and gToEnum are both straightforward. For :+:, gFromEnum needs to find all of the leaves to the left of it, so if it is the right subtree we add the size of the left subtree (and if it is the left subtree we add nothing). Similarly, gToEnum, checks whether it belong in the left or right subtree by checking if it is smaller than the number of leaves in the left subtree.
instance GEnum U1 where
gFromEnum U1 = 0
gToEnum n = if n == 0 then U1 else error "Outside enumeration range"
instance GEnum f => GEnum (M1 i t f) where
gFromEnum (M1 x) = gFromEnum x
gToEnum n = M1 (gToEnum n)
instance (GLeaves x, GEnum x, GEnum y) => GEnum (x :+: y) where
gFromEnum (L1 x) = gFromEnum x
gFromEnum (R1 y) = gSize (Proxy :: Proxy x) + gFromEnum y
gToEnum n = let s = gSize (Proxy :: Proxy x)
in if n < s then L1 (gToEnum n) else R1 (gToEnum (n - s))
Finally, you can test this in GHCi:
ghci> :set -XDeriveAnyClass -XDeriveGeneric
ghci> data D = A | B | C | D | E deriving (Show, Generic, Enum, Enum')
ghci> toEnum `map` [0 .. 4] :: [D]
[A,B,C,D,E]
ghci> toEnum' `map` [0 .. 4] :: [D]
[A,B,C,D,E]
ghci> fromEnum `map` [A .. E] :: [Int]
[A,B,C,D,E]
ghci> fromEnum' `map` [A .. E] :: [Int]
[A,B,C,D,E]
Performance
You may be thinking to yourself: this is super inefficient! We end up recalculating a bunch of sizes over and over - the worst case performance is at least O(n^2). The catch is that (hopefully), GHC will be able to optimize/inline the hell out of our specific Enum' instances until there is nothing left of the initial Generic structure.
Enum is one of many examples that are slightly awkward to write using the standard GHC Generics representation, because a lot of the structure of datatypes is left implicit (e.g. how sum and product constructors are nested, and where metadata occurs).
With generics-sop, you can (re-)define generic Enum instances in a slightly more straight-forward way:
{-# LANGUAGE ConstraintKinds, DataKinds, DeriveGeneric #-}
{-# LANGUAGE FlexibleContexts, GADTs, PolyKinds #-}
{-# LANGUAGE TypeApplications, TypeOperators #-}
import Generics.SOP
import qualified GHC.Generics as GHC
We define a type synonym that captures what it means to be an enumeration type:
type IsEnumType a = All ((~) '[]) (Code a)
(Unfortunately, the (~) '[] construction triggers a bug in GHC 8.0.1, but it works fine in GHC 8.0.2.) In generics-sop, the code of a datatype is a type-level list of lists. The outer list contains an element for each constructor, the inner lists contain the types of the constructor arguments. The IsEnumType constraint says that all of the inner lists have to be empty, which means that none of the constructors must have any arguments.
gfromEnum :: (Generic a, IsEnumType a) => a -> Int
gfromEnum = conIndex . unSOP . from
where
conIndex :: NS f xs -> Int
conIndex (Z _) = 0
conIndex (S i) = 1 + conIndex i
The function from turns a value into a sum-of-products representation, and unSOP strips the outer constructor. We then have a sum structure to traverse. The NS datatype representing n-ary sums has constructors Z and S that indicate exactly which constructor is being used, so we can simply traverse and count.
gtoEnum :: (Generic a, IsEnumType a) => Int -> a
gtoEnum i =
to (SOP (apInjs_NP (hcpure (Proxy # ((~) '[])) Nil) !! i))
Here, the apInjs_NP (hcpure ...) call produces the representations of
empty constructor applications for all constructors of the datatype. Unlike the gfromEnum function, this actually makes use of the IsEnumType constraint to be type correct (because we rely on the fact that none of the constructors take any arguments). We then selected the i-th constructor out of the list and turn it back from the generic representation to the actual type by applying first SOP and then to.
To apply it to your sample type, you have to instantiate it to both GHC's and generic-sop's Generic classes (or you can use TH for this, too):
data Foo = Foo | Bar | Baz | Quux
deriving (Show, Eq, GHC.Generic)
instance Generic Foo
Then you can test it:
GHCi> gfromEnum Baz
2
GHCi> gtoEnum 2 :: Foo
Baz
If you want, you can make gfromEnum and gtoEnum the default definitions for an Enum-like class, just as with GHC Generics.
Related
Are there examples of generics-sop that mimic SYB's everywhere/mkT behavior?
What I'm attempting to do, but not seeing how to do it successfully, is replace the everywhere (mkT fixupSymbol) in main with an equivalent Generics.SOP construction, i.e., use Generics.SOP to recurse into the product (I (AbsAddr value)) and replace it with (I (SymAddr label)).
I could pass the symbol table to gformatOperands, polluting the formatOperands signature. That seems suboptimal.
Without fixupSymbol, the output would look like:
LD B, 0x0000
LD C, 0x1234
CALL 0x4567
Resolving addresses to symbolic labels:
gensop % stack ghci
Using main module: 1. Package `gensop' component exe:gensop with main-is file: <...>/Main.hs
gensop-0.1: configure (exe)
Configuring gensop-0.1...
gensop-0.1: initial-build-steps (exe)
Configuring GHCi with the following packages: gensop
GHCi, version 8.6.3: http://www.haskell.org/ghc/ :? for help
[1 of 1] Compiling Main ( <...>/Main.hs, interpreted )
*Main> main
LD B, 0x0000
LD C, label1
CALL label2
*Main>
Cut down version of code:
{-# LANGUAGE DataKinds #-}
{-# LANGUAGE TypeFamilies #-}
{-# LANGUAGE TemplateHaskell #-}
{-# LANGUAGE DeriveDataTypeable #-}
module Main where
import Data.Data
import Data.Foldable (foldl)
import Data.Word (Word8, Word16)
import Data.Text (Text)
import qualified Data.Text as T
import qualified Data.Text.IO as T
import Text.Printf
import Generics.SOP
import Generics.SOP.TH (deriveGeneric)
import Data.Generics.Aliases (mkT)
import Data.Generics.Schemes (everywhere)
import Data.HashMap.Strict (HashMap)
import qualified Data.HashMap.Strict as H
import Data.Sequence (Seq, (|>))
import qualified Data.Sequence as Seq
type Z80addr = Word16
type Z80word = Word8
class Z80operand x where
formatOperand :: x -> Text
main :: IO()
main = mapM_ T.putStrLn (foldl printIns Seq.empty $ everywhere (mkT fixupSymbol) insnSeq)
-- -------------------------------------------------^ Does this have a Generics.SOP equivalent?
where
printIns accum ins = accum |> T.concat ([mnemonic, gFormatOperands] <*> [ins])
mnemonic (LD _) = "LD "
mnemonic (CALL _) = "CALL "
-- Generics.SOP: Fairly straightforward
gFormatOperands {-elt-} =
T.intercalate ", " . hcollapse . hcmap disOperandProxy (mapIK formatOperand) . from {-elt-}
where
disOperandProxy = Proxy :: Proxy Z80operand
-- Translate an absolute address, generally hidden inside an instruction operand, into a symbolic address
-- if present in the symbol table.
fixupSymbol addr#(AbsAddr absAddr) = maybe addr SymAddr (absAddr `H.lookup` symtab)
fixupSymbol other = other
insnSeq :: Seq Z80instruction
insnSeq = Seq.singleton (LD (Reg8Imm B 0x0))
|> (LD (Reg8Indirect C (AbsAddr 0x1234)))
|> (CALL (AbsAddr 0x4567))
symtab :: HashMap Z80addr Text
symtab = H.fromList [ (0x1234, "label1"), (0x4567, "label2")]
-- | Symbolic and absolute addresses. Absolute addresses can be translated into symbolic
-- labels.
data SymAbsAddr = AbsAddr Z80addr | SymAddr Text
deriving (Eq, Ord, Typeable, Data)
data Z80reg8 = A | B | C
deriving (Eq, Ord, Typeable, Data)
-- | Cut down version of the Z80 instruction set
data Z80instruction = LD OperLD | CALL SymAbsAddr
deriving (Eq, Ord, Typeable, Data)
-- | Load operands
data OperLD = Reg8Imm Z80reg8 Z80word | Reg8Indirect Z80reg8 SymAbsAddr
deriving (Eq, Ord, Typeable, Data)
$(deriveGeneric ''SymAbsAddr)
$(deriveGeneric ''Z80reg8)
$(deriveGeneric ''Z80instruction)
$(deriveGeneric ''OperLD)
instance Z80operand Z80word where
formatOperand word = T.pack $ printf "0x%04x" word
instance Z80operand SymAbsAddr where
formatOperand (AbsAddr addr) = T.pack $ printf "0x04x" addr
formatOperand (SymAddr label) = label
instance Z80operand Z80reg8 where
formatOperand A = "A"
formatOperand B = "B"
formatOperand C = "C"
instance Z80operand OperLD where
formatOperand (Reg8Imm reg imm) = T.concat [formatOperand reg, ", ", formatOperand imm]
formatOperand (Reg8Indirect reg addr) = T.concat [formatOperand reg, ", ", formatOperand addr]
The gensop.cabal file:
cabal-version: >= 1.12
name: gensop
version: 0.1
build-type: Simple
author: scooter-me-fecit
description: No description.
license: GPL-3
executable gensop
default-language: Haskell2010
main-is: Main.hs
build-depends:
base,
containers,
bytestring,
generics-sop,
syb,
text,
unordered-containers
default-extensions:
OverloadedStrings,
FlexibleInstances
ghc-options: -Wall
generics-sop provides no equivalents for recursive traversal schemes, such as these functions. If you need to deal with recursion in this library, the possible solution is to implement them. Although, defining such functions in SOP is related to some difficulties because it has a core generic view on data that doesn't distinguish recursive nodes from leaves. Recursion in this setting can be managed using closed type families (CTF) and some type class machinery. Closed type families allow you:
to implement the type-safe cast, which is needed for defining mkT,
to resolve the cases of recursive and non-recursive nodes—different
instances of a type class—which otherwise overlap. (Another option
is using pragmas for overlapping instances, a recent GHC feature;
there is, however, some bias about overlapping instances in the
Haskell community, so this solution is often considered as
undesired.)
Using CTF for treating recursion has been described in an unpublished paper “Handling Recursion in Generic Programming Using Closed Type Families”, which uses the generics-sop library as a case study; it provides examples of defining recursive schemes in SOP.
SYB's everywhere supports families of mutually recursive datatypes. The following implementation allows to specify them as type-level lists.
{-# LANGUAGE DeriveGeneric, TypeFamilies, DataKinds,
TypeApplications, ScopedTypeVariables, MultiParamTypeClasses,
ConstraintKinds, FlexibleContexts, AllowAmbiguousTypes,
FlexibleInstances, UndecidableInstances,
UndecidableSuperClasses, TypeOperators, RankNTypes #-}
import Generics.SOP
import Generics.SOP.NS
import GHC.Exts (Constraint)
import Data.Type.Equality
type family Equal a x :: Bool where
Equal a a = 'True
Equal _ _ = 'False
class DecideEq (eq :: Bool) (a :: *) (b :: *) where
decideEq :: Maybe (b :~: a)
instance a ~ b => DecideEq True a b where
decideEq = Just Refl
instance DecideEq False a b where
decideEq = Nothing
type ProofCast a b = DecideEq (Equal a b) a b
castEq :: forall a b. ProofCast a b => b -> Maybe a
castEq t = (\d -> castWith d t) <$> decideEq #(Equal a b)
type Transform a b = (Generic a, Generic b, ProofCast a b, ProofCast b a)
mkT :: Transform a b => (a -> a) -> b -> b
mkT f x = maybe x id $ castEq =<< f <$> castEq x
type family In (a :: *) (fam :: [*]) :: Bool where
In a ([a] ': fam) = 'True
In [a] (a ': fam) = 'True
In a (a ': fam) = 'True
In a (_ ': fam) = In a fam
In _ '[] = 'False
class CaseEverywhere' (inFam :: Bool) (c :: * -> Constraint)
(fam :: [*]) (x :: *) (y :: *) where
caseEverywhere' :: (forall b . c b => b -> b) -> I x -> I y
instance c x => CaseEverywhere' 'False c fam x x where
caseEverywhere' f = I . f . unI
instance (c x, Everywhere x c fam) => CaseEverywhere' 'True c fam x x where
caseEverywhere' f = I . f . everywhere #fam #c f . unI
class CaseEverywhere' (In x fam) c fam x y => CaseEverywhere c fam x y
instance CaseEverywhere' (In x fam) c fam x y => CaseEverywhere c fam x y
caseEverywhere :: forall c fam x y . CaseEverywhere c fam x y
=> (forall b . c b => b -> b) -> I x -> I y
caseEverywhere = caseEverywhere' #(In x fam) #c #fam
type Everywhere a c fam =
(Generic a, AllZip2 (CaseEverywhere c fam) (Code a) (Code a))
everywhere :: forall fam c a . Everywhere a c fam
=> (forall b . c b => b -> b) -> a -> a
everywhere f = to . everywhere_SOP . from
where
everywhere_SOP = trans_SOP (Proxy #(CaseEverywhere c fam)) $
caseEverywhere #c #fam f
Usage
First, this can be examined with a small-scale example taken from the SYB paper. The implemented SOP-based everywhere, as compared to SYB's one, additionally takes two type arguments, passed through the explicit type application. The first one specifies a family of mutually recursive datatypes as a type list. The traversal will treat as recursive only those nodes whose types are specified in that list. The second argument is needed to provide a compiler with a ‘proof’ object for type-cast. The T synonym for the Transform constraint serves to allow its partial application.
data Company = C [Dept]
data Dept = D Name Manager [SubUnit]
data SubUnit = PU Employee | DU Dept
data Employee = E Person Salary
data Person = P Name Address
data Salary = S Float
type Manager = Employee
type Name = String
type Address = String
class Transform a b => T a b
instance Transform a b => T a b
type CompanyF = '[Company, Dept, SubUnit, Employee]
increase :: Float -> Company -> Company
increase k = everywhere #CompanyF #(T Salary) (mkT (incS k))
incS :: Float -> Salary -> Salary
incS k (Sal s) = Sal (s * (1 + k))
The defined everywhere / mkT functions are ready for using in your code, but it misses some Generic instances. To apply everywhere to insnSeq, you need a Generic (Seq Z80instruction) instance. Yet you can't obtain it, because the Data.Sequence module doesn't export the internal representation of it. A possible fix is applying fmap to the sequence. So now you can write:
{-# LANGUAGE TypeApplications #-}
...
type Z80 = '[SymAbsAddr, Z80reg8, Z80instruction, OperLD]
main :: IO()
main = mapM_ T.putStrLn (foldl printIns Seq.empty $
fmap (everywhere #Z80 #(T SymAbsAddr) (mkT fixupSymbol)) insnSeq)
You should provide the Generic instances for all types of nodes that this traverses, recursive and non-recursive. So next, this demands the Generic instances for Word8, Word16, and Text. While the Generic Text instance can be generated via deriveGeneric, the others can't, because of their special GHC representation. So you'll have to do it manually; this definition is straightforward:
$(deriveGeneric ''Text)
instance Generic Word8 where
type Code Word8 = '[ '[Word8]]
from x = SOP (Z (I x :* Nil))
to (SOP ((Z (I x :* Nil)))) = x
instance Generic Word16 where
type Code Word16 = '[ '[Word16]]
from x = SOP (Z (I x :* Nil))
to (SOP ((Z (I x :* Nil)))) = x
This code is boilerplate, but the newest GHC extension DerivingVia could nicely simplify this, reducing the second definition. Hopefully, this useful feature will be improved with possibilities for standalone deriving, so it will be possible to say instead:
deriving via Word8 instance Generic Word16
The whole code now works well, and main yields the expected result.
Given the type X = X Int Int, I want to define a function toX :: [String] -> X which constructs an X during runtime with generics.
This is easy when I just write it down like this:
toX :: [String] -> X
toX (x:[y]) = to (M1 (M1 (M1 (K1 $ read x) :*: (M1 (K1 $ read y)))))
But I don't know how to do it recursive (in case we have a lot more than two parameters). My first try was something like this:
toX xs = to (M1 (M1 (toX' xs)))
toX' (x:[]) = M1 (K1 x)
toX' (x:xs) = M1 (K1 x) :*: (toX' xs)
which (of course) fails with a type error. Looking at the type of (:*:) confuses me even more: (:*:) :: f p -> g p -> (:*:) f g p. I have absolutely no idea what this type is supposed to mean and how to proceed from here.
Any hints?
#!/usr/bin/env stack
{- stack --resolver lts-8.4 runghc-}
{-# LANGUAGE DeriveGeneric #-}
import GHC.Generics
data X = X Int Int deriving (Generic, Show)
main :: IO ()
main = do
print $ toXeasy ["2","4"]
-- print $ toX ["2","4"]
toXeasy :: [String] -> X
toXeasy (x:[y]) = to (M1 (M1 (M1 (K1 $ read x) :*: (M1 (K1 $ read y)))))
--toX :: [String] -> X
--toX xs = to (M1 (M1 (toX' xs)))
--toX' (x:[]) = M1 (K1 x)
--toX' (x:xs) = M1 (K1 x) :*: (toX' xs)
This defines a function readFields :: [String] -> Maybe X for any Generic data type X which has only one constructor (with at least one field).
readFields is defined using a generic version gReadFields which works with generic representations (i.e., types constructed using type constructors that appear in GHC.Generics: M1, (:*:), K1...).
{-# LANGUAGE DeriveGeneric #-}
{-# LANGUAGE FlexibleContexts #-}
{-# LANGUAGE TypeOperators #-}
module A where
import GHC.Generics
import Control.Monad.Trans.State
import Text.Read
data X = X Int Int deriving (Generic, Show)
main = print (readFields ["14", "41"] :: Maybe X)
readFields :: (Generic a, GReadableFields (Rep a)) => [String] -> Maybe a
readFields xs = fmap to (evalStateT gReadFields xs)
class GReadableFields f where
gReadFields :: StateT [String] Maybe (f p)
instance GReadableFields f => GReadableFields (M1 i c f) where
gReadFields = fmap M1 gReadFields
-- When your type is a large product, you cannot assume that
-- the generic product structure formed using `(:*:)` is list-
-- like (field1 :*: (field2 :*: (field3 ...)), so it is not
-- clear how to split the input list of strings to read each
-- component. For that reason we use `State`. Another possible way
-- is to compute the number of fields of the two operands `f` and `g`.
instance (GReadableFields f, GReadableFields g) => GReadableFields (f :*: g) where
gReadFields = do
f <- gReadFields
g <- gReadFields
return (f :*: g)
instance Read c => GReadableFields (K1 i c) where
gReadFields = StateT $ \(x : xs) -> do
c <- readMaybe x
return (K1 c, xs)
Just for fun, here is a way of achieving a similar results which does not use generics. The user has to provide a constructor (or a function), and a type class takes care of filling all its arguments with values read from the list of strings.
{-# LANGUAGE FlexibleInstances, TypeFamilies #-}
module A where
data X = X Int Int deriving Show
main = print (readFields X ["14", "41"])
type family Result a where
Result (a -> b) = Result b
Result a = a
class ReadableFields a where
readFields :: a -> [String] -> Maybe (Result a)
instance {-# OVERLAPPING #-} (ReadableFields b, Read a) => ReadableFields (a -> b) where
readFields f (x : xs) = do
a <- readMaybe x
readFields (f a) xs
readFields _ _ = Nothing
instance (Result a ~ a) => ReadableFields a where
readFields a _ = Just a
EDIT
That use of Generic is straightforward enough that the underlying pattern is packaged in one-liner.
{-# LANGUAGE FlexibleContexts #-}
import Generics.OneLiner
import Control.Monad.Trans.State
import Text.Read
Define an action to read a single field. It is important that there is an instance Applicative (StateT [String] Maybe) so that it can be composed.
-- Takes a string from the state and reads it out.
readM :: Read a => StateT [String] Maybe a
readM = StateT readM'
where
readM' (x : xs) | Just a <- readMaybe x = Just (a, xs)
readM' _ = Nothing
This is now a one-liner, using createA from the one-liner library.
readFields xs = evalStateT (createA (For :: For Read) readM) xs
main = print (readFields ["14", "42"] :: Maybe (Int, Int))
Here is a solution using generics-sop:
{-# LANGUAGE DataKinds, TypeFamilies, FlexibleContexts, TypeApplications #-}
{-# LANGUAGE TemplateHaskell #-}
module ReadFields where
import Data.Maybe
import Generics.SOP
import Generics.SOP.TH
readFields ::
(Generic a, Code a ~ '[ xs ], All Read xs) => [String] -> Maybe a
readFields xs =
to . SOP . Z . hcmap (Proxy #Read) (I . read . unK) <$> fromList xs
data X = X Int Int
deriving Show
deriveGeneric ''X
Testing:
GHCi> readFields #X ["3", "4"]
Just (X 3 4)
GHCi> readFields #X ["3"]
Nothing
So I've used syb for a long time, and often have functions like
friendlyNames :: Data a => a -> a
friendlyNames = everywhere (mkT (\(Name x _) -> Name x NameS))
What is the equivalent of this using GHC.Generics, assuming Generic a?
This might be the wrong problem to solve with GHC.Generics, but here's now you'd do it!
{-# Language TypeOperators #-}
{-# Language DeriveGeneric #-}
{-# Language DefaultSignatures #-}
{-# Language FlexibleContexts #-}
module Demo where
import GHC.Generics
import Language.Haskell.TH
import Language.Haskell.TH.Syntax
data Record = Record { field0 :: Int, field1 :: Maybe Record, field2 :: Name } deriving Generic
instance FriendlyNames Record -- body omitted and derived with GHC.Generics
instance FriendlyNames a => FriendlyNames (Maybe a)
instance FriendlyNames Int where friendlyNames = id -- no-op
------------------------------------------------------------------------
-- | Class for types that can be made friendly
class FriendlyNames a where
friendlyNames :: a -> a
default friendlyNames :: (GFriendlyNames (Rep a), Generic a) => a -> a
friendlyNames = to . gfriendlyNames . from
-- | Replaces the second component of a name with 'NameS'
instance FriendlyNames Name where
friendlyNames (Name x _) = Name x NameS
------------------------------------------------------------------------
-- | Class for generic structures that can have names made friendly
class GFriendlyNames f where
gfriendlyNames :: f p -> f p
-- | Case for metadata (type constructor, data constructor, field selector)
instance GFriendlyNames f => GFriendlyNames (M1 i c f) where
gfriendlyNames (M1 x) = M1 (gfriendlyNames x)
-- | Case for product types
instance (GFriendlyNames f, GFriendlyNames g) => GFriendlyNames (f :*: g) where
gfriendlyNames (x :*: y) = gfriendlyNames x :*: gfriendlyNames y
-- | Case for sum types
instance (GFriendlyNames f, GFriendlyNames g) => GFriendlyNames (f :+: g) where
gfriendlyNames (L1 x) = L1 (gfriendlyNames x)
gfriendlyNames (R1 y) = R1 (gfriendlyNames y)
-- | Case for datatypes without any data constructors (why not?)
instance GFriendlyNames V1 where
gfriendlyNames v1 = v1 `seq` error "gfriendlyNames.V1"
-- | Case for datatypes without any fields
instance GFriendlyNames U1 where
gfriendlyNames U1 = U1
-- | Case for data constructor fields
instance FriendlyNames a => GFriendlyNames (K1 i a) where
gfriendlyNames (K1 x) = K1 (friendlyNames x)
The GHC.Generics approach is more suited to situations where this kind of complexity can be written once and hidden away in a library. While the SYB approach relies on runtime checks, observe the GHC core that is generated for a friendlyNames that makes Record values friendly
-- RHS size: {terms: 14, types: 18, coercions: 0}
recordFriendlyNames
recordFriendlyNames =
\ w_s63w ->
case w_s63w of _ { Record ww1_s63z ww2_s63A ww3_s63B ->
case $recordFriendlyNames ww1_s63z ww2_s63A ww3_s63B
of _ { (# ww5_s63H, ww6_s63I, ww7_s63J #) ->
Record ww5_s63H ww6_s63I ww7_s63J
}
}
-- RHS size: {terms: 19, types: 19, coercions: 0}
$recordFriendlyNames
$recordFriendlyNames =
\ ww_s63z ww1_s63A ww2_s63B ->
(# ww_s63z,
case ww1_s63A of _ {
Nothing -> Nothing;
Just g1_a601 -> Just (recordFriendlyNames g1_a601)
},
case ww2_s63B of _ { Name x_a3Z3 ds_d5Z5 -> Name x_a3Z3 NameS } #)
Well, I finally have a satisfying answer to this question. The guts of it are taken from glguy's answer above, but I'll add some wrappers and explanation that helped me connect the dots. I will also make it more generic so it corresponds more closely with the tools provide by Data.Data.
The everywhere function will apply a function to every occurence of some Typeable type b within the argument value, which is represented as type a. The Typeable instance is used to determine when a ~ b during the recursion. Note that because everywhere is a method of class Everywhere and a default instance is provided, it will accept any type that satisfies the class constraints
{-# LANGUAGE UndecidableInstances #-}
import Data.Typeable (cast, Typeable)
import GHC.Generics
import Data.Ratio (Ratio)
import Data.Word (Word8)
class (Typeable b, Typeable a) => Everywhere b a where
everywhere :: (b -> b) -> a -> a
Here is the basic instance of Everywhere, it can be applied to any type which satisfies its constraints, in particular GEverywhere which is defined below for any instance of Generic. The OVERLAPPABLE lets us supply instances for additional types that are not instances of Generic.
instance {-# OVERLAPPABLE #-} (Typeable b, Typeable a, Generic a, GEverywhere b (Rep a))
=> Everywhere b a where
everywhere f = to . geverywhere f . from
Now we write a class GEverywhere which includes the instances that cover the type representation. Ultimately, the job of this code is to recurse on the values of the fields inside this value.
class GEverywhere b f where
geverywhere :: (b -> b) -> f p -> f p
instance GEverywhere b f => GEverywhere b (M1 i c f) where
geverywhere f (M1 x) = M1 (geverywhere f x)
instance (GEverywhere b f, GEverywhere b g) => GEverywhere b (f :*: g) where
geverywhere f (x :*: y) = geverywhere f x :*: geverywhere f y
instance (GEverywhere b f, GEverywhere b g) => GEverywhere b (f :+: g) where
geverywhere f (L1 x) = L1 (geverywhere f x)
geverywhere f (R1 y) = R1 (geverywhere f y)
instance GEverywhere b V1 where geverywhere _ v1 =
v1 `seq` error "geverywhere.V1"
instance GEverywhere b U1 where geverywhere _ U1 = U1
This final instance is where the subtype is encountered. We check whether it is the type we are looking for using the cast function from Data.Typeable:
instance Everywhere b a => GEverywhere b (K1 i a) where
geverywhere f (K1 x) =
case cast x :: Maybe b of
Nothing -> K1 (everywhere f x)
Just x' -> case cast (f x') :: Maybe a of
-- This should never happen - we got here because a ~ b
Nothing -> K1 (everywhere f x)
Just x'' -> K1 x''
Finally, there may be primitive types that occur within the types we are interested in that have no Generic instances.
instance (Typeable b, Typeable a) => Everywhere b (Ratio a) where everywhere _ r = r
instance (Typeable b) => Everywhere b Char where everywhere _ r = r
instance (Typeable b) => Everywhere b Integer where everywhere _ r = r
instance (Typeable b) => Everywhere b Word8 where everywhere _ r = r
instance (Typeable b) => Everywhere b Int where everywhere _ r = r
That's it, now we can use everywhere to do generic modification:
λ> everywhere (succ :: Char -> Char) ("abc", 123)
("bcd",123)
λ> everywhere #Int succ ("abc", 123 :: Int)
("abc",124)
I would like to write an implementation of
instance (GMySerialize a, GMySerialize b) => GMySerialize (a :+: b)
Where GMySerialize is defined as:
class GMySerialize f where
gtoMyS :: f a -> MySerialize
gfromMyS :: MySerialize -> Maybe (f a)
That will, for any sum type consisting solely of nullary data constructors (such as data MyType = A | B | C | D | E | f), convert it to and from MySerializeInt, where MySerializeInt is a constructor for MySerialize that takes one int parameter.
I started out with
instance (GMySerialize a, GMySerialize b) => GMySerialize (a :+: b) where
gtoMyS (L1 x) = MySerializeInt (0 + rest)
where rest = case gtoMyS x of
MySerializeInt n -> n
MySerializeNil -> 0
err -> error $ show err
gtoMyS (R1 x) = MySerializeInt (1 + rest)
where rest = case gtoMyS x of
MySerializeInt n -> n
MySerializeNil -> 0
err -> error $ show err
But realised that's horribly wrong, and am not sure how to fix it. How is it wrong? As an example, the following produce the same integer, but they should not as they represent different constructors:
M1 {unM1 = L1 (R1 (M1 {unM1 = U1}))}
M1 {unM1 = R1 (L1 (M1 {unM1 = U1}))}
I'm also unsure how I'd go about writing the gfromMyS instances even if I got gtoMyS working.
To phrase it another way, what I'm looking to do has an equivalent effect to writing a Template Haskell function that generates:
instance MySerialize t where
toMyS x = MySerializeInt (toEnum x)
fromMyS (MySerializeInt n) -> Just (fromEnum n)
fromMyS _ -> Nothing
For every single t where t is sum types with only nullary constructors that implement Enum.
The trick is to make another class that counts the number of constructors
{-# LANGUAGE ScopedTypeVariables #-}
{-# LANGUAGE TypeOperators #-}
{-# LANGUAGE DefaultSignatures #-}
{-# LANGUAGE DeriveGeneric #-}
{-# LANGUAGE KindSignatures #-}
{-# LANGUAGE FlexibleInstances #-}
{-# LANGUAGE FlexibleContexts #-}
import Data.Functor ((<$>))
import Data.Tagged
import GHC.Generics
class GNumConstructors (f :: * -> *) where
-- Is this close enough to CAF to get memoed in the dictionary?
gnumConstructors :: Tagged f Int
instance GNumConstructors (M1 C c f) where
gnumConstructors = Tagged 1
instance (GNumConstructors a, GNumConstructors b) => GNumConstructors (a :+: b) where
gnumConstructors = Tagged $ unTagged (gnumConstructors :: Tagged a Int) + unTagged (gnumConstructors :: Tagged b Int)
Then you can easily divide up the integers between those on the left side (less than the number of possibilities on the left) and those on the right side (any larger numbers).
type MyS = Int
class GMySerialize f where
gtoMyS :: f a -> MyS
gfromMyS :: MyS -> Maybe (f a)
instance (GNumConstructors a, GMySerialize a, GMySerialize b) => GMySerialize (a :+: b) where
gtoMyS (L1 l) = gtoMyS l
gtoMyS (R1 r) = unTagged (gnumConstructors :: Tagged a Int) + gtoMyS r
gfromMyS x = if x < unTagged (gnumConstructors :: Tagged a Int)
then L1 <$> gfromMyS x
else R1 <$> gfromMyS (x - unTagged (gnumConstructors :: Tagged a Int))
Any individual constructor is represented by 0 and we peek straight through metadata.
instance GMySerialize U1 where
gtoMyS U1 = 0
gfromMyS 0 = Just U1
gfromMyS _ = Nothing
instance GMySerialize f => GMySerialize (M1 i c f) where
gtoMyS (M1 a) = gtoMyS a
gfromMyS ms = M1 <$> gfromMyS ms
Combined with a MySerialize class we can flesh out a complete example for MyType and test it
class MySerialize a where
toMyS :: a -> MyS
fromMyS :: MyS -> Maybe a
default toMyS :: (Generic a, GMySerialize (Rep a)) => a -> MyS
toMyS a = gtoMyS $ from a
default fromMyS :: (Generic a, GMySerialize (Rep a)) => MyS -> Maybe a
fromMyS a = to <$> gfromMyS a
data MyType = A | B | C | D | E | F
deriving (Generic, Show)
instance MySerialize MyType
main = do
print . map toMyS $ [A, B, C, D, E, F]
print . map (fromMyS :: MyS -> Maybe MyType) $ [-1, 0, 1, 2, 3, 4, 5, 6]
A through F are mapped to the numbers 0 through 5. Reading in those numbers reproduces A through F. Trying to read in a number outside that range produces Nothing.
[0,1,2,3,4,5]
[Nothing,Just A,Just B,Just C,Just D,Just E,Just F,Nothing]
I'm using datatype-generic programming for a class called Generic that contains a method called get. If my end user defines a type and forgets to add deriving Generic, and calls put, they will see an error message such as this:
No instance for (ALife.Creatur.Genetics.Code.BRGCWord8.GGene
(GHC.Generics.Rep ClassifierGene))
arising from a use of `ALife.Creatur.Genetics.Code.BRGCWord8.$gdmput'
I can tell users how to fix the error, but I am curious about this $gdmput. I assume it's a function or symbol that's automatically generated, but by what? Is it the use of the DefaultSignatures pragma, or the DeriveGeneric pragma? I read a few papers about datatype-generic programming, but did not see any reference to gdmXXX symbols.
Here's the definition of the Generic class.
{-# LANGUAGE TypeFamilies, FlexibleContexts, FlexibleInstances,
DefaultSignatures, DeriveGeneric, TypeOperators #-}
. . .
-- | A class representing anything which is represented in, and
-- determined by, an agent's genome.
-- This might include traits, parameters, "organs" (components of
-- agents), or even entire agents.
-- Instances of this class can be thought of as genes, i.e.,
-- instructions for building an agent.
class Genetic g where
-- | Writes a gene to a sequence.
put :: g -> Writer ()
default put :: (Generic g, GGenetic (Rep g)) => g -> Writer ()
put = gput . from
-- | Reads the next gene in a sequence.
get :: Reader (Either [String] g)
default get :: (Generic g, GGenetic (Rep g)) => Reader (Either [String] g)
get = do
a <- gget
return $ fmap to a
getWithDefault :: g -> Reader g
getWithDefault d = fmap (fromEither d) get
class GGenetic f where
gput :: f a -> Writer ()
gget :: Reader (Either [String] (f a))
-- | Unit: used for constructors without arguments
instance GGenetic U1 where
gput U1 = return ()
gget = return (Right U1)
-- | Constants, additional parameters and recursion of kind *
instance (GGenetic a, GGenetic b) => GGenetic (a :*: b) where
gput (a :*: b) = gput a >> gput b
gget = do
a <- gget
b <- gget
return $ (:*:) <$> a <*> b
-- | Meta-information (constructor names, etc.)
instance (GGenetic a, GGenetic b) => GGenetic (a :+: b) where
gput (L1 x) = putRawWord16 0 >> gput x
gput (R1 x) = putRawWord16 1 >> gput x
gget = do
a <- getRawWord16
case a of
Right x -> do
if even x -- Only care about the last bit
then fmap (fmap L1) gget
else fmap (fmap R1) gget
Left s -> return $ Left s
-- | Sums: encode choice between constructors
instance (GGenetic a) => GGenetic (M1 i c a) where
gput (M1 x) = gput x
gget = fmap (fmap M1) gget
-- | Products: encode multiple arguments to constructors
instance (Genetic a) => GGenetic (K1 i a) where
gput (K1 x) = put x
gget = do
a <- get
return $ fmap K1 a
The $gdm comes from DefaultSignatures. Here's a minimal example that produces a similar error message
{-# LANGUAGE DefaultSignatures #-}
data NoInstances = NoInstances
class Display a where
display :: a -> String
default display :: Show a => a -> String
display = show
instance Display NoInstances
The error message produced is
defaultsignatures.hs:11:10:
No instance for (Show NoInstances)
arising from a use of `Main.$gdmdisplay'
In the expression: Main.$gdmdisplay
In an equation for `display': display = Main.$gdmdisplay
In the instance declaration for `Display NoInstances'