Shortcut fusion for triemaps

Shortcut fusion for triemaps - haskell

This problem arose when attempting to fuse away intermediate triemaps in Haskell.
Consider the trie for Peano natural numbers:
data Nat = Zero | Succ Nat
data ExpoNat a = ExpoNat (Maybe a) (ExpoNat a)
| NoExpoNat
We can easily define a fold on ExpoNat (it is essentially a list) and use foldr/build (a.k.a. finally tagless) to fuse away intermediate occurrencess of ExpoNat:
{-# NOINLINE fold #-}
fold :: (Maybe a -> b -> b) -> b -> ExpoNat a -> b
fold f z (ExpoNat x y) = f x (fold f z y)
fold f z NoExpoNat = z
{-# NOINLINE build #-}
build :: (forall b. (Maybe a -> b -> b) -> b -> b) -> ExpoNat a
build f = f ExpoNat NoExpoNat
{-# RULES "fold/build" forall f n (g :: forall b. (Maybe a -> b -> b) -> b -> b). fold f n (build g) = g f n #-}
As an example, we take match and appl from "Is there a way to generalize this TrieMap code?" and compose them such that ExpoNat is fused away. (Note that we must "strengthen the induction hypothesis" in appl.)
{-# INLINE match #-}
match :: Nat -> ExpoNat ()
match n = build $ \f z ->
let go Zero = f (Just ()) z
go (Succ n) = f Nothing (go n)
in go n
{-# INLINE appl #-}
appl :: ExpoNat a -> (Nat -> Maybe a)
appl
= fold (\f z -> \n ->
case n of Zero -> f
Succ n' -> z n')
(\n -> Nothing)
applmatch :: Nat -> Nat -> Maybe ()
applmatch x = appl (match x)
The fusion can be verified by inspecting Core with -ddump-simpl.
Now we would like to do the same for Tree.
data Tree = Leaf | Node Tree Tree
data TreeMap a
= TreeMap {
tm_leaf :: Maybe a,
tm_node :: TreeMap (TreeMap a)
}
| EmptyTreeMap
We are in trouble: TreeMap is a non-regular data type, and so it is not obvious how to write its corresponding fold/build pair.
Haskell Programming with Nested Types: A Principled Approach seems to have the answer (see the Bush type) but 4:30 AM seems to be too late for me to get it working. How is one supposed to write hfmap? Have there been further developments since?
A similar variant of this question has been asked in What's the type of a catamorphism (fold) for non-regular recursive types?

I worked on it some more and I now have working fusion, without using the generic gadgets from the paper.
{-# LANGUAGE RankNTypes #-}
{-# LANGUAGE TypeOperators #-}
{-# LANGUAGE KindSignatures #-}
{-# LANGUAGE PolyKinds #-}
{-# LANGUAGE GADTs #-}
{-# LANGUAGE DeriveFunctor #-}
{-# LANGUAGE DeriveFoldable #-}
{-# LANGUAGE DeriveTraversable #-}
module Tree where
data Tree = Leaf | Node Tree Tree
deriving (Show)
data ExpoTree a = ExpoTree (Maybe a) (ExpoTree (ExpoTree a))
| NoExpoTree
deriving (Show, Functor)
I derived most of the specialized types by taking the generic construction and then inlining type definitions until I bottomed out. I've kept the generic construction in here for ease of comparison.
data HExpoTree f a = HExpoTree (Maybe a) (f (f a))
| HNoExpoTree
type g ~> h = forall a. g a -> h a
class HFunctor f where
ffmap :: Functor g => (a -> b) -> f g a -> f g b
hfmap :: (Functor g, Functor h) => (g ~> h) -> (f g ~> f h)
instance HFunctor HExpoTree where
ffmap f HNoExpoTree = HNoExpoTree
ffmap f (HExpoTree x y) = HExpoTree (fmap f x) (fmap (fmap f) y)
hfmap f HNoExpoTree = HNoExpoTree
hfmap f (HExpoTree x y) = HExpoTree x (f (fmap f y))
type Alg f g = f g ~> g
newtype Mu f a = In { unIn :: f (Mu f) a }
instance HFunctor f => Functor (Mu f) where
fmap f (In r) = In (ffmap f r)
hfold :: (HFunctor f, Functor g) => Alg f g -> (Mu f ~> g)
hfold m (In u) = m (hfmap (hfold m) u)
An Alg ExpoTreeH g can be decomposed into a product of two natural transformations:
type ExpoTreeAlg g = forall a. Maybe a -> g (g a) -> g a
type NoExpoTreeAlg g = forall a. g a
{-# NOINLINE fold #-}
fold :: Functor g => ExpoTreeAlg g -> NoExpoTreeAlg g -> ExpoTree a -> g a
fold f z NoExpoTree = z
fold f z (ExpoTree x y) = f x (fold f z (fmap (fold f z) y))
The natural transformation here c ~> x is very interesting, and turns out to be quite necessary. Here's the build translation:
hbuild :: HFunctor f => (forall x. Alg f x -> (c ~> x)) -> (c ~> Mu f)
hbuild g = g In
newtype I :: (* -> *) where
I :: x -> I x
deriving (Show, Eq, Functor, Foldable, Traversable)
-- Needs to be a newtype, otherwise RULE firer gets bamboozled
newtype ExpoTreeBuilder c = ETP {runETP :: (forall x. Functor x
=> (forall a. Maybe a -> x (x a) -> x a)
-> (forall a. x a)
-> (forall a. c a -> x a)
)}
{-# NOINLINE build #-}
build :: ExpoTreeBuilder c -> forall a. c a -> ExpoTree a
build g = runETP g ExpoTree NoExpoTree
The newtype for the builder function is needed, because GHC 8.0 doesn't know how to fire the RULE without.
Now, the shortcut fusion rule:
{-# RULES "ExpoTree fold/build"
forall (g :: ExpoTreeBuilder c) c (f :: ExpoTreeAlg g) (n :: NoExpoTreeAlg g).
fold f n (build g c) = runETP g f n c #-}
Implementation of 'match' with 'build':
{-# INLINE match #-}
match :: Tree -> ExpoTree ()
match n = build (match_mk n) (I ())
where
match_mk :: Tree -> ExpoTreeBuilder I
match_mk Leaf = ETP $ \ f z (I c) -> f (Just c) z
match_mk (Node x y) = ETP $ \ f z c ->
-- NB: This fmap is bad for performance
f Nothing (fmap (const (runETP (match_mk y) f z c)) (runETP (match_mk x) f z c))
Implementation of 'appl' with 'fold' (we need to define a custom functor to define the return type.)
newtype PFunTree a = PFunTree { runPFunTree :: Tree -> Maybe a }
deriving (Functor)
{-# INLINE appl #-}
appl :: ExpoTree a -> PFunTree a
appl = fold appl_expoTree appl_noExpoTree
where
appl_expoTree :: ExpoTreeAlg PFunTree
appl_expoTree = \z f -> PFunTree $ \n ->
case n of Leaf -> z
Node n1 n2 -> runPFunTree f n1 >>= flip runPFunTree n2
appl_noExpoTree :: NoExpoTreeAlg PFunTree
appl_noExpoTree = PFunTree $ \n -> Nothing
Putting it all together:
applmatch :: Tree -> Tree -> Maybe ()
applmatch x = runPFunTree (appl (match x))
We can once again inspect the core with -ddump-simpl. Unfortunately, while we have successfully fused away the TrieMap data structure, we are left with suboptimal code due to the fmap in match. Eliminating this inefficiency is left to future work.

The paper appears to draw a parallel between ExpoNat a as a recursive Type and Tree as a recursive type constructor (Type -> Type).
newtype Fix f = Fix (f ( Fix f))
newtype HFix h a = HFix (h (HFix h) a)
Fix f represents the least fixed point of the endofunctor on the category of types and functions, f :: Type -> Type; HFix h represents the least fixed point of the endofunctor h on a category of functors and natural transformations, h :: (Type -> Type) -> (Type -> Type).
-- x ~ Fix (ExpoNatF a) ~ ExpoNat
data ExpoNatF a x = ExpoNatF (Maybe a) x | NoExpoNatF
fmap :: (x -> y) -> ExpoNatF a x -> ExpoNatF a y
fmap f (ExpoNatF u v) = ExpoNatF u (f v)
fmap _ NoExpoNatF = NoExpoNatF
-- f ~ HFix TreeMapH ~ TreeMap
data TreeMapH f a = TreeMapH (Maybe a) (f (f a)) | EmptyTreeMapH
hfmap :: (f ~> g) -> (TreeMapH f ~> TreeMapH g)
hfmap f (TreeMapH u v) = TreeMapH u ((fmap . fmap) f v)
hfmap _ EmptyTreeMapH = EmptyTreeMapH
-- (~>) is the type of natural transformations
type f ~> g = forall a. f a -> g a
Endofunctors give rise to algebras.
type Alg f a = f a -> a
type HAlg h f = h f ~> f
fold, or cata maps any algebra to a morphism (function|natural transformation).
cata :: Alg f a -> Fix f -> a
hcata :: HAlg h f -> (HFix h ~> h)
build constructs a value from its Church encoding.
type Church f = forall a. Alg f a -> a
type HChurch h = forall f. HAlg h f ~> f
build :: Church f -> Fix f
hbuild :: HChurch h -> HFix h a
-- The paper actually has a slightly different type for Church encodings, derived from the categorical view, but I'm pretty sure they're equivalent
build/fold fusion is summarized by one equation.
cata alg ( build f) = f alg
hcata alg (hbuild f) = f alg

Related

Is this type a valid "rank-2 bifunctor"?

The rank2classes package provides a version of Functor for which the mapped functions seem to be natural transformations between type constructors.
Following that idea, here's a rank-2 version of Bifunctor:
{-# LANGUAGE RankNTypes #-}
{-# LANGUAGE StandaloneKindSignatures #-}
import Data.Kind
type Rank2Bifunctor :: ((Type -> Type) -> (Type -> Type) -> Type) -> Constraint
class Rank2Bifunctor b where
rank2bimap ::
(forall x. p x -> p' x) -> (forall x. q x -> q' x) -> b p q -> b p' q'
It seems clear that this type Foo can be given a Rank2Bifunctor instance:
data Foo f g = Foo (f Int) (g Int)
instance Rank2Bifunctor Foo where
rank2bimap tleft tright (Foo f g) = Foo (tleft f) (tright g)
But what about this Bar type which nests f and g:
data Bar f g = Bar (f (g Int))
For starters, it seems that we would need to require Functor p in the signature of rank2bimap, to be able to transform the g inside the f.
Is Bar a valid "rank-2 bifunctor"?

Indeed that's not an instance of your Bifunctor, since the lack of constraint allows you to pick a contravariant functor for f and then rank2bimap would amount roughly to implement fmap for f:
rank2bimap id :: (g ~> g') -> Bar f g -> Bar f g' -- covariance of f, kinda (since Bar f g = f (g Int))
type f ~> f' = (forall x. f x -> f' x)
If you add the requirement that f and g (optional here) be functors, then you do get a bifunctor:
rank2bimap :: (Functor f, Functor g) => (f ~> f') -> (g ~> g') -> Bar f g -> Bar f' g'
In particular, to prove the bifunctor laws, you will need the free theorem of f ~> f', assuming f and f' are functors, then n :: f ~> f' satisfies that for all phi, fmap phi . n = n . fmap phi.

LiquidHaskell: Functor Law

I am encoding (as an assumption) the functor identity law like so:
...
import qualified Prelude ()
...
{-# class Functor f where
fmap :: forall a b. (a -> b) -> (f a -> f b) #-}
class Functor f where
fmap :: (a -> b) -> (f a -> f b)
-- identity function
{-# id :: forall a . x : a -> {y : a | x = y} #-}
id :: forall a . a -> a
id x = x
-- functor law: mapping with identity is identity
{-# assume
fmap_id :: forall f a . Functor f => x : f a ->
{fmap id x = id x} #-}
fmap_id :: Functor f => f a -> ()
fmap_id _ = ()
I can't see anything wrong with this formulation, yet I get this error from LH:
src/Category/Functor/LH.hs:45:16: error:
• Illegal type specification for `Category.Functor.LH.fmap_id`
Category.Functor.LH.fmap_id :: forall f a .
(Functor<[]> f) =>
x:f a -> {VV : () | Category.Functor.LH.fmap Category.Functor.LH.id x == Category.Functor.LH.id x}
Sort Error in Refinement: {VV : Tuple | Category.Functor.LH.fmap Category.Functor.LH.id x == Category.Functor.LH.id x}
Unbound symbol Category.Functor.LH.fmap --- perhaps you meant: Category.Functor.LH.C:Functor ?
•
|
45 | fmap_id :: forall f a . Functor f => x : f a ->
|
What am I doing wrong? My goal is to formulate this point-free with extensionality, but at least this point-wise formulation should work first.
Configuration:
GHC version: 8.10.1
Cabal version: 3.2.0.0
Liquid Haskell version: 0.8.10.2

Support for typeclasses is not yet supported in the official release of Liquid Haskell (although it is almost ready to merge). For now, you can use this fork which implements typeclass support. After recursively cloning the repository, you can install the forked version with:
pushd liquid-benchmark/liquidhaskell
stack install
popd
We define Functor and its laws (in the VFunctor subclass) in liquid-base/liquid-base/src/Data/Functor/Classes.hs as follows. Notice that you can specify refinements on the typeclass methods directly.
class Functor f where
{-# fmap :: forall a b. (a -> b) -> f a -> f b #-}
fmap :: (a -> b) -> f a -> f b
(<$) :: a -> f b -> f a
class Functor m => VFunctor m where
{-# lawFunctorId :: forall a . x:m a -> {fmap id x == id x} #-}
lawFunctorId :: m a -> ()
{-# lawFunctorComposition :: forall a b c . f:(b -> c) -> g:(a -> b) -> x:m a -> { fmap (compose f g) x == compose (fmap f) (fmap g) x } #-}
lawFunctorComposition :: forall a b c. (b -> c) -> (a -> b) -> m a -> ()
You can run the forked version of Liquid Haskell on the typeclass definitions with:
liquid --typeclass -i liquid-base/liquid-base/src/ liquid-base/liquid-base/src/Data/Functor/Classes.hs
We create a verified Maybe instance in liquid-base/liquid-base/src/Data/Maybe/Functor.hs with:
instance Functor Maybe where
fmap _ Nothing = Nothing
fmap f (Just x) = Just (f x)
_ <$ Nothing = Nothing
x <$ (Just _) = Just x
instance VFunctor Maybe where
lawFunctorId Nothing = ()
lawFunctorId (Just _) = ()
lawFunctorComposition f g Nothing = ()
lawFunctorComposition f g (Just x) = ()
You can run Liquid Haskell on the Maybe instance to verify that it satisfies the required laws:
liquid --typeclass -i liquid-base/liquid-base/src/ liquid-base/liquid-base/src/Data/Maybe/Functor.hs

How to define an instance of Data.Foldable.Constrained?

I've successfully defined Category, Functor, Semigroup, Monoid constrained. Now I'm stuck with Data.Foldable.Constrained. More precisely, I seem to have correctly defined the unconstrained functions fldl and fldMp, but I can't get them to be accepted as Foldable.Constrained instances.
My definition attempt is inserted as a comment.
{-# LANGUAGE OverloadedLists, GADTs, TypeFamilies, ConstraintKinds,
FlexibleInstances, MultiParamTypeClasses, StandaloneDeriving, TypeApplications #-}
import Prelude ()
import Control.Category.Constrained.Prelude
import qualified Control.Category.Hask as Hask
-- import Data.Constraint.Trivial
import Data.Foldable.Constrained
import Data.Map as M
import Data.Set as S
import qualified Data.Foldable as FL
main :: IO ()
main = print $ fmap (constrained #Ord (+1))
$ RMS ([(1,[11,21]),(2,[31,41])])
data RelationMS a b where
IdRMS :: RelationMS a a
RMS :: Map a (Set b) -> RelationMS a b
deriving instance (Show a, Show b) => Show (RelationMS a b)
instance Category RelationMS where
type Object RelationMS o = Ord o
id = IdRMS
RMS mp2 . RMS mp1
| M.null mp2 || M.null mp1 = RMS M.empty
| otherwise = RMS $ M.foldrWithKey
(\k s acc -> M.insert k (S.foldr (\x acc2 -> case M.lookup x mp2 of
Nothing -> acc2
Just s2 -> S.union s2 acc2
) S.empty s
) acc
) M.empty mp1
(°) :: (Object k a, Object k b, Object k c, Category k) => k a b -> k b c -> k a c
r1 ° r2 = r2 . r1
instance (Ord a, Ord b) => Semigroup (RelationMS a b) where
RMS r1 <> RMS r2 = RMS $ M.foldrWithKey (\k s acc -> M.insertWith S.union k s acc) r1 r2
instance (Ord a, Ord b) => Monoid (RelationMS a b) where
mempty = RMS M.empty
mappend = (<>)
instance Functor (RelationMS a) (ConstrainedCategory (->) Ord) Hask where
fmap (ConstrainedMorphism f) = ConstrainedMorphism $
\(RMS r) -> RMS $ M.map (S.map f) r
fldl :: (a -> Set b -> a) -> a -> RelationMS k b -> a
fldl f acc (RMS r) = M.foldl f acc r
fldMp :: Monoid b1 => (Set b2 -> b1) -> RelationMS k b2 -> b1
fldMp m (RMS r) = M.foldr (mappend . m) mempty r
-- instance Foldable (RelationMS a) (ConstrainedCategory (->) Ord) Hask where
-- foldMap f (RMS r)
-- | M.null r = mempty
-- | otherwise = FL.foldMap f r
-- ffoldl f = uncurry $ M.foldl (curry f)

You need FL.foldMap (FL.foldMap f) r in your definition so that you fold over the Map and the Set.
However, there's a critical error in your Functor instance; your fmap is partial. It's not defined on IdRMS.
I suggest using -Wall to have the compiler warn you about such issues.
The problem comes down to you need to be able to represent relations with finite and infinite domains. IdRMS :: RelationRMS a a can already be used to represent some relations of infinite domain, it isn't powerful enough to represent a relation like fmap (\x -> [x]) IdRMS.
One approach is to use Map a (Set b) for finite relations and a -> Set b for infinite relations.
data Relation a b where
Fin :: Map a (Set b) -> Relation a b
Inf :: (a -> Set b) -> Relation a b
image :: Relation a b -> a -> Set b
image (Fin f) a = M.findWithDefault (S.empty) a f
image (Inf f) a = f a
This changes the category instance accordingly:
instance Category Relation where
type Object Relation a = Ord a
id = Inf S.singleton
f . Fin g = Fin $ M.mapMaybe (nonEmptySet . concatMapSet (image f)) g
f . Inf g = Inf $ concatMapSet (image f) . g
nonEmptySet :: Set a -> Maybe (Set a)
nonEmptySet | S.null s = Nothing
| otherwise = Just s
concatMapSet :: Ord b => (a -> Set b) -> Set a -> Set b
concatMapSet f = S.unions . fmap f . S.toList
And now you can define a total Functor instance:
instance Functor (Relation a) (Ord ⊢ (->)) Hask where
fmap (ConstrainedMorphism f) = ConstrainedMorphism $ \case -- using {-# LANGUAGE LambdaCase #-}
Fin g -> Fin $ fmap (S.map f) g
Inf g -> Inf $ fmap (S.map f) g
But a new issue raises its head when defining the Foldable instance:
instance Foldable (Relation a) (Ord ⊢ (->)) Hask where
foldMap (ConstrainedMorphism f) = ConstrainedMorphism $ \case
Fin g -> Prelude.foldMap (Prelude.foldMap f) g
Inf g -> -- uh oh...problem!
We have f :: b -> m and g :: a -> Set b. Monoid m gives us append :: m -> m -> m, and we know Ord a, but in order to generate all the b values in the image of the relation, we need all the possible a values!
One way you could try to salvage this is to use Bounded and Enum as additional constraints on the relation's domain. Then you could try to enumerate all the possible a values with [minBound..maxBound] (this may not be list every value for all types; I'm not sure if that's a law for Bounded and Enum).
instance (Enum a, Bounded a) => Foldable (Relation a) (Ord ⊢ (->)) Hask where
foldMap (ConstrainedMorphism f) = ConstrainedMorphism $ \case
Fin g -> Prelude.foldMap (Prelude.foldMap f) g
Inf g -> Prelude.foldMap (Prelude.foldMap f . g) [minBound .. maxBound]

Is it possible to derive recursion principles generically?

In Idris, there's some magical machinery to automatically create (dependent) eliminators for user-defined types. I'm wondering if it's possible to do something (perhaps less dependent) with Haskell types. For instance, given
data Foo a = No | Yes a | Perhaps (Foo a)
I want to generate
foo :: b -> (a -> b) -> (b -> b) -> Foo a -> b
foo b _ _ No = b
foo _ f _ (Yes a) = f a
foo b f g (Perhaps c) = g (foo b f g x)
I'm pretty weak on polyvariadic functions and generics, so I could use a bit of help getting started.

Here's a start of doing this using GHC Generics. Adding some code to reassociate the (:+:) would make this nicer. A few more instances are required and this probably has ergonomic problems.
EDIT: Bah, I got lazy and fell back to a data family to get injectivity for my type equality dispatch. This mildly changes the interface. I suspect with enough trickery, and/or using injective type families this can be done without a data family or overlapping instances.
{-# LANGUAGE DeriveGeneric #-}
{-# LANGUAGE EmptyDataDecls #-}
{-# LANGUAGE TypeFamilies #-}
{-# LANGUAGE TypeOperators #-}
{-# LANGUAGE FlexibleInstances #-}
{-# LANGUAGE FlexibleContexts #-}
{-# LANGUAGE MultiParamTypeClasses #-}
{-# LANGUAGE UndecidableInstances #-}
module Main where
import Data.Function (fix)
import GHC.Generics
data Foo a = No | Yes | Perhaps (Foo a) | Extra a Int Bool
deriving (Show, Generic1)
data Bar a = Bar (Maybe a)
deriving (Show, Generic1)
gcata :: (GCata (f a) (Rep1 f a), Generic1 f) => Alg (f a) (Rep1 f a) r -> f a -> r
gcata f = fix(\w -> gcata' w f . from1)
ex' :: Show a => Foo a -> String
ex' = gcata (("No","Yes"),(\(Rec s) -> "Perhaps ("++s++")", \a i b -> "Extra ("++show a++") ("++show i++") ("++show b++")"))
ex1 = ex' (Perhaps (Perhaps Yes) :: Foo Int)
ex2 = ex' (Perhaps (Perhaps (Extra 'a' 2 True)) :: Foo Char)
ex3 :: Foo a -> Foo a
ex3 = gcata ((No, Yes), (Perhaps . unRec, Extra))
ex4 = gcata (\(K m) -> show m) (Bar (Just 3))
class GCata rec f where
type Alg (rec :: *) (f :: *) (r :: *) :: *
gcata' :: (rec -> r) -> Alg rec f r -> f -> r
instance (GCata rec (f p)) => GCata rec (M1 i c f p) where
type Alg rec (M1 i c f p) r = Alg rec (f p) r
gcata' w f (M1 x) = gcata' w f x
instance (GCata rec (f p), GCata rec (g p)) => GCata rec ((f :+: g) p) where
type Alg rec ((f :+: g) p) r = (Alg rec (f p) r, Alg rec (g p) r)
gcata' w (l,_) (L1 x) = gcata' w l x
gcata' w (_,r) (R1 x) = gcata' w r x
instance GCata rec (U1 p) where
type Alg rec (U1 p) r = r
gcata' _ f U1 = f
instance (Project rec (f p), GCata rec (g p)) => GCata rec ((f :*: g) p) where
type Alg rec ((f :*: g) p) r = Prj rec (f p) r -> Alg rec (g p) r
gcata' w f (x :*: y) = gcata' w (f (prj w x)) y
class Project rec f where
type Prj (rec :: *) (f :: *) (r :: *) :: *
prj :: (rec -> r) -> f -> Prj rec f r
instance (Project rec (f p)) => Project rec (M1 i c f p) where
type Prj rec (M1 i c f p) r = Prj rec (f p) r
prj w (M1 x) = prj w x
instance Project rec (K1 i c p) where
type Prj rec (K1 i c p) r = c
prj _ (K1 x) = x
instance (RecIfEq (TEq rec (f p)) rec (f p)) => Project rec (Rec1 f p) where
type Prj rec (Rec1 f p) r = Tgt (TEq rec (f p)) rec (f p) r
prj w (Rec1 x) = recIfEq w x
instance Project rec (Par1 p) where
type Prj rec (Par1 p) r = p
prj _ (Par1 x) = x
instance GCata rec (K1 i c p) where
type Alg rec (K1 i c p) r = c -> r
gcata' _ f (K1 x) = f x
instance GCata rec (Par1 p) where
type Alg rec (Par1 p) r = p -> r
gcata' _ f (Par1 x) = f x
instance (Project rec (Rec1 f p)) => GCata rec (Rec1 f p) where
type Alg rec (Rec1 f p) r = Prj rec (Rec1 f p) r -> r
gcata' w f = f . prj w
data HTrue; data HFalse
type family TEq x y where
TEq x x = HTrue
TEq x y = HFalse
class RecIfEq b rec t where
data Tgt b rec t r :: *
recIfEq :: (rec -> r) -> t -> Tgt b rec t r
instance RecIfEq HTrue rec rec where
newtype Tgt HTrue rec rec r = Rec { unRec :: r }
recIfEq w = Rec . w
instance RecIfEq HFalse rec t where
newtype Tgt HFalse rec t r = K { unK :: t }
recIfEq _ = K

As pigworker remarked in the question comments, using the default Generic representation leads to great ugliness, since we don't have prior information about recursion in our type, and we have to dig out recursive occurrences by manually checking for type equality. I'd like to present here alternative solutions with explicit f-algebra-style recursion. For this, we need an alternative generic Rep. Sadly, this means we can't easily tap into GHC.Generics, but I hope this will be edifying nonetheless.
In my first solution I aim for a presentation that is as simple as possible within current GHC capabilities. The second solution is a TypeApplication-heavy GHC 8-based one with more sophisticated types.
Starting out as usual:
{-# language
TypeOperators, DataKinds, PolyKinds,
RankNTypes, EmptyCase, ScopedTypeVariables,
DeriveFunctor, StandaloneDeriving, GADTs,
TypeFamilies, FlexibleContexts, FlexibleInstances #-}
My generic representation is a fixpoint of a sum-of-products. It slightly extends the basic model of generics-sop, which is also a sum-of-products but not functorial and therefore ill-equipped for recursive algorithms. I think SOP is overall a much better practical representation than arbitrarily nested types; you can find extended arguments as to why this is the case in the paper. In short, SOP removes unnecessary nesting information and lets us separate metadata from basic data.
But before anything else, we should decide on a code for generic types. In vanilla GHC.Generics there isn't a well-defined kind of codes, as the type constructors of sums, products etc. form an ad-hoc type-level grammar, and we can dispatch on them using type classes. We adhere more closely to usual presentations in dependently typed generics, and use explicit codes, interpretations and functions. Our codes shall be of kind:
[[Maybe *]]
The outer list encodes a sum of constructors, with each inner [Maybe *] encoding a constructor. A Just * is just a constructor field, while Nothing denotes a recursive field. For example, the code of [Int] is ['[], [Just Int, Nothing]].
type Rep a = Fix (SOP (Code a))
class Generic a where
type Code a :: [[Maybe *]]
to :: a -> Rep a
from :: Rep a -> a
data NP (ts :: [Maybe *]) (k :: *) where
Nil :: NP '[] k
(:>) :: t -> NP ts k -> NP (Just t ': ts) k
Rec :: k -> NP ts k -> NP (Nothing ': ts) k
infixr 5 :>
data SOP (code :: [[Maybe *]]) (k :: *) where
Z :: NP ts k -> SOP (ts ': code) k
S :: SOP code k -> SOP (ts ': code) k
Note that NP has different constructors for recursive and non-recursive fields. This is quite important, because we want codes to be unambiguously reflected in the type indices. In other words, we would like NP to also act as a singleton for [Maybe *] (although we remain parametric in * for good reasons).
We use a k parameter in the definitions to leave a hole for recursion. We set up recursion as usual, leaving the Functor instances to GHC:
deriving instance Functor (SOP code)
deriving instance Functor (NP code)
newtype Fix f = In {out :: f (Fix f)}
cata :: Functor f => (f a -> a) -> Fix f -> a
cata phi = go where go = phi . fmap go . out
We have two type families:
type family CurryNP (ts :: [Maybe *]) (r :: *) :: * where
CurryNP '[] r = r
CurryNP (Just t ': ts) r = t -> CurryNP ts r
CurryNP (Nothing ': ts) r = r -> CurryNP ts r
type family Alg (code :: [[Maybe *]]) (r :: *) :: * where
Alg '[] r = ()
Alg (ts ': tss) r = (CurryNP ts r, Alg tss r)
CurryNP ts r curries NP ts with result type r, and it also plugs in r in the recursive occurrences.
Alg code r computes the type of an algebra on SOP code r. It tuples together the eliminators for the individual constructors. Here we use plain nested tuples, but of course HList-s would be adequate too. We could also reuse NP here as a HList, but I find that too kludgy.
All that's left is to implement the functions:
uncurryNP :: CurryNP ts a -> NP ts a -> a
uncurryNP f Nil = f
uncurryNP f (x :> xs) = uncurryNP (f x) xs
uncurryNP f (Rec k xs) = uncurryNP (f k) xs
algSOP :: Alg code a -> SOP code a -> a
algSOP fs (Z np) = uncurryNP (fst fs) np
algSOP fs (S sop) = algSOP (snd fs) sop
gcata :: Generic a => Alg (Code a) r -> a -> r
gcata f = cata (algSOP f) . to
The key point here is that we have to convert the curried eliminators in Alg into a "proper" SOP code a -> a algebra, since that is the form that can be directly used in cata.
Let's define some sugar and instances:
(<:) :: a -> b -> (a, b)
(<:) = (,)
infixr 5 <:
instance Generic (Fix (SOP code)) where
type Code (Fix (SOP code)) = code
to = id
from = id
instance Generic [a] where
type Code [a] = ['[], [Just a, Nothing]]
to = foldr (\x xs -> In (S (Z (x :> Rec xs Nil)))) (In (Z Nil))
from = gcata ([] <: (:) <: ()) -- note the use of "Generic (Rep [a])"
Example:
> gcata (0 <: (+) <: ()) [0..10]
55
Full code.
However, it would be nicer if we had currying and didn't have to use HList-s or tuples to store eliminators. The most convenient way is to have the same order of arguments as in standard library folds, such as foldr or maybe. In this case the return type of gcata is given by a type family that computes from the generic code of a type.
type family CurryNP (ts :: [Maybe *]) (r :: *) :: * where
CurryNP '[] r = r
CurryNP (Just t ': ts) r = t -> CurryNP ts r
CurryNP (Nothing ': ts) r = r -> CurryNP ts r
type family Fold' code a r where
Fold' '[] a r = r
Fold' (ts ': tss) a r = CurryNP ts a -> Fold' tss a r
type Fold a r = Fold' (Code a) r (a -> r)
gcata :: forall a r. Generic a => Fold a r
This gcata is highly (fully) ambiguous. We need either explicit application or Proxy, and I opted for the former, incurring a GHC 8 dependence. However, once we supply an a type, the result type reduces, and we can easily curry:
> :t gcata #[_]
gcata #[_] :: Generic [t] => r -> (t -> r -> r) -> [t] -> r
> :t gcata #[_] 0
gcata #[_] 0 :: Num t1 => (t -> t1 -> t1) -> [t] -> t1
> gcata #[_] 0 (+) [0..10]
55
I used above a partial type signature in [_]. We can also create a shorthand for this:
gcata1 :: forall f a r. Generic (f a) => Fold (f a) r
gcata1 = gcata #(f a) #r
Which can be used as gcata1 #[].
I'd rather not elaborate the implementation of the above gcata here. It's not much longer than the simple version, but the gcata implementation is pretty hairy (embarrassingly, it's responsible for my delayed answer). Right now I couldn't explain it very well, since I wrote it with Agda aid, which entails plenty of automatic search and type tetris.

As has been said in the comments and other answers, it's best to start from a generic representation that has access to the recursive positions.
One library that works with such a representation is multirec (another is compdata):
{-# LANGUAGE TemplateHaskell #-}
{-# LANGUAGE GADTs, TypeFamilies, MultiParamTypeClasses, RankNTypes #-}
module FooFold where
import Generics.MultiRec.FoldAlgK
import Generics.MultiRec.TH
data Foo a = No | Yes a | Perhaps (Foo a)
data FooF :: * -> * -> * where
Foo :: FooF a (Foo a)
deriveAll ''FooF
foldFoo :: (r, (a -> r, r -> r)) -> Foo a -> r
foldFoo phi = fold (const phi) Foo
The FoldAlgK module provides a fold with a single result type and computes the algebra type as a nested pair. It would be relatively easy to additionally curry that. There are some other variants offered by the library.

How do I give a Functor instance to a datatype built for general recursion schemes?

I have a recursive datatype which has a Functor instance:
data Expr1 a
= Val1 a
| Add1 (Expr1 a) (Expr1 a)
deriving (Eq, Show, Functor)
Now, I'm interested in modifying this datatype to support general recursion schemes, as they are described in this tutorial and this Hackage package. I managed to get the catamorphism to work:
newtype Fix f = Fix {unFix :: f (Fix f)}
data ExprF a r
= Val a
| Add r r
deriving (Eq, Show, Functor)
type Expr2 a = Fix (ExprF a)
cata :: Functor f => (f a -> a) -> Fix f -> a
cata f = f . fmap (cata f) . unFix
eval :: Expr2 Int -> Int
eval = cata $ \case
Val n -> n
Add x y -> x + y
main :: IO ()
main =
print $ eval
(Fix (Add (Fix (Val 1)) (Fix (Val 2))))
But now I can't figure out how to give Expr2 the same functor instance that the original Expr had. It seems there is a kind mismatch when trying to define the functor instance:
instance Functor (Fix (ExprF a)) where
fmap = undefined
Kind mis-match
The first argument of `Functor' should have kind `* -> *',
but `Fix (ExprF a)' has kind `*'
In the instance declaration for `Functor (Fix (ExprF a))'
How do I write a Functor instance for Expr2?
I thought about wrapping Expr2 in a newtype with newtype Expr2 a = Expr2 (Fix (ExprF a)) but then this newtype needs to be unwrapped to be passed to cata, which I don't like very much. I also don't know if it would be possible to automatically derive the Expr2 functor instance like I did with Expr1.

This is an old sore for me. The crucial point is that your ExprF is functorial in both its parameters. So if we had
class Bifunctor b where
bimap :: (x1 -> y1) -> (x2 -> y2) -> b x1 x2 -> b y1 y2
then you could define (or imagine a machine defining for you)
instance Bifunctor ExprF where
bimap k1 k2 (Val a) = Val (k1 a)
bimap k1 k2 (Add x y) = Add (k2 x) (k2 y)
and now you can have
newtype Fix2 b a = MkFix2 (b a (Fix2 b a))
accompanied by
map1cata2 :: Bifunctor b => (a -> a') -> (b a' t -> t) -> Fix2 b a -> t
map1cata2 e f (MkFix2 bar) = f (bimap e (map1cata2 e f) bar)
which in turn gives you that when you take a fixpoint in one of the parameters, what's left is still functorial in the other
instance Bifunctor b => Functor (Fix2 b) where
fmap k = map1cata2 k MkFix2
and you sort of get what you wanted. But your Bifunctor instance isn't going to be built by magic. And it's a bit annoying that you need a different fixpoint operator and a whole new kind of functor. The trouble is that you now have two sorts of substructure: "values" and "subexpressions".
And here's the turn. There is a notion of functor which is closed under fixpoints. Turn on the kitchen sink (especially DataKinds) and
type s :-> t = forall x. s x -> t x
class FunctorIx (f :: (i -> *) -> (o -> *)) where
mapIx :: (s :-> t) -> f s :-> f t
Note that "elements" come in a kind indexed over i and "structures" in a kind indexed over some other o. We take i-preserving functions on elements to o preserving functions on structures. Crucially, i and o can be different.
The magic words are "1, 2, 4, 8, time to exponentiate!". A type of kind * can easily be turned into a trivially indexed GADT of kind () -> *. And two types can be rolled together to make a GADT of kind Either () () -> *. That means we can roll both sorts of substructure together. In general, we have a kind of type level either.
data Case :: (a -> *) -> (b -> *) -> Either a b -> * where
CL :: f a -> Case f g (Left a)
CR :: g b -> Case f g (Right b)
equipped with its notion of "map"
mapCase :: (f :-> f') -> (g :-> g') -> Case f g :-> Case f' g'
mapCase ff gg (CL fx) = CL (ff fx)
mapCase ff gg (CR gx) = CR (gg gx)
So we can refunctor our bifactors as Either-indexed FunctorIx instances.
And now we can take the fixpoint of any node structure f which has places for either elements p or subnodes. It's just the same deal we had above.
newtype FixIx (f :: (Either i o -> *) -> (o -> *))
(p :: i -> *)
(b :: o)
= MkFixIx (f (Case p (FixIx f p)) b)
mapCata :: forall f p q t. FunctorIx f =>
(p :-> q) -> (f (Case q t) :-> t) -> FixIx f p :-> t
mapCata e f (MkFixIx node) = f (mapIx (mapCase e (mapCata e f)) node)
But now, we get the fact that FunctorIx is closed under FixIx.
instance FunctorIx f => FunctorIx (FixIx f) where
mapIx f = mapCata f MkFixIx
Functors on indexed sets (with the extra freedom to vary the index) can be very precise and very powerful. They enjoy many more convenient closure properties than Functors do. I don't suppose they'll catch on.

I wonder if you might be better off using the Free type:
data Free f a
= Pure a
| Wrap (f (Free f a))
deriving Functor
data ExprF r
= Add r r
deriving Functor
This has the added benefit that there are quite a few libraries that work on free monads already, so maybe they'll save you some work.

Nothing wrong with pigworker's answer, but maybe you can use a simpler one as a stepping-stone:
{-# LANGUAGE DeriveFunctor, ScopedTypeVariables #-}
import Prelude hiding (map)
newtype Fix f = Fix { unFix :: f (Fix f) }
-- This is the catamorphism function you hopefully know and love
-- already. Generalizes 'foldr'.
cata :: Functor f => (f r -> r) -> Fix f -> r
cata phi = phi . fmap (cata phi) . unFix
-- The 'Bifunctor' class. You can find this in Hackage, so if you
-- want to use this just use it from there.
--
-- Minimal definition: either 'bimap' or both 'first' and 'second'.
class Bifunctor f where
bimap :: (a -> c) -> (b -> d) -> f a b -> f c d
bimap f g = first f . second g
first :: (a -> c) -> f a b -> f c b
first f = bimap f id
second :: (b -> d) -> f a b -> f a d
second g = bimap id g
-- The generic map function. I wrote this out with
-- ScopedTypeVariables to make it easier to read...
map :: forall f a b. (Functor (f a), Bifunctor f) =>
(a -> b) -> Fix (f a) -> Fix (f b)
map f = cata phi
where phi :: f a (Fix (f b)) -> Fix (f b)
phi = Fix . first f
Now your expression language works like this:
-- This is the base (bi)functor for your expression type.
data ExprF a r = Val a
| Add r r
deriving (Eq, Show, Functor)
instance Bifunctor ExprF where
bimap f g (Val a) = Val (f a)
bimap f g (Add l r) = Add (g l) (g r)
newtype Expr a = Expr (Fix (ExprF a))
instance Functor Expr where
fmap f (Expr exprF) = Expr (map f exprF)
EDIT: Here's a link to the bifunctors package in Hackage.

The keyword type is used only as a synonymous of an existing type, maybe this is what you are looking for
newtype Expr2 a r = In { out :: (ExprF a r)} deriving Functor

Develop Reference

node.js excel linux python-3.x azure haskell apache-spark rust .htaccess string

Shortcut fusion for triemaps - haskell

Related

Is this type a valid "rank-2 bifunctor"?

LiquidHaskell: Functor Law

How to define an instance of Data.Foldable.Constrained?

Is it possible to derive recursion principles generically?

How do I give a Functor instance to a datatype built for general recursion schemes?

Categories

Resources