Using Comonad Fix Combinators - haskell

So I've been experimenting with fixed points lately and have finally struggled
through regular fixed points enough to discover some uses; now I'm moving onto
comonadic fixed points and I'm afraid I've gotten stuck;
Here's a few examples of what I've tried and what has/hasn't worked:
{-# language DeriveFunctor #-}
{-# language FlexibleInstances #-}
module WFix where
import Control.Comonad
import Control.Comonad.Cofree
import Control.Monad.Fix
So I started with loeb's theorem as a list; each element of the list is a function
which takes the end result to compute its answer; this lets me do 'spreadsheet'
calculations where values can depend on other values.
spreadSheetFix :: [Int]
spreadSheetFix = fix $ \result -> [length result, (result !! 0) * 10, (result !! 1) + 1, sum (take 3 result)]
Okay, so I have basic fix working, time to move on to the comonad types!
Here's a few simple comonads to use for examples:
data Stream a = S a (Stream a)
deriving (Eq, Show, Functor)
next :: Stream a -> Stream a
next (S _ s) = s
instance Comonad Stream where
extract (S a _) = a
duplicate s#(S _ r) = S s (duplicate r)
instance ComonadApply Stream where
(S f fs) <#> (S a as) = S (f a) (fs <#> as)
data Tape a = Tape [a] a [a]
deriving (Show, Eq, Functor)
moveLeft, moveRight :: Tape a -> Tape a
moveLeft w#(Tape [] _ _) = w
moveLeft (Tape (l:ls) a rs) = Tape ls l (a:rs)
moveRight w#(Tape _ _ []) = w
moveRight (Tape ls a (r:rs)) = Tape (a:ls) r rs
instance Comonad Tape where
extract (Tape _ a _) = a
duplicate w#(Tape l _ r) = Tape lefts w rights
where
lefts = zipWith const (tail $ iterate moveLeft w) l
rights = zipWith const (tail $ iterate moveRight w) r
instance ComonadApply Tape where
Tape l f r <#> Tape l' a r' = Tape (zipWith ($) l l') (f a) (zipWith ($) r r')
Okay so the following combinators come from Control.Comonad;
wfix :: Comonad w => w (w a -> a) -> a
wfix w = extract w (extend wfix w)
cfix :: Comonad w => (w a -> a) -> w a
cfix f = fix (extend f)
kfix :: ComonadApply w => w (w a -> a) -> w a
kfix w = fix $ \u -> w <#> duplicate u
I started with trying out wfix:
streamWFix :: Int
streamWFix = wfix st
where
incNext = succ . extract . next
st = (S incNext (S incNext (S (const 0) st)))
> streamWFix
-- 2
This one seems to work by calling the first w a -> a on w until reaching
a resolution const 0 in this case; that makes sense. We can also do this
with a Tape:
selfReferentialWFix :: Int
selfReferentialWFix = wfix $ Tape [const 10] ((+5) . extract . moveLeft) []
-- selfReferentialWFix == 15
K, I think I get that one, but the next ones I'm kind of stuck,
I don't seem to have an intuition for what cfix is supposed to do.
Even the simplest possible thing I could think of spins forever
when I evaluate it; even trying to extract the first element of the stream
using getOne fails.
getOne :: Stream a -> a
getOne (S a _) = a
simpleCFix :: Stream Int
simpleCFix = cfix go
where
go _ = 0
Similarly with kfix; even simple tries don't seem to terminate.
My understanding of kfix was that the function in each 'slot' gets
passed a copy of the evaluated comonad focused on that spot; is that the case?
I tried using 'getOne' on this:
streamKFix :: Stream Int
streamKFix = kfix st
where
go _ = 0
st = S go st
Here's a finite attempt using Tape which also fails to run:
tapeKFix :: Tape Int
tapeKFix = kfix $ Tape [] (const 0) []
So; down to my question, could someone please offer some runnable (non-trivial)
examples of using cfix and kfix, and explain how they work? I plan to use kfix to eventually do a "Conway's
game of life" style experiment, am I correct in thinking that kfix would be useful
in working with neighbourhoods around a given cell?
Feel free to ask
any clarifying questions and help me expand my knowledge and intuition of fix!
Thanks!

The ComonadApply and Comonad instances for Tape are insufficiently lazy to be used with kfix.
duplicate for Tape requires that you prove the tape exists before it can conclude that the result is a Tape
instance Comonad Tape where
extract (Tape _ a _) = a
duplicate w#(Tape l _ r) = Tape lefts w rights
-- ^ ^
-- matches a Tape |
-- before determining that the result is a Tape
<#> checks that both arguments are tapes before it can conclude that the result is a Tape
instance ComonadApply Tape where
Tape l f r <#> Tape l' a r' = Tape (zipWith ($) l l') (f a) (zipWith ($) r r')
-- ^ ^ ^
-- matches two Tapes |
-- before detrmining that the result is a Tape
Combined there's no way for kfix (Tape _ _ _) to ever produce Tape
kfix w = fix $ \u -> w <#> duplicate u
kfix (Tape _ _ _) = fix $ \u -> (Tape _ _ _) <#> duplicate u
kfix (Tape _ _ _) = fix $ \u -> (Tape _ _ _) <#> case u of (Tape _ _ _) -> ...
-- ^ |
-- ----------- <<loop>> -------------
You can fix this by making duplicate, <#>, or both more productive by using irrefutable patterns. A pattern ~(Tape l a r) matches even if the Tape constructor hasn't been produced yet. Here's how you'd use it to make duplicate productive
instance Comonad Tape where
extract (Tape _ a _) = a
duplicate w#(~(Tape l _ r)) = Tape lefts w rights
where
lefts = zipWith const (tail $ iterate moveLeft w) l
rights = zipWith const (tail $ iterate moveRight w) r
Irrefutable pattern matches are equivalent to using functions to extract values. For duplicate it's equivalent to writing
left (Tape l _ _) = l
right (Tape _ _ r) = r
instance Comonad Tape where
extract (Tape _ a _) = a
duplicate w = Tape lefts w rights
where
l = left w
r = right w
...

Related

Haskell Function for checking if element is in Tree, returning Depth

I am currently doing an assigment for a class in which I have to implement a function which checks if an element is in a tree.
It is supposed to return Nothing when the element is not in the tree and Just (depth at which it was found) when it is.
An example:
sample1
##1
#3 2
###7 5 6 4
- contains 6 sample1 returns Just 2
- contains 1 sample1 returns Just 0
- contains 2 sample1 returns Just 1
- contains 8 sample1 returns Nothing
Here is what we are given:
Heap functional data structure:
module Fdata.Heap where
-- A signature for min-heaps
data Heap e t = Heap {
empty :: t e,
insert :: e -> t e -> t e,
findMin :: t e -> Maybe e,
deleteMin :: t e -> Maybe (t e),
merge :: t e -> t e -> t e,
contains :: e -> t e -> Maybe Int
}
An implementation of self-adjusting heaps:
import Fdata.Heap
import Fdata.Tree
-- An implementation of self-adjusting heaps
heap :: (Eq e, Ord e) => Heap e Tree
heap = Heap {
empty = Empty,
insert = \x t -> merge' (Node x Empty Empty) t,
findMin = \t -> case t of
Empty -> Nothing
(Node x _ _) -> Just x,
deleteMin = \t -> case t of
Empty -> Nothing
(Node _ l r) -> Just (merge' r l),
merge = \l r -> case (l, r) of
(Empty, t) -> t
(t, Empty) -> t
(t1#(Node x1 l1 r1), t2#(Node x2 l2 r2)) ->
if x1 <= x2
then Node x1 (merge' t2 r1) l1
else Node x2 (merge' t1 r2) l2,
contains = \x t -> case (x,t) of
(x,Empty)-> Nothing
(x,tx#(Node x1 l1 r1) ->
|x==x1 = Just 0
|x>x1 = (1+ (contains x l)
|x<x1 = (1+ (contains x r)
}
where
merge' = merge heap
The tree implementation
module Fdata.Tree where
import Fdata.Heap
data Tree x
= Empty
| Node x (Tree x) (Tree x)
deriving (Eq, Show)
leaf x = Node x Empty Empty
-- Convert a list to a heap
list2heap :: Heap x t -> [x] -> t x
list2heap i = foldl f z
where
f = flip $ insert i
z = empty i
-- Convert a heap to a list
heap2list :: Heap x t -> t x -> [x]
heap2list i t
= case (findMin i t, deleteMin i t) of
(Nothing, Nothing) -> []
(Just x, Just t') -> x : heap2list i t'
I am supposed to implement the contains function in the implementation for self-adjusting heaps.
I am not allowed to use any helper functions and I am supposed to use the maybe function.
My current implementation:
contains = \x t -> case (x,t) of
(x,Empty) -> Nothing
(x,tx#(Node x1 l1 r1))
|x==x1 -> Just 0
|x>x1 -> (1+ (contains x l1)
|x<x1 -> (1+ (contains x r1)
This does not work, since I get a parse error on input |.
I really dont know how to fix this since I did use 4 spaces instead of tabs and according to this: https://wiki.haskell.org/Case
the syntax is correct...
I once managed to fix this, but I got a type error about (1+ (contains x l), so this probably is not correct.
Any hint would be appreciated.
EDIT:
Thanks to everyone who answered!
Really appreciate that everyone took the time to explain their answers in great detail.
First of all:
there were some smaller mistakes, as pointed out by some of you in the comments:
I missed one closing parenthesis and accidentially named one argument l1 and another r1 and afterwards used r and l.
Fixed both mistakes.
Someone wrote that I do not need to use a lambda function. The problem is when I use something like:
contains _ Empty = Nothing
I get the error:
parse Error on input '_'.
However, lambda functions do not give me any errors about the input arguments.
Currently the only function that works without any errors is:
contains = \e t -> case (e,t) of
(_,Empty) -> Nothing
(e , Node x t1 t2) ->
if e == (head (heap2list heap (Node x t1 t2)))
then Just 0
else if (fmap (+1) (contains heap e t1))== Nothing
then (fmap (+1) (contains heap e t2))
else (fmap (+1) (contains heap e t1))
Found at:
Counting/Getting "Level" of a hierarchical data
Found by:Krom
One way of structuring contains :: Eq a => a -> Tree a -> Maybe Integer is to first label each element in your tree with its depth, using something like this, then fold the tree to find the element you're looking for, pulling its depth out with it. You can do this without very much code!
Jumping right in where this answer left off, here's contains.
contains :: Eq a => a -> Tree a -> Maybe Integer
contains x = fmap fst . find ((== x) . snd) . labelDepths
That's the whole function! This is classic functional programming style: rather than hand-crank a bespoke recursive tree traversal function I've structured the code as a pipeline of reusable operations. In Haskell pipelines are constructed using the composition operator (.) and are read from left to right. The result of labelDepths is passed to find ((== x) . snd), whose result is then passed to fmap fst.
labelDepths :: Tree a -> Tree (Integer, a), which I've explained in detail in the answer I linked above, attaches an Integer depth to each element of the input tree.
find :: Foldable t => (a -> Bool) -> t a -> Maybe a is a standard function which extracts the first element of a container (like a tree, or a list) that satisfies a predicate. In this instance, the Foldable structure in question is a Tree, so t ~ Tree and find :: (a -> Bool) -> Tree a -> Maybe a. The predicate I've given to find is ((== x) . snd), which returns True if the second element of its input tuple equals x: find ((== x) . snd) :: Tree (Integer, a) -> Maybe (Integer, a). find works by folding the input structure - testing its elements one at a time until it finds one that matches the predicate. The order in which elements are processed is defined by the container's Foldable instance, of which more below.
fmap :: Functor f => (a -> b) -> f a -> f b is another standard function. It applies a mapping function uniformly to each element of a container, transforming its elements from type a to type b. This time the container in question is the return value of find, which is a Maybe, so fmap :: (a -> b) -> Maybe a -> Maybe b. The mapping function I've supplied is fst, which extracts the first element of a tuple: fmap fst :: Maybe (Integer, a) -> Maybe Integer.
So putting it all together, you can see that this is a fairly direct implementation of my English description of the process above. First we label every element in the tree with its depth, then we find an element which matches the item we're looking for, then we extract the depth with which the element was previously labelled.
I mentioned above that Tree is a Foldable container. In fact, this isn't the case quite yet - there's no instance of Foldable for Tree. The easiest way to get a Foldable instance for Tree is to turn on the DeriveFoldable GHC extension and utter the magic words deriving Foldable.
{-# LANGUAGE DeriveFoldable #-}
data Tree x = Empty | Node x (Tree x) (Tree x) deriving Foldable
This automatically-implemented instance of Foldable will perform a preorder traversal, processing the tree in a top-down fashion. (x is considered to be "to the left of" l and r in the expression Node x l r.) You can adjust the derived traversal order by adjusting the layout of the Node constructor.
That said, I'm guessing that this is an assignment and you're not allowed to modify the definition of Tree or apply any language extensions. So you'll need to hand-write your own instance of Foldable, following the template at the bottom of this post. Here's an implementation of foldr which performs a preorder traversal.
instance Foldable Tree where
foldr f z Empty = z
foldr f z (Node x l r) = f x (foldr f (foldr f z r) l)
The Node case is the interesting one. We fold the tree from right to left (since this is a foldr) and from bottom to top. First we fold the right subtree, placing z at the rightmost leaf. Then we use the aggregated result of the right subtree as the seed for folding the left subtree. Finally we use the result of folding all of the Node's children as the aggregator to apply to f x.
Hopefully you didn't find this answer too advanced! (Happy to answer any questions you have.) While the other answers do a good job of showcasing how to write recursive tree traversal functions, I really wanted to give you a glimpse of the real power of functional programming. When you think at a higher level - breaking down a problem into its component parts, structuring operations as pipelines, and learning to spot common patterns like zipping, folding and mapping - you can be very productive and solve problems with very little code.
An instance of Foldable for a binary tree
To instantiate Foldable you need to provide a definition for at least foldMap or foldr.
data Tree a = Leaf
| Node (Tree a) a (Tree a)
instance Foldable Tree where
foldMap f Leaf = mempty
foldMap f (Node l x r) = foldMap f l `mappend` f x `mappend` foldMap f r
foldr f acc Leaf = acc
foldr f acc (Node l x r) = foldr f (f x (foldr f acc r)) l
This implementation performs an in-order traversal of the tree.
ghci> let myTree = Node (Node Leaf 'a' Leaf) 'b' (Node Leaf 'c' Leaf)
-- +--'b'--+
-- | |
-- +-'a'-+ +-'c'-+
-- | | | |
-- * * * *
ghci> toList myTree
"abc"
The DeriveFoldable extension allows GHC to generate Foldable instances based on the structure of the type. We can vary the order of the machine-written traversal by adjusting the layout of the Node constructor.
data Inorder a = ILeaf
| INode (Inorder a) a (Inorder a) -- as before
deriving Foldable
data Preorder a = PrLeaf
| PrNode a (Preorder a) (Preorder a)
deriving Foldable
data Postorder a = PoLeaf
| PoNode (Postorder a) (Postorder a) a
deriving Foldable
-- injections from the earlier Tree type
inorder :: Tree a -> Inorder a
inorder Leaf = ILeaf
inorder (Node l x r) = INode (inorder l) x (inorder r)
preorder :: Tree a -> Preorder a
preorder Leaf = PrLeaf
preorder (Node l x r) = PrNode x (preorder l) (preorder r)
postorder :: Tree a -> Postorder a
postorder Leaf = PoLeaf
postorder (Node l x r) = PoNode (postorder l) (postorder r) x
ghci> toList (inorder myTree)
"abc"
ghci> toList (preorder myTree)
"bac"
ghci> toList (postorder myTree)
"acb"
This function doesn't need to be a lambda:
contains x t =
Adding x to the case serves no purpose, since you only match it back to x. You can instead use pattern matching in the function head:
contains _ Empty = Nothing
The Node case has three sub-cases, where the value being searched for is less-than, greater-than, or equal to the value in the Node. If you order them that way, you get a symmetry from the less-than and greater-than tests, and can handle the equal case with an otherwise.
When recusring, you are going to get a Maybe Int, to which you want to add one. You can't do that directly because the Int is inside the Maybe. Normally, you would lift the addition, but I suspect that this is where the required call to maybe should go (however unnatural it may seem):
contains x (Node x' l r) | x < x' = maybe Nothing (Just . (+1)) $ contains x l
| x > x' = maybe Nothing (Just . (+1)) $ contains x r
| otherwise = Just 0
Instead of using maybe, the (+1) could have been lifted into the Maybe with fmap (or <$>):
... = fmap (+1) $ contains ...
Using maybe is unnatural because it has to explicitly pass the Nothing, and also re-wrap the Just.
This does not work, since I get a parse error on input |
Your previous line misses a closing parenthesis.
I got a Typ error about (1+ (contains x l)), so this probably is not correct.
The idea is totally correct, the issue is that contains x l returns a Maybe Int instead of an Int so you cannot directly add to that. You can only add to the result when it's a Just. There's a helper function that does exactly that, do something to Justs and keep Nothings: fmap (from Functor).
contains = \x t -> case (x,t) of
(x,Empty)-> Nothing
(x,tx#(Node x1 l1 r1))
|x==x1 -> Just 0
|x>x1 -> fmap (1+) (contains x l)
|x<x1 -> fmap (1+) (contains x r)
Btw, I'd write this as
contains x Empty = Nothing
contains x (Node v l r) = if x == v
then Just 0
else fmap (+1) $ contains x $ if x > v then l else r

Compute an (infinite) tree from fixpoint operator using delay modality

Here is a functional programming puzzle involving loop-tying and infinite data structures. There is a bit of background, so hang tight.
The setting. Let us define a data type representing recursive data types:
type Var = String
data STerm = SMu Var STerm
| SVar Var
| SArrow STerm STerm
| SBottom
| STop
deriving (Show)
i.e. t ::= μα. t | α | t → t | ⊥ | ⊤. Note that ⊥ denotes the type with no inhabitants, while ⊤ denotes the type with all inhabitants. Note that (μα. α) = ⊥, as μ is a least fixpoint operator.
We can interpret a recursive data type as an infinite tree, arising from repeatedly unfolding μα. t to t[α ↦ μα. t]. (For a formal description of this process, see http://lucacardelli.name/Papers/SRT.pdf) In Haskell, we can define a type of lazy trees, which don't have μ-binders or variables:
data LTerm = LArrow LTerm LTerm
| LBottom
| LTop
deriving (Show)
and, in ordinary Haskell, a conversion function from one to the other:
convL :: [(Var, LTerm)] -> STerm -> LTerm
convL _ STop = LTop
convL _ SBottom = LBottom
convL ctx (SArrow t1 t2) = LArrow (convL ctx t1) (convL ctx t2)
convL ctx (SVar v)
| Just l <- lookup v ctx = l
| otherwise = error "unbound variable"
convL ctx (SMu v t) = fix (\r -> convL ((v, r) : ctx) t)
However, there is a problem with this function: it's not productive. If you run convL [] (SMu "x" (SVar "x")) you will infinite loop. We would rather get LBottom in this case. An interesting exercise is to directly fix this function so that it is productive; however, in this question I want to solve the problem differently.
Productivity with the delay modality. When we build cyclic data structures as above, we need to make sure we do not use the results of our computations before we have constructed them. The delay modality is a way of ensuring that we write productive (non infinite looping) programs. The basic idea is simple: if the type Int means that I have an integer today, I define a type constructor D, so that D Int means that I have a value of type Int tomorrow. D is a Functor and an Applicative (but NOT a monad.)
-- D is abstract; you are not allowed to pattern match on it
newtype D a = D a
deriving (Show)
instance Functor D where
fmap f (D a) = D (f a)
instance Applicative D where
D f <*> D a = D (f a)
pure x = D x
With D, we define a fixpoint operator: it says that to construct a value of a, you can have access to the a you are constructing, as long as you only use it tomorrow.
fixD :: (D a -> a) -> a
fixD f = f (D (fixD f))
For example, a Stream consists both of a value a I have today, and a stream Stream a which I have to produce tomorrow.
data Stream a = Cons a (D (Stream a))
Using fixD, I can define a map function on streams which is guaranteed to be productive, since the recursive call to map is only used to produced values that are needed tomorrow.
instance Functor Stream where
fmap f = fixD $ \go (Cons x xs) -> Cons (f x) (go <*> xs)
The problem. Here is a variant of LTerm with an explicit delay modality.
data Term = Arrow (D Term) (D Term)
| Bottom
| Top
deriving (Show)
Using fixD (no non-structurally recursive references allowed), how do I write a function conv :: STerm -> Term (or conv :: STerm -> D Term)?
A particularly interesting test case is SMu "x" (SArrow STop (SMu "y" (SVar "x"))); there should be no Bottoms in the resulting structure!
Update. I accidentally ruled out structural recursion on STerm, which was not the intent of the question; I've reworded to remove that restriction.
Do you intend to forbid just the unrestricted recursion (fix) in the SMu case of convL, or also the structural recursion in the SArrow case?
I don’t think this has a solution without structural recursion on STerm, because then we would have to be productive even on an infinite STerm such as:
foldr (\n -> SMu ("x" ++ show n)) undefined [0..] -- μα. μβ. μγ. μδ. …
To do this with structural recursion on STerm, it seems the trick is to store Either Term (D Term) in the context. When we pass through an Arrow and produce a D, we can convert all the Rights to Lefts.
type Ctx = [(Var, Either Term (D Term))]
dCtx :: Ctx -> D Ctx
dCtx = traverse (traverse (fmap Left . either pure id))
conv :: STerm -> Ctx -> Term
conv STop _ = Top
conv SBottom _ = Bottom
conv (SArrow t1 t2) ctx = Arrow (fmap (conv t1) (dCtx ctx)) (fmap (conv t2) (dCtx ctx))
conv (SVar v) ctx = case lookup v ctx of
Nothing -> error "unbound variable"
Just (Left t) -> t
Just (Right _) -> Bottom
conv (SMu v t) ctx = fixD (\dr -> conv t ((v, Right dr) : ctx))
My intuition is that the context should contain only delayed terms. That way, conv ctx (SMu "x" t) will be equivalent to fixD (\d -> conv ((x,r):ctx) t), as in the original convL.
If this is the case, then you need a general way to include delayed terms in your data structure, instead of just allowing them in arrows:
data Term = Arrow Term Term
| Bottom
| Top
| Next (D Term)
A first attempt at conv gives us:
conv :: [(Var, D Term)] -> STerm -> Term
conv _ STop = Top
conv _ SBottom = SBottom
conv ctx (SArrow t1 t2) = Arrow (conv ctx t1) (conv ctx t2)
conv ctx (SVar v)
| Just l <- lookup v ctx = Next l
| otherwise = error "unbound variable"
conv ctx (SMu v t) = fixD (\r -> conv ((x,r):ctx) t)
However, this uses unguarded recursive calls to conv. If you want to avoid that, you can wrap all the fixD recursive calls in a Next.
conv :: [(Var, D Term)] -> STerm -> Term
conv = fixD step where
step _ _ STop = Top
step _ _ SBottom = Bottom
step d ctx (SArrow t1 t2) = Arrow (Next $ d <*> pure ctx <*> pure t1)
(Next $ d <*> pure ctx <*> pure t2)
step d ctx (SVar v)
| Just l <- lookup v ctx = Next l
| otherwise = error "unbound variable"
step d ctx (SMu v t) = fixD (\r ->
Next $ d <*> pure ((x,r):ctx) <*> pure t)
I'm not sure if this is exactly what you are looking for, because conv [] SMu "x" (SArrow SBottom (SMu "y" (SVar "x"))) does still have bottoms in the resulting structure. What is the type you want to get out?

Is there any type-algebra function mapping an ADT to the set of elements of that ADT?

For some simple ADTs, we can obtain the ADT of sets of that ADT as data Set = Full Set (*repeated N times) | Empty Set(*repeated N times), where N is the number of non terminal constructors of that ADT. For a more solid example, take Nat:
data Nat = Succ Nat | Zero
We can obtain a type for sets of nats as follows:
data NatSet = Full NatSet | Empty NatSet
So, for example,
empty :: NatSet
empty = Empty empty
insert :: Nat -> NatSet -> NatSet
insert Zero (Empty natverse) = Full natverse
insert Zero (Full natverse) = Full natverse
insert (Succ nat) (Empty natverse) = Empty (insert nat natverse)
insert (Succ nat) (Full natverse) = Full (insert nat natverse)
member :: Nat -> NatSet -> Bool
member Zero (Full natverse) = True
member Zero (Empty natverse) = False
member (Succ nat) (Empty natverse) = member nat natverse
member (Succ nat) (Full natverse) = member nat natverse
main = do
let set = foldr insert empty [Zero, Succ (Succ Zero), Succ (Succ (Succ (Succ (Succ Zero))))]
print $ member Zero set
print $ member (Succ Zero) set
print $ member (Succ (Succ Zero)) set
print $ member (Succ (Succ (Succ Zero))) set
print $ member (Succ (Succ (Succ (Succ Zero)))) set
print $ member (Succ (Succ (Succ (Succ (Succ Zero))))) set
For other types, it is equally simple:
data Bin = A Bin | B Bin | C
data BinSet = Empty BinSet BinSet | Full BinSet BinSet
But what about types that branch? It isn't obvious to me:
data Tree = Branch Tree Tree | Tip
data TreeSet = ???
Is there any type-algebraic argument that maps ADTs to the ADTs of sets of that type?
Let's take another look at your set type.
data NatSet = Full NatSet | Empty NatSet
There's always another NatSet inside this one; the two branches of the sum type indicate whether the current Nat is present in the set. This is a structural representation of sets. As you observed, the structure of the set depends on the structure of the values.
It's equivalent to a stream of Booleans: we test for membership by indexing into the stream.
data NatSet = NatSet Bool NatSet
empty = NatSet False empty
insert Z (NatSet _ bs) = NatSet True bs
insert (S n) (NatSet b bs) = NatSet b (insert n bs)
(NatSet b _) `contains` Z = b
(NatSet _ bs) `contains` (S n) = bs `contains` n
Based on the insight that set membership is like indexing into a collection of Booleans, let's pursue a generic implementation of sets of values of types formed as the fixed point of a polynomial functor (what's the problem?).
First things first. As you observed, the set's structure depends on the type of the things inside it. Here's the class of things which can be elements of a set,
class El a where
data Set a
empty :: Set a
full :: Set a
insert :: a -> Set a -> Set a
remove :: a -> Set a -> Set a
contains :: Set a -> a -> Bool
For a first example, I'll modify NatSet from above to suit this format.
instance El Nat where
data Set Nat = NatSet Bool (Set Nat)
empty = NatSet False empty
full = NatSet True empty
insert Z (NatSet _ bs) = NatSet True bs
insert (S n) (NatSet b bs) = NatSet b (insert n bs)
remove Z (NatSet _ bs) = NatSet False bs
remove (S n) (NatSet b bs) = NatSet b (remove n bs)
(NatSet b _) `contains` Z = b
(NatSet _ bs) `contains` (S n) = bs `contains` n
Another simple El instance which we'll need later is (). A set of ()s is either full or empty, with nothing in between.
instance El () where
newtype Set () = UnitSet Bool
empty = UnitSet False
full = UnitSet True
insert () _ = UnitSet True
remove () _ = UnitSet False
(UnitSet b) `contains` () = b
The fixed point of a functor f,
newtype Fix f = Fix (f (Fix f))
in an instance of El whenever f is an instance of the following El1 class.
class El1 f where
data Set1 f a
empty1 :: El a => Set1 f (Set a)
full1 :: El a => Set1 f (Set a)
insert1 :: El a => f a -> Set1 f (Set a) -> Set1 f (Set a)
remove1 :: El a => f a -> Set1 f (Set a) -> Set1 f (Set a)
contains1 :: El a => Set1 f (Set a) -> f a -> Bool
instance El1 f => El (Fix f) where
newtype Set (Fix f) = FixSet (Set1 f (Set (Fix f)))
empty = FixSet empty1
full = FixSet full1
insert (Fix x) (FixSet s) = FixSet (insert1 x s)
remove (Fix x) (FixSet s) = FixSet (remove1 x s)
(FixSet s) `contains` (Fix x) = s `contains1` x
As usual we'll be composing bits of functors into bigger functors, before taking the resulting functor's fixed point to produce a concrete type.
newtype I a = I a
newtype K c a = K c
data (f :* g) a = f a :* g a
data (f :+ g) a = L (f a) | R (g a)
type N = K ()
For example,
type Nat = Fix (N :+ N)
type Tree = Fix (N :+ (I :* I))
Let's make sets of these things.
I (for identity) is the location in the polynomial where Fix will plug in the recursive substructure. We can just delegate its implementation of El_ to Fix's.
instance El1 I where
newtype Set1 I a = ISet1 a
empty1 = ISet1 empty
full1 = ISet1 full
insert1 (I x) (ISet1 s) = ISet1 (insert x s)
remove1 (I x) (ISet1 s) = ISet1 (remove x s)
(ISet1 s) `contains1` (I x) = s `contains` x
The constant functor K c features no recursive substructure, but it does feature a value of type c. If c can be put in a set, K c can be too.
instance El c => El1 (K c) where
newtype Set1 (K c) a = KSet1 (Set c)
empty1 = KSet1 empty
full1 = KSet_ full
insert1 (K x) (KSet1 s) = KSet1 (insert x s)
remove1 (K x) (KSet1 s) = KSet1 (remove x s)
(KSet1 s) `contains1` (K x) = s `contains` x
Note that this definition makes Set1 N isomorphic to Bool.
For sums, let's use our intuition that testing for membership is like indexing. When you index into a tuple, you choose between the left-hand and right-hand members of the tuple.
instance (El1 f, El1 g) => El1 (f :+ g) where
data Set1 (f :+ g) a = SumSet1 (Set1 f a) (Set1 g a)
empty1 = SumSet1 empty1 empty1
full1 = SumSet1 full1 full1
insert1 (L x) (SumSet1 l r) = SumSet1 (insert1 x l) r
insert1 (R y) (SumSet1 l r) = SumSet1 l (insert1 y r)
remove1 (L x) (SumSet1 l r) = SumSet1 (remove1 x l) r
remove1 (R y) (SumSet1 l r) = SumSet1 l (remove1 y r)
(SumSet1 l r) `contains1` (L x) = l `contains1` x
(SumSet1 l r) `contains1` (R y) = r `contains1` y
This is now enough to do sets of natural numbers. With appropriate instances of Show, you can do this:
ghci> let z = Fix (L (K ()))
ghci> let s n = Fix (R (I n))
ghci> insert (s z) empty `contains` z
False
ghci> insert (s z) empty `contains` (s z)
True
ghci> empty :: Set (Fix (N :+ I))
FixSet (SumSet1 (KSet1 (UnitSet False)) (ISet1 (FixSet ( -- to infinity and beyond
I still haven't answered your original question, which was how should this work for product types? I can think of a few strategies, but none of them actually work.
We could make Set1 (f :* g) a a sum type. This has a pleasing symmetry: sets of sums are products, and sets of products are sums. In the context of the indexing idea, this is like saying "in order to get a Bool out of the set, you have to give an index to handle each of the two possible cases", rather like Either a b's elimination rule (a -> c) -> (b -> c) -> Either a b -> c. However, you get stuck when you try to come up with meaningful values for empty1 and full1:
instance (El1 f, El1 g) => El1 (f :* g) where
data Set1 (f :* g) a = LSet1 (Set1 f a) | RSet1 (Set1 g a)
insert1 (l :* r) (LSet1 x) = LSet1 (insert1 l x)
insert1 (l :* r) (RSet1 y) = RSet1 (insert1 r y)
remove1 (l :* r) (LSet1 x) = LSet1 (remove1 l x)
remove1 (l :* r) (RSet1 y) = RSet1 (remove1 r y)
(LSet1 x) `contains1` (l :* r) = x `contains1` l
(RSet1 y) `contains1` (l :* r) = y `contains1` r
empty1 = _
full1 = _
You could try adding hacky Empty and Full constructors to Set1 (f :* g) but then you'd struggle to implement insert1 and remove1.
You could interpret a set of a product type as a pair of the sets of the two halves of the product. An item is in the set if both of its halves are in their respective sets. Like a sort of generalised intersection.
instance (El1 f, El1 g) => El1 (f :* g) where
data Set1 (f :* g) a = ProdSet1 (Set1 f a) (Set1 g a)
insert1 (l :* r) (ProdSet1 s t) = ProdSet1 (insert1 l s) (insert1 r t)
remove1 (l :* r) (ProdSet1 s t) = ProdSet1 (remove1 l s) (remove1 r t)
(ProdSet1 s t) `contains1` (l :* r) = (s `contains1` l) && (t `contains1` r)
empty1 = ProdSet1 empty1 empty1
full1 = ProdSet1 full1 full1
But this implementation doesn't work correctly. Observe:
ghci> let tip = Fix (L (K ()))
ghci> let fork l r = Fix (R (I l :* I r))
ghci> let s1 = insert (fork tip tip) empty
ghci> let s2 = remove (fork tip (fork tip tip)) s1
ghci> s2 `contains` (fork tip tip)
False
Removing fork tip (fork tip tip) also removed fork tip tip. tip got removed from the left-hand half of the set, which meant any tree whose left-hand branch is tip got removed with it. We removed more items than we intended. (However, if you don't need a remove operation on your sets, this implementation works - though that's just another disappointing asymmetry.) You could implement contains with || instead of && but then you'll end up inserting more items than you intended.
Finally, I also thought about treating a set of products as a set of sets.
data Set1 (f :* g) a = ProdSet1 (Set1 f (Set1 g a))
This doesn't work - try implementing any of El1's methods and you'll get stuck right away.
So at the end of the day your intuition was right. Product types are the problem. There's no meaningful structural representation of sets for product types.
Of course, this is all academic really. Fixed points of polynomial functors have a well-defined notion of equality and ordering, so you can do what everyone else does and use Data.Set.Set to represent sets, even for product types. It won't have such good asymptotics, though: equality and ordering tests on such values are O(n) (where n is the size of the value), making membership operations on such sets O(n log m) (where m is the size of the set) because the set itself is represented as a balanced tree. Contrast with our generic structural sets, where membership operations are O(n).

Indexing into containers: the mathematical underpinnings

When you want to pull an element out of a data structure, you have to give its index. But the meaning of index depends on the data structure itself.
class Indexed f where
type Ix f
(!) :: f a -> Ix f -> Maybe a -- indices can be out of bounds
For example...
Elements in a list have numeric positions.
data Nat = Z | S Nat
instance Indexed [] where
type Ix [] = Nat
[] ! _ = Nothing
(x:_) ! Z = Just x
(_:xs) ! (S n) = xs ! n
Elements in a binary tree are identified by a sequence of directions.
data Tree a = Leaf | Node (Tree a) a (Tree a)
data TreeIx = Stop | GoL TreeIx | GoR TreeIx -- equivalently [Bool]
instance Indexed Tree where
type Ix Tree = TreeIx
Leaf ! _ = Nothing
Node l x r ! Stop = Just x
Node l x r ! GoL i = l ! i
Node l x r ! GoR j = r ! j
Looking for something in a rose tree entails stepping down the levels one at a time by selecting a tree from the forest at each level.
data Rose a = Rose a [Rose a] -- I don't even like rosé
data RoseIx = Top | Down Nat RoseIx -- equivalently [Nat]
instance Indexed Rose where
type Ix Rose = RoseIx
Rose x ts ! Top = Just x
Rose x ts ! Down i j = ts ! i >>= (! j)
It seems that the index of a product type is a sum (telling you which arm of the product to look at), the index of an element is the unit type, and the index of a nested type is a product (telling you where to look in the nested type). Sums seem to be the only one which aren't somehow linked to the derivative. The index of a sum is also a sum - it tells you which part of the sum the user is hoping to find, and if that expectation is violated you're left with a handful of Nothing.
In fact I had some success implementing ! generically for functors defined as the fixed point of a polynomial bifunctor. I won't go into detail, but Fix f can be made an instance of Indexed when f is an instance of Indexed2...
class Indexed2 f where
type IxA f
type IxB f
ixA :: f a b -> IxA f -> Maybe a
ixB :: f a b -> IxB f -> Maybe b
... and it turns out you can define an instance of Indexed2 for each of the bifunctor building blocks.
But what's really going on? What is the underlying relationship between a functor and its index? How does it relate to the functor's derivative? Does one need to understand the theory of containers (which I don't, really) to answer this question?
It seems like the index into the type is an index into the set of constructors, following by an index into the product representing that constructor. This can be implemented quite naturally with e.g. generics-sop.
First you need a datatype to represent possible indices into a single element of the product. This could be an index pointing to an element of type a,
or an index pointing to something of type g b - which requires an index pointing into g and an index pointing to an element of type a in b. This is encoded with the following type:
import Generics.SOP
data ArgIx f x x' where
Here :: ArgIx f x x
There :: (Generic (g x')) => Ix g -> ArgIx f x x' -> ArgIx f x (g x')
newtype Ix f = ...
The index itself is just a sum (implemented by NS for n-ary sum) of sums over the generic representation of the type (choice of constructor, choice of constructor element):
newtype Ix f = MkIx (forall x . NS (NS (ArgIx f x)) (Code (f x)))
You can write smart constructors for various indices:
listIx :: Natural -> Ix []
listIx 0 = MkIx $ S $ Z $ Z Here
listIx k = MkIx $ S $ Z $ S $ Z $ There (listIx (k-1)) Here
treeIx :: [Bool] -> Ix Tree
treeIx [] = MkIx $ S $ Z $ S $ Z Here
treeIx (b:bs) =
case b of
True -> MkIx $ S $ Z $ Z $ There (treeIx bs) Here
False -> MkIx $ S $ Z $ S $ S $ Z $ There (treeIx bs) Here
roseIx :: [Natural] -> Ix Rose
roseIx [] = MkIx $ Z $ Z Here
roseIx (k:ks) = MkIx $ Z $ S $ Z $ There (listIx k) (There (roseIx ks) Here)
Note that e.g. in the list case, you cannot construct an (non-bottom) index pointing to the [] constructor - likewise for Tree and Empty, or constructors containing values whose type is not a or something containing some values of type a. The quantification in MkIx prevents the construction bad things like an index pointing to the first Int in data X x = X Int x where x is instantiated to Int.
The implementation of the index function is fairly straightforward, even if the types are scary:
(!) :: (Generic (f x)) => f x -> Ix f -> Maybe x
(!) arg (MkIx ix) = go (unSOP $ from arg) ix where
atIx :: a -> ArgIx f x a -> Maybe x
atIx a Here = Just a
atIx a (There ix0 ix1) = a ! ix0 >>= flip atIx ix1
go :: (All SListI xss) => NS (NP I) xss -> NS (NS (ArgIx f x)) xss -> Maybe x
go (Z a) (Z b) = hcollapse $ hzipWith (\(I x) -> K . atIx x) a b
go (S x) (S x') = go x x'
go Z{} S{} = Nothing
go S{} Z{} = Nothing
The go function compares the constructor pointed to by the index and the actual constructor used by the type. If the constructors don't match, the indexing returns Nothing. If they do, the actual indexing is done - which is trivial in the case that the index points exactly Here, and in the case of some substructure, both indexing operations must succeed one after the other, which is handled by >>=.
And a simple test:
>map (("hello" !) . listIx) [0..5]
[Just 'h',Just 'e',Just 'l',Just 'l',Just 'o',Nothing]

What benefits do I get from creating an instance of Comonad

In my application, I'm trying to implement an animation system. In this system, animations are represented as a cyclic list of frames:
data CyclicList a = CL a [a]
We can (inefficiently) advance the animation as follows:
advance :: CyclicList a -> CyclicList a
advance (CL x []) = CL x []
advance (CL x (z:zs)) = CL z (zs ++ [x])
Now, I'm pretty sure that this data type is a comonad:
instance Functor CyclicList where
fmap f (CL x xs) = CL (f x) (map f xs)
cyclicFromList :: [a] -> CyclicList a
cyclicFromList [] = error "Cyclic list must have one element!"
cyclicFromList (x:xs) = CL x xs
cyclicLength :: CyclicList a -> Int
cyclicLength (CL _ xs) = length xs + 1
listCycles :: CyclicList a -> [CyclicList a]
listCycles cl = let
helper 0 _ = []
helper n cl' = cl' : (helper (n-1) $ advance cl')
in helper (cyclicLength cl) cl
instance Comonad CyclicList where
extract (CL x _) = x
duplicate = cyclicFromList . listCycles
The question I have is: what kind of benefits do I get (if any) from using the comonad instance?
The advantage of providing a type class or implementing an interface is that code, written to use that typeclass or interface, can use your code without any modifications.
What programs can be written in terms of Comonad? A Comonad provides a way to both inspect the value at the current location (without observing its neighbors) using extract and a way to observe the neighborhood of every location with duplicate or extend. Without any additional functions, this isn't terribly useful. However, if we also require other functions along with the Comonad instance, we can write programs that depend on both local data and data from elsewhere. For example, if we require functions that allow us to change location, such as your advance, we can write programs that depend only on the local structure of the data, not on the data structure itself.
For a concrete example, consider a cellular automata program written in terms of Comonad and the following Bidirectional class:
class Bidirectional c where
forward :: c a -> Maybe (c a)
backward :: c a -> Maybe (c a)
The program could use this, together with Comonad, to extract data stored in a cell and explore the cells forward and backward of the current cell. It can use duplicate to capture the neighborhood of each cell and fmap to inspect that neighborhood. This combination of fmap f . duplicate is extract f.
Here is such a program. rule' is only interesting to the example; it implements cellular automata rules on neighborhood with just the left and right values. rule extracts data from the neighborhood, given the class, and runs the rule on each neighborhood. slice pulls out even larger neighborhoods so that we can display them easily. simulate runs the simulation, displaying these larger neighborhoods for each generation.
rule' :: Word8 -> Bool -> Bool -> Bool -> Bool
rule' x l m r = testBit x ((if l then 4 else 0) .|. (if m then 2 else 0) .|. (if r then 1 else 0))
rule :: (Comonad w, Bidirectional w) => Word8 -> w Bool -> w Bool
rule x = extend go
where
go w = rule' x (maybe False extract . backward $ w) (extract w) (maybe False extract . forward $ w)
slice :: (Comonad w, Bidirectional w) => Int -> Int -> a -> w a -> [a]
slice l r a w = sliceL l w (extract w : sliceR r w)
where
sliceR r w | r > 0 = case (forward w) of
Nothing -> take r (repeat a)
Just w' -> extract w' : sliceR (r-1) w'
sliceR _ _ = []
sliceL l w r | l > 0 = case (backward w) of
Nothing -> take l (repeat a) ++ r
Just w' -> sliceL (l-1) w' (extract w':r)
sliceL _ _ r = r
simulate :: (Comonad w, Bidirectional w) => (w Bool -> w Bool) -> Int -> Int -> Int -> w Bool -> IO ()
simulate f l r x w = mapM_ putStrLn . map (map (\x -> if x then '1' else '0') . slice l r False) . take x . iterate f $ w
This program might have been intended to work with the following Bidirectional Comonad, a Zipper on a list.
data Zipper a = Zipper {
heads :: [a],
here :: a,
tail :: [a]
} deriving Functor
instance Bidirectional Zipper where
forward (Zipper _ _ [] ) = Nothing
forward (Zipper l h (r:rs)) = Just $ Zipper (h:l) r rs
backward (Zipper [] _ _) = Nothing
backward (Zipper (l:ls) h r) = Just $ Zipper ls l (h:r)
instance Comonad Zipper where
extract = here
duplicate (Zipper l h r) = Zipper (goL (h:r) l) (Zipper l h r) (goR (h:l) r)
where
goL r [] = []
goL r (h:l) = Zipper l h r : goL (h:r) l
goR l [] = []
goR l (h:r) = Zipper l h r : goR (h:l) r
But will also work with a CyclicList Bidirectional Comonad.
data CyclicList a = CL a (Seq a)
deriving (Show, Eq, Functor)
instance Bidirectional CyclicList where
forward (CL x xs) = Just $ case viewl xs of
EmptyL -> CL x xs
x' :< xs' -> CL x' (xs' |> x)
backward (CL x xs) = Just $ case viewr xs of
EmptyR -> CL x xs
xs' :> x' -> CL x' (x <| xs')
instance Comonad CyclicList where
extract (CL x _) = x
duplicate (CL x xs) = CL (CL x xs) (go (singleton x) xs)
where
go old new = case viewl new of
EmptyL -> empty
x' :< xs' -> CL x' (xs' >< old) <| go (old |> x') xs'
We can reuse simulate with either data structure. The CyclicList has a more interesting output, because, instead of bumping into a wall, it wraps back around to interact with itself.
{-# LANGUAGE DeriveFunctor #-}
import Control.Comonad
import Data.Sequence hiding (take)
import Data.Bits
import Data.Word
main = do
putStrLn "10 + 1 + 10 Zipper"
simulate (rule 110) 10 10 30 $ Zipper (take 10 . repeat $ False) True (take 10 . repeat $ False)
putStrLn "10 + 1 + 10 Cyclic"
simulate (rule 110) 10 10 30 $ CL True (fromList (take 20 . repeat $ False))

Resources