I'm trying to build a list at the type level, but I'm having some trouble figuring out how to enforce constraints.
My base code is:
data Foo z q = Foo1 (z q)
| Foo2 (z q)
class Qux q -- where ...
class Baz z -- where ...
class Bar a where -- a has kind *->*
type BCtx a q :: Constraint -- using ConstraintKinds to allow constraints on the concrete type
f :: (BCtx a q) => a q -> a q -> a q
g :: (BCtx a q, BCtx a q') => a q -> a q'
instance (Baz z) => Bar (Foo z) where
type BCtx (Foo z) q = (Num (z q), Qux q) -- for example
f (Foo1 x) (Foo1 y) = Foo1 $ x+y -- these functions need access to the type q to do arithmetic mod q
f (Foo1 x) (Foo2 y) = Foo2 $ x-y
-- ...
You can think of the qs above representing prime powers. I would also like to represent composite numbers using a type list of qis. I'm imagining something like:
data QList qi qs = QCons qi qs
| QNil
with the data
data FList c q = FNil
| FCons (c (Head q)) (FList c (Tail q))
where (Head q) should correspond to qi and (Tail q) should correspond to qs. Note that the q parameter for FList is NOT (necessarily) a (Qux q), it is a list of (Qux qi). (I don't want to flesh out anything more about this list, since it's one of the design problems I'm posing). I would like to work "modulus-wise" on the FList:
instance (Bar c) => Bar (FList c) where
type BCtx (FList c) q = () -- Anything I put here is not enough
f (FCons x xs) (FCons y ys) = FCons (f x y) (f xs ys)
-- the left call to `f` calls a concrete instance, the right call to `f` is a recursive call on the rest of the list
-- ...
Compiling these codes snippets together in GHC result in (modulo transcription, abstraction, and typing errors):
Could not deduce (BCtx c (Head q), BCtx c (Tail q))
and then
Could not deduce (BCtx c (Head (Tail q)), BCtx c (Tail (Tail q)))
etc.
I see why I'm getting this error, but not how to fix it.
Concretely, I'm expecting an FList c q type where c~Foo z and q~QCons q1 (QCons q2 QNil), and of course my list will satisfy all of the BCtx constraints at every level.
I'm not sure that fixing those particular errors will result in compiling code, but it is a start. The entire Bar class is basically fixed (the Constraint kind is required, and the instances of Bar must have kind * -> *). I don't believe I can use existential types to create a list of generic objects because I need access to the qi parameter. I am willing to change the type of FList and QList to allow me to work modulus-wise on a collection of Bars.
Thanks for your time!
To handle type lists, it's necessary to discriminate empty from nonempty lists and handle them separately. The 'Could not deduce' errors in your code occur because your instance assumes a nonempty list, when in fact the list may or may not be empty. Here is a solution using the extensions TypeFamilies, TypeOperators, DataKinds, and GADTs.
With DataKinds, type lists are predefined. They have kind [*], but they'll be used in a context where kind * is expected, so an operator is needed to cast them:
data InjList (qs :: [*])
Using type lists, FList is defined as
data FList c q where
FNil :: FList c (InjList '[])
FCons :: c h -> FList c (InjList t) -> FList c (InjList (h ': t))
It's defined as a GADT to express how it's only possible to construct FLists over the type InjList q' for some type-list q'. For instance, the term FCons [True] FNil has type FList [] (InjList (Bool ': '[])). On the other hand, since Bool isn't of the form InjList q', there are no terms (except ⊥) of type FList [] Bool. By pattern matching on an FList, a function can verify that it has been given a non-⊥ argument, and further determine whether it's been passed an empty type list.
An instance of Bar for FLists has to handle nil lists and cons lists separately. A nil list has an empty context.
A cons list has components for the head and tail of the list. This is expressed by pattern matching on the type-list in the associated type instance of BCtx. The function f examines its argument to verify that it's not ⊥ and to decide whether it's an empty list.
instance (Bar c) => Bar (FList c) where
-- Empty context for '[]
type BCtx (FList c) (InjList '[]) = ()
-- Context includes components for head and tail of list
type BCtx (FList c) (InjList (h ': t)) = (BCtx c h, BCtx (FList c) (InjList t))
f FNil FNil = FNil
f (FCons x xs) (FCons y ys) = FCons (f x y) (f xs ys)
We can load the code into GHCi to verify that it works:
instance Bar [] where
type BCtx [] q = Num q
f xs ys = zipWith (+) xs ys
instance Show (FList c (InjList '[])) where
show FNil = "FNil"
instance (Show (c h), Show (FList c (InjList t))) => Show (FList c (InjList (h ': t))) where
show (FCons h t) = "FCons (" ++ show h ++ ") (" ++ show t ++ ")"
$ ghci
> :load Test
> f (FCons [1,2] FNil) (FCons [3,4] FNil)
FCons ([4,6]) (FNil)
Related
I'm looking for a way to transform a list into an n-tuple with one list for each of the n constructors in a disjoint union. The standard library defines a similar function specifically for Eithers:
partitionEithers :: [Either a b] -> ([a], [b])
I'm looking for techniques for solving the generalized problem with the following requirements:
convenient to write
as little boilerplate as possible
processes the list in a single pass
datatype-generics, metaprogramming, existing libraries etc are all permitted
Example
Here is an example specification with two proposed solutions:
partitionSum :: [MySum] -> ([A], [B], [C], [D])
data MySum
= CaseA A
| CaseB B
| CaseC C
| CaseD D
data A = A deriving Show
data B = B deriving Show
data C = C deriving Show
data D = D deriving Show
-- expect "([A,A],[B,B,B],[],[D])"
test :: IO ()
test = print . partitionSum $
[CaseD D, CaseB B, CaseA A, CaseA A, CaseB B, CaseB B]
First attempt: n list comprehensions that traverse the list n times.
partitionSum1 :: [MySum] -> ([A], [B], [C], [D])
partitionSum1 xs =
( [a | CaseA a <- xs]
, [b | CaseB b <- xs]
, [c | CaseC c <- xs]
, [d | CaseD d <- xs]
)
Second attempt: a single traversal of the input list. I have to manually thread the state through the fold which makes the solution a little repetitive and annoying to write.
partitionSum2 :: [MySum] -> ([A], [B], [C], [D])
partitionSum2 = foldr f ([], [], [], [])
where
f x (as, bs, cs, ds) =
case x of
CaseA a -> (a : as, bs, cs, ds)
CaseB b -> (as, b : bs, cs, ds)
CaseC c -> (as, bs, c : cs, ds)
CaseD d -> (as, bs, cs, d : ds)
In addition to the Representable answer:
A thing that came to me from seeing foldr f ([], [], [], []) was to define a monoid where the nil case is mempty
{-# DerivingVia #-}
..
import GHC.Generics (Generically(..), ..)
type Classify :: Type
type Classify = C [A] [B] [C] [D]
deriving
stock Generic
deriving (Semigroup, Monoid)
via Generically Classify
-- mempty = C [] [] [] []
-- C as bs cs ds <> C as1 bs1 cd1 ds1 = C (as ++ as1) (bs ++ bs1) (cs ++ cs1) (ds ++ ds1)
Generically will be exported from GHC.Generics in the future. It defines Classify as a semigroup and monoid through generic pointwise lifting.
With this all you need is a classifier function, that classifies a MySum into Classify and you can define partition in terms of foldMap
classify :: MySum -> Classify
classify = \case
SumA a -> C [a] [] [] []
SumB b -> C [] [b] [] []
SumC c -> C [] [] [c] []
SumD d -> C [] [] [] [d]
partition :: Foldable f => f MySum -> Classify
partition = foldMap classify
As your function is a transformation from sums to products, there's a fairly simple implementation using generics-sop. This is a library which enhances GHCs generics with more specialized types that make induction on algebriac type (i.e. sums of products) simpler.
First, a prelude:
{-# LANGUAGE DeriveGeneric, StandaloneDeriving #-}
import Generics.SOP hiding ((:.:))
import qualified GHC.Generics as GHC
import GHC.Generics ((:.:)(..))
partitionSum :: (Generic t) => [t] -> NP ([] :.: NP I) (Code t)
This is the method you want to write. Let's examine its type.
the single argument is a list of some generic type. Pretty straightforward. Note here that Generic is the one from generics-sop, not from GHC
the returned value is an n-ary product (n-tuple) where each element is a list composed with NP I (itself an n-ary product, because generally, algebraic datatype constructors might have more than one field)
Code t is the sum-of-products type representation of t. It's a list of lists of type. e.g. Code (Either a b) ~ '[ '[a], '[b] ]. The generic value representation of t is SOP I (Code t) - a sum of of products over the "code".
To implement this, we can convert each t to its generic representation, then fold over the resulting list:
partitionSum = partitionSumGeneric . map from
partitionSumGeneric :: SListI xss => [SOP I xss] -> NP ([] :.: NP I) xss
partitionSumGeneric = foldr (\(SOP x) -> classifyGeneric x) emptyClassifier
partitionSumGeneric is pretty much the same as partitionSum, but operates on generic representations of values.
Now for the interesting part. Let's begin with the base case of our fold. This should contain empty lists in every position. generics-sop provides a handy mechanism for generating a product type with a uniform value in each position:
emptyClassifier :: SListI xs => NP ([] :.: NP I) xs
emptyClassifier = hpure (Comp1 [])
The recursive case is as follows: if the value has tag at index k, add that value to the list at index k in the accumulator. We can do this with simultaneous recursion on both the sum type (it's generic now, so a value of type NS (NP I) xs - a sum of products) and on the accumulator.
classifyGeneric :: NS (NP I) xss -> NP ([] :.: NP I) xss -> NP ([] :.: NP I) xss
classifyGeneric (Z x) (Comp1 l :* ls) = (Comp1 $ x : l) :* ls
classifyGeneric (S xs) ( l :* ls) = l :* classifyGeneric xs ls
Your example with some added data to make it a bit more interesting:
data MySum
= CaseA A
| CaseB B
| CaseC C
| CaseD D
-- All that's needed for `partitionSum' to work with your type
deriving instance GHC.Generic MySum
instance Generic MySum
data A = A Int deriving Show
data B = B String Int deriving Show
data C = C deriving Show
data D = D Integer deriving Show
test = partitionSum $
[CaseD $ D 0, CaseB $ B "x" 1, CaseA $ A 2, CaseA $ A 3, CaseB $ B "y" 4, CaseB $ B "z" 5]
the result is:
Comp1 {unComp1 = [I (A 2) :* Nil,I (A 3) :* Nil]} :* Comp1 {unComp1 = [I (B "x" 1) :* Nil,I (B "y" 4) :* Nil,I (B "z" 5) :* Nil]} :* Comp1 {unComp1 = []} :* Comp1 {unComp1 = [I (D 0) :* Nil]} :*Nil
Starting with a concrete instance of my question, we all know (and love) the Monad type class:
class ... => Monad m where
return :: a -> m a
(>>=) :: m a -> (a -> m b) -> mb
...
Consider the following would-be instance, where we modify the standard list/"nondeterminism" instance using nub to retain only one copy of each "outcome":
type DistinctList a = DL { dL :: [a] }
instance Monad DistinctList where
return = DL . return
x >>= f = DL . nub $ (dL x) >>= (dL . f)
...Do you spot the error? The problem is that nub :: Eq a => [a] -> [a] and so x >>= f is only defined under the condition f :: Eq b => a -> DistinctList b, whereas the compiler demands f :: a -> DistinctList b. Is there some way I can proceed anyway?
Stepping back, suppose I have a would-be instance that is only defined under some condition on the parametric type's variable. I understand that this is generally not allowed because other code written with the type class cannot be guaranteed to supply parameter values that obey the condition. But are there circumstances where this still can be carried out? If so, how?
Here is an adaptation of the technique applied in set-monad to your case.
Note there is, as there must be, some "cheating". The structure includes extra value constructors to represent "return" and "bind". These act as suspended computations that need to be run. The Eq instance is there part of the run function, while the constructors that create the "suspension" are Eq free.
{-# LANGUAGE GADTs #-}
import qualified Data.List as L
import qualified Data.Functor as F
import qualified Control.Applicative as A
import Control.Monad
-- for reference, the bind operation to be implemented
-- bind operation requires Eq
dlbind :: Eq b => [a] -> (a -> [b]) -> [b]
dlbind xs f = L.nub $ xs >>= f
-- data structure comes with incorporated return and bind
-- `Prim xs` wraps a list into a DL
data DL a where
Prim :: [a] -> DL a
Return :: a -> DL a
Bind :: DL a -> (a -> DL b) -> DL b
-- converts a DL to a list
run :: Eq a => DL a -> [a]
run (Prim xs) = xs
run (Return x) = [x]
run (Bind (Prim xs) f) = L.nub $ concatMap (run . f) xs
run (Bind (Return x) f) = run (f x)
run (Bind (Bind ma f) g) = run (Bind ma (\a -> Bind (f a) g))
-- lifting of Eq and Show instance
-- Note: you probably should provide a different instance
-- one where eq doesn't depend on the position of the elements
-- otherwise you break functor laws (and everything else)
instance (Eq a) => Eq (DL a) where
dxs == dys = run dxs == run dys
-- this "cheats", i.e. it will convert to lists in order to show.
-- executing returns and binds in the process
instance (Show a, Eq a) => Show (DL a) where
show = show . run
-- uses the monad instance
instance F.Functor DL where
fmap = liftM
-- uses the monad instance
instance A.Applicative DL where
pure = return
(<*>) = ap
-- builds the DL using Return and Bind constructors
instance Monad DL where
return = Return
(>>=) = Bind
-- examples with bind for a "normal list" and a "distinct list"
list = [1,2,3,4] >>= (\x -> [x `mod` 2, x `mod` 3])
dlist = (Prim [1,2,3,4]) >>= (\x -> Prim [x `mod` 2, x `mod` 3])
And here is a dirty hack to make it more efficient, addressing the points raised below about evaluation of bind.
{-# LANGUAGE GADTs #-}
import qualified Data.List as L
import qualified Data.Set as S
import qualified Data.Functor as F
import qualified Control.Applicative as A
import Control.Monad
dlbind xs f = L.nub $ xs >>= f
data DL a where
Prim :: Eq a => [a] -> DL a
Return :: a -> DL a
Bind :: DL b -> (b -> DL a) -> DL a
-- Fail :: DL a -- could be add to clear failure chains
run :: Eq a => DL a -> [a]
run (Prim xs) = xs
run (Return x) = [x]
run b#(Bind _ _) =
case foldChain b of
(Bind (Prim xs) f) -> L.nub $ concatMap (run . f) xs
(Bind (Return a) f) -> run (f a)
(Bind (Bind ma f) g) -> run (Bind ma (\a -> Bind (f a) g))
-- fold a chain ((( ... >>= f) >>= g) >>= h
foldChain :: DL u -> DL u
foldChain (Bind b2 g) = stepChain $ Bind (foldChain b2) g
foldChain dxs = dxs
-- simplify (Prim _ >>= f) >>= g
-- if (f x = Prim _)
-- then reduce to (Prim _ >>= g)
-- else preserve (Prim _ >>= f) >>= g
stepChain :: DL u -> DL u
stepChain b#(Bind (Bind (Prim xs) f) g) =
let dys = map f xs
pms = [Prim ys | Prim ys <- dys]
ret = [Return ys | Return ys <- dys]
bnd = [Bind ys f | Bind ys f <- dys]
in case (pms, ret, bnd) of
-- ([],[],[]) -> Fail -- could clear failure
(dxs#(Prim ys:_),[],[]) -> let Prim xs = joinPrims dxs (Prim $ mkEmpty ys)
in Bind (Prim $ L.nub xs) g
_ -> b
stepChain dxs = dxs
-- empty list with type via proxy
mkEmpty :: proxy a -> [a]
mkEmpty proxy = []
-- concatenate Prims in on Prim
joinPrims [] dys = dys
joinPrims (Prim zs : dzs) dys = let Prim xs = joinPrims dzs dys in Prim (zs ++ xs)
instance (Ord a) => Eq (DL a) where
dxs == dys = run dxs == run dys
instance (Ord a) => Ord (DL a) where
compare dxs dys = compare (run dxs) (run dys)
instance (Show a, Eq a) => Show (DL a) where
show = show . run
instance F.Functor DL where
fmap = liftM
instance A.Applicative DL where
pure = return
(<*>) = ap
instance Monad DL where
return = Return
(>>=) = Bind
-- cheating here, Prim is needed for efficiency
return' x = Prim [x]
s = [1,2,3,4] >>= (\x -> [x `mod` 2, x `mod` 3])
t = (Prim [1,2,3,4]) >>= (\x -> Prim [x `mod` 2, x `mod` 3])
r' = ((Prim [1..1000]) >>= (\x -> return' 1)) >>= (\x -> Prim [1..1000])
If your type could be a Monad, then it would need to work in functions that are parameterized across all monads, or across all applicatives. But it can't, because people store all kinds of weird things in their monads. Most notably, functions are very often stored as the value in an applicative context. For example, consider:
pairs :: Applicative f => f a -> f b -> f (a, b)
pairs xs ys = (,) <$> xs <*> ys
Even though a and b are both Eq, in order to combine them into an (a, b) pair, we needed to first fmap a function into xs, briefly producing a value of type f (b -> (a, b)). If we let f be your DL monad, we see that this can't work, because this function type has no Eq instance.
Since pairs is guaranteed to work for all Applicatives, and it does not work for your type, we can be sure your type is not Applicative. And since all Monads are also Applicative, we can conclude that your type cannot possibly be made an instance of Monad: it would violate the laws.
In Haskell, I recently found the following function useful:
listCase :: (a -> [a] -> b) -> [a] -> [b]
listCase f [] = []
listCase f (x:xs) = f x xs : listCase f xs
I used it to generate sliding windows of size 3 from a list, like this:
*Main> listCase (\_ -> take 3) [1..5]
[[2,3,4],[3,4,5],[4,5],[5],[]]
Is there a more general recursion scheme which captures this pattern? More specifically, that allows you to generate a some structure of results by repeatedly breaking data into a "head" and "tail"?
What you are asking for is a comonad. This may sound scarier than monad, but is a simpler concept (YMMV).
Comonads are Functors with additional structure:
class Functor w => Comonad w where
extract :: w a -> a
duplicate :: w a -> w (w a)
extend :: (w a -> b) -> w a -> w b
(extendand duplicate can be defined in terms of each other)
and laws similar to the monad laws:
duplicate . extract = id
duplicate . fmap extract = id
duplicate . duplicate = fmap duplicate . duplicate
Specifically, the signature (a -> [a] -> b) takes non-empty Lists of type a. The usual type [a] is not an instance of a comonad, but the non-empty lists are:
data NE a = T a | a :. NE a deriving Functor
instance Comonad NE where
extract (T x) = x
extract (x :. _) = x
duplicate z#(T _) = T z
duplicate z#(_ :. xs) = z :. duplicate xs
The comonad laws allow only this instance for non-empty lists (actually a second one).
Your function then becomes
extend (take 3 . drop 1 . toList)
Where toList :: NE a -> [a] is obvious.
This is worse than the original, but extend can be written as =>> which is simpler if applied repeatedly.
For further information, you may start at What is the Comonad typeclass in Haskell?.
This looks like a special case of a (jargon here but it can help with googling) paramorphism, a generalisation of primitive recursion to all initial algebras.
Reimplementing ListCase
Let's have a look at how to reimplement your function using such a combinator. First we define the notion of paramorphism: a recursion principle where not only the result of the recursive call is available but also the entire substructure this call was performed on:
The type of paraList tells me that in the (:) case, I will have access to the head, the tail and the value of the recursive call on the tail and that I need to provide a value for the base case.
module ListCase where
paraList :: (a -> [a] -> b -> b) -- cons
-> b -- nil
-> [a] -> b -- resulting function on lists
paraList c n [] = n
paraList c n (x : xs) = c x xs $ paraList c n xs
We can now give an alternative definition of listCase:
listCase' :: (a -> [a] -> b) -> [a] -> [b]
listCase' c = paraList (\ x xs tl -> c x xs : tl) []
Considering the general case
In the general case, we are interested in building a definition of paramorphism for all data structures defined as the fixpoint of a (strictly positive) functor. We use the traditional fixpoint operator:
newtype Fix f = Fix { unFix :: f (Fix f) }
This builds an inductive structure layer by layer. The layers have an f shape which maybe better grasped by recalling the definition of List using this formalism. A layer is either Nothing (we're done!) or Just (head, tail):
newtype ListF a as = ListF { unListF :: Maybe (a, as) }
type List a = Fix (ListF a)
nil :: List a
nil = Fix $ ListF $ Nothing
cons :: a -> List a -> List a
cons = curry $ Fix . ListF .Just
Now that we have this general framework, we can define para generically for all Fix f where f is a functor:
para :: Functor f => (f (Fix f, b) -> b) -> Fix f -> b
para alg = alg . fmap (\ rec -> (rec, para alg rec)) . unFix
Of course, ListF a is a functor. Meaning we could use para to reimplement paraList and listCase.
instance Functor (ListF a) where fmap f = ListF . fmap (fmap f) . unListF
paraList' :: (a -> List a -> b -> b) -> b -> List a -> b
paraList' c n = para $ maybe n (\ (a, (as, b)) -> c a as b) . unListF
listCase'' :: (a -> List a -> b) -> List a -> List b
listCase'' c = paraList' (\ x xs tl -> cons (c x xs) tl) nil
You can implement a simple bijection toList, fromList to test it if you want. I could not be bothered to reimplement take so it's pretty ugly:
toList :: [a] -> List a
toList = foldr cons nil
fromList :: List a -> [a]
fromList = paraList' (\ x _ tl -> x : tl) []
*ListCase> fmap fromList . fromList . listCase'' (\ _ as -> toList $ take 3 $ fromList as). toList $ [1..5]
[[2,3,4],[3,4,5],[4,5],[5],[]]
I have a type synonym type Entity = ([Feature], Body) for whatever Feature and Body mean. Objects of Entity type are to be grouped together:
type Bunch = [Entity]
and the assumption, crucial for the algorithm working with Bunch, is that any two entities in the same bunch have the equal number of features.
If I were to implement this constraint in an OOP language, I would add the corresponding check to the method encapsulating the addition of entities into a bunch.
Is there a better way to do it in Haskell? Preferably, on the definition level. (If the definition of Entity also needs to be changed, no problem.)
Using type-level length annotations
So here's the deal. Haskell does have type-level natural numbers and you can annotate with types using "phantom types". However you do it, the types will look like this:
data Z
data S n
data LAList x len = LAList [x] -- length-annotated list
Then you can add some construction functions for convenience:
lalist1 :: x -> LAList x (S Z)
lalist1 x = LAList [x]
lalist2 :: x -> x -> LAList x (S (S Z))
lalist2 x y = LAList [x, y]
-- ...
And then you've got more generic methods:
(~:) :: x -> LAList x n -> LAList x (S n)
x ~: LAList xs = LAList (x : xs)
infixr 5 ~:
nil :: LAList x Z
nil = LAList []
lahead :: LAList x (S n) -> x
lahead (LAList xs) = head xs
latail :: LAList x (S n) -> LAList x n
latail (LAList xs) = tail xs
but by itself the List definition doesn't have any of this because it's complicated. You may be interested in the Data.FixedList package for a somewhat different approach, too. Basically every approach is going to start off looking a little weird with some data type that has no constructor, but it starts to look normal after a little bit.
You might also be able to get a typeclass so that all of the lalist1, lalist2 operators above can be replaced with
class FixedLength t where
la :: t x -> LAList x n
but you will probably need the -XTypeSynonymInstances flag to do this, as you want to do something like
type Pair x = (x, x)
instance FixedLength Pair where
la :: Pair x -> LAList [x] (S (S Z))
la (a, b) = LAList [a, b]
(it's a kind mismatch when you go from (a, b) to Pair a).
Using runtime checking
You can very easily take a different approach and encapsulate all of this as a runtime error or explicitly model the error in your code:
-- this may change if you change your definition of the Bunch type
features :: Entity -> [Feature]
features = fst
-- we also assume a runBunch :: [Entity] -> Something function
-- that you're trying to run on this Bunch.
allTheSame :: (Eq x) => [x] -> Bool
allTheSame (x : xs) = all (x ==) xs
allTheSame [] = True
permissiveBunch :: [Entity] -> Maybe Something
permissiveBunch es
| allTheSame (map (length . features) es) = Just (runBunch es)
| otherwise = Nothing
strictBunch :: [Entity] -> Something
strictBunch es
| allTheSame (map (length . features) es) = runBunch es
| otherwise = error ("runBunch requires all feature lists to be the same length; saw instead " ++ show (map (length . features) es))
Then your runBunch can just assume that all the lengths are the same and it's explicitly checked for above. You can get around pattern-matching weirdnesses with, say, the zip :: [a] -> [b] -> [(a, b)] function in the Prelude, if you need to pair up the features next to each other. (The goal here would be an error in an algorithm due to pattern-matching for both runBunch' (x:xs) (y:ys) and runBunch' [] [] but then Haskell warns that there are 2 patterns which you've not considered in the match.)
Using tuples and type classes
One final way to do it which is a compromise between the two (but makes for pretty good Haskell code) involves making Entity parametrized over all features:
type Entity x = (x, Body)
and then including a function which can zip different entities of different lengths together:
class ZippableFeatures z where
fzip :: z -> z -> [(Feature, Feature)]
instance ZippableFeatures () where
fzip () () = []
instance ZippableFeatures Feature where
fzip f1 f2 = [(f1, f2)]
instance ZippableFeatures (Feature, Feature) where
fzip (a1, a2) (b1, b2) = [(a1, b1), (a2, b2)]
Then you can use tuples for your feature lists, as long as they don't get any larger than the maximum tuple length (which is 15 on my GHC). If you go larger than that, of course, you can always define your own data types, but it's not going to be as general as type-annotated lists.
If you do this, your type signature for runBunch will simply look like:
runBunch :: (ZippableFeatures z) => [Entity z] -> Something
When you run it on things with the wrong number of features you'll get compiler errors that it can't unify the type (a, b) with (a, b, c).
There are various ways to enforce length constraints like that; here's one:
{-# LANGUAGE DataKinds, KindSignatures, GADTs, TypeFamilies #-}
import Prelude hiding (foldr)
import Data.Foldable
import Data.Monoid
import Data.Traversable
import Control.Applicative
data Feature -- Whatever that really is
data Body -- Whatever that really is
data Nat = Z | S Nat -- Natural numbers
type family Plus (m::Nat) (n::Nat) where -- Type level natural number addition
Plus Z n = n
Plus (S m) n = S (Plus m n)
data LList (n :: Nat) a where -- Lists tagged with their length at the type level
Nil :: LList Z a
Cons :: a -> LList n a -> LList (S n) a
Some functions on these lists:
llHead :: LList (S n) a -> a
llHead (Cons x _) = x
llTail :: LList (S n) a -> LList n a
llTail (Cons _ xs) = xs
llAppend :: LList m a -> LList n a -> LList (Plus m n) a
llAppend Nil ys = ys
llAppend (Cons x xs) ys = Cons x (llAppend xs ys)
data Entity n = Entity (LList n Feature) Body
data Bunch where
Bunch :: [Entity n] -> Bunch
Some instances:
instance Functor (LList n) where
fmap f Nil = Nil
fmap f (Cons x xs) = Cons (f x) (fmap f xs)
instance Foldable (LList n) where
foldMap f Nil = mempty
foldMap f (Cons x xs) = f x `mappend` foldMap f xs
instance Traversable (LList n) where
traverse f Nil = pure Nil
traverse f (Cons x xs) = Cons <$> f x <*> traverse f xs
And so on. Note that n in the definition of Bunch is existential. It can be anything, and what it actually is doesn't affect the type—all bunches have the same type. This limits what you can do with bunches to a certain extent. Alternatively, you can tag the bunch with the length of its feature lists. It all depends what you need to do with this stuff in the end.
I'm trying to implement a kind of zipper for length-indexed lists which would return each item of the list paired with a list where that element is removed. E.g. for ordinary lists:
zipper :: [a] -> [(a, [a])]
zipper = go [] where
go _ [] = []
go prev (x:xs) = (x, prev ++ xs) : go (prev ++ [x]) xs
So that
> zipper [1..5]
[(1,[2,3,4,5]), (2,[1,3,4,5]), (3,[1,2,4,5]), (4,[1,2,3,5]), (5,[1,2,3,4])]
My current attempt at implementing the same thing for length-indexed lists:
{-# LANGUAGE GADTs #-}
{-# LANGUAGE DataKinds #-}
{-# LANGUAGE KindSignatures #-}
{-# LANGUAGE TypeOperators #-}
{-# LANGUAGE TypeFamilies #-}
data Nat = Zero | Succ Nat
type One = Succ Zero
type family (+) (a :: Nat) (b :: Nat) :: Nat
type instance (+) Zero n = n
type instance (+) (Succ n) m = Succ (n + m)
data List :: Nat -> * -> * where
Nil :: List Zero a
Cons :: a -> List size a -> List (Succ size) a
single :: a -> List One a
single a = Cons a Nil
cat :: List a i -> List b i -> List (a + b) i
cat Nil ys = ys
cat (Cons x xs) ys = Cons x (xs `cat` ys)
zipper :: List (Succ n) a -> List (Succ n) (a, List n a)
zipper = go Nil where
go :: (p + Zero) ~ p
=> List p a -> List (Succ q) a -> List (Succ q) (a, List (p + q) a)
go prev (Cons x Nil) = single (x, prev)
go prev (Cons x xs) = (x, prev `cat` xs) `Cons` go (prev `cat` single x) xs
This feels like it should be rather straightforward, but as there doesn't seem to be any way to convey to GHC that e.g. + is commutative and associative or that zero is the identity, I'm running into lots of problems where the type checker (understandably) complains that it cannot determine that a + b ~ b + a or that a + Zero ~ a.
Do I need to add some sort of proof objects (data Refl a b where Refl :: Refl a a et al.) or is there some way to make this work with just adding more explicit type signatures?
Alignment
Dependently typed programming is like doing two jigsaws which some rogue has glued together. Less metaphorically, we express simultaneous computations at the value level and at the type level, and we must ensure their compatibility. Of course, we are each our own rogue, so if we can arrange for the jigsaws to be glued in alignment, we shall have an easier time of it. When you see proof obligations for type repair, you might be tempted to ask
Do I need to add some sort of proof objects (data Refl a b where Refl :: Refl a a et al.) or is there some way to make this work with just adding more explicit type signatures?
But you might first consider in what way the value- and type-level computations are out of alignment, and whether there is any hope to bring them closer.
A Solution
The question here is how to compute the vector (length-indexed list) of selections from a vector. So we'd like something with type
List (Succ n) a -> List (Succ n) (a, List n a)
where the element in each input position gets decorated with the one-shorter vector of its siblings. The proposed method is to scan left-to-right, accumulating the elder siblings in a list which grows on the right, then concatenate with the younger siblings at each position. Growing lists on the right is always a worry, especially when the Succ for the length is aligned to the Cons on the left. The need for concatenation necessitates type-level addition, but the arithmetic resulting from right-ended activity is out of alignment with the computation rules for addition. I'll get back to this style in a bit, but let's try thinking it out again.
Before we get into any accumulator-based solution, let's just try bog standard structural recursion. We have the "one" case and the "more" case.
picks (Cons x xs#Nil) = Cons (x, xs) Nil
picks (Cons x xs#(Cons _ _)) = Cons (x, xs) (undefined (picks xs))
In both cases, we put the first decomposition at the front. In the second case, we have checked that the tail is nonempty, so we can ask for its selections. We have
x :: a
xs :: List (Succ n) a
picks xs :: List (Succ n) (a, List n a)
and we want
Cons (x, xs) (undefined (picks xs)) :: List (Succ (Succ n)) (a, List (Succ n) a)
undefined (picks xs) :: List (Succ n) (a, List (Succ n) a)
so the undefined needs to be a function which grows all the sibling lists by reattaching x at the left end (and left-endedness is good). So, I define the Functor instance for List n
instance Functor (List n) where
fmap f Nil = Nil
fmap f (Cons x xs) = Cons (f x) (fmap f xs)
and I curse the Prelude and
import Control.Arrow((***))
so that I can write
picks (Cons x xs#Nil) = Cons (x, xs) Nil
picks (Cons x xs#(Cons _ _)) = Cons (x, xs) (fmap (id *** Cons x) (picks xs))
which does the job with not a hint of addition, let alone a proof about it.
Variations
I got annoyed about doing the same thing in both lines, so I tried to wriggle out of it:
picks :: m ~ Succ n => List m a -> List m (a, List n a) -- DOESN'T TYPECHECK
picks Nil = Nil
picks (Cons x xs) = Cons (x, xs) (fmap (id *** (Cons x)) (picks xs))
But GHC solves the constraint aggressively and refuses to allow Nil as a pattern. And it's correct to do so: we really shouldn't be computing in a situation where we know statically that Zero ~ Succ n, as we can easily construct some segfaulting thing. The trouble is just that I put my constraint in a place with too global a scope.
Instead, I can declare a wrapper for the result type.
data Pick :: Nat -> * -> * where
Pick :: {unpick :: (a, List n a)} -> Pick (Succ n) a
The Succ n return index means the nonemptiness constraint is local to a Pick. A helper function does the left-end extension,
pCons :: a -> Pick n a -> Pick (Succ n) a
pCons b (Pick (a, as)) = Pick (a, Cons b as)
leaving us with
picks' :: List m a -> List m (Pick m a)
picks' Nil = Nil
picks' (Cons x xs) = Cons (Pick (x, xs)) (fmap (pCons x) (picks' xs))
and if we want
picks = fmap unpick . picks'
That's perhaps overkill, but it might be worth it if we want to separate older and younger siblings, splitting lists in three, like this:
data Pick3 :: Nat -> * -> * where
Pick3 :: List m a -> a -> List n a -> Pick3 (Succ (m + n)) a
pCons3 :: a -> Pick3 n a -> Pick3 (Succ n) a
pCons3 b (Pick3 bs x as) = Pick3 (Cons b bs) x as
picks3 :: List m a -> List m (Pick3 m a)
picks3 Nil = Nil
picks3 (Cons x xs) = Cons (Pick3 Nil x xs) (fmap (pCons3 x) (picks3 xs))
Again, all the action is left-ended, so we're fitting nicely with the computational behaviour of +.
Accumulating
If we want to keep the style of the original attempt, accumulating the elder siblings as we go, we could do worse than to keep them zipper-style, storing the closest element in the most accessible place. That is, we can store the elder siblings in reverse order, so that at each step we need only Cons, rather than concatenating. When we want to build the full sibling list in each place, we need to use reverse-concatenation (really, plugging a sublist into a list zipper). You can type revCat easily for vectors if you deploy the abacus-style addition:
type family (+/) (a :: Nat) (b :: Nat) :: Nat
type instance (+/) Zero n = n
type instance (+/) (Succ m) n = m +/ Succ n
That's the addition which is in alignment with the value-level computation in revCat, defined thus:
revCat :: List m a -> List n a -> List (m +/ n) a
revCat Nil ys = ys
revCat (Cons x xs) ys = revCat xs (Cons x ys)
We acquire a zipperized go version
picksr :: List (Succ n) a -> List (Succ n) (a, List n a)
picksr = go Nil where
go :: List p a -> List (Succ q) a -> List (Succ q) (a, List (p +/ q) a)
go p (Cons x xs#Nil) = Cons (x, revCat p xs) Nil
go p (Cons x xs#(Cons _ _)) = Cons (x, revCat p xs) (go (Cons x p) xs)
and nobody proved anything.
Conclusion
Leopold Kronecker should have said
God made the natural numbers to perplex us: all the rest is the work of man.
One Succ looks very like another, so it is very easy to write down expressions which give the size of things in a way which is out of alignment with their structure. Of course, we can and should (and are about to) equip GHC's constraint solver with improved kit for type-level numerical reasoning. But before that kicks in, it's worth just conspiring to align the Conses with the Succs.