Type-level constraints in Haskell - haskell

I'm trying to encode term algebra as a data type in Haskell for further use in an algorithm. By term algebra I mean a set of terms which are either variables or functions applied to other terms. The functions with zero arguments are constants (but that's actually does not matter here).
Firstly, one would need the following GHC language extensions to replicate my code:
{-# LANGUAGE GADTs, DataKinds,
TypeApplications,
TypeFamilies,
TypeOperators,
StandaloneKindSignatures,
UndecidableInstances#-}
and the following imports:
import qualified GHC.TypeLits as GTL
import Data.Kind
The direct way to encode terms (the first one I took):
data Term where
Var :: String -> Term
Func :: String -> Integer -> [Term] -> Term
where by String I want to encode the name, by Integer the arity and by [Terms] the list of arguments of a function.
Then I want to be sure that the list of terms as arguments have the same length as an arity.
The first idea is to use smart constructors, but I would like to encode such constraints on a type level. So the second idea would be to create type-level naturals, lists of a specified length and the data type where these numbers coincide:
data Z = Z
data S a = S a
data List n a where
Nil :: List Z a
Cons :: a -> List m a -> List (S m) a
data WWTerm where
WWVar :: String -> WWTerm
WWFunc :: String -> m -> List m WWTerm -> WWTerm
My question here is the following: is there a way to impose type-level constraints using ordinary lists at the same time via 1) creating special type-families, 2) creating special type classes, or 3) via constraints in data types?
Regarding my attempts, I wrote the following:
type MyLength :: [Type] -> GTL.Nat
type family MyLength xs where
MyLength '[] = 0
MyLength (x':xs) = 1 GTL.+ (MyLength xs)
data QFunc n l where
QFunc :: (MyLength l ~ n) => String -> (n :: GTL.Nat) -> l -> QFunc n l
Unfortunately, this part of code doesn't compile for the following reasons:
Expected a type, but ‘n :: Nat’ has kind ‘Nat’
...
Expected a type, but ‘l’ has kind ‘[*]’
...
Are there any thoughts on how to approach my goal?

Related

Matrix dimensionality checks at compile time

I was wondering whether is was possible to use the matrix type from Data.Matrix to construct another type with which it becomes possible to perform dimensionality checking at compile time.
E.g. I want to be able to write a function like:
mmult :: Matrix' r c -> Matrix' c r -> Matrix' r r
mmult = ...
However, I don't see how to do this since the arguments to a type constructor Matrix' would have to be types and not integer constants.
I don't see how to do this since the arguments to a type constructor Matrix' would have to be types and not integer constants
They do need to be type-level values, but not necessarily types. “Type-level” basically just means known at compile-time, but this also contains stuff that isn't really types. Types are in particular the type-level values of kind Type, but you can also have type-level strings or, indeed, natural numbers.
{-# LANGUAGE DataKinds, KindSignatures #-}
import GHC.TypeLits
import Data.Matrix
newtype Matrix' (n :: Nat) (m :: Nat) a
= StaMat {getStaticSizeMatrix :: Matrix a}
mmult :: Num a => Matrix' n m a -> Matrix' l n a -> Matrix' l m a
mmult (StaMat f) (StaMat g) = StaMat $ multStd f g
I would remark that matrices are only a special case of a much more general mathematical concept, that of linear maps between vector spaces. And since vector spaces can be seen as particular types, it actually makes a lot of sense to not use mere integers as the type-level tags, but the actual spaces. What you have then is a category, and it allows you to deal with both dynamic- and static size matrix/vector types, and can even be generalised to completely different spaces like infinite-dimensional Hilbert spaces.
{-# LANGUAGE GADTs #-}
newtype StaVect (n :: Nat) a
= StaVect {getStaticSizeVect :: Vector a}
data LinMap v w where
StaMat :: Matrix a -> LinMap (StaVec n a) (StaVec m a)
-- ...Add more constructors for mappings between other sorts of vector spaces...
linCompo :: LinMap v w -> LinMap u v -> LinMap u w
linCompo (StaMat f) (StaMat g) = StaMat $ multStd f g
The linearmap-category package pursues this direction.

Can a Haskell type constructor have non-type parameters?

A type constructor produces a type given a type. For example, the Maybe constructor
data Maybe a = Nothing | Just a
could be a given a concrete type, like Char, and give a concrete type, like Maybe Char. In terms of kinds, one has
GHCI> :k Maybe
Maybe :: * -> *
My question: Is it possible to define a type constructor that yields a concrete type given a Char, say? Put another way, is it possible to mix kinds and types in the type signature of a type constructor? Something like
GHCI> :k my_type
my_type :: Char -> * -> *
Can a Haskell type constructor have non-type parameters?
Let's unpack what you mean by type parameter. The word type has (at least) two potential meanings: do you mean type in the narrow sense of things of kind *, or in the broader sense of things at the type level? We can't (yet) use values in types, but modern GHC features a very rich kind language, allowing us to use a wide range of things other than concrete types as type parameters.
Higher-Kinded Types
Type constructors in Haskell have always admitted non-* parameters. For example, the encoding of the fixed point of a functor works in plain old Haskell 98:
newtype Fix f = Fix { unFix :: f (Fix f) }
ghci> :k Fix
Fix :: (* -> *) -> *
Fix is parameterised by a functor of kind * -> *, not a type of kind *.
Beyond * and ->
The DataKinds extension enriches GHC's kind system with user-declared kinds, so kinds may be built of pieces other than * and ->. It works by promoting all data declarations to the kind level. That is to say, a data declaration like
data Nat = Z | S Nat -- natural numbers
introduces a kind Nat and type constructors Z :: Nat and S :: Nat -> Nat, as well as the usual type and value constructors. This allows you to write datatypes parameterised by type-level data, such as the customary vector type, which is a linked list indexed by its length.
data Vec n a where
Nil :: Vec Z a
(:>) :: a -> Vec n a -> Vec (S n) a
ghci> :k Vec
Vec :: Nat -> * -> *
There's a related extension called ConstraintKinds, which frees constraints like Ord a from the yoke of the "fat arrow" =>, allowing them to roam across the landscape of the type system as nature intended. Kmett has used this power to build a category of constraints, with the newtype (:-) :: Constraint -> Constraint -> * denoting "entailment": a value of type c :- d is a proof that if c holds then d also holds. For example, we can prove that Ord a implies Eq [a] for all a:
ordToEqList :: Ord a :- Eq [a]
ordToEqList = Sub Dict
Life after forall
However, Haskell currently maintains a strict separation between the type level and the value level. Things at the type level are always erased before the program runs, (almost) always inferrable, invisible in expressions, and (dependently) quantified by forall. If your application requires something more flexible, such as dependent quantification over runtime data, then you have to manually simulate it using a singleton encoding.
For example, the specification of split says it chops a vector at a certain length according to its (runtime!) argument. The type of the output vector depends on the value of split's argument. We'd like to write this...
split :: (n :: Nat) -> Vec (n :+: m) a -> (Vec n a, Vec m a)
... where I'm using the type function (:+:) :: Nat -> Nat -> Nat, which stands for addition of type-level naturals, to ensure that the input vector is at least as long as n...
type family n :+: m where
Z :+: m = m
S n :+: m = S (n :+: m)
... but Haskell won't allow that declaration of split! There aren't any values of type Z or S n; only types of kind * contain values. We can't access n at runtime directly, but we can use a GADT which we can pattern-match on to learn what the type-level n is:
data Natty n where
Zy :: Natty Z
Sy :: Natty n -> Natty (S n)
ghci> :k Natty
Natty :: Nat -> *
Natty is called a singleton, because for a given (well-defined) n there is only one (well-defined) value of type Natty n. We can use Natty n as a run-time stand-in for n.
split :: Natty n -> Vec (n :+: m) a -> (Vec n a, Vec m a)
split Zy xs = (Nil, xs)
split (Sy n) (x :> xs) =
let (ys, zs) = split n xs
in (x :> ys, zs)
Anyway, the point is that values - runtime data - can't appear in types. It's pretty tedious to duplicate the definition of Nat in singleton form (and things get worse if you want the compiler to infer such values); dependently-typed languages like Agda, Idris, or a future Haskell escape the tyranny of strictly separating types from values and give us a range of expressive quantifiers. You're able to use an honest-to-goodness Nat as split's runtime argument and mention its value dependently in the return type.
#pigworker has written extensively about the unsuitability of Haskell's strict separation between types and values for modern dependently-typed programming. See, for example, the Hasochism paper, or his talk on the unexamined assumptions that have been drummed into us by four decades of Hindley-Milner-style programming.
Dependent Kinds
Finally, for what it's worth, with TypeInType modern GHC unifies types and kinds, allowing us to talk about kind variables using the same tools that we use to talk about type variables. In a previous post about session types I made use of TypeInType to define a kind for tagged type-level sequences of types:
infixr 5 :!, :?
data Session = Type :! Session -- Type is a synonym for *
| Type :? Session
| E
I'd recommend #Benjamin Hodgson's answer and the references he gives to see how to make this sort of thing useful. But, to answer your question more directly, using several extensions (DataKinds, KindSignatures, and GADTs), you can define types that are parameterized on (certain) concrete types.
For example, here's one parameterized on the concrete Bool datatype:
{-# LANGUAGE DataKinds, KindSignatures, GADTs #-}
{-# LANGUAGE FlexibleInstances #-}
module FlaggedType where
-- The single quotes below are optional. They serve to notify
-- GHC that we are using the type-level constructors lifted from
-- data constructors rather than types of the same name (and are
-- only necessary where there's some kind of ambiguity otherwise).
data Flagged :: Bool -> * -> * where
Truish :: a -> Flagged 'True a
Falsish :: a -> Flagged 'False a
-- separate instances, just as if they were different types
-- (which they are)
instance (Show a) => Show (Flagged 'False a) where
show (Falsish x) = show x
instance (Show a) => Show (Flagged 'True a) where
show (Truish x) = show x ++ "*"
-- these lists have types as indicated
x = [Truish 1, Truish 2, Truish 3] -- :: Flagged 'True Integer
y = [Falsish "a", Falsish "b", Falsish "c"] -- :: Flagged 'False String
-- this won't typecheck: it's just like [1,2,"abc"]
z = [Truish 1, Truish 2, Falsish 3] -- won't typecheck
Note that this isn't much different from defining two completely separate types:
data FlaggedTrue a = Truish a
data FlaggedFalse a = Falsish a
In fact, I'm hard pressed to think of any advantage Flagged has over defining two separate types, except if you have a bar bet with someone that you can write useful Haskell code without type classes. For example, you can write:
getInt :: Flagged a Int -> Int
getInt (Truish z) = z -- same polymorphic function...
getInt (Falsish z) = z -- ...defined on two separate types
Maybe someone else can think of some other advantages.
Anyway, I believe that parameterizing types with concrete values really only becomes useful when the concrete type is sufficient "rich" that you can use it to leverage the type checker, as in Benjamin's examples.
As #user2407038 noted, most interesting primitive types, like Ints, Chars, Strings and so on can't be used this way. Interestingly enough, though, you can use literal positive integers and strings as type parameters, but they are treated as Nats and Symbols (as defined in GHC.TypeLits) respectively.
So something like this is possible:
import GHC.TypeLits
data Tagged :: Symbol -> Nat -> * -> * where
One :: a -> Tagged "one" 1 a
Two :: a -> Tagged "two" 2 a
Three :: a -> Tagged "three" 3 a
Look at using Generalized Algebraic Data Types (GADTS), which enable you to define concrete outputs based on input type, e.g.
data CustomMaybe a where
MaybeChar :: Maybe a -> CustomMaybe Char
MaybeString :: Maybe a > CustomMaybe String
MaybeBool :: Maybe a -> CustomMaybe Bool
exampleFunction :: CustomMaybe a -> a
exampleFunction (MaybeChar maybe) = 'e'
exampleFunction (MaybeString maybe) = True //Compile error
main = do
print $ exampleFunction (MaybeChar $ Just 10)
To a similar effect, RankNTypes can allow the implementation of similar behaviour:
exampleFunctionOne :: a -> a
exampleFunctionOne el = el
type PolyType = forall a. a -> a
exampleFuntionTwo :: PolyType -> Int
exampleFunctionTwo func = func 20
exampleFunctionTwo func = func "Hello" --Compiler error, PolyType being forced to return 'Int'
main = do
print $ exampleFunctionTwo exampleFunctionOne
The PolyType definition allows you to insert the polymorphic function within exampleFunctionTwo and force its output to be 'Int'.
No. Haskell doesn't have dependent types (yet). See https://typesandkinds.wordpress.com/2016/07/24/dependent-types-in-haskell-progress-report/ for some discussion of when it may.
In the meantime, you can get behavior like this in Agda, Idris, and Cayenne.

What can type families do that multi param type classes and functional dependencies cannot

I have played around with TypeFamilies, FunctionalDependencies, and MultiParamTypeClasses. And it seems to me as though TypeFamilies doesn't add any concrete functionality over the other two. (But not vice versa). But I know type families are pretty well liked so I feel like I am missing something:
"open" relation between types, such as a conversion function, which does not seem possible with TypeFamilies. Done with MultiParamTypeClasses:
class Convert a b where
convert :: a -> b
instance Convert Foo Bar where
convert = foo2Bar
instance Convert Foo Baz where
convert = foo2Baz
instance Convert Bar Baz where
convert = bar2Baz
Surjective relation between types, such as a sort of type safe pseudo-duck typing mechanism, that would normally be done with a standard type family. Done with MultiParamTypeClasses and FunctionalDependencies:
class HasLength a b | a -> b where
getLength :: a -> b
instance HasLength [a] Int where
getLength = length
instance HasLength (Set a) Int where
getLength = S.size
instance HasLength Event DateDiff where
getLength = dateDiff (start event) (end event)
Bijective relation between types, such as for an unboxed container, which could be done through TypeFamilies with a data family, although then you have to declare a new data type for every contained type, such as with a newtype. Either that or with an injective type family, which I think is not available prior to GHC 8. Done with MultiParamTypeClasses and FunctionalDependencies:
class Unboxed a b | a -> b, b -> a where
toList :: a -> [b]
fromList :: [b] -> a
instance Unboxed FooVector Foo where
toList = fooVector2List
fromList = list2FooVector
instance Unboxed BarVector Bar where
toList = barVector2List
fromList = list2BarVector
And lastly a surjective relations between two types and a third type, such as python2 or java style division function, which can be done with TypeFamilies by also using MultiParamTypeClasses. Done with MultiParamTypeClasses and FunctionalDependencies:
class Divide a b c | a b -> c where
divide :: a -> b -> c
instance Divide Int Int Int where
divide = div
instance Divide Int Double Double where
divide = (/) . fromIntegral
instance Divide Double Int Double where
divide = (. fromIntegral) . (/)
instance Divide Double Double Double where
divide = (/)
One other thing I should also add is that it seems like FunctionalDependencies and MultiParamTypeClasses are also quite a bit more concise (for the examples above anyway) as you only have to write the type once, and you don't have to come up with a dummy type name which you then have to type for every instance like you do with TypeFamilies:
instance FooBar LongTypeName LongerTypeName where
FooBarResult LongTypeName LongerTypeName = LongestTypeName
fooBar = someFunction
vs:
instance FooBar LongTypeName LongerTypeName LongestTypeName where
fooBar = someFunction
So unless I am convinced otherwise it really seems like I should just not bother with TypeFamilies and use solely FunctionalDependencies and MultiParamTypeClasses. Because as far as I can tell it will make my code more concise, more consistent (one less extension to care about), and will also give me more flexibility such as with open type relationships or bijective relations (potentially the latter is solver by GHC 8).
Here's an example of where TypeFamilies really shines compared to MultiParamClasses with FunctionalDependencies. In fact, I challenge you to come up with an equivalent MultiParamClasses solution, even one that uses FlexibleInstances, OverlappingInstance, etc.
Consider the problem of type level substitution (I ran across a specific variant of this in Quipper in QData.hs). Essentially what you want to do is recursively substitute one type for another. For example, I want to be able to
substitute Int for Bool in Either [Int] String and get Either [Bool] String,
substitute [Int] for Bool in Either [Int] String and get Either Bool String,
substitute [Int] for [Bool] in Either [Int] String and get Either [Bool] String.
All in all, I want the usual notion of type level substitution. With a closed type family, I can do this for any types (albeit I need an extra line for each higher-kinded type constructor - I stopped at * -> * -> * -> * -> *).
{-# LANGUAGE TypeFamilies #-}
-- Subsitute type `x` for type `y` in type `a`
type family Substitute x y a where
Substitute x y x = y
Substitute x y (k a b c d) = k (Substitute x y a) (Substitute x y b) (Substitute x y c) (Substitute x y d)
Substitute x y (k a b c) = k (Substitute x y a) (Substitute x y b) (Substitute x y c)
Substitute x y (k a b) = k (Substitute x y a) (Substitute x y b)
Substitute x y (k a) = k (Substitute x y a)
Substitute x y a = a
And trying at ghci I get the desired output:
> :t undefined :: Substitute Int Bool (Either [Int] String)
undefined :: Either [Bool] [Char]
> :t undefined :: Substitute [Int] Bool (Either [Int] String)
undefined :: Either Bool [Char]
> :t undefined :: Substitute [Int] [Bool] (Either [Int] String)
undefined :: Either [Bool] [Char]
With that said, maybe you should be asking yourself why am I using MultiParamClasses and not TypeFamilies. Of the examples you gave above, all except Convert translate to type families (albeit you will need an extra line per instance for the type declaration).
Then again, for Convert, I am not convinced it is a good idea to define such a thing. The natural extension to Convert would be instances such as
instance (Convert a b, Convert b c) => Convert a c where
convert = convert . convert
instance Convert a a where
convert = id
which are as unresolvable for GHC as they are elegant to write...
To be clear, I am not saying there are no uses of MultiParamClasses, just that when possible you should be using TypeFamilies - they let you think about type-level functions instead of just relations.
This old HaskellWiki page does an OK job of comparing the two.
EDIT
Some more contrasting and history I stumbled upon from augustss blog
Type families grew out of the need to have type classes with
associated types. The latter is not strictly necessary since it can be
emulated with multi-parameter type classes, but it gives a much nicer
notation in many cases. The same is true for type families; they can
also be emulated by multi-parameter type classes. But MPTC gives a
very logic programming style of doing type computation; whereas type
families (which are just type functions that can pattern match on the
arguments) is like functional programming.
Using closed type families
adds some extra strength that cannot be achieved by type classes. To
get the same power from type classes we would need to add closed type
classes. Which would be quite useful; this is what instance chains
gives you.
Functional dependencies only affect the process of constraint solving, while type families introduced the notion of non-syntactic type equality, represented in GHC's intermediate form by coercions. This means type families interact better with GADTs. See this question for the canonical example of how functional dependencies fail here.

Type level environment in Haskell

I'm trying to use some Haskell extensions to implement a simple DSL. A feature that I'd like is to have a type level context for variables.
I know that this kind of thing is common place in languages like Agda or Idris. But I'd like to know if is possible to achieve the same results in Haskell.
My idea to this is to use type level association lists. The code is as follows:
{-# LANGUAGE GADTs,
DataKinds,
PolyKinds,
TypeOperators,
TypeFamilies,
ScopedTypeVariables,
ConstraintKinds,
UndecidableInstances #-}
import Data.Proxy
import Data.Singletons.Prelude
import Data.Singletons.Prelude.List
import GHC.Exts
import GHC.TypeLits
type family In (s :: Symbol)(a :: *)(env :: [(Symbol, *)]) :: Constraint where
In x t '[] = ()
In x t ('(y,t) ': env) = (x ~ y , In x t env)
data Exp (env :: [(Symbol, *)]) (a :: *) where
Pure :: a -> Exp env a
Map :: (a -> b) -> Exp env a -> Exp env b
App :: Exp env (a -> b) -> Exp env a -> Exp env b
Set :: (KnownSymbol s, In s t env) => proxy s -> t -> Exp env ()
Get :: (KnownSymbol s, In s t env) => proxy s -> Exp env t
test :: Exp '[ '("a", Bool), '("b", Char) ] Char
test = Get (Proxy :: Proxy "b")
Type family In models a type level list membership constraint that is used to ensure that a variable can only be used if it is on a given environment env.
The problem is that GHC constraint solver isn't able to entail the constraint
In "b" Char [("a",Bool), ("b",Char)] for test function, giving the following error message:
Could not deduce (In "b" Char '['("a", Bool), '("b", Char)])
arising from a use of ‘Get’
In the expression: Get (Proxy :: Proxy "b")
In an equation for ‘test’: test = Get (Proxy :: Proxy "b")
Failed, modules loaded: Main.
I'm using GHC 7.10.3. Any tip on how can I solve this or an explanation of why this isn't possible is highly welcome.
Your In is not what you think - it's actually more like All. A value of type In x t xs is a proof that every element of the (type-level) list xs is equal to '(x,t).
The following GADT is a more usual proof of membership:
data Elem x xs where
Here :: Elem x (x ': xs)
There :: Elem x xs -> Elem y (x ': xs)
Elem is like a natural number but with more types: compare the shape of There (There Here) with that of S (S Z). You prove an item is in the list by giving its index.
For the purposes of writing a lambda calculus type checker, Elem is useful as a de Bruijn index into a (type-level) list of types.
data Ty = NatTy | Ty ~> Ty
data Term env ty where
Lit :: Nat -> Term env NatTy
Var :: Elem t env -> Term env t
Lam :: Term (u ': env) t -> Term env (u ~> t)
($$) :: Term env (u ~> t) -> Term env u -> Term env t
de Bruijn indices have great advantages for compiler writers: they're simple, you needn't worry about alpha-equivalence, and in this example you don't have to mess around with type-level lookup tables or Symbols. But no one in their right mind would ever program in a language with de Bruijn indices, even with a type checker to help. This makes them a good choice for an intermediate language in a compiler, to which you'd translate a surface language with explicit variable names.
Type-level environments and de Bruijn indices are both rather complicated, so you should ask yourself: who is the target audience for this language? (and: is a type-level environment worth the costs?) Is it an embedded DSL with expectations of familiarity, simplicity, and performance? If so I'd consider a deeper embedding using higher-order abstract syntax. Or is it to be used as an intermediate language, for whom the primary audience is yourself, the compiler writer? Then I'd use a library like Kmett's bound to take care of variable binding and suck up the possibility of type errors because they could happen anyway.
For more on type-level environments et al, see The View from the Left for (as far as I know) the first example of a statically-checked lambda calculus type-checker embedded in a dependently-typed programming language. Norell gives an Agda-flavoured exposition of the same program in the Agda tutorial and a series of lectures. See also this question which seems relevant to your use-case.
In generates impossible constraints:
In "b" Char '['("a", Bool), '("b", Char)] =
("b" ~ "a", ("b" ~ "b", ())
It's a conjunction and it has an impossible element "a" ~ "b", so it's impossible overall.
Moreover, the constraint doesn't tell us anything about Char which I assume is the looked-up value.
The simplest way would be directly using a lookup function:
type family Lookup (s :: Symbol) (env :: [(Symbol, *)]) :: * where
Lookup s ('(s , v) ': env) = v
Lookup s ('(s' , v) ': env) = Lookup s env
We can use Lookup k xs ~ val to put constraints on the looked-up types.
We could also return Maybe *. Indeed, the Lookup in Data.Singletons.Prelude.List does that, so that can be used too. However, on the type level we can often get by with partial functions, since type family applications without a matching case get stuck instead of throwing errors, so having a value of type Lookup k xs :: * is already sufficient evidence that k is really a key contained in xs.

Pattern matching on length using this GADT:

I've defined the following GADT:
data Vector v where
Zero :: Num a => Vector a
Scalar :: Num a => a -> Vector a
Vector :: Num a => [a] -> Vector [a]
TVector :: Num a => [a] -> Vector [a]
If it's not obvious, I'm trying to implement a simple vector space. All vector spaces need vector addition, so I want to implement this by making Vector and instance of Num. In a vector space, it doesn't make sense to add vectors of different lengths, and this is something I would like to enforce. One way I thought to do it would be using guards:
instance Num (Vector v) where
(Vector a) + (Vector b) | length a == length b =
Vector $ zipWith (+) a b
| otherwise =
error "Only add vectors with the same length."
There is nothing really wrong with this approach, but I feel like there has to be a way to do this with pattern matching. Perhaps one way to do it would be to define a new data type VectorLength, which would look something like this:
data Length l where
AnyLength :: Nat a => Length a
FixedLength :: Nat a -> Length a
Then, a length component could be added to the Vector data type, something like this:
data Vector (Length l) v where
Zero :: Num a => Vector AnyLength a
-- ...
Vector :: Num a => [a] -> Vector (length [a]) [a]
I know this isn't correct syntax, but this is the general idea I'm playing with. Finally, you could define addition to be
instance Num (Vector v) where
(Vector l a) + (Vector l b) = Vector $ zipWith (+) a b
Is such a thing possible, or is there any other way to use pattern matching for this purpose?
What you're looking for is something (in this instance confusingly) named a Vector as well. Generally, these are used in dependently typed languages where you'd write something like
data Vec (n :: Natural) a where
Nil :: Vec 0 a
Cons :: a -> Vec n a -> Vec (n + 1) a
But that's far from valid Haskell (or really any language). Some very recent extensions to GHC are beginning to enable this kind of expression but they're not there yet.
You might be interested in fixed-vector which does a best approximation of a fixed Vector available in relatively stable GHC. It uses a number of tricks between type families and continuations to create classes of fixed-size vectors.
Just to add to the example in the other answer - this nearly works already in GHC 7.6:
{-# LANGUAGE DataKinds, GADTs, KindSignatures, TypeOperators #-}
import GHC.TypeLits
data Vector (n :: Nat) a where
Nil :: Vector 0 a
Cons :: a -> Vector n a -> Vector (n + 1) a
That code compiles fine, it just doesn't work quite the way you'd hope. Let's check it out in ghci:
*Main> :t Nil
Nil :: Vector 0 a
Good so far...
*Main> :t Cons "foo" Nil
Cons "foo" Nil :: Vector (0 + 1) [Char]
Well, that's a little odd... Why does it say (0 + 1) instead of 1?
*Main> :t Cons "foo" Nil :: Vector 1 String
<interactive>:1:1:
Couldn't match type `0 + 1' with `1'
Expected type: Vector 1 String
Actual type: Vector (0 + 1) String
In the return type of a call of `Cons'
In the expression: Cons "foo" Nil :: Vector 1 String
Uh. Oops. That'd be why it says (0 + 1) instead of 1. It doesn't know that those are the same. This will be fixed (at least this case will) in GHC 7.8, which is due out... In a couple months, I think?

Resources