understanding type-level functions in haskell - haskell

I am trying to understand how the Plus and Times functions in the code below work. What I do not understand is:
How do you call these functions in ghci to operate on a vector?
Why do they have to operate on the type level? What type do they operate on?
How do they evaluate the result?
For plus, we have, Plus (S m) n = S (Plus m n), but how is (Plus m n) evaluated. Similarly, how is (Times n m) evaluated?
{-# LANGUAGE DataKinds, KindSignatures, TypeFamilies, GADTs #-}
{-# LANGUAGE UndecidableInstances, StandaloneDeriving #-}
module Vector where
-- Natural numbers, values will be promoted to types
data Nat
= Z -- Zero
| S Nat -- Successor (+1)
data Vec (a :: *) :: Nat -> * where
-- Nil has zero length
Nil :: Vec a Z
-- Cons has length of the tail + 1
Cons :: a -> Vec a n -> Vec a (S n)
deriving instance Show a => Show (Vec a n)
-- head :: [a] -> a
hd :: Vec a (S n) -> a
hd (Cons x xs) = x
-- tail :: [a] -> [a]
tl :: Vec a (S n) -> Vec a n
tl (Cons x xs) = xs
-- map :: (a -> b) -> [a] -> [b]
vMap :: (a -> b) -> Vec a n -> Vec b n
vMap f Nil = Nil
vMap f (Cons x xs) = Cons (f x) (vMap f xs)
-- (++) :: [a] -> [a] -> [a]
vAppend :: Vec a n -> Vec a m -> Vec a (Plus n m)
vAppend Nil xs = xs
vAppend (Cons y ys) xs = Cons y (vAppend ys xs)
-- Type-level addition
type family Plus (x :: Nat) (y :: Nat) :: Nat where
Plus Z n = n
Plus (S m) n = S (Plus m n)
-- concat :: [[a]] -> [a]
vConcat :: Vec (Vec a n) m -> Vec a (Times m n)
vConcat Nil = Nil
vConcat (Cons xs xss) = xs `vAppend` vConcat xss
-- Type-level multiplication
type family Times (x :: Nat) (y :: Nat) :: Nat where
Times Z m = Z
Times (S n) m = Plus m (Times n m)
vFilter :: (a -> Bool) -> Vec a n -> [a]
vFilter p Nil = []
vFilter p (Cons x xs)
| p x = x : vFilter p xs
| otherwise = vFilter p xs

Let me first define a more convenient synonym for Cons:
infixr 5 #:
(#:) :: a -> Vec a n -> Vec a (S n)
(#:) = Cons
How do you call these functions in ghci to operate on a vector?
The value-level functions you can call from GHCi like any other value-level functions. That'll first of all invoke any type-level computations that are necessary, and then run the typechecked code like you would run other Haskell code.
*Vector> :set -XDataKinds
*Vector> let v = 4 #: 9 #: 13 #:Nil
*Vector> :t v
v :: Num a => Vec a ('S ('S ('S 'Z)))
*Vector> v
Cons 4 (Cons 9 (Cons 13 Nil))
*Vector> let w = 7 #: 8 #: 6 #:Nil
*Vector> :t vAppend v w
vAppend v w :: Num a => Vec a ('S ('S ('S ('S ('S ('S 'Z))))))
*Vector> vAppend v w
Cons 4 (Cons 9 (Cons 13 (Cons 7 (Cons 8 (Cons 6 Nil)))))
To evaluate type families as type-level functions by themselves, use GHCi's :kind! command:
*Vector> :kind! Plus ('S 'Z) ('S ('S ('S 'Z)))
Plus ('S 'Z) ('S ('S ('S 'Z))) :: Nat
= 'S ('S ('S ('S 'Z)))
Why do they have to operate on the type level? What type do they operate on?
The purpose, in this example, of having them operate on the type level is that any length-mismatch errors should be caught by the compiler, not result in runtime errors. For example, you may want to write a function accepting two lists that are supposed to have the same length. This is then unsafe:
foo :: [a] -> [a] -> String
but this expresses exactly that the lengths must be identical:
foo' :: Vec a n -> Vec a n -> String
A concrete example is a zip. Prelude.zip does allow lists of different lengths but this means the longer list is basically trimmed to the length of the shorter, which may result in unexpected behaviour. With the Vector version
vZip :: Vec a n -> Vec b n -> Vec (a,b) n
vZip Nil Nil = Nil
vZip (Cons x xs) (Cons y ys) = Cons (x,y) $ vZip xs ys
this can't happen:
*Vector> vZip v w
Cons (4,7) (Cons (9,8) (Cons (13,6) Nil))
*Vector> vZip (vAppend v v) w
<interactive>:22:7: error:
• Couldn't match type ‘'S ('S ('S 'Z))’ with ‘'Z’
Expected type: Vec a ('S ('S ('S 'Z)))
Actual type: Vec a (Plus ('S ('S ('S 'Z))) ('S ('S ('S 'Z))))
• In the first argument of ‘vZip’, namely ‘(vAppend v v)’
In the expression: vZip (vAppend v v) w
In an equation for ‘it’: it = vZip (vAppend v v) w
Note that even if this expression were written somewhere deep down in a big program, you would immediately get the error at compile-time, instead of runtime problems sometime later on.
How do they evaluate the result?
It depends, and you don't really need to care, but the important bit is that the compiler weeds out anything that could go wrong† and immediately gives you an error message. If all the type calculations are provably right, then the compiler hard-wires the code corresponding to those types into your program, erases the types themselves, and the runtime works essentially like in a Haskell program with plain old lists.
†Sometimes unfortunately it will also give you an error message when the code is actually correct&safe, but the compiler can't prove it.

Related

Transform list in type fixed length in haskell

With the answer to this question I was able to define a list in witch the length is part of the type:
{-# LANGUAGE DataKinds, GADTs, KindSignatures #-}
data Nat = Z | S Nat
data ListF (n :: Nat) a where
Nil :: (ListF 'Z a)
Cons :: a -> ListF m a -> ListF ('S m) a
I would like be able to convert a normal list in to these kind of list. If a make these:
toListF :: [a] -> ListF n a
toListF [] = Nil
toListF (x:xs) = Cons x (toListF xs)
It does't type-check, because for [a] -> ListF n a to type-check, the function should return any Nat that the caller needs:
ListF.hs:11:14: error:
• Couldn't match type ‘n’ with ‘'Z’
‘n’ is a rigid type variable bound by
the type signature for:
toListF :: forall a (n :: Nat). [a] -> ListF n a
at ListF.hs:10:1-27
Expected type: ListF n a
Actual type: ListF 'Z a
• In the expression: Nil
In an equation for ‘toListF’: toListF [] = Nil
• Relevant bindings include
toListF :: [a] -> ListF n a (bound at ListF.hs:11:1)
|
11 | toListF [] = Nil
| ^^^
Failed, no modules loaded.
The logical type for toListF I think it wold be something like exists n. [a] -> ListF n a or [a] -> (exists n. ListF n a), but of course those are not valid haskell types.
It is possible to do what I am trying to do in haskell? And how?
There are existential types in Haskell.
{-# LANGUAGE DataKinds, GADTs, KindSignatures, RankNTypes #-}
data Nat = Z | S Nat
data ListF (n :: Nat) a where
Nil :: (ListF 'Z a)
Cons :: a -> ListF m a -> ListF ('S m) a
data SomeListF a = forall n . AList (ListF n a)
You cam convert from a regular list to SomeListF:
fromList :: [a] -> SomeListF a
fromList [] = AList Nil
fromList (x:xs) = someCons x (fromList xs) where
someCons x (AList zs) = AList (Cons x zs)
You can also recover ListF from SomeListF, but only in a restricted scope. The n in forall n cannot escape, so you cannot have something like
toListF :: SomeListF a -> ListF n a
but you can have this:
withSomeList :: (forall n . ListF n a -> b) -> SomeListF a -> b
withSomeList f (AList zs) = f zs
Inside the f argument, n is known and you can for example map your list and the length of the result is statically known to be the same as the length of the argument. Here's a silly example:
zipF :: ListF n a -> ListF n b -> ListF n (a, b)
zipF Nil Nil = Nil
zipF (Cons y ys) (Cons z zs) = Cons (y, z) (zipF ys zs)
mapF :: (a->b) -> ListF n a -> ListF n b
mapF _ Nil = Nil
mapF f (Cons z zs) = Cons (f z) (mapF f zs)
zipMapF :: (a->b) -> ListF n a -> ListF n (a, b)
zipMapF f zs = zipF zs (mapF f zs)
zipMapAny :: (a->b) -> ListF n a -> SomeListF (a, b)
zipMapAny f zs = AList (zipMapF f zs)
nums = fromList [1,2,3,4,5]
numsAndSquares = withSomeList (zipMapAny (\x -> x * x)) nums
zipMapAny "knows" that the length of all the lists inside it is the same, but it cannot leak that length to the result. You cannot have for example withSomeList (zipMapF (\x -> x * x)) nums because n would escape.

Using haskell's singletons, how can I write `fromList :: [a] -> Vec a n`?

As part of my journey in understanding singletons I have tried to bridge the gap between compile time safety, and lifting runtime values into that dependent type safety.
I think though that a minimal example of "runtime" values is a function that takes an unbounded list, and converts it to a size-indexed vector. The following skeleton provides length-indexed vectors, but I can't quite determine how to write fromList.
I have considered making the function take a size parameter, but I suspect it's possible to keep that implicit.
{-# LANGUAGE GADTs #-}
{-# LANGUAGE ScopedTypeVariables #-}
{-# LANGUAGE TemplateHaskell #-}
{-# LANGUAGE TypeFamilies #-}
{-# LANGUAGE TypeInType #-}
{-# LANGUAGE UndecidableInstances #-}
import Data.Singletons
import Data.Singletons.TH
$(singletons
[d|
data Nat = Z | S Nat deriving (Show)
|])
data Vec a n where
Nil :: Vec a Z
Cons :: a -> Vec a n -> Vec a (S n)
instance Show a => Show (Vec a n) where
show Nil = "Nil"
show (Cons x xs) = show x ++ " :< " ++ show xs
fromListExplicit :: forall (n :: Nat) a. SNat n -> [a] -> Vec a n
fromListExplicit SZ _ = Nil
fromListExplicit (SS n) (x : xs) = Cons x (fromListExplicit n xs)
ex1 = fromListExplicit (SS (SS (SS SZ))) [1..99]
-- 1 :< 2 :< 3 :< Nil
fromListImplicit :: (?????) => [a] -> Vec a n
fromListImplicit = ?????
main :: IO ()
main = do
xs <- readLn :: IO [Int]
print $ fromListImplicit xs
This is not possible using Haskell because Haskell does not yet have full dependent types (although GHC might in the future). Notice that
fromList :: [a] -> Vec a n
Has both a and n quantified universally, which means that the user should be able to pick their n and get back a Vec of the right size. That makes no sense! The trick is that n is not really for the user to choose - it has to be the length of the input list. (For the same reason, fromList :: Integer -> [a] -> Vec a n would not be any more useful - the size hint has to be something type-level.)
Looking to a dependently typed language like Idris, you can define
fromList : (l : List elem) -> Vec (length l) elem
And in fact they define this in the standard library.
So, what can you do? Short of saying that Vec has the length equal to the size of the input list (which requires lifting "length of the input list" to the type level), you can say it has some length.
data SomeVec a where { SomeVec :: Vec a n -> SomeVec a }
list2SomeVec :: [a] -> SomeVec a
list2SomeVec [] = SomeVec Nil
list2SomeVec (x:xs) = case list2SomeVec xs of
SomeVec ys -> SomeVec (x `Cons` ys)
That isn't spectacularly useful, but it is better than nothing.
You can do this with an existentially quantified type variable, as in #Alec’s answer, or equivalently by rewriting in continuation-passing style. The trick is to give fromList a continuation (function) that’s polymorphic in the size of the Vec; then, within the continuation, you have access to a type variable representing the size:
data Vec n a where
Nil :: Vec Z a
Cons :: a -> Vec n a -> Vec (S n) a
deriving instance (Show a) => Show (Vec n a)
fromList :: [a] -> (forall n. Vec n a -> r) -> r
fromList [] k = k Nil
fromList (x : xs) k = fromList xs $ \ xs' -> k (Cons x xs')
-- fromList [1, 2, 3] show == "Cons 1 (Cons 2 (Cons 3 Nil))"
You can’t know the actual value of n, because it’s not available at compile time.
If you replace your Nat with the one from GHC.TypeLits, I think you can get a KnownNat constraint for n by creating a SomeNat from the runtime length using fromJust (someNatVal (fromIntegral (length xs))), then get back the actual length value at runtime with natVal. I’m not really familiar with how to do it, and it might require the ghc-typelits-natnormalise plugin, but it might be a starting point.

Using the type system to check length of output vs. input list

Suppose a list L, with length n, is interleaved in list J, with length n + 1.
We'd like to know, for each element of J, which of its neighbors from L is the greater.
The following function takes L as its input, and produces a list K, also of length
n + 1, such that the ith element of K is the desired neighbor of the ith element of J.
aux [] prev acc = prev:acc
aux (hd:tl) prev acc = aux tl hd ((max hd prev):acc)
expand row = reverse (aux row 0 [])
I can prove to myself, informally, that the length of the result of this function (which I
originally wrote in Ocaml) is one greater than the length of the input. But I
hopped over to Haskell (a new language for me) because I got interested in being
able to prove via the type system that this invariant holds. With the help
of this previous answer, I was
able to get as far as the following:
{-# LANGUAGE GADTs, TypeOperators, TypeFamilies #-}
data Z
data S n
type family (:+:) a b :: *
type instance (:+:) Z n = n
type instance (:+:) (S m) n = S (m :+: n)
-- A List of length 'n' holding values of type 'a'
data List a n where
Nil :: List a Z
Cons :: a -> List a m -> List a (S m)
aux :: List a n -> a -> List a m -> List a (n :+: (S m))
aux Nil prev acc = Cons prev acc
aux (Cons hd tl) prev acc = aux tl hd (Cons (max hd prev) acc)
However, the last line produces the following error:
* Could not deduce: (m1 :+: S (S m)) ~ S (m1 :+: S m)
from the context: n ~ S m1
bound by a pattern with constructor:
Cons :: forall a m. a -> List a m -> List a (S m),
in an equation for `aux'
at pyramid.hs:23:6-15
Expected type: List a (n :+: S m)
Actual type: List a (m1 :+: S (S m))
* In the expression: aux tl hd (Cons (max hd prev) acc)
In an equation for `aux':
aux (Cons hd tl) prev acc = aux tl hd (Cons (max hd prev) acc)
* Relevant bindings include
acc :: List a m (bound at pyramid.hs:23:23)
tl :: List a m1 (bound at pyramid.hs:23:14)
aux :: List a n -> a -> List a m -> List a (n :+: S m)
(bound at pyramid.hs:22:1)
It seems that what I need to do is teach the compiler that (x :+: (S y)) ~ S (x :+: y). Is this possible?
Alternatively, are there better tools for this problem than the type system?
First, some imports and language extensions:
{-# LANGUAGE GADTs, TypeInType, RankNTypes, TypeOperators, TypeFamilies, TypeApplications, AllowAmbiguousTypes #-}
import Data.Type.Equality
We now have DataKinds (or TypeInType) which allows us to promote any data to the type level (with its own kind), so the type level naturals really deserve to be defined as a regular data (heck, this is exactly the motivating examples the previous link to the GHC docs give!). Nothing changes with your List type, but (:+:) really should be a closed type family (now over things of kind Nat).
-- A natural number type (that can be promoted to the type level)
data Nat = Z | S Nat
-- A List of length 'n' holding values of type 'a'
data List a n where
Nil :: List a Z
Cons :: a -> List a m -> List a (S m)
type family (+) (a :: Nat) (b :: Nat) :: Nat where
Z + n = n
S m + n = S (m + n)
Now, in order to make the proofs work for aux, it is useful to define singleton types for natural numbers.
-- A singleton type for `Nat`
data SNat n where
SZero :: SNat Z
SSucc :: SNat n -> SNat (S n)
-- Utility for taking the predecessor of an `SNat`
sub1 :: SNat (S n) -> SNat n
sub1 (SSucc x) = x
-- Find the size of a list
size :: List a n -> SNat n
size Nil = SZero
size (Cons _ xs) = SSucc (size xs)
Now, we are in shape to start proving some stuff. From Data.Type.Equality, a :~: b represents a proof that a ~ b. We need to prove one simple thing about arithmetic.
-- Proof that n + (S m) == S (n + m)
plusSucc :: SNat n -> SNat m -> (n + S m) :~: S (n + m)
plusSucc SZero _ = Refl
plusSucc (SSucc n) m = gcastWith (plusSucc n m) Refl
Finally, we can use gcastWith to use this proof in aux. Oh and you were missing the Ord a constraint. :)
aux :: Ord a => List a n -> a -> List a m -> List a (n + S m)
aux Nil prev acc = Cons prev acc
aux (Cons hd tl) prev acc = gcastWith (plusSucc (size tl) (SSucc (size acc)))
aux tl hd (Cons (max hd prev) acc)
-- append to a list
(|>) :: List a n -> a -> List a (S n)
Nil |> y = Cons y Nil
(Cons x xs) |> y = Cons x (xs |> y)
-- reverse 'List'
rev :: List a n -> List a n
rev Nil = Nil
rev (Cons x xs) = rev xs |> x
Let me know if this answers your question - getting started with this sort of thing involves a lot of new stuff.

Total real-time persistent queues

Okasaki describes persistent real-time queues which can be realized in Haskell using the type
data Queue a = forall x . Queue
{ front :: [a]
, rear :: [a]
, schedule :: [x]
}
where incremental rotations maintain the invariant
length schedule = length front - length rear
More details
If you're familiar with the queues involved, you can skip this section.
The rotation function looks like
rotate :: [a] -> [a] -> [a] -> [a]
rotate [] (y : _) a = y : a
rotate (x : xs) (y : ys) a =
x : rotate xs ys (y : a)
and it's called by a smart constructor
exec :: [a] -> [a] -> [x] -> Queue a
exec f r (_ : s) = Queue f r s
exec f r [] = Queue f' [] f' where
f' = rotate f r []
after each queue operation. The smart constructor is always called when length s = length f - length r + 1, ensuring that the pattern match in rotate will succeed.
The problem
I hate partial functions! I'd love to find a way to express the structural invariant in the types. The usual dependent vector seems a likely choice:
data Nat = Z | S Nat
data Vec n a where
Nil :: Vec 'Z a
Cons :: a -> Vec n a -> Vec ('S n) a
and then (perhaps)
data Queue a = forall x rl sl . Queue
{ front :: Vec (sl :+ rl) a
, rear :: Vec rl a
, schedule :: Vec sl x
}
The trouble is that I haven't been able to figure out how to juggle the types. It seems extremely likely that some amount of unsafeCoerce will be needed to make this efficient. However, I haven't been able to come up with an approach that's even vaguely manageable. Is it possible to do this nicely in Haskell?
Here is what I got:
open import Function
open import Data.Nat.Base
open import Data.Vec
grotate : ∀ {n m} {A : Set}
-> (B : ℕ -> Set)
-> (∀ {n} -> A -> B n -> B (suc n))
-> Vec A n
-> Vec A (suc n + m)
-> B m
-> B (suc n + m)
grotate B cons [] (y ∷ ys) a = cons y a
grotate B cons (x ∷ xs) (y ∷ ys) a = grotate (B ∘ suc) cons xs ys (cons y a)
rotate : ∀ {n m} {A : Set} -> Vec A n -> Vec A (suc n + m) -> Vec A m -> Vec A (suc n + m)
rotate = grotate (Vec _) _∷_
record Queue (A : Set) : Set₁ where
constructor queue
field
{X} : Set
{n m} : ℕ
front : Vec A (n + m)
rear : Vec A m
schedule : Vec X n
open import Relation.Binary.PropositionalEquality
open import Data.Nat.Properties.Simple
exec : ∀ {m n A} -> Vec A (n + m) -> Vec A (suc m) -> Vec A n -> Queue A
exec {m} {suc n} f r (_ ∷ s) = queue (subst (Vec _) (sym (+-suc n m)) f) r s
exec {m} f r [] = queue (with-zero f') [] f' where
with-zero = subst (Vec _ ∘ suc) (sym (+-right-identity m))
without-zero = subst (Vec _ ∘ suc) (+-right-identity m)
f' = without-zero (rotate f (with-zero r) [])
rotate is defined in terms of grotate for the same reason reverse is defined in terms of foldl (or enumerate in terms of genumerate): because Vec A (suc n + m) is not definitionally Vec A (n + suc m), while (B ∘ suc) m is definitionally B (suc m).
exec has the same implementation as you provided (modulo those substs), but I'm not sure about the types: is it OK that r must be non-empty?
The other answer is super clever (please take a moment to upvote it), but as someone not familiar with Agda, how this would be implemented in Haskell was not obvious to me. Here's a full Haskell version. We'll need a whole slew of extensions, as well as Data.Type.Equality (since we will need to do some limited amount of type-proofs).
{-# LANGUAGE GADTs, ScopedTypeVariables,RankNTypes,
TypeInType, TypeFamilies, TypeOperators #-}
import Data.Type.Equality
Defining Nat, Vec, and Queue
Next, we define the usual type-level natural numbers (this looks like just a regular data definition, but because we have TypeInType enabled, it will get automatically promoted when we use it in a type) and a type function (a type family) for addition. Note that although there are multiple ways of defining +, our choice here will impact what follows. We'll also define the usual Vec which is very much like a list except that it encodes its length in the phantom type n. With that, we can go ahead and define the type of our queue.
data Nat = Z | S Nat
type family n + m where
Z + m = m
S n + m = S (n + m)
data Vec a n where
Nil :: Vec a Z
(:::) :: a -> Vec a n -> Vec a (S n)
data Queue a where
Queue :: { front :: Vec a (n + m)
, rear :: Vec a m
, schedule :: Vec x n } -> Queue a
Defining rotate
Now, things start to get hairier. We want to define a function rotate that has type rotate :: Vec a n -> Vec a (S n + m) -> Vec a m -> Vec a (S n + m), but you quickly run into a variety of proof related problems with just defining this recursively. The solution is instead to define a slightly more general grotate, which can be defined recursively, and for which rotate is a special case.
The point of Bump is to circumvent the fact that there is no such thing as type level composition in Haskell. There is no way of writing things an operator like (∘) such that (S ∘ S) x is S (S x). The workaround is to continuously wrap/unwrap with Bump/lower.
newtype Bump p n = Bump { lower :: p (S n) }
grotate :: forall p n m a.
(forall n. a -> p n -> p (S n)) ->
Vec a n ->
Vec a (S n + m) ->
p m ->
p (S n + m)
grotate cons Nil (y ::: _) zs = cons y zs
grotate cons (x ::: xs) (y ::: ys) zs = lower (grotate consS xs ys (Bump (cons y zs)))
where
consS :: forall n. a -> Bump p n -> Bump p (S n)
consS = \a -> Bump . cons a . lower
rotate :: Vec a n -> Vec a (S n + m) -> Vec a m -> Vec a (S n + m)
rotate = grotate (:::)
We need explicit foralls here to make it very clear which type variables are getting captured and which aren't, as well as to denote higher-rank types.
Singleton natural numbers SNat
Before we proceed to exec, we set up some machinery that will allow us to prove some type-level arithmetic claims (which we need to get exec to typecheck). We start by making an SNat type (which is a singleton type corresponding to Nat). SNat reflects its value in a phantom type variable.
data SNat n where
SZero :: SNat Z
SSucc :: SNat n -> SNat (S n)
We can then make a couple useful functions to do things with SNat.
sub1 :: SNat (S n) -> SNat n
sub1 (SSucc x) = x
size :: Vec a n -> SNat n
size Nil = SZero
size (_ ::: xs) = SSucc (size xs)
Finally, we are prepared to prove some arithmetic, namely that n + S m ~ S (n + m) and n + Z ~ n.
plusSucc :: (SNat n) -> (SNat m) -> (n + S m) :~: S (n + m)
plusSucc SZero _ = Refl
plusSucc (SSucc n) m = gcastWith (plusSucc n m) Refl
plusZero :: SNat n -> (n + Z) :~: n
plusZero SZero = Refl
plusZero (SSucc n) = gcastWith (plusZero n) Refl
Defining exec
Now that we have rotate, we can define exec. This definition looks almost identical to the one in the question (with lists), except annotated with gcastWith <some-proof>.
exec :: Vec a (n + m) -> Vec a (S m) -> Vec a n -> Queue a
exec f r (_ ::: s) = gcastWith (plusSucc (size s) (sub1 (size r))) $ Queue f r s
exec f r Nil = gcastWith (plusZero (sub1 (size r))) $
let f' = rotate f r Nil in (Queue f' Nil f')
It is probably worth noting that we can get some stuff for free by using singletons. With the right extensions enabled, the following more readable code
import Data.Singletons.TH
singletons [d|
data Nat = Z | S Nat
(+) :: Nat -> Nat -> Nat
Z + n = n
S m + n = S (m + n)
|]
defines, Nat, the type family :+ (equivalent to my +), and the singleton type SNat (with constructors SZ and SS equivalent to my SZero and SSucc) all in one.

How do I build a list with a dependently-typed length?

Dipping my toe into the waters of dependent types, I had a crack at the canonical "list with statically-typed length" example.
{-# LANGUAGE DataKinds, GADTs, KindSignatures #-}
-- a kind declaration
data Nat = Z | S Nat
data SafeList :: (Nat -> * -> *) where
Nil :: SafeList Z a
Cons :: a -> SafeList n a -> SafeList (S n) a
-- the type signature ensures that the input list has at least one element
safeHead :: SafeList (S n) a -> a
safeHead (Cons x xs) = x
This seems to work:
ghci> :t Cons 5 (Cons 3 Nil)
Cons 5 (Cons 3 Nil) :: Num a => SafeList ('S ('S 'Z)) a
ghci> safeHead (Cons 'x' (Cons 'c' Nil))
'x'
ghci> safeHead Nil
Couldn't match type 'Z with 'S n0
Expected type: SafeList ('S n0) a0
Actual type: SafeList 'Z a0
In the first argument of `safeHead', namely `Nil'
In the expression: safeHead Nil
In an equation for `it': it = safeHead Nil
However, in order for this data-type to be actually useful, I should be able to build it from run-time data for which you don't know the length at compile time. My naïve attempt:
fromList :: [a] -> SafeList n a
fromList = foldr Cons Nil
This fails to compile, with the type error:
Couldn't match type 'Z with 'S n
Expected type: a -> SafeList n a -> SafeList n a
Actual type: a -> SafeList n a -> SafeList ('S n) a
In the first argument of `foldr', namely `Cons'
In the expression: foldr Cons Nil
In an equation for `fromList': fromList = foldr Cons Nil
I understand why this is happening: the return type of Cons is different for each iteration of the fold - that's the whole point! But I can't see a way around it, probably because I've not read deeply enough into the subject. (I can't imagine all this effort is being put into a type system that is impossible to use in practice!)
So: How can I build this sort of dependently-typed data from 'normal' simply-typed data?
Following #luqui's advice I was able to make fromList compile:
data ASafeList a where
ASafeList :: SafeList n a -> ASafeList a
fromList :: [a] -> ASafeList a
fromList = foldr f (ASafeList Nil)
where f x (ASafeList xs) = ASafeList (Cons x xs)
Here's my attempt to unpack the ASafeList and use it:
getSafeHead :: [a] -> a
getSafeHead xs = case fromList xs of ASafeList ys -> safeHead ys
This causes another type error:
Couldn't match type `n' with 'S n0
`n' is a rigid type variable bound by
a pattern with constructor
ASafeList :: forall a (n :: Nat). SafeList n a -> ASafeList a,
in a case alternative
at SafeList.hs:33:22
Expected type: SafeList ('S n0) a
Actual type: SafeList n a
In the first argument of `safeHead', namely `ys'
In the expression: safeHead ys
In a case alternative: ASafeList ys -> safeHead ys
Again, intuitively it makes sense that this would fail to compile. I can call fromList with an empty list, so the compiler has no guarantee that I'll be able to call safeHead on the resulting SafeList. This lack of knowledge is roughly what the existential ASafeList captures.
Can this problem be solved? I feel like I might have walked down a logical dead-end.
Never throw anything away.
If you're going to take the trouble to crank along a list to make a length-indexed list (known in the literature as a "vector"), you may as well remember its length.
So, we have
data Nat = Z | S Nat
data Vec :: Nat -> * -> * where -- old habits die hard
VNil :: Vec Z a
VCons :: a -> Vec n a -> Vec (S n) a
but we can also give a run time representation to static lengths. Richard Eisenberg's "Singletons" package will do this for you, but the basic idea is to give a type of run time representations for static numbers.
data Natty :: Nat -> * where
Zy :: Natty Z
Sy :: Natty n -> Natty (S n)
Crucially, if we have a value of type Natty n, then we can interrogate that value to find out what n is.
Hasochists know that run time representability is often so boring that even a machine can manage it, so we hide it inside a type class
class NATTY (n :: Nat) where
natty :: Natty n
instance NATTY Z where
natty = Zy
instance NATTY n => NATTY (S n) where
natty = Sy natty
Now we can give a slightly more informative existential treatment of the length you get from your lists.
data LenList :: * -> * where
LenList :: NATTY n => Vec n a -> LenList a
lenList :: [a] -> LenList a
lenList [] = LenList VNil
lenList (x : xs) = case lenList xs of LenList ys -> LenList (VCons x ys)
You get the same code as the length-destroying version, but you can grab a run time representation of the length anytime you like, and you don't need to crawl along the vector to get it.
Of course, if you want the length to be a Nat, it's still a pain that you instead have a Natty n for some n.
It's a mistake to clutter one's pockets.
Edit I thought I'd add a little, to address the "safe head" usage issue.
First, let me add an unpacker for LenList which gives you the number in your hand.
unLenList :: LenList a -> (forall n. Natty n -> Vec n a -> t) -> t
unLenList (LenList xs) k = k natty xs
And now suppose I define
vhead :: Vec (S n) a -> a
vhead (VCons a _) = a
enforcing the safety property. If I have a run time representation of the length of a vector, I can look at it to see if vhead applies.
headOrBust :: LenList a -> Maybe a
headOrBust lla = unLenList lla $ \ n xs -> case n of
Zy -> Nothing
Sy _ -> Just (vhead xs)
So you look at one thing, and in doing so, learn about another.
In
fromList :: [a] -> SafeList n a
n is universally quantified -- i.e. this signature is claiming that we should be able to build a SafeList of any length from the list. Instead you want to quantify existentially, which can only be done by defining a new data type:
data ASafeList a where
ASafeList :: SafeList n a -> ASafeList a
Then your signature should be
fromList :: [a] -> ASafeList a
You can use it by pattern matching on ASafeList
useList :: ASafeList a -> ...
useList (ASafeList xs) = ...
and in the body, xs will be a SafeList n a type with an unknown (rigid) n. You will probably have to add more operations to use it in any nontrivial way.
If you want to use dependently typed functions on runtime data, then you need to ensure, that this data doesn't violate encoded in type signatures laws. It's easier to understand this by an example. Here is our setup:
data Nat = Z | S Nat
data Natty (n :: Nat) where
Zy :: Natty Z
Sy :: Natty n -> Natty (S n)
data Vec :: * -> Nat -> * where
VNil :: Vec a Z
VCons :: a -> Vec a n -> Vec a (S n)
We can write some simple functions on Vec:
vhead :: Vec a (S n) -> a
vhead (VCons x xs) = x
vtoList :: Vec a n -> [a]
vtoList VNil = []
vtoList (VCons x xs) = x : vtoList xs
vlength :: Vec a n -> Natty n
vlength VNil = Zy
vlength (VCons x xs) = Sy (vlength xs)
For writing the canonical example of the lookup function we need the concept of finite sets. They are usually defined as
data Fin :: Nat -> where
FZ :: Fin (S n)
FS :: Fin n -> Fin (S n)
Fin n represents all numbers less than n.
But just like there is a type level equivalent of Nats — Nattys, there is a type level equivalent of Fins. But now we can incorporate value level and type level Fins:
data Finny :: Nat -> Nat -> * where
FZ :: Finny (S n) Z
FS :: Finny n m -> Finny (S n) (S m)
The first Nat is an upper bound of a Finny. And the second Nat corresponds to an actual value of a Finny. I.e. it must be equal to toNatFinny i, where
toNatFinny :: Finny n m -> Nat
toNatFinny FZ = Z
toNatFinny (FS i) = S (toNatFinny i)
Defining the lookup function is now straightforward:
vlookup :: Finny n m -> Vec a n -> a
vlookup FZ (VCons x xs) = x
vlookup (FS i) (VCons x xs) = vlookup i xs
And some tests:
print $ vlookup FZ (VCons 1 (VCons 2 (VCons 3 VNil))) -- 1
print $ vlookup (FS FZ) (VCons 1 (VCons 2 (VCons 3 VNil))) -- 2
print $ vlookup (FS (FS (FS FZ))) (VCons 1 (VCons 2 (VCons 3 VNil))) -- compile-time error
That was simple, but what about the take function? It's not harder:
type Finny0 n = Finny (S n)
vtake :: Finny0 n m -> Vec a n -> Vec a m
vtake FZ _ = VNil
vtake (FS i) (VCons x xs) = VCons x (vtake i xs)
We need Finny0 instead of Finny, because lookup requires a Vec to be non-empty, so if there is a value of type Finny n m, then n = S n' for some n'. But vtake FZ VNil is perfectly valid, so we need to relax this restriction. So Finny0 n represents all numbers less or equal n.
But what about runtime data?
vfromList :: [a] -> (forall n. Vec a n -> b) -> b
vfromList [] f = f VNil
vfromList (x:xs) f = vfromList xs (f . VCons x)
I.e. "give me a list and a function, that accepts a Vec of arbitrary length, and I'll apply the latter to the former". vfromList xs returns a continuation (i.e. something of type (a -> r) -> r) modulo higher-rank types. Let's try it:
vmhead :: Vec a n -> Maybe a
vmhead VNil = Nothing
vmhead (VCons x xs) = Just x
main = do
print $ vfromList ([] :: [Int]) vmhead -- Nothing
print $ vfromList [1..5] vmhead -- Just 1
Works. But aren't we just repeat ourself? Why vmhead, when there is vhead already? Should we rewrite all safe functions in an unsafe way to make is possible to use them on runtime data? That would be silly.
All we need is to ensure, that all invariants hold. Let's try this principle on the vtake function:
fromIntFinny :: Int -> (forall n m. Finny n m -> b) -> b
fromIntFinny 0 f = f FZ
fromIntFinny n f = fromIntFinny (n - 1) (f . FS)
main = do
xs <- readLn :: IO [Int]
i <- read <$> getLine
putStrLn $
fromIntFinny i $ \i' ->
vfromList xs $ \xs' ->
undefined -- what's here?
fromIntFinny is just like vfromList. It's instructive to see, what the types are:
i' :: Finny n m
xs' :: Vec a p
But vtake has this type: Finny0 n m -> Vec a n -> Vec a m. So we need to coerce i', so that it would be of type Finny0 p m. And also toNatFinny i' must be equal to toNatFinny coerced_i'. But this coercion is not possible in general, since if S p < n, then there are elements in Finny n m, that are not in Finny (S p) m, since S p and n are upper bounds.
coerceFinnyBy :: Finny n m -> Natty p -> Maybe (Finny0 p m)
coerceFinnyBy FZ p = Just FZ
coerceFinnyBy (FS i) (Sy p) = fmap FS $ i `coerceFinnyBy` p
coerceFinnyBy _ _ = Nothing
That's why there is Maybe here.
main = do
xs <- readLn :: IO [Int]
i <- read <$> getLine
putStrLn $
fromIntFinny i $ \i' ->
vfromList xs $ \xs' ->
case i' `coerceFinnyBy` vlength xs' of
Nothing -> "What should I do with this input?"
Just i'' -> show $ vtoList $ vtake i'' xs'
In the Nothing case a number, that was read from the input, is bigger, than the length of a list. In the Just case a number is less or equal to the length of a list and coerced to the appropriate type, so vtake i'' xs' is well-typed.
This works, but we introduced the coerceFinnyBy function, that looks rather ad hoc. Decidable "less or equal" relation would be the appropriate alternative:
data (:<=) :: Nat -> Nat -> * where
Z_le_Z :: Z :<= m -- forall n, 0 <= n
S_le_S :: n :<= m -> S n :<= S m -- forall n m, n <= m -> S n <= S m
type n :< m = S n :<= m
(<=?) :: Natty n -> Natty m -> Either (m :< n) (n :<= m) -- forall n m, n <= m || m < n
Zy <=? m = Right Z_le_Z
Sy n <=? Zy = Left (S_le_S Z_le_Z)
Sy n <=? Sy m = either (Left . S_le_S) (Right . S_le_S) $ n <=? m
And a safe injecting function:
inject0Le :: Finny0 n p -> n :<= m -> Finny0 m p
inject0Le FZ _ = FZ
inject0Le (FS i) (S_le_S le) = FS (inject0Le i le)
I.e. if n is an upper bound for some number and n <= m, then m is an upper bound for this number too. And another one:
injectLe0 :: Finny n p -> n :<= m -> Finny0 m p
injectLe0 FZ (S_le_S le) = FZ
injectLe0 (FS i) (S_le_S le) = FS (injectLe0 i le)
The code now looks like this:
getUpperBound :: Finny n m -> Natty n
getUpperBound = undefined
main = do
xs <- readLn :: IO [Int]
i <- read <$> getLine
putStrLn $
fromIntFinny i $ \i' ->
vfromList xs $ \xs' ->
case getUpperBound i' <=? vlength xs' of
Left _ -> "What should I do with this input?"
Right le -> show $ vtoList $ vtake (injectLe0 i' le) xs'
It compiles, but what definition should getUpperBound have? Well, you can't define it. A n in Finny n m lives only at the type level, you can't extract it or get somehow. If we can't perform "downcast", we can perform "upcast":
fromIntNatty :: Int -> (forall n. Natty n -> b) -> b
fromIntNatty 0 f = f Zy
fromIntNatty n f = fromIntNatty (n - 1) (f . Sy)
fromNattyFinny0 :: Natty n -> (forall m. Finny0 n m -> b) -> b
fromNattyFinny0 Zy f = f FZ
fromNattyFinny0 (Sy n) f = fromNattyFinny0 n (f . FS)
For comparison:
fromIntFinny :: Int -> (forall n m. Finny n m -> b) -> b
fromIntFinny 0 f = f FZ
fromIntFinny n f = fromIntFinny (n - 1) (f . FS)
So a continuation in fromIntFinny is universally quantified over the n and m variables, while a continuation in fromNattyFinny0 is universally quantified over just m. And fromNattyFinny0 receives a Natty n instead of Int.
There is Finny0 n m instead of Finny n m, because FZ is an element of forall n m. Finny n m, while FZ is not necessarily an element of forall m. Finny n m for some n, specifically FZ is not an element of forall m. Finny 0 m (so this type is uninhabited).
After all, we can join fromIntNatty and fromNattyFinny0 together:
fromIntNattyFinny0 :: Int -> (forall n m. Natty n -> Finny0 n m -> b) -> b
fromIntNattyFinny0 n f = fromIntNatty n $ \n' -> fromNattyFinny0 n' (f n')
Achieving the same result, as in the #pigworker's answer:
unLenList :: LenList a -> (forall n. Natty n -> Vec n a -> t) -> t
unLenList (LenList xs) k = k natty xs
Some tests:
main = do
xs <- readLn :: IO [Int]
ns <- read <$> getLine
forM_ ns $ \n -> putStrLn $
fromIntNattyFinny0 n $ \n' i' ->
vfromList xs $ \xs' ->
case n' <=? vlength xs' of
Left _ -> "What should I do with this input?"
Right le -> show $ vtoList $ vtake (inject0Le i' le) xs'
for
[1,2,3,4,5,6]
[0,2,5,6,7,10]
returns
[]
[1,2]
[1,2,3,4,5]
[1,2,3,4,5,6]
What should I do with this input?
What should I do with this input?
The code: http://ideone.com/3GX0hd
EDIT
Well, you can't define it. A n in Finny n m lives only at the type
level, you can't extract it or get somehow.
That's not true. Having SingI n => Finny n m -> ..., we can get n as fromSing sing.

Resources