Okasaki describes persistent real-time queues which can be realized in Haskell using the type
data Queue a = forall x . Queue
{ front :: [a]
, rear :: [a]
, schedule :: [x]
}
where incremental rotations maintain the invariant
length schedule = length front - length rear
More details
If you're familiar with the queues involved, you can skip this section.
The rotation function looks like
rotate :: [a] -> [a] -> [a] -> [a]
rotate [] (y : _) a = y : a
rotate (x : xs) (y : ys) a =
x : rotate xs ys (y : a)
and it's called by a smart constructor
exec :: [a] -> [a] -> [x] -> Queue a
exec f r (_ : s) = Queue f r s
exec f r [] = Queue f' [] f' where
f' = rotate f r []
after each queue operation. The smart constructor is always called when length s = length f - length r + 1, ensuring that the pattern match in rotate will succeed.
The problem
I hate partial functions! I'd love to find a way to express the structural invariant in the types. The usual dependent vector seems a likely choice:
data Nat = Z | S Nat
data Vec n a where
Nil :: Vec 'Z a
Cons :: a -> Vec n a -> Vec ('S n) a
and then (perhaps)
data Queue a = forall x rl sl . Queue
{ front :: Vec (sl :+ rl) a
, rear :: Vec rl a
, schedule :: Vec sl x
}
The trouble is that I haven't been able to figure out how to juggle the types. It seems extremely likely that some amount of unsafeCoerce will be needed to make this efficient. However, I haven't been able to come up with an approach that's even vaguely manageable. Is it possible to do this nicely in Haskell?
Here is what I got:
open import Function
open import Data.Nat.Base
open import Data.Vec
grotate : ∀ {n m} {A : Set}
-> (B : ℕ -> Set)
-> (∀ {n} -> A -> B n -> B (suc n))
-> Vec A n
-> Vec A (suc n + m)
-> B m
-> B (suc n + m)
grotate B cons [] (y ∷ ys) a = cons y a
grotate B cons (x ∷ xs) (y ∷ ys) a = grotate (B ∘ suc) cons xs ys (cons y a)
rotate : ∀ {n m} {A : Set} -> Vec A n -> Vec A (suc n + m) -> Vec A m -> Vec A (suc n + m)
rotate = grotate (Vec _) _∷_
record Queue (A : Set) : Set₁ where
constructor queue
field
{X} : Set
{n m} : ℕ
front : Vec A (n + m)
rear : Vec A m
schedule : Vec X n
open import Relation.Binary.PropositionalEquality
open import Data.Nat.Properties.Simple
exec : ∀ {m n A} -> Vec A (n + m) -> Vec A (suc m) -> Vec A n -> Queue A
exec {m} {suc n} f r (_ ∷ s) = queue (subst (Vec _) (sym (+-suc n m)) f) r s
exec {m} f r [] = queue (with-zero f') [] f' where
with-zero = subst (Vec _ ∘ suc) (sym (+-right-identity m))
without-zero = subst (Vec _ ∘ suc) (+-right-identity m)
f' = without-zero (rotate f (with-zero r) [])
rotate is defined in terms of grotate for the same reason reverse is defined in terms of foldl (or enumerate in terms of genumerate): because Vec A (suc n + m) is not definitionally Vec A (n + suc m), while (B ∘ suc) m is definitionally B (suc m).
exec has the same implementation as you provided (modulo those substs), but I'm not sure about the types: is it OK that r must be non-empty?
The other answer is super clever (please take a moment to upvote it), but as someone not familiar with Agda, how this would be implemented in Haskell was not obvious to me. Here's a full Haskell version. We'll need a whole slew of extensions, as well as Data.Type.Equality (since we will need to do some limited amount of type-proofs).
{-# LANGUAGE GADTs, ScopedTypeVariables,RankNTypes,
TypeInType, TypeFamilies, TypeOperators #-}
import Data.Type.Equality
Defining Nat, Vec, and Queue
Next, we define the usual type-level natural numbers (this looks like just a regular data definition, but because we have TypeInType enabled, it will get automatically promoted when we use it in a type) and a type function (a type family) for addition. Note that although there are multiple ways of defining +, our choice here will impact what follows. We'll also define the usual Vec which is very much like a list except that it encodes its length in the phantom type n. With that, we can go ahead and define the type of our queue.
data Nat = Z | S Nat
type family n + m where
Z + m = m
S n + m = S (n + m)
data Vec a n where
Nil :: Vec a Z
(:::) :: a -> Vec a n -> Vec a (S n)
data Queue a where
Queue :: { front :: Vec a (n + m)
, rear :: Vec a m
, schedule :: Vec x n } -> Queue a
Defining rotate
Now, things start to get hairier. We want to define a function rotate that has type rotate :: Vec a n -> Vec a (S n + m) -> Vec a m -> Vec a (S n + m), but you quickly run into a variety of proof related problems with just defining this recursively. The solution is instead to define a slightly more general grotate, which can be defined recursively, and for which rotate is a special case.
The point of Bump is to circumvent the fact that there is no such thing as type level composition in Haskell. There is no way of writing things an operator like (∘) such that (S ∘ S) x is S (S x). The workaround is to continuously wrap/unwrap with Bump/lower.
newtype Bump p n = Bump { lower :: p (S n) }
grotate :: forall p n m a.
(forall n. a -> p n -> p (S n)) ->
Vec a n ->
Vec a (S n + m) ->
p m ->
p (S n + m)
grotate cons Nil (y ::: _) zs = cons y zs
grotate cons (x ::: xs) (y ::: ys) zs = lower (grotate consS xs ys (Bump (cons y zs)))
where
consS :: forall n. a -> Bump p n -> Bump p (S n)
consS = \a -> Bump . cons a . lower
rotate :: Vec a n -> Vec a (S n + m) -> Vec a m -> Vec a (S n + m)
rotate = grotate (:::)
We need explicit foralls here to make it very clear which type variables are getting captured and which aren't, as well as to denote higher-rank types.
Singleton natural numbers SNat
Before we proceed to exec, we set up some machinery that will allow us to prove some type-level arithmetic claims (which we need to get exec to typecheck). We start by making an SNat type (which is a singleton type corresponding to Nat). SNat reflects its value in a phantom type variable.
data SNat n where
SZero :: SNat Z
SSucc :: SNat n -> SNat (S n)
We can then make a couple useful functions to do things with SNat.
sub1 :: SNat (S n) -> SNat n
sub1 (SSucc x) = x
size :: Vec a n -> SNat n
size Nil = SZero
size (_ ::: xs) = SSucc (size xs)
Finally, we are prepared to prove some arithmetic, namely that n + S m ~ S (n + m) and n + Z ~ n.
plusSucc :: (SNat n) -> (SNat m) -> (n + S m) :~: S (n + m)
plusSucc SZero _ = Refl
plusSucc (SSucc n) m = gcastWith (plusSucc n m) Refl
plusZero :: SNat n -> (n + Z) :~: n
plusZero SZero = Refl
plusZero (SSucc n) = gcastWith (plusZero n) Refl
Defining exec
Now that we have rotate, we can define exec. This definition looks almost identical to the one in the question (with lists), except annotated with gcastWith <some-proof>.
exec :: Vec a (n + m) -> Vec a (S m) -> Vec a n -> Queue a
exec f r (_ ::: s) = gcastWith (plusSucc (size s) (sub1 (size r))) $ Queue f r s
exec f r Nil = gcastWith (plusZero (sub1 (size r))) $
let f' = rotate f r Nil in (Queue f' Nil f')
It is probably worth noting that we can get some stuff for free by using singletons. With the right extensions enabled, the following more readable code
import Data.Singletons.TH
singletons [d|
data Nat = Z | S Nat
(+) :: Nat -> Nat -> Nat
Z + n = n
S m + n = S (m + n)
|]
defines, Nat, the type family :+ (equivalent to my +), and the singleton type SNat (with constructors SZ and SS equivalent to my SZero and SSucc) all in one.
Related
I am trying to understand how the Plus and Times functions in the code below work. What I do not understand is:
How do you call these functions in ghci to operate on a vector?
Why do they have to operate on the type level? What type do they operate on?
How do they evaluate the result?
For plus, we have, Plus (S m) n = S (Plus m n), but how is (Plus m n) evaluated. Similarly, how is (Times n m) evaluated?
{-# LANGUAGE DataKinds, KindSignatures, TypeFamilies, GADTs #-}
{-# LANGUAGE UndecidableInstances, StandaloneDeriving #-}
module Vector where
-- Natural numbers, values will be promoted to types
data Nat
= Z -- Zero
| S Nat -- Successor (+1)
data Vec (a :: *) :: Nat -> * where
-- Nil has zero length
Nil :: Vec a Z
-- Cons has length of the tail + 1
Cons :: a -> Vec a n -> Vec a (S n)
deriving instance Show a => Show (Vec a n)
-- head :: [a] -> a
hd :: Vec a (S n) -> a
hd (Cons x xs) = x
-- tail :: [a] -> [a]
tl :: Vec a (S n) -> Vec a n
tl (Cons x xs) = xs
-- map :: (a -> b) -> [a] -> [b]
vMap :: (a -> b) -> Vec a n -> Vec b n
vMap f Nil = Nil
vMap f (Cons x xs) = Cons (f x) (vMap f xs)
-- (++) :: [a] -> [a] -> [a]
vAppend :: Vec a n -> Vec a m -> Vec a (Plus n m)
vAppend Nil xs = xs
vAppend (Cons y ys) xs = Cons y (vAppend ys xs)
-- Type-level addition
type family Plus (x :: Nat) (y :: Nat) :: Nat where
Plus Z n = n
Plus (S m) n = S (Plus m n)
-- concat :: [[a]] -> [a]
vConcat :: Vec (Vec a n) m -> Vec a (Times m n)
vConcat Nil = Nil
vConcat (Cons xs xss) = xs `vAppend` vConcat xss
-- Type-level multiplication
type family Times (x :: Nat) (y :: Nat) :: Nat where
Times Z m = Z
Times (S n) m = Plus m (Times n m)
vFilter :: (a -> Bool) -> Vec a n -> [a]
vFilter p Nil = []
vFilter p (Cons x xs)
| p x = x : vFilter p xs
| otherwise = vFilter p xs
Let me first define a more convenient synonym for Cons:
infixr 5 #:
(#:) :: a -> Vec a n -> Vec a (S n)
(#:) = Cons
How do you call these functions in ghci to operate on a vector?
The value-level functions you can call from GHCi like any other value-level functions. That'll first of all invoke any type-level computations that are necessary, and then run the typechecked code like you would run other Haskell code.
*Vector> :set -XDataKinds
*Vector> let v = 4 #: 9 #: 13 #:Nil
*Vector> :t v
v :: Num a => Vec a ('S ('S ('S 'Z)))
*Vector> v
Cons 4 (Cons 9 (Cons 13 Nil))
*Vector> let w = 7 #: 8 #: 6 #:Nil
*Vector> :t vAppend v w
vAppend v w :: Num a => Vec a ('S ('S ('S ('S ('S ('S 'Z))))))
*Vector> vAppend v w
Cons 4 (Cons 9 (Cons 13 (Cons 7 (Cons 8 (Cons 6 Nil)))))
To evaluate type families as type-level functions by themselves, use GHCi's :kind! command:
*Vector> :kind! Plus ('S 'Z) ('S ('S ('S 'Z)))
Plus ('S 'Z) ('S ('S ('S 'Z))) :: Nat
= 'S ('S ('S ('S 'Z)))
Why do they have to operate on the type level? What type do they operate on?
The purpose, in this example, of having them operate on the type level is that any length-mismatch errors should be caught by the compiler, not result in runtime errors. For example, you may want to write a function accepting two lists that are supposed to have the same length. This is then unsafe:
foo :: [a] -> [a] -> String
but this expresses exactly that the lengths must be identical:
foo' :: Vec a n -> Vec a n -> String
A concrete example is a zip. Prelude.zip does allow lists of different lengths but this means the longer list is basically trimmed to the length of the shorter, which may result in unexpected behaviour. With the Vector version
vZip :: Vec a n -> Vec b n -> Vec (a,b) n
vZip Nil Nil = Nil
vZip (Cons x xs) (Cons y ys) = Cons (x,y) $ vZip xs ys
this can't happen:
*Vector> vZip v w
Cons (4,7) (Cons (9,8) (Cons (13,6) Nil))
*Vector> vZip (vAppend v v) w
<interactive>:22:7: error:
• Couldn't match type ‘'S ('S ('S 'Z))’ with ‘'Z’
Expected type: Vec a ('S ('S ('S 'Z)))
Actual type: Vec a (Plus ('S ('S ('S 'Z))) ('S ('S ('S 'Z))))
• In the first argument of ‘vZip’, namely ‘(vAppend v v)’
In the expression: vZip (vAppend v v) w
In an equation for ‘it’: it = vZip (vAppend v v) w
Note that even if this expression were written somewhere deep down in a big program, you would immediately get the error at compile-time, instead of runtime problems sometime later on.
How do they evaluate the result?
It depends, and you don't really need to care, but the important bit is that the compiler weeds out anything that could go wrong† and immediately gives you an error message. If all the type calculations are provably right, then the compiler hard-wires the code corresponding to those types into your program, erases the types themselves, and the runtime works essentially like in a Haskell program with plain old lists.
†Sometimes unfortunately it will also give you an error message when the code is actually correct&safe, but the compiler can't prove it.
Suppose a list L, with length n, is interleaved in list J, with length n + 1.
We'd like to know, for each element of J, which of its neighbors from L is the greater.
The following function takes L as its input, and produces a list K, also of length
n + 1, such that the ith element of K is the desired neighbor of the ith element of J.
aux [] prev acc = prev:acc
aux (hd:tl) prev acc = aux tl hd ((max hd prev):acc)
expand row = reverse (aux row 0 [])
I can prove to myself, informally, that the length of the result of this function (which I
originally wrote in Ocaml) is one greater than the length of the input. But I
hopped over to Haskell (a new language for me) because I got interested in being
able to prove via the type system that this invariant holds. With the help
of this previous answer, I was
able to get as far as the following:
{-# LANGUAGE GADTs, TypeOperators, TypeFamilies #-}
data Z
data S n
type family (:+:) a b :: *
type instance (:+:) Z n = n
type instance (:+:) (S m) n = S (m :+: n)
-- A List of length 'n' holding values of type 'a'
data List a n where
Nil :: List a Z
Cons :: a -> List a m -> List a (S m)
aux :: List a n -> a -> List a m -> List a (n :+: (S m))
aux Nil prev acc = Cons prev acc
aux (Cons hd tl) prev acc = aux tl hd (Cons (max hd prev) acc)
However, the last line produces the following error:
* Could not deduce: (m1 :+: S (S m)) ~ S (m1 :+: S m)
from the context: n ~ S m1
bound by a pattern with constructor:
Cons :: forall a m. a -> List a m -> List a (S m),
in an equation for `aux'
at pyramid.hs:23:6-15
Expected type: List a (n :+: S m)
Actual type: List a (m1 :+: S (S m))
* In the expression: aux tl hd (Cons (max hd prev) acc)
In an equation for `aux':
aux (Cons hd tl) prev acc = aux tl hd (Cons (max hd prev) acc)
* Relevant bindings include
acc :: List a m (bound at pyramid.hs:23:23)
tl :: List a m1 (bound at pyramid.hs:23:14)
aux :: List a n -> a -> List a m -> List a (n :+: S m)
(bound at pyramid.hs:22:1)
It seems that what I need to do is teach the compiler that (x :+: (S y)) ~ S (x :+: y). Is this possible?
Alternatively, are there better tools for this problem than the type system?
First, some imports and language extensions:
{-# LANGUAGE GADTs, TypeInType, RankNTypes, TypeOperators, TypeFamilies, TypeApplications, AllowAmbiguousTypes #-}
import Data.Type.Equality
We now have DataKinds (or TypeInType) which allows us to promote any data to the type level (with its own kind), so the type level naturals really deserve to be defined as a regular data (heck, this is exactly the motivating examples the previous link to the GHC docs give!). Nothing changes with your List type, but (:+:) really should be a closed type family (now over things of kind Nat).
-- A natural number type (that can be promoted to the type level)
data Nat = Z | S Nat
-- A List of length 'n' holding values of type 'a'
data List a n where
Nil :: List a Z
Cons :: a -> List a m -> List a (S m)
type family (+) (a :: Nat) (b :: Nat) :: Nat where
Z + n = n
S m + n = S (m + n)
Now, in order to make the proofs work for aux, it is useful to define singleton types for natural numbers.
-- A singleton type for `Nat`
data SNat n where
SZero :: SNat Z
SSucc :: SNat n -> SNat (S n)
-- Utility for taking the predecessor of an `SNat`
sub1 :: SNat (S n) -> SNat n
sub1 (SSucc x) = x
-- Find the size of a list
size :: List a n -> SNat n
size Nil = SZero
size (Cons _ xs) = SSucc (size xs)
Now, we are in shape to start proving some stuff. From Data.Type.Equality, a :~: b represents a proof that a ~ b. We need to prove one simple thing about arithmetic.
-- Proof that n + (S m) == S (n + m)
plusSucc :: SNat n -> SNat m -> (n + S m) :~: S (n + m)
plusSucc SZero _ = Refl
plusSucc (SSucc n) m = gcastWith (plusSucc n m) Refl
Finally, we can use gcastWith to use this proof in aux. Oh and you were missing the Ord a constraint. :)
aux :: Ord a => List a n -> a -> List a m -> List a (n + S m)
aux Nil prev acc = Cons prev acc
aux (Cons hd tl) prev acc = gcastWith (plusSucc (size tl) (SSucc (size acc)))
aux tl hd (Cons (max hd prev) acc)
-- append to a list
(|>) :: List a n -> a -> List a (S n)
Nil |> y = Cons y Nil
(Cons x xs) |> y = Cons x (xs |> y)
-- reverse 'List'
rev :: List a n -> List a n
rev Nil = Nil
rev (Cons x xs) = rev xs |> x
Let me know if this answers your question - getting started with this sort of thing involves a lot of new stuff.
Dipping my toe into the waters of dependent types, I had a crack at the canonical "list with statically-typed length" example.
{-# LANGUAGE DataKinds, GADTs, KindSignatures #-}
-- a kind declaration
data Nat = Z | S Nat
data SafeList :: (Nat -> * -> *) where
Nil :: SafeList Z a
Cons :: a -> SafeList n a -> SafeList (S n) a
-- the type signature ensures that the input list has at least one element
safeHead :: SafeList (S n) a -> a
safeHead (Cons x xs) = x
This seems to work:
ghci> :t Cons 5 (Cons 3 Nil)
Cons 5 (Cons 3 Nil) :: Num a => SafeList ('S ('S 'Z)) a
ghci> safeHead (Cons 'x' (Cons 'c' Nil))
'x'
ghci> safeHead Nil
Couldn't match type 'Z with 'S n0
Expected type: SafeList ('S n0) a0
Actual type: SafeList 'Z a0
In the first argument of `safeHead', namely `Nil'
In the expression: safeHead Nil
In an equation for `it': it = safeHead Nil
However, in order for this data-type to be actually useful, I should be able to build it from run-time data for which you don't know the length at compile time. My naïve attempt:
fromList :: [a] -> SafeList n a
fromList = foldr Cons Nil
This fails to compile, with the type error:
Couldn't match type 'Z with 'S n
Expected type: a -> SafeList n a -> SafeList n a
Actual type: a -> SafeList n a -> SafeList ('S n) a
In the first argument of `foldr', namely `Cons'
In the expression: foldr Cons Nil
In an equation for `fromList': fromList = foldr Cons Nil
I understand why this is happening: the return type of Cons is different for each iteration of the fold - that's the whole point! But I can't see a way around it, probably because I've not read deeply enough into the subject. (I can't imagine all this effort is being put into a type system that is impossible to use in practice!)
So: How can I build this sort of dependently-typed data from 'normal' simply-typed data?
Following #luqui's advice I was able to make fromList compile:
data ASafeList a where
ASafeList :: SafeList n a -> ASafeList a
fromList :: [a] -> ASafeList a
fromList = foldr f (ASafeList Nil)
where f x (ASafeList xs) = ASafeList (Cons x xs)
Here's my attempt to unpack the ASafeList and use it:
getSafeHead :: [a] -> a
getSafeHead xs = case fromList xs of ASafeList ys -> safeHead ys
This causes another type error:
Couldn't match type `n' with 'S n0
`n' is a rigid type variable bound by
a pattern with constructor
ASafeList :: forall a (n :: Nat). SafeList n a -> ASafeList a,
in a case alternative
at SafeList.hs:33:22
Expected type: SafeList ('S n0) a
Actual type: SafeList n a
In the first argument of `safeHead', namely `ys'
In the expression: safeHead ys
In a case alternative: ASafeList ys -> safeHead ys
Again, intuitively it makes sense that this would fail to compile. I can call fromList with an empty list, so the compiler has no guarantee that I'll be able to call safeHead on the resulting SafeList. This lack of knowledge is roughly what the existential ASafeList captures.
Can this problem be solved? I feel like I might have walked down a logical dead-end.
Never throw anything away.
If you're going to take the trouble to crank along a list to make a length-indexed list (known in the literature as a "vector"), you may as well remember its length.
So, we have
data Nat = Z | S Nat
data Vec :: Nat -> * -> * where -- old habits die hard
VNil :: Vec Z a
VCons :: a -> Vec n a -> Vec (S n) a
but we can also give a run time representation to static lengths. Richard Eisenberg's "Singletons" package will do this for you, but the basic idea is to give a type of run time representations for static numbers.
data Natty :: Nat -> * where
Zy :: Natty Z
Sy :: Natty n -> Natty (S n)
Crucially, if we have a value of type Natty n, then we can interrogate that value to find out what n is.
Hasochists know that run time representability is often so boring that even a machine can manage it, so we hide it inside a type class
class NATTY (n :: Nat) where
natty :: Natty n
instance NATTY Z where
natty = Zy
instance NATTY n => NATTY (S n) where
natty = Sy natty
Now we can give a slightly more informative existential treatment of the length you get from your lists.
data LenList :: * -> * where
LenList :: NATTY n => Vec n a -> LenList a
lenList :: [a] -> LenList a
lenList [] = LenList VNil
lenList (x : xs) = case lenList xs of LenList ys -> LenList (VCons x ys)
You get the same code as the length-destroying version, but you can grab a run time representation of the length anytime you like, and you don't need to crawl along the vector to get it.
Of course, if you want the length to be a Nat, it's still a pain that you instead have a Natty n for some n.
It's a mistake to clutter one's pockets.
Edit I thought I'd add a little, to address the "safe head" usage issue.
First, let me add an unpacker for LenList which gives you the number in your hand.
unLenList :: LenList a -> (forall n. Natty n -> Vec n a -> t) -> t
unLenList (LenList xs) k = k natty xs
And now suppose I define
vhead :: Vec (S n) a -> a
vhead (VCons a _) = a
enforcing the safety property. If I have a run time representation of the length of a vector, I can look at it to see if vhead applies.
headOrBust :: LenList a -> Maybe a
headOrBust lla = unLenList lla $ \ n xs -> case n of
Zy -> Nothing
Sy _ -> Just (vhead xs)
So you look at one thing, and in doing so, learn about another.
In
fromList :: [a] -> SafeList n a
n is universally quantified -- i.e. this signature is claiming that we should be able to build a SafeList of any length from the list. Instead you want to quantify existentially, which can only be done by defining a new data type:
data ASafeList a where
ASafeList :: SafeList n a -> ASafeList a
Then your signature should be
fromList :: [a] -> ASafeList a
You can use it by pattern matching on ASafeList
useList :: ASafeList a -> ...
useList (ASafeList xs) = ...
and in the body, xs will be a SafeList n a type with an unknown (rigid) n. You will probably have to add more operations to use it in any nontrivial way.
If you want to use dependently typed functions on runtime data, then you need to ensure, that this data doesn't violate encoded in type signatures laws. It's easier to understand this by an example. Here is our setup:
data Nat = Z | S Nat
data Natty (n :: Nat) where
Zy :: Natty Z
Sy :: Natty n -> Natty (S n)
data Vec :: * -> Nat -> * where
VNil :: Vec a Z
VCons :: a -> Vec a n -> Vec a (S n)
We can write some simple functions on Vec:
vhead :: Vec a (S n) -> a
vhead (VCons x xs) = x
vtoList :: Vec a n -> [a]
vtoList VNil = []
vtoList (VCons x xs) = x : vtoList xs
vlength :: Vec a n -> Natty n
vlength VNil = Zy
vlength (VCons x xs) = Sy (vlength xs)
For writing the canonical example of the lookup function we need the concept of finite sets. They are usually defined as
data Fin :: Nat -> where
FZ :: Fin (S n)
FS :: Fin n -> Fin (S n)
Fin n represents all numbers less than n.
But just like there is a type level equivalent of Nats — Nattys, there is a type level equivalent of Fins. But now we can incorporate value level and type level Fins:
data Finny :: Nat -> Nat -> * where
FZ :: Finny (S n) Z
FS :: Finny n m -> Finny (S n) (S m)
The first Nat is an upper bound of a Finny. And the second Nat corresponds to an actual value of a Finny. I.e. it must be equal to toNatFinny i, where
toNatFinny :: Finny n m -> Nat
toNatFinny FZ = Z
toNatFinny (FS i) = S (toNatFinny i)
Defining the lookup function is now straightforward:
vlookup :: Finny n m -> Vec a n -> a
vlookup FZ (VCons x xs) = x
vlookup (FS i) (VCons x xs) = vlookup i xs
And some tests:
print $ vlookup FZ (VCons 1 (VCons 2 (VCons 3 VNil))) -- 1
print $ vlookup (FS FZ) (VCons 1 (VCons 2 (VCons 3 VNil))) -- 2
print $ vlookup (FS (FS (FS FZ))) (VCons 1 (VCons 2 (VCons 3 VNil))) -- compile-time error
That was simple, but what about the take function? It's not harder:
type Finny0 n = Finny (S n)
vtake :: Finny0 n m -> Vec a n -> Vec a m
vtake FZ _ = VNil
vtake (FS i) (VCons x xs) = VCons x (vtake i xs)
We need Finny0 instead of Finny, because lookup requires a Vec to be non-empty, so if there is a value of type Finny n m, then n = S n' for some n'. But vtake FZ VNil is perfectly valid, so we need to relax this restriction. So Finny0 n represents all numbers less or equal n.
But what about runtime data?
vfromList :: [a] -> (forall n. Vec a n -> b) -> b
vfromList [] f = f VNil
vfromList (x:xs) f = vfromList xs (f . VCons x)
I.e. "give me a list and a function, that accepts a Vec of arbitrary length, and I'll apply the latter to the former". vfromList xs returns a continuation (i.e. something of type (a -> r) -> r) modulo higher-rank types. Let's try it:
vmhead :: Vec a n -> Maybe a
vmhead VNil = Nothing
vmhead (VCons x xs) = Just x
main = do
print $ vfromList ([] :: [Int]) vmhead -- Nothing
print $ vfromList [1..5] vmhead -- Just 1
Works. But aren't we just repeat ourself? Why vmhead, when there is vhead already? Should we rewrite all safe functions in an unsafe way to make is possible to use them on runtime data? That would be silly.
All we need is to ensure, that all invariants hold. Let's try this principle on the vtake function:
fromIntFinny :: Int -> (forall n m. Finny n m -> b) -> b
fromIntFinny 0 f = f FZ
fromIntFinny n f = fromIntFinny (n - 1) (f . FS)
main = do
xs <- readLn :: IO [Int]
i <- read <$> getLine
putStrLn $
fromIntFinny i $ \i' ->
vfromList xs $ \xs' ->
undefined -- what's here?
fromIntFinny is just like vfromList. It's instructive to see, what the types are:
i' :: Finny n m
xs' :: Vec a p
But vtake has this type: Finny0 n m -> Vec a n -> Vec a m. So we need to coerce i', so that it would be of type Finny0 p m. And also toNatFinny i' must be equal to toNatFinny coerced_i'. But this coercion is not possible in general, since if S p < n, then there are elements in Finny n m, that are not in Finny (S p) m, since S p and n are upper bounds.
coerceFinnyBy :: Finny n m -> Natty p -> Maybe (Finny0 p m)
coerceFinnyBy FZ p = Just FZ
coerceFinnyBy (FS i) (Sy p) = fmap FS $ i `coerceFinnyBy` p
coerceFinnyBy _ _ = Nothing
That's why there is Maybe here.
main = do
xs <- readLn :: IO [Int]
i <- read <$> getLine
putStrLn $
fromIntFinny i $ \i' ->
vfromList xs $ \xs' ->
case i' `coerceFinnyBy` vlength xs' of
Nothing -> "What should I do with this input?"
Just i'' -> show $ vtoList $ vtake i'' xs'
In the Nothing case a number, that was read from the input, is bigger, than the length of a list. In the Just case a number is less or equal to the length of a list and coerced to the appropriate type, so vtake i'' xs' is well-typed.
This works, but we introduced the coerceFinnyBy function, that looks rather ad hoc. Decidable "less or equal" relation would be the appropriate alternative:
data (:<=) :: Nat -> Nat -> * where
Z_le_Z :: Z :<= m -- forall n, 0 <= n
S_le_S :: n :<= m -> S n :<= S m -- forall n m, n <= m -> S n <= S m
type n :< m = S n :<= m
(<=?) :: Natty n -> Natty m -> Either (m :< n) (n :<= m) -- forall n m, n <= m || m < n
Zy <=? m = Right Z_le_Z
Sy n <=? Zy = Left (S_le_S Z_le_Z)
Sy n <=? Sy m = either (Left . S_le_S) (Right . S_le_S) $ n <=? m
And a safe injecting function:
inject0Le :: Finny0 n p -> n :<= m -> Finny0 m p
inject0Le FZ _ = FZ
inject0Le (FS i) (S_le_S le) = FS (inject0Le i le)
I.e. if n is an upper bound for some number and n <= m, then m is an upper bound for this number too. And another one:
injectLe0 :: Finny n p -> n :<= m -> Finny0 m p
injectLe0 FZ (S_le_S le) = FZ
injectLe0 (FS i) (S_le_S le) = FS (injectLe0 i le)
The code now looks like this:
getUpperBound :: Finny n m -> Natty n
getUpperBound = undefined
main = do
xs <- readLn :: IO [Int]
i <- read <$> getLine
putStrLn $
fromIntFinny i $ \i' ->
vfromList xs $ \xs' ->
case getUpperBound i' <=? vlength xs' of
Left _ -> "What should I do with this input?"
Right le -> show $ vtoList $ vtake (injectLe0 i' le) xs'
It compiles, but what definition should getUpperBound have? Well, you can't define it. A n in Finny n m lives only at the type level, you can't extract it or get somehow. If we can't perform "downcast", we can perform "upcast":
fromIntNatty :: Int -> (forall n. Natty n -> b) -> b
fromIntNatty 0 f = f Zy
fromIntNatty n f = fromIntNatty (n - 1) (f . Sy)
fromNattyFinny0 :: Natty n -> (forall m. Finny0 n m -> b) -> b
fromNattyFinny0 Zy f = f FZ
fromNattyFinny0 (Sy n) f = fromNattyFinny0 n (f . FS)
For comparison:
fromIntFinny :: Int -> (forall n m. Finny n m -> b) -> b
fromIntFinny 0 f = f FZ
fromIntFinny n f = fromIntFinny (n - 1) (f . FS)
So a continuation in fromIntFinny is universally quantified over the n and m variables, while a continuation in fromNattyFinny0 is universally quantified over just m. And fromNattyFinny0 receives a Natty n instead of Int.
There is Finny0 n m instead of Finny n m, because FZ is an element of forall n m. Finny n m, while FZ is not necessarily an element of forall m. Finny n m for some n, specifically FZ is not an element of forall m. Finny 0 m (so this type is uninhabited).
After all, we can join fromIntNatty and fromNattyFinny0 together:
fromIntNattyFinny0 :: Int -> (forall n m. Natty n -> Finny0 n m -> b) -> b
fromIntNattyFinny0 n f = fromIntNatty n $ \n' -> fromNattyFinny0 n' (f n')
Achieving the same result, as in the #pigworker's answer:
unLenList :: LenList a -> (forall n. Natty n -> Vec n a -> t) -> t
unLenList (LenList xs) k = k natty xs
Some tests:
main = do
xs <- readLn :: IO [Int]
ns <- read <$> getLine
forM_ ns $ \n -> putStrLn $
fromIntNattyFinny0 n $ \n' i' ->
vfromList xs $ \xs' ->
case n' <=? vlength xs' of
Left _ -> "What should I do with this input?"
Right le -> show $ vtoList $ vtake (inject0Le i' le) xs'
for
[1,2,3,4,5,6]
[0,2,5,6,7,10]
returns
[]
[1,2]
[1,2,3,4,5]
[1,2,3,4,5,6]
What should I do with this input?
What should I do with this input?
The code: http://ideone.com/3GX0hd
EDIT
Well, you can't define it. A n in Finny n m lives only at the type
level, you can't extract it or get somehow.
That's not true. Having SingI n => Finny n m -> ..., we can get n as fromSing sing.
I wrote an Agda-function prefixApp which applies a Vector-Function to a prefix of a vector:
split : {A : Set}{m n : Nat} -> Vec A (n + m) -> (Vec A n) * (Vec A m)
split {_} {_} {zero} xs = ( [] , xs )
split {_} {_} {suc _} (x :: xs) with split xs
... | ( ys , zs ) = ( (x :: ys) , zs )
prefixApp : {A : Set}{n m k : Nat} -> (Vec A n -> Vec A m) -> Vec A (n + k) -> Vec A (m + k)
prefixApp f xs with split xs
... | ( ys , zs ) = f ys ++ zs
I like the fact, that prefixApp can be used without explicitly providing a length argument, e.g.
gate : Vec Bool 4 -> Vec Bool 3
gate = prefixApp xorV
(where xorV : Vec Bool 2 -> Vec Bool 1 is the Vector-Xor-Function)
Unfortunately, I dont know how to write a postfixApp-function which can be used without explicitly providing a length argument. My function definition so far looks like this:
postfixApp : {A : Set}{n m k : Nat} -> (Vec A n -> Vec A m) -> Vec A (k + n) -> Vec A (k + m)
postfixApp {_} {_} {_} {k} f xs with split {_} {_} {k} xs
... | ( ys , zs ) = ys ++ (f zs)
It seems, however, that postfixApp always needs a length argument. E.g.
gate : Vec Bool 4 -> Vec Bool 3
gate = postfixApp {k = 2} xorV
Does anyone know, how to eliminate this asymmetry, i.e. how to write a function postfixApp which works without an explicit length argument. I guess, I need another split-function?
With your prefixApp, you have
prefixApp : {A : Set}{n m k : Nat} -> (Vec A n -> Vec A m) -> Vec A (n + k) -> Vec A (m + k)
and you pass it a function Vec Bool 2 -> Vec Bool 1, so it knows that n = 2 and m = 1 by simple unification. Then, because addition is defined by recursion on the left arguments, the remainder of the function type reduces from Vec A (2 + k) -> Vec A (1 + k) to Vec A (suc (suc k)) -> Vec A (suc k). Agda can then apply straight-up unification (expanding the number literals) of:
Vec A (suc (suc k)) -> Vec A (suc k)
Vec Bool (suc (suc (suc (suc zero)))) -> Vec Bool (suc (suc (suc zero)))
to infer that k = 2.
Looking at the other one:
postfixApp : {A : Set}{n m k : Nat} -> (Vec A n -> Vec A m) -> Vec A (k + n) -> Vec A (k + m)
The only difference is that the known quantities that your xorV forces n and m to be 2 and 1, but this only makes the remainder of your function type into Vec A (k + 2) -> Vec A (k + 1). This type does not reduce further, because addition is defined by recursion on the first argument, k, which is unknown at this point. You then try to unify k + 2 with 4 and k + 1 with 3, and Agda spits out yellow. "But clearly k = 2," you say! You know that because you know math, and can apply subtraction and other simple principles, but Agda does not know that. _+_ is just another function to it, and unifying arbitrary function applications is hard. What if I asked you to unify (2 + x) * (2 + y) with 697, for example? Should the typechecker be expected to factor the number and complain that there isn't a unique factorization? I guess since multiplication is commutative there generally won't be unless you restrict the sides, but should Agda know that multiplication is commutative?
Anyway, so Agda only knows how to do unification, which basically matches "structural" quantities to each other. Data constructors have this structural quality to them, as do type constructors, so those can all be unified unambiguously. When it comes to anything fancier than that, you run into the "higher-order unification" problem, which can't be solved in general. Agda implements a fancy algorithm called Miller pattern unification, that lets it solve some restricted sorts of fancier situations, but there are some things it just can't do, and your kind of function application is one of them.
If you look in the standard library, you'll find that most cases in which a type involves an addition of naturals, one of the addends (the left one) will generally not be implicit, unless another argument specifies it completely (as is the case in your prefixApp).
As far as what to do about it, there isn't much to tackle the problem in general. After a while, you develop a sense for what Agda can infer and what it can't, and then stop making the uninferrable arguments implicit. You can define a "symmetric" version of _+_, but it ends up just being equally painful to work with both sides of it, so I don't recommend that either.
Actually, it's possible to define this function with almost the same type.
postfixApp : {A : Set}{n m k : ℕ} -> (Vec A n -> Vec A m) -> Vec A (n + k) -> Vec A (k + m)
postfixApp f xs with splitAt' (reverse xs)
... | ys , zs = reverse zs ++ f (reverse ys)
test-func : Vec Bool 3 -> Vec Bool 2
test-func (x1 ∷ x2 ∷ x3 ∷ []) = (x1 ∧ x2) ∷ (x2 ∨ x3) ∷ []
test : postfixApp test-func (false ∷ false ∷ true ∷ false ∷ true ∷ [])
≡ false ∷ false ∷ false ∷ true ∷ []
test = refl
The whole code: http://lpaste.net/107176
I have the following definition of fixed-length-vectors using ghcs extensions GADTs, TypeOperators and DataKinds:
data Vec n a where
T :: Vec VZero a
(:.) :: a -> Vec n a -> Vec (VSucc n) a
infixr 3 :.
data VNat = VZero | VSucc VNat -- ... promoting Kind VNat
type T1 = VSucc VZero
type T2 = VSucc T1
and the following defiition of a TypeOperator :+:
type family (n::VNat) :+ (m::VNat) :: VNat
type instance VZero :+ n = n
type instance VSucc n :+ m = VSucc (n :+ m)
For my whole intented library to make sense, I need to apply a fixed-length-vector-function of type (Vec n b)->(Vec m b) to the inial part of a longer vector Vec (n:+k) b. Let's call that function prefixApp. It should have type
prefixApp :: ((Vec n b)->(Vec m b)) -> (Vec (n:+k) b) -> (Vec (m:+k) b)
Here's an example application with the fixed-length-vector-function change2 defined like this:
change2 :: Vec T2 a -> Vec T2 a
change2 (x :. y :. T) = (y :. x :. T)
prefixApp should be able to apply change2 to the prefix of any vector of length >=2, e.g.
Vector> prefixApp change2 (1 :. 2 :. 3 :. 4:. T)
(2 :. 1 :. 3 :. 4 :. T)
Has anyone any idea how to implement prefixApp?
(The problem is, that a part of the type of the fixed-length-vector-function has to be used to grab the prefix of the right size...)
Edit:
Daniel Wagners (very clever!) solution seems to have worked with some release candidate of ghc 7.6 (not an official release!). IMHO it shouldnt work, however, for 2 reasons:
The type-declaration for prefixApp lacks an VNum m in the context (for prepend (f b) to typecheck correctly.
Even more problematic: ghc 7.4.2 does not assume the TypeOperator :+ to be injective in its first argument (nor the second, but thats not essential here), which leads to a type error: from the type-declaration, we know that vec must have type Vec (n:+k) a and the type-checker infers for the expression split vec on the right-hand side of the definition a type of Vec (n:+k0) a. But: the type-checker cannot infer that k ~ k0 (since there is no assurance that :+ is injective).
Does anyone know a solution to this second issue? How can I declare :+ to be injective in its first argument and/or how can I avoid running into this issue at all?
Here is a version where split is not in a type class. Here we build a singleton type for natural numbers (SN), which enables to pattern match on `n' in the definition of split'.
This extra argument can then be hidden by the use of a type class (ToSN).
The type Tag is used to manually specify the non-inferred arguments.
(this answer has been co-authored with Daniel Gustafsson)
Here is the code:
{-# LANGUAGE TypeFamilies, TypeOperators, DataKinds, GADTs, ScopedTypeVariables, FlexibleContexts #-}
module Vec where
data VNat = VZero | VSucc VNat -- ... promoting Kind VNat
data Vec n a where
T :: Vec VZero a
(:.) :: a -> Vec n a -> Vec (VSucc n) a·
infixr 3 :.
type T1 = VSucc VZero
type T2 = VSucc T1
data Tag (n::VNat) = Tag
data SN (n::VNat) where
Z :: SN VZero
S :: SN n -> SN (VSucc n)
class ToSN (n::VNat) where
toSN :: SN n
instance ToSN VZero where
toSN = Z
instance ToSN n => ToSN (VSucc n) where
toSN = S toSN
type family (n::VNat) :+ (m::VNat) :: VNat
type instance VZero :+ n = n
type instance VSucc n :+ m = VSucc (n :+ m)
split' :: SN n -> Tag m -> Vec (n :+ m) a -> (Vec n a, Vec m a)
split' Z _ xs = (T , xs)
split' (S n) _ (x :. xs) = let (as , bs) = split' n Tag xs in (x :. as , bs)
split :: ToSN n => Tag m -> Vec (n :+ m) a -> (Vec n a, Vec m a)
split = split' toSN
append :: Vec n a -> Vec m a -> Vec (n :+ m) a
append T ys = ys
append (x :. xs) ys = x :. append xs ys
prefixChange :: forall a m n k. ToSN n => (Vec n a -> Vec m a) -> Vec (n :+ k) a -> Vec (m :+ k) a
prefixChange f xs = let (as , bs) = split (Tag :: Tag k) xs in append (f as) bs
Make a class:
class VNum (n::VNat) where
split :: Vec (n:+m) a -> (Vec n a, Vec m a)
prepend :: Vec n a -> Vec m a -> Vec (n:+m) a
instance VNum VZero where
split v = (T, v)
prepend _ v = v
instance VNum n => VNum (VSucc n) where
split (x :. xs) = case split xs of (b, e) -> (x :. b, e)
prepend (x :. xs) v = x :. prepend xs v
prefixApp :: VNum n => (Vec n a -> Vec m a) -> (Vec (n:+k) a -> (Vec (m:+k) a))
prefixApp f vec = case split vec of (b, e) -> prepend (f b) e
If you can live with a slightly different type of prefixApp:
{-# LANGUAGE GADTs, TypeOperators, DataKinds, TypeFamilies #-}
import qualified Data.Foldable as F
data VNat = VZero | VSucc VNat -- ... promoting Kind VNat
type T1 = VSucc VZero
type T2 = VSucc T1
type T3 = VSucc T2
type family (n :: VNat) :+ (m :: VNat) :: VNat
type instance VZero :+ n = n
type instance VSucc n :+ m = VSucc (n :+ m)
type family (n :: VNat) :- (m :: VNat) :: VNat
type instance n :- VZero = n
type instance VSucc n :- VSucc m = n :- m
data Vec n a where
T :: Vec VZero a
(:.) :: a -> Vec n a -> Vec (VSucc n) a
infixr 3 :.
-- Just to define Show for Vec
instance F.Foldable (Vec n) where
foldr _ b T = b
foldr f b (a :. as) = a `f` F.foldr f b as
instance Show a => Show (Vec n a) where
show = show . F.foldr (:) []
class Splitable (n::VNat) where
split :: Vec k b -> (Vec n b, Vec (k:-n) b)
instance Splitable VZero where
split r = (T,r)
instance Splitable n => Splitable (VSucc n) where
split (x :. xs) =
let (xs' , rs) = split xs
in ((x :. xs') , rs)
append :: Vec n a -> Vec m a -> Vec (n:+m) a
append T r = r
append (l :. ls) r = l :. append ls r
prefixApp :: Splitable n => (Vec n b -> Vec m b) -> Vec k b -> Vec (m:+(k:-n)) b
prefixApp f v = let (v',rs) = split v in append (f v') rs
-- A test
inp :: Vec (T2 :+ T3) Int
inp = 1 :. 2 :. 3 :. 4:. 5 :. T
change2 :: Vec T2 a -> Vec T2 a
change2 (x :. y :. T) = (y :. x :. T)
test = prefixApp change2 inp -- -> [2,1,3,4,5]
In fact, your original signature can also be used (with augmented context):
prefixApp :: (Splitable n, (m :+ k) ~ (m :+ ((n :+ k) :- n))) =>
((Vec n b)->(Vec m b)) -> (Vec (n:+k) b) -> (Vec (m:+k) b)
prefixApp f v = let (v',rs) = split v in append (f v') rs
Works in 7.4.1
Upd: Just for fun, the solution in Agda:
data Nat : Set where
zero : Nat
succ : Nat -> Nat
_+_ : Nat -> Nat -> Nat
zero + r = r
succ n + r = succ (n + r)
data _*_ (A B : Set) : Set where
_,_ : A -> B -> A * B
data Vec (A : Set) : Nat -> Set where
[] : Vec A zero
_::_ : {n : Nat} -> A -> Vec A n -> Vec A (succ n)
split : {A : Set}{k n : Nat} -> Vec A (n + k) -> (Vec A n) * (Vec A k)
split {_} {_} {zero} v = ([] , v)
split {_} {_} {succ _} (h :: t) with split t
... | (l , r) = ((h :: l) , r)
append : {A : Set}{n m : Nat} -> Vec A n -> Vec A m -> Vec A (n + m)
append [] r = r
append (h :: t) r with append t r
... | tr = h :: tr
prefixApp : {A : Set}{n m k : Nat} -> (Vec A n -> Vec A m) -> Vec A (n + k) -> Vec A (m + k)
prefixApp f v with split v
... | (l , r) = append (f l) r