Dependent types: Crossing the type/kind (runtime/compiletime) barrier - haskell

While venturing deeper into Haskell's dependent types I keep bumping into problems with the strict separation of run-time and compile-time checks.
Let me demonstrate the issue(s) with a little example of integers with a given bit-length.
{-# LANGUAGE DataKinds, GADTs, TypeFamilies, TypeOperators #-}
import GHC.TypeNats
import Data.Bits
data Sign = Signed | Unsigned
data Int_ (s::Sign) (n::Nat) where
Int_ :: Integer -> Int_ Signed n
UInt_ :: Integer -> Int_ Unsigned n
It's a type Int s n with signature s and bit-length n.
Some useful class instances would look like this:
instance (KnownNat n) => Show (Int_ s n) where
-- Verilog style
show i#(Int_ v) = (show.natVal) i <> "'sd" <> show v
show i#(UInt_ v) = (show.natVal) i <> "'ud" <> show v
instance (KnownNat n) => Eq (Int_ s n) where
(Int_ u) == (Int_ v) = u==v
(UInt_ u) == (UInt_ v) = u==v
Smart constructors that check bounds are
int8 :: Integer -> Int_ Signed 8
int8 i = if i<2^7 && i>=(-2^7) then Int_ i else error "int8 Overflow"
uint8 :: Integer -> Int_ Unsigned 8
uint8 i = if i>=0 && i<2^8 then UInt_ i else error "uint8 Overflow"
These smart constructors are already not so nice because the sanity check against overflow happens at run-time.
Problem 1: I have no clue how to write a smart constructor that would stop me from creating silly ints at compile-time. Moreover I'd need two types of constructors, one with the run-time check and one with the compile-time check.
This 'duplication' problem, that is needing one set of operations with run-time checks and another set of operations for compile-time checks is quite common.
For example, a concatenation operation would look like this:
(.++) :: (KnownNat m, KnownNat n) => Int_ Unsigned m
-> Int_ Unsigned n -> Int_ Unsigned (m+n)
i#(UInt_ x) .++ j#(UInt_ y) = UInt_ (x*2^(natVal j)+y)
and here's a shift operation:
(.>>) :: Int_ s m -> Int -> Int_ s m
i#(UInt_ x) .>> n = UInt_ (x `quot` (2 ^ n))
i#(Int_ x) .>> n = Int_ (x `quot` (2 ^ n))
It's a sensible definition, but it would also be sensible to let .>> return a truncated Int_ s k (for the left shift it would in fact be the more sensible thing to do). I.e.
(.>>) :: Int_ s m -> Nat n -> Int_ s (m-n) -- not proper syntax
It clearly isn't as simple as that: m-n can't be negative - coming back to the bounds check problem above.
Problem 2, a fundamental one, I believe, is that sometimes I know the n to shift by at compile-time, and sometimes I don't. When I do know the n at compile-time I'd like to have the compile-time checks. When n is dynamic (determined at run-time) I obviously can't have them. Is there a way to set this up in order to mix-and-match, so to speak, i.e. avoid the duplication of all the logic, one at the value-level and one at the type level, plus the glue between the two?
My hunch is that I need some type class that introduces polymorphism over Int and some Nat sort of thing, but I'm totally unsure where this is going.

Related

Besides as-pattern, what else can # mean in Haskell?

I am studying Haskell currently and try to understand a project that uses Haskell to implement cryptographic algorithms. After reading Learn You a Haskell for Great Good online, I begin to understand the code in that project. Then I found I am stuck at the following code with the "#" symbol:
-- | Generate an #n#-dimensional secret key over #rq#.
genKey :: forall rq rnd n . (MonadRandom rnd, Random rq, Reflects n Int)
=> rnd (PRFKey n rq)
genKey = fmap Key $ randomMtx 1 $ value #n
Here the randomMtx is defined as follows:
-- | A random matrix having a given number of rows and columns.
randomMtx :: (MonadRandom rnd, Random a) => Int -> Int -> rnd (Matrix a)
randomMtx r c = M.fromList r c <$> replicateM (r*c) getRandom
And PRFKey is defined below:
-- | A PRF secret key of dimension #n# over ring #a#.
newtype PRFKey n a = Key { key :: Matrix a }
All information sources I can find say that # is the as-pattern, but this piece of code is apparently not that case. I have checked the online tutorial, blogs and even the Haskell 2010 language report at https://www.haskell.org/definition/haskell2010.pdf. There is simply no answer to this question.
More code snippets can be found in this project using # in this way too:
-- | Generate public parameters (\( \mathbf{A}_0 \) and \(
-- \mathbf{A}_1 \)) for #n#-dimensional secret keys over a ring #rq#
-- for gadget indicated by #gad#.
genParams :: forall gad rq rnd n .
(MonadRandom rnd, Random rq, Reflects n Int, Gadget gad rq)
=> rnd (PRFParams n gad rq)
genParams = let len = length $ gadget #gad #rq
n = value #n
in Params <$> (randomMtx n (n*len)) <*> (randomMtx n (n*len))
I deeply appreciate any help on this.
That #n is an advanced feature of modern Haskell, which is usually not covered by tutorials like LYAH, nor can be found the the Report.
It's called a type application and is a GHC language extension. To understand it, consider this simple polymorphic function
dup :: forall a . a -> (a, a)
dup x = (x, x)
Intuitively calling dup works as follows:
the caller chooses a type a
the caller chooses a value x of the previously chosen type a
dup then answers with a value of type (a,a)
In a sense, dup takes two arguments: the type a and the value x :: a. However, GHC is usually able to infer the type a (e.g. from x, or from the context where we are using dup), so we usually pass only one argument to dup, namely x. For instance, we have
dup True :: (Bool, Bool)
dup "hello" :: (String, String)
...
Now, what if we want to pass a explicitly? Well, in that case we can turn on the TypeApplications extension, and write
dup #Bool True :: (Bool, Bool)
dup #String "hello" :: (String, String)
...
Note the #... arguments carrying types (not values). Those are something that exists at compile time, only -- at runtime the argument does not exist.
Why do we want that? Well, sometimes there is no x around, and we want to prod the compiler to choose the right a. E.g.
dup #Bool :: Bool -> (Bool, Bool)
dup #String :: String -> (String, String)
...
Type applications are often useful in combination with some other extensions which make type inference unfeasible for GHC, like ambiguous types or type families. I won't discuss those, but you can simply understand that sometimes you really need to help the compiler, especially when using powerful type-level features.
Now, about your specific case. I don't have all the details, I don't know the library, but it's very likely that your n represents a kind of natural-number value at the type level. Here we are diving in rather advanced extensions, like the above-mentioned ones plus DataKinds, maybe GADTs, and some typeclass machinery. While I can't explain everything, hopefully I can provide some basic insight. Intuitively,
foo :: forall n . some type using n
takes as argument #n, a kind-of compile-time natural, which is not passed at runtime. Instead,
foo :: forall n . C n => some type using n
takes #n (compile-time), together with a proof that n satisfies constraint C n. The latter is a run-time argument, which might expose the actual value of n. Indeed, in your case, I guess you have something vaguely resembling
value :: forall n . Reflects n Int => Int
which essentially allows the code to bring the type-level natural to the term-level, essentially accessing the "type" as a "value". (The above type is considered an "ambiguous" one, by the way -- you really need #n to disambiguate.)
Finally: why should one want to pass n at the type level if we then later on convert that to the term level? Wouldn't be easier to simply write out functions like
foo :: Int -> ...
foo n ... = ... use n
instead of the more cumbersome
foo :: forall n . Reflects n Int => ...
foo ... = ... use (value #n)
The honest answer is: yes, it would be easier. However, having n at the type level allows the compiler to perform more static checks. For instance, you might want a type to represent "integers modulo n", and allow adding those. Having
data Mod = Mod Int -- Int modulo some n
foo :: Int -> Mod -> Mod -> Mod
foo n (Mod x) (Mod y) = Mod ((x+y) `mod` n)
works, but there is no check that x and y are of the same modulus. We might add apples and oranges, if we are not careful. We could instead write
data Mod n = Mod Int -- Int modulo n
foo :: Int -> Mod n -> Mod n -> Mod n
foo n (Mod x) (Mod y) = Mod ((x+y) `mod` n)
which is better, but still allows to call foo 5 x y even when n is not 5. Not good. Instead,
data Mod n = Mod Int -- Int modulo n
-- a lot of type machinery omitted here
foo :: forall n . SomeConstraint n => Mod n -> Mod n -> Mod n
foo (Mod x) (Mod y) = Mod ((x+y) `mod` (value #n))
prevents things to go wrong. The compiler statically checks everything. The code is harder to use, yes, but in a sense making it harder to use is the whole point: we want to make it impossible for the user to try adding something of the wrong modulus.
Concluding: these are very advanced extensions. If you're a beginner, you will need to slowly progress towards these techniques. Don't be discouraged if you can't grasp them after only a short study, it does take some time. Make a small step at a time, solve some exercises for each feature to understand the point of it. And you'll always have StackOverflow when you are stuck :-)

Access GADT contstraint from evaluation level

I am trying to make use of some of GADT parameter from runtime, assuming that I have used DataKinds extension to allow promoting data to types. i.e. having
data Num = Zero | Succ Num
data Something (len :: Num) where
Some :: Something len
I would like to have function
toNum :: Something len -> Num
That for any Some :: Something n will return n:
toNum (s :: Something n) = n
Which is invalid in Haskell. Is it possible to do so?
In Haskell this is impossible, since types are erased at runtime. That is, when the program will run, there is no information in memory about the value of the index let in the type.
To overcome this issue, we need to force Haskell to keep in memory that value, at runtime. This is usually done using a singleton auxiliary type:
data Num = Zero | Succ Num
data SNum (n :: Num) where
SZero :: SNum 'Zero
SSucc :: SNum n -> SNum ('Succ n)
data Something (len :: Num) where
Some :: SNum len -> Something len
Using this you can easily write
sToNum :: SNum n -> Num
sToNum SZero = Zero
sToNum (SSucc n) = Succ (sToNum n)
and then
toNum :: Something len -> Num
toNum (Some n) = sToNum n
If you look up "haskell singletons" you should find several examples. There's even a singletons library to partially automatize this.
If / when "dependent Haskell" will be released, we will have less cumbersome tools at our disposal. Currently, singletons work, but they are a hassle sometimes. Still, for the moment, we have to use them.

Easy function gives compile error on conversion from Int to Double

Why does this easy function which computes the distance between 2 integer points in the plane not compile?
distance :: (Int, Int) -> (Int, Int) -> Double
distance (x, y) (u, v) = sqrt ((x - u) ^ 2 + (y - v) ^ 2)
I get the error Couldn't match expected type ‘Double’ with actual type ‘Int’.
It is frustrating such an easy mathematical function consumes so much of my time. Any explanation why this goes wrong and the most elegant way to fix this is appreciated.
This is my solution to overcome the problem
distance :: (Int, Int) -> (Int, Int) -> Double
distance (x, y) (u, v) =
let xd = fromIntegral x :: Double
yd = fromIntegral y :: Double
ud = fromIntegral u :: Double
vd = fromIntegral v :: Double
in sqrt ((xd - ud) ^ 2 + (yd - vd) ^ 2)
but there must be a more elegant way.
Most languages only do type inference (if any) “in direction of data flow”. E.g., you start with a value 2 in Java or Python, that'll be an int. You calculate something like 2 + 4, and the + operator infers from the integer arguments that the result is also int. In dynamic languages this is the only way that's possible at all (because the types are only an “associated property” of values). In static languages like C++, the inference-step is only done once at compile time, but it's still done largely “as if the types were associated properties of values”.
Not so in Haskell. Like other Hindley-Milner languages, it has a type system that works completely independent of any runtime data flow directions. It can still do forward-inference ((2::Int) + (4::Int) is unambiguously of type Int), but it's only a special case – types can just as well be inferred in the “reverse direction”, i.e. if you write (x + y) :: Int the compiler is able to infer that both x and y must have type Int as well.
This reverse-polymorphism enables many nice tricks – example:
Prelude Debug.SimpleReflect> 2 + 4 :: Expr
2 + 4
Prelude Debug.SimpleReflect> 7^3 :: Expr
7 * 7 * 7
...but it only works if the language never does implicit conversions, not even in “safe†, obvious cases” like Int -> Integer.
Usually, the type checker automatically infers the most sensible type. For your original implementation, the checker would infer the type
distance :: Floating a => (a, a) -> (a, a) -> a
and that – or perhaps the specialised version
distance :: (Double,Double) -> (Double,Double) -> Double
is a much more sensible type than your (Int, Int) -> ... attempt, because the Euclidean distance makes actually no sense on a discrete grid (you'd want something like a Taxcab distance there).
What you'd actually want is distance from the vector-space package. This is more general, works not only on 2-tuples but any suitable space.
†Int -> Double is actually not a safe conversion – try float(1000000000000000001) in Python! So even without Hindley-Milner, this is not really a very smart thing to do implicitly.
SOLVED: now I have this
distance :: (Int, Int) -> (Int, Int) -> Double
distance (x, y) (u, v) = sqrt (fromIntegral ((x - u) ^ 2 + (y - v) ^ 2))

Haskell type family instance with type constraints

I am trying to represent expressions with type families, but I cannot seem to figure out how to write the constraints that I want, and I'm starting to feel like it's just not possible. Here is my code:
class Evaluable c where
type Return c :: *
evaluate :: c -> Return c
data Negate n = Negate n
instance (Evaluable n, Return n ~ Int) => Evaluable (Negate n) where
type Return (Negate n) = Return n
evaluate (Negate n) = negate (evaluate n)
This all compiles fine, but it doesn't express exactly what I want. In the constraints of the Negate instance of Evaluable, I say that the return type of the expression inside Negate must be an Int (with Return n ~ Int) so that I can call negate on it, but that is too restrictive. The return type actually only needs to be an instance of the Num type class which has the negate function. That way Doubles, Integers, or any other instance of Num could also be negated and not just Ints. But I can't just write
Return n ~ Num
instead because Num is a type class and Return n is a type. I also cannot put
Num (Return n)
instead because Return n is a type not a type variable.
Is what I'm trying to do even possible with Haskell? If not, should it be, or am I misunderstanding some theory behind it? I feel like Java could add a constraint like this. Let me know if this question could be clearer.
Edit: Thanks guys, the responses are helping and are getting at what I suspected. It appears that the type checker isn't able to handle what I'd like to do without UndecidableInstances, so my question is, is what I'd like to express really undecidable? It is to the Haskell compiler, but is it in general? i.e. could a constraint even exist that means "check that Return n is an instance of Num" which is decidable to a more advanced type checker?
Actually, you can do exactly what you mentioned:
{-# LANGUAGE TypeFamilies, FlexibleContexts, UndecidableInstances #-}
class Evaluable c where
type Return c :: *
evaluate :: c -> Return c
data Negate n = Negate n
instance (Evaluable n, Num (Return n)) => Evaluable (Negate n) where
type Return (Negate n) = Return n
evaluate (Negate n) = negate (evaluate n)
Return n certainly is a type, which can be an instance of a class just like Int can. Your confusion might be about what can be the argument of a constraint. The answer is "anything with the correct kind". The kind of Int is *, as is the kind of Return n. Num has kind * -> Constraint, so anything of kind * can be its argument. It perfectly legal (though vacuous) to write Num Int as a constraint, in the same way that Num (a :: *) is legal.
To complement Eric's answer, let me suggest one possible alternative: using a functional dependency instead of a type family:
class EvaluableFD r c | c -> r where
evaluate :: c -> r
data Negate n = Negate n
instance (EvaluableFD r n, Num r) => EvaluableFD r (Negate n) where
evaluate (Negate n) = negate (evaluate n)
This makes it a bit easier to talk about the result type, I think. For instance, you can write
foo :: EvaluableFD Int a => Negate a -> Int
foo x = evaluate x + 12
You can also use ConstraintKinds to apply this partially (which is why I put the arguments in that funny-looking order):
type GivesInt = EvaluableFD Int
You could do this with your class as well, but it would be more annoying:
type GivesInt x = (Evaluable x, Result x ~ Int)

Pattern matching on length using this GADT:

I've defined the following GADT:
data Vector v where
Zero :: Num a => Vector a
Scalar :: Num a => a -> Vector a
Vector :: Num a => [a] -> Vector [a]
TVector :: Num a => [a] -> Vector [a]
If it's not obvious, I'm trying to implement a simple vector space. All vector spaces need vector addition, so I want to implement this by making Vector and instance of Num. In a vector space, it doesn't make sense to add vectors of different lengths, and this is something I would like to enforce. One way I thought to do it would be using guards:
instance Num (Vector v) where
(Vector a) + (Vector b) | length a == length b =
Vector $ zipWith (+) a b
| otherwise =
error "Only add vectors with the same length."
There is nothing really wrong with this approach, but I feel like there has to be a way to do this with pattern matching. Perhaps one way to do it would be to define a new data type VectorLength, which would look something like this:
data Length l where
AnyLength :: Nat a => Length a
FixedLength :: Nat a -> Length a
Then, a length component could be added to the Vector data type, something like this:
data Vector (Length l) v where
Zero :: Num a => Vector AnyLength a
-- ...
Vector :: Num a => [a] -> Vector (length [a]) [a]
I know this isn't correct syntax, but this is the general idea I'm playing with. Finally, you could define addition to be
instance Num (Vector v) where
(Vector l a) + (Vector l b) = Vector $ zipWith (+) a b
Is such a thing possible, or is there any other way to use pattern matching for this purpose?
What you're looking for is something (in this instance confusingly) named a Vector as well. Generally, these are used in dependently typed languages where you'd write something like
data Vec (n :: Natural) a where
Nil :: Vec 0 a
Cons :: a -> Vec n a -> Vec (n + 1) a
But that's far from valid Haskell (or really any language). Some very recent extensions to GHC are beginning to enable this kind of expression but they're not there yet.
You might be interested in fixed-vector which does a best approximation of a fixed Vector available in relatively stable GHC. It uses a number of tricks between type families and continuations to create classes of fixed-size vectors.
Just to add to the example in the other answer - this nearly works already in GHC 7.6:
{-# LANGUAGE DataKinds, GADTs, KindSignatures, TypeOperators #-}
import GHC.TypeLits
data Vector (n :: Nat) a where
Nil :: Vector 0 a
Cons :: a -> Vector n a -> Vector (n + 1) a
That code compiles fine, it just doesn't work quite the way you'd hope. Let's check it out in ghci:
*Main> :t Nil
Nil :: Vector 0 a
Good so far...
*Main> :t Cons "foo" Nil
Cons "foo" Nil :: Vector (0 + 1) [Char]
Well, that's a little odd... Why does it say (0 + 1) instead of 1?
*Main> :t Cons "foo" Nil :: Vector 1 String
<interactive>:1:1:
Couldn't match type `0 + 1' with `1'
Expected type: Vector 1 String
Actual type: Vector (0 + 1) String
In the return type of a call of `Cons'
In the expression: Cons "foo" Nil :: Vector 1 String
Uh. Oops. That'd be why it says (0 + 1) instead of 1. It doesn't know that those are the same. This will be fixed (at least this case will) in GHC 7.8, which is due out... In a couple months, I think?

Resources