Type checking with RankNTypes in Haskell - haskell

I'm trying to understand RankNTypes in Haskell and found this example:
check :: Eq b => (forall a. [a] -> b) -> [c] -> [d] -> Bool
check f l1 l2 = f l1 == f l2
(If my understanding is correct, this is equivalent to check :: forall b c d. Eq b => (forall a. [a] -> b) -> [c] -> [d] -> Bool.)
Ok, so far so good. Now, if the explicit forall a is removed, GHC produces the following errors:
Could not deduce (c ~ a)
from the context (Eq b)
[…]
Could not deduce (d ~ a)
from the context (Eq b)
[…]
When removing the nested forall, the type signature becomes
check :: forall a b c d. Eq b => ([a] -> b) -> [c] -> [d] -> Bool
It is easy to see why this fails type checking since l1 and l2 should have type [a] for us to pass them to f, but why isn't this the case when specifying f's type as (forall a. [a] ->b)? Is the fact that a is only bound inside the parens the full answer? I.e. the type checker will accept
[c] -> b ~ (forall a. [a] -> b)
[d] -> b ~ (forall a. [a] -> b)
(edit: Fixed. Thanks, Boyd!)
since a function of type (forall a. a -> b) can take any list?

When f = \xs -> ... is written with the explicit Rank2 quantification forall a. [a] -> b you can view this as a new function
f = Λa -> \xs -> ...
where Λ is a special lambda that takes a type argument to determine which specific type a it will use in the body of the function. This type argument is applied each time the function is called, just like how normal lambda bindings are applied on each call. This is how GHC handles forall internally.
In the explicitly forall'd version, f can be applied to different type arguments each time it is called so a can resolve to a different type each time, once for c and once for d.
In the version without the inner forall, this type application for a happens only once, when check is called. So every time f is called it must use the same a. Of course this fails since f is called on lists of different types.

It is easy to see why this fails type checking since l1 and l2 should have type [a] for us to pass them to f, but why isn't this the case when specifying f's type as (forall a. [a] ->b)?
Because the type (forall a. [a] -> B) can be unified with [C] -> B and (separately) [D] -> B. However, the type [A] -> B cannot be unified with either [C] -> B or [D] -> B.
Is the fact that a is only bound inside the parens the full answer?
Basically. You have to choose a particular type for each type variable when you are "inside" a forall scope, but outside you can use the forall multiple times and choose a different particular type each time you do.
I.e. the type checker will accept
[c] ~ (forall a. a -> b)
[d] ~ (forall a. a -> b)
since a function of type (forall a. a -> b) can take any list?
Careful. You seem to have lost some "[]" characters there. Also, you are not quite getting the unification correct. The type checker will accept both:
[C] -> B ~ (forall a. [a] -> B)
[D] -> B ~ (forall a. [a] -> B)
It will not accept either:
[C] -> B ~ [A] -> B
[D] -> B ~ [A] -> B

You can rewrite universal quantification in a contravariant field with existential quantification in covariant fields (not legally in Haskell, but in principle).
check' :: exists c' d'. forall b c d. Eq b
=> ([c'] -> b) -> ([d'] -> b) -> [c] -> [d] -> Bool
It's obvious enough that this works: for c ~ C, d ~ D choose c' ~ C and d' ~ D as well, then the function is simply
check'' :: forall b . Eq b => ([C] -> b) -> ([D] -> b) -> [C] -> [D] -> Bool
Not sure if this answers you question, but it is one way to look at rank-2 types.

Related

What is this GHC feature called? `forall` in type definitions

I learned that you can redefine ContT from transformers such that the r type parameter is made implicit (and may be specified explicitly using TypeApplications), viz.:
-- | Same as `ContT` but with the `r` made implicit
type ContT ::
forall (r :: Type).
(Type -> Type) ->
Type ->
Type
data ContT m a where
ContT ::
forall r m a.
{runContT :: (a -> m r) -> m r} ->
ContT #r m a
type ContVoid :: (Type -> Type) -> Type -> Type
type ContVoid = ContT #()
I hadn't realized this was possible in GHC. What is the larger feature called to refer to this way of defining a family of types with implicit type parameters, that is specified using forall in type definition (referring, in the example above, to the outer forall - rather than the inner forall which simply unifies the r)?
Nobody uses this (invisible dependent quantification) for this purpose (where the dependency is not used) but it is the same as giving a Type -> .. parameter, implicitly.
type EITHER :: forall (a :: Type) (b :: Type). Type
data EITHER where
LEFT :: a -> EITHER #a #b
RIGHT :: b -> EITHER #a #b
eITHER :: (a -> res) -> (b -> res) -> (EITHER #a #b -> res)
eITHER left right = \case
LEFT a -> left a
RIGHT b -> right b
You can also use "visible dependent quantification" where forall-> is the visible counterpart to forall., so forall (a :: Type) -> .. is properly like Type -> .. where a does not appear in ..:
type EITHER :: forall (a :: Type) -> forall (b :: Type) -> Type
data EITHER a b where
LEFT :: a -> EITHER a b
RIGHT :: b -> EITHER a b
eITHER :: (a -> res) -> (b -> res) -> (EITHER a b -> res)
eITHER left right = \case
LEFT a -> left a
RIGHT b -> right b

How to derive the type of an applicator applied to the identity function

I want to derive the type of the following contrived applicator applied to the identity function. To achieve this I probably have to unify the type portion of the first argument (a -> [b]) with the type of id:
ap :: (a -> [b]) -> a -> [b]
id :: a -> a
a -> [b]
a0 -> a0 -- fresh type vars
a ~ a0 -- mappings
[b] ~ a0
a0 -> a0 -- substitution
This is obviously wrong, since the expected type is [b] -> [b]. There is an ambiguity within the unification, because a0 cannot be equivalent to both a and [b], except for a ~ [b] . But what is the rule that tells me to substitute a with [b] and not the other way around, as I would have to do with ap :: ([a] -> b) -> [a] -> b for example.
I know this is a very specific question, sorry. Hopefully it is not too confusing!
Ok, new answer because I now understand the question being asked! To restate the question:
Given
ap :: (a -> [b]) -> a -> [b]
id :: a -> a
Explain how the type of the expression ap id is derived.
Answer:
Rename variables:
ap :: (a -> [b]) -> a -> [b]
id :: a0 -> a0
Unify:
(a -> [b]) ~ (a0 -> a0)
Apply generativity a couple times, pulling the arguments from the (->) type constructor:
a ~ a0
[b] ~ a0
Apply commutativity/transitivity of type equality:
[b] ~ a
Substitute the most specific types known into the types of ap and id
ap :: ([b] -> [b]) -> [b] -> [b]
id :: [b] -> [b]
[b] is the most specific type known, because it provides some restriction. The value must be a list of something. The other two equivalent type expressions just mean any type at all. You can think of unification as a constraint-solving process. You find the maximal type that satisfies the constraints provided, which amount to "it's a list of something" for this case.
Now that the types are unified, the type of the function application is the type of the function's result:
ap id :: [b] -> [b]
I can see why the choice of [b] looks a little odd in this case, because only one of the three type expressions contributed factors to unify. There are more involved cases where constraints come from multiple places, though.
Let's consider a more advanced case. This might involve some things you haven't seen before. If it does, I apologize for jumping straight to the deep end.
Given:
f1 :: (a -> b) -> f a -> f b
f2 :: p c d -> (e, c) -> (e, d)
Unify the types of f1 and f2.
Let's be really careful with this one. First up, rewrite all the types in terms of prefix application. Even the (->) types. This is going to be ugly:
f1 :: (->) ((->) a b) ((->) (f a) (f b))
f2 :: (->) (p c d) ((->) ((,) e c) ((,) e d))
Unify and apply generativity twice to top-level (->) type constructors:
((->) a b) ~ (p c d)
((->) (f a) (f b)) ~ ((->) ((,) e c) ((,) e d))
And, just keep unifying and applying generativity:
(->) ~ p
a ~ c
b ~ d
f a ~ (,) e c
f b ~ (,) e d
f ~ (,) e
Ok, we've built up a giant stack of constraints now. Choosing between a and c or b and d doesn't matter, as they're equivalently constrained. Let's choose letters closer to the beginning of the alphabet when it doesn't matter. (->) is more constrained than p, so it wins there, and (,) e is more constrained than f. Call it a winner too.
Then switch back to infix type constructors to make it pretty, and the unified type is:
(a -> b) -> (e, a) -> (e, b)
Notice how each of the two starting types contributed a constraint to the final unified type. f1 requires the p type in f2 to be more specific, and f2 required the f type in f1 to be more specific.
Overall, this is a super-mechanical process. It's also fiddly and requires precise tracking of what you know. There's a reason we mostly leave it to the compiler to handle this. It is absolutely useful in the cases when something goes wrong and you want to double-check the process yourself to see why the compiler is reporting an error, though.

Unifying c -> a -> b and (a -> b) -> c

What is the type inferred by a Haskell type synthesizer when unifying
the types c -> a -> b and (a -> b) -> c?
Can someone explain me how can I solve it?
Thanks!
This seems to be some kind of exercise/homework so I will not spoil everything but give you some hints first:
the type c -> a -> b is actually c -> (a -> b)
so you have to unify c -> (a -> b) with (a -> b) -> c, that is:
c with a -> b (first part)
a -> b with c (second part)
now what could that (try to get rid of c ;) ) be now?
PS: I am assuming you want those types a, b, .. to be the same
In other answers, we have seen how to perform the unification by hand, and how to ask ghci some limited unification questions when we do not need to connect type variables in the two types we want to unify. In this answer, I show how to use existing tooling to answer the question you asked as I understand you to intend it.
The trick is to use type-equality constraints to ask GHC to unify two types, then expose the results as a tuple type. The type equality constraint kicks off the unifier; when unification is done, the type variables in our tuple type will be simplified according to what was learned during unification.
Thus, your question looks like this, for example:
> :set -XTypeFamilies
> :{
| :t undefined -- a dummy implementation we don't actually care about
| :: ((a -> b) -> c) ~ (c -> a -> b) -- the unification problem
| => (a, b, c) -- how we ask our query (what are the values of a, b, and c after unification?)
| :}
<snip -- a copy of the above text>
:: (a, b, a -> b)
From this, we learn that for any types a and b, we can choose a ~ a, b ~ b, and c ~ a -> b as a solution to the unification problem. Here is another query you might wonder: after unification, what is the simplified type of (a -> b) -> c? You could run the previous query, and substitute in a, b, and c by hand, or you could ask ghci:
> :t undefined :: ((a -> b) -> c) ~ (c -> a -> b) => (a -> b) -> c
undefined :: ((a -> b) -> c) ~ (c -> a -> b) => (a -> b) -> c
:: (a -> b) -> a -> b
The only thing I changed in this command is the "query" part. The result tells us that (a -> b) -> c becomes (a -> b) -> a -> b after unification. Note well that the a and b in the result type are not guaranteed to be exactly the same as the a and b in the query -- though probably in GHC that will always be the case.
Another quick trick worth mentioning is that you can use Proxy to turn an arbitrarily-kinded type variable into a * type for use in a tuple; thus, for example:
> :t undefined :: f a ~ (b -> c) => (a, b, c, f)
<interactive>:1:42:
Expecting one more argument to ‘f’
The fourth argument of a tuple should have kind ‘*’,
but ‘f’ has kind ‘* -> *’
In an expression type signature: f a ~ (b -> c) => (a, b, c, f)
In the expression: undefined :: f a ~ (b -> c) => (a, b, c, f)
> :m + Data.Proxy
> :t undefined :: f a ~ (b -> c) => (a, b, c, Proxy f)
undefined :: f a ~ (b -> c) => (a, b, c, Proxy f)
:: (c, b, c, Proxy ((->) b))
You can ask ghci
:t [undefined :: c -> a -> b, undefined :: (a -> b) -> c]
It will need to unify the types to figure out what type the elements of the list are. We can unify any number of types this way; even 0, try it!
The type variables on the left in c -> a -> b are distinct from the type variables on the right in a -> b -> c. GHC will rename type variables to keep them distinct, but it will try to preserve the original names. It does this by adding numbers to the end of the type variable names. The answer to this query includes some of the type variables a, a1, b, b1, c, and c1. If you don't want the type variables to be distinct, you can read off the answer ignoring the added numbers.
If you do want the type variables to be distinct, it can be a bit tricky to tell what ghc is doing because you don't know which type variables where renamed to what. In practical coding, this can be a problem when trying to understand type errors. In both cases there is a simple solution: rename the type variables with distinctive names yourself so that ghc doesn't need to rename them.
:t [undefined :: c1 -> a1 -> b1, undefined :: (a2 -> b2) -> c2]
We're done with what vanilla Haskell can do, but you can get the compiler to answer questions more generally by using type equality constraints as described in Daniel Wagner's answer. The next section just describes why forall scoped types are not the general solution.
forall
Before reading this section you should think about whether it is possible to unify, for all c, c -> a -> b and (a -> b) -> c.
To the experienced haskeller, it might seem like you could keep the type variables from being distinct by introducing them in an explicit forall scope with the ScopedTypeVariables extension. I don't know an easy way to do this in ghci, but the following snipet with a hole† asks the compiler to unify a -> b and a -> b.
{-# LANGUAGE ScopedTypeVariables #-}
example1 :: forall a b. ()
example1 = (undefined :: _) [undefined :: a -> b, undefined :: a -> b]
The output seems to tell us that the list is a list of a -> b.
Found hole `_' with type: [a -> b] -> ()
If we try to use this for the example problem, it doesn't work.
example2 :: forall a b c. ()
example2 = (undefined :: _) [undefined :: c -> a -> b, undefined :: (a -> b) -> c]
The compiler politely tells us why†
Couldn't match type `c' with `a -> b'
It is not true that for all types c, c is a function. Some example types that aren't functions include Int, Bool, and IO a.
† I use (undefined :: _) instead of _ when asking what the type that goes in a hole is. If you just use _ ghc doesn't type check all of the expression. The compiler may lead you to believe a hole is possible to fill when it is in fact impossible. In the output for example2 there is also the following, extremely misleading line
Found hole `_' with type: [c -> a -> b] -> ()

What does forall on the right of a function arrow mean?

The topic of Section 7.12.5 of the GHC Users Guide is higher rank polymorphism. There are some example valid types, among others:
f4 :: Int -> (forall a.a->a)
Now I wonder what this type means. I think it is the same as:
f4' :: forall a. Int -> a -> a
If this is so, can we generally mentally move forall like the above (that appears right of the rightmost arrow) to the left assuming that no type variable with the same name occurs in the rest of the type (but this could be dealt with renaming, simply)?
For example, the following would still be correct, wouldn't it:
f5 :: Int -> (forall a. (forall b. b -> a) -> a)
f5' :: forall a. Int -> (forall b. b -> a) -> a
Would be thankful for an insightful answer.
Background: In this talk about lenses by SPJ, we have:
type Lens' s a = forall f. Functor f => (a -> f a) -> s -> f s
and then, when you compose them, you have such a lens type in the result.
Therefore I just wanted to know whether my intuition is correct, that the forall in the result doesn't really matter - it just appears "accidentally" because of the type synonym. Otherwise, there must be some difference between the types for f4, f4'and f5, f5' above I would want to learn about.
Here is a ghci session:
Prelude> let f5 :: Int -> (forall a. (forall b. b -> a) -> a); f5 i f = f i
Prelude> :t f5
f5 :: Int -> (forall b. b -> a) -> a
Prelude>
Looks like GHC agrees with me, at least in this case .....
f4 in your example can be universally quantified because Int -> (forall a. a -> a) and forall a. Int -> (a -> a) are essentially same.
But we can not apply you analogy for example in this Rank2 type (forall a. a -> a) -> (forall b. b -> b). This type is essentially same as forall b. (forall a. a -> a) -> (b -> b). But moving the first forall out (forall a b. (a -> a) -> (b -> b)) essentially changes the semantics of the type.
f :: (forall a. a -> a) -> (forall b. b -> b) -- Rank 2
g :: forall a b. (a -> a) -> (b -> b) -- Rank 1
To see the difference, you can instantiate a in g to be Int and thus can pass a function of type Int -> Int to g. On the other hand the argument to f has to be universally quantified (specified by that forall before the function type) and should work for all types (Example of such a function is id).
Here is a good explanation of higher rank types.

Why can't the type of id be specialised to (forall a. a -> a) -> (forall b. b -> b)?

Take the humble identity function in Haskell,
id :: forall a. a -> a
Given that Haskell supposedly supports impredicative polymorphism, it seems reasonable that I should be able to "restrict" id to the type (forall a. a -> a) -> (forall b. b -> b) via type ascription. But this doesn't work:
Prelude> id :: (forall a. a -> a) -> (forall b. b -> b)
<interactive>:1:1:
Couldn't match expected type `b -> b'
with actual type `forall a. a -> a'
Expected type: (forall a. a -> a) -> b -> b
Actual type: (forall a. a -> a) -> forall a. a -> a
In the expression: id :: (forall a. a -> a) -> (forall b. b -> b)
In an equation for `it':
it = id :: (forall a. a -> a) -> (forall b. b -> b)
It's of course possible to define a new, restricted form of the identity function with the desired signature:
restrictedId :: (forall a. a -> a) -> (forall b. b -> b)
restrictedId x = x
However defining it in terms of the general id doesn't work:
restrictedId :: (forall a. a -> a) -> (forall b. b -> b)
restrictedId = id -- Similar error to above
So what's going on here? It seems like it might be related to difficulties with impredicativity, but enabling -XImpredicativeTypes makes no difference.
why is it expecting a type of (forall a. a -> a) -> b -> b
I think the type forall b.(forall a. a -> a) -> b -> b is equivalent to the type you gave. It is just a canonical representation of it, where the forall is shifted as much to the left as possible.
And the reason why it does not work is that the given type is actually more polymorphic than the type of id :: forall c. c -> c, which requires that argument and return types be equal. But the forall a in your type effectively forbids a to be unified with any other type.
You are absolutely correct that forall b. (forall a. a -> a) -> b -> b is not equivalent to (forall a. a -> a) -> (forall b. b -> b).
Unless annotated otherwise, type variables are quantified at the outermost level. So (a -> a) -> b -> b is shorthand for (forall a. (forall b. (a -> a) -> b -> b)). In System F, where type abstraction and application are made explicit, this describes a term like f = Λa. Λb. λx:(a -> a). λy:b. x y. Just to be clear for anyone not familiar with the notation, Λ is a lambda that takes a type as a parameter, unlike λ which takes a term as a parameter.
The caller of f first provides a type parameter a, then supplies a type parameter b, then supplies two values x and y that adhere to the chosen types. The important thing to note is the caller chooses a and b. So the caller can perform an application like f String Int length for example to produce a term String -> Int.
Using -XRankNTypes you can annotate a term by explicitly placing the universal quantifier, it doesn't have to be at the outermost level. Your restrictedId term with the type (forall a. a -> a) -> (forall b. b -> b) could be roughly exemplified in System F as g = λx:(forall a. a -> a). if (x Int 0, x Char 'd') > (0, 'e') then x else id. Notice how g, the callee, can apply x to both 0 and 'e' by instantiating it with a type first.
But in this case the caller cannot choose the type parameter like it did before with f. You'll note the applications x Int and x Char inside the lambda. This forces the caller to provide a polymorphic function, so a term like g length is not valid because length does not apply to Int or Char.
Another way to think about it is drawing the types of f and g as a tree. The tree for f has a universal quantifier as the root while the tree for g has an arrow as the root. To get to the arrow in f, the caller instantiates the two quantifiers. With g, it's already an arrow type and the caller cannot control the instantiation. This forces the caller to provide a polymorphic argument.
Lastly, please forgive my contrived examples. Gabriel Scherer describes some more practical uses of higher-rank polymorphism in Moderately Practical uses of System F over ML. You might also consult chapters 23 and 30 of TAPL or skim the documentation for the compiler extensions to find more detail or better practical examples of higher-rank polymorphism.
I'm not an expert on impredictive types, so this is at once a potential answer and a try at learning something from comments.
It doesn't make sense to specialize
\/ a . a -> a (1)
to
(\/ a . a -> a) -> (\/ b . b -> b) (2)
and I don't think impredictive types are a reason to allow it. The quantifiers have the effect of making the types represented by the left and right side of (2) inequivalent sets in general. Yet the a -> a in (1) implies left and right side are equivalent sets.
E.g. you can concretize (2) to (int -> int) -> (string -> string). But by any system I know this is not a type represented by (1).
The error message looks like it results from an attempt by the Haskel type inferencer to unify the type of id
\/ a . a -> a
with the type you've given
\/ c . (c -> c) -> \/ d . (d -> d)
Here I'm uniqifying quantified variables for clarity.
The job of the type inferencer is to find a most general assignment for a, c, and d that causes the two expressions to be syntactically equal. It ultimately finds that it's required to unify c and d. Since they're separately quantified, it's at a dead end and quits.
You are perhaps asking the question because the basic type inferencer -- with an ascription (c -> c) -> (d -> d) -- would just plow ahead and set c == d. The resulting type would be
(c -> c) -> (c -> c)
which is just shorthand for
\/c . (c -> c) -> (c -> c)
This is provably the least most general type (type theoretic least upper bound) expression for the type of x = x where x is constrained to be a function with the same domain and co-domain.
The type of "restricedId" as given is in a real sense excessively general. While it can never lead to a runtime type error, there are many types described by the expression you've given it - like the aforementioned (int -> int) -> (string -> string) - that are impossible operationally even though your type would allow them.

Resources