Unifying c -> a -> b and (a -> b) -> c - haskell

What is the type inferred by a Haskell type synthesizer when unifying
the types c -> a -> b and (a -> b) -> c?
Can someone explain me how can I solve it?
Thanks!

This seems to be some kind of exercise/homework so I will not spoil everything but give you some hints first:
the type c -> a -> b is actually c -> (a -> b)
so you have to unify c -> (a -> b) with (a -> b) -> c, that is:
c with a -> b (first part)
a -> b with c (second part)
now what could that (try to get rid of c ;) ) be now?
PS: I am assuming you want those types a, b, .. to be the same

In other answers, we have seen how to perform the unification by hand, and how to ask ghci some limited unification questions when we do not need to connect type variables in the two types we want to unify. In this answer, I show how to use existing tooling to answer the question you asked as I understand you to intend it.
The trick is to use type-equality constraints to ask GHC to unify two types, then expose the results as a tuple type. The type equality constraint kicks off the unifier; when unification is done, the type variables in our tuple type will be simplified according to what was learned during unification.
Thus, your question looks like this, for example:
> :set -XTypeFamilies
> :{
| :t undefined -- a dummy implementation we don't actually care about
| :: ((a -> b) -> c) ~ (c -> a -> b) -- the unification problem
| => (a, b, c) -- how we ask our query (what are the values of a, b, and c after unification?)
| :}
<snip -- a copy of the above text>
:: (a, b, a -> b)
From this, we learn that for any types a and b, we can choose a ~ a, b ~ b, and c ~ a -> b as a solution to the unification problem. Here is another query you might wonder: after unification, what is the simplified type of (a -> b) -> c? You could run the previous query, and substitute in a, b, and c by hand, or you could ask ghci:
> :t undefined :: ((a -> b) -> c) ~ (c -> a -> b) => (a -> b) -> c
undefined :: ((a -> b) -> c) ~ (c -> a -> b) => (a -> b) -> c
:: (a -> b) -> a -> b
The only thing I changed in this command is the "query" part. The result tells us that (a -> b) -> c becomes (a -> b) -> a -> b after unification. Note well that the a and b in the result type are not guaranteed to be exactly the same as the a and b in the query -- though probably in GHC that will always be the case.
Another quick trick worth mentioning is that you can use Proxy to turn an arbitrarily-kinded type variable into a * type for use in a tuple; thus, for example:
> :t undefined :: f a ~ (b -> c) => (a, b, c, f)
<interactive>:1:42:
Expecting one more argument to ‘f’
The fourth argument of a tuple should have kind ‘*’,
but ‘f’ has kind ‘* -> *’
In an expression type signature: f a ~ (b -> c) => (a, b, c, f)
In the expression: undefined :: f a ~ (b -> c) => (a, b, c, f)
> :m + Data.Proxy
> :t undefined :: f a ~ (b -> c) => (a, b, c, Proxy f)
undefined :: f a ~ (b -> c) => (a, b, c, Proxy f)
:: (c, b, c, Proxy ((->) b))

You can ask ghci
:t [undefined :: c -> a -> b, undefined :: (a -> b) -> c]
It will need to unify the types to figure out what type the elements of the list are. We can unify any number of types this way; even 0, try it!
The type variables on the left in c -> a -> b are distinct from the type variables on the right in a -> b -> c. GHC will rename type variables to keep them distinct, but it will try to preserve the original names. It does this by adding numbers to the end of the type variable names. The answer to this query includes some of the type variables a, a1, b, b1, c, and c1. If you don't want the type variables to be distinct, you can read off the answer ignoring the added numbers.
If you do want the type variables to be distinct, it can be a bit tricky to tell what ghc is doing because you don't know which type variables where renamed to what. In practical coding, this can be a problem when trying to understand type errors. In both cases there is a simple solution: rename the type variables with distinctive names yourself so that ghc doesn't need to rename them.
:t [undefined :: c1 -> a1 -> b1, undefined :: (a2 -> b2) -> c2]
We're done with what vanilla Haskell can do, but you can get the compiler to answer questions more generally by using type equality constraints as described in Daniel Wagner's answer. The next section just describes why forall scoped types are not the general solution.
forall
Before reading this section you should think about whether it is possible to unify, for all c, c -> a -> b and (a -> b) -> c.
To the experienced haskeller, it might seem like you could keep the type variables from being distinct by introducing them in an explicit forall scope with the ScopedTypeVariables extension. I don't know an easy way to do this in ghci, but the following snipet with a hole† asks the compiler to unify a -> b and a -> b.
{-# LANGUAGE ScopedTypeVariables #-}
example1 :: forall a b. ()
example1 = (undefined :: _) [undefined :: a -> b, undefined :: a -> b]
The output seems to tell us that the list is a list of a -> b.
Found hole `_' with type: [a -> b] -> ()
If we try to use this for the example problem, it doesn't work.
example2 :: forall a b c. ()
example2 = (undefined :: _) [undefined :: c -> a -> b, undefined :: (a -> b) -> c]
The compiler politely tells us why†
Couldn't match type `c' with `a -> b'
It is not true that for all types c, c is a function. Some example types that aren't functions include Int, Bool, and IO a.
† I use (undefined :: _) instead of _ when asking what the type that goes in a hole is. If you just use _ ghc doesn't type check all of the expression. The compiler may lead you to believe a hole is possible to fill when it is in fact impossible. In the output for example2 there is also the following, extremely misleading line
Found hole `_' with type: [c -> a -> b] -> ()

Related

Understanding ST's quantification and phantom type [duplicate]

I found that I can say
{-# LANGUAGE RankNTypes #-}
f1 :: (forall b.b -> b) -> (forall c.c -> c)
f1 f = id f
(and HLint tell me I can do "Eta reduce" here), but
f2 :: (forall b.b -> b) -> (forall c.c -> c)
f2 = id
fail to compile:
Couldn't match expected type `c -> c'
with actual type `forall b. b -> b'
Expected type: (forall b. b -> b) -> c -> c
Actual type: (forall b. b -> b) -> forall b. b -> b
In the expression: id
In an equation for `f2': f2 = id
Actually I have a similar problem in a more complicated situation but this is the simplest example I can think of. So either HLint is fail to provide proper advise here, or the compiler shall detect this situation, is it?
UPDATE
Another revelent question looks similar. However although both answers are quite useful, neither satisfy me, since they seems not touching the heart of the question.
For example, I am not even allowed to assign id with the proposed rank 2 type:
f2 :: (forall b.b -> b) -> (forall c.c -> c)
f2 = id :: (forall b.b -> b) -> (forall c.c -> c)
If the problem is just about type inference, an explicit type notation shall solve it (id have type a -> a, and it has been constrained to (forall b.b -> b) -> (forall c.c -> c). Therefore to justify this use, (forall b.b -> b) must match (forall c.c -> c) and that is true). But the above example shows this is not the case. Thus, this IS a true exception of "eta reduce": you have to explicitly add parameters to both sides to convert a rank 1 typed value in to rank 2 typed value.
But why there is such a limitation? Why the computer cannot unify rank 1 type and rank 2 type automatically (forget about type inference, all types can be given by notations)?
I'm not sure HLint is aware of RankNTypes at all, perhaps not.
Indeed eta reduction is often impossible with that extension on. GHC can't just unify a->a and (forall b.b -> b) -> (forall c.c -> c), otherwise it would completely mess up its type inference capability for Rank1-code1. OTOH, it's not a problem to unify (forall b.b -> b) with the a argument; the result is confimed to be (forall b.b -> b) which matches with (forall c.c -> c).
1Consider map id [(+1), (*2)]. If id were allowed to have the type you're dealing with, the compiler could end up producing different instance choice for the polymorphic Num functions, which certainly shouldn't be possible. Or should it? I'm not sure, thinking about it...
At any rate, I'm pretty sure its proven that with RankNTypes, full type inference is not possible, so to get it at least in the Rank1 subset GHC must usually default to this as a less-than-possible polymorphic choice.

How to derive the type of an applicator applied to the identity function

I want to derive the type of the following contrived applicator applied to the identity function. To achieve this I probably have to unify the type portion of the first argument (a -> [b]) with the type of id:
ap :: (a -> [b]) -> a -> [b]
id :: a -> a
a -> [b]
a0 -> a0 -- fresh type vars
a ~ a0 -- mappings
[b] ~ a0
a0 -> a0 -- substitution
This is obviously wrong, since the expected type is [b] -> [b]. There is an ambiguity within the unification, because a0 cannot be equivalent to both a and [b], except for a ~ [b] . But what is the rule that tells me to substitute a with [b] and not the other way around, as I would have to do with ap :: ([a] -> b) -> [a] -> b for example.
I know this is a very specific question, sorry. Hopefully it is not too confusing!
Ok, new answer because I now understand the question being asked! To restate the question:
Given
ap :: (a -> [b]) -> a -> [b]
id :: a -> a
Explain how the type of the expression ap id is derived.
Answer:
Rename variables:
ap :: (a -> [b]) -> a -> [b]
id :: a0 -> a0
Unify:
(a -> [b]) ~ (a0 -> a0)
Apply generativity a couple times, pulling the arguments from the (->) type constructor:
a ~ a0
[b] ~ a0
Apply commutativity/transitivity of type equality:
[b] ~ a
Substitute the most specific types known into the types of ap and id
ap :: ([b] -> [b]) -> [b] -> [b]
id :: [b] -> [b]
[b] is the most specific type known, because it provides some restriction. The value must be a list of something. The other two equivalent type expressions just mean any type at all. You can think of unification as a constraint-solving process. You find the maximal type that satisfies the constraints provided, which amount to "it's a list of something" for this case.
Now that the types are unified, the type of the function application is the type of the function's result:
ap id :: [b] -> [b]
I can see why the choice of [b] looks a little odd in this case, because only one of the three type expressions contributed factors to unify. There are more involved cases where constraints come from multiple places, though.
Let's consider a more advanced case. This might involve some things you haven't seen before. If it does, I apologize for jumping straight to the deep end.
Given:
f1 :: (a -> b) -> f a -> f b
f2 :: p c d -> (e, c) -> (e, d)
Unify the types of f1 and f2.
Let's be really careful with this one. First up, rewrite all the types in terms of prefix application. Even the (->) types. This is going to be ugly:
f1 :: (->) ((->) a b) ((->) (f a) (f b))
f2 :: (->) (p c d) ((->) ((,) e c) ((,) e d))
Unify and apply generativity twice to top-level (->) type constructors:
((->) a b) ~ (p c d)
((->) (f a) (f b)) ~ ((->) ((,) e c) ((,) e d))
And, just keep unifying and applying generativity:
(->) ~ p
a ~ c
b ~ d
f a ~ (,) e c
f b ~ (,) e d
f ~ (,) e
Ok, we've built up a giant stack of constraints now. Choosing between a and c or b and d doesn't matter, as they're equivalently constrained. Let's choose letters closer to the beginning of the alphabet when it doesn't matter. (->) is more constrained than p, so it wins there, and (,) e is more constrained than f. Call it a winner too.
Then switch back to infix type constructors to make it pretty, and the unified type is:
(a -> b) -> (e, a) -> (e, b)
Notice how each of the two starting types contributed a constraint to the final unified type. f1 requires the p type in f2 to be more specific, and f2 required the f type in f1 to be more specific.
Overall, this is a super-mechanical process. It's also fiddly and requires precise tracking of what you know. There's a reason we mostly leave it to the compiler to handle this. It is absolutely useful in the cases when something goes wrong and you want to double-check the process yourself to see why the compiler is reporting an error, though.

Type checking with RankNTypes in Haskell

I'm trying to understand RankNTypes in Haskell and found this example:
check :: Eq b => (forall a. [a] -> b) -> [c] -> [d] -> Bool
check f l1 l2 = f l1 == f l2
(If my understanding is correct, this is equivalent to check :: forall b c d. Eq b => (forall a. [a] -> b) -> [c] -> [d] -> Bool.)
Ok, so far so good. Now, if the explicit forall a is removed, GHC produces the following errors:
Could not deduce (c ~ a)
from the context (Eq b)
[…]
Could not deduce (d ~ a)
from the context (Eq b)
[…]
When removing the nested forall, the type signature becomes
check :: forall a b c d. Eq b => ([a] -> b) -> [c] -> [d] -> Bool
It is easy to see why this fails type checking since l1 and l2 should have type [a] for us to pass them to f, but why isn't this the case when specifying f's type as (forall a. [a] ->b)? Is the fact that a is only bound inside the parens the full answer? I.e. the type checker will accept
[c] -> b ~ (forall a. [a] -> b)
[d] -> b ~ (forall a. [a] -> b)
(edit: Fixed. Thanks, Boyd!)
since a function of type (forall a. a -> b) can take any list?
When f = \xs -> ... is written with the explicit Rank2 quantification forall a. [a] -> b you can view this as a new function
f = Λa -> \xs -> ...
where Λ is a special lambda that takes a type argument to determine which specific type a it will use in the body of the function. This type argument is applied each time the function is called, just like how normal lambda bindings are applied on each call. This is how GHC handles forall internally.
In the explicitly forall'd version, f can be applied to different type arguments each time it is called so a can resolve to a different type each time, once for c and once for d.
In the version without the inner forall, this type application for a happens only once, when check is called. So every time f is called it must use the same a. Of course this fails since f is called on lists of different types.
It is easy to see why this fails type checking since l1 and l2 should have type [a] for us to pass them to f, but why isn't this the case when specifying f's type as (forall a. [a] ->b)?
Because the type (forall a. [a] -> B) can be unified with [C] -> B and (separately) [D] -> B. However, the type [A] -> B cannot be unified with either [C] -> B or [D] -> B.
Is the fact that a is only bound inside the parens the full answer?
Basically. You have to choose a particular type for each type variable when you are "inside" a forall scope, but outside you can use the forall multiple times and choose a different particular type each time you do.
I.e. the type checker will accept
[c] ~ (forall a. a -> b)
[d] ~ (forall a. a -> b)
since a function of type (forall a. a -> b) can take any list?
Careful. You seem to have lost some "[]" characters there. Also, you are not quite getting the unification correct. The type checker will accept both:
[C] -> B ~ (forall a. [a] -> B)
[D] -> B ~ (forall a. [a] -> B)
It will not accept either:
[C] -> B ~ [A] -> B
[D] -> B ~ [A] -> B
You can rewrite universal quantification in a contravariant field with existential quantification in covariant fields (not legally in Haskell, but in principle).
check' :: exists c' d'. forall b c d. Eq b
=> ([c'] -> b) -> ([d'] -> b) -> [c] -> [d] -> Bool
It's obvious enough that this works: for c ~ C, d ~ D choose c' ~ C and d' ~ D as well, then the function is simply
check'' :: forall b . Eq b => ([C] -> b) -> ([D] -> b) -> [C] -> [D] -> Bool
Not sure if this answers you question, but it is one way to look at rank-2 types.

Why does this Haskell code compile?

Given:
uncurry :: (a-> b -> c) -> (a,b) -> c
id :: a -> a
Invoking uncurry id results in a function of type: (b -> c, b) -> c
How do we get this result?
How can you use id (a -> a) as the first parameter to uncurry, which requires a (a -> b -> c) function?
It's easier to understand if we try and look at it from the point of making the types work out: figuring out what we need to do to id's type to get it to fit the shape required by uncurry. Since we have:
id :: a -> a
we also have:
id :: (b -> c) -> (b -> c)
This can be seen by substituting b -> c for a in the original type of id, just as you might substitute Int instead when figuring out the type of id 42. We can then drop the parentheses on the right-hand side, since (->) is right-associative:
id :: (b -> c) -> b -> c
showing that id's type fits the form a -> b -> c, where a is b -> c. In other words, we can reshape id's type to fit the required form simply by specialising the general type it already has.
Another way to understand this is to see that uncurry ($) also has the type (b -> c, b) -> c. Comparing the definitions of id and ($):
id :: a -> a
id a = a
($) :: (a -> b) -> a -> b
($) f x = f x
we can make the latter definition more point-free:
($) f = f
at which point the fact that ($) is simply a specialisation of id to a more specific type becomes clear.
How can you use id (a -> a) as the first parameter to uncurry, which requires a (a -> b -> c) function?
Actually, uncurry requires (a -> (b -> c)) function. Can you spot the difference? :)
Omitting parentheses is evil (well, sometimes). It makes it impossible for a novice to decipher Haskell. Of course after you've gathered some experience with the language, you feel like you don't need them at all, anymore.
Here, it all becomes clear once we write out all the omitted parentheses back explicitly:
uncurry :: (a -> (b -> c)) -> ((a,b) -> c)
id :: a -> a
Now, writing uncurry id calls for a type unification of a1 -> a1 with a2 -> (b -> c). This is straightforward, a1 ~ a2 and a1 ~ (b -> c). Just mechanical stuff, no creative thinking involved here. So id in question actually has type a -> a where a ~ (b -> c), and so uncurry id has type (b -> c,b) -> c, by simple substitution of a ~ (b -> c) into (a,b) -> c. That is, it expects a pair of a b -> c function and a b value, and must produce a c value.
Since the types are most general (i.e. nothing is known about them, and so there's no specific functions to call that might do the trick in some special way), the only way to produce a c value here is to call the b -> c function with the b value as an argument. Naturally, that's what ($) does. So uncurry id == uncurry ($), although id is most certainly not ($).

Why can't the type of id be specialised to (forall a. a -> a) -> (forall b. b -> b)?

Take the humble identity function in Haskell,
id :: forall a. a -> a
Given that Haskell supposedly supports impredicative polymorphism, it seems reasonable that I should be able to "restrict" id to the type (forall a. a -> a) -> (forall b. b -> b) via type ascription. But this doesn't work:
Prelude> id :: (forall a. a -> a) -> (forall b. b -> b)
<interactive>:1:1:
Couldn't match expected type `b -> b'
with actual type `forall a. a -> a'
Expected type: (forall a. a -> a) -> b -> b
Actual type: (forall a. a -> a) -> forall a. a -> a
In the expression: id :: (forall a. a -> a) -> (forall b. b -> b)
In an equation for `it':
it = id :: (forall a. a -> a) -> (forall b. b -> b)
It's of course possible to define a new, restricted form of the identity function with the desired signature:
restrictedId :: (forall a. a -> a) -> (forall b. b -> b)
restrictedId x = x
However defining it in terms of the general id doesn't work:
restrictedId :: (forall a. a -> a) -> (forall b. b -> b)
restrictedId = id -- Similar error to above
So what's going on here? It seems like it might be related to difficulties with impredicativity, but enabling -XImpredicativeTypes makes no difference.
why is it expecting a type of (forall a. a -> a) -> b -> b
I think the type forall b.(forall a. a -> a) -> b -> b is equivalent to the type you gave. It is just a canonical representation of it, where the forall is shifted as much to the left as possible.
And the reason why it does not work is that the given type is actually more polymorphic than the type of id :: forall c. c -> c, which requires that argument and return types be equal. But the forall a in your type effectively forbids a to be unified with any other type.
You are absolutely correct that forall b. (forall a. a -> a) -> b -> b is not equivalent to (forall a. a -> a) -> (forall b. b -> b).
Unless annotated otherwise, type variables are quantified at the outermost level. So (a -> a) -> b -> b is shorthand for (forall a. (forall b. (a -> a) -> b -> b)). In System F, where type abstraction and application are made explicit, this describes a term like f = Λa. Λb. λx:(a -> a). λy:b. x y. Just to be clear for anyone not familiar with the notation, Λ is a lambda that takes a type as a parameter, unlike λ which takes a term as a parameter.
The caller of f first provides a type parameter a, then supplies a type parameter b, then supplies two values x and y that adhere to the chosen types. The important thing to note is the caller chooses a and b. So the caller can perform an application like f String Int length for example to produce a term String -> Int.
Using -XRankNTypes you can annotate a term by explicitly placing the universal quantifier, it doesn't have to be at the outermost level. Your restrictedId term with the type (forall a. a -> a) -> (forall b. b -> b) could be roughly exemplified in System F as g = λx:(forall a. a -> a). if (x Int 0, x Char 'd') > (0, 'e') then x else id. Notice how g, the callee, can apply x to both 0 and 'e' by instantiating it with a type first.
But in this case the caller cannot choose the type parameter like it did before with f. You'll note the applications x Int and x Char inside the lambda. This forces the caller to provide a polymorphic function, so a term like g length is not valid because length does not apply to Int or Char.
Another way to think about it is drawing the types of f and g as a tree. The tree for f has a universal quantifier as the root while the tree for g has an arrow as the root. To get to the arrow in f, the caller instantiates the two quantifiers. With g, it's already an arrow type and the caller cannot control the instantiation. This forces the caller to provide a polymorphic argument.
Lastly, please forgive my contrived examples. Gabriel Scherer describes some more practical uses of higher-rank polymorphism in Moderately Practical uses of System F over ML. You might also consult chapters 23 and 30 of TAPL or skim the documentation for the compiler extensions to find more detail or better practical examples of higher-rank polymorphism.
I'm not an expert on impredictive types, so this is at once a potential answer and a try at learning something from comments.
It doesn't make sense to specialize
\/ a . a -> a (1)
to
(\/ a . a -> a) -> (\/ b . b -> b) (2)
and I don't think impredictive types are a reason to allow it. The quantifiers have the effect of making the types represented by the left and right side of (2) inequivalent sets in general. Yet the a -> a in (1) implies left and right side are equivalent sets.
E.g. you can concretize (2) to (int -> int) -> (string -> string). But by any system I know this is not a type represented by (1).
The error message looks like it results from an attempt by the Haskel type inferencer to unify the type of id
\/ a . a -> a
with the type you've given
\/ c . (c -> c) -> \/ d . (d -> d)
Here I'm uniqifying quantified variables for clarity.
The job of the type inferencer is to find a most general assignment for a, c, and d that causes the two expressions to be syntactically equal. It ultimately finds that it's required to unify c and d. Since they're separately quantified, it's at a dead end and quits.
You are perhaps asking the question because the basic type inferencer -- with an ascription (c -> c) -> (d -> d) -- would just plow ahead and set c == d. The resulting type would be
(c -> c) -> (c -> c)
which is just shorthand for
\/c . (c -> c) -> (c -> c)
This is provably the least most general type (type theoretic least upper bound) expression for the type of x = x where x is constrained to be a function with the same domain and co-domain.
The type of "restricedId" as given is in a real sense excessively general. While it can never lead to a runtime type error, there are many types described by the expression you've given it - like the aforementioned (int -> int) -> (string -> string) - that are impossible operationally even though your type would allow them.

Resources