Code unexpectedly accepted by GHC/GHCi - haskell

I don't understand why this code should pass type-checking:
foo :: (Maybe a, Maybe b)
foo = let x = Nothing in (x,x)
Since each component is bound to the same variable x, I would expect that the most general type for this expression to be (Maybe a, Maybe a). I get the same results if I use a where instead of a let. Am I missing something?

Briefly put, the type of x gets generalized by let. This is a key step in the Hindley-Milner type inference algorithm.
Concretely, let x = Nothing initially assigns x the type Maybe t, where t is a fresh type variable. Then, the type gets generalized, universally quantifying all its type variables (technically: except those in use elsewhere, but here we only have t). This causes x :: forall t. Maybe t. Note that this is exactly the same type as Nothing :: forall t. Maybe t.
Hence, each time we use x in our code, that refers to a potentially different type Maybe t, much like Nothing. Using (x, x) gets the same type as (Nothing, Nothing) for this reason.
Instead, lambdas do not feature the same generalization step. By comparison (\x -> (x, x)) Nothing "only" has type forall t. (Maybe t, Maybe t), where both components are forced to be of the same type. Here x is again assigned type Maybe t, with t fresh, but it is not generalized. Then (x, x) is assigned type (Maybe t, Maybe t). Only at the top-level we generalize adding forall t, but at that point is too late to obtain a heterogeneous pair.

Related

I think there is a type mismatch in default definition in instance Applicative Maybe in Haskell

I am currently studying Haskell with Prof. Hutton's "Programming in Haskell", and I found something strange regarding the definition of Maybe as an instance of the class Applicative.
In GHC.Base, the instance Applicative Maybe is defined as follows:
instance Applicative Maybe where
pure = Just
Just f <*> m = fmap f m
Nothing <*> _m = Nothing
It is the line which defines the value of Nothing <\*> _ as Nothing that bothers me. Nothing is of type Maybe a, where the operator <*> actually requires f (a -> b) (in this case, Maybe (a -> b)) as its first argument's type. Therefore, this is a type mismatch, which Haskell should complain about. However, this is accepted as a default definition, and therefore Haskell does not complain about it where I think it should.
What am I missing?
The a in Maybe a is a type variable, and can be any type at all! So Nothing can have the type Maybe Int, or Maybe [x], or Maybe (p -> q), for example.
Don't get confused by the fact that the variable name a is used in two places. The a in the type of Nothing is a completely different variable from the a in the type of <*>, and just happens to have the same name!
(That's exactly the same as if you wrote f x = x + 5 and then elsewhere, g x = "Hello, " ++ x. The use of x in both places doesn't matter, because they are in different scopes. Same with the a in this types. Different scopes, so they are different variables.)
Let's make things clearer by relabeling a type variable:
Nothing :: Maybe x
The type Maybe x unifies with Maybe (a -> b), with x ~ (a -> b). That is, Nothing is a value that can used as Maybe a for any a, including a function type. Thus it is a legal left-hand argument for <*> here.

Confusing about Haskell type inference

I have just started learning Haskell. As Haskell is static typed and has polymorphic type inference, the type of the identity function is
id :: a -> a
suggesting id can take any type as its parameter and return itself. It works fine when I try:
a = (id 1, id True)
I just suppose that at compile time, the first id is Num a :: a -> a, and the second id is Bool -> Bool. When I try the following code, it gives an error:
foo f a b = (f a, f b)
result = foo id 1 True
It shows the type of a must be the same type of b, since it works fine with
result = foo id 1 2
But is that true that the type of id's parameter can be polymorphic, so that a and b can be different type?
All right, this is a weird spooky corner of Haskell's type system. The problem here is that there are two ways to type inference your function foo.
-- rank 1
foo :: forall a b. (a -> b) -> a -> a -> (b, b)
foo f a b = (f a, f b)
-- rank 2
foo' :: (forall a. a -> a) -> a -> b -> (a, b)
foo' f a b = (f a, f b)
The second type is the one you want, but the first type is the one you're getting. The second type, as amalloy pointed out, is a rank-2 type (we're going to ignore what the two means but read the introduction in "Practical type inference for arbitrary-rank types" if you want a good explanation of ranks – don't be put off by the academic nature of the PDF file as the beginning is accessibly and clearly written).
We'll defer the definition of higher-ranked types for now and just say that the problem is that GHC is unable to infer the rank-2 type. Quote the paper:
Complete type inference is known to be undecidable for higher-rank (impredicative) type systems, but in practice programmers are more than willing to add type annotations to guide the type inference engine, and to document their code....
Kfoury and Wells show that typeability is decidable for rank ≤ 2, and undecidable for all ranks ≥ 3 (Kfoury & Wells, 1994). For the rank-2 fragment, the same paper gives a type inference algorithm. This inference algorithm is somewhat subtle, does not interact well with user-supplied type annotations, and has not, to our knowledge, been implemented in a production compiler.
Undecidable means there can be no algorithm that always leads to a correct yes-or-no decision. So there you have it: impossible to infer a rank-3-or-higher type, and it's too gosh-darn-hard to infer the rank-2 type.
Now, back to rank 2. The (forall a. a -> a) is what makes it rank-2. There's already an excellent Stack Overflow question about what the forall keyword means so I'll refer you to that, but basically it means you're able to call f a and f b in the expression (f a, f b) while having a and b be different types, which is what you wanted in the first place, before all this hot mess.
One last thing: The reason you don't normally see foralls in GHCi is that any foralls on the very outer scope are left off. So forall a b. (a -> b) -> a -> a -> (b, b) is equivalent to (a -> b) -> a -> a -> (b, b).
Overall this is a pain point of the language that's poorly explained.
(Hat tip to #amalloy in the comments.)

Wrapping / Unwrapping Universally Quantified Types

I have imported a data type, X, defined as
data X a = X a
Locally, I have defined a universally quantified data type, Y
type Y = forall a. X a
Now I need to define two functions, toY and fromY. For fromY, this definition works fine:
fromY :: Y -> X a
fromY a = a
but if I try the same thing for toY, I get an error
Couldn't match type 'a' with 'a0'
'a' is a rigid type variable bound by the type signature for 'toY :: X a -> y'
'a0' is a rigid type variable bound by the type signature for 'toY :: X a -> X a0'
Expected type: X a0
Actual type: X a
If I understand correctly, the type signature for toY is expanding to forall a. X a -> (forall a0. X a0) because Y is defined as a synonym, rather than a newtype, and so the two as in the definitions don't match up.
But if this is the case, then why does fromY type-check successfully? And is there any way around this problem other than using unsafeCoerce?
You claim to define an existential type, but you do not.
type Y = forall a. X a
defines a universally quantified type. For something to have type Y, it must have type X a for every a. To make an existentially quantified type, you always need to use data, and I find the GADT syntax easier to understand than the traditional existential one.
data Y where
Y :: forall a . X a -> Y
The forall there is actually optional, but I think clarifies things.
I'm too sleepy right now to work out your other questions, but I'll try again tomorrow if no one else does.
Remark:
This is more like a comment but I could not really put it there as it would have been unreadable; please forgive me this one time.
Aside from what dfeuer already told you, you might see (when you use his answer) that toY is now really easy to do but you might have trouble defining fromY – because you basically lose the type-information, so this will not work:
{-# LANGUAGE GADTs #-}
module ExTypes where
data X a = X a
data Y where
Y :: X a -> Y
fromY :: Y -> X a
fromY (Y a) = a
as here you have two different as – one from the constructor Y and one from X a – indeed if you strip the definition and try to compile: fromY (Y a) = a the compiler will tell you that the type a escapes:
Couldn't match expected type `t' with actual type `X a'
because type variable `a' would escape its scope
This (rigid, skolem) type variable is bound by
a pattern with constructor
Y :: forall a. X a -> Y,
in an equation for `fromY'
I think the only thing you will have left now will be something like this:
useY :: (forall a . X a -> b) -> Y -> b
useY f (Y x) = f x
but this might prove not to be too useful.
The thing is that you normally should constrain the forall a there a bit more (with type-classes) to get any meaningful behavior – but of course I cannot help here.
This wiki article might be interesting for you on the details.

Why doesn't f=(+) need a type annotation?

I mean, for example,
f :: (Enum a) => a -> a --without this line, there would be an error
f = succ
It's because succ needs its parameter to be enumerable (succ :: (Enum a) => a -> a)
but for (+)
f = (+) --ok
Though (+)'s declaration is (+) :: (Num a) => a –> a –> a.
I mean, why don't I need to declare f as f :: (Num a) => a –> a –> a?
Because of defaulting. Num is a 'defaultable' type class, meaning that if you leave it un-constrained, the compiler will make a few intelligent guesses as to which type you meant to use it as. Try putting that definition in a module, then running
:t f
in ghci; it should tell you (IIRC) f :: Integer -> Integer -> Integer. The compiler didn't know which a you wanted to use, so it guessed Integer; and since that worked, it went with that guess.
Why didn't it infer a polymorphic type for f? Because of the dreaded[1] monomorphism restriction. When the compiler sees
f = (+)
it thinks 'f is a value', which means it needs a single (monomorphic) type. Eta-expand the definition to
f x = (+) x
and you will get the polymorphic type
f :: Num a => a -> a -> a
and similarly if you eta-expand your first definition
f x = succ x
you don't need a type signature any more.
[1] Actual name from the GHC documentation!
I mean, why don't I need to declare f as (+) :: (Num a) => a –> a –> a?
You do need to do that, if you declare the signature of f at all. But if you don't, the compiler will “guess” the signature itself – in this case this isn't all to remarkable since it can basically just copy&paste the signature of (+). And that's precisely what it will do.
...or at least what it should do. It does, provided you have the -XNoMonomorphism flag on. Otherwise, well, the dreaded monomorphism restriction steps in because f's definition is of the shape ConstantApplicativeForm = Value; that makes the compiler dumb down the signature to the next best non-polymorphic type it can find, namely Integer -> Integer -> Integer. To prevent this, you should in fact supply the right signature by hand, for all top-level functions. That also prevents a lot of confusion, and many errors become way less confusing.
The monomorphism restriction is the reason
f = succ
won't work on its own: because it also has this CAF shape, the compiler does not try to infer the correct polymorphic type, but tries to find some concrete instantiation to make a monomorphic signature. But unlike Num, the Enum class does not offer a default instance.
Possible solutions, ordered by preference:
Always add signatures. You really should.
Enable -XNoMonomorphismRestriction.
Write your function definitions in the form f a = succ a, f a b = a+b. Because there are explicitly mentioned arguments, these don't qualify as CAF, so the monomorphism restriction won't kick in.
Haskell defaults Num constraints to Int or Integer, I forget which.

What Justification for the type of f x = f x in Haskell is there?

Haskell gives f x = f x the type of t1 -> t, but could someone explain why?
And, is it possible for any other, nonequivalent function to have this same type?
Okay, starting from the function definition f x = f x, let's step through and see what we can deduce about the type of f.
Start with a completely unspecified type variable, a. Can we deduce more than that? Yes, we observe that f is a function taking one argument, so we can change a into a function between two unknown type variables, which we'll call b -> c. Whatever type b stands for is the type of the argument x, and whatever type c stands for must be the type of the right-hand side of the definition.
What can we figure out about the right-hand side? Well, we have f, which is a recursive reference to the function we're defining, so its type is still b -> c, where both type variables are the same as for the definition of f. We also have x, which is a variable bound within the definition of f and has type b. Applying f to x type checks, because they're sharing the same unknown type b, and the result is c.
At this point everything fits together and with no other restrictions, we can make the type variables "official", resulting in a final type of b -> c where both variables are the usual, implicitly universally quantified type variables in Haskell.
In other words, f is a function that takes an argument of any type and returns a value of any type. How can it return any possible type? It can't, and we can observe that evaluating it produces only an infinite recursion.
For the same reason, any function with the same type will be "equivalent" in the sense of never returning when evaluated.
An even more direct version is to remove the argument entirely:
foo :: a
foo = foo
...which is also universally quantified and represents a value of any type. This is pretty much equivalent to undefined.
f x = undefined
has the (alpha) equivalent type f :: t -> a.
If you're curious, Haskell's type system is derived from Hindley–Milner. Informally, the typechecker starts off with the most permissive types for everything, and unifies the various constraints until what remains is consistent (or not). In this case, the most general type is f :: t1 -> t, and there's no additional constraints.
Compare to
f x = f (f x)
which has inferred type f :: t -> t, due to unifying the types of the argument of f on the LHS and the argument to the outer f on the RHS.

Resources