strange existential type - haskell

http://www.iai.uni-bonn.de/~jv/mpc08.pdf - in this article I cannot understand the following declaration:
instance TreeLike CTree where
...
abs :: CTree a -> Tree a
improve :: (forall m. TreeLike m => m a) -> Tree a
improve m = abs m
what difference (forall m. TreeLike m => m a) brings (I thought TreeLike m => m a would suffice here)
why does it permit abs here, if m in m a can be any TreeLike, not just CTree?

That's a rank-2 type, not an existential type. What that type means is the argument for improve must be polymorphic. You can't pass a value of type Ctree a (for instance) to improve. It cannot be concrete in the type constructor at all. It explicitly must be polymorphic in the type constructor, with the constraint that the type constructor implement the Treelike class.
For your second question, this allows the implementation of improve to pick whichever type for m it wants to - it's the implementation's choice, and it's hidden from the caller by the type system. The implementation happens to pick Ctree for m in this case. That's totally fine. The trick is that the caller of improve doesn't get to use that information anywhere.
This has the practical result that the value cannot be constructed using details of a type - it has to use the functions in the Treelike class to construct it, instead. But the implementation gets to pick a specific type for it to work, allowing it to use details of the representation internally.

Whether m can be "any TreeLike" depends on your perspective.
From the perspective of implementing improve, it's true--m can be any TreeLike, so it picks one that's convenient, and uses abs.
From the perspective of the argument m--which is to say, the perspective of whatever is applying improve to some argument, something that's rather the opposite holds: m in fact must be able to be any TreeLike, not a single one that we choose.
Compare this to the type of numeric literals--something like (5 :: forall a. Num a => a) means that it's any Num instance we want it to be, but if a function expects an argument of type (forall a. Num a => a) it wants something that can be any Num instance it chooses. So we could give it a polymorphic 5 but not, say, the Integer 5.
You can, in many ways, think of polymorphic types as meaning that the function takes a type as an extra argument, which tells it what specific type we want to use for each type variable. So to see the difference between (forall m. TreeLike m => m a) -> Tree a and forall m. TreeLike m => m a -> Tree a you can read them as something like (M -> M a) -> Tree a vs. M -> M a -> Tree a.

Related

How do I understand the set of valid inputs to a Haskell type constructor?

Warning: very beginner question.
I'm currently mired in the section on algebraic types in the Haskell book I'm reading, and I've come across the following example:
data Id a =
MkId a deriving (Eq, Show)
idInt :: Id Integer
idInt = MkId 10
idIdentity :: Id (a -> a)
idIdentity = MkId $ \x -> x
OK, hold on. I don't fully understand the idIdentity example. The explanation in the book is that:
This is a little odd. The type Id takes an argument and the data
constructor MkId takes an argument of the corresponding polymorphic
type. So, in order to have a value of type Id Integer, we need to
apply a -> Id a to an Integer value. This binds the a type variable to
Integer and applies away the (->) in the type constructor, giving us
Id Integer. We can also construct a MkId value that is an identity
function by binding the a to a polymorphic function in both the type
and the term level.
But wait. Why only fully polymorphic functions? My previous understanding was that a can be any type. But apparently constrained polymorphic type doesn't work: (Num a) => a -> a won't work here, and the GHC error suggests that only completely polymorphic types or "qualified types" (not sure what those are) are valid:
f :: (Num a) => a -> a
f = undefined
idConsPoly :: Id (Num a) => a -> a
idConsPoly = MkId undefined
Illegal polymorphic or qualified type: Num a => a -> a
Perhaps you intended to use ImpredicativeTypes
In the type signature for ‘idIdentity’:
idIdentity :: Id (Num a => a -> a)
EDIT: I'm a bonehead. I wrote the type signature below incorrectly, as pointed out by #chepner in his answer below. This also resolves my confusion in the next sentence below...
In retrospect, this behavior makes sense because I haven't defined a Num instance for Id. But then what explains me being able to apply a type like Integer in idInt :: Id Integer?
So in generality, I guess my question is: What specifically is the set of valid inputs to type constructors? Only fully polymorphic types? What are "qualified types" then? Etc...
You just have the type constructor in the wrong place. The following is fine:
idConsPoly :: Num a => Id (a -> a)
idConsPoly = MkId undefined
The type constructor Id here has kind * -> *, which means you can give it any value that has kind * (which includes all "ordinary" types) and returns a new value of kind *. In general, you are more concerned with arrow-kinded functions(?), of which type constructors are just one example.
TypeProd is a ternary type constructor whose first two arguments have kind * -> *:
-- Based on :*: from Control.Compose
newtype TypeProd f g a = Prod { unProd :: (f a, g a) }
Either Int is an expression whose value has kind * -> * but is not a type constructor, being the partial application of the type constructor Either to the nullary type constructor Int.
Also contributing to your confusion is that you've misinterpreted the error message from GHC. It means "Num a => a -> a is a polymorphic or qualified type, and therefore illegal (in this context)".
The last sentence that you quoted from the book is not very well worded, and maybe contributed to that misunderstanding. It's important to realize that in Id (a -> a) the argument a -> a is not a polymorphic type, but just an ordinary type that happens to mention a type variable. The thing which is polymorphic is idIdentity, which can have the type Id (a -> a) for any type a.
In standard Haskell polymorphism and qualification can only appear at the outermost level of the type in a type signature.
The type signature is almost correct
idConsPoly :: (Num a) => Id (a -> a)
Should be right, though i have no ghc on my phone to test this.
Also i think your question is quite broad, thus i deliberately answer only the concrete problem here.

Why is context reduction necessary?

I've just read this paper ("Type classes: an exploration of the design space" by Peyton Jones & Jones), which explains some challenges with the early typeclass system of Haskell, and how to improve it.
Many of the issues that they raise are related to context reduction which is a way to reduce the set of constraints over instance and function declarations by following the "reverse entailment" relationship.
e.g. if you have somewhere instance (Ord a, Ord b) => Ord (a, b) ... then within contexts, Ord (a, b) gets reduced to {Ord a, Ord b} (reduction does not always shrink the number of constrains).
I did not understand from the paper why this reduction was necessary.
Well, I gathered it was used to perform some form of type checking. When you have your reduced set of constraint, you can check that there exist some instance that can satisfy them, otherwise it's an error. I'm not too sure what the added value of that is, since you would notice the problem at the use site, but okay.
But even if you have to do that check, why use the result of reduction inside inferred types? The paper points out it leads to unintuitive inferred types.
The paper is quite ancient (1997) but as far as I can tell, context reduction is still an ongoing concern. The Haskell 2010 spec does mention the inference behaviour I explain above (link).
So, why do it this way?
I don't know if this is The Reason, necessarily, but it might be considered A Reason: in early Haskell, type signatures were only permitted to have "simple" constraints, namely, a type class name applied to a type variable. Thus, for example, all of these were okay:
Ord a => a -> a -> Bool
Eq a => a -> a -> Bool
Graph gr => gr n e -> [n]
But none of these:
Ord (Tree a) => Tree a -> Tree a -> Bool
Eq (a -> b) => (a -> b) -> (a -> b) -> Bool
Graph Gr => Gr n e -> [n]
I think there was a feeling then -- and still today, as well -- that allowing the compiler to infer a type which one couldn't write manually would be a bit unfortunate. Context reduction was a way of turning the above signatures either into ones that could be written by hand as well or an informative error. For example, since one might reasonably have
instance Ord a => Ord (Tree a)
in scope, we could turn the illegal signature Ord (Tree a) => ... into the legal signature Ord a => .... On the other hand, if we don't have any instance of Eq for functions in scope, one would report an error about the type which was inferred to require Eq (a -> b) in its context.
This has a couple of other benefits:
Intuitively pleasing. Many of the context reduction rules do not change whether the type is legal, but do reflect things humans would do when writing the type. I'm thinking here of the de-duplication and subsumption rules that let you turn, e.g. (Eq a, Eq a, Ord a) into just Ord a -- a transformation one definitely would want to do for readability.
This can frequently catch stupid errors; rather than inferring a type like Eq (Integer -> Integer) => Bool which can't be satisfied in a law-abiding way, one can report an error like Perhaps you did not apply a function to enough arguments?. Much friendlier!
It becomes the compiler's job to pinpoint what went wrong. Instead of inferring a complicated context like Eq (Tree (Grizwump a, [Flagle (Gr n e) (Gr n' e') c])) and complaining that the context is not satisfiable, it instead is forced to reduce this to the constituent constraints; it will instead complain that we couldn't determine Eq (Grizwump a) from the existing context -- a much more precise and actionable error.
I think this is indeed desirable in a dictionary passing implementation. In such an implementation, a "dictionary", that is, a tuple or record of functions is passed as implicit argument for every type class constraint in the type of the applied function.
Now, the question is simply when and how those dictionaries are created. Observe that for simple types like Int by necessity all dictionaries for whatever type class Int is an instance of will be a constant.
Not so in the case of parameterized types like lists, Maybe or tuples. It is clear that to show a tuple, for instance, the Show instances of the actual tuple elements need to be known. Hence such a polymorphic dictionary cannot be a constant.
It appears that the principle guiding the dictionary passing is such that only dictionaries for types that appear as type variables in the type of the applied function are passed. Or, to put it differently: no redundant information is replicated.
Consider this function:
f :: (Show a, Show b) => (a,b) -> Int
f ab = length (show ab)
The information that a tuple of show-able components is also showable, thus a constraint like Show (a,b) needs not to appear when we already know (Show a, Show b).
An alternative implementation would be possible, though, where the caller .would be responsible to create and pass dictionaries. This could work without context reduction, such that the type of f would look like:
f :: Show (a,b) => (a,b) -> Int
But this would mean that the code to create the tuple dictionary would have to be repeated on every call site. And it is easy to come up with examples where the number of necessary constraints actually increases, like in:
g :: (Show (a,a), Show(b,b), Show (a,b), Show (b, a)) => a -> b -> Int
g a b = maximum (map length [show (a,a), show (a,b), show (b,a), show(b,b)])
It is instructive to implement a type class/instance system with actual records that are explicitly passed. For example:
data Show' a = Show' { show' :: a -> String }
showInt :: Show' Int
showInt = Show' { show' = intshow } where
intshow :: Int -> String
intshow = show
Once you do this you will probably easily recognize the need for "context reduction".

Relationship between Haskell's 'forall' and '=>'

I'm having trouble wrapping my mind around the relationship (and interactions) between Haskell's forall and => (and for that matter the . that often connects them).
For example
λ> :t (+)
λ> :t id
give
(+) :: forall a. Num a => a -> a -> a
id :: forall a. a -> a
and while I understand how these work in these specific cases, I'm not comfortable parsing the expressions (signatures?) forall a. Num a => or forall a. themselves into something meaningful, or that I can generally understand in more complex contexts.
What do forall a. Num a => and forall a. mean? Specifically, what is the roles played in each by forall, => and a?
(As another perspective, without invoking the "implicit dictionary passing" implementation of type classes):
forall a. in Haskell means "for every type a".1 It's introducing a type variable, and declaring that the rest of the type expression has to be valid whatever choice is made for a.
You usually don't see it in basic Haskell (without turning on any extensions in GHC), because it's not necessary; you just use type variables in your type signature, and GHC automatically assumes there are foralls introducing those variables at the start of the expression.
For example:
zip :: forall a. ( forall b. ( [a] -> [b] -> [(a, b)] ))
zip :: forall a. forall b. [a] -> [b] -> [(a, b)]
zip :: forall a b. [a] -> [b] -> [(a, b)]
zip :: [a] -> [b] -> [(a, b)]
The above are all the same; they just tell us that zip can be a way of zipping a list of a together with a list of b to make a list of (a, b) pairs, whatever choice we feel like making for a and b.
forall mainly comes into play with extensions, because then you can introduce type variables with scopes other than the default ones assumed by GHC if you don't explicitly write them.
Now, the constraints => type syntax can be read roughly as "these constraints imply this type", or "provided these constraints hold, you can use this type". It's used all the time, even in vanilla Haskell with no extensions, so it's important to understand what it means and how it works and not just copy and paste and hope.
The => arrow allows us to state a set of constraints on the variables in the rest of the type expression; it lets us put limitations on what choices can be made to introduce the type variable. You should read it first by ignoring everything left of the => arrow, and reading the the right part on its own. This gives you the "shape" of the type. The stuff to the left of the => arrow tells you what kind of types you can use the rest of the type with.
An example:
(+) :: Num a => a -> a -> a
This means that (+) is exactly the same kind of thing as anything with a simpler type like a -> a -> a, except the Num a => is telling us that we're not free to choose any type a. We can only choose a type for a when we know that it is a member of the Num type class (another slightly more precise way of saying "a is a member of Num is "the constraint Num a holds").
Note that GHC is still assuming that there's an implicit forall a to introduce the type variable a here, so it really looks like:
(+) :: forall a. Num a => a -> a -> a
In which case you can read this off moderately easily as an English sentence once you know what forall a. and Num a => means: "For every type a, provided Num a holds, plus has the type a -> a -> a".
1 If you're familiar with formal logic at all, it's just an ASCII-friendly way of writing ∀a, a "universally quantified variable".
As the forall matter appears to be settled, I'll attempt to explain the => a bit. The things to the left of the => are arguments, much like ones to the left of a ->. But you don't apply these arguments manually, and they can only have specific types.
f :: Num a => a -> a
is a function that takes two arguments:
A Num a dictionary.
An a.
When you apply f, you just provide the a. GHC has to provide the Num a. If it's applied to a specific concrete type like Int, GHC knows Num Int and can supply it at the call site. Otherwise, it checks that Num a is provided by some outer context and uses that one. The great thing about Haskell's typeclass system is that it ensures that any two Num a dictionaries, however they are found, will be identical. So it doesn't matter where the dictionary comes from—it is sure to be the right one.
Further discussion
A lot of these things we're talking about aren't exactly part of Haskell so much as they're part of the way GHC interprets Haskell by translation to GHC core, AKA System FC, an extension of the very well-studied System F, AKA the Girard-Reynolds calculus. System FC is an explicitly typed polymorphic lambda calculus with algebraic datatypes, etc., but no type inference, no instance resolution, etc. After GHC checks the types in your Haskell code, it translates that code to System FC by a thoroughly mechanical process. It can do this confidently because the type checker "decorates" the code with all the information the desugarer needs to plumb all the dictionaries around. If you have a Haskell function that looks like
foo :: forall a . Num a => a -> a -> a
foo x y= x + y
then that will translate to something that looks like
foo :: forall a . Num a -> a -> a -> a
foo = /\ (a :: *) -> \ (d :: Num a) -> \ (x :: a) -> \ (y :: a) -> (+) #a d x y
The /\ is a type lambda—it's just line a normal lambda except it takes a type variable. The # represents application of a type to a function that takes one. The + is really just a record selector. It chooses the right field from the dictionary it's passed.
I suppose it helps if we add the implied parentheses:
(+) :: ∀ a . ( Num a => (a -> (a -> a)) )
id :: ∀ a . ( a -> a )
The ∀ always goes together with a .. It's basically special syntax meaning “anything between ∀ and . are type variables that I want to introduce into the following scope”†
=> denotes what Idris calls an implicit function: Num a is a dictionary for the instance Num a, and such a dictionary is implicitly needed whenever you're adding numbers. But whether a is a type variable here that was previously introduced by some ∀, or a fixed type, doesn't really matter. You could also have
(+) :: Num Int => Int -> Int -> Int
That's just superfluous, because the compiler knows that Int is a Num instance and hence automatically (implicitly!) chooses the right dictionary.
Really, there's no particular relationship between ∀ and =>, they just happen to be used often together.
†Actually this is a type-level lambda. The type expression ∀ a . b behaves analogously to the value level expression \a -> b.

What are these explicit "forall"s doing?

What is the purpose of the foralls in this code?
class Monad m where
(>>=) :: forall a b. m a -> (a -> m b) -> m b
(>>) :: forall a b. m a -> m b -> m b
-- Explicit for-alls so that we know what order to
-- give type arguments when desugaring
(some code omitted). This is from the code for Monads.
My background: I don't really understand forall or when Haskell has them implicitly.
Also, and it may not be significant, but GHCi allows me to omit the forall when giving >> a type:
Prelude> :t (>>) :: Monad m => m a -> m b -> m b
(>>) :: Monad m => m a -> m b -> m b
:: (Monad m) => m a -> m b -> m b
(no error).
My background: I don't really understand forall or when Haskell has them implicitly.
Okay, consider the type of id, a -> a. What does a mean, and where does it come from? When you define a value, you can't just use arbitrary variables that aren't defined anywhere. You need a top-level definition, or a function argument, or a where clause, &c. In general, if you use a variable, it must be bound somewhere.
The same is true of type variables, and forall is one such way to bind a type variable. Anywhere you see a type variable that isn't explicitly bound (for example, class Foo a where ... binds a inside the class definition), it's implicitly bound by a forall.
So, the type of id is implicitly forall a. a -> a. What does this mean? Pretty much what it says. We can get a type a -> a for all possible types a, or from another perspective, if you pick any specific type you can get a type representing "functions from your chosen type to itself". The latter phrasing should sound a bit like defining a function, and as such you can think of forall as being similar to a lambda abstraction for types.
GHC uses various intermediate representations during compilation, and one of the transformations it applies is making the similarity to functions more direct: implicit foralls are made explicit, and anywhere a polymorphic value is used for a specific type, it is first applied to a type argument.
We can even write both foralls and lambdas as one expression. I'll abuse notation for a moment and replace forall a. with /\a => for visual consistency. In this style, we can define id = /\a => \(x::a) -> (x::a) or something similar. So, an expression like id True in your code would end up translated to something like id Bool True instead; just id True would no longer even make sense.
Just as you can reorder function arguments, you can likewise reorder the type arguments, subject only to the (rather obvious) restriction that type arguments must come before any value arguments of that type. Since implicit foralls are always the outermost layer, GHC could potentially choose any order it wanted when making them explicit. Under normal circumstances, this obviously doesn't matter.
I'm not sure exactly what's going on in this case, but based on the comments I would guess that the conversion to using explicit type arguments and the desugaring of do notation are, in some sense, not aware of each other, and therefore the order of type arguments is specified explicitly to ensure consistency. After all, if something is blindly applying two type arguments to an expression, it matters a great deal whether that expression's type is forall a b. m a -> m b -> m b or forall b a. m a -> m b -> m b!

Polymorphic signature for non-polymorphic function: why not?

As an example, consider the trivial function
f :: (Integral b) => a -> b
f x = 3 :: Int
GHC complains that it cannot deduce (b ~ Int). The definition matches the signature in the sense that it returns something that is Integral (namely an Int). Why would/should GHC force me to use a more specific type signature?
Thanks
Type variables in Haskell are universally quantified, so Integral b => b doesn't just mean some Integral type, it means any Integral type. In other words, the caller gets to pick which concrete types should be used. Therefore, it is obviously a type error for the function to always return an Int when the type signature says I should be able to choose any Integral type, e.g. Integer or Word64.
There are extensions which allow you to use existentially quantified type variables, but they are more cumbersome to work with, since they require a wrapper type (in order to store the type class dictionary). Most of the time, it is best to avoid them. But if you did want to use existential types, it would look something like this:
{-# LANGUAGE ExistentialQuantification #-}
data SomeIntegral = forall a. Integral a => SomeIntegral a
f :: a -> SomeIntegral
f x = SomeIntegral (3 :: Int)
Code using this function would then have to be polymorphic enough to work with any Integral type. We also have to pattern match using case instead of let to keep GHC's brain from exploding.
> case f True of SomeIntegral x -> toInteger x
3
> :t toInteger
toInteger :: Integral a => a -> Integer
In the above example, you can think of x as having the type exists b. Integral b => b, i.e. some unknown Integral type.
The most general type of your function is
f :: a -> Int
With a type annotation, you can only demand that you want a more specific type, for example
f :: Bool -> Int
but you cannot declare a less specific type.
The Haskell type system does not allow you to make promises that are not warranted by your code.
As others have said, in Haskell if a function returns a result of type x, that means that the caller gets to decide what the actual type is. Not the function itself. In other words, the function must be able to return any possible type matching the signature.
This is different to most OOP languages, where a signature like this would mean that the function gets to choose what it returns. Apparently this confuses a few people...

Resources