Why can't haskell infer this type? - haskell

I have a simple type class that I'm having trouble working around.
class Entity a where
eid :: a -> b
data Item = Item
{ itemId :: Int
, itemName :: String
}
instance Entity Item where
eid i = itemId i
this gives me a compile error
main.hs:12:11: error:
* Couldn't match expected type `b' with actual type `Int'
`b' is a rigid type variable bound by
the type signature for:
eid :: forall b. Item -> b
at main.hs:12:3-5
* In the expression: itemId i
In an equation for `eid': eid i = itemId i
In the instance declaration for `Entity Item'
* Relevant bindings include
eid :: Item -> b (bound at main.hs:12:3)
|
12 | eid i = itemId i
| ^^^^^^^^
I would think the compiler would be smart enough to know that for an Item I would like eid to return the Int type based on the info I've given but, I'm obviously missing something.

I would think the compiler would be smart enough to know that for an Item I would like eid to return the Int type based on the info I've given but, I'm obviously missing something.
Your signature eid :: a -> b promises that it can return a value for every type b. So it is not the instance that will decide what the return type is, it is the usage of the eid function that will determine this. So it means that one can use this as eid :: Entity a => a -> Integer for example, not per se Int.
If b fully depends on a, you can make use of functional dependencies [haskell-wiki]:
{-# LANGUAGE FunctionalDependencies, MultiParamTypeClasses #-}
class Entity a b | a -> b where
eid :: a -> b
instance Entity Item Int where
eid = itemId
You can also drop the | a -> b part if for a given a, there can be multiple bs.

In Haskell it is not you who can decide what concrete type b stands for, but instead the caller of your eid function. By providing your implementation eid i = itemId i you fix the return type to be the same what itemId returns – an Int. Now the caller of eid can’t choose the type for b anymore, so this is why you get the error. It’s not really a lack of inference.
In a pure language the only thing a function can see is its arguments (and global constants). Here it is an a that is visible, but not a b. This means there exists no non-evil implementation for your function. Your function must return any type b that I (the caller) choose.
This signature won’t work, so you’ll have to change it. For example you could ask for the caller to provide a b and have eid :: a -> b -> b. In this case I am allowed to choose the type for b but I also have the obligation to provide you with a value of that type. Now your pure function can see a b and is able to return one.
Alternatively you could also simply specify that you want to return an Int: eid :: a -> Int.
Or you say: „Hey, I want to have a function that I provide with an a and I want it to compute any b a caller may choose – so let him provide me a function that can 'extract' that b”: eid :: a -> (a -> b) -> b.
Now the caller will have the obligation to pass in something like itemId. This means you have delayed the decision about what function actually can produce that b up until the point where eid is getting called.

Related

Clarifying Data Constructor in Haskell

In the following:
data DataType a = Data a | Datum
I understand that Data Constructor are value level function. What we do above is defining their type. They can be function of multiple arity or const. That's fine. I'm ok with saying Datum construct Datum. What is not that explicit and clear to me here is somehow the difference between the constructor function and what it produce. Please let me know if i am getting it well:
1 - a) Basically writing Data a, is defining both a Data Structure and its Constructor function (as in scala or java usually the class and the constructor have the same name) ?
2 - b) So if i unpack and make an analogy. With Data a We are both defining a Structure(don't want to use class cause class imply a type already i think, but maybe we could) of object (Data Structure), the constructor function (Data Constructor/Value constructor), and the later return an object of that object Structure. Finally The type of that Structure of object is given by the Type constructor. An Object Structure in a sense is just a Tag surrounding a bunch value of some type. Is my understanding correct ?
3 - c) Can I formally Say:
Data Constructor that are Nullary represent constant values -> Return the the constant value itself of which the type is given by the Type Constructor at the definition site.
Data Constructor that takes an argument represent class of values, where class is a Tag ? -> Return an infinite number of object of that class, of which the type is given by the Type constructor at the definition site.
Another way of writing this:
data DataType a = Data a | Datum
Is with generalised algebraic data type (GADT) syntax, using the GADTSyntax extension, which lets us specify the types of the constructors explicitly:
{-# LANGUAGE GADTSyntax #-}
data DataType a where
Data :: a -> DataType a
Datum :: DataType a
(The GADTs extension would work too; it would also allow us to specify constructors with different type arguments in the result, like DataType Int vs. DataType Bool, but that’s a more advanced topic, and we don’t need that functionality here.)
These are exactly the types you would see in GHCi if you asked for the types of the constructor functions with :type / :t:
> :{
| data DataType a where
| Data :: a -> DataType a
| Datum :: DataType a
| :}
> :type Data
Data :: a -> DataType a
> :t Datum
Datum :: DataType a
With ExplicitForAll we can also specify the scope of the type variables explicitly, and make it clearer that the a in the data definition is a separate variable from the a in the constructor definitions by also giving them different names:
data DataType a where
Data :: forall b. b -> DataType b
Datum :: forall c. DataType c
Some more examples of this notation with standard prelude types:
data Either a b where
Left :: forall a b. a -> Either a b
Right :: forall a b. b -> Either a b
data Maybe a where
Nothing :: Maybe a
Just :: a -> Maybe a
data Bool where
False :: Bool
True :: Bool
data Ordering where
LT, EQ, GT :: Ordering -- Shorthand for repeated ‘:: Ordering’
I understand that Data Constructor are value level function. What we do above is defining their type. They can be function of multiple arity or const. That's fine. I'm ok with saying Datum construct Datum. What is not that explicit and clear to me here is somehow the difference between the constructor function and what it produce.
Datum and Data are both “constructors” of DataType a values; neither Datum nor Data is a type! These are just “tags” that select between the possible varieties of a DataType a value.
What is produced is always a value of type DataType a for a given a; the constructor selects which “shape” it takes.
A rough analogue of this is a union in languages like C or C++, plus an enumeration for the “tag”. In pseudocode:
enum Tag {
DataTag,
DatumTag,
}
// A single anonymous field.
struct DataFields<A> {
A field1;
}
// No fields.
struct DatumFields<A> {};
// A union of the possible field types.
union Fields<A> {
DataFields<A> data;
DatumFields<A> datum;
}
// A pair of a tag with the fields for that tag.
struct DataType<A> {
Tag tag;
Fields<A> fields;
}
The constructors are then just functions returning a value with the appropriate tag and fields. Pseudocode:
<A> DataType<A> newData(A x) {
DataType<A> result;
result.tag = DataTag;
result.fields.data.field1 = x;
return result;
}
<A> DataType<A> newDatum() {
DataType<A> result;
result.tag = DatumTag;
// No fields.
return result;
}
Unions are unsafe, since the tag and fields can get out of sync, but sum types are safe because they couple these together.
A pattern-match like this in Haskell:
case someDT of
Datum -> f
Data x -> g x
Is a combination of testing the tag and extracting the fields. Again, in pseudocode:
if (someDT.tag == DatumTag) {
f();
} else if (someDT.tag == DataTag) {
var x = someDT.fields.data.field1;
g(x);
}
Again this is coupled in Haskell to ensure that you can only ever access the fields if you have checked the tag by pattern-matching.
So, in answer to your questions:
1 - a) Basically writing Data a, is defining both a Data Structure and its Constructor function (as in scala or java usually the class and the constructor have the same name) ?
Data a in your original code is not defining a data structure, in that Data is not a separate type from DataType a, it’s just one of the possible tags that a DataType a value may have. Internally, a value of type DataType Int is one of the following:
The tag for Data (in GHC, a pointer to an “info table” for the constructor), and a reference to a value of type Int.
x = Data (1 :: Int) :: DataType Int
+----------+----------------+ +---------+----------------+
x ---->| Data tag | pointer to Int |---->| Int tag | unboxed Int# 1 |
+----------+----------------+ +---------+----------------+
The tag for Datum, and no other fields.
y = Datum :: DataType Int
+-----------+
y ----> | Datum tag |
+-----------+
In a language with unions, the size of a union is the maximum of all its alternatives, since the type must support representing any of the alternatives with mutation. In Haskell, since values are immutable, they don’t require any extra “padding” since they can’t be changed.
It’s a similar situation for standard data types, e.g., a product or sum type:
(x :: X, y :: Y) :: (X, Y)
+---------+--------------+--------------+
| (,) tag | pointer to X | pointer to Y |
+---------+--------------+--------------+
Left (m :: M) :: Either M N
+-----------+--------------+
| Left tag | pointer to M |
+-----------+--------------+
Right (n :: N) :: Either M N
+-----------+--------------+
| Right tag | pointer to N |
+-----------+--------------+
2 - b) So if i unpack and make an analogy. With Data a We are both defining a Structure(don't want to use class cause class imply a type already i think, but maybe we could) of object (Data Structure), the constructor function (Data Constructor/Value constructor), and the later return an object of that object Structure. Finally The type of that Structure of object is given by the Type constructor. An Object Structure in a sense is just a Tag surrounding a bunch value of some type. Is my understanding correct ?
This is sort of correct, but again, the constructors Data and Datum aren’t “data structures” by themselves. They’re just the names used to introduce (construct) and eliminate (match) values of type DataType a, for some type a that is chosen by the caller of the constructors to fill in the forall
data DataType a = Data a | Datum says:
If some term e has type T, then the term Data e has type DataType T
Inversely, if some value of type DataType T matches the pattern Data x, then x has type T in the scope of the match (case branch or function equation)
The term Datum has type DataType T for any type T
3 - c) Can I formally Say:
Data Constructor that are Nullary represent constant values -> Return the the constant value itself of which the type is given by the Type Constructor at the definition site.
Data Constructor that takes an argument represent class of values, where class is a Tag ? -> Return an infinite number of object of that class, of which the type is given by the Type constructor at the definition site.
Not exactly. A type constructor like DataType :: Type -> Type, Maybe :: Type -> Type, or Either :: Type -> Type -> Type, or [] :: Type -> Type (list), or a polymorphic data type, represents an “infinite” family of concrete types (Maybe Int, Maybe Char, Maybe (String -> String), …) but only in the same way that id :: forall a. a -> a represents an “infinite” family of functions (id :: Int -> Int, id :: Char -> Char, id :: String -> String, …).
That is, the type a here is a parameter filled in with an argument value given by the caller. Usually this is implicit, through type inference, but you can specify it explicitly with the TypeApplications extension:
-- Akin to: \ (a :: Type) -> \ (x :: a) -> x
id :: forall a. a -> a
id x = x
id #Int :: Int -> Int
id #Int 1 :: Int
Data :: forall a. a -> DataType a
Data #Char :: Char -> DataType Char
Data #Char 'x' :: DataType Char
The data constructors of each instantiation don’t really have anything to do with each other. There’s nothing in common between the instantiations Data :: Int -> DataType Int and Data :: Char -> DataType Char, apart from the fact that they share the same tag name.
Another way of thinking about this in Java terms is with the visitor pattern. DataType would be represented as a function that accepts a “DataType visitor”, and then the constructors don’t correspond to separate data types, they’re just the methods of the visitor which accept the fields and return some result. Writing the equivalent code in Java is a worthwhile exercise, but here it is in Haskell:
{-# LANGUAGE RankNTypes #-}
-- (Allows passing polymorphic functions as arguments.)
type DataType a
= forall r. -- A visitor with a generic result type
r -- With one “method” for the ‘Datum’ case (no fields)
-> (a -> r) -- And one for the ‘Data’ case (one field)
-> r -- Returning the result
newData :: a -> DataType a
newData field = \ _visitDatum visitData -> visitData field
newDatum :: DataType a
newDatum = \ visitDatum _visitData -> visitDatum
Pattern-matching is simply running the visitor:
matchDT :: DataType a -> b -> (a -> b) -> b
matchDT dt visitDatum visitData = dt visitDatum visitData
-- Or: matchDT dt = dt
-- Or: matchDT = id
-- case someDT of { Datum -> f; Data x -> g x }
-- f :: r
-- g :: a -> r
-- someDT :: DataType a
-- :: forall r. r -> (a -> r) -> r
someDT f (\ x -> g x)
Similarly, in Haskell, data constructors are just the ways of introducing and eliminating values of a user-defined type.
What is not that explicit and clear to me here is somehow the difference between the constructor function and what it produce
I'm having trouble following your question, but I think you are complicating things. I would suggest not thinking too deeply about the "constructor" terminology.
But hopefully the following helps:
Starting simple:
data DataType = Data Int | Datum
The above reads "Declare a new type named DataType, which has the possible values Datum or Data <some_number> (e.g. Data 42)"
So e.g. Datum is a value of type DataType.
Going back to your example with a type parameter, I want to point out what the syntax is doing:
data DataType a = Data a | Datum
^ ^ ^ These things appear in type signatures (type level)
^ ^ These things appear in code (value level stuff)
There's a bit of punning happening here. so in the data declaration you might see "Data Int" and this is mixing type-level and value-level stuff in a way that you wouldn't see in code. In code you'd see e.g. Data 42 or Data someVal.
I hope that helps a little...

MultiParamTypeClasses - Why is this type variable ambiguous?

Suppose I define a multi-parameter type class:
{-# LANGUAGE MultiParamTypeClasses, AllowAmbiguousTypes, FlexibleContexts, FlexibleInstances #-}
class Table a b c where
decrement :: a -> a
evalutate :: a -> b -> c
Then I define a function that uses decrement, for simplicity:
d = decrement
When I try to load this in ghci (version 8.6.3):
• Could not deduce (Table a b0 c0)
arising from a use of ‘decrement’
from the context: Table a b c
bound by the type signature for:
d :: forall a b c. Table a b c => a -> a
at Thing.hs:13:1-28
The type variables ‘b0’, ‘c0’ are ambiguous
Relevant bindings include d :: a -> a (bound at Thing.hs:14:1)
These potential instance exist:
instance Table (DummyTable a b) a b
This is confusing to me because the type of d is exactly the type of decrement, which is denoted in the class declaration.
I thought of the following workaround:
data Table a b = Table (a -> b) ((Table a b) -> (Table a b))
But this seems notationally inconvenient, and I also just wanted to know why I was getting this error message in the first place.
The problem is that, since decrement only requires the a type, there is no way to figure out which types b and c should be, even at the point where the function is called (thus solving the polymorphism into a specific type) - therefore, GHC would be unable to decide which instance to use.
For example: let's suppose you have two instances of Table: Table Int String Bool, and Table Int Bool Float; you call your function d in a context where it is supposed to map an Int to another Int - problem is, that matches both instances! (a is Int for both).
Notice how, if you make your function equal to evalutate:
d = evalutate
then the compiler accepts it. This is because, since evalutate depends on the three type parameters a, b, and c, the context at the call site would allow for non-ambiguous instance resolution - just check which are the types for a, b, and c at the place where it is called.
This is, of course, not usually a problem for single-parameter type classes - only one type to resolve; it is when we deal with multiple parameters that things get complicated...
One common solution is to use functional dependencies - make b and c depend on a:
class Table a b c | a -> b c where
decrement :: a -> a
evalutate :: a -> b -> c
This tells the compiler that, for every instance of Table for a given type a, there will be one, and only one, instance (b and c will be uniquely determined by a); so it will know that there won't be any ambiguities and accept your d = decrement happily.

How do I understand the set of valid inputs to a Haskell type constructor?

Warning: very beginner question.
I'm currently mired in the section on algebraic types in the Haskell book I'm reading, and I've come across the following example:
data Id a =
MkId a deriving (Eq, Show)
idInt :: Id Integer
idInt = MkId 10
idIdentity :: Id (a -> a)
idIdentity = MkId $ \x -> x
OK, hold on. I don't fully understand the idIdentity example. The explanation in the book is that:
This is a little odd. The type Id takes an argument and the data
constructor MkId takes an argument of the corresponding polymorphic
type. So, in order to have a value of type Id Integer, we need to
apply a -> Id a to an Integer value. This binds the a type variable to
Integer and applies away the (->) in the type constructor, giving us
Id Integer. We can also construct a MkId value that is an identity
function by binding the a to a polymorphic function in both the type
and the term level.
But wait. Why only fully polymorphic functions? My previous understanding was that a can be any type. But apparently constrained polymorphic type doesn't work: (Num a) => a -> a won't work here, and the GHC error suggests that only completely polymorphic types or "qualified types" (not sure what those are) are valid:
f :: (Num a) => a -> a
f = undefined
idConsPoly :: Id (Num a) => a -> a
idConsPoly = MkId undefined
Illegal polymorphic or qualified type: Num a => a -> a
Perhaps you intended to use ImpredicativeTypes
In the type signature for ‘idIdentity’:
idIdentity :: Id (Num a => a -> a)
EDIT: I'm a bonehead. I wrote the type signature below incorrectly, as pointed out by #chepner in his answer below. This also resolves my confusion in the next sentence below...
In retrospect, this behavior makes sense because I haven't defined a Num instance for Id. But then what explains me being able to apply a type like Integer in idInt :: Id Integer?
So in generality, I guess my question is: What specifically is the set of valid inputs to type constructors? Only fully polymorphic types? What are "qualified types" then? Etc...
You just have the type constructor in the wrong place. The following is fine:
idConsPoly :: Num a => Id (a -> a)
idConsPoly = MkId undefined
The type constructor Id here has kind * -> *, which means you can give it any value that has kind * (which includes all "ordinary" types) and returns a new value of kind *. In general, you are more concerned with arrow-kinded functions(?), of which type constructors are just one example.
TypeProd is a ternary type constructor whose first two arguments have kind * -> *:
-- Based on :*: from Control.Compose
newtype TypeProd f g a = Prod { unProd :: (f a, g a) }
Either Int is an expression whose value has kind * -> * but is not a type constructor, being the partial application of the type constructor Either to the nullary type constructor Int.
Also contributing to your confusion is that you've misinterpreted the error message from GHC. It means "Num a => a -> a is a polymorphic or qualified type, and therefore illegal (in this context)".
The last sentence that you quoted from the book is not very well worded, and maybe contributed to that misunderstanding. It's important to realize that in Id (a -> a) the argument a -> a is not a polymorphic type, but just an ordinary type that happens to mention a type variable. The thing which is polymorphic is idIdentity, which can have the type Id (a -> a) for any type a.
In standard Haskell polymorphism and qualification can only appear at the outermost level of the type in a type signature.
The type signature is almost correct
idConsPoly :: (Num a) => Id (a -> a)
Should be right, though i have no ghc on my phone to test this.
Also i think your question is quite broad, thus i deliberately answer only the concrete problem here.

Confused about Haskell polymorphic types

I have defined a function :
gen :: a -> b
So just trying to provide a simple implementation :
gen 2 = "test"
But throws error :
gen.hs:51:9:
Couldn't match expected type ‘b’ with actual type ‘[Char]’
‘b’ is a rigid type variable bound by
the type signature for gen :: a -> b at gen.hs:50:8
Relevant bindings include gen :: a -> b (bound at gen.hs:51:1)
In the expression: "test"
In an equation for ‘gen’: gen 2 = "test"
Failed, modules loaded: none.
So my function is not correct. Why is a not typed as Int and b not typed as String ?
This is a very common misunderstanding.
The key thing to understand is that if you have a variable in your type signature, then the caller gets to decide what type that is, not you!
So you cannot say "this function returns type x" and then just return a String; your function actually has to be able to return any possible type that the caller may ask for. If I ask your function to return an Int, it has to return an Int. If I ask it to return a Bool, it has to return a Bool.
Your function claims to be able to return any possible type, but actually it only ever returns String. So it doesn't do what the type signature claims it does. Hence, a compile-time error.
A lot of people apparently misunderstand this. In (say) Java, you can say "this function returns Object", and then your function can return anything it wants. So the function decides what type it returns. In Haskell, the caller gets to decide what type is returned, not the function.
Edit: Note that the type you're written, a -> b, is impossible. No function can ever have this type. There's no way a function can construct a value of type b out of thin air. The only way this can work is if some of the inputs also involve type b, or if b belongs to some kind of typeclass which allows value construction.
For example:
head :: [x] -> x
The return type here is x ("any possible type"), but the input type also mentions x, so this function is possible; you just have to return one of the values that was in the original list.
Similarly, gen :: a -> a is a perfectly valid function. But the only thing it can do is return it's input unchanged (i.e., what the id function does).
This property of type signatures telling you what a function does is a very useful and powerful property of Haskell.
gen :: a -> b does not mean "for some type a and some type b, foo must be of type a -> b", it means "for any type a and any type b, foo must be of type a -> b".
to motivate this: If the type checker sees something like let x :: Int = gen "hello", it sees that gen is used as String -> Int here and then looks at gen's type to see whether it can be used that way. The type is a -> b, which can be specialized to String -> Int, so the type checker decides that this is fine and allows this call. That is since the function is declared to have type a -> b, the type checker allows you to call the function with any type you want and allows you to use the result as any type you want.
However that clearly does not match the definition you gave the function. The function knows how to handle numbers as arguments - nothing else. And likewise it knows how to produce strings as its result - nothing else. So clearly it should not be possible to call the function with a string as its argument or to use the function's result as an Int. So since the type a -> b would allow that, it's clearly the wrong type for that function.
Your type signature gen :: a -> b is stating, that your function can work for any type a (and provide any type b the caller of the function demands).
Besides the fact that such a function is hard to come by, the line gen 2 = "test" tries to return a String which very well may not be what the caller demands.
Excellent answers. Given your profile, however, you seem to know Java, so I think it's valuable to connect this to Java as well.
Java offers two kinds of polymorphism:
Subtype polymorphism: e.g., every type is a subtype of java.lang.Object
Generic polymorphism: e.g., in the List<T> interface.
Haskell's type variables are a version of (2). Haskell doesn't really have a version of (1).
One way to think of generic polymorphism is in terms of templates (which is what C++ people call them): a type that has a type variable parameter is a template that can be specialized into a variety of monomorphic types. So for example, the interface List<T> is a template for constructing monomorphic interfaces like List<String>, List<List<String>> and so on, all of which have the same structure but differ only because the type variable T gets replaced uniformly throughout the signatures with the instantiation type.
The concept that "the caller chooses" that several responders have mentioned here is basically a friendly way of referring to instantiation. In Java, for example, the most common point where the type variable gets "chosen" is when an object is instantiated:
List<String> myList = new ArrayList<String>();
Second common point is that a subtype of a generic type may instantiate the supertype's variables:
class MyFunction implements Function<Integer, String> {
public String apply(Integer i) { ... }
}
Third one is methods that allow the caller to instantiate a variable that's not a parameter of its enclosing type:
/**
* Visitor-pattern style interface for a simple arithmetical language
* abstract syntax tree.
*/
interface Expression {
// The caller of `accept` implicitly chooses which type `R` is,
// by supplying a `Visitor<R>` with `R` instantiated to something
// of its choice.
<R> accept(Expression.Visitor<R> visitor);
static interface Visitor<R> {
R constant(int i);
R add(Expression a, Expression b);
R multiply(Expression a, Expression b);
}
}
In Haskell, instantiation is carried out implicitly by the type inference algorithm. In any expression where you use gen :: a -> b, type inference will infer what types need to be instantiated for a and b, given the context in which gen is used. So basically, "caller chooses" means that any code that uses gen controls the types to which a and b will be instantiated; if I write gen [()], then I'm implicitly instantiating a to [()]. The error here means that your type declaration says that gen [()] is allowed, but your equation gen 2 = "test" implies that it's not.
In Haskell, type variables are implicitly quantified, but we can make this explicit:
{-# LANGUAGE ScopedTypeVariables #-}
gen :: forall a b . a -> b
gen x = ????
The "forall" is really just a type level version of a lambda, often written Λ. So gen is a function taking three arguments: a type, bound to the name a, another type, bound to the name b, and a value of type a, bound to the name x. When your function is called, it is called with those three arguments. Consider a saner case:
fst :: (a,b) -> a
fst (x1,x2) = x1
This gets translated to
fst :: forall (a::*) (b::*) . (a,b) -> a
fst = /\ (a::*) -> /\ (b::*) -> \ (x::(a,b)) ->
case x of
(x1, x2) -> x1
where * is the type (often called a kind) of normal concrete types. If I call fst (3::Int, 'x'), that gets translated into
fst Int Char (3Int, 'x')
where I use 3Int to represent specifically the Int version of 3. We could then calculate it as follows:
fst Int Char (3Int, 'x')
=
(/\ (a::*) -> /\ (b::*) -> \(x::(a,b)) -> case x of (x1,x2) -> x1) Int Char (3Int, 'x')
=
(/\ (b::*) -> \(x::(Int,b)) -> case x of (x1,x2) -> x1) Char (3Int, 'x')
=
(\(x::(Int,Char)) -> case x of (x1,x2) -> x1) (3Int, x)
=
case (3Int,x) of (x1,x2) -> x1
=
3Int
Whatever types I pass in, as long as the value I pass in matches, the fst function will be able to produce something of the required type. If you try to do this for a->b, you will get stuck.

What is the purpose of Rank2Types?

I am not really proficient in Haskell, so this might be a very easy question.
What language limitation do Rank2Types solve? Don't functions in Haskell already support polymorphic arguments?
It's hard to understand higher-rank polymorphism unless you study System F directly, because Haskell is designed to hide the details of that from you in the interest of simplicity.
But basically, the rough idea is that polymorphic types don't really have the a -> b form that they do in Haskell; in reality, they look like this, always with explicit quantifiers:
id :: ∀a.a → a
id = Λt.λx:t.x
If you don't know the "∀" symbol, it's read as "for all"; ∀x.dog(x) means "for all x, x is a dog." "Λ" is capital lambda, used for abstracting over type parameters; what the second line says is that id is a function that takes a type t, and then returns a function that's parametrized by that type.
You see, in System F, you can't just apply a function like that id to a value right away; first you need to apply the Λ-function to a type in order to get a λ-function that you apply to a value. So for example:
(Λt.λx:t.x) Int 5 = (λx:Int.x) 5
= 5
Standard Haskell (i.e., Haskell 98 and 2010) simplifies this for you by not having any of these type quantifiers, capital lambdas and type applications, but behind the scenes GHC puts them in when it analyzes the program for compilation. (This is all compile-time stuff, I believe, with no runtime overhead.)
But Haskell's automatic handling of this means that it assumes that "∀" never appears on the left-hand branch of a function ("→") type. Rank2Types and RankNTypes turn off those restrictions and allow you to override Haskell's default rules for where to insert forall.
Why would you want to do this? Because the full, unrestricted System F is hella powerful, and it can do a lot of cool stuff. For example, type hiding and modularity can be implemented using higher-rank types. Take for example a plain old function of the following rank-1 type (to set the scene):
f :: ∀r.∀a.((a → r) → a → r) → r
To use f, the caller first must choose what types to use for r and a, then supply an argument of the resulting type. So you could pick r = Int and a = String:
f Int String :: ((String → Int) → String → Int) → Int
But now compare that to the following higher-rank type:
f' :: ∀r.(∀a.(a → r) → a → r) → r
How does a function of this type work? Well, to use it, first you specify which type to use for r. Say we pick Int:
f' Int :: (∀a.(a → Int) → a → Int) → Int
But now the ∀a is inside the function arrow, so you can't pick what type to use for a; you must apply f' Int to a Λ-function of the appropriate type. This means that the implementation of f' gets to pick what type to use for a, not the caller of f'. Without higher-rank types, on the contrary, the caller always picks the types.
What is this useful for? Well, for many things actually, but one idea is that you can use this to model things like object-oriented programming, where "objects" bundle some hidden data together with some methods that work on the hidden data. So for example, an object with two methods—one that returns an Int and another that returns a String, could be implemented with this type:
myObject :: ∀r.(∀a.(a → Int, a -> String) → a → r) → r
How does this work? The object is implemented as a function that has some internal data of hidden type a. To actually use the object, its clients pass in a "callback" function that the object will call with the two methods. For example:
myObject String (Λa. λ(length, name):(a → Int, a → String). λobjData:a. name objData)
Here we are, basically, invoking the object's second method, the one whose type is a → String for an unknown a. Well, unknown to myObject's clients; but these clients do know, from the signature, that they will be able to apply either of the two functions to it, and get either an Int or a String.
For an actual Haskell example, below is the code that I wrote when I taught myself RankNTypes. This implements a type called ShowBox which bundles together a value of some hidden type together with its Show class instance. Note that in the example at the bottom, I make a list of ShowBox whose first element was made from a number, and the second from a string. Since the types are hidden by using the higher-rank types, this doesn't violate type checking.
{-# LANGUAGE RankNTypes #-}
{-# LANGUAGE ImpredicativeTypes #-}
type ShowBox = forall b. (forall a. Show a => a -> b) -> b
mkShowBox :: Show a => a -> ShowBox
mkShowBox x = \k -> k x
-- | This is the key function for using a 'ShowBox'. You pass in
-- a function #k# that will be applied to the contents of the
-- ShowBox. But you don't pick the type of #k#'s argument--the
-- ShowBox does. However, it's restricted to picking a type that
-- implements #Show#, so you know that whatever type it picks, you
-- can use the 'show' function.
runShowBox :: forall b. (forall a. Show a => a -> b) -> ShowBox -> b
-- Expanded type:
--
-- runShowBox
-- :: forall b. (forall a. Show a => a -> b)
-- -> (forall b. (forall a. Show a => a -> b) -> b)
-- -> b
--
runShowBox k box = box k
example :: [ShowBox]
-- example :: [ShowBox] expands to this:
--
-- example :: [forall b. (forall a. Show a => a -> b) -> b]
--
-- Without the annotation the compiler infers the following, which
-- breaks in the definition of 'result' below:
--
-- example :: forall b. [(forall a. Show a => a -> b) -> b]
--
example = [mkShowBox 5, mkShowBox "foo"]
result :: [String]
result = map (runShowBox show) example
PS: for anybody reading this who's wondered how come ExistentialTypes in GHC uses forall, I believe the reason is because it's using this sort of technique behind the scenes.
Do not functions in Haskell already support polymorphic arguments?
They do, but only of rank 1. This means that while you can write a function that takes different types of arguments without this extension, you can't write a function that uses its argument as different types in the same invocation.
For example the following function can't be typed without this extension because g is used with different argument types in the definition of f:
f g = g 1 + g "lala"
Note that it's perfectly possible to pass a polymorphic function as an argument to another function. So something like map id ["a","b","c"] is perfectly legal. But the function may only use it as monomorphic. In the example map uses id as if it had type String -> String. And of course you can also pass a simple monomorphic function of the given type instead of id. Without rank2types there is no way for a function to require that its argument must be a polymorphic function and thus also no way to use it as a polymorphic function.
Luis Casillas's answer gives a lot of great info about what rank 2 types mean, but I'll just expand on one point he didn't cover. Requiring an argument to be polymorphic doesn't just allow it to be used with multiple types; it also restricts what that function can do with its argument(s) and how it can produce its result. That is, it gives the caller less flexibility. Why would you want to do that? I'll start with a simple example:
Suppose we have a data type
data Country = BigEnemy | MediumEnemy | PunyEnemy | TradePartner | Ally | BestAlly
and we want to write a function
f g = launchMissilesAt $ g [BigEnemy, MediumEnemy, PunyEnemy]
that takes a function that's supposed to choose one of the elements of the list it's given and return an IO action launching missiles at that target. We could give f a simple type:
f :: ([Country] -> Country) -> IO ()
The problem is that we could accidentally run
f (\_ -> BestAlly)
and then we'd be in big trouble! Giving f a rank 1 polymorphic type
f :: ([a] -> a) -> IO ()
doesn't help at all, because we choose the type a when we call f, and we just specialize it to Country and use our malicious \_ -> BestAlly again. The solution is to use a rank 2 type:
f :: (forall a . [a] -> a) -> IO ()
Now the function we pass in is required to be polymorphic, so \_ -> BestAlly won't type check! In fact, no function returning an element not in the list it is given will typecheck (although some functions that go into infinite loops or produce errors and therefore never return will do so).
The above is contrived, of course, but a variation on this technique is key to making the ST monad safe.
Higher-rank types aren't as exotic as the other answers have made out. Believe it or not, many object-oriented languages (including Java and C#!) feature them. (Of course, no one in those communities knows them by the scary-sounding name "higher-rank types".)
The example I'm going to give is a textbook implementation of the Visitor pattern, which I use all the time in my daily work. This answer is not intended as an introduction to the visitor pattern; that knowledge is readily available elsewhere.
In this fatuous imaginary HR application, we wish to operate on employees who may be full-time permanent staff or temporary contractors. My preferred variant of the Visitor pattern (and indeed the one which is relevant to RankNTypes) parameterises the visitor's return type.
interface IEmployeeVisitor<T>
{
T Visit(PermanentEmployee e);
T Visit(Contractor c);
}
class XmlVisitor : IEmployeeVisitor<string> { /* ... */ }
class PaymentCalculator : IEmployeeVisitor<int> { /* ... */ }
The point is that a number of visitors with different return types can all operate on the same data. This means IEmployee must express no opinion as to what T ought to be.
interface IEmployee
{
T Accept<T>(IEmployeeVisitor<T> v);
}
class PermanentEmployee : IEmployee
{
// ...
public T Accept<T>(IEmployeeVisitor<T> v)
{
return v.Visit(this);
}
}
class Contractor : IEmployee
{
// ...
public T Accept<T>(IEmployeeVisitor<T> v)
{
return v.Visit(this);
}
}
I wish to draw your attention to the types. Observe that IEmployeeVisitor universally quantifies its return type, whereas IEmployee quantifies it inside its Accept method - that is to say, at a higher rank. Translating clunkily from C# to Haskell:
data IEmployeeVisitor r = IEmployeeVisitor {
visitPermanent :: PermanentEmployee -> r,
visitContractor :: Contractor -> r
}
newtype IEmployee = IEmployee {
accept :: forall r. IEmployeeVisitor r -> r
}
So there you have it. Higher-rank types show up in C# when you write types containing generic methods.
For those familiar with object oriented languages, a higher-rank function is simply a generic function that expects as its argument another generic function.
E.g. in TypeScript you could write:
type WithId<T> = T & { id: number }
type Identifier = <T>(obj: T) => WithId<T>
type Identify = <TObj>(obj: TObj, f: Identifier) => WithId<TObj>
See how the generic function type Identify demands a generic function of the type Identifier? This makes Identify a higher-rank function.

Resources