Haskell function returning existential type - haskell

Is it possible to write a Haskell function that yields a parameterised type where the exact type parameter is hidden? I.e. something like f :: T -> (exists a. U a)? The obvious attempt:
{-# LANGUAGE ExistentialQuantification #-}
data D a = D a
data Wrap = forall a. Wrap (D a)
unwrap :: Wrap -> D a
unwrap (Wrap d) = d
fails to compile with:
Couldn't match type `a1' with `a'
`a1' is a rigid type variable bound by
a pattern with constructor
Wrap :: forall a. D a -> Wrap,
in an equation for `unwrap'
at test.hs:8:9
`a' is a rigid type variable bound by
the type signature for unwrap :: Wrap -> D a at test.hs:7:11
Expected type: D a
Actual type: D a1
In the expression: d
In an equation for `unwrap': unwrap (Wrap d) = d
I know this is a contrived example, but I'm curious if there is a way to convince GHC that I do not care for the exact type with which D is parameterised, without introducing another existential wrapper type for the result of unwrap.
To clarify, I do want type safety, but also would like to be able to apply a function dToString :: D a -> String that does not care about a (e.g. because it just extracts a String field from D) to the result of unwrap. I realise there are other ways of achieving it (e.g. defining wrapToString (Wrap d) = dToString d) but I'm more interested in whether there is a fundamental reason why such hiding under existential is not permitted.

Yes, you can, but not in a straightforward way.
{-# LANGUAGE ExistentialQuantification #-}
{-# LANGUAGE RankNTypes #-}
data D a = D a
data Wrap = forall a. Wrap (D a)
unwrap :: Wrap -> forall r. (forall a. D a -> r) -> r
unwrap (Wrap x) k = k x
test :: D a -> IO ()
test (D a) = putStrLn "Got a D something"
main = unwrap (Wrap (D 5)) test
You cannot return a D something_unknown from your function, but you can extract it and immediately pass it to another function that accepts D a, as shown.

Yes, you can convince GHC that you do not care for the exact type with which D is parameterised. Just, it's a horrible idea.
{-# LANGUAGE GADTs #-}
import Unsafe.Coerce
data D a = D a deriving (Show)
data Wrap where -- this GADT is equivalent to your `ExistentialQuantification` version
Wrap :: D a -> Wrap
unwrap :: Wrap -> D a
unwrap (Wrap (D a)) = D (unsafeCoerce a)
main = print (unwrap (Wrap $ D "bla") :: D Integer)
This is what happens when I execute that simple program:
and so on, until memory consumption brings down the system.
Types are important! If you circumvent the type system, you circumvent any predictability of your program (i.e. anything can happen, including thermonuclear war or the famous demons flying out of your nose).
Now, evidently you thought that types somehow work differently. In dynamic languages such as Python, and also to a degree in OO languages like Java, a type is in a sense a property that a value can have. So, (reference-) values don't just carry around the information needed to distinguish different values of a single type, but also information to distinguish different (sub-)types. That's in many senses rather inefficient – it's a major reason why Python is so slow and Java needs such a huge VM.
In Haskell, types don't exist at runtime. A function never knows what type the values have it's working with. Only because the compiler knows all about the types it will have, the function doesn't need any such knowledge – the compiler has already hard-coded it! (That is, unless you circumvent it with unsafeCoerce, which as I demonstrated is as unsafe as it sounds.)
If you do want to attach the type as a “property” to a value, you need to do it explicitly, and that's what those existential wrappers are there for. However, there are usually better ways to do it in a functional language. What's really the application you wanted this for?
Perhaps it's also helpful to recall what a signature with polymorphic result means. unwrap :: Wrap -> D a doesn't mean “the result is some D a... and the caller better don't care for the a used”. That would be the case in Java, but it would be rather useless in Haskell because there's nothing you can do with a value of unknown type.
Instead it means: for whatever type a the caller requests, this function is able to supply a suitable D a value. Of course this is tough to deliver – without extra information it's just as impossible as doing anything with a value of given unknown type. But if there are already a values in the arguments, or a is somehow constrained to a type class (e.g. fromInteger :: Num a => Integer -> a, then it's quite possible and very useful.
To obtain a String field – independent of the a parameter – you can just operate directly on the wrapped value:
data D a = D
{ dLabel :: String
, dValue :: a
}
data Wrap where Wrap :: D a -> Wrap
labelFromWrap :: Wrap -> String
labelFromWrap (Wrap (D l _)) = l
To write such functions on Wrap more generically (with any “ label accesor that doesn't care about a”), use Rank2-polymorphism as shown in n.m.'s answer.

Related

Subset algebraic data type, or type-level set, in Haskell

Suppose you have a large number of types and a large number of functions that each return "subsets" of these types.
Let's use a small example to make the situation more explicit. Here's a simple algebraic data type:
data T = A | B | C
and there are two functions f, g that return a T
f :: T
g :: T
For the situation at hand, assume it is important that f can only return a A or B and g can only return a B or C.
I would like to encode this in the type system. Here are a few reasons/circumstances why this might be desirable:
Let the functions f and g have a more informative signature than just ::T
Enforce that implementations of f and g do not accidentally return a forbidden type that users of the implementation then accidentally use
Allow code reuse, e.g. when helper functions are involved that only operate on subsets of type T
Avoid boilerplate code (see below)
Make refactoring (much!) easier
One way to do this is to split up the algebraic datatype and wrap the individual types as needed:
data A = A
data B = B
data C = C
data Retf = RetfA A | RetfB B
data Retg = RetgB B | RetgC C
f :: Retf
g :: Retg
This works, and is easy to understand, but carries a lot of boilerplate for frequent unwrapping of the return types Retf and Retg.
I don't see polymorphism being of any help, here.
So, probably, this is a case for dependent types. It's not really a type-level list, rather a type-level set, but I've never seen a type-level set.
The goal, in the end, is to encode the domain knowledge via the types, so that compile-time checks are available, without having excessive boilerplate. (The boilerplate gets really annoying when there are lots of types and lots of functions.)
Define an auxiliary sum type (to be used as a data kind) where each branch corresponds to a version of your main type:
{-# LANGUAGE FlexibleInstances #-}
{-# LANGUAGE StandaloneKindSignatures #-}
{-# LANGUAGE StandaloneDeriving #-}
{-# LANGUAGE DataKinds #-}
import Data.Kind
import Data.Void
import GHC.TypeLits
data Version = AllEnabled | SomeDisabled
Then define a type family that maps the version and the constructor name (given as a type-level Symbol) to the type () if that branch is allowed, and to the empty type Void if it's disallowed.
type Enabled :: Version -> Symbol -> Type
type family Enabled v ctor where
Enabled SomeDisabled "C" = Void
Enabled _ _ = ()
Then define your type as follows:
type T :: Version -> Type
data T v = A !(Enabled v "A")
| B !(Enabled v "B")
| C !(Enabled v "C")
(The strictness annotations are there to help the exhaustivity checker.)
Typeclass instances can be derived, but separately for each version:
deriving instance Show (T AllEnabled)
deriving instance Eq (T AllEnabled)
deriving instance Show (T SomeDisabled)
deriving instance Eq (T SomeDisabled)
Here's an example of use:
noC :: T SomeDisabled
noC = A ()
main :: IO ()
main = print $ case noC of
A _ -> "A"
B _ -> "B"
-- this doesn't give a warning with -Wincomplete-patterns
This solution makes pattern-matching and construction more cumbersome, because those () are always there.
A variation is to have one type family per branch (as in Trees that Grow) instead of a two-parameter type family.
I tried to achieve something like this in the past, but without much success -- I was not too satisfied with my solution.
Still, one can use GADTs to encode this constraint:
data TagA = IsA | NotA
data TagC = IsC | NotC
data T (ta :: TagA) (tc :: TagC) where
A :: T 'IsA 'NotC
B :: T 'NotA 'NotC
C :: T 'NotA 'IsC
-- existential wrappers
data TnotC where TnotC :: T ta 'NotC -> TnotC
data TnotA where TnotA :: T 'NotA tc -> TnotA
f :: TnotC
g :: TnotA
This however gets boring fast, because of the wrapping/unwrapping of the exponentials. Consumer functions are more convenient since we can write
giveMeNotAnA :: T 'NotA tc -> Int
to require anything but an A. Producer functions instead need to use existentials.
In a type with many constructors, it also gets inconvenient since we have to use a GADT with many tags/parameters. Maybe this can be streamlined with some clever typeclass machinery.
Giving each individual value its own type scales extremely badly, and is quite unnecessarily fine-grained.
What you probably want is just restrict the types by some property on their values. In e.g. Coq, that would be a subset type:
Inductive T: Type :=
| A
| B
| C.
Definition Retf: Type := { x: T | x<>C }.
Definition Retg: Type := { x: T | x<>A }.
Well, Haskell has no way of expressing such value constraints, but that doesn't stop you from creating types that conceptually fulfill them. Just use newtypes:
newtype Retf = Retf { getRetf :: T }
mkRetf :: T -> Maybe Retf
mkRetf C = Nothing
mkRetf x = Retf x
newtype Retg = Retg { getRetg :: T }
mkRetg :: ...
Then in the implementation of f, you match for the final result of mkRetf and raise an error if it's Nothing. That way, an implementation mistake that makes it give a C will unfortunately not give a compilation error, but at least a runtime error from within the function that's actually at fault, rather than somewhere further down the line.
An alternative that might be ideal for you is Liquid Haskell, which does support subset types. I can't say too much about it, but it's supposedly pretty good (and will in new GHC versions have direct support).

Overlapping instances - how to circumvent them

I am a beginner at Haskell, so please be indulgent. For reasons that are not important here, I am trying to define a operator <^> that takes a function and an argument and returns the value of the function by the argument, irrespective of which of the function and the argument came first. In short, I would like to be able to write the following:
foo :: Int -> Int
foo x = x * x
arg :: Int
arg = 2
foo <^> arg -- valid, returns 4
arg <^> foo -- valid, returns 4
I have tried to accomplish that through type families, as follows:
{-# LANGUAGE FlexibleInstances, MultiParamTypeClasses, TypeFamilies, TypeOperators #-}
class Combine t1 t2 where
type Output t1 t2 :: *
(<^>) :: t1 -> t2 -> Output t1 t2
instance Combine (a->b) a where
type Output (a->b) a = b
f <^> x = f x
instance Combine a (a->b) where
type Output a a->b = b
x <^> f = f x
On this code, GHC throws a Conflicting family instance declarations. My guess is that the overlap GHC complains about occurs when type a->b and type a are the same. I don't know Haskell well enough, but I suspect that with recursive type definitions, one may be able to construct such a situation. I have a couple of questions:
Since this is a rather remote scenario that will never occur in my application (in particular not with foo and arg above), I was wondering if there was a way of specifying a dummy default instance to use in case of overlap? I have tried the different OVERLAPS and OVERLAPPING flags, but they didn't have any effect.
If not, is there a better way of achieving what I want?
Thanks!
This is a bad idea, in my view, but I'll play along.
A possible solution is to switch to functional dependencies. Usually I tend to avoid fundeps in favor of type families, but here they make the instances compile in a simple way.
class Combine t1 t2 r | t1 t2 -> r where
(<^>) :: t1 -> t2 -> r
instance Combine (a->b) a b where
f <^> x = f x
instance Combine a (a->b) b where
x <^> f = f x
Note that this class will likely cause problems during type inference if we use polymorphic functions. This is because, with polymorphic functions, the code can easily become ambiguous.
For instance id <^> id could pick any of the two instances. Above, melpomene already reported const <^> id being ambiguous as well.
The following is weakly related, but I want to share it anyway:
What about type families instead? I tried to experiment a bit, and I just discovered a limitation which I did not know. Consider the closed type family
type family Output a b where
Output (a->b) a = b
Output a (a->b) = b
The code above compiles, but then the type Output a (a->b) is stuck. The second equation does not get applied, as if the first one could potentially match.
Usually, I can understand this in some other scenarios, but here unifying
Output (a' -> b') b' ~ Output a (a -> b)
seems to fail since we would need a ~ (a' -> b') ~ (a' -> a -> b) which is impossible, with finite types. For some reason, GHC does not use this argument (does it pretend infinite types exist in this check? why?)
Anyway, this makes replacing fundeps with type families harder than it could be, it seems. I have no idea about why GHC accepts the fundeps code I posted, yet refuses the OP's code which is essentially the same thing, except using type families.
#chi is close; an approach using either FunDeps or Closed Type Families is possible. But the Combine instances are potentially ambiguous/unifiable just as much as the CTF Output equations.
When chi says the FunDep code is accepted, that's only half-true: GHC plain leads you down the garden path. It will accept the instances but then you find you can't use them/you get weird error messages. See the Users Guide at "potential for overlap".
If you're looking to resolve a potentially ambiguous Combine constraint, you might get an error suggesting you try IncoherentInstances (or INCOHERENT pragma). Don't do that. You have a genuinely incoherent problem; all that will do is defer the problem to somewhere else. It's always possible to avoid Incoherent -- providing you can rejig your instances (as follows) and they're not locked away in libraries.
Notice that because of the potential ambiguity, another Haskell compiler (Hugs) doesn't let you write Combine like that. It has a more correct implementation of Haskell's (not-well-stated) rules.
The answer is to use a sort of overlap where one instance is strictly more specific. You must first decide which you way you want to prefer in case of ambiguity. I'll choose function prefixed to argument:
{-# LANGUAGE UndecidableInstances, TypeFamilies #-}
instance {-# OVERLAPPING #-} (r ~ b)
=> Combine (a->b) a r where ...
instance {-# OVERLAPPABLE #-} (Combine2 t1 t2 r)
=> Combine t1 t2 r where
(<^>) = revApp
class Combine2 t1 t2 r | t1 t2 -> r where
revApp :: t1 -> t2 -> r
instance (b ~ r) => Combine2 a (a->b) r where
revApp x f = f x
Notice that the OVERLAPPABLE instance for Combine has bare tyvars, it's a catch-all so it's always matchable. All the compiler has to do is decide whether some wanted constraint is of the form of the OVERLAPPING instance.
The Combine2 constraint on the OVERLAPPABLE instance is no smaller than the head, so you need UndecidableInstances. Also beware that deferring to Combine2 will mean that if the compiler still can't resolve, you're likely to get puzzling error messages.
Talking of bare tyvars/"always matchable", I've used an additional trick to make the compiler work really hard to improve the types: There's bare r in the head of the instance, with an Equality type improvement constraint (b ~ r) =>. To use the ~, you need to switch on TypeFamilies even though you're not writing any type families.
A CTF approach would be similar. You need a catch-all equation on Output that calls an auxiliary type function. Again you need UndecidableInstances.

How to define a function inside haskell newtype?

I am trying to decipher the record syntax in haskell for newtype and my understanding breaks when there is a function inside newtype. Consider this simple example
newtype C a b = C { getC :: (a -> b) -> a }
As per my reasoning C is a type which accepts a function and a parameter in it's constructor.
so,
let d1 = C $ (2 *) 3
:t d1 also gives
d1 :: Num ((a -> b) -> a) => C a b
Again to check this I do :t getC d1, which shows this
getC d1 :: Num ((a -> b) -> a) => (a -> b) -> a
Why the error if I try getC d1? getC should return the function and it's parameter or at least apply the parameter.
I can't have newtype C a b = C { getC :: (a->b)->b } deriving (Show), because this won't make sense!
It's always good to emphasise that Haskell has two completely separate namespaces, the type language and the value language. In your case, there's
A type constructor C :: Type -> Type -> Type, which lives in the type language. It takes two types a, b (of kind Type) and maps them to a type C a b (also of kind Type)†.
A value constructor C :: ((a->b) -> a) -> C a b, which lives in the value language. It takes a function f (of type (a->b) -> a) and maps it to a value C f (of type C a b).
Perhaps it would be less confusing if you had
newtype CT a b = CV ((a->b) -> a)
but because for a newtype there is always exactly one value constructor (and exactly one type constructor) it makes sense to name them the same.
CV is a value constructor that accepts one function, full stop. That function will have signature (a->b) -> a, i.e. its argument is also a function, but as far as CT is concerned this doesn't really matter.
Really, it's kind of wrong that data and newtype declarations use a = symbol, because it doesn't mean the things on the left and right are “the same” – can't, because they don't even belong to the same language. There's an alternative syntax which expresses the relation better:
{-# LANGUAGE GADTs #-}
import Data.Kind
data CT :: Type -> Type -> Type where
CV :: ((a->b) -> a) -> CT a b
As for that value you tried to construct
let d1 = CV $ (\x->(2*x)) 3
here you did not pass “a function and a parameter” to CV. What you actually did‡ was, you applied the function \x->2*x to the value 3 (might as well have written 6) and passed that number to CV. But as I said, CV expects a function. What then happens is, GHC tries to interpret 6 as a function, which gives the bogus constraint Num ((a->b) -> a). What that means is: “if (a->b)->a is a number type, then...”. Of course it isn't a number type, so the rest doesn't make sense either.
†It may seem redundant to talk of “types of kind Type”. Actually, when talking about “types” we often mean “entities in the type-level language”. These have kinds (“type-level types”) of which Type (the kind of (lifted) value-level values) is the most prominent, but not the only one – you can also have type-level numbers and type-level functions – C is indeed one.Note that Type was historically written *, but this notation is deprecated because it's inconsistent (confusion with multiplication operator).
‡This is because $ has the lowest precedence, i.e. the expression CV $ (\x->(2*x)) 3 is actually parsed as CV ((\x->(2*x)) 3), or equivalently let y = 2*3 in CV y.
As per my reasoning C is a type which accepts a function and a parameter
How so? The constructor has only one argument.
Newtypes always have a single constructor with exactly one argument.
The type C, otoh, has two type parameters. But that has nothing to do with the number of arguments you can apply to the constructor.

Associated Parameter Restriction using Functional Dependency

The function f below, for a given type 'a', takes a parameter of type 'c'. For different types 'a', 'c' is restricted in different ways. Concretely, when 'a' is any Integral type, 'c' should be allowed to be any 'Real' type. When 'a' is Float, 'c' can ONLY be Float.
One attempt is:
{-# LANGUAGE
MultiParamTypeClasses,
FlexibleInstances,
FunctionalDependencies,
UndecidableInstances #-}
class AllowedParamType a c | a -> c
class Foo a where
f :: (AllowedParamType a c) => c -> a
fIntegral :: (Integral a, Real c) => c -> a
fIntegral = error "implementation elided"
instance (Integral i, AllowedParamType i d, Real d) => Foo i where
f = fIntegral
For some reason, GHC 7.4.1 complains that it "could not deduce (Real c) arising from a use of fIntegral". It seems to me that the functional dependency should allow this deduction. In the instance, a is unified with i, so by the functional dependency, d should be unified with c, which in the instance is declared to be 'Real'. What am I missing here?
Functional dependencies aside, will this approach be expressive enough to enforce the restrictions above, or is there a better way? We are only working with a few different values for 'a', so there will be instances like:
instance (Integral i, Real c) => AllowedParamType i c
instance AllowedParamType Float Float
Thanks
A possibly better way, is to use constraint kinds and type families (GHC extensions, requires GHC 7.4, I think). This allows you to specify the constraint as part of the class instance.
{-# LANGUAGE ConstraintKinds, TypeFamilies, FlexibleInstances, UndecidableInstances #-}
import GHC.Exts (Constraint)
class Foo a where
type ParamConstraint a b :: Constraint
f :: ParamConstraint a b => b -> a
instance Integral i => Foo i where
type ParamConstraint i b = Real b
f = fIntegral
EDIT: Upon further experimentation, there are some subtleties that mean that this doesn't work as expected, specifically, type ParamConstraint i b = Real b is too general. I don't know a solution (or if one exists) right now.
OK, this one's been nagging at me. given the wide variety of instances,
let's go the whole hog and get rid of any relationship between the
source and target type other than the presence of an instance:
{-# LANGUAGE OverlappingInstances, FlexibleInstances,TypeSynonymInstances,MultiParamTypeClasses #-}
class Foo a b where f :: a -> b
Now we can match up pairs of types with an f between them however we like, for example:
instance Foo Int Int where f = (+1)
instance Foo Int Integer where f = toInteger.((7::Int) -)
instance Foo Integer Int where f = fromInteger.(^ (2::Integer))
instance Foo Integer Integer where f = (*100)
instance Foo Char Char where f = id
instance Foo Char String where f = (:[]) -- requires TypeSynonymInstances
instance (Foo a b,Functor f) => Foo (f a) (f b) where f = fmap f -- requires FlexibleInstances
instance Foo Float Int where f = round
instance Foo Integer Char where f n = head $ show n
This does mean a lot of explicit type annotation to avoid No instance for... and Ambiguous type error messages.
For example, you can't do main = print (f 6), but you can do main = print (f (6::Int)::Int)
You could list all of the instances with the standard types that you want,
which could lead to an awful lot of repetition, our you could light the blue touchpaper and do:
instance Integral i => Foo Double i where f = round -- requires FlexibleInstances
instance Real r => Foo Integer r where f = fromInteger -- requires FlexibleInstances
Beware: this does not mean "Hey, if you've got an integral type i,
you can have an instance Foo Double i for free using this handy round function",
it means: "every time you have any type i, it's definitely an instance
Foo Double i. By the way, I'm using round for this, so unless your type i is Integral,
we're going to fall out." That's a big issue for the Foo Integer Char instance, for example.
This can easily break your other instances, so if you now type f (5::Integer) :: Integer you get
Overlapping instances for Foo Integer Integer
arising from a use of `f'
Matching instances:
instance Foo Integer Integer
instance Real r => Foo Integer r
You can change your pragmas to include OverlappingInstances:
{-# LANGUAGE OverlappingInstances, FlexibleInstances,TypeSynonymInstances,MultiParamTypeClasses #-}
So now f (5::Integer) :: Integer returns 500, so clearly it's using the more specific Foo Integer Integer instance.
I think this sort of approach might work for you, defining many instances by hand, carefully considering when to go completely wild
making instances out of standard type classes. (Alternatively, there aren't all that many standard types, and as we all know, notMany choose 2 = notIntractablyMany, so you could just list them all.)
Here's a suggestion to solve a more general problem, not yours specifically (I need more detail yet first - I promise to check later). I'm writing it in case other people are searching for a solution to a similar problem to you, I certainly was in the past, before I discovered SO. SO is especially great when it helps you try a radically new approach.
I used to have the work habit:
Introduce a multi-parameter type class (Types hanging out all over the place, so...)
Introduce functional dependencies (Should tidy it up but then I end up needing...)
Add FlexibleInstances (Alarm bells start ringing. There's a reason the compiler has this off by default...)
Add UndecidableInstances (GHC is telling you you're on your own, because it's not convinced it's up to the challenge you're setting it.)
Everything blows up. Refactor somehow.
Then I discovered the joys of type families (functional programming for types (hooray) - multi-parameter type classes are (a bit like) logic programming for types). My workflow changed to:
Introduce a type class including an associated type, i.e. replace
class MyProblematicClass a b | a -> b where
thing :: a -> b
thang :: b -> a -> b
with
class MyJustWorksClass a where
type Thing a :: * -- Thing a is a type (*), not a type constructor (* -> *)
thing :: a -> Thing a
thang :: Thing a -> a -> Thing a
Nervously add FlexibleInstances. Nothing goes wrong at all.
Sometimes fix things by using constraints like (MyJustWorksClass j,j~a)=> instead of (MyJustWorksClass a)=> or (Show t,t ~ Thing a,...)=> instead of (Show (Thing a),...) => to help ghc out. (~ essentially means 'is the same type as')
Nervously add FlexibleContexts. Nothing goes wrong at all.
Everything works.
The reason "Nothing goes wrong at all" is that ghc calculates the type Thing a using my type function Thang rather than trying to deduce it using a merely a bunch of assertions that there's a function there and it ought to be able to work it out.
Give it a go! Read Fun with Type Functions before reading the manual!

What is the purpose of Rank2Types?

I am not really proficient in Haskell, so this might be a very easy question.
What language limitation do Rank2Types solve? Don't functions in Haskell already support polymorphic arguments?
It's hard to understand higher-rank polymorphism unless you study System F directly, because Haskell is designed to hide the details of that from you in the interest of simplicity.
But basically, the rough idea is that polymorphic types don't really have the a -> b form that they do in Haskell; in reality, they look like this, always with explicit quantifiers:
id :: ∀a.a → a
id = Λt.λx:t.x
If you don't know the "∀" symbol, it's read as "for all"; ∀x.dog(x) means "for all x, x is a dog." "Λ" is capital lambda, used for abstracting over type parameters; what the second line says is that id is a function that takes a type t, and then returns a function that's parametrized by that type.
You see, in System F, you can't just apply a function like that id to a value right away; first you need to apply the Λ-function to a type in order to get a λ-function that you apply to a value. So for example:
(Λt.λx:t.x) Int 5 = (λx:Int.x) 5
= 5
Standard Haskell (i.e., Haskell 98 and 2010) simplifies this for you by not having any of these type quantifiers, capital lambdas and type applications, but behind the scenes GHC puts them in when it analyzes the program for compilation. (This is all compile-time stuff, I believe, with no runtime overhead.)
But Haskell's automatic handling of this means that it assumes that "∀" never appears on the left-hand branch of a function ("→") type. Rank2Types and RankNTypes turn off those restrictions and allow you to override Haskell's default rules for where to insert forall.
Why would you want to do this? Because the full, unrestricted System F is hella powerful, and it can do a lot of cool stuff. For example, type hiding and modularity can be implemented using higher-rank types. Take for example a plain old function of the following rank-1 type (to set the scene):
f :: ∀r.∀a.((a → r) → a → r) → r
To use f, the caller first must choose what types to use for r and a, then supply an argument of the resulting type. So you could pick r = Int and a = String:
f Int String :: ((String → Int) → String → Int) → Int
But now compare that to the following higher-rank type:
f' :: ∀r.(∀a.(a → r) → a → r) → r
How does a function of this type work? Well, to use it, first you specify which type to use for r. Say we pick Int:
f' Int :: (∀a.(a → Int) → a → Int) → Int
But now the ∀a is inside the function arrow, so you can't pick what type to use for a; you must apply f' Int to a Λ-function of the appropriate type. This means that the implementation of f' gets to pick what type to use for a, not the caller of f'. Without higher-rank types, on the contrary, the caller always picks the types.
What is this useful for? Well, for many things actually, but one idea is that you can use this to model things like object-oriented programming, where "objects" bundle some hidden data together with some methods that work on the hidden data. So for example, an object with two methods—one that returns an Int and another that returns a String, could be implemented with this type:
myObject :: ∀r.(∀a.(a → Int, a -> String) → a → r) → r
How does this work? The object is implemented as a function that has some internal data of hidden type a. To actually use the object, its clients pass in a "callback" function that the object will call with the two methods. For example:
myObject String (Λa. λ(length, name):(a → Int, a → String). λobjData:a. name objData)
Here we are, basically, invoking the object's second method, the one whose type is a → String for an unknown a. Well, unknown to myObject's clients; but these clients do know, from the signature, that they will be able to apply either of the two functions to it, and get either an Int or a String.
For an actual Haskell example, below is the code that I wrote when I taught myself RankNTypes. This implements a type called ShowBox which bundles together a value of some hidden type together with its Show class instance. Note that in the example at the bottom, I make a list of ShowBox whose first element was made from a number, and the second from a string. Since the types are hidden by using the higher-rank types, this doesn't violate type checking.
{-# LANGUAGE RankNTypes #-}
{-# LANGUAGE ImpredicativeTypes #-}
type ShowBox = forall b. (forall a. Show a => a -> b) -> b
mkShowBox :: Show a => a -> ShowBox
mkShowBox x = \k -> k x
-- | This is the key function for using a 'ShowBox'. You pass in
-- a function #k# that will be applied to the contents of the
-- ShowBox. But you don't pick the type of #k#'s argument--the
-- ShowBox does. However, it's restricted to picking a type that
-- implements #Show#, so you know that whatever type it picks, you
-- can use the 'show' function.
runShowBox :: forall b. (forall a. Show a => a -> b) -> ShowBox -> b
-- Expanded type:
--
-- runShowBox
-- :: forall b. (forall a. Show a => a -> b)
-- -> (forall b. (forall a. Show a => a -> b) -> b)
-- -> b
--
runShowBox k box = box k
example :: [ShowBox]
-- example :: [ShowBox] expands to this:
--
-- example :: [forall b. (forall a. Show a => a -> b) -> b]
--
-- Without the annotation the compiler infers the following, which
-- breaks in the definition of 'result' below:
--
-- example :: forall b. [(forall a. Show a => a -> b) -> b]
--
example = [mkShowBox 5, mkShowBox "foo"]
result :: [String]
result = map (runShowBox show) example
PS: for anybody reading this who's wondered how come ExistentialTypes in GHC uses forall, I believe the reason is because it's using this sort of technique behind the scenes.
Do not functions in Haskell already support polymorphic arguments?
They do, but only of rank 1. This means that while you can write a function that takes different types of arguments without this extension, you can't write a function that uses its argument as different types in the same invocation.
For example the following function can't be typed without this extension because g is used with different argument types in the definition of f:
f g = g 1 + g "lala"
Note that it's perfectly possible to pass a polymorphic function as an argument to another function. So something like map id ["a","b","c"] is perfectly legal. But the function may only use it as monomorphic. In the example map uses id as if it had type String -> String. And of course you can also pass a simple monomorphic function of the given type instead of id. Without rank2types there is no way for a function to require that its argument must be a polymorphic function and thus also no way to use it as a polymorphic function.
Luis Casillas's answer gives a lot of great info about what rank 2 types mean, but I'll just expand on one point he didn't cover. Requiring an argument to be polymorphic doesn't just allow it to be used with multiple types; it also restricts what that function can do with its argument(s) and how it can produce its result. That is, it gives the caller less flexibility. Why would you want to do that? I'll start with a simple example:
Suppose we have a data type
data Country = BigEnemy | MediumEnemy | PunyEnemy | TradePartner | Ally | BestAlly
and we want to write a function
f g = launchMissilesAt $ g [BigEnemy, MediumEnemy, PunyEnemy]
that takes a function that's supposed to choose one of the elements of the list it's given and return an IO action launching missiles at that target. We could give f a simple type:
f :: ([Country] -> Country) -> IO ()
The problem is that we could accidentally run
f (\_ -> BestAlly)
and then we'd be in big trouble! Giving f a rank 1 polymorphic type
f :: ([a] -> a) -> IO ()
doesn't help at all, because we choose the type a when we call f, and we just specialize it to Country and use our malicious \_ -> BestAlly again. The solution is to use a rank 2 type:
f :: (forall a . [a] -> a) -> IO ()
Now the function we pass in is required to be polymorphic, so \_ -> BestAlly won't type check! In fact, no function returning an element not in the list it is given will typecheck (although some functions that go into infinite loops or produce errors and therefore never return will do so).
The above is contrived, of course, but a variation on this technique is key to making the ST monad safe.
Higher-rank types aren't as exotic as the other answers have made out. Believe it or not, many object-oriented languages (including Java and C#!) feature them. (Of course, no one in those communities knows them by the scary-sounding name "higher-rank types".)
The example I'm going to give is a textbook implementation of the Visitor pattern, which I use all the time in my daily work. This answer is not intended as an introduction to the visitor pattern; that knowledge is readily available elsewhere.
In this fatuous imaginary HR application, we wish to operate on employees who may be full-time permanent staff or temporary contractors. My preferred variant of the Visitor pattern (and indeed the one which is relevant to RankNTypes) parameterises the visitor's return type.
interface IEmployeeVisitor<T>
{
T Visit(PermanentEmployee e);
T Visit(Contractor c);
}
class XmlVisitor : IEmployeeVisitor<string> { /* ... */ }
class PaymentCalculator : IEmployeeVisitor<int> { /* ... */ }
The point is that a number of visitors with different return types can all operate on the same data. This means IEmployee must express no opinion as to what T ought to be.
interface IEmployee
{
T Accept<T>(IEmployeeVisitor<T> v);
}
class PermanentEmployee : IEmployee
{
// ...
public T Accept<T>(IEmployeeVisitor<T> v)
{
return v.Visit(this);
}
}
class Contractor : IEmployee
{
// ...
public T Accept<T>(IEmployeeVisitor<T> v)
{
return v.Visit(this);
}
}
I wish to draw your attention to the types. Observe that IEmployeeVisitor universally quantifies its return type, whereas IEmployee quantifies it inside its Accept method - that is to say, at a higher rank. Translating clunkily from C# to Haskell:
data IEmployeeVisitor r = IEmployeeVisitor {
visitPermanent :: PermanentEmployee -> r,
visitContractor :: Contractor -> r
}
newtype IEmployee = IEmployee {
accept :: forall r. IEmployeeVisitor r -> r
}
So there you have it. Higher-rank types show up in C# when you write types containing generic methods.
For those familiar with object oriented languages, a higher-rank function is simply a generic function that expects as its argument another generic function.
E.g. in TypeScript you could write:
type WithId<T> = T & { id: number }
type Identifier = <T>(obj: T) => WithId<T>
type Identify = <TObj>(obj: TObj, f: Identifier) => WithId<TObj>
See how the generic function type Identify demands a generic function of the type Identifier? This makes Identify a higher-rank function.

Resources