Enforce class constraints in type class that is not captured in the type signature of implementing type - haskell

I am trying to use a typeclass that enforces a constraint on the type returned by one of the functions it defines. But the return type of the function does not capture the constraint in its type variable. I would like to know what is wrong with the code or what is the correct way to encode it.
A sample code is given below:
data State a = State {
uniform :: a
}
class Renderable a where
render :: (Uniform b) => Int -> a -> State b
library :: (Uniform a) => a -> IO ()
-- some implementation
draw :: (Renderable a) => a -> IO ()
draw renderable = do
let state = render 0 renderable
_ <- library (uniform state)
In the above snippet, the render function tries to enforce that the uniform property in State adheres to a class constraint Uniform. When I run the code, I am getting an error that
Could not deduce (Uniform a5) arising from a use of ‘draw’
from the context: (Renderable r, Uniform a)
bound by the type signature for:
draw :: forall r a.
(Renderable r, Uniform a) =>
Int -> Renderable r -> IO ()
Thinking of it, I am sort of able to understand that since the type of draw uses only Renderable and Renderable does not have a parameter of type Uniform in its type signature, the compiler is not able to verify the flow completely. But I am wondering, why cant the compiler, while testing the type signature of draw ignore the issue and basically depend on the fact that it will know if a type implementing Renderable will definitely have to provide a value for uniform as a part of State and it can verify the type correctness in the implementation site rather than usage.
PS: This is an extracted snippet from OpenGL code and Uniform, Library are opengl terminologies.

Here is a technique for you. I've written about this many years ago (in a slightly different context, but the idea is the same) and I still stand by it.
First, the framing. If we write out the signature of render explicitly, we have:
render :: forall b. Uniform b => Int -> a -> State b
That is, the caller of render chooses the type b. It seems to me that your intention is more like this pseudo-Haskell*:
render :: exists b. (Uniform b) & Int -> a -> State b
In which the callee gets to choose the type. That is, different implementations of render may choose different types b to return, so long as they are uniform.
This might be a fine way to phrase it, except that Haskell does not support existential quantification directly. You can make a wrapper data type to simulate it
data SomeUniform where
SomeUniform :: Uniform a => a -> SomeUniform
making your signature
render :: Int -> a -> SomeUniform
which I think has the properties you are looking for. However the SomeUniform type and the Uniform typeclass are very likely superfluous. You said in the comments that the Uniform typeclass looks like this:
class Uniform a where
library :: a -> IO ()
Let's consider this question: let's say we have a SomeUniform, that is, we have a value of some type a about which we know nothing except that it is an instance of the Uniform typeclass. What can we possibly do with x? There is only one way to get any information out of x, and that is to call library on it. So in essence the only thing the SomeUniform type is doing is carrying around a library method to be called later. This whole existential/typeclass is kind of pointless, we would be better served collapsing it down to a simple data type:
data Uniform = Uniform { library :: IO () }
and your render method becomes:
render :: Int -> a -> Uniform
It's so beautifully unfancy, isn't it? If there were more methods in Uniform typeclass, they would become additional fields of this data type (whose types may be functions, which can take some getting used to). Where you had types and instances of the typeclass, e.g.
data Thingy = Thingy String
-- note the constructor type Thingy :: String -> Thingy
instance Uniform String where
library (Thingy s) = putStrLn $ "thingy " ++ s
you can now also be rid of the data type and just use a function in place of the constructor
thingy :: String -> Uniform
thingy s = Uniform { library = putStrLn $ "thingy " ++ s }
(If you can't get rid of the data type for other reasons, you can provide a conversion function instead uniformThingy :: Thingy -> Uniform)
The principle here is, you may replace an existential type with the collection of its observations, and it's usually pretty nice if you do.
* My pseudo-Haskell & is dual to =>, playing essentially the same role but for existentially quantified dictionaries. c => t means that once the caller provides the dictionary c, the type t is returned, whereas c & t means that the callee provides both the dictionary c and the type t.

It appears that you're expecting to be able to define render to return a different distinct type for each implementation of Renderable, as long as that type is Uniform:
instance Renderable Foo where
render _ _ = State True
instance Renderable Bar where
render _ _ = State "mothman"
instance Renderable Baz where
render _ _ = State 19
So if render is called with a Foo, it will return a State Bool, but if it's called with a Bar it will return a State String (assuming both Bool and String are Uniform). This is not how it works, and you'll get a type mismatch error if you try instantiating like this.
render :: (Uniform b) => Int -> a -> State b means that a Uniform b => State b is returned. If this is what your type signature is, your implementation must be no more or less specific; your implementation must be able to return a value of ANY type Uniform b => State b. If it is not able to do so, any code that requests a return value of a specific type won't get the right type, and things will break in ways that the type system SHOULD be able to prevent.
Let's look at a different example:
class Collection t where
size :: Num i => t a -> i
Assume someone wants to run this size function, and get the result as a Double. They can do that, because any implementation of size must be able to return any type of class Num, so the caller can always specify which type they want. If you were allowed to write an implementation that always returned an Integer, this would no longer be possible.
I think to do what you're trying to do, you'd need something like FunctionalDependencies. With this, your class can be something like:
class Uniform b => Renderable a b | a -> b where
render :: Int -> a -> State b
The "| a -> b" tells the type checker that the type b should be decided based on the type a provided by the caller. This disallows the caller from choosing their own b, which means the implementation should force a more specific type. Note that now you need to specify both a and b in your instances, so:
instance Renderable Foo Bool where ...
instance Renderable Bar String where ...
I'm certain there are other valid approaches to this problem, as well.

Related

Right way to set constraint on function arguments

I have some function. And I want to restrict types that can be passed to this function. Let's say only types that are cacheable. I can enumerate such types with a type family like:
type family Cacheable a::Bool where
Cacheable X = 'True
Cacheable _ = 'False
And add such a constraint to my function:
myFunc :: forall a. (Cacheable a ~ 'True) => ....
but this constraint is little bit redundant: I can remove it from the function's signature, and nothing changes. Nothing forces it to be in the signature.
Another approach is to create some typeclass whose usage in the body of myFunc will force me to add this constraint to myFunc's signature:
class Cacheable a
ensureCacheable :: ()
ensureCacheable = ()
instance Cacheable X
myFunc :: forall a. (Cacheable a) => ...
myFunc =
let _ = ensureCacheable #a
in ...
but it looks little bit funny.
What is the right/canonical way to do it in Haskell?
Let's suppose that Cacheable cannot have any reasoning methods. Think about it like about some classifier. Another name is IsQuery (vs IsCommand) and queries are cacheable, commands - no.
For the sake of discussion, let's pick a single terminology:
type family CacheableTF a :: Bool where
CacheableTF X = 'True
CacheableTF _ = 'False
class CacheableC a
instance CacheableC X
What's the difference between these? Two aspects:
CacheableTF a expresses in classical logic the fact that a is a cacheable type. It's just a boolean after all. Thus you can arbitrarily stack negations on top of this. (Cacheable a ~ 'False) => .... is a constraint just as valid as the 'True version.
By contrast, CacheableC a is constructive, it's a proposition, a promise that anything which has this in context will be able to access the methods of the type class. (Of course, in your example the method was pretty useless, but even then you could still build other functions on top of it.) Unlike with CacheableTF, you can't really use the negation of CacheableC at all.This aspect is directly reflected in the kinds:
CacheableTF :: Type -> Bool
CacheableC :: Type -> Constraint
CacheableTF is closed-world, CacheableC is open-world. If you want this to be an interface where people can later on make their own types cacheable as well, you need CacheableC. But this power is one of the reasons why class constraints can't be negated: just because the compiler can't find any instance while compiling one module, doesn't mean the type won't have an instance by the time the complete program is linked together.
If you want to make a decision based on whether or not the type is cacheable, you need CacheableTF. However in practice, you'd typically also need some methods detailing how to cache it, in the True case, not just trivial methods like in your CacheableC. This can be accomplished by a more general class that covers both the cacheable- and non-cacheable cases:
class DecideCache a where
type IsCacheable a :: Bool
howToActuallyGoAboutCachingIt
:: (IsCacheable a ~ 'True) => SomeCompl -> Icated -> Stora -> GeMethod

Clarification on Existential Types in Haskell

I am trying to understand Existential types in Haskell and came across a PDF http://www.ii.uni.wroc.pl/~dabi/courses/ZPF15/rlasocha/prezentacja.pdf
Please correct my below understandings that I have till now.
Existential Types not seem to be interested in the type they contain but pattern matching them say that there exists some type we don't know what type it is until & unless we use Typeable or Data.
We use them when we want to Hide types (ex: for Heterogeneous Lists) or we don't really know what the types at Compile Time.
GADT's provide the clear & better syntax to code using Existential Types by providing implicit forall's
My Doubts
In Page 20 of above PDF it is mentioned for below code that it is impossible for a Function to demand specific Buffer. Why is it so? When I am drafting a Function I exactly know what kind of buffer I gonna use eventhough I may not know what data I gonna put into that.
What's wrong in Having :: Worker MemoryBuffer Int If they really want to abstract over Buffer they can have a Sum type data Buffer = MemoryBuffer | NetBuffer | RandomBuffer and have a type like :: Worker Buffer Int
data Worker x = forall b. Buffer b => Worker {buffer :: b, input :: x}
data MemoryBuffer = MemoryBuffer
memoryWorker = Worker MemoryBuffer (1 :: Int)
memoryWorker :: Worker Int
As Haskell is a Full Type Erasure language like C then How does it know at Runtime which function to call. Is it something like we gonna maintain few information and pass in a Huge V-Table of Functions and at runtime it gonna figure out from V-Table? If it is so then what sort of Information it gonna store?
GADT's provide the clear & better syntax to code using Existential Types by providing implicit forall's
I think there's general agreement that the GADT syntax is better. I wouldn't say that it's because GADTs provide implicit foralls, but rather because the original syntax, enabled with the ExistentialQuantification extension, is potentially confusing/misleading. That syntax, of course, looks like:
data SomeType = forall a. SomeType a
or with a constraint:
data SomeShowableType = forall a. Show a => SomeShowableType a
and I think the consensus is that the use of the keyword forall here allows the type to be easily confused with the completely different type:
data AnyType = AnyType (forall a. a) -- need RankNTypes extension
A better syntax might have used a separate exists keyword, so you'd write:
data SomeType = SomeType (exists a. a) -- not valid GHC syntax
The GADT syntax, whether used with implicit or explicit forall, is more uniform across these types, and seems to be easier to understand. Even with an explicit forall, the following definition gets across the idea that you can take a value of any type a and put it inside a monomorphic SomeType':
data SomeType' where
SomeType' :: forall a. (a -> SomeType') -- parentheses optional
and it's easy to see and understand the difference between that type and:
data AnyType' where
AnyType' :: (forall a. a) -> AnyType'
Existential Types not seem to be interested in the type they contain but pattern matching them say that there exists some type we don't know what type it is until & unless we use Typeable or Data.
We use them when we want to Hide types (ex: for Heterogeneous Lists) or we don't really know what the types at Compile Time.
I guess these aren't too far off, though you don't have to use Typeable or Data to use existential types. I think it would be more accurate to say an existential type provides a well-typed "box" around an unspecified type. The box does "hide" the type in a sense, which allows you to make a heterogeneous list of such boxes, ignoring the types they contain. It turns out that an unconstrained existential, like SomeType' above is pretty useless, but a constrained type:
data SomeShowableType' where
SomeShowableType' :: forall a. (Show a) => a -> SomeShowableType'
allows you to pattern match to peek inside the "box" and make the type class facilities available:
showIt :: SomeShowableType' -> String
showIt (SomeShowableType' x) = show x
Note that this works for any type class, not just Typeable or Data.
With regard to your confusion about page 20 of the slide deck, the author is saying that it's impossible for a function that takes an existential Worker to demand a Worker having a particular Buffer instance. You can write a function to create a Worker using a particular type of Buffer, like MemoryBuffer:
class Buffer b where
output :: String -> b -> IO ()
data Worker x = forall b. Buffer b => Worker {buffer :: b, input :: x}
data MemoryBuffer = MemoryBuffer
instance Buffer MemoryBuffer
memoryWorker = Worker MemoryBuffer (1 :: Int)
memoryWorker :: Worker Int
but if you write a function that takes a Worker as argument, it can only use the general Buffer type class facilities (e.g., the function output):
doWork :: Worker Int -> IO ()
doWork (Worker b x) = output (show x) b
It can't try to demand that b be a particular type of buffer, even via pattern matching:
doWorkBroken :: Worker Int -> IO ()
doWorkBroken (Worker b x) = case b of
MemoryBuffer -> error "try this" -- type error
_ -> error "try that"
Finally, runtime information about existential types is made available through implicit "dictionary" arguments for the typeclasses that are involved. The Worker type above, in addtion to having fields for the buffer and input, also has an invisible implicit field that points to the Buffer dictionary (somewhat like v-table, though it's hardly huge, as it just contains a pointer to the appropriate output function).
Internally, the type class Buffer is represented as a data type with function fields, and instances are "dictionaries" of this type:
data Buffer' b = Buffer' { output' :: String -> b -> IO () }
dBuffer_MemoryBuffer :: Buffer' MemoryBuffer
dBuffer_MemoryBuffer = Buffer' { output' = undefined }
The existential type has a hidden field for this dictionary:
data Worker' x = forall b. Worker' { dBuffer :: Buffer' b, buffer' :: b, input' :: x }
and a function like doWork that operates on existential Worker' values is implemented as:
doWork' :: Worker' Int -> IO ()
doWork' (Worker' dBuf b x) = output' dBuf (show x) b
For a type class with only one function, the dictionary is actually optimized to a newtype, so in this example, the existential Worker type includes a hidden field that consists of a function pointer to the output function for the buffer, and that's the only runtime information needed by doWork.
In Page 20 of above PDF it is mentioned for below code that it is impossible for a Function to demand specific Buffer. Why is it so?
Because Worker, as defined, takes only one argument, the type of the "input" field (type variable x). E.g. Worker Int is a type. The type variable b, instead, is not a parameter of Worker, but is a sort of "local variable", so to speak. It can not be passed as in Worker Int String -- that would trigger a type error.
If we instead defined:
data Worker x b = Worker {buffer :: b, input :: x}
then Worker Int String would work, but the type is no longer existential -- we now always have to pass the buffer type as well.
As Haskell is a Full Type Erasure language like C then How does it know at Runtime which function to call. Is it something like we gonna maintain few information and pass in a Huge V-Table of Functions and at runtime it gonna figure out from V-Table? If it is so then what sort of Information it gonna store?
This is roughly correct. Briefly put, each time you apply constructor Worker, GHC infers the b type from the arguments of Worker, and then searches for an instance Buffer b. If that is found, GHC includes an additional pointer to the instance in the object. In its simplest form, this is not too different from the "pointer to vtable" which is added to each object in OOP when virtual functions are present.
In the general case, it can be much more complex, though. The compiler might use a different representation and add more pointers instead of a single one (say, directly adding the pointers to all the instance methods), if that speeds up code. Also, sometimes the compiler needs to use multiple instances to satisfy a constraint. E.g., if we need to store the instance for Eq [Int] ... then there is not one but two: one for Int and one for lists, and the two needs to be combined (at run time, barring optimizations).
It is hard to guess exactly what GHC does in each case: that depends on a ton of optimizations which might or might not trigger.
You could try googling for the "dictionary based" implementation of type classes to see more about what's going on. You can also ask GHC to print the internal optimized Core with -ddump-simpl and observe the dictionaries being constructed, stored, and passed around. I have to warn you: Core is rather low level, and can be hard to read at first.

Resolving an ambiguous type variable

I have these two functions:
load :: Asset a => Reference -> IO (Maybe a)
send :: Asset a => a -> IO ()
The Asset class look like this:
class (Typeable a,ToJSON a, FromJSON a) => Asset a where
ref :: a -> Reference
...
The first reads an asset from disk, and the second transmits a JSON representation to a WebSocket. In isolation they work fine, but when I combine them the compiler cannot deduce what concrete type a should be. (Could not deduce (Asset a0) arising from a use of 'load')
This makes sense, I have not given a concrete type and both load and send are polymorphic. Somehow the compiler has to decide which version of send (and by extension what version of toJSON) to use.
I can determine at run time what the concrete type of a is. This information is actually encoded both in the data on the disk and the Reference type, but I do not know for sure at compile time as the type checker is being run.
Is there a way to pass the correct type at run time an still keep the type checker happy?
Additional Information
The definition of Reference
data Reference = Ref {
assetType:: String
, assetIndex :: Int
} deriving (Eq, Ord, Show, Generic)
References are derived by parsing a request from a WebSocket as follows where Parser comes from the Parsec library.
reference :: Parser Reference
reference = do
t <- string "User"
<|> string "Port"
<|> string "Model"
<|> ...
char '-'
i <- int
return Ref {assetType = t, assetIndex =i}
If I added a type parameter to Reference I simply push my problem back into the parser. I still need to turn a string that I do not know at compile time into a type to make this work.
You can't make a function that turns string data into values of different types depending on what is in the string. That's simply impossible. You need to rearrange things so that your return-type doesn't depend on the string contents.
Your type for load, Asset a => Reference -> IO (Maybe a) says "pick any a (where Asset a) you like and give me a Reference, and I'll give you back an IO action that produces Maybe a". The caller picks the type they expect to be loaded by the reference; the contents of the file do not influence which type is loaded. But you don't want it to be chosen by the caller, you want it to be chosen by what's stored on disk, so the type signature simply doesn't express the operation you actually want. That's your real problem; the ambiguous type variable when combining load and send would be easily resolved (with a type signature or TypeApplications) if load and send were individually correct and combining them was the only problem.
Basically you can't just have load return a polymorphic type, because if it does then the caller gets to (must) decide what type it returns. There's two ways to avoid this that are more-or-less equivalent: return an existential wrapper, or use rank 2 types and add a polymorphic handler function (continuation) as a parameter.
Using an existential wrapper (requires GADTs extension), it looks something like this:
data SomeAsset
where Some :: Asset a => a -> SomeAsset
load :: Reference -> IO (Maybe SomeAsset)
Notice load is no longer polymorphic. You get a SomeAsset that (as far as the type checker is concerned) could contain any type that has an Asset instance. load can internally use whatever logic it wants split into multiple branches and come up with values of different types of asset on different branches; provided each branch ends with wrapping up the asset value with the SomeAsset constructor all of the branches will return the same type.
To send it, you would use something like (ignoring that I'm not handling Nothing):
loadAndSend :: Reference -> IO ()
loadAndSend ref
= do Just someAsset <- load ref
case someAsset
of SomeAsset asset -> send asset
The SomeAsset wrapper guarantees that Asset holds for its wrapped value, so you can unwrap them and call any Asset-polymorphic function on the result. However you can never do anything with the value that depends on the specific type in any other way1, which is why you have to keep it wrapped up and case match on it all the time; if the case expression results in a type that depends on the contained type (such as case someAsset of SomeAsset a -> a) the compiler will not accept your code.
The other way is to instead use RankNTypes and give load a type like this:
load :: (forall a. Asset a => a -> r) -> Reference -> IO (Maybe r)
Here load doesn't return a value representing the loaded asset at all. What it does instead is take a polymorphic function as an argument; the function works on any Asset and returns a type r (that was chosen by load's caller), so again load can internally branch however it wants and construct differently-typed assets in the different branches. The different asset types can all be passed to the handler, so the handler can be called in every branch.
My preference is often to use the SomeAsset approach, but then to also use RankNTypes and define a helper function like:
withSomeAsset :: (forall a. Asset a => a -> r) -> (SomeAsset -> r)
withSomeAsset f (SomeAsset a) = f a
This avoids having to restructure your code into continuation passing style, but takes away the heave case syntax everywhere you need to use a SomeAsset:
loadAndSend :: Reference -> IO ()
loadAndSend ref
= do Just asset <- load ref
withSomeAsset send asset
Or even add:
sendSome = withSomeAsset send
Daniel Wagner suggested adding the type parameter to Reference, which the OP objected to by stating that simply moves the same problem to when the references are constructed. If the references contain data representing which type of asset they refer to, then I would strongly recommend taking Daniel's advice, and using the concepts described in this answer to address that problem at the reference-constructing level. Reference having a type parameter prevents mixing up references to the wrong types of assets where you do know the type.
And if you do significant processing with references and assets of the same type, then having the type parameter in your workhorse code can catch easy mistakes mixing them up even if you usually existential the type away at the outer levels of code.
1 Technically your Asset implies Typeable, so you can test it for specific types and then return those.
Sure, make Reference store the type.
data Reference a where
UserRef :: Int -> Reference User
PortRef :: Int -> Reference Port
ModelRef :: Int -> Reference Model
load :: Asset a => Reference a -> IO (Maybe a)
send :: Asset a => a -> IO ()
If necessary, you can still recover the strong points of your original Reference type by existentially boxing it.
data SomeAsset f where SomeAsset :: Asset a => f a -> SomeAsset f
reference :: Parser (SomeAsset Reference)
reference = asum
[ string "User" *> go UserRef
, string "Port" *> go PortRef
, string "Model" *> go ModelRef
]
where
go :: Asset a => (Int -> Parser (Reference a)) -> Parser (SomeAsset Reference)
go constructor = constructor <$ char '-' <*> int
loadAndSend :: SomeAsset Reference -> IO ()
loadAndSend (SomeAsset reference) = load reference >>= traverse_ send
After reviewing the answers from Daniel Wagner and Ben, I ultimately resolved my issue using a combination of the two which I place here in hopes it will aid others.
First, per Daniel Wagner's answer, I added a phantom type to Reference:
data Reference a = Ref {
assetType:: String
, assetIndex :: Int
} deriving (Eq, Ord, Show, Generic)
I chose not to use a GADT constructors and leave the string reference to assetType as I frequently send references over the wire and/or parse them from incoming text. I felt there were too many code points where I needed a generic reference. For those cases, I fill in the phantom type with Void:
{-# LANGUAGE EmptyDataDecls #-}
data Void
-- make this reference Generic
voidRef :: Reference a -> Reference Void
castRef :: a -> Reference b -> Reference a
-- ^^^ Note this can be undefined used only for its type
With this the load type signature becomes load :: Asset a => Reference a -> IO (Maybe a) So the Asset is always matches the type of the Reference. (Yay type safety!)
That still doesn't address how to load a generic reference. For those cases, I wrote some new code using the second half of Ben's answer. By wrapping the asset in SomeAsset, I can return a Type which is making the type checker happy.
{-# LANGUAGE GADTs #-}
import Data.Aeson (encode)
loadGenericAsset :: Reference Void -> IO SomeAsset
loadGenericAsset ref =
case assetType ref of
"User" -> Some <$> load (castRef (undefined :: User) ref)
"Port" -> Some <$> load (castRef (undefined :: Port) ref)
[etc...]
send :: SomeAsset -> IO ()
send (Some a) = writeToUser (encode a)
data SomeAsset where
Some :: Asset a => a -> SomeAsset

Practical applications of Rank 2 polymorphism?

I'm covering polymorphism and I'm trying to see the practical uses of such a feature.
My basic understanding of Rank 2 is:
type MyType = ∀ a. a -> a
subFunction :: a -> a
subFunction el = el
mainFunction :: MyType -> Int
mainFunction func = func 3
I understand that this is allowing the user to use a polymorphic function (subFunction) inside mainFunction and strictly specify it's output (Int). This seems very similar to GADT's:
data Example a where
ExampleInt :: Int -> Example Int
ExampleBool :: Bool -> Example Bool
1) Given the above, is my understanding of Rank 2 polymorphism correct?
2) What are the general situations where Rank 2 polymorphism can be used, as opposed to GADT's, for example?
If you pass a polymorphic function as and argument to a Rank2-polymorphic function, you're essentially passing not just one function but a whole family of functions – for all possible types that fulfill the constraints.
Typically, those forall quantifiers come with a class constraint. For example, I might wish to do number arithmetic with two different types simultaneously (for comparing precision or whatever).
data FloatCompare = FloatCompare {
singlePrecision :: Float
, doublePrecision :: Double
}
Now I might want to modify those numbers through some maths operation. Something like
modifyFloat :: (Num -> Num) -> FloatCompare -> FloatCompare
But Num is not a type, only a type class. I could of course pass a function that would modify any particular number type, but I couldn't use that to modify both a Float and a Double value, at least not without some ugly (and possibly lossy) converting back and forth.
Solution: Rank-2 polymorphism!
modifyFloat :: (∀ n . Num n => n -> n) -> FloatCompare -> FloatCompare
mofidyFloat f (FloatCompare single double)
= FloatCompare (f single) (f double)
The best single example of how this is useful in practice are probably lenses. A lens is a “smart accessor function” to a field in some larger data structure. It allows you to access fields, update them, gather results... while at the same time composing in a very simple way. How it works: Rank2-polymorphism; every lens is polymorphic, with the different instantiations corresponding to the “getter” / “setter” aspects, respectively.
The go-to example of an application of rank-2 types is runST as Benjamin Hodgson mentioned in the comments. This is a rather good example and there are a variety of examples using the same trick. For example, branding to maintain abstract data type invariants across multiple types, avoiding confusion of differentials in ad, a region-based version of ST.
But I'd actually like to talk about how Haskell programmers are implicitly using rank-2 types all the time. Every type class whose methods have universally quantified types desugars to a dictionary with a field with a rank-2 type. In practice, this is virtually always a higher-kinded type class* like Functor or Monad. I'll use a simplified version of Alternative as an example. The class declaration is:
class Alternative f where
empty :: f a
(<|>) :: f a -> f a -> f a
The dictionary representing this class would be:
data AlternativeDict f = AlternativeDict {
empty :: forall a. f a,
(<|>) :: forall a. f a -> f a -> f a }
Sometimes such an encoding is nice as it allows one to use different "instances" for the same type, perhaps only locally. For example, Maybe has two obvious instances of Alternative depending on whether Just a <|> Just b is Just a or Just b. Languages without type classes, such as Scala, do indeed use this encoding.
To connect to leftaroundabout's reference to lenses, you can view the hierarchy there as a hierarchy of type classes and the lens combinators as simply tools for explicitly building the relevant type class dictionaries. Of course, the reason it isn't actually a hierarchy of type classes is that we usually will have multiple "instances" for the same type. E.g. _head and _head . _tail are both "instances" of Traversal' s a.
* A higher-kinded type class doesn't necessarily lead to this, and it can happen for a type class of kind *. For example:
-- Higher-kinded but doesn't require universal quantification.
class Sum c where
sum :: c Int -> Int
-- Not higher-kinded but does require universal quantification.
class Length l where
length :: [a] -> l
If you are using modules in Haskell, you are already using Rank-2 types. Theoretically speaking, modules are records with rank-2 type properties.
For example, the Foo module below in Haskell ...
module Foo(id) where
id :: forall a. a -> a
id x = x
import qualified Foo
main = do
putStrLn (Foo.id "hello")
return ()
... can actually be thought as a record as follows:
type FooType = FooType {
id :: forall a. a -> a
}
Foo :: FooType
Foo = Foo {
id = \x -> x
}
P/S (unrelated this question): from a language design perspective, if you are going to support module system, then you might as well support higher-rank types (i.e. allow arbitrary quantification of type variables on any level) to reduce duplication of efforts (i.e. type checking a module should be almost the same as type checking a record with higher rank types).

What is the purpose of Rank2Types?

I am not really proficient in Haskell, so this might be a very easy question.
What language limitation do Rank2Types solve? Don't functions in Haskell already support polymorphic arguments?
It's hard to understand higher-rank polymorphism unless you study System F directly, because Haskell is designed to hide the details of that from you in the interest of simplicity.
But basically, the rough idea is that polymorphic types don't really have the a -> b form that they do in Haskell; in reality, they look like this, always with explicit quantifiers:
id :: ∀a.a → a
id = Λt.λx:t.x
If you don't know the "∀" symbol, it's read as "for all"; ∀x.dog(x) means "for all x, x is a dog." "Λ" is capital lambda, used for abstracting over type parameters; what the second line says is that id is a function that takes a type t, and then returns a function that's parametrized by that type.
You see, in System F, you can't just apply a function like that id to a value right away; first you need to apply the Λ-function to a type in order to get a λ-function that you apply to a value. So for example:
(Λt.λx:t.x) Int 5 = (λx:Int.x) 5
= 5
Standard Haskell (i.e., Haskell 98 and 2010) simplifies this for you by not having any of these type quantifiers, capital lambdas and type applications, but behind the scenes GHC puts them in when it analyzes the program for compilation. (This is all compile-time stuff, I believe, with no runtime overhead.)
But Haskell's automatic handling of this means that it assumes that "∀" never appears on the left-hand branch of a function ("→") type. Rank2Types and RankNTypes turn off those restrictions and allow you to override Haskell's default rules for where to insert forall.
Why would you want to do this? Because the full, unrestricted System F is hella powerful, and it can do a lot of cool stuff. For example, type hiding and modularity can be implemented using higher-rank types. Take for example a plain old function of the following rank-1 type (to set the scene):
f :: ∀r.∀a.((a → r) → a → r) → r
To use f, the caller first must choose what types to use for r and a, then supply an argument of the resulting type. So you could pick r = Int and a = String:
f Int String :: ((String → Int) → String → Int) → Int
But now compare that to the following higher-rank type:
f' :: ∀r.(∀a.(a → r) → a → r) → r
How does a function of this type work? Well, to use it, first you specify which type to use for r. Say we pick Int:
f' Int :: (∀a.(a → Int) → a → Int) → Int
But now the ∀a is inside the function arrow, so you can't pick what type to use for a; you must apply f' Int to a Λ-function of the appropriate type. This means that the implementation of f' gets to pick what type to use for a, not the caller of f'. Without higher-rank types, on the contrary, the caller always picks the types.
What is this useful for? Well, for many things actually, but one idea is that you can use this to model things like object-oriented programming, where "objects" bundle some hidden data together with some methods that work on the hidden data. So for example, an object with two methods—one that returns an Int and another that returns a String, could be implemented with this type:
myObject :: ∀r.(∀a.(a → Int, a -> String) → a → r) → r
How does this work? The object is implemented as a function that has some internal data of hidden type a. To actually use the object, its clients pass in a "callback" function that the object will call with the two methods. For example:
myObject String (Λa. λ(length, name):(a → Int, a → String). λobjData:a. name objData)
Here we are, basically, invoking the object's second method, the one whose type is a → String for an unknown a. Well, unknown to myObject's clients; but these clients do know, from the signature, that they will be able to apply either of the two functions to it, and get either an Int or a String.
For an actual Haskell example, below is the code that I wrote when I taught myself RankNTypes. This implements a type called ShowBox which bundles together a value of some hidden type together with its Show class instance. Note that in the example at the bottom, I make a list of ShowBox whose first element was made from a number, and the second from a string. Since the types are hidden by using the higher-rank types, this doesn't violate type checking.
{-# LANGUAGE RankNTypes #-}
{-# LANGUAGE ImpredicativeTypes #-}
type ShowBox = forall b. (forall a. Show a => a -> b) -> b
mkShowBox :: Show a => a -> ShowBox
mkShowBox x = \k -> k x
-- | This is the key function for using a 'ShowBox'. You pass in
-- a function #k# that will be applied to the contents of the
-- ShowBox. But you don't pick the type of #k#'s argument--the
-- ShowBox does. However, it's restricted to picking a type that
-- implements #Show#, so you know that whatever type it picks, you
-- can use the 'show' function.
runShowBox :: forall b. (forall a. Show a => a -> b) -> ShowBox -> b
-- Expanded type:
--
-- runShowBox
-- :: forall b. (forall a. Show a => a -> b)
-- -> (forall b. (forall a. Show a => a -> b) -> b)
-- -> b
--
runShowBox k box = box k
example :: [ShowBox]
-- example :: [ShowBox] expands to this:
--
-- example :: [forall b. (forall a. Show a => a -> b) -> b]
--
-- Without the annotation the compiler infers the following, which
-- breaks in the definition of 'result' below:
--
-- example :: forall b. [(forall a. Show a => a -> b) -> b]
--
example = [mkShowBox 5, mkShowBox "foo"]
result :: [String]
result = map (runShowBox show) example
PS: for anybody reading this who's wondered how come ExistentialTypes in GHC uses forall, I believe the reason is because it's using this sort of technique behind the scenes.
Do not functions in Haskell already support polymorphic arguments?
They do, but only of rank 1. This means that while you can write a function that takes different types of arguments without this extension, you can't write a function that uses its argument as different types in the same invocation.
For example the following function can't be typed without this extension because g is used with different argument types in the definition of f:
f g = g 1 + g "lala"
Note that it's perfectly possible to pass a polymorphic function as an argument to another function. So something like map id ["a","b","c"] is perfectly legal. But the function may only use it as monomorphic. In the example map uses id as if it had type String -> String. And of course you can also pass a simple monomorphic function of the given type instead of id. Without rank2types there is no way for a function to require that its argument must be a polymorphic function and thus also no way to use it as a polymorphic function.
Luis Casillas's answer gives a lot of great info about what rank 2 types mean, but I'll just expand on one point he didn't cover. Requiring an argument to be polymorphic doesn't just allow it to be used with multiple types; it also restricts what that function can do with its argument(s) and how it can produce its result. That is, it gives the caller less flexibility. Why would you want to do that? I'll start with a simple example:
Suppose we have a data type
data Country = BigEnemy | MediumEnemy | PunyEnemy | TradePartner | Ally | BestAlly
and we want to write a function
f g = launchMissilesAt $ g [BigEnemy, MediumEnemy, PunyEnemy]
that takes a function that's supposed to choose one of the elements of the list it's given and return an IO action launching missiles at that target. We could give f a simple type:
f :: ([Country] -> Country) -> IO ()
The problem is that we could accidentally run
f (\_ -> BestAlly)
and then we'd be in big trouble! Giving f a rank 1 polymorphic type
f :: ([a] -> a) -> IO ()
doesn't help at all, because we choose the type a when we call f, and we just specialize it to Country and use our malicious \_ -> BestAlly again. The solution is to use a rank 2 type:
f :: (forall a . [a] -> a) -> IO ()
Now the function we pass in is required to be polymorphic, so \_ -> BestAlly won't type check! In fact, no function returning an element not in the list it is given will typecheck (although some functions that go into infinite loops or produce errors and therefore never return will do so).
The above is contrived, of course, but a variation on this technique is key to making the ST monad safe.
Higher-rank types aren't as exotic as the other answers have made out. Believe it or not, many object-oriented languages (including Java and C#!) feature them. (Of course, no one in those communities knows them by the scary-sounding name "higher-rank types".)
The example I'm going to give is a textbook implementation of the Visitor pattern, which I use all the time in my daily work. This answer is not intended as an introduction to the visitor pattern; that knowledge is readily available elsewhere.
In this fatuous imaginary HR application, we wish to operate on employees who may be full-time permanent staff or temporary contractors. My preferred variant of the Visitor pattern (and indeed the one which is relevant to RankNTypes) parameterises the visitor's return type.
interface IEmployeeVisitor<T>
{
T Visit(PermanentEmployee e);
T Visit(Contractor c);
}
class XmlVisitor : IEmployeeVisitor<string> { /* ... */ }
class PaymentCalculator : IEmployeeVisitor<int> { /* ... */ }
The point is that a number of visitors with different return types can all operate on the same data. This means IEmployee must express no opinion as to what T ought to be.
interface IEmployee
{
T Accept<T>(IEmployeeVisitor<T> v);
}
class PermanentEmployee : IEmployee
{
// ...
public T Accept<T>(IEmployeeVisitor<T> v)
{
return v.Visit(this);
}
}
class Contractor : IEmployee
{
// ...
public T Accept<T>(IEmployeeVisitor<T> v)
{
return v.Visit(this);
}
}
I wish to draw your attention to the types. Observe that IEmployeeVisitor universally quantifies its return type, whereas IEmployee quantifies it inside its Accept method - that is to say, at a higher rank. Translating clunkily from C# to Haskell:
data IEmployeeVisitor r = IEmployeeVisitor {
visitPermanent :: PermanentEmployee -> r,
visitContractor :: Contractor -> r
}
newtype IEmployee = IEmployee {
accept :: forall r. IEmployeeVisitor r -> r
}
So there you have it. Higher-rank types show up in C# when you write types containing generic methods.
For those familiar with object oriented languages, a higher-rank function is simply a generic function that expects as its argument another generic function.
E.g. in TypeScript you could write:
type WithId<T> = T & { id: number }
type Identifier = <T>(obj: T) => WithId<T>
type Identify = <TObj>(obj: TObj, f: Identifier) => WithId<TObj>
See how the generic function type Identify demands a generic function of the type Identifier? This makes Identify a higher-rank function.

Resources