What is a contravariant functor? - haskell

The type blows my mind:
class Contravariant (f :: * -> *) where
contramap :: (a -> b) -> f b -> f a
Then I read this, but contrary to the title, I wasn't any more enlightened.
Can someone please give an explanation of what a contravariant functor is and some examples?

From a programmer's point of view the essence of functor-ness is being able to easily adapt things. What I mean by "adapt" here is that if I have an f a and I need an f b, I'd like an adaptor that will fit my f a in my f b-shaped hole.
It seems intuitive that if I can turn an a into a b, that I might be able to turn a f a into an f b. And indeed that's the pattern that Haskell's Functor class embodies; if I supply an a -> b function then fmap lets me adapt f a things into f b things, without worrying about whatever f involves.1
Of course talking about paramterised types like list-of-x [x], Maybe y, or IO z here, and the thing we get to change with our adaptors is the x, y, or z in those. If we want the flexibility to get an adaptor from any possible function a -> b then of course the thing we're adapting has to be equally applicable to any possible type.
What is less intuitive (at first) is that there are some types which can be adapted almost exactly the same way as functory ones, only they're "backwards"; for these if we want to adapt an f a to fill a need for a f b we actually need to supply a b -> a function, not an a -> b one!
My favourite concrete example is actually the function type a -> r (a for argument, r for result); all of this abstract nonsense makes perfect sense when applied to functions (and if you've done any substantial programming you've almost certainly used these concepts without knowing the terminology or how widely-applicable they are), and the two notions are so obviously dual to each other in this context.
It's fairly well known that a -> r is a functor in r. This makes sense; if I've got an a -> r and I need an a -> s, then I could use an r -> s function to adapt my original function simply by post-processing the result.2
If, on the other hand, I have an a -> r function and what I need is a b -> r, then again it's clear that I can address my need by pre-processing arguments before passing them to the original function. But what do I pre-process them with? The original function is a black box; no matter what I do it's always expecting a inputs. So I need to turn my b values into the a values it expects: my pre-processing adaptor needs a b -> a function.
What we've just seen is that the function type a -> r is a covariant functor in r, and a contravariant functor in a. I think of this as saying we can adapt a function's result, and the result type "changes with" the adaptor r -> s, while when we adapt a function's argument the argument type changes "in the opposite direction" to the adaptor.
Interestingly, the implementation of the function-result fmap and the function-argument contramap are almost exactly the same thing: just function composition (the . operator)! The only difference is on which side you compose the adaptor function:3
fmap :: (r -> s) -> (a -> r) -> (a -> s)
fmap adaptor f = adaptor . f
fmap adaptor = (adaptor .)
fmap = (.)
contramap' :: (b -> a) -> (a -> r) -> (b -> r)
contramap' adaptor f = f . adaptor
contramap' adaptor = (. adaptor)
contramap' = flip (.)
I consider the second definition from each block the most insightful; (covariantly) mapping over a function's result is composition on the left (post-composition if we want to take a "this-happens-after-that" view), while contravariantly mapping over a function's argument is composition on the right (pre-composition).
This intuition generalises pretty well; if an f x structure can give us values of type x (just like an a -> r function gives us r values, at least potentially), it might be a covariant Functor in x, and we could use an x -> y function to adapt it into being an f y. But if an f x structure receives values of type x from us (again, like an a -> r function's argument of type a), then it might be a Contravariant functor and we'd need to use a y -> x function to adapt it to being an f y.
I find it interesting to reflect that this "sources are covariant, destinations are contravariant" intuition reverses when you're thinking from the perspective of an implementer of the source/destination rather than a caller. If I'm trying to implement an f x that receives x values I can "adapt my own interface" so I get to work with y values instead (while still presenting the "receives x values" interface to my callers) by using an x -> y function. Usually we don't think this way around; even as the implementer of the f x I think about adapting the things I'm calling rather than "adapting my caller's interface to me". But it's another perspective you can take.
The only semi-real-world use I've made of Contravariant (as opposed to implicitly using the contravariance of functions in their arguments by using composition-on-the-right, which is very common) was for a type Serialiser a that could serialise x values. Serialiser had to be a Contravariant rather than a Functor; given I can serialise Foos, I can also serialise Bars if I can Bar -> Foo.4 But when you realise that Serialiser a is basically a -> ByteString it becomes obvious; I'm just repeating a special case of the a -> r example.
In pure functional programming, there's not very much use in having something that "receives values" without it also giving something back so all the contravariant functors tend to look like functions, but nearly any straightforward data structure that can contain values of an arbitrary type will be a covariant functor in that type parameter. This is why Functor stole the good name early and is used all over the place (well, that and that Functor was recognised as a fundamental part of Monad, which was already in wide use before Functor was defined as a class in Haskell).
In imperative OO I believe contravariant functors may be significantly more common (but not abstracted over with a unified framework like Contravariant), although it's also very easy to have mutability and side effects mean that a parameterised type just couldn't be a functor at all (commonly: your standard container of a that is both readable and writable is both an emitter and a sink of a, and rather than meaning it's both covariant and contravariant it turns out that means it's neither).
1 The Functor instance of each individual f says how to apply arbitrary functions to the particular form of that f, without worrying about the particular types f is being applied to; a nice separation of concerns.
2 This functor is also a monad, equivalent to the Reader monad. I'm not going to go beyond functors in detail here, but given the rest of my post an obvious question would be "is the a -> r type also some sort of contravariant monad in a then?". Contravariance doesn't apply to monads unfortunately (see Are there contravariant monads?), but there is a contravariant analogue of Applicative: https://hackage.haskell.org/package/contravariant-1.4/docs/Data-Functor-Contravariant-Divisible.html
3 Note that my contramap' here doesn't match the actual contramap from Contravariant as implemented in Haskell; you can't make a -> r an actual instance of Contravariant in Haskell code simply because the a is not the last type parameter of (->). Conceptually it works perfectly well, and you can always use a newtype wrapper to swap the type parameters and make that an instance (the contravariant defines the the Op type for exactly this purpose).
4 At least for a definition of "serialise" that doesn't necessarily include being able to reconstruct the Bar later, since it would serialise the a Bar identically to the Foo it mapped to with no way to include any information about what the mapping was.

First of all #haoformayor's answer is excellent so consider this more an addendum than a full answer.
Definition
One way I like to think about Functor (co/contravariant) is in terms of diagrams. The definition is reflected in the following ones. (I am abbreviating contramap with cmap)
covariant contravariant
f a ─── fmap φ ───▶ f b g a ◀─── cmap φ ─── g b
▲ ▲ ▲ ▲
│ │ │ │
│ │ │ │
a ────── φ ───────▶ b a ─────── φ ──────▶ b
Note: that the only change in those two definition is the arrow on top, (well and the names so I can refer to them as different things).
Example
The example I always have in head when speaking about those is functions - and then an example of f would be type F a = forall r. r -> a (which means the first argument is arbitrary but fixed r), or in other words all functions with a common input.
As always the instance for (covariant) Functor is just fmap ψ φ = ψ . φ`.
Where the (contravariant) Functor is all functions with a common result - type G a = forall r. a -> r here the Contravariant instance would be
cmap ψ φ = φ . ψ.
But what the hell does this mean
φ :: a -> b and ψ :: b -> c
usually therefore (ψ . φ) x = ψ (φ x) or x ↦ y = φ x and y ↦ ψ y makes sense, what is ommited in the statement for cmap is that here
φ :: a -> b but ψ :: c -> a
so ψ cannot take the result of φ but it can transform its arguments to something φ can use - therefore x ↦ y = ψ x and y ↦ φ y is the only correct choice.
This is reflected in the following diagrams, but here we have abstracted over the example of functions with common source/target - to something that has the property of being covariant/contravariant, which is a thing you often see in mathematics and/or haskell.
covariant
f a ─── fmap φ ───▶ f b ─── fmap ψ ───▶ f c
▲ ▲ ▲
│ │ │
│ │ │
a ─────── φ ──────▶ b ─────── ψ ──────▶ c
contravariant
g a ◀─── cmap φ ─── g b ◀─── cmap ψ ─── g c
▲ ▲ ▲
│ │ │
│ │ │
a ─────── φ ──────▶ b ─────── ψ ──────▶ c
Remark:
In mathematics you usually require a law to call something functor.
covariant
a f a
│ ╲ │ ╲
φ │ ╲ ψ.φ ══▷ fmap φ │ ╲ fmap (ψ.φ)
▼ ◀ ▼ ◀
b ──▶ c f b ────▶ f c
ψ fmap ψ
contravariant
a f a
│ ╲ ▲ ▶
φ │ ╲ ψ.φ ══▷ cmap φ │ ╲ cmap (ψ.φ)
▼ ◀ │ ╲
b ──▶ c f b ◀─── f c
ψ cmap ψ
which is equivalent to saying
fmap ψ . fmap φ = fmap (ψ.φ)
whereas
cmap φ . cmap ψ = cmap (ψ.φ)

First, a note about our friend, the Functor class
You can think of Functor f as an assertion that a never appears in the "negative position". This is an esoteric term for this idea: Notice that in the following datatypes the a appears to act as a "result" variable.
newtype IO a = IO (World -> (World, a))
newtype Identity a = Identity a
newtype List a = List (forall r. r -> (a -> List a -> r) -> r)
In each of these examples a appears in a positive position. In some sense the a for each type represents the "result" of a function. It might help to think of a in the second example as () -> a. And it might help to remember that third example is equivalent to data List a = Nil | Cons a (List a). In callbacks like a -> List -> r the a appears in the negative position but the callback itself is in the negative position so negative and negative multiply to be positive.
This scheme for signing the parameters of a function is elaborated in this wonderful blog post.
Now note that each of these types admit a Functor. That is no mistake! Functors are meant to model the idea of categorical covariant functors, which "preserve the order of the arrows" i.e. f a -> f b as opposed to f b -> f a. In Haskell, types where a never appears in a negative position always admit Functor. We say these types are covariant on a.
To put it another way, one could validly rename the Functor class to be Covariant. They are one and the same idea.
The reason this idea is worded so strangely with the word "never" is that a can appear in both a positive and negative location, in which case we say the type is invariant on a. a can also never appear (such as a phantom type), in which case we say the type is both covariant and contravariant on a – bivariant.
Back to Contravariant
So for types where a never appears in the positive position we say the type is contravariant in a. Every such type Foo a will admit an instance Contravariant Foo. Here are some examples, taken from the contravariant package:
data Void a(a is phantom)
data Unit a = Unit (a is phantom again)
newtype Const constant a = Const constant
newtype WriteOnlyStateVariable a = WriteOnlyStateVariable (a -> IO ())
newtype Predicate a = Predicate (a -> Bool)
newtype Equivalence a = Equivalence (a -> a -> Bool)
In these examples a is either bivariant or merely contravariant. a either never appears or is negative (in these contrived examples a always appears before an arrow so determining this is extra-simple). As a result, each of these types admit an instance Contravariant.
A more intuitive exercise would be to squint at these types (which exhibit contravariance) and then squint at the types above (which exhibit covariance) and see if you can intuit a difference in the semantic meaning of a. Maybe that is helpful, or maybe it is just still abstruse sleight of hand.
When might these be practically useful? Let us for example want to partition a list of cookies by what kind of chips they have. We have a chipEquality :: Chip -> Chip -> Bool. To obtain a Cookie -> Cookie -> Bool, we simply evaluate runEquivalence . contramap cookie2chip . Equivalence $ chipEquality.
Pretty verbose! But solving the problem of newtype-induced verbosity will have to be another question...
Other resources (add links here as you find them)
24 Days of Hackage: contravariant
Covariance, contravariance, and positive and negative positions
I love profunctors
Talk: Fun with Profunctors: I cannot overstate how great this talk is

I know this answer won't be as deeply academic as the other ones, but it's simply based on the common implementations of contravariant you'll come across.
First, a tip: Don't read the contraMap function type using the same mental metaphor for f as you do when reading the good ol' Functor's map.
You know how you think:
"a thing that contains (or produces) an t"
...when you read a type like f t?
Well, you need to stop doing that, in this case.
The Contravariant functor is "the dual" of the classic functor so, when you see f a in contraMap, you should think the "dual" metaphor:
f t is a thing that CONSUMES a t
Now contraMap's type should start to make sense:
contraMap :: (a -> b) -> f b ...
...pause right there, and the type is perfectly sensible:
A function that "produces" a b.
A thing that "consumes" a b.
First argument cooks the b. Second argument eats the b.
Makes sense, right?
Now finish writing the type:
contraMap :: (a -> b) -> f b -> f a
So in the end this thing must yield a "consumer of a".
Well, surely we can build that, given that our first argument is a function that takes an a as input.
A function (a -> b) should be a good building block for building a "consumer of a".
So contraMap basically lets you create a new "consumer", like this (warning: made up symbols incoming):
(takes a as input / produces b as output) ~~> (consumer of b)
On the left of my made up symbol: The first argument to contraMap (i.e. (a -> b)).
On the right: The second argument (i.e. f b).
The whole thing glued together: The final output of contraMap (a thing that knows how to consume an a, i.e. f a).

Another view on the topic, limited to functions seen as contravariant functors. (See also this.)
Functions as containers (therefore Functors) of their result
A function f of type a -> b can be tought of as containing a value of type b, which we get access to when we feed a value of type a to f.
Now, things which are containers of other things can be made Functors, in the sense that we can apply a function g to their content, via applying fmap g to the functor itself.
Therefore, f, which is of type a -> b can be seen as a functor in b, i.e. (->) a can be made a Functor. To do so, we need to define fmap: fmapping a function g on the "content" of f essentially means applying g on whatever f returns (once it's fed with an input of type a, obviously), which means that fmap g f = \x -> g (f x) or, more concisely, fmap g f = g . f.
fmapping on a -> b functions = post-processing their result of type b
As a last thought: a function of type a -> b is a functor in b because we can post-process it by means of a function b -> c (where c is just another type).
contramapping on a -> b functions = pre-processing an input to get a
But what if we want to use a function g (of type c -> a) to pre-process a value of some type c to obtain the value of type a that we want to feed to f?
Well, it's clear that in this case, we want g to act before f, i.e. we are looking for f . g.
And we want f . g to be the "implementation" of the concept of "mapping g on f". In other words we want whichmap g f = f . g.
Guess what? whichmap is actually the implementation of contramap for functions! And contramap is what you have to implement to make some type an instance of the Contravariant functor typeclass.
Well, not really (-> b)...
Actually there isn't exactly an instance of Contravariant mirroring instance Functor ((->) r), I believe just because instance Contravariant (-> r)/instance Contravariant (flip (->) r) are invalid syntaxes; so another type is created, via
newtype Op a b = Op { getOp :: b -> a }
and this is made an instance of Contravariant:
instance Contravariant (Op a) where
contramap f g = Op (getOp g . f)
The two last chunks of code are taken from hackage.
The example at the top of this page is very illuminating too.

Related

Free theorem for fmap

Consider the following wrapper:
newtype F a = Wrap { unwrap :: Int }
I want to disprove (as an exercise to wrap my head around this interesting post) that there’s a legitimate Functor F instance which allows us to apply functions of Int -> Int type to the actual contents and to ~ignore~ all other functions (i. e. fmap nonIntInt = id).
I believe this should be done with a free theorem for fmap (which I read here):
for given f, g, h and k, such that g . f = k . h: $map g . fmap f = fmap k . $map h, where $map is the natural map for the given constructor.
What defines a natural map? Am I right to assume that it is a simple flip const for F?
As far as I get it: $map f is what we denote as Ff in category theory. Thus, in a categorical sense, we simply want something among the lines of the following diagram to commute:
Yet, I do not know what to put instead of ???s (that is, what functor do we apply to get such a diagram and how do we denote this almost-fmap?).
So, what is a natural map in general, and for F? What is the proper diagram for fmap's free theorem?
Where am I going with this?
Consider:
f = const 42
g = id
h = const ()
k () = 42
It is easy to see that f . g is h . k. And yet, the non-existant fmap will execute only f, not k, giving different results. If my intuition about the naturality is correct, such a proof would work. That's what I am trying to figure out.
#leftaroundabout proposed a simpler piece of proof: fmap show . fmap (+1) alters the contents, unlike fmap $ show . (+1). It is a nice piece of proof, and yet I would still like to work with free theorems as an exercise.
So we are entertaining a function m :: forall a b . (a->b) -> F a -> F b such that (among other things)
m (1 +) (Wrap x) = (Wrap (1+x))
m (show) (Wrap x) = (Wrap x)
There are two somewhat related questions here.
Can a well-behaved fmap do this?
Can a parametric function do this?
The answer to both questions is "no".
A well-behaved fmap can't do this because fmap has to obey the axioms of Functor. Whether our environment is parametric or not is irrelevant. The axiom of Functor says that for all functions a and b, fmap (a . b) = fmap a . fmap b must hold, and this fails for a = show and b = (1 +). So m cannot be a well-behaved fmap.
A parametric function can't do this because that is what the parametricity theorem says. When viewing types as relations between terms, related functions take related arguments to related results. It is easy to see that m fails parametricity, but it is slightly easier to look at m': forall a b. (a -> b) -> (Int -> Int) (the two can be trivially converted to each other). (1 +) is related to show because m' is polymorphic in its argument, so different values of the argument can be related by any relation. Functions are relations, and there exists a function that sends (1 +) to show. However, the result type of m' has no type variables, so it corresponds to the constant relation (its values are only related to themselves). Since every value including m' is related to itself, it follows that all parametric functions m :: forall a b. (a -> b) -> (Int -> Int) must obey m f = m g, i.e. they must ignore their first argument. Which is intuitively obvious since there is nothing to apply it to.
One can in fact deduce the first statement from the second by observing that a well-behaved fmap must be parametric. So even if the language allows non-parametricity, fmap cannot make any non-trivial use of it.

Is there such thing as a bidistributive? What function do I need here?

I have code (in C# actually, but this question has nothing to do with C# specifically, so I will speak of all my types in Haskell-speak) where I am working inside of an Either a b. I then bind a function with a signature that in Haskell-speak is b -> (c, d), after which I want to pull c to the outside and default it in the left case, i.e. I want (c, Either a d). Now this pattern occurred many times one particular service I was writing so I pulled out a method to do it. However it bothers me whenever I just "make up" a method like this without understanding the correct theoretical underpinnings. In other words, what abstraction are we dealing with here?
I had a similar situation in some F# code where my pair and my either were reversed: (a, b) -> (b -> Either c d) -> Either c (a, d). I asked a friend what this was and he turned me on to traverse which made me very happy even though I have to make horrifically monomorphic implementations in F# due to the lack of typeclasses. (I wish I could remap my F1 in Visual Studio to Hackage; it is one of my primary resources for writing .NET code). The problem though is that traverse is:
class (Functor t, Foldable t) => Traversable t where
traverse :: Applicative f => (a -> f b) -> t a -> f (t b)
Which means it works great when you start with a pair and want to "bind" an either to it, but does not work when you start with an either and want to end up with a pair, because pair is not an Applicative.
However I thought about my first case more, the one that is not traverse, and realize that "defaulting c in the left case" can just be done with mapping over the left case, which changes the problem to having this shape: Either (c, a) (c, d) -> (c, Either a d) which I recognize as the pattern that we see in arithmetic with multiplication and addition: a(b + c) = ab + ac. I also remembered that the same pattern exists in Boolean algebra and in set theory (if memory serves, A intersect (B union C) = (A intersect B) union (A intersect C)). Clearly there is some abstract algebraic structure here. However, memory does not serve, and I could not remember what it was called. A little poking around on Wikipedia quickly solved this: these are the distributive laws. And joy, oh joy, Kmett has given us distribute:
class Functor g => Distributive g where
distribute :: Functor f => f (g a) -> g (f a)
It even has a cotraverse because it is dual to Travsersable! Lovely!! However, I noticed that there is no (,) instance. Uh oh. Because, yeah, where does the "default c value" come into all this? Then I realized, uh oh, I perhaps I need something like a bidistributive based on a bifunctor? perhaps dual to bitraversable? Conceptually:
class Bifunctor g => Bidistributive g where
bidistribute :: Bifunctor f => f (g a b) (g a c) -> g a (f b c)
This seems to be the structure of the distributive law I am talking about. I can't find such a thing in Haskell which doesn't matter to me in and of itself since I am actually writing C#. However, the thing that is important to me is to not be coming up with bogus abstractions, and yet to recognize as many lawful abstractions in my code as possible, whether they are expressed as such or not, for my own understanding.
I currently have a .InsideOut(<default>) function (extension method) in my C# code (what a hack, right!). Would I be totally off-base to create a (yes, sadly monomorphic) .Bidistribute(...) function (extension method) to replace it and map the "default" for the left case into the left case before invoking it (or just recognize the "bidistributive" character of "inside out")?
bidistribute can't be implemented as such. Consider the trivial example
data Biconst c a b = Biconst c
instance Bifunctor (Biconst c) where
bimap _ _ (Biconst c) = Biconst c
Then we'd have the specialisation
bidistribute :: Biconst () (Void, ()) (Void, ()) -> (Void, Biconst () () ())
bidistribute (Biconst ()) = ( ????, Biconst () )
There's clearly no way to fill in the gap, which would need to have type Void.
Actually, I think you really need Either there (or something isomorphic to it) rather than an arbitrary bifunctor. Then your function is just
uncozipL :: Functor f => Either (f a) (f b) -> f (Either a b)
uncozipL (Left l) = Left <$> l
uncozipL (Right r) = Right <$> l
It's defined in adjunctions (found using Hoogle).
Based on #leftaroundabout's tip-off to look at adjunctions, in addition to uncozipL that he mentions in his answer, if we defer the "default the first value of the pair in the left case of either", we can also solve this with unzipR:
unzipR :: Functor u => u (a, b) -> (u a, u b)
Then it would still be necessary to map over the first element in the pair and pull out the value with something like either (const "default") id. The interesting thing about this is that it if you use uncozipL, you need to know that one of the things is a pair. If you use unzipR, you need to know that one is an either. In neither case do you use an abstract bifunctor.
Further, it seems that the pattern or abstraction that I'm looking for is a distributive lattice. Wikipedia says:
A lattice (L,∨,∧) is distributive if the following additional identity holds for all x, y, and z in L:
x ∧ (y ∨ z) = (x ∧ y) ∨ (x ∧ z).
which is exactly the property I have observed occuring in many different places.

Polymorphic reasoning

I am learning Haskell and in internet I've found is paper from Philip Wadler.
I read it and did not understand at all, but it somehow connects to polymorphic function.
For example:
polyfunc :: a -> a -> a
It is a polymorphic function of any type.
What is the free theorem in connection of the example polyfunc?
I feel like if I actually understood that paper then any code I wrote would be coauthored by God.
My best guess for this problem though is that all polyfunc can do is either always return the first argument or always return the second argument. So there are actually only two implementations of polyfunc,
polyfuncA a _ = a
polyfuncB _ b = b
The paper gives you a way to prove that claim.
This is a very important concept. For example, I've been involved in data quality research previously. This free theorem says that there is no function which can select the best data from two arbitrary pieces of data. We have to know something more. Its actually a no-brainer that I was surprised to find some people willing to overlook.
I've never really understood the algorithm laid out in that paper either, so I thought I would try to figure it out.
(1) Type of function in question
f :: a -> a -> a
(2) Rephrasing as a relation
f : ∀X. X -> X -> X
(3) By parametricity
(f, f) ∈ ∀X. X -> X -> X
(4) By definition of ∀ on relations
for all Q : A <=> A',
(fA, fA') ∈ Q -> Q -> Q
(5) Applying definition of -> on relations to the first -> in (4)
for all Q : A <=> A',
for all (x, x') ∈ Q,
(fA x, fA' x') ∈ Q -> Q
(6) Applying definition of -> on relations to (5)
for all Q : A <=> A',
for all (x, x') ∈ Q,
for all (y, y') ∈ Q,
(fA x y, fA' x' y') ∈ Q
At this point I was done expanding the relational definition, but wasn't sure how to get this back from relations into terms of functions and types, so I went and found a webapp that will automatically derive the free theorem for a type. I won't spoil (yet) what result it gives, but looking at it did help me figure out the next step in my proof.
The next step is to get back into function-land from relation-land, by noting that Q can be the type of any function at all and this will still hold.
(7) Specializing Q to a function g :: p -> q
for all p, q
for all g :: p -> q
where g x = x'
and g y = y'
g (f x y) = f x' y'
(8) By definitions of x' and y'
for all p, q
for all g :: p -> q
g (f x y) = f (g x) (g y)
That looks true, right? It is equivalent to use g to transform both elements and then let f choose between them, or to let f choose an element and then transform it with g. By parametricity, f can't change whether it chooses the left or right element based on anything that g does.
Of course the claim given in trevor cook's answer is true: f must either always choose its first argument, or always choose its second. I'm not sure whether the free theorem I derived is equivalent to that, or is a weaker version of it.
And incidentally, this is a special case of something that is already covered explicitly in the paper. It gives the theorem for
k :: x -> y -> x
which of course is the same as your function f, where x ~ a and y ~ a. The result that it gives is the same as the one I described:
for all a, b, x, y
a (k x y) = k (a x) (b y)
if we choose b=a to make the two results equivalent.
Wadler's "Theorems for free" (TFF) paper is not a good reference for learning about relational parametricity. The TFF paper is focused on abstract theory but, if you need a practical algorithm, the paper does not actually give it to you. The explanations in TFF have a number of important omissions that will confuse anyone who does not already know a great deal about relational parametricity.
The first thing TFF does not explain is that a function will satisfy a "free theorem" only if the code of the function is fully parametric (restricted in a number of ways).
"Fully parametric" code is a function with type parameters, whose arguments are typed using only those type parameters, and whose code is purely functional and does not try to examine at run time what types are assigned to the type parameters. The code must treat all types as totally unknown, arbitrary type parameters. The code must work with all types in the same way.
With these restrictions, the code will satisfy a certain law, in most cases this will be the "naturality" law, but in some cases the law will be more complicated. The paper "Theorems for free" shows many examples of such laws but does not explain a general algorithm for deriving those theorems.
To prove that those laws always hold, one uses the technique of relational parametricity. This is a complicated and powerful technique where one replaces functions (viewed as many-to-one binary relations) by arbitrary (many-to-many) binary relations and then reformulates the naturality law in terms of relations. The result is a "relational naturality law". At the end, one replaces relations again by functions and tries to derive an equation.
I recently recorded a tutorial about relational parametricity, with code examples in Scala. https://www.youtube.com/watch?v=Jf2VFB90Q0s&list=PLcoadSpY7rHUO0I1zdcbu9EeByYbwPSQ6
My tutorial does not follow Wadler's TFF paper but instead explains a simple and straightforward approach focused on practical results: how to derive the free theorem and reason about relations effectively. In this approach, it becomes easier to derive the "free theorem" for a given type and also to prove the "parametricity theorem": fully parametric functions will always satisfy one "free theorem" per type parameter.
Of course, for practical usage you don't necessarily need to go through the proof of the parametricity theorem, but you do need to be able to write the free theorem itself.
The key element of my tutorial is the idea of "lifting a relation to a type constructor". If you have a relation r: A <=> B and a type constructor F A then you can lift r to a relation of type F A <=> F B.
This lifting operation is denoted by rmap_F. The relation rmap_F r has type F A <=> F B.
The lifting operation rmap_F is defined by induction on the type structure of F. The details of that definition are somewhat technical (and are not adequately explained in the TFF paper!). The most important step in learning about relational parametricity is to understand the practical technique for lifting relations to type constructors. This is explained in my tutorial, and it's too long to write it here. The definition explains how to to lift r to a trivial type constructor F A = C where C is a fixed type, to F A = A, to F A = Either (G A) (H A), to F A = (G A, H A), to F A = G A -> H A, etc.
The definition of rmap is analogous to the functor lifting fmap that lifts a function of type A -> B to a function of type F A -> F B. However, the functor lifting works only for covariant F, while the relational lifting works for any F, even if it is neither covariant nor contravariant, such as F A = A -> A -> A. This is the crucial feature that shows why relational technique is useful at all.
Let us apply the relational technique to the type forall A. A -> A -> A.
We define F A = A -> A -> A. We take an arbitrary fully parametric function t of type forall A. F A. The relational naturality law says: for any relation r: A <=> B between any types A and B the function t must be in the relation (t, t) ∈ rmap_F r.
Now we need to do two things: 1) select r to be the graph relation of some function f: A -> B, denoted r = graph f, and 2) use the definition of rmap_F to compute rmap_F (graph f) explicitly.
The definition of rmap_F gives:
(t1, t2) ∈ rmap_F r ===
for all a1: A, a2: A, b1: B, b2: B,
if (a1, b1) ∈ r and (a2, b2) ∈ r
then (t a1 a2, t b1 b2) ∈ r
Translating this with r = graph f, we get:
(a1, b1) ∈ r === b1 = f a1
(a2, b2) ∈ r === b2 = f a2
(t a1 a2, t b1 b2) ∈ r === t b1 b2 = f (t a1 a2)
So, we obtain the following law:
for all a1: A, a2: A,
t (f a1) (f a2) = f (t a1 a2)
This is actually a naturality law. This is the "free theorem" satisfied by t.

Curry Howard correspondence and equality

A while ago I read that the function type a -> b corresponds to the relation a ≤ b, or is it a ≥ b? This makes sense to me because two types are isomorphic if we have a bijection between them (i.e. (a ≈ b) ≡ (a -> b, b -> a)). Similarly, (a = b) ≡ (a ≤ b) ∧ (a ≥ b).
I know that this is not the Curry-Howard-Lambek correspondence (i.e. the correspondence between type theory, logic and category theory). It's the correspondence between type theory and something else. I want to know learn more about this correspondence. Could somebody point me in the right direction?
I know that this doesn't seem like a programming question but it is related to programming and I'm hoping that some functional programmer knows more about it and can point me in the right direction.
Every pre-ordered set forms a category. Let (S, «) be a pre-ordered set. Define a category C whose objects are the elements of S and with Hom(a, b) inhabited by (a, b) if a « b and uninhabited otherwise. Define composition the only way you possibly can. The category laws follow immediately from the transitivity and reflexivity of the pre-order.
A lattice, in particular, will form a category admitting finite products and coproducts. A bounded lattice will form one with initial and final objects.
Types and functions in a sufficiently well-behaved functional language also form a category with finite products and coproducts, and initial and final objects. So if you squint out to a categorical blur, these things will start to look vaguely similar.
(This is more a comment than an answer, but I need more space.)
The type a -> b corresponds to a <= b. This is useful, e.g., to speak about fixed points at the type level, which are needed to properly define recursive types (lists, trees, ...).
Recall how recursion is solved, without categories. In domain theory, given a function f :: a -> a we look for a least x satisfying f x = x (least fixed point). This turns out to also be the least x satisfying f x <= x (least prefixed point). We then get the induction principle
f y <= y ==> fix f <= y
which basically states that, if we have any prefixed point y, then the least (pre)fixed point fix f must be less than y -- indeed, it is the least!
Now, let's sprinkle some category powder over that. Implication becomes a -> arrow, and <= also becomes ->. We get
(f y -> y) -> fix f -> y
Looks familiar, where did I see that...? Ah!
newtype Fix f = Fix { unFix :: f (Fix f) }
cata :: Functor f => (f y -> y) -> Fix f -> y
cata g = g . fmap (cata g) . unFix
Hence, the cata general eliminator/catamorphism is just a category-empowered version of the good old induction principle.
Note how domain points y are now object in our category (i.e. types). Also, functions f must be applicable to y, so these are not morphisms in our category (which would be function values :: A -> B, from some type to some type), but correspond to functors int the category of types (mapping types to types :: * -> *).

How much is applicative really about applying, rather than "combining"?

For an uncertainty-propagating Approximate type, I'd like to have instances for Functor through Monad. This however doesn't work because I need a vector space structure on the contained types, so it must actually be restricted versions of the classes. As there still doesn't seem to be a standard library for those (or is there? please point me. There's rmonad, but it uses * rather than Constraint as the context kind, which seems just outdated to me), I wrote my own version for the time being.
It all works easy for Functor
class CFunctor f where
type CFunctorCtxt f a :: Constraint
cfmap :: (CFunctorCtxt f a, CFunctorCtxt f b) => (a -> b) -> f a -> f b
instance CFunctor Approximate where
type CFunctorCtxt Approximate a = FScalarBasisSpace a
f `cfmap` Approximate v us = Approximate v' us'
where v' = f v
us' = ...
but a direct translation of Applicative, like
class CFunctor f => CApplicative' f where
type CApplicative'Ctxt f a :: Constraint
cpure' :: (CApplicative'Ctxt f a) => a -> f a
(#<*>#) :: ( CApplicative'Ctxt f a
, CApplicative'Ctxt f (a->b)
, CApplicative'Ctxt f b) => f(a->b) -> f a -> f b
is not possible because functions a->b do not have the necessary vector space structure* FScalarBasisSpace.
What does work, however, is to change the definition of the restricted applicative class:
class CFunctor f => CApplicative f where
type CApplicativeCtxt f a :: Constraint
cpure :: CAppFunctorCtxt f a => a -> f a
cliftA2 :: ( CAppFunctorCtxt f a
, CAppFunctorCtxt f b
, CAppFunctorCtxt f c ) => (a->b->c) -> f a -> f b -> f c
and then defining <*># rather than cliftA2 as a free function
(<*>#) = cliftA2 ($)
instead of a method. Without the constraint, that's completely equivalent (in fact, many Applicative instances go this way anyway), but in this case it's actually better: (<*>#) still has the constraint on a->b which Approximate can't fulfill, but that doesn't hurt the applicative instance, and I can still do useful stuff like
ghci> cliftA2 (\x y -> (x+y)/x^2) (3±0.2) (5±0.3) :: Approximate Double
0.8888888888888888 +/- 0.10301238090045711
I reckon the situation would essentially the same for many other uses of CApplicative, for instance the Set example that's already given in the original blog post on constraint kinds.
So my question:
is <*> more fundamental than liftA2?
Again, in the unconstrained case they're equivalent anyway. I actually have found liftA2 easier to understand, but in Haskell it's probably just more natural to think about passing "containers of functions" rather than containers of objects and some "global" operation to combine them. And <*> directly induces all the liftAμ for μ ∊ ℕ, not just liftA2; doing that from liftA2 only doesn't really work.
But then, these constrained classes seem to make quite a point for liftA2. In particular, it allows CApplicative instances for all CMonads, which does not work when <*># is the base method. And I think we all agree that Applicative should always be more general than Monad.
What would the category theorists say to all of this? And is there a way to get the general liftAμ without a->b needing to fulfill the associated constraint?
*Linear functions of that type actually do have the vector space structure, but I definitely can't restrict myself to those.
As I understand it (as a non---category theorist), the fundamental operation is zip :: f a -> f b -> f (a, b) (mapping a pair of effectful computations to an effectful computation resulting in a pair).
You can then define
fx <*> fy = uncurry ($) <$> zip fx fy
liftA2 g fx fy = uncurry g <$> zip fx fy
See this post by Edward Yang, which I found via the Typeclassopedia.

Resources