For any Applicative instance, once <*> is written, pure is uniquely determined. Suppose you have pure1 and pure2, both of which obey the laws. Then
pure2 f <*> pure1 y = pure1 ($ y) <*> pure2 f -- interchange for pure1
pure2 id <*> pure1 y = pure1 ($ y) <*> pure2 id -- specialize f to id
pure1 y = pure1 ($ y) <*> pure2 id -- identity for pure2
pure1 y = fmap ($ y) (pure2 id) -- applicative/fmap law for pure1
pure1 y = pure2 ($ y) <*> pure2 id -- applicative/fmap law for pure2
pure1 y = pure2 y -- homomorphism law
But using the fmap law this way feels like a cheat. Is there a way to avoid that without resorting to appeals to parametricity?
The laws as given in the current documentation do rely on parametricity to connect to fmap.
Without parametricity, we lose that connection, as we cannot even prove the uniqueness of fmap, and indeed there are models/extensions of System F where fmap is not unique.
A simple example of breaking parametricity is to add type-case (pattern-matching on types), this allows you to define the following twist which inspects the type of its argument and flip any boolean it finds:
twist :: forall a. a -> a
twist #Bool = not
twist #(a -> b) = \f -> (\x -> twist #b (f (twist #a x)))
twist #a = id -- default case
It has the type of polymorphic identity, but it is not the identity function.
You can then define the following "twisted identity" functor, where pure applies twist to its argument:
newtype I a = I { runI :: a }
pure :: a -> I a
pure = I . twist
(<*>) :: I (a -> b) -> I a -> I b -- The usual, no twist
(<*>) (I f) (I x) = I (f x)
A key property of twist is that twist . twist = id. This allows it to cancel out with itself when composing values embedded using pure, thus guaranteeing the four laws of Control.Applicative. (Proof sketch in Coq)
This twisted functor also yields a different definition of fmap, as \u -> pure f <*> u. Unfolded definition:
fmap :: (a -> b) -> I a -> I b
fmap f (I x) = I (twist (f (twist x)))
This does not contradict duplode's answer, which translates the usual argument for the uniqueness of the identity of monoids to the setting of monoidal functors, but it undermines its approach. The issue is that view assumes that you have a given functor already and that the monoidal structure is compatible with it. In particular, the law fmap f u = pure f <*> u is implied from defining pure as \x -> fmap (const x) funit (and (<*>) also accordingly). That argument breaks down if you haven't fixed fmap in the first place, so you don't have any coherence laws to rely on.
Let's switch to the monoidal functor presentation of applicative:
funit :: F ()
fzip :: (F a, F b) -> F (a, b)
fzip (funit, v) ~ v
fzip (u, funit) ~ u
fzip (u, fzip (v, w)) ~ fzip (fzip (u, v), w)
If we specialise a and b to () (and look past the tuple isomorphisms), the laws tell us that funit and fzip specify a monoid. Since the identity of a monoid is unique, funit is unique. For the usual Applicative class, pure can then be recovered as...
pure a = fmap (const a) funit
... which is just as unique. While in principle it makes sense to try to extend this reasoning to at least some functors that aren't fully parametric, doing so might require compromises in a lot of places:
The availability of () as an object, to define funit and state the monoidal functor laws;
The uniqueness of the map function used to define pure (see also Li-yao Xia's answer), or, failing that, a sensible way to somehow uniquely specify a fmap . const analogue;
The availability of function types as objects, for the sake of stating the Aplicative laws in terms of (<*>).
Related
This question already has answers here:
Haskell: Flaw in the description of applicative functor laws in the hackage Control.Applicative article?: it says Applicative determines Functor
(2 answers)
Closed 3 years ago.
According to Haskell's library documentation, every instance of the Applicative class must satisfy the four rules:
identity: pure id <*> v = v
composition: pure (.) <*> u <*> v <*> w = u <*> (v <*> w)
homomorphism: pure f <*> pure x = pure (f x)
interchange: u <*> pure y = pure ($ y) <*> u
It then says that as a consequence of these rules, the underlying Functor instance will satisfy fmap f x = pure f <*> x. But since the method fmap does not even appear in the above equations, how exactly does this property follow from them?
Update: I've substantially expanded the answer. I hope it helps.
"Short" answer:
For any functor F, there is a "free theorem" (see below) for the type:
(a -> b) -> F a -> F b
This theorem states that for any (total) function, say foo, with this type, the following will be true for any functions f, f', g, and h, with appropriate matching types:
If f' . g = h . f, then foo f' . fmap g = fmap h . foo f.
Note that it is not at all obvious why this should be true.
Anyway, if you set f = id and g = id and use the functor law fmap id = id, this theorem simplifies to:
For all h, we have foo h = fmap h . foo id.
Now, if F is also an applicative, then the function:
foo :: (a -> b) -> F a -> F b
foo f x = pure f <*> x
has the right type, so it satisfies the theorem. Therefore, for all h, we have:
pure h <*> x
-- by definition of foo
= foo h x
-- by the specialized version of the theorem
= (fmap h . foo id) x
-- by definition of the operator (.)
= fmap h (foo id x)
-- by the definition of foo
= fmap h (pure id <*> x)
-- by the identity law for applicatives
= fmap h x
In other words, the identity law for applicatives implies the relation:
pure h <*> x = fmap h x
It is unfortunate that the documentation does not include some explanation or at least acknowledgement of this extremely non-obvious fact.
Longer answer:
Originally, the documentation listed the four laws (identity, composition, homomorphism, and interchange), plus two additional laws for *> and <* and then simply stated:
The Functor instance should satisfy
fmap f x = pure f <*> x
The wording above was replaced with the new text:
As a consequence of these laws, the Functor instance for f will satisfy
fmap f x = pure f <*> x
as part of commit 92b562403 in February 2011 in response to a suggestion made by Russell O'Connor on the libraries list.
Russell pointed out that this rule was actually implied by the other applicative laws. Originally, he offered the following proof (the link in the post is broken, but I found a copy on archive.org). He pointed out that the function:
possibleFmap :: Applicative f => (a -> b) -> f a -> f b
possibleFmap f x = pure f <*> x
satisfies the Functor laws for fmap:
pure id <*> x = x {- Identity Law -}
pure (f . g) <*> x
= {- infix to prefix -}
pure ((.) f g) <*> x
= {- 2 applications of homomorphism law -}
pure (.) <*> pure f <*> pure g <*> x
= {- composition law -}
pure f <*> (pure g <*> x)
and then reasoned that:
So, \f x -> pure f <*> x satisfies the laws of a functor.
Since there is at most one functor instance per data type,
(\f x -> pure f <*> x) = fmap.
A key part of this proof is that there is only one possible functor instance (i.e., only one way of defining fmap) per data type.
When asked about this, he gave the following proof of the uniqueness of fmap.
Suppose we have a functor f and another function
foo :: (a -> b) -> f a -> f b
Then as a consequence of the free theorem for foo, for any f :: a -> b
and any g :: b -> c.
foo (g . f) = fmap g . foo f
In particular, if foo id = id, then
foo g = foo (g . id) = fmap g . foo id = fmap g . id = fmap g
Obviously, this depends critically on the "consequence of the free theorem for foo". Later, Russell realized that the free theorem could be used directly, together with the identity law for applicatives, to prove the needed law. That's what I've summarized in my "short answer" above.
Free Theorems...
So what about this "free theorem" business?
The concept of free theorems comes from a paper by Wadler, "Theorems for Free". Here's a Stack Overflow question that links to the paper and some other resources. Understanding the theory "for real" is hard, but you can think about it intuitively. Let's pick a specific functor, like Maybe. Suppose we had a function with the following type;
foo :: (a -> b) -> Maybe a -> Maybe b
foo f x = ...
Note that, no matter how complex and convoluted the implementation of foo is, that same implementation needs to work for all types a and b. It doesn't know anything about a, so it can't do anything with values of type a, other than apply the function f, and that just gives it a b value. It doesn't know anything about b either, so it can't do anything with a b value, except maybe return Just someBvalue. Critically, this means that the structure of the computation performed by foo -- what it does with the input value x, whether and when it decides to apply f, etc. -- is entirely determined by whether x is Nothing or Just .... Think about this for a bit -- foo can inspect x to see if it's Nothing or Just someA. But, if it's Just someA, it can't learn anything about the value someA: it can't use it as-is because it doesn't understand the type a, and it can't do anything with f someA, because it doesn't understand the type b. So, if x is Just someA, the function foo can only act on its Just-ness, not on the underlying value someA.
This has a striking consequence. If we were to use a function g to change the input values out from under foo f x by writing:
foo f' (fmap g x)
because fmap g doesn't change x's Nothing-ness or Just-ness, this change as no effect on the structure of foo's computation. It behaves the same way, processing the Nothing or Just ... value in the same way, applying f' in exactly the same circumstances and at exactly the same time that it previously applied f, etc.
This means that, as long as we've arranged things so that f' acting on the g-transformed value gives the same answer as an h-transformed version of f acting on the original value -- in other words if we have:
f' . g = h . f
then we can trick foo into processing our g-transformed input in exactly the same way it would have processed the original input, as long as we account for the input change after foo has finished running by applying h to the output:
foo f' (fmap g x) = fmap h (foo f x)
I don't know whether or not that's convincing, but that's how we get the free theorem:
If f' . g = h . f then foo f' . fmap g = fmap h . foo f.
It basically says that because we can transform the input in a way that foo won't notice (because of its polymorphic type), the answer is the same whether we transform the input and run foo, or run foo first and transform its output instead.
I know that the Applicative class is described in category theory as a "lax monoidal functor" but I've never heard the term "lax" before, and the nlab page on lax functor a bunch of stuff I don't recognize at all, re: bicategories and things that I didn't know we cared about in Haskell. If it is actually about bicategories, can someone give me a plebian view of what that means? Otherwise, what is "lax" doing in this name?
Let's switch to the monoidal view of Applicative:
unit :: () -> f ()
mult :: (f s, f t) -> f (s, t)
pure :: x -> f x
pure x = fmap (const x) (unit ())
(<*>) :: f (s -> t) -> f s -> f t
ff <*> fs = fmap (uncurry ($)) (mult (ff, fs))
For a strict monoidal functor, unit and mult must be isomorphisms. The impact of "lax" is to drop that requirement.
E.g., (up to the usual naivete) (->) a is strict-monoidal, but [] is only lax-monoidal.
For instance Alternative [], (<|>) = (++). So I regarded (<|>) as some kind of splicer, resulting in seemingly almost-universal container converter:
-- (<|>) = generalization of (++)
(<|) :: Alternative f => x -> f x -> f x
x <| xs = pure x <|> xs
conv :: (Foldable t, Alternative f) => t x -> f x
conv = foldr (<|) empty
Indeed, I was able to generalize all functions from Data.List, here's some:
-- fmap = generalization of map
reverse :: (Foldable t, Alternative f) => t a -> f a
reverse = getAlt . getDual . foldMap (Dual . Alt . pure)
-- asum = generalization of concat
asumMap :: (Foldable t, Alternative f) => (a -> f b) -> t a -> f b -- generalization of concatMap
asumMap f = getAlt . foldMap (Alt . f)
splitAt :: (Foldable t, Alternative f, Alternative g) => Int -> t a -> (f a, g a)
splitAt n xs = let (_,fs,gs) = foldl' (\(i,z1,z2) x -> if 0 < i then (i-1,z1 . (x <|),z2) else (0,z1,z2 . (x <|))) (n,id,id) xs in (fs empty,gs empty)
Also, this analogy makes some interesting new instances, such as a working applicative functor for functor sums (Data.Functor.Sum):
instance (Foldable f, Applicative f, Alternative g) => Applicative (Sum f g) where
pure = InL . pure
InL f <*> InL x = InL (f <*> x)
InL f <*> InR x = InR (conv f <*> x)
InR f <*> InL x = InR (f <*> conv x)
InR f <*> InR x = InR (f <*> x)
instance (Foldable f, Applicative f, Alternative g) => Alternative (Sum f g) where
empty = InR empty
InL x <|> _ = InL x
InR _ <|> InL y = InL y
InR x <|> InR y = InR (x <|> y)
Is it actually good idea to generalize all functions and make new instances with this analogy, especially for list operations?
EDIT: I'm especially concerned about ambiguous return type. For normal list operations, the return type is deducible from its argument types. But the "universal" version is not, as the return type must be explicitly specified. Is this problem severe enough to regard the analogy dangerous? (Or is there any other problem?)
EDIT 2: If I'm understanding the behavior of foldl' exactly, the time complexity of universal splitAt (shown above) must be Θ(length xs), as foldl' is strict for every elements, right? If yes, that must be a problem, as it's inferior to the regular version's Θ(min n (length xs)).
It is not always a good idea to make functions as polymorphic as theoretically possible, in particular not function arguments. As a rule of thumb: make function results as polymorphic as possible. (Often, the arguments will then already contain some type variables that are used in the result.) Only if you have a particular reason, also give the arguments extra polymorphism.
The reason being: if everything is polymorphic, the compiler has no hints as to what concrete types to choose. Polymorphic results/values are usually ok, because these will generally be bound directly or indirectly to some top-level definition which has an explicit signature, but polymorphic arguments will often only be filled with literals (number literals are polymorphic in Haskell, and strings/lists can be too) or other polymorphic values, so you end up having to type out lots of explicit local signatures, which tends to be more awkward than having to occasionally toss in an explicit conversion function because something is not polymorphic enough.
This idea with Foldable->Alternative specifically has another problem that the Alternative class is rather frowned upon, having no very solid mathematical backing. It's basically the class of applicative functors which for every instantiation give rise to a Monoid. Well, that can also be expressed directly, by demanding Monoid itself. The “universal container conversion function” thus already exists, it is foldMap pure.
Indeed it does:
λ :i Applicative
class Functor f => Applicative (f :: * -> *) where
At the same time:
fmap f x = pure f <*> x
— by the laws of Applicative we can define fmap from pure & <*>.
I don't get why I should tediously define fmap every time I want an Applicative if, really, fmap can be automatically set up in terms of pure and <*>.
I gather it would be necessary if pure or <*> were somehow dependent on the definition of fmap but I fail to see why they have to.
While fmap can be derived from pure and <*>, it is generally not the most efficient approach. Compare:
fmap :: (a -> b) -> Maybe a -> Maybe b
fmap f Nothing = Nothing
fmap f (Just x) = Just (f x)
with the work done using Applicative tools:
fmap :: (a -> b) -> Maybe a -> Maybe b
-- inlining pure and <*> in: fmap f x = pure f <*> x
fmap f x = case (Just f) of
Nothing -> Nothing
Just f' -> case x of
Nothing -> Nothing
Just x' -> Just (f' x')
Pointlessly wrapping something up in a constructor just to do a pattern-match against it.
So, clearly it is useful to be able to define fmap independently of the Applicative functions. That could be done by making a single typeclass with all three functions, using a default implementation for fmap that you could override. However, there are types that make good Functor instances but not good Applicative instances, so you may need to implement just one. Thus, two typeclasses.
And since there are no types with Applicative instances but without Functor instances, you should be able to treat an Applicative as though it were a Functor, if you like; hence the extension relationship between the two.
However, if you tire of implementing Functor, you can (in most cases) ask GHC to derive the only possible implementation of Functor for you, with
{-# LANGUAGE DeriveFunctor #-}
data Boring a = Boring a deriving Functor
While there are proposals to make it's easier https://ghc.haskell.org/trac/ghc/wiki/IntrinsicSuperclasses the "default instances" problem itself is very difficult.
One challenge is how to deal with common superclasses:
fmap f x = pure f <*> x -- using Applicative
fmap f x = runIdentity (traverse (Identity . f) x) -- using Traversable
fmap f x = x >>= (return . f) -- using Monad
Which one to pick?
So the best we can do now is to provide fmapDefault (as Data.Traversable) does; or use pure f <*> x; or fmapRep from Data.Functor.Rep when applicable.
Applicative Programming with Effects, the paper from McBride and Paterson, presents the interchange law:
u <*> pure x = pure (\f -> f x) <*> u
In order to try to understand it, I attempted the following example - to represent the left-hand side.
ghci> Just (+10) <*> pure 5
Just 15
How could I write this example using the right-hand side?
Also, if u is an f (a -> b) where f is an Applicative, then what's the function on the right-hand side: pure (\f -> f x) ...?
It would be written as
pure (\f -> f 5) <*> Just (+10)
Or even
pure ($ 5) <*> Just (+10)
Both are equivalent in this case. Quite literally, you're wrapping a function with pure that takes another function as its argument, then applies x to it. You provide f as the contents of the Just, which in this case is (+10). When you see the lambda syntax of (\f -> f x), it's being very literal, this is a lambda used for this definition.
The point this law makes is about preservation of exponential by the Applicative Functor: what is a exponential in the origin, is also an exponential in the image of the category.
Please, observe that the actual action of the Applicative Functor is the transformation of the following kind: strength :: (f a, f b) -> f (a, b); then ap or <*> is just fmap eval over the result, or, written fully, ap = curry $ fmap (uncurry ($)) . strength.
This law then says that since in the origin g $ x == ($ x) $ g, lifting ($), x and ($ x) should preserve the equality. Notice, that "normal" Functors will only preserve the equality only if g is lifted, too, but Applicative Functors will preserve this equality for any object of type f (a->b) in place of g. This way the whole type f (a->b) behaves like f a -> f b, whereas for "normal" Functors it only needs to behave like f a -> f b for images of the arrows in the origin (to make the diagrams commute and fulfill the promises of the Functor).
As to representing the right-hand-side of the law, you've already been advised to take it literally, pure ($ 5) <*> Just (+10)