Applicative functor evaluation is not clear to me - haskell

I am currently reading Learn You a Haskell for Great Good! and am stumbling on the explanation for the evaluation of a certain code block. I've read the explanations several times and am starting to doubt if even the author understands what this piece of code is doing.
ghci> (+) <$> (+3) <*> (*100) $ 5
508
An applicative functor applies a function in some context to a value in some context to get some result in some context. I have spent a few hours studying this code block and have come up with a few explanations for how this expression is evaluated, and none of them are satisfactory. I understand that (5+3)+(5*100) is 508, but the problem is getting to this expression. Does anyone have a clear explanation for this piece of code?

The other two answers have given the detail of how this is calculated - but I thought I might chime in with a more "intuitive" answer to explain how, without going through a detailed calculation, one can "see" that the result must be 508.
As you implied, every Applicative (in fact, even every Functor) can be viewed as a particular kind of "context" which holds values of a given type. As simple examples:
Maybe a is a context in which a value of type a might exist, but might not (usually the result of a computation which may fail for some reason)
[a] is a context which can hold zero or more values of type a, with no upper limit on the number - representing all possible outcomes of a particular computation
IO a is a context in which a value of type a is available as a result of interacting with "the outside world" in some way. (OK that one isn't so simple...)
And, relevant to this example:
r -> a is a context in which a value of type a is available, but its particular value is not yet known, because it depends on some (as yet unknown) value of type r.
The Applicative methods can be very well understood on the basis of values in such contexts. pure embeds an "ordinary value" in a "default context" in which it behaves as closely as possible in that context to a "context-free" one. I won't go through this for each of the 4 examples above (most of them are very obvious), but I will note that for functions, pure = const - that is, a "pure value" a is represented by the function which always produces a no matter what the source value.
Rather than dwell on how <*> can best be described using the "context" metaphor though, I want to dwell on the particular expression:
f <$> a <*> b
where f is a function between 2 "pure values" and a and b are "values in a context". This expression in fact has a synonym as a function: liftA2. Although using the liftA2 function is generally considered less idiomatic than the "applicative style" using <$> and <*>, the name emphasies that the idea is to "lift" a function on "ordinary values" to one on "values in a context". And when thought of like this, I think it is usually very intuitive what this does, given a particular "context" (ie. a particular Applicative instance).
So the expression:
(+) <$> a <*> b
for values a and b of type say f Int for an Applicative f, behaves as follows for different instances f:
if f = Maybe, then the result, if a and b are both Just values, is to add up the underlying values and wrap them in a Just. If either a or b is Nothing, then the whole expression is Nothing.
if f = [] (the list instance) then the above expression is a list containing all sums of the form a' + b' where a' is in a and b' is in b.
if f = IO, then the above expression is an IO action that performs all the I/O effects of a followed by those of b, and results in the sum of the Ints produced by those two actions.
So what, finally, does it do if f is the function instance? Since a and b are both functions describing how to get a given Int given an arbitrary (Int) input, it is natural that lifting the (+) function over them should be the function that, given an input, gets the result of both the a and b functions, and then adds the results.
And that is, of course, what it does - and the explicit route by which it does that has been very ably mapped out by the other answers. But the reason why it works out like that - indeed, the very reason we have the instance that f <*> g = \x -> f x (g x), which might otherwise seem rather arbitrary (although in actual fact it's one of the very few things, if not the only thing, that will type-check), is so that the instance matches the semantics of "values which depend on some as-yet-unknown other value, according to the given function". And in general, I would say it's often better to think "at a high level" like this than to be forced to go down to the low-level details of exactly how computations are performed. (Although I certainly don't want to downplay the importance of also being able to do the latter.)
[Actually, from a philosophical point of view, it might be more accurate to say that the definition is as it is just because it's the "natural" definition that type-checks, and that it's just happy coincidence that the instance then takes on such a nice "meaning". Mathematics is of course full of just such happy "coincidences" which turn out to have very deep reasons behind them.]

It is using the applicative instance for functions. Your code
(+) <$> (+3) <*> (*100) $ 5
is evaluated as
( (\a->\b->a+b) <$> (\c->c+3) <*> (\d->d*100) ) 5 -- f <$> g
( (\x -> (\a->\b->a+b) ((\c->c+3) x)) <*> (\d->d*100) ) 5 -- \x -> f (g x)
( (\x -> (\a->\b->a+b) (x+3)) <*> (\d->d*100) ) 5
( (\x -> \b -> (x+3)+b) <*> (\d->d*100) ) 5
( (\x->\b->(x+3)+b) <*> (\d->d*100) ) 5 -- f <*> g
(\y -> ((\x->\b->(x+3)+b) y) ((\d->d*100) y)) 5 -- \y -> (f y) (g y)
(\y -> (\b->(y+3)+b) (y*100)) 5
(\y -> (y+3)+(y*100)) 5
(5+3)+(5*100)
where <$> is fmap or just function composition ., and <*> is ap if you know how it behaves on monads.

Let us first take a look how fmap and (<*>) are defined for a function:
instance Functor ((->) r) where
fmap = (.)
instance Applicative ((->) a) where
pure = const
(<*>) f g x = f x (g x)
liftA2 q f g x = q (f x) (g x)
The expression we aim to evaluate is:
(+) <$> (+3) <*> (*100) $ 5
or more verbose:
((+) <$> (+3)) <*> (*100) $ 5
If we thus evaluate (<$>), which is an infix synonym for fmap, we thus see that this is equal to:
(+) . (+3)
so that means our expression is equivalent to:
((+) . (+3)) <*> (*100) $ 5
Next we can apply the sequential application. Here f is thus equal to (+) . (+3) and g is (*100). This thus means that we construct a function that looks like:
\x -> ((+) . (+3)) x ((*100) x)
We can now simplify this and rewrite this into:
\x -> ((+) (x+3)) ((*100) x)
and then rewrite it to:
\x -> (+) (x+3) ((*100) x)
We thus have constructed a function that looks like:
\x -> (x+3) + 100 * x
or simpler:
\x -> 101 * x + 3
If we then calculate:
(\x -> 101*x + 3) 5
then we of course obtain:
101 * 5 + 3
and thus:
505 + 3
which is the expected:
508

For any applicative,
a <$> b <*> c = liftA2 a b c
For functions,
liftA2 a b c x
= a (b x) (c x) -- by definition;
= (a . b) x (c x)
= ((a <$> b) <*> c) x
Thus
(+) <$> (+3) <*> (*100) $ 5
=
liftA2 (+) (+3) (*100) 5
=
(+) ((+3) 5) ((*100) 5)
=
(5+3) + (5*100)
(the long version of this answer follows.)
Pure math has no time. Pure Haskell has no time. Speaking in verbs ("applicative functor applies" etc.) can be confusing ("applies... when?...").
Instead, (<*>) is a combinator which combines a "computation" (denoted by an applicative functor) carrying a function (in the context of that type of computations) and a "computation" of the same type, carrying a value (in like context), into one combined "computation" that carries out the application of that function to that value (in such context).
"Computation" is used to contrast it with a pure Haskell "calculations" (after Philip Wadler's "Calculating is better than Scheming" paper, itself referring to David Turner's Kent Recursive Calculator language, one of predecessors of Miranda, the (main) predecessor of Haskell).
"Computations" might or might not be pure themselves, that's an orthogonal issue. But mainly what it means, is that "computations" embody a generalized function call protocol. They might "do" something in addition to / as part of / carrying out the application of a function to its argument. Or in types,
( $ ) :: (a -> b) -> a -> b
(<$>) :: (a -> b) -> f a -> f b
(<*>) :: f (a -> b) -> f a -> f b
(=<<) :: (a -> f b) -> f a -> f b
With functions, the context is application (another one), and to recover the value -- be it a function or an argument -- the application to a common argument is to be performed.
(bear with me, we're almost there).
The pattern a <$> b <*> c is also expressible as liftA2 a b c. And so, the "functions" applicative functor "computation" type is defined by
liftA2 h x y s = let x' = x s -- embellished application of h to x and y
y' = y s in -- in context of functions, or Reader
h x' y'
-- liftA2 h x y = let x' = x -- non-embellished application, or Identity
-- y' = y in
-- h x' y'
-- liftA2 h x y s = let (x',s') = x s -- embellished application of h to x and y
-- (y',s'') = y s' in -- in context of
-- (h x' y', s'') -- state-passing computations, or State
-- liftA2 h x y = let (x',w) = x -- embellished application of h to x and y
-- (y',w') = y in -- in context of
-- (h x' y', w++w') -- logging computations, or Writer
-- liftA2 h x y = [h x' y' | -- embellished application of h to x and y
-- x' <- x, -- in context of
-- y' <- y ] -- nondeterministic computations, or List
-- ( and for Monads we define `liftBind h x k =` and replace `y` with `k x'`
-- in the bodies of the above combinators; then liftA2 becomes liftBind: )
-- liftA2 :: (a -> b -> c) -> f a -> f b -> f c
-- liftBind :: (a -> b -> c) -> f a -> (a -> f b) -> f c
-- (>>=) = liftBind (\a b -> b) :: f a -> (a -> f b) -> f b
And in fact all the above snippets can be just written with ApplicativeDo as liftA2 h x y = do { x' <- x ; y' <- y ; pure (h x' y') } or even more intuitively as
liftA2 h x y = [h x' y' | x' <- x, y' <- y], with Monad Comprehensions, since all the above computation types are monads as well as applicative functors. This shows by the way that (<*>) = liftA2 ($), which one might find illuminating as well.
Indeed,
> :t let liftA2 h x y r = h (x r) (y r) in liftA2
:: (a -> b -> c) -> (t -> a) -> (t -> b) -> (t -> c)
> :t liftA2 -- the built-in one
liftA2 :: Applicative f => (a -> b -> c) -> f a -> f b -> f c
i.e. the types match when we take f a ~ (t -> a) ~ (->) t a, i.e. f ~ (->) t.
And so, we're already there:
(+) <$> (+3) <*> (*100) $ 5
=
liftA2 (+) (+3) (*100) 5
=
(+) ((+3) 5) ((*100) 5)
=
(+) (5+3) (5*100)
=
(5+3) + (5*100)
It's just how liftA2 is defined for this type, Applicative ((->) t) => ...:
instance Applicative ((->) t) where
pure x t = x
liftA2 h x y t = h (x t) (y t)
There's no need to define (<*>). The source code says:
Minimal complete definition
pure, ((<*>) | liftA2)
So now you've been wanting to ask for a long time, why is it that a <$> b <*> c is equivalent to liftA2 a b c?
The short answer is, it just is. One can be defined in terms of the other -- i.e. (<*>) can be defined via liftA2,
g <*> x = liftA2 id g x -- i.e. (<*>) = liftA2 id = liftA2 ($)
-- (g <*> x) t = liftA2 id g x t
-- = id (g t) (x t)
-- = (id . g) t (x t) -- = (id <$> g <*> x) t
-- = g t (x t)
(which is exactly as it is defined in the source),
and it is a law that every Applicative Functor must follow, that h <$> g = pure h <*> g.
Lastly,
liftA2 h g x == pure h <*> g <*> x
-- h g x == (h g) x
because <*> associates to the left: it is infixl 4 <*>.

Related

Natural map derivation algorithm

This Reddit post by Edward Kmett provides a constructive definition of a natural map, the one from the free theorem for fmap (which I read in yet another Edward Kmett's post):
For given f, g, h and k, such that f . g = h . k: $map f . fmap g = fmap h . $map k, where $map is the natural map for the given constructor.
I do not fully understand the algorithm. Let us approach it step-by-step:
We can define a "natural map" by induction over any particular concrete choice of F you give.
Ultimately any such ADT is made out of sums, products, (->)'s, 1s, 0s, a's, invocations of other
functors, etc.
Consider:
data Smth a = A a a a | B a (Maybe a) | U | Z Void deriving ...
No arrows. Let us see how fmap (which I reckon to be the natural choice for any ADT without (->)s in it) would operate here:
instance Functor Smth where
fmap xy (A x x1 x2) = A (xy x) (xy x1) (xy x2)
fmap xy (B x xPlus1) = B (xy x) (fmap xy xPlus1)
-- One can pattern-match 'xPlus1' as 'Just x1' and 'Nothing'.
-- From my point of view, 'fmap' suits better here. Reasons below.
fmap _xy U = U
fmap _xy (Z z) = absurd z
Which seems natural. To put this more formally, we apply xy to every x, apply fmap xy to every T x, where T is a Functor, we leave every unit unchanged, and we pass every Void onto absurd. This works for recursive definitions too!
data Lst a = Unit | Prepend a (Lst a) deriving ...
instance Functor Lst where
fmap xy Unit = Unit
fmap xy (Prepend x lstX) = Prepend (xy x) (fmap xy lstX)
And for the non-inductive types:(Detailed explanation in this answer under the linked post.)
Graph a = Node a [Graph a]
instance Functor Graph where
fmap xy (Node x children) = Node (xy x) (fmap (fmap xy) children)
This part is clear.
When we allow (->) we now have the first thing that mixes variance up. The left-hand type argument of (->) is in contravariant position, the right-hand side is in covariant position. So you need to track the final type variable through the entire ADT and see if it occurs in positive and/or negative position.
Now we include (->)s. Let us try to keep this induction going:
We somehow derived natural maps for T a and S a. Thus, we want to consider the following:
data T2S a = T2S (T a -> S a)
instance ?Class? T2S where
?map? ?? (T2S tx2sx) = T2S $ \ ty -> ???
And I believe this to be the point where we start choosing. We have the following options:
(Phantom) a occurs neither in T nor in S. a in T2S is phantom, thus, we can implement both fmap and contramap as const phantom.
(Covariant) a occurs in S a and does not occur in T a. Thus, this something among the lines of ReaderT with S a (which does not actually depend on a) as environment, which substitutes ?Class? with Functor, ?map? with fmap, ???, ?? with xy with:
let tx = phantom ty
sx = tx2sx tx
sy = fmap xy sx
in sy
(Contravariant) a occurs in T a and does not occur in S a. I do not see a way to make this thing a covariant functor, so we implement a Contravariant instance here, which substitutes ?map? with contramap, ?? with yx, ??? with:
let tx = fmap yx ty
sx = tx2sx tx
sy = phantom sx
in sy
(Invariant) a occurs both in T a and S a. We can no longer use phantom, which came in quite handy. There is a module Data.Functor.Invariant by Edward Kmett. It provides the following class with a map:
class Invariant f where
invmap :: (a -> b) -> (b -> a) -> f a -> f b
-- and some generic stuff...
And yet, I cannot see a way to turn this into something we can pluf into the free theorem for fmap - the type requires an additional function-argument, which we can't brush off as id or something. Anyway, we put invmap instead of ?map?, xy yx instead of ??, and the following instead of ???:
let tx = fmap yx ty
sx = tx2sx tx
sy = fmap xy sx
in sy
So, is my understanding of such an algorithm correct? If so, how are we to properly process the Invariant case?
I think your algorithm is too complex, because you are trying to write one algorithm. Writing two algorithms instead makes things much simpler. One algorithm will build the natural fmap, and the other will build the natural contramap. BUT both algorithms need to be nondeterministic in the following sense: there will be types where they cannot succeed, and so do not return an implementation; and there will be types where there are multiple ways they can succeed, but they're all equivalent.
To start, let's carefully define what it means to be a parameterized type. Here's the different kinds of parameterized types we can have:
F ::= F + F'
| F * F'
| F -> F'
| F . F'
| Id
| Const X
In Const X, the X ranges over all concrete, non-parameterized types, like Int and Bool and so forth. And here's their interpretation, i.e. the concrete type they are isomorphic to once given a parameter:
[[F + F']] a = Either ([[F]] a) ([[F']] a)
[[F * F']] a = ([[F]] a, [[F']] a)
[[F -> F']] a = [[F]] a -> [[F']] a
[[F . F']] a = [[F]] ([[F']] a)
[[Id]] a = a
[[Const X]] a = X
Now we can give our two algorithms. The first bit you've already written yourself:
fmap #(F + F') f (Left x) = Left (fmap #F f x)
fmap #(F + F') f (Right x) = Right (fmap #F' f x)
fmap #(F * F') f (x, y) = (fmap #F f x, fmap #F f y)
fmap #(Id) f x = f x
fmap #(Const X) f x = x
These correspond to the clauses you gave in your first instance. Then, in your [Graph a] example, you gave a clause corresponding to this:
fmap #(F . F') f x = fmap #F (fmap #F' f) x
That's fine, but this is also the first moment where we get some nondeterminism. One way to make this a functor is indeed nested fmaps; but another way is nested contramaps.
fmap #(F . F') f x = contramap #F (contramap #F' f) x
If both clauses are possible, then there are no Ids in either F or F', so both instances will return x unchanged.
The only thing left now is the arrow case, the one you ask about. But it turns out it's very easy in this formalism, there is only one choice:
fmap #(F -> F') f x = fmap #F' f . x . contramap #F f
That's the whole algorithm, in full detail, for defining the natural fmap. ...except one detail, which is the algorithm for the natural contramap. But hopefully if you followed all of the above, you can reproduce that algorithm yourself. I encourage you to give it a shot, then check your answers against mine below.
contramap #(F + F') f (Left x) = Left (contramap #F f x)
contramap #(F + F') f (Right x) = Right (contramap #F' f x)
contramap #(F * F') f (x, y) = (contramap #F f x, contramap #F' f y)
contramap #(F -> F') f x = contramap #F' f . x . fmap #F f
contramap #(F . F') f x = contramap #F (fmap #F' f) x
-- OR
contramap #(F . F') f x = fmap #F (contramap #F' f) x
-- contramap #(Id) fails
contramap #(Const X) f x = x
One thing of interest to me personally: it turns out that contramap #(Id) is the only leaf case that fails. All further failures are inductive failures ultimately deriving from this one -- a fact I had never thought of before! (The dual statement is that it turns out that fmap #(Id) is the only leaf case that actually uses its first function argument.)

What does <*> do in addRecip x y = fmap (+) (recipMay x) <*> recipMay y?

addRecip :: Double -> Double -> Maybe Double
addRecip x y = fmap (+) (recipMay x) <*> recipMay y
where
recipMay a | a == 0 = Nothing
| otherwise = Just (1 / a)
I look up some explanation for <*>.
<*> takes a functor that contains a function taking an a and returning a b, and a functor that contains an a, and it returns a functor that contains a b. So <*> kind of extract the function from a functor and applies it to an arguments also inside a functor, and finally returns the result into a functor
This is an example:
fs <*> xs = [f x | f <- fs, x <- xs]
But in my case, it seems a bit different. The elements in recipMay x are not functions.
<*> Applies an applicative value to another. It's a richer counterpart to regular function application. The applicative values are decorated in some way, for example, it can be optional whether there's any value as you would perceive it (for Maybe, which is your case), or there can be very many values (for List).
The application of one applicative value to the other therefore has some special behaviour. For lists, a <*> b applies each member of a to each member of b making a huge list of all combinations whilst for Maybe (which is your case) a <*> b gives Just (a' b') if a and b are (Just a') and (Just b') and gives Nothing if either or both a and b are Nothing - for Maybe, in summary, it's function application for optional values where the result is absent if any value involved is absent.
There are some rules to how <*> is implemented which means that you can always view this as [apply a "contained function" to a "contained value"] and as long as you do all your work in the contained domain (using <$>, <*>, pure, >>=, <|>, etc) then you can think of it as the same as regular function application, but when you come to "extract" values you get to see the added richness.
The (<*>) :: Applicative f => f (a -> b) -> f a -> f b comes from the Applicative typeclass. An Applicative is a (quoting the documentation) "A functor with application.". You can think of a Functor as a collection (although there are other types that are no collections that are functors, like a function for example).
If we see a functor as a collection then the (<*>) operator thus takes two of these collections. The first collection stores functions of type a -> b, and the latter is a collection of bs. The result is then a collection (the same type of collection) of bs, by applying every element in the second collection to every function in the first collection.
So for a list it looks like:
(<*>) :: [a -> b] -> [a] -> [b]
(<*>) fs xs = [fi xj | fi <- fs, xj <- xs]
A Maybe is also some sort of collection: it either contains no elements (the Nothing case), or one element (the Just x case with x the element). You can thus see a Maybe as a collection with "multiplicity" 0..1.
In case one of the two operands is a Nothing (or both), then the result is a Nothing as well, since if there is no function, or no element, there is no "result" of a function application. Only in case both operands are Justs (so Just f and Just x), we can perform function application (so Just (f x)):
(<*>) :: Maybe (a -> b) -> Maybe a -> Maybe b
(<*>) (Just f) (Just x) = Just (f x)
(<*>) _ _ = Nothing
In this specific case, we can analyze the use:
addRecip :: Double -> Double -> Maybe Double
addRecip x y = (fmap (+) (recipMay x)) <*> recipMay y
where
recipMay a | a == 0 = Nothing
| otherwise = Just (1 / a)
We thus see two operands: fmap (+) (RecipMay x) and recipMay y. In case x and/or y are 0, then the operands are respectively Nothing. Since in that case the corresponding recipMay is Nothing.
We thus could write it like:
addRecip :: Double -> Double -> Maybe Double
addRecip x y | x == 0 = Nothing
| y == 0 = Nothing
| otherwise = Just ((1/x) + (1/y))
But in the above we thus repeat the == 0, and 1/ logic twice.
Here the functor is Maybe. That <*> will return Nothing if either argument is Nothing (i.e., it involved a division by zero)
Nothing <*> _ = Nothing
_ <*> Nothing = Nothing
In the remaining case, it just applies the wrapped function:
Just f <*> Just x = Just (f x)
Also note that
fmap (+) (recipMay x) <*> recipMay y
is a slightly unusual notation. Usually that's written as
(+) <$> recipMay x <*> recipMay y
which is completely equivalent, since fmap is written as the infix <$>, but arguably more readable.
Here, fmap (+) (recipMay x) (or (+) <$> recipMay x) means
if x == 0
then Nothing
else Just (\a -> 1/x + a)

How one parameter is used by two function simultaneously

I am reading about applicative functors and found such line:
(+) <$> (+3) <*> (*100) $ 5
It outputs 508.
How 5 could be used by (+3) and (*100) at the same time?
Why don't we need to pass two 5's as a parameters like:
(+) <$> (+3) <*> (*100) $ 5 5
In the (->) a applicative instance, we find:
instance Applicative ((->) a) where
pure = const
(<*>) f g x = f x (g x)
liftA2 q f g x = q (f x) (g x)
So, x is passed to both f and g by definition.
Here is a way to unbox it. We start with
e = ((+) <$> (+3)) <*> (*100)
(note that I left out the $ 5). The Applicative Functor whose <$> and <*> we are using here is the Function type (->) (partially applied to, I guess, Integer). Here, the meaning of <$> and <*> is as follows:
f <$> g = \y -> f (g y)
g <*> h = \x -> g x (h x)
We can plug that in into the term in the first line and get
e = \x -> (\y -> (+) ((+3) y)) x ((*100) x
There are a few simplifications that we can do to this term:
e = \x -> (x+3) + (x*100)
So if this function is the value of (+) <$> (+3) <*> (*100), then it should no longer be surprising that applying this to 5 gives
e 5 = (5+3) + (5*100) = 508
The thing is, you first have to understand how a function can be a functor. Think about a function like a container which reveals it's content only when you feed it with a parameter. In other words we only get to know it's content when this applicative functor (function) gets invoked with a parameter. So the type of this function would be say r -> a on two different types. However for functors we can only take an effect on a single type. So applicative functors are partially applied types just like the functor instance of Either type. We are interested in the output of the function so our applicative functor becomes (->) r in prefix notation. By remembering that <$> is the infix form of fmap
(+3) <$> (*2) $ 4 would result 11. 4 is applied to our functor (*2) and the result (which is the value in the applicative functor context) gets mapped with (+3).
However in our case we are fmaping (+) to (+3). To make it clearer lets rephrase the functions in lambda form.
(+) = \w x -> w + x and (+3) = \y -> y + 3.
then (+) <$> (+3) by partially applying \y -> y + 3 in the place of w our fmap applied applicative functor becomes \y x -> (y + 3) + x.
Now here comes the applicative operator <*>. As mentioned in previous answers it is of definition g <*> h = \x -> g x (h x) which takes a two parameter function g and partially applies g's second parameter with it's second parameter function h. Now our operation looks like
(\y x -> (y + 3) + x) <*> (*100) which can be rephrased as;
(\y x -> (y + 3) + x) <*> (\z -> z*100) which means now we have to partially apply \z -> z*100 to x and our function becomes \y z -> (y + 3) + (z*100).
Finally the applicative operator returns us a function which takes a single parameter and applies it to both parameters of the above two parameter function. So
\x -> (\y z -> (y + 3) + (z*100)) x x

Applicative Laws for the ((->) r) type

I'm trying to check that the Applicative laws hold for the function type ((->) r), and here's what I have so far:
-- Identiy
pure (id) <*> v = v
-- Starting with the LHS
pure (id) <*> v
const id <*> v
(\x -> const id x (g x))
(\x -> id (g x))
(\x -> g x)
g x
v
-- Homomorphism
pure f <*> pure x = pure (f x)
-- Starting with the LHS
pure f <*> pure x
const f <*> const x
(\y -> const f y (const x y))
(\y -> f (x))
(\_ -> f x)
pure (f x)
Did I perform the steps for the first two laws correctly?
I'm struggling with the interchange & composition laws. For interchange, so far I have the following:
-- Interchange
u <*> pure y = pure ($y) <*> u
-- Starting with the LHS
u <*> pure y
u <*> const y
(\x -> g x (const y x))
(\x -> g x y)
-- I'm not sure how to proceed beyond this point.
I would appreciate any help for the steps to verify the Interchange & Composition applicative laws for the ((->) r) type. For reference, the Composition applicative law is as follows:
pure (.) <*> u <*> v <*> w = u <*> (v <*> w)
I think in your "Identity" proof, you should replace g with v everywhere (otherwise what is g and where did it come from?). Similarly, in your "Interchange" proof, things look okay so far, but the g that magically appears should just be u. To continue that proof, you could start reducing the RHS and verify that it also produces \x -> u x y.
Composition is more of the same: plug in the definitions of pure and (<*>) on both sides, then start calculating on both sides. You'll soon come to some bare lambdas that will be easy to prove equivalent.

Y Combinator in Haskell

Is it possible to write the Y Combinator in Haskell?
It seems like it would have an infinitely recursive type.
Y :: f -> b -> c
where f :: (f -> b -> c)
or something. Even a simple slightly factored factorial
factMaker _ 0 = 1
factMaker fn n = n * ((fn fn) (n -1)
{- to be called as
(factMaker factMaker) 5
-}
fails with "Occurs check: cannot construct the infinite type: t = t -> t2 -> t1"
(The Y combinator looks like this
(define Y
(lambda (X)
((lambda (procedure)
(X (lambda (arg) ((procedure procedure) arg))))
(lambda (procedure)
(X (lambda (arg) ((procedure procedure) arg)))))))
in scheme)
Or, more succinctly as
(λ (f) ((λ (x) (f (λ (a) ((x x) a))))
(λ (x) (f (λ (a) ((x x) a))))))
For the applicative order
And
(λ (f) ((λ (x) (f (x x)))
(λ (x) (f (x x)))))
Which is just a eta contraction away for the lazy version.
If you prefer short variable names.
Here's a non-recursive definition of the y-combinator in haskell:
newtype Mu a = Mu (Mu a -> a)
y f = (\h -> h $ Mu h) (\x -> f . (\(Mu g) -> g) x $ x)
hat tip
The Y combinator can't be typed using Hindley-Milner types, the polymorphic lambda calculus on which Haskell's type system is based. You can prove this by appeal to the rules of the type system.
I don't know if it's possible to type the Y combinator by giving it a higher-rank type. It would surprise me, but I don't have a proof that it's not possible. (The key would be to identify a suitably polymorphic type for the lambda-bound x.)
If you want a fixed-point operator in Haskell, you can define one very easily because in Haskell, let-binding has fixed-point semantics:
fix :: (a -> a) -> a
fix f = f (fix f)
You can use this in the usual way to define functions and even some finite or infinite data structures.
It is also possible to use functions on recursive types to implement fixed points.
If you're interested in programming with fixed points, you want to read Bruce McAdam's technical report That About Wraps it Up.
The canonical definition of the Y combinator is as follows:
y = \f -> (\x -> f (x x)) (\x -> f (x x))
But it doesn't type check in Haskell because of the x x, since it would require an infinite type:
x :: a -> b -- x is a function
x :: a -- x is applied to x
--------------------------------
a = a -> b -- infinite type
If the type system were to allow such recursive types, it would make type checking undecidable (prone to infinite loops).
But the Y combinator will work if you force it to typecheck, e.g. by using unsafeCoerce :: a -> b:
import Unsafe.Coerce
y :: (a -> a) -> a
y = \f -> (\x -> f (unsafeCoerce x x)) (\x -> f (unsafeCoerce x x))
main = putStrLn $ y ("circular reasoning works because " ++)
This is unsafe (obviously). rampion's answer demonstrates a safer way to write a fixpoint combinator in Haskell without using recursion.
Oh
this wiki page and
This Stack Overflow answer seem to answer my question.
I will write up more of an explanation later.
Now, I've found something interesting about that Mu type. Consider S = Mu Bool.
data S = S (S -> Bool)
If one treats S as a set and that equals sign as isomorphism, then the equation becomes
S ⇋ S -> Bool ⇋ Powerset(S)
So S is the set of sets that are isomorphic to their powerset!
But we know from Cantor's diagonal argument that the cardinality of Powerset(S) is always strictly greater than the cardinality of S, so they are never isomorphic.
I think this is why you can now define a fixed point operator, even though you can't without one.
Just to make rampion's code more readable:
-- Mu :: (Mu a -> a) -> Mu a
newtype Mu a = Mu (Mu a -> a)
w :: (Mu a -> a) -> a
w h = h (Mu h)
y :: (a -> a) -> a
y f = w (\(Mu x) -> f (w x))
-- y f = f . y f
in which w stands for the omega combinator w = \x -> x x, and y stands for the y combinator y = \f -> w . (f w).

Resources