How do you implement monoid interface for this tree in haskell?

How do you implement monoid interface for this tree in haskell? - haskell

Please excuse the terminology, my mind is still bending.
The tree:
data Ftree a = Empty | Leaf a | Branch ( Ftree a ) ( Ftree a )
deriving ( Show )
I have a few questions:
If Ftree could not be Empty, would it no longer be a Monoid since there is no identity value.
How would you implement mappend with this tree? Can you just arbitrarily graft two trees together willy nilly?
For binary search trees, would you have to introspect some of the elements in both trees to make sure the result of mappend is still a BST?
For the record, some other stuff Ftree could do here:
instance Functor Ftree where
fmap g Empty = Empty
fmap g ( Leaf a ) = Leaf ( g a )
fmap g ( Branch tl tr ) = Branch ( fmap g tl ) ( fmap g tr )
instance Monad Ftree where
return = Leaf
Empty >>= g = Empty
Leaf a >>= g = g a
Branch lt rt >>= g = Branch ( lt >>= g ) ( rt >>= g )

There are three answers to your question, one captious and one unhelpful and one abstract:
The captious answer
instance Monoid (Ftree a) where
mempty = Empty
mappend = Branch
This is an instance of the Monoid type class, but does not satisfy any of the required properties.
The unhelpful answer
What Monoid do you want? Just asking for a monoid instance without further information is like asking for a solution without giving the problem. Sometimes there is a natural monoid instance (e.g. for lists) or there is only one (e.g. for (), disregarding questions of definedness). I don’t think either is the case here.
BTW: There would be an interesting monoid instance if your tree would have data at internal nodes that combines two trees recursively...
The abstract answer
Since you gave a Monad (Ftree a) instance, there is a generic way to get a Monoid instance:
instance (Monoid a, Monad f) => Monoid (f a) where
mempty = return mempty
mappend f g = f >>= (\x -> (mappend x) `fmap` g)
Lets check if this is a Monoid. I use <> = mappend. We assume that the Monad laws hold (I did not check that for your definition). At this point, recall the Monad laws written in do-notation.
Our mappend, written in do-Notation, is:
mappend f g = do
x <- f
y <- g
return (f <> g)
So we can verify the monoid laws now:
Left identity
mappend mempty g
≡ -- Definition of mappend
do
x <- mempty
y <- g
return (x <> y)
≡ -- Definition of mempty
do
x <- return mempty
y <- g
return (x <> y)
≡ -- Monad law
do
y <- g
return (mempty <> y)
≡ -- Underlying monoid laws
do
y <- g
return y
≡ -- Monad law
g
Right identity
mappend f mempty
≡ -- Definition of mappend
do
x <- f
y <- mempty
return (x <> y)
≡ -- Monad law
do
x <- f
return (x <> mempty)
≡ -- Underlying monoid laws
do
x <- f
return x
≡ -- Monad law
f
And finally the important associativity law
mappend f (mappend g h)
≡ -- Definition of mappend
do
x <- f
y <- do
x' <- g
y' <- h
return (x' <> y')
return (x <> y)
≡ -- Monad law
do
x <- f
x' <- g
y' <- h
y <- return (x' <> y')
return (x <> y)
≡ -- Monad law
do
x <- f
x' <- g
y' <- h
return (x <> (x' <> y'))
≡ -- Underlying monoid law
do
x <- f
x' <- g
y' <- h
return ((x <> x') <> y')
≡ -- Monad law
do
x <- f
x' <- g
z <- return (x <> x')
y' <- h
return (z <> y')
≡ -- Monad law
do
z <- do
x <- f
x' <- g
return (x <> x')
y' <- h
return (z <> y')
≡ -- Definition of mappend
mappend (mappend f g) h
So for every (proper) Monad (and even for every applicative functor, as Jake McArthur pointed out on #haskell), there is a Monoid instance. It may or may not be the one that you are looking for.

Related

Is the implementation of `<*>` based on `fmap` special to Maybe applicative or can it be generalized to other applicatives?

In Maybe applicative, <*> can be implemented based on fmap. Is it incidental, or can it be generalized to other applicative(s)?
(<*>) :: Maybe (a -> b) -> Maybe a -> Maybe b
Nothing <*> _ = Nothing
(Just g) <*> mx = fmap g mx
Thanks.
See also In applicative, how can `<*>` be represented in terms of `fmap_i, i=0,1,2,...`?

It cannot be generalized. A Functor instance is unique:
instance Functor [] where
fmap = map
but there can be multiple valid Applicative instances for the same type constructor.
-- "Canonical" instance: [f, g] <*> [x, y] == [f x, f y, g x, g y]
instance Applicative [] where
pure x = [x]
[] <*> _ = []
(f:fs) <*> xs = fmap f xs ++ (fs <*> xs)
-- Zip instance: [f, g] <*> [x, y] == [f x, g y]
instance Applicative [] where
pure x = repeat x
(f:fs) <*> (x:xs) = f x : (fs <*> xs)
_ <*> _ = []
In the latter, we neither want to apply any single function from the left argument to all elements of the right, nor apply all the functions on the left to any single element on the right, making fmap useless.

Applicative functor evaluation is not clear to me

I am currently reading Learn You a Haskell for Great Good! and am stumbling on the explanation for the evaluation of a certain code block. I've read the explanations several times and am starting to doubt if even the author understands what this piece of code is doing.
ghci> (+) <$> (+3) <*> (*100) $ 5
508
An applicative functor applies a function in some context to a value in some context to get some result in some context. I have spent a few hours studying this code block and have come up with a few explanations for how this expression is evaluated, and none of them are satisfactory. I understand that (5+3)+(5*100) is 508, but the problem is getting to this expression. Does anyone have a clear explanation for this piece of code?

The other two answers have given the detail of how this is calculated - but I thought I might chime in with a more "intuitive" answer to explain how, without going through a detailed calculation, one can "see" that the result must be 508.
As you implied, every Applicative (in fact, even every Functor) can be viewed as a particular kind of "context" which holds values of a given type. As simple examples:
Maybe a is a context in which a value of type a might exist, but might not (usually the result of a computation which may fail for some reason)
[a] is a context which can hold zero or more values of type a, with no upper limit on the number - representing all possible outcomes of a particular computation
IO a is a context in which a value of type a is available as a result of interacting with "the outside world" in some way. (OK that one isn't so simple...)
And, relevant to this example:
r -> a is a context in which a value of type a is available, but its particular value is not yet known, because it depends on some (as yet unknown) value of type r.
The Applicative methods can be very well understood on the basis of values in such contexts. pure embeds an "ordinary value" in a "default context" in which it behaves as closely as possible in that context to a "context-free" one. I won't go through this for each of the 4 examples above (most of them are very obvious), but I will note that for functions, pure = const - that is, a "pure value" a is represented by the function which always produces a no matter what the source value.
Rather than dwell on how <*> can best be described using the "context" metaphor though, I want to dwell on the particular expression:
f <$> a <*> b
where f is a function between 2 "pure values" and a and b are "values in a context". This expression in fact has a synonym as a function: liftA2. Although using the liftA2 function is generally considered less idiomatic than the "applicative style" using <$> and <*>, the name emphasies that the idea is to "lift" a function on "ordinary values" to one on "values in a context". And when thought of like this, I think it is usually very intuitive what this does, given a particular "context" (ie. a particular Applicative instance).
So the expression:
(+) <$> a <*> b
for values a and b of type say f Int for an Applicative f, behaves as follows for different instances f:
if f = Maybe, then the result, if a and b are both Just values, is to add up the underlying values and wrap them in a Just. If either a or b is Nothing, then the whole expression is Nothing.
if f = [] (the list instance) then the above expression is a list containing all sums of the form a' + b' where a' is in a and b' is in b.
if f = IO, then the above expression is an IO action that performs all the I/O effects of a followed by those of b, and results in the sum of the Ints produced by those two actions.
So what, finally, does it do if f is the function instance? Since a and b are both functions describing how to get a given Int given an arbitrary (Int) input, it is natural that lifting the (+) function over them should be the function that, given an input, gets the result of both the a and b functions, and then adds the results.
And that is, of course, what it does - and the explicit route by which it does that has been very ably mapped out by the other answers. But the reason why it works out like that - indeed, the very reason we have the instance that f <*> g = \x -> f x (g x), which might otherwise seem rather arbitrary (although in actual fact it's one of the very few things, if not the only thing, that will type-check), is so that the instance matches the semantics of "values which depend on some as-yet-unknown other value, according to the given function". And in general, I would say it's often better to think "at a high level" like this than to be forced to go down to the low-level details of exactly how computations are performed. (Although I certainly don't want to downplay the importance of also being able to do the latter.)
[Actually, from a philosophical point of view, it might be more accurate to say that the definition is as it is just because it's the "natural" definition that type-checks, and that it's just happy coincidence that the instance then takes on such a nice "meaning". Mathematics is of course full of just such happy "coincidences" which turn out to have very deep reasons behind them.]

It is using the applicative instance for functions. Your code
(+) <$> (+3) <*> (*100) $ 5
is evaluated as
( (\a->\b->a+b) <$> (\c->c+3) <*> (\d->d*100) ) 5 -- f <$> g
( (\x -> (\a->\b->a+b) ((\c->c+3) x)) <*> (\d->d*100) ) 5 -- \x -> f (g x)
( (\x -> (\a->\b->a+b) (x+3)) <*> (\d->d*100) ) 5
( (\x -> \b -> (x+3)+b) <*> (\d->d*100) ) 5
( (\x->\b->(x+3)+b) <*> (\d->d*100) ) 5 -- f <*> g
(\y -> ((\x->\b->(x+3)+b) y) ((\d->d*100) y)) 5 -- \y -> (f y) (g y)
(\y -> (\b->(y+3)+b) (y*100)) 5
(\y -> (y+3)+(y*100)) 5
(5+3)+(5*100)
where <$> is fmap or just function composition ., and <*> is ap if you know how it behaves on monads.

Let us first take a look how fmap and (<*>) are defined for a function:
instance Functor ((->) r) where
fmap = (.)
instance Applicative ((->) a) where
pure = const
(<*>) f g x = f x (g x)
liftA2 q f g x = q (f x) (g x)
The expression we aim to evaluate is:
(+) <$> (+3) <*> (*100) $ 5
or more verbose:
((+) <$> (+3)) <*> (*100) $ 5
If we thus evaluate (<$>), which is an infix synonym for fmap, we thus see that this is equal to:
(+) . (+3)
so that means our expression is equivalent to:
((+) . (+3)) <*> (*100) $ 5
Next we can apply the sequential application. Here f is thus equal to (+) . (+3) and g is (*100). This thus means that we construct a function that looks like:
\x -> ((+) . (+3)) x ((*100) x)
We can now simplify this and rewrite this into:
\x -> ((+) (x+3)) ((*100) x)
and then rewrite it to:
\x -> (+) (x+3) ((*100) x)
We thus have constructed a function that looks like:
\x -> (x+3) + 100 * x
or simpler:
\x -> 101 * x + 3
If we then calculate:
(\x -> 101*x + 3) 5
then we of course obtain:
101 * 5 + 3
and thus:
505 + 3
which is the expected:
508

For any applicative,
a <$> b <*> c = liftA2 a b c
For functions,
liftA2 a b c x
= a (b x) (c x) -- by definition;
= (a . b) x (c x)
= ((a <$> b) <*> c) x
Thus
(+) <$> (+3) <*> (*100) $ 5
=
liftA2 (+) (+3) (*100) 5
=
(+) ((+3) 5) ((*100) 5)
=
(5+3) + (5*100)
(the long version of this answer follows.)
Pure math has no time. Pure Haskell has no time. Speaking in verbs ("applicative functor applies" etc.) can be confusing ("applies... when?...").
Instead, (<*>) is a combinator which combines a "computation" (denoted by an applicative functor) carrying a function (in the context of that type of computations) and a "computation" of the same type, carrying a value (in like context), into one combined "computation" that carries out the application of that function to that value (in such context).
"Computation" is used to contrast it with a pure Haskell "calculations" (after Philip Wadler's "Calculating is better than Scheming" paper, itself referring to David Turner's Kent Recursive Calculator language, one of predecessors of Miranda, the (main) predecessor of Haskell).
"Computations" might or might not be pure themselves, that's an orthogonal issue. But mainly what it means, is that "computations" embody a generalized function call protocol. They might "do" something in addition to / as part of / carrying out the application of a function to its argument. Or in types,
( $ ) :: (a -> b) -> a -> b
(<$>) :: (a -> b) -> f a -> f b
(<*>) :: f (a -> b) -> f a -> f b
(=<<) :: (a -> f b) -> f a -> f b
With functions, the context is application (another one), and to recover the value -- be it a function or an argument -- the application to a common argument is to be performed.
(bear with me, we're almost there).
The pattern a <$> b <*> c is also expressible as liftA2 a b c. And so, the "functions" applicative functor "computation" type is defined by
liftA2 h x y s = let x' = x s -- embellished application of h to x and y
y' = y s in -- in context of functions, or Reader
h x' y'
-- liftA2 h x y = let x' = x -- non-embellished application, or Identity
-- y' = y in
-- h x' y'
-- liftA2 h x y s = let (x',s') = x s -- embellished application of h to x and y
-- (y',s'') = y s' in -- in context of
-- (h x' y', s'') -- state-passing computations, or State
-- liftA2 h x y = let (x',w) = x -- embellished application of h to x and y
-- (y',w') = y in -- in context of
-- (h x' y', w++w') -- logging computations, or Writer
-- liftA2 h x y = [h x' y' | -- embellished application of h to x and y
-- x' <- x, -- in context of
-- y' <- y ] -- nondeterministic computations, or List
-- ( and for Monads we define `liftBind h x k =` and replace `y` with `k x'`
-- in the bodies of the above combinators; then liftA2 becomes liftBind: )
-- liftA2 :: (a -> b -> c) -> f a -> f b -> f c
-- liftBind :: (a -> b -> c) -> f a -> (a -> f b) -> f c
-- (>>=) = liftBind (\a b -> b) :: f a -> (a -> f b) -> f b
And in fact all the above snippets can be just written with ApplicativeDo as liftA2 h x y = do { x' <- x ; y' <- y ; pure (h x' y') } or even more intuitively as
liftA2 h x y = [h x' y' | x' <- x, y' <- y], with Monad Comprehensions, since all the above computation types are monads as well as applicative functors. This shows by the way that (<*>) = liftA2 ($), which one might find illuminating as well.
Indeed,
> :t let liftA2 h x y r = h (x r) (y r) in liftA2
:: (a -> b -> c) -> (t -> a) -> (t -> b) -> (t -> c)
> :t liftA2 -- the built-in one
liftA2 :: Applicative f => (a -> b -> c) -> f a -> f b -> f c
i.e. the types match when we take f a ~ (t -> a) ~ (->) t a, i.e. f ~ (->) t.
And so, we're already there:
(+) <$> (+3) <*> (*100) $ 5
=
liftA2 (+) (+3) (*100) 5
=
(+) ((+3) 5) ((*100) 5)
=
(+) (5+3) (5*100)
=
(5+3) + (5*100)
It's just how liftA2 is defined for this type, Applicative ((->) t) => ...:
instance Applicative ((->) t) where
pure x t = x
liftA2 h x y t = h (x t) (y t)
There's no need to define (<*>). The source code says:
Minimal complete definition
pure, ((<*>) | liftA2)
So now you've been wanting to ask for a long time, why is it that a <$> b <*> c is equivalent to liftA2 a b c?
The short answer is, it just is. One can be defined in terms of the other -- i.e. (<*>) can be defined via liftA2,
g <*> x = liftA2 id g x -- i.e. (<*>) = liftA2 id = liftA2 ($)
-- (g <*> x) t = liftA2 id g x t
-- = id (g t) (x t)
-- = (id . g) t (x t) -- = (id <$> g <*> x) t
-- = g t (x t)
(which is exactly as it is defined in the source),
and it is a law that every Applicative Functor must follow, that h <$> g = pure h <*> g.
Lastly,
liftA2 h g x == pure h <*> g <*> x
-- h g x == (h g) x
because <*> associates to the left: it is infixl 4 <*>.

Is `data PoE a = Empty | Pair a a` a monad?

This question comes from this answer in
example of a functor that is Applicative but not a Monad:
It is claimed that the
data PoE a = Empty | Pair a a deriving (Functor,Eq)
cannot have a monad instance, but I fail to see that with:
instance Applicative PoE where
pure x = Pair x x
Pair f g <*> Pair x y = Pair (f x) (g y)
_ <*> _ = Empty
instance Monad PoE where
Empty >>= _ = Empty
Pair x y >>= f = case (f x, f y) of
(Pair x' _,Pair _ y') -> Pair x' y'
_ -> Empty
The actual reason why I believe this to be a monad is that it is isomorphic to Maybe (Pair a) with Pair a = P a a. They are both monads, both traversables so their composition should form a monad, too. Oh, I just found out not always.
Which counter-example failes which monad law? (and how to find that out systematically?)
edit: I did not expect such an interest in this question. Now I have to make up my mind if I accept the best example or the best answer to the "systematically" part.
Meanwhile, I want to visualize how join works for the simpler Pair a = P a a:
P
________/ \________
/ \
P P
/ \ / \
1 2 3 4
it always take the outer path, yielding P 1 4, more commonly known as a diagonal in a matrix representation. For monad associativy I need three dimensions, a tree visualization works better. Taken from chi's answer, this is the failing example for join, and how I can comprehend it.
Pair
_________/\_________
/ \
Pair Pair
/\ /\
/ \ / \
Pair Empty Empty Pair
/\ /\
1 2 3 4
Now you do the join . fmap join by collapsing the lower levels first, for join . join collapse from the root.

Apparently, it is not a monad. One of the monad "join" laws is
join . join = join . fmap join
Hence, according to the law above, these two outputs should be equal, but they are not.
main :: IO ()
main = do
let x = Pair (Pair (Pair 1 2) Empty) (Pair Empty (Pair 7 8))
print (join . join $ x)
-- output: Pair 1 8
print (join . fmap join $ x)
-- output: Empty
The problem is that
join x = Pair (Pair 1 2) (Pair 7 8)
fmap join x = Pair Empty Empty
Performing an additional join on those does not make them equal.
how to find that out systematically?
join . join has type m (m (m a)) -> m (m a), so I started with a triple-nested Pair-of-Pair-of-Pair, using numbers 1..8. That worked fine. Then, I tried to insert some Empty inside, and quickly found the counterexample above.
This approach was possible since a m (m (m Int)) only contains a finite amount of integers inside, and we only have constructors Pair and Empty to try.
For these checks, I find the join law easier to test than, say, associativity of >>=.

QuickCheck immediately finds a counterexample to associativity.
{-# LANGUAGE DeriveFunctor #-}
import Test.QuickCheck
data PoE a = Empty | Pair a a deriving (Functor,Eq, Show)
instance Applicative PoE where
pure x = Pair x x
Pair f g <*> Pair x y = Pair (f x) (g y)
_ <*> _ = Empty
instance Monad PoE where
Empty >>= _ = Empty
Pair x y >>= f = case (f x, f y) of
(Pair x' _,Pair _ y') -> Pair x' y'
_ -> Empty
instance Arbitrary a => Arbitrary (PoE a) where
arbitrary = oneof [pure Empty, Pair <$> arbitrary <*> arbitrary]
prop_assoc :: PoE Bool -> (Bool -> PoE Bool) -> (Bool -> PoE Bool) -> Property
prop_assoc m k h =
((m >>= k) >>= h) === (m >>= (\a -> k a >>= h))
main = do
quickCheck $ \m (Fn k) (Fn h) -> prop_assoc m k h
Output:
*** Failed! Falsifiable (after 35 tests and 3 shrinks):
Pair True False
{False->Pair False False, True->Pair False True, _->Empty}
{False->Pair False True, _->Empty}
Pair False True /= Empty

Since you are interested in how to do it systematically, here's how I found a counterexample with quickcheck:
{-# LANGUAGE DeriveFunctor #-}
import Control.Monad ((>=>))
import Test.QuickCheck
-- <your code>
Defining an arbitrary instance to generate random PoEs.
instance (Arbitrary a) => Arbitrary (PoE a) where
arbitrary = do
emptyq <- arbitrary
if emptyq
then return Empty
else Pair <$> arbitrary <*> arbitrary
And tests for the monad laws:
prop_right_id m = (m >>= return) == m
where
_types = (m :: PoE Int)
prop_left_id fun x = (return x >>= f) == f x
where
_types = fun :: Fun Int (PoE Int)
f = applyFun fun
prop_assoc fun gun hun x = (f >=> (g >=> h)) x == ((f >=> g) >=> h) x
where
_types = (fun :: Fun Int (PoE Int),
gun :: Fun Int (PoE Int),
hun :: Fun Int (PoE Int),
x :: Int)
f = applyFun fun
g = applyFun gun
h = applyFun hun
I don't get any failures for the identity laws, but prop_assoc does generate a counterexample:
ghci> quickCheck prop_assoc
*** Failed! Falsifiable (after 7 tests and 36 shrinks):
{6->Pair 1 (-1), _->Empty}
{-1->Pair (-3) (-4), 1->Pair (-1) (-2), _->Empty}
{-3->Empty, _->Pair (-2) (-4)}
6
Not that it's terribly helpful for understanding why the failure occurs, it does give you a place to start. If we look carefully, we see that we are passing (-3) and (-2) to the third function; (-3) maps to Empty and (-2) maps to a Pair, so we can't defer to the laws of either of the two monads PoE is composed of.

This kind of potential Monad instance can be concisely described as "taking the diagonal". It is easier to see why if we use the join presentation. Here is join for the Pair type you mention:
join (P (P a00 a11) (P a10 a11)) = P a00 a11
Taking the diagonal, however, is only guaranteed to give a lawful join for fixed length (or infinite) lists. That's because of the associativity law:
join . join = join . fmap join
If the n-th list in a list of lists doesn't have an n-th element, it will lead to the diagonal being trimmed: it will end before its n-th element. join . join takes the outer diagonal (of a list of lists of lists) first, while join . fmap join takes the inner diagonals first. It may be possible for an insufficiently long innermost list which is not in the outer diagonal to trim join . fmap join, but it can't possibly affect join . join. (This would be easier to show with a picture instead of words.)
Your PoE is a list-like type that doesn't have fixed length (the length is either zero or two). It turns out that taking its diagonal doesn't give us a monad, as the potential issue discussed above actually gets in the way (as illustrated in chi's answer).
Additional notes:
This is precisely the reason ZipList is not a monad: the zippy behaviour amounts to taking the diagonal.
Infinite lists are isomorphic to functions from the naturals, and fixed length lists are isomorphic to functions from the naturals up to an appropriate value. This means you can get a Monad instance for them out of the instance for functions -- and the instance you get, again, amounts to taking the diagonal.
Once upon a time I got confused about this exact issue.

(Posting this as a separate answer, as it has little overlap with my other one.)
The actual reason why I believe this to be a monad is that it is isomorphic to Maybe (Pair a) with Pair a = P a a. They are both monads, both traversables so their composition should form a monad, too. Oh, I just found out not always.
The conditions for the composition of monads m-over-n with n traversable are:
-- Using TypeApplications notation to make the layers easier to track.
sequenceA #n #m . pure #n = fmap #m (pure #n)
sequenceA #n #m . fmap #n (join #m)
= join #m . fmap #m (sequenceA #n #m) . sequenceA #n #m
sequenceA #n #m . join #n
= fmap #m (join #n) . sequenceA #n #m . fmap #n (sequenceA #n #m)
(There is also sequenceA #n #m . fmap #n (pure #m) = pure #m, but that always holds.)
In our case, we have m ~ Maybe and n ~ Pair. The relevant method definitions for Pair would be:
fmap f (P x y) = P (f x) (f y)
pure x = P x x
P f g <*> P x y = P (f x) (g y)
join (P (P a00 a01) (P a10 a11)) = P a00 a11 -- Let's pretend join is a method.
sequenceA (P x y) = P <$> x <*> y
Let's check the third property:
sequenceA #n #m . join #n
= fmap #m (join #n) . sequenceA #n #m . fmap #n (sequenceA #n #m)
-- LHS
sequenceA . join $ P (P a00 a01) (P a10 a11)
sequenceA $ P a00 a11
P <$> a00 <*> a11 -- Maybe (Pair a)
-- RHS
fmap join . sequenceA . fmap sequenceA $ P (P a00 a01) (P a10 a11)
fmap join . sequenceA $ P (P <$> a00 <*> a01) (P <$> a10 <*> a11)
fmap join $ P <$> (P <$> a00 <*> a01) <*> (P <$> a10 <*> a11)
fmap join $ (\x y z w -> P (P x y) (P z w)) <$> a00 <*> a01 <*> a10 <*> a11
(\x _ _ w -> P x w) <$> a00 <*> a01 <*> a10 <*> a11 -- Maybe (Pair a)
These are clearly not the same: while any a values will be drawn exclusively from a00 and a11, the effects of a01 and a10 are ignored in the left-hand side, but not in the right-hand side (in other words, if a01 or a10 are Nothing, the RHS will be Nothing, but the LHS won't necessarily be so). The LHS corresponds exactly to the vanishing Empty in chi's answer, and the RHS corresponds to the inner diagonal trimming described in my other answer.
P.S.: I forgot to show that the would-be instance we are talking about here is the same one being discussed in the question:
join' :: m (n (m (n a))) -> m (n a)
join' = fmap #m (join #n) . join #m . fmap #m (sequenceA #n #m)
With m ~ Maybe and n ~ Pair, we have:
join' :: Maybe (Pair (Maybe (Pair a))) -> Maybe (Pair a)
join' = fmap #Maybe (join #Pair) . join #Maybe . fmap #Maybe (sequenceA #Pair #Maybe)
join #Maybe . fmap #Maybe (sequenceA #Pair #Maybe) means the join' will result in Nothing unless there are no Nothings anywhere:
join' = \case
Just (P (Just (P a00 a01)) (Just (P a10 a11))) -> _
_ -> Nothing
Working out the non-Nothing case is straightforward:
fmap join . join . fmap sequenceA $ Just (P (Just (P a00 a01)) (Just (P a10 a11)))
fmap join . join $ Just (Just (P (P a00 a01) (P a10 a11)))
fmap join $ Just (P (P a00 a01) (P a10 a11))
Just (P a00 a11)
Therefore...
join' = \case
Just (P (Just (P a00 _)) (Just (P _ a11))) -> Just (P a00 a11)
_ -> Nothing
... which is essentially the same as:
join = \case
Pair (Pair a00 _) (Pair _ a11) -> Pair (a00 a11)
_ -> Empty

Relationship between fmap and bind

After looking up the Control.Monad documentation, I'm confused about
this passage:
The above laws imply:
fmap f xs = xs >>= return . f
How do they imply that?

Control.Applicative says
As a consequence of these laws, the Functor instance for f will satisfy
fmap f x = pure f <*> x
The relationship between Applicative and Monad says
pure = return
(<*>) = ap
ap says
return f `ap` x1 `ap` ... `ap` xn
is equivalent to
liftMn f x1 x2 ... xn
Therefore
fmap f x = pure f <*> x
= return f `ap` x
= liftM f x
= do { v <- x; return (f v) }
= x >>= return . f

Functor instances are unique, in the sense that if F is a Functor and you have a function foobar :: (a -> b) -> F a -> F b such that foobar id = id (that is, it follows the first functor law) then foobar = fmap. Now, consider this function:
liftM :: Monad f => (a -> b) -> f a -> f b
liftM f xs = xs >>= return . f
What is liftM id xs, then?
liftM id xs
xs >>= return . id
-- id does nothing, so...
xs >>= return
-- By the second monad law...
xs
liftM id xs = xs; that is, liftM id = id. Therefore, liftM = fmap; or, in other words...
fmap f xs = xs >>= return . f
epheriment's answer, which routes through the Applicative laws, is also a valid way of reaching this conclusion.

Applicative Laws for the ((->) r) type

I'm trying to check that the Applicative laws hold for the function type ((->) r), and here's what I have so far:
-- Identiy
pure (id) <*> v = v
-- Starting with the LHS
pure (id) <*> v
const id <*> v
(\x -> const id x (g x))
(\x -> id (g x))
(\x -> g x)
g x
v
-- Homomorphism
pure f <*> pure x = pure (f x)
-- Starting with the LHS
pure f <*> pure x
const f <*> const x
(\y -> const f y (const x y))
(\y -> f (x))
(\_ -> f x)
pure (f x)
Did I perform the steps for the first two laws correctly?
I'm struggling with the interchange & composition laws. For interchange, so far I have the following:
-- Interchange
u <*> pure y = pure ($y) <*> u
-- Starting with the LHS
u <*> pure y
u <*> const y
(\x -> g x (const y x))
(\x -> g x y)
-- I'm not sure how to proceed beyond this point.
I would appreciate any help for the steps to verify the Interchange & Composition applicative laws for the ((->) r) type. For reference, the Composition applicative law is as follows:
pure (.) <*> u <*> v <*> w = u <*> (v <*> w)

I think in your "Identity" proof, you should replace g with v everywhere (otherwise what is g and where did it come from?). Similarly, in your "Interchange" proof, things look okay so far, but the g that magically appears should just be u. To continue that proof, you could start reducing the RHS and verify that it also produces \x -> u x y.
Composition is more of the same: plug in the definitions of pure and (<*>) on both sides, then start calculating on both sides. You'll soon come to some bare lambdas that will be easy to prove equivalent.

Develop Reference

node.js excel linux python-3.x azure haskell apache-spark rust .htaccess string