What can the Reader monad do that applicative functions cannot? - haskell

Having read http://learnyouahaskell.com/functors-applicative-functors-and-monoids#applicative-functors , I can provide an example of the use of functions as applicative functors:
Let's say res is a function of 4 arguments and fa, fb, fc, fd are all functions that take a single argument. Then, if I'm not mistaken, this applicaive expression:
f <$> fa <*> fb <*> fc <*> fd $ x
Means the same as this non-fancy expression:
f (fa x) (fb x) (fc x) (fd x)
Ugh. Took me quite a bit of time to understand why this is the case, but - with the help of a sheet of paper with my notes - I should be able to prove this.
Then I read http://learnyouahaskell.com/for-a-few-monads-more#reader . And we're back at this stuff again, this time in the monadic syntax:
do
a <- fa
b <- fb
c <- fc
d <- fd
return (f a b c d)
While another A4 sheet of notes was needed for me to prove this, I'm now pretty confident that this, again, means the same:
f (fa x) (fb x) (fc x) (fd x)
I'm confused. Why? What's the use of this?
Or, to be more precise: This seems to me to just duplicate the functionality of functions as applicatives, but with a more verbose syntax.
So, could you give me an example of can the Reader monad do that functions as applicatives cannot?
Actually, I would also like to ask what's the use of any of these two: applicative functions OR the Reader monad - because while being able to apply the same argument to four functions (fa, fb, fc, fd) without repeating this argument four times does reduce some repetitiveness, I'm not sure if this minute improvement justifies this level of complexity; so I must be missing something prominent, I think; but this is worthy of a separate question

The monadic version lets you add additional logic between the calls to the functions found in the context, or even decide not to call them at all.
do
a <- fa
if a == 3
then return (f a 1 1 1)
else do
b <- fb
c <- fc
d <- fd
return (f a b c d)
In your original do expression, it's true that you aren't doing anything that the Applicative instance couldn't do, and in fact, the compiler can determine that. If you use the ApplicativeDo extension, then
do
a <- fa
b <- fb
c <- fc
d <- fd
return (f a b c d)
would indeed desugar to f <$> fa <*> fb <*> fc <*> fd instead of fa >>= \a -> fb >>= \b -> fc >>= \c -> fd >>= \d -> return (f a b c d).
This all holds for other types as well, for example
Maybe:
f <$> (Just 3) <*> (Just 5)
== Just (f 3 5)
== do
x <- Just 3
y <- Just 5
return (f 3 5)
[]:
f <$> [1,2] <*> [3,4]
== [f 1 3, f 1 4, f 2 3, f 2 4]
== do
x <- [1,2]
y <- [3,4]
return (f x y)

Before getting to your main question about Reader, I will start with a few remarks about applicative-versus-monad in general. While this applicative style expression...
g <$> fa <*> fb
... is indeed equivalent to this do-block...
do
x <- fa
y <- fb
return (g x y)
... switching from Applicative to Monad makes it possible to make decisions about which computations to perform based on results of other computations, or, in other words, to have effects that depend on previous results (see also chepner's answer):
do
x <- fa
y <- if x >= 0 then fb else fc
return (g x y)
While Monad is more powerful than Applicative, I suggest not thinking of it as if one were more useful than the other. Firstly, because there are applicative functors that aren't monads; secondly, because not using more power than you actually need tends to make things simpler overall. (In addition, such simplicity can sometimes bring tangible benefits, such as an easier time dealing with concurrency.)
A parenthetical note: when it comes to applicative-versus-monad, Reader is a special case, in that the Applicative and Monad instances happen to be equivalent. For the function functor (that is, ((->) r), which is Reader r without the newtype wrapper), we have m >>= f = flip f <*> m. That means if take the second do-block I wrote just above (or the analogous one in chepner's answer, etc) and assume the monad being used is Reader, we can translate it into applicative style.
Still, with Reader ultimately being such a simple thing, why should we even bother with any of the above in this specific case? Here go a few suggestions.
To begin with, Haskellers are often wary of the bare function functor, ((->) r), and quite understandably so: it can easily lead to unnecessarily cryptic code when compared to "non-fancy expression[s]" in which functions are applied directly. Still, in a few select cases it can be handy to use. For a tiny example, consider these two functions from Data.Char:
isUpper :: Char -> Bool
isDigit :: Char -> Bool
Now let's say we want to write a function that checks if a character is either an upper case letter or an ASCII digit. The straightforward thing to do is something along the lines of:
\c -> isUpper c && isDigit c
Using the applicative style, though, we can write it immediately in terms of the two functions -- or, I'm inclined to say, the two properties -- without having to note where the eventual argument goes:
(&&) <$> isUpper <*> isDigit
With an example as tiny as this one, whether to write it in this way is not a big deal, and largely up to taste -- I quite like it; others can't stand it. The point, though, is that sometimes we aren't particularly concerned about a certain value being a function, because we happen to be thinking of it as something else -- in this case, as a property -- and the fact it is ultimately a function can appear to us as a mere implementation detail.
A quite compelling example of this perspective shift involves application-wide configuration parameters: if every single function across some layer of your program takes some Config value as an argument, chances are you will find it more comfortable treating its availability as a background assumption, rather than passing it around explicitly everywhere. It turns out that is the main use case for the reader monad.
In any case, your suspicions about the usefulness of Reader are somewhat vindicated in at least one manner. It turns out that Reader itself, the functions-but-wrapped-in-a-fancy-newtype functor, isn't actually used all that often in the wild. What is extremely common are monadic stacks that incorporate the functionality of Reader, typically through the means of ReaderT and/or the MonadReader class. Discussing monad transformers at length would be a digression too far for the space of this answer, so I will just note that you can work with, for example, ReaderT r IO much like you would with Reader r, except that you can also slip in IO computations along the way. It is not unusual to see some variant of ReaderT over IO as the core type of the outer layer of a Haskell application.
On a final note, you might find it interesting to see what join from Control.Monad does for the function functor, and then work out why that makes sense. (A solution can be found in this Q&A.)

Related

Combining bind and return

Consider:
x `f` y = x >>= (return . y)
This function f seems very similar to <$> and flip liftM but <$> doesn't seem to work and I'd have to define an infix operator for flip liftM to make it look nice and I'm presuming one already exists?
Is there a function like what I've described and what is it?
It is flip liftM, but not <$>. It's also almost exactly the same as flip <$>, but the latter is for the Functor typeclass, not Monad. (In the latest standard libraries the relationship between Functor and Monad is not yet reflected in the typeclass hierarchy, but it will be).
If you want to find where this is defined, you go to FP Complete's Hoogle, enter the type you are looking for
Functor f => f a -> (a -> b) -> f b
and discover it is defined in lens.
Your function
x `f` y = x >>= (return . y)
is equivalent to flip fmap, so if you don't mind swapping the order, you can import Data.Functor, define fmap and write it as
y <$> x
(There's no need to wait for Functor to be a superclass of Monad; you can go ahead today and define it.)
This has nice precedence so you can write stuff like
munge = Just . remove bits . add things <$> operation 1
>>= increase something <$> operation 2
instead of
munge' = do
thing1 <- operation 1
let thing2 = Just . remove bits. add things $ thing1
thing3 <- operation 2
return . increase something $ thing3
but even nicer, if you import Control.Applicative instead (which also exports <$>), you can combine multiple things, for example:
addLine = (+) <$> readLine <*> readLine >>= print
instead of
addLine' = do
one <- readLine
two <- readLine
print (one + two)
Future-proofing your code
If the Functor-Applicative-proposal goes ahead, you'll have to make all your Monads Applicatives (and hence Functors). You may as well start now.
If your Monad isn't already an Applicative, you can define pure = return and
mf <*> mx = do
f <- mf
x <- mx
return (f x)
If it's not a Functor, you can define
fmap f mx = do
x <- mx
return (f x)
The proposal suggests using (<*>) = ap and fmap = liftM, both from Control.Monad, but the definitions above are easy too, and you may well find it even easier in your own Monad.
Data.Generics.Serialization.Standard exports (>>$) which is defined as flip liftM. Not exactly a general-purpose module to depend upon, but you can if you want to. I've seen similar definitions in other application-specific modules. This is an indication that no general-purpose module defines such a function.
The least painful solution is probably to define your own, at least until the big Monad hierarchy overhaul happens.

What advantage does Monad give us over an Applicative?

I've read this article, but didn't understand last section.
The author says that Monad gives us context sensitivity, but it's possible to achieve the same result using only an Applicative instance:
let maybeAge = (\futureYear birthYear -> if futureYear < birthYear
then yearDiff birthYear futureYear
else yearDiff futureYear birthYear) <$> (readMay futureYearString) <*> (readMay birthYearString)
It's uglier for sure without do-syntax, but beside that I don't see why we need Monad. Can anyone clear this up for me?
Here's a couple of functions that use the Monad interface.
ifM :: Monad m => m Bool -> m a -> m a -> m a
ifM c x y = c >>= \z -> if z then x else y
whileM :: Monad m => (a -> m Bool) -> (a -> m a) -> a -> m a
whileM p step x = ifM (p x) (step x >>= whileM p step) (return x)
You can't implement them with the Applicative interface. But for the sake of enlightenment, let's try and see where things go wrong. How about..
import Control.Applicative
ifA :: Applicative f => f Bool -> f a -> f a -> f a
ifA c x y = (\c' x' y' -> if c' then x' else y') <$> c <*> x <*> y
Looks good! It has the right type, it must be the same thing! Let's just check to make sure..
*Main> ifM (Just True) (Just 1) (Just 2)
Just 1
*Main> ifM (Just True) (Just 1) (Nothing)
Just 1
*Main> ifA (Just True) (Just 1) (Just 2)
Just 1
*Main> ifA (Just True) (Just 1) (Nothing)
Nothing
And there's your first hint at the difference. You can't write a function using just the Applicative interface that replicates ifM.
If you divide this up into thinking about values of the form f a as being about "effects" and "results" (both of which are very fuzzy approximate terms that are the best terms available, but not very good), you can improve your understanding here. In the case of values of type Maybe a, the "effect" is success or failure, as a computation. The "result" is a value of type a that might be present when the computation completes. (The meanings of these terms depends heavily on the concrete type, so don't think this is a valid description of anything other than Maybe as a type.)
Given that setting, we can look at the difference in a bit more depth. The Applicative interface allows the "result" control flow to be dynamic, but it requires the "effect" control flow to be static. If your expression involves 3 computations that can fail, the failure of any one of them causes the failure of the whole computation. The Monad interface is more flexible. It allows the "effect" control flow to depend on the "result" values. ifM chooses which argument's "effects" to include in its own "effects" based on its first argument. This is the huge fundamental difference between ifA and ifM.
There's something even more serious going on with whileM. Let's try to make whileA and see what happens.
whileA :: Applicative f => (a -> f Bool) -> (a -> f a) -> a -> f a
whileA p step x = ifA (p x) (whileA p step <*> step x) (pure x)
Well.. What happens is a compile error. (<*>) doesn't have the right type there. whileA p step has the type a -> f a and step x has the type f a. (<*>) isn't the right shape to fit them together. For it to work, the function type would need to be f (a -> a).
You can try lots more things - but you'll eventually find that whileA has no implementation that works anything even close to the way whileM does. I mean, you can implement the type, but there's just no way to make it both loop and terminate.
Making it work requires either join or (>>=). (Well, or one of the many equivalents of one of those) And those the extra things you get out of the Monad interface.
With monads, subsequent effects can depend on previous values. For example, you can have:
main = do
b <- readLn :: IO Bool
if b
then fireMissiles
else return ()
You can't do that with Applicatives - the result value of one effectfull computation can't determine what effect will follow.
Somewhat related:
Why can applicative functors have side effects, but functors can't?
Good examples of Not a Functor/Functor/Applicative/Monad?
As Stephen Tetley said in a comment, that example doesn't actually use context-sensitivity. One way to think about context-sensitivity is that it lets use choose which actions to take depending on monadic values. Applicative computations must always have the same "shape", in a certain sense, regardless of the values involved; monadic computations need not. I personally think this is easier to understand with a concrete example, so let's look at one. Here's two versions of a simple program which ask you to enter a password, check that you entered the right one, and print out a response depending on whether or not you did.
import Control.Applicative
checkPasswordM :: IO ()
checkPasswordM = do putStrLn "What's the password?"
pass <- getLine
if pass == "swordfish"
then putStrLn "Correct. The secret answer is 42."
else putStrLn "INTRUDER ALERT! INTRUDER ALERT!"
checkPasswordA :: IO ()
checkPasswordA = if' . (== "swordfish")
<$> (putStrLn "What's the password?" *> getLine)
<*> putStrLn "Correct. The secret answer is 42."
<*> putStrLn "INTRUDER ALERT! INTRUDER ALERT!"
if' :: Bool -> a -> a -> a
if' True t _ = t
if' False _ f = f
Let's load this into GHCi and check what happens with the monadic version:
*Main> checkPasswordM
What's the password?
swordfish
Correct. The secret answer is 42.
*Main> checkPasswordM
What's the password?
zvbxrpl
INTRUDER ALERT! INTRUDER ALERT!
So far, so good. But if we use the applicative version:
*Main> checkPasswordA
What's the password?
hunter2
Correct. The secret answer is 42.
INTRUDER ALERT! INTRUDER ALERT!
We entered the wrong password, but we still got the secret! And an intruder alert! This is because <$> and <*>, or equivalently liftAn/liftMn, always execute the effects of all their arguments. The applicative version translates, in do notation, to
do pass <- putStrLn "What's the password?" *> getLine)
unit1 <- putStrLn "Correct. The secret answer is 42."
unit2 <- putStrLn "INTRUDER ALERT! INTRUDER ALERT!"
pure $ if' (pass == "swordfish") unit1 unit2
And it should be clear why this has the wrong behavior. In fact, every use of applicative functors is equivalent to monadic code of the form
do val1 <- app1
val2 <- app2
...
valN <- appN
pure $ f val1 val2 ... valN
(where some of the appI are allowed to be of the form pure xI). And equivalently, any monadic code in that form can be rewritten as
f <$> app1 <*> app2 <*> ... <*> appN
or equivalently as
liftAN f app1 app2 ... appN
To think about this, consider Applicative's methods:
pure :: a -> f a
(<$>) :: (a -> b) -> f a -> f b
(<*>) :: f (a -> b) -> f a -> f b
And then consider what Monad adds:
(=<<) :: (a -> m b) -> m a -> m b
join :: m (m a) -> m a
(Remember that you only need one of those.)
Handwaving a lot, if you think about it, the only way we can put together the applicative functions is to construct chains of the form f <$> app1 <*> ... <*> appN, and possibly nest those chains (e.g., f <$> (g <$> x <*> y) <*> z). However, (=<<) (or (>>=)) allows us to take a value and produce different monadic computations depending on that value, that could be constructed on the fly. This is what we use to decide whether to compute "print out the secret", or compute "print out an intruder alert", and why we can't make that decision with applicative functors alone; none of the types for applicative functions allow you to consume a plain value.
You can think about join in concert with fmap in a similar way: as I mentioned in a comment, you can do something like
checkPasswordFn :: String -> IO ()
checkPasswordFn pass = if pass == "swordfish"
then putStrLn "Correct. The secret answer is 42."
else putStrLn "INTRUDER ALERT! INTRUDER ALERT!"
checkPasswordA' :: IO (IO ())
checkPasswordA' = checkPasswordFn <$> (putStrLn "What's the password?" *> getLine)
This is what happens when we want to pick a different computation depending on the value, but only have applicative functionality available us. We can pick two different computations to return, but they're wrapped inside the outer layer of the applicative functor. To actually use the computation we've picked, we need join:
checkPasswordM' :: IO ()
checkPasswordM' = join checkPasswordA'
And this does the same thing as the previous monadic version (as long as we import Control.Monad first, to get join):
*Main> checkPasswordM'
What's the password?
12345
INTRUDER ALERT! INTRUDER ALERT!
On the other hand, here's a a practical example of the Applicative/Monad divide where Applicatives have an advantage: error handling! We clearly have a Monad implementation of Either that carries along errors, but it always terminates early.
Left e1 >> Left e2 === Left e1
You can think of this as an effect of intermingling values and contexts. Since (>>=) will try to pass the result of the Either e a value to a function like a -> Either e b, it must fail immediately if the input Either is Left.
Applicatives only pass their values to the final pure computation after running all of the effects. This means they can delay accessing the values for longer and we can write this.
data AllErrors e a = Error e | Pure a deriving (Functor)
instance Monoid e => Applicative (AllErrors e) where
pure = Pure
(Pure f) <*> (Pure x) = Pure (f x)
(Error e) <*> (Pure _) = Error e
(Pure _) <*> (Error e) = Error e
-- This is the non-Monadic case
(Error e1) <*> (Error e2) = Error (e1 <> e2)
It's impossible to write a Monad instance for AllErrors such that ap matches (<*>) because (<*>) takes advantage of running both the first and second contexts before using any values in order to get both errors and (<>) them together. Monadic (>>=) and (join) can only access contexts interwoven with their values. That's why Either's Applicative instance is left-biased, so that it can also have a harmonious Monad instance.
> Left "a" <*> Left "b"
Left 'a'
> Error "a" <*> Error "b"
Error "ab"
With Applicative, the sequence of effectful actions to be performed is fixed at compile-time. With Monad, it can be varied at run-time based on the results of effects.
For example, with an Applicative parser, the sequence of parsing actions is fixed for all time. That means that you can potentially perform "optimisations" on it. On the other hand, I can write a Monadic parser which parses some a BNF grammar description, dynamically constructs a parser for that grammar, and then runs that parser over the rest of the input. Every time you run this parser, it potentially constructs a brand new parser to parse the second portion of the input. Applicative has no hope of doing such a thing - and there is no chance of performing compile-time optimisations on a parser that doesn't exist yet...
As you can see, sometimes the "limitation" of Applicative is actually beneficial - and sometimes the extra power offered by Monad is required to get the job done. This is why we have both.
If you try to convert the type signature of Monad's bind and Applicative <*> to natural language, you will find that:
bind : I will give you the contained value and you will return me a new packaged value
<*>: You give me a packaged function that accepts a contained value and return a value and I will use it to create new packaged value based on my rules.
Now as you can see from the above description, bind gives you more control as compared to <*>
If you work with Applicatives, the "shape" of the result is already determined by the "shape" of the input, e.g. if you call [f,g,h] <*> [a,b,c,d,e], your result will be a list of 15 elements, regardless which values the variables have. You don't have this guarantee/limitation with monads. Consider [x,y,z] >>= join replicate: For [0,0,0] you'll get the result [], for [1,2,3] the result [1,2,2,3,3,3].
Now that ApplicativeDo extension become pretty common thing, the difference between Monad and Applicative can be illustrated using simple code snippet.
With Monad you can do
do
r1 <- act1
if r1
then act2
else act3
but having only Applicative do-block, you can't use if on things you've pulled out with <-.

Monads with Join() instead of Bind()

Monads are usually explained in turns of return and bind. However, I gather you can also implement bind in terms of join (and fmap?)
In programming languages lacking first-class functions, bind is excruciatingly awkward to use. join, on the other hand, looks quite easy.
I'm not completely sure I understand how join works, however. Obviously, it has the [Haskell] type
join :: Monad m => m (m x) -> m x
For the list monad, this is trivially and obviously concat. But for a general monad, what, operationally, does this method actually do? I see what it does to the type signatures, but I'm trying to figure out how I'd write something like this in, say, Java or similar.
(Actually, that's easy: I wouldn't. Because generics is broken. ;-) But in principle the question still stands...)
Oops. It looks like this has been asked before:
Monad join function
Could somebody sketch out some implementations of common monads using return, fmap and join? (I.e., not mentioning >>= at all.) I think perhaps that might help it to sink in to my dumb brain...
Without plumbing the depths of metaphor, might I suggest to read a typical monad m as "strategy to produce a", so the type m value is a first class "strategy to produce a value". Different notions of computation or external interaction require different types of strategy, but the general notion requires some regular structure to make sense:
if you already have a value, then you have a strategy to produce a value (return :: v -> m v) consisting of nothing other than producing the value that you have;
if you have a function which transforms one sort of value into another, you can lift it to strategies (fmap :: (v -> u) -> m v -> m u) just by waiting for the strategy to deliver its value, then transforming it;
if you have a strategy to produce a strategy to produce a value, then you can construct a strategy to produce a value (join :: m (m v) -> m v) which follows the outer strategy until it produces the inner strategy, then follows that inner strategy all the way to a value.
Let's have an example: leaf-labelled binary trees...
data Tree v = Leaf v | Node (Tree v) (Tree v)
...represent strategies to produce stuff by tossing a coin. If the strategy is Leaf v, there's your v; if the strategy is Node h t, you toss a coin and continue by strategy h if the coin shows "heads", t if it's "tails".
instance Monad Tree where
return = Leaf
A strategy-producing strategy is a tree with tree-labelled leaves: in place of each such leaf, we can just graft in the tree which labels it...
join (Leaf tree) = tree
join (Node h t) = Node (join h) (join t)
...and of course we have fmap which just relabels leaves.
instance Functor Tree where
fmap f (Leaf x) = Leaf (f x)
fmap f (Node h t) = Node (fmap f h) (fmap f t)
Here's an strategy to produce a strategy to produce an Int.
Toss a coin: if it's "heads", toss another coin to decide between two strategies (producing, respectively, "toss a coin for producing 0 or producing 1" or "produce 2"); if it's "tails" produce a third ("toss a coin for producing 3 or tossing a coin for 4 or 5").
That clearly joins up to make a strategy producing an Int.
What we're making use of is the fact that a "strategy to produce a value" can itself be seen as a value. In Haskell, the embedding of strategies as values is silent, but in English, I use quotation marks to distinguish using a strategy from just talking about it. The join operator expresses the strategy "somehow produce then follow a strategy", or "if you are told a strategy, you may then use it".
(Meta. I'm not sure whether this "strategy" approach is a suitably generic way to think about monads and the value/computation distinction, or whether it's just another crummy metaphor. I do find leaf-labelled tree-like types a useful source of intuition, which is perhaps not a surprise as they're the free monads, with just enough structure to be monads at all, but no more.)
PS The type of "bind"
(>>=) :: m v -> (v -> m w) -> m w
says "if you have a strategy to produce a v, and for each v a follow-on strategy to produce a w, then you have a strategy to produce a w". How can we capture that in terms of join?
mv >>= v2mw = join (fmap v2mw mv)
We can relabel our v-producing strategy by v2mw, producing instead of each v value the w-producing strategy which follows on from it — ready to join!
join = concat -- []
join f = \x -> f x x -- (e ->)
join f = \s -> let (f', s') = f s in f' s' -- State
join (Just (Just a)) = Just a; join _ = Nothing -- Maybe
join (Identity (Identity a)) = Identity a -- Identity
join (Right (Right a)) = Right a; join (Right (Left e)) = Left e;
join (Left e) = Left e -- Either
join ((a, m), m') = (a, m' `mappend` m) -- Writer
-- N.B. there is a non-newtype-wrapped Monad instance for tuples that
-- behaves like the Writer instance, but with the tuple order swapped
join f = \k -> f (\f' -> f' k) -- Cont
Calling fmap (f :: a -> m b) (x ::ma) produces values (y ::m(m b)) so it is a very natural thing to use join to get back values (z :: m b).
Then bind is defined simply as bind ma f = join (fmap f ma), thus achieving the Kleisly compositionality of functions of (:: a -> m b) variety, which is what it is really all about:
ma `bind` (f >=> g) = (ma `bind` f) `bind` g -- bind = (>>=)
= (`bind` g) . (`bind` f) $ ma
= join . fmap g . join . fmap f $ ma
And so, with flip bind = (=<<), we have
((g <=< f) =<<) = (g =<<) . (f =<<) = join . (g <$>) . join . (f <$>)
OK, so it's not really good form to answer your own question, but I'm going to note down my thinking in case it enlightens anybody else. (I doubt it...)
If a monad can be thought of as a "container", then both return and join have pretty obvious semantics. return generates a 1-element container, and join turns a container of containers into a single container. Nothing hard about that.
So let us focus on monads which are more naturally thought of as "actions". In that case, m x is some sort of action which yields a value of type x when you "execute" it. return x does nothing special, and then yields x. fmap f takes an action that yields an x, and constructs an action that computes x and then applies f to it, and returns the result. So far, so good.
It's fairly obvious that if f itself generates an action, then what you end up with is m (m x). That is, an action that computes another action. In a way, that's maybe even simpler to wrap your mind around than the >>= function which takes an action and a "function that produces an action" and so on.
So, logically speaking, it seems join would run the first action, take the action it produces, and then run that. (Or rather, join would return an action that does what I just described, if you want to split hairs.)
That seems to be the central idea. To implement join, you want to run an action, which then gives you another action, and then you run that. (Whatever "run" happens to mean for this particular monad.)
Given this insight, I can take a stab at writing some join implementations:
join Nothing = Nothing
join (Just mx) = mx
If the outer action is Nothing, return Nothing, else return the inner action. Then again, Maybe is more of a container than an action, so let's try something else...
newtype Reader s x = Reader (s -> x)
join (Reader f) = Reader (\ s -> let Reader g = f s in g s)
That was... painless. A Reader is really just a function that takes a global state and only then returns its result. So to unstack, you apply the global state to the outer action, which returns a new Reader. You then apply the state to this inner function as well.
In a way, it's perhaps easier than the usual way:
Reader f >>= g = Reader (\ s -> let x = f s in g x)
Now, which one is the reader function, and which one is the function that computes the next reader...?
Now let's try the good old State monad. Here every function takes an initial state as input but also returns a new state along with its output.
data State s x = State (s -> (s, x))
join (State f) = State (\ s0 -> let (s1, State g) = f s0 in g s1)
That wasn't too hard. It's basically run followed by run.
I'm going to stop typing now. Feel free to point out all the glitches and typos in my examples... :-/
I've found many explanations of monads that say "you don't have to know anything about category theory, really, just think of monads as burritos / space suits / whatever".
Really, the article that demystified monads for me just said what categories were, described monads (including join and bind) in terms of categories, and didn't bother with any bogus metaphors:
http://en.wikibooks.org/wiki/Haskell/Category_theory
I think the article is very readable without much math knowledge required.
Asking what a type signature in Haskell does is rather like asking what an interface in Java does.
It, in some literal sense, "doesn't". (Though, of course, you will typically have some sort of purpose associated with it, that's mostly in your mind, and mostly not in the implementation.)
In both cases you are declaring legal sequences of symbols in the language which will be used in later definitions.
Of course, in Java, I suppose you could say that an interface corresponds to a type signature which is going to be implemented literally in the VM. You can get some polymorphism this way -- you can define a name that accepts an interface, and you can provide a different definition for the name which accepts a different interface. Something similar happens in Haskell, where you can provide a declaration for a name which accepts one type and then another declaration for that name which treats a different type.
This is Monad explained in one picture. The 2 functions in the green category are not composable, when being mapped to the blue category with join . fmap (strictly speaking, they are one category), they become composable. Monad is about turning a function of type T -> Monad<U> into a function of type Monad<T> -> Monad<U>.

Explanation of Monad laws

From a gentle introduction to Haskell, there are the following monad laws. Can anyone intuitively explain what they mean?
return a >>= k = k a
m >>= return = m
xs >>= return . f = fmap f xs
m >>= (\x -> k x >>= h) = (m >>= k) >>= h
Here is my attempted explanation:
We expect the return function to wrap a so that its monadic nature is trivial. When we bind it to a function, there are no monadic effects, it should just pass a to the function.
The unwrapped output of m is passed to return that rewraps it. The monadic nature remains the same. So it is the same as the original monad.
The unwrapped value is passed to f then rewrapped. The monadic nature remains the same. This is the behavior expected when we transform a normal function into a monadic function.
I don't have an explanation for this law. This does say that the monad must be "almost associative" though.
Your descriptions seem pretty good. Generally people speak of three monad laws, which you have as 1, 2, and 4. Your third law is slightly different, and I'll get to that later.
For the three monad laws, I find it much easier to get an intuitive understanding of what they mean when they're re-written using Kleisli composition:
-- defined in Control.Monad
(>=>) :: Monad m => (a -> m b) -> (b -> m c) -> a -> m c
mf >=> n = \x -> mf x >>= n
Now the laws can be written as:
1) return >=> mf = mf -- left identity
2) mf >=> return = mf -- right identity
4) (f >=> g) >=> h = f >=> (g >=> h) -- associativity
1) Left Identity Law - returning a value doesn't change the value and doesn't do anything in the monad.
2) Right Identity Law - returning a value doesn't change the value and doesn't do anything in the monad.
4) Associativity - monadic composition is associative (I like KennyTM's answer for this)
The two identity laws basically say the same thing, but they're both necessary because return should have identity behavior on both sides of the bind operator.
Now for the third law. This law essentially says that both the Functor instance and your Monad instance behave the same way when lifting a function into the monad, and that neither does anything monadic. If I'm not mistaken, it's the case that when a monad obeys the other three laws and the Functor instance obeys the functor laws, then this statement will always be true.
A lot of this comes from the Haskell Wiki. The Typeclassopedia is a good reference too.
No disagreements with the other answers, but it might help to think of the monad laws as actually describing two sets of properties. As John says, the third law you mention is slightly different, but here's how the others can be split apart:
Functions that you bind to a monad compose just like regular functions.
As in John's answer, what's called a Kleisli arrow for a monad is a function with type a -> m b. Think of return as id and (<=<) as (.), and the monad laws are the translations of these:
id . f is equivalent to f
f . id is equivalent to f
(f . g) . h is equivalent to f . (g . h)
Sequences of monadic effects append like lists.
For the most part, you can think of the extra monadic structure as a sequence of extra behaviors associated with a monadic value; e.g. Maybe being "give up" for Nothing and "keep going" for Just. Combining two monadic actions then essentially concatenates the sequences of behaviors they held.
In this sense, return is again an identity--the null action, akin to an empty list of behaviors--and (>=>) is concatenation. So, the monad laws are translations of these:
[] ++ xs is equivalent to xs
xs ++ [] is equivalent to xs
(xs ++ ys) ++ zs is equivalent to xs ++ (ys ++ zs)
These three laws describe a ridiculously common pattern, which Haskell unfortunately can't quite express in full generality. If you're interested, Control.Category gives a generalization of "things that look like function composition", while Data.Monoid generalizes the latter case where no type parameters are involved.
In terms of do notation, rule 4 means we can add an extra do block to group a sequence of monadic operations.
do do
y <- do
x <- m x <- m
y <- k x <=> k x
h y h y
This allows functions that return a monadic value to work properly.
The first three laws say that "return" only wraps a value and does nothing else. So you can eliminate "return" calls without changing the semantics.
The last law is associativity for bind. It means that you take something like:
do
x <- foo
bar x
z <- baz
and turn it into
do
do
x <- foo
bar x
z <- baz
without changing the meaning. Of course you wouldn't do exactly this, but you might want to put the inner "do" clause in an "if" statement and want it to mean the same when the "if" is true.
Sometimes monads don't exactly follow these laws, particularly when some kind of bottom value occurs. That's OK as long as its documented and is "morally correct" (i.e. the laws are followed for non-bottom values, or the results are considered equivalent in some other way).

How do you use Control.Applicative to write cleaner Haskell?

In a recent answer to a style question, I wrote
main = untilM (isCorrect 42) (read `liftM` getLine)
and
isCorrect num guess =
case compare num guess of
EQ -> putStrLn "You Win!" >> return True
...
Martijn helpfully suggested alternatives:
main = untilM (isCorrect 42) (read <$> getLine)
EQ -> True <$ putStrLn "You Win!"
Which common patterns in Haskell code can be made clearer using abstractions from Control.Applicative? What are helpful rules of thumb to keep in mind for using Control.Applicative effectively?
There is a lot to say in answer to your question, however, since you asked, I will offer this "rule of thumb."
If you are using do-notation and your generated values[1] are not used in the expressions that you are sequencing[2], then that code can transform to an Applicative style. Similarly, if you use one or more of the generated values in an expression that is sequenced, then you must use Monad and Applicative is not strong enough to achieve the same code.
For example, let us look at the following code:
do a <- e1
b <- e2
c <- e3
return (f a b c)
We see that in none of the expressions to the right of <- do any of the generated values (a, b, c) appear. Therefore, we can transform it to using Applicative code. Here is one possible transformation:
f <$> e1 <*> e2 <*> e3
and another:
liftA3 f e1 e2 e3
On the other hand, take this piece of code for example:
do a <- e1
b <- e2 a
c <- e3
return (f b c)
This code cannot use Applicative[3] because the generated value a is used later in an expression in the comprehension. This must use Monad to get to its result -- attempt to factor it into Applicative to get a feel for why.
There are some further interesting and useful details on this subject, however, I just intended to give you this rule of thumb whereby you can skim over a do-comprehension and determine pretty quickly if it can be factored into Applicative style code.
[1] Those that appear to the left of <-.
[2] Expressions that appear to the right of <-.
[3] strictly speaking, parts of it could, by factoring out e2 a.
Basically, monads are also applicative functors [1]. So, whenever you find yourself using liftM, liftM2, etc., you could chain the computation together using <*>. In some sense, you can think of applicative functors as analogous to functions. A pure function f can be lifted by doing f <$> x <*> y <*> z.
Compared to monads, applicative functors cannot run its arguments selectively. The side effects of all the arguments will take place.
import Control.Applicative
ifte condition trueClause falseClause = do
c <- condition
if c then trueClause else falseClause
x = ifte (return True) (putStrLn "True") (putStrLn "False")
ifte' condition trueClause falseClause =
if condition then trueClause else falseClause
y = ifte' <$> (pure True) <*> (putStrLn "True") <*> (putStrLn "False")
x only outputs True, whereas y outputs True and False sequentially.
[1] The Typeclassopedia. Highly recommended.
[2] http://www.soi.city.ac.uk/~ross/papers/Applicative.html. Although this is an academic paper, it's not hard to follow.
[3] http://learnyouahaskell.com/functors-applicative-functors-and-monoids#applicative-functors. Explains the deal very well.
[4] http://book.realworldhaskell.org/read/using-parsec.html#id652399. Shows how the monadic Parsec library can also be used in an applicative way.
See The basics of applicative functors, put to practical work by Bryan O'Sullivan.

Resources