In a recent answer to a style question, I wrote
main = untilM (isCorrect 42) (read `liftM` getLine)
and
isCorrect num guess =
case compare num guess of
EQ -> putStrLn "You Win!" >> return True
...
Martijn helpfully suggested alternatives:
main = untilM (isCorrect 42) (read <$> getLine)
EQ -> True <$ putStrLn "You Win!"
Which common patterns in Haskell code can be made clearer using abstractions from Control.Applicative? What are helpful rules of thumb to keep in mind for using Control.Applicative effectively?
There is a lot to say in answer to your question, however, since you asked, I will offer this "rule of thumb."
If you are using do-notation and your generated values[1] are not used in the expressions that you are sequencing[2], then that code can transform to an Applicative style. Similarly, if you use one or more of the generated values in an expression that is sequenced, then you must use Monad and Applicative is not strong enough to achieve the same code.
For example, let us look at the following code:
do a <- e1
b <- e2
c <- e3
return (f a b c)
We see that in none of the expressions to the right of <- do any of the generated values (a, b, c) appear. Therefore, we can transform it to using Applicative code. Here is one possible transformation:
f <$> e1 <*> e2 <*> e3
and another:
liftA3 f e1 e2 e3
On the other hand, take this piece of code for example:
do a <- e1
b <- e2 a
c <- e3
return (f b c)
This code cannot use Applicative[3] because the generated value a is used later in an expression in the comprehension. This must use Monad to get to its result -- attempt to factor it into Applicative to get a feel for why.
There are some further interesting and useful details on this subject, however, I just intended to give you this rule of thumb whereby you can skim over a do-comprehension and determine pretty quickly if it can be factored into Applicative style code.
[1] Those that appear to the left of <-.
[2] Expressions that appear to the right of <-.
[3] strictly speaking, parts of it could, by factoring out e2 a.
Basically, monads are also applicative functors [1]. So, whenever you find yourself using liftM, liftM2, etc., you could chain the computation together using <*>. In some sense, you can think of applicative functors as analogous to functions. A pure function f can be lifted by doing f <$> x <*> y <*> z.
Compared to monads, applicative functors cannot run its arguments selectively. The side effects of all the arguments will take place.
import Control.Applicative
ifte condition trueClause falseClause = do
c <- condition
if c then trueClause else falseClause
x = ifte (return True) (putStrLn "True") (putStrLn "False")
ifte' condition trueClause falseClause =
if condition then trueClause else falseClause
y = ifte' <$> (pure True) <*> (putStrLn "True") <*> (putStrLn "False")
x only outputs True, whereas y outputs True and False sequentially.
[1] The Typeclassopedia. Highly recommended.
[2] http://www.soi.city.ac.uk/~ross/papers/Applicative.html. Although this is an academic paper, it's not hard to follow.
[3] http://learnyouahaskell.com/functors-applicative-functors-and-monoids#applicative-functors. Explains the deal very well.
[4] http://book.realworldhaskell.org/read/using-parsec.html#id652399. Shows how the monadic Parsec library can also be used in an applicative way.
See The basics of applicative functors, put to practical work by Bryan O'Sullivan.
Related
Having read http://learnyouahaskell.com/functors-applicative-functors-and-monoids#applicative-functors , I can provide an example of the use of functions as applicative functors:
Let's say res is a function of 4 arguments and fa, fb, fc, fd are all functions that take a single argument. Then, if I'm not mistaken, this applicaive expression:
f <$> fa <*> fb <*> fc <*> fd $ x
Means the same as this non-fancy expression:
f (fa x) (fb x) (fc x) (fd x)
Ugh. Took me quite a bit of time to understand why this is the case, but - with the help of a sheet of paper with my notes - I should be able to prove this.
Then I read http://learnyouahaskell.com/for-a-few-monads-more#reader . And we're back at this stuff again, this time in the monadic syntax:
do
a <- fa
b <- fb
c <- fc
d <- fd
return (f a b c d)
While another A4 sheet of notes was needed for me to prove this, I'm now pretty confident that this, again, means the same:
f (fa x) (fb x) (fc x) (fd x)
I'm confused. Why? What's the use of this?
Or, to be more precise: This seems to me to just duplicate the functionality of functions as applicatives, but with a more verbose syntax.
So, could you give me an example of can the Reader monad do that functions as applicatives cannot?
Actually, I would also like to ask what's the use of any of these two: applicative functions OR the Reader monad - because while being able to apply the same argument to four functions (fa, fb, fc, fd) without repeating this argument four times does reduce some repetitiveness, I'm not sure if this minute improvement justifies this level of complexity; so I must be missing something prominent, I think; but this is worthy of a separate question
The monadic version lets you add additional logic between the calls to the functions found in the context, or even decide not to call them at all.
do
a <- fa
if a == 3
then return (f a 1 1 1)
else do
b <- fb
c <- fc
d <- fd
return (f a b c d)
In your original do expression, it's true that you aren't doing anything that the Applicative instance couldn't do, and in fact, the compiler can determine that. If you use the ApplicativeDo extension, then
do
a <- fa
b <- fb
c <- fc
d <- fd
return (f a b c d)
would indeed desugar to f <$> fa <*> fb <*> fc <*> fd instead of fa >>= \a -> fb >>= \b -> fc >>= \c -> fd >>= \d -> return (f a b c d).
This all holds for other types as well, for example
Maybe:
f <$> (Just 3) <*> (Just 5)
== Just (f 3 5)
== do
x <- Just 3
y <- Just 5
return (f 3 5)
[]:
f <$> [1,2] <*> [3,4]
== [f 1 3, f 1 4, f 2 3, f 2 4]
== do
x <- [1,2]
y <- [3,4]
return (f x y)
Before getting to your main question about Reader, I will start with a few remarks about applicative-versus-monad in general. While this applicative style expression...
g <$> fa <*> fb
... is indeed equivalent to this do-block...
do
x <- fa
y <- fb
return (g x y)
... switching from Applicative to Monad makes it possible to make decisions about which computations to perform based on results of other computations, or, in other words, to have effects that depend on previous results (see also chepner's answer):
do
x <- fa
y <- if x >= 0 then fb else fc
return (g x y)
While Monad is more powerful than Applicative, I suggest not thinking of it as if one were more useful than the other. Firstly, because there are applicative functors that aren't monads; secondly, because not using more power than you actually need tends to make things simpler overall. (In addition, such simplicity can sometimes bring tangible benefits, such as an easier time dealing with concurrency.)
A parenthetical note: when it comes to applicative-versus-monad, Reader is a special case, in that the Applicative and Monad instances happen to be equivalent. For the function functor (that is, ((->) r), which is Reader r without the newtype wrapper), we have m >>= f = flip f <*> m. That means if take the second do-block I wrote just above (or the analogous one in chepner's answer, etc) and assume the monad being used is Reader, we can translate it into applicative style.
Still, with Reader ultimately being such a simple thing, why should we even bother with any of the above in this specific case? Here go a few suggestions.
To begin with, Haskellers are often wary of the bare function functor, ((->) r), and quite understandably so: it can easily lead to unnecessarily cryptic code when compared to "non-fancy expression[s]" in which functions are applied directly. Still, in a few select cases it can be handy to use. For a tiny example, consider these two functions from Data.Char:
isUpper :: Char -> Bool
isDigit :: Char -> Bool
Now let's say we want to write a function that checks if a character is either an upper case letter or an ASCII digit. The straightforward thing to do is something along the lines of:
\c -> isUpper c && isDigit c
Using the applicative style, though, we can write it immediately in terms of the two functions -- or, I'm inclined to say, the two properties -- without having to note where the eventual argument goes:
(&&) <$> isUpper <*> isDigit
With an example as tiny as this one, whether to write it in this way is not a big deal, and largely up to taste -- I quite like it; others can't stand it. The point, though, is that sometimes we aren't particularly concerned about a certain value being a function, because we happen to be thinking of it as something else -- in this case, as a property -- and the fact it is ultimately a function can appear to us as a mere implementation detail.
A quite compelling example of this perspective shift involves application-wide configuration parameters: if every single function across some layer of your program takes some Config value as an argument, chances are you will find it more comfortable treating its availability as a background assumption, rather than passing it around explicitly everywhere. It turns out that is the main use case for the reader monad.
In any case, your suspicions about the usefulness of Reader are somewhat vindicated in at least one manner. It turns out that Reader itself, the functions-but-wrapped-in-a-fancy-newtype functor, isn't actually used all that often in the wild. What is extremely common are monadic stacks that incorporate the functionality of Reader, typically through the means of ReaderT and/or the MonadReader class. Discussing monad transformers at length would be a digression too far for the space of this answer, so I will just note that you can work with, for example, ReaderT r IO much like you would with Reader r, except that you can also slip in IO computations along the way. It is not unusual to see some variant of ReaderT over IO as the core type of the outer layer of a Haskell application.
On a final note, you might find it interesting to see what join from Control.Monad does for the function functor, and then work out why that makes sense. (A solution can be found in this Q&A.)
Consider:
x `f` y = x >>= (return . y)
This function f seems very similar to <$> and flip liftM but <$> doesn't seem to work and I'd have to define an infix operator for flip liftM to make it look nice and I'm presuming one already exists?
Is there a function like what I've described and what is it?
It is flip liftM, but not <$>. It's also almost exactly the same as flip <$>, but the latter is for the Functor typeclass, not Monad. (In the latest standard libraries the relationship between Functor and Monad is not yet reflected in the typeclass hierarchy, but it will be).
If you want to find where this is defined, you go to FP Complete's Hoogle, enter the type you are looking for
Functor f => f a -> (a -> b) -> f b
and discover it is defined in lens.
Your function
x `f` y = x >>= (return . y)
is equivalent to flip fmap, so if you don't mind swapping the order, you can import Data.Functor, define fmap and write it as
y <$> x
(There's no need to wait for Functor to be a superclass of Monad; you can go ahead today and define it.)
This has nice precedence so you can write stuff like
munge = Just . remove bits . add things <$> operation 1
>>= increase something <$> operation 2
instead of
munge' = do
thing1 <- operation 1
let thing2 = Just . remove bits. add things $ thing1
thing3 <- operation 2
return . increase something $ thing3
but even nicer, if you import Control.Applicative instead (which also exports <$>), you can combine multiple things, for example:
addLine = (+) <$> readLine <*> readLine >>= print
instead of
addLine' = do
one <- readLine
two <- readLine
print (one + two)
Future-proofing your code
If the Functor-Applicative-proposal goes ahead, you'll have to make all your Monads Applicatives (and hence Functors). You may as well start now.
If your Monad isn't already an Applicative, you can define pure = return and
mf <*> mx = do
f <- mf
x <- mx
return (f x)
If it's not a Functor, you can define
fmap f mx = do
x <- mx
return (f x)
The proposal suggests using (<*>) = ap and fmap = liftM, both from Control.Monad, but the definitions above are easy too, and you may well find it even easier in your own Monad.
Data.Generics.Serialization.Standard exports (>>$) which is defined as flip liftM. Not exactly a general-purpose module to depend upon, but you can if you want to. I've seen similar definitions in other application-specific modules. This is an indication that no general-purpose module defines such a function.
The least painful solution is probably to define your own, at least until the big Monad hierarchy overhaul happens.
I've read this article, but didn't understand last section.
The author says that Monad gives us context sensitivity, but it's possible to achieve the same result using only an Applicative instance:
let maybeAge = (\futureYear birthYear -> if futureYear < birthYear
then yearDiff birthYear futureYear
else yearDiff futureYear birthYear) <$> (readMay futureYearString) <*> (readMay birthYearString)
It's uglier for sure without do-syntax, but beside that I don't see why we need Monad. Can anyone clear this up for me?
Here's a couple of functions that use the Monad interface.
ifM :: Monad m => m Bool -> m a -> m a -> m a
ifM c x y = c >>= \z -> if z then x else y
whileM :: Monad m => (a -> m Bool) -> (a -> m a) -> a -> m a
whileM p step x = ifM (p x) (step x >>= whileM p step) (return x)
You can't implement them with the Applicative interface. But for the sake of enlightenment, let's try and see where things go wrong. How about..
import Control.Applicative
ifA :: Applicative f => f Bool -> f a -> f a -> f a
ifA c x y = (\c' x' y' -> if c' then x' else y') <$> c <*> x <*> y
Looks good! It has the right type, it must be the same thing! Let's just check to make sure..
*Main> ifM (Just True) (Just 1) (Just 2)
Just 1
*Main> ifM (Just True) (Just 1) (Nothing)
Just 1
*Main> ifA (Just True) (Just 1) (Just 2)
Just 1
*Main> ifA (Just True) (Just 1) (Nothing)
Nothing
And there's your first hint at the difference. You can't write a function using just the Applicative interface that replicates ifM.
If you divide this up into thinking about values of the form f a as being about "effects" and "results" (both of which are very fuzzy approximate terms that are the best terms available, but not very good), you can improve your understanding here. In the case of values of type Maybe a, the "effect" is success or failure, as a computation. The "result" is a value of type a that might be present when the computation completes. (The meanings of these terms depends heavily on the concrete type, so don't think this is a valid description of anything other than Maybe as a type.)
Given that setting, we can look at the difference in a bit more depth. The Applicative interface allows the "result" control flow to be dynamic, but it requires the "effect" control flow to be static. If your expression involves 3 computations that can fail, the failure of any one of them causes the failure of the whole computation. The Monad interface is more flexible. It allows the "effect" control flow to depend on the "result" values. ifM chooses which argument's "effects" to include in its own "effects" based on its first argument. This is the huge fundamental difference between ifA and ifM.
There's something even more serious going on with whileM. Let's try to make whileA and see what happens.
whileA :: Applicative f => (a -> f Bool) -> (a -> f a) -> a -> f a
whileA p step x = ifA (p x) (whileA p step <*> step x) (pure x)
Well.. What happens is a compile error. (<*>) doesn't have the right type there. whileA p step has the type a -> f a and step x has the type f a. (<*>) isn't the right shape to fit them together. For it to work, the function type would need to be f (a -> a).
You can try lots more things - but you'll eventually find that whileA has no implementation that works anything even close to the way whileM does. I mean, you can implement the type, but there's just no way to make it both loop and terminate.
Making it work requires either join or (>>=). (Well, or one of the many equivalents of one of those) And those the extra things you get out of the Monad interface.
With monads, subsequent effects can depend on previous values. For example, you can have:
main = do
b <- readLn :: IO Bool
if b
then fireMissiles
else return ()
You can't do that with Applicatives - the result value of one effectfull computation can't determine what effect will follow.
Somewhat related:
Why can applicative functors have side effects, but functors can't?
Good examples of Not a Functor/Functor/Applicative/Monad?
As Stephen Tetley said in a comment, that example doesn't actually use context-sensitivity. One way to think about context-sensitivity is that it lets use choose which actions to take depending on monadic values. Applicative computations must always have the same "shape", in a certain sense, regardless of the values involved; monadic computations need not. I personally think this is easier to understand with a concrete example, so let's look at one. Here's two versions of a simple program which ask you to enter a password, check that you entered the right one, and print out a response depending on whether or not you did.
import Control.Applicative
checkPasswordM :: IO ()
checkPasswordM = do putStrLn "What's the password?"
pass <- getLine
if pass == "swordfish"
then putStrLn "Correct. The secret answer is 42."
else putStrLn "INTRUDER ALERT! INTRUDER ALERT!"
checkPasswordA :: IO ()
checkPasswordA = if' . (== "swordfish")
<$> (putStrLn "What's the password?" *> getLine)
<*> putStrLn "Correct. The secret answer is 42."
<*> putStrLn "INTRUDER ALERT! INTRUDER ALERT!"
if' :: Bool -> a -> a -> a
if' True t _ = t
if' False _ f = f
Let's load this into GHCi and check what happens with the monadic version:
*Main> checkPasswordM
What's the password?
swordfish
Correct. The secret answer is 42.
*Main> checkPasswordM
What's the password?
zvbxrpl
INTRUDER ALERT! INTRUDER ALERT!
So far, so good. But if we use the applicative version:
*Main> checkPasswordA
What's the password?
hunter2
Correct. The secret answer is 42.
INTRUDER ALERT! INTRUDER ALERT!
We entered the wrong password, but we still got the secret! And an intruder alert! This is because <$> and <*>, or equivalently liftAn/liftMn, always execute the effects of all their arguments. The applicative version translates, in do notation, to
do pass <- putStrLn "What's the password?" *> getLine)
unit1 <- putStrLn "Correct. The secret answer is 42."
unit2 <- putStrLn "INTRUDER ALERT! INTRUDER ALERT!"
pure $ if' (pass == "swordfish") unit1 unit2
And it should be clear why this has the wrong behavior. In fact, every use of applicative functors is equivalent to monadic code of the form
do val1 <- app1
val2 <- app2
...
valN <- appN
pure $ f val1 val2 ... valN
(where some of the appI are allowed to be of the form pure xI). And equivalently, any monadic code in that form can be rewritten as
f <$> app1 <*> app2 <*> ... <*> appN
or equivalently as
liftAN f app1 app2 ... appN
To think about this, consider Applicative's methods:
pure :: a -> f a
(<$>) :: (a -> b) -> f a -> f b
(<*>) :: f (a -> b) -> f a -> f b
And then consider what Monad adds:
(=<<) :: (a -> m b) -> m a -> m b
join :: m (m a) -> m a
(Remember that you only need one of those.)
Handwaving a lot, if you think about it, the only way we can put together the applicative functions is to construct chains of the form f <$> app1 <*> ... <*> appN, and possibly nest those chains (e.g., f <$> (g <$> x <*> y) <*> z). However, (=<<) (or (>>=)) allows us to take a value and produce different monadic computations depending on that value, that could be constructed on the fly. This is what we use to decide whether to compute "print out the secret", or compute "print out an intruder alert", and why we can't make that decision with applicative functors alone; none of the types for applicative functions allow you to consume a plain value.
You can think about join in concert with fmap in a similar way: as I mentioned in a comment, you can do something like
checkPasswordFn :: String -> IO ()
checkPasswordFn pass = if pass == "swordfish"
then putStrLn "Correct. The secret answer is 42."
else putStrLn "INTRUDER ALERT! INTRUDER ALERT!"
checkPasswordA' :: IO (IO ())
checkPasswordA' = checkPasswordFn <$> (putStrLn "What's the password?" *> getLine)
This is what happens when we want to pick a different computation depending on the value, but only have applicative functionality available us. We can pick two different computations to return, but they're wrapped inside the outer layer of the applicative functor. To actually use the computation we've picked, we need join:
checkPasswordM' :: IO ()
checkPasswordM' = join checkPasswordA'
And this does the same thing as the previous monadic version (as long as we import Control.Monad first, to get join):
*Main> checkPasswordM'
What's the password?
12345
INTRUDER ALERT! INTRUDER ALERT!
On the other hand, here's a a practical example of the Applicative/Monad divide where Applicatives have an advantage: error handling! We clearly have a Monad implementation of Either that carries along errors, but it always terminates early.
Left e1 >> Left e2 === Left e1
You can think of this as an effect of intermingling values and contexts. Since (>>=) will try to pass the result of the Either e a value to a function like a -> Either e b, it must fail immediately if the input Either is Left.
Applicatives only pass their values to the final pure computation after running all of the effects. This means they can delay accessing the values for longer and we can write this.
data AllErrors e a = Error e | Pure a deriving (Functor)
instance Monoid e => Applicative (AllErrors e) where
pure = Pure
(Pure f) <*> (Pure x) = Pure (f x)
(Error e) <*> (Pure _) = Error e
(Pure _) <*> (Error e) = Error e
-- This is the non-Monadic case
(Error e1) <*> (Error e2) = Error (e1 <> e2)
It's impossible to write a Monad instance for AllErrors such that ap matches (<*>) because (<*>) takes advantage of running both the first and second contexts before using any values in order to get both errors and (<>) them together. Monadic (>>=) and (join) can only access contexts interwoven with their values. That's why Either's Applicative instance is left-biased, so that it can also have a harmonious Monad instance.
> Left "a" <*> Left "b"
Left 'a'
> Error "a" <*> Error "b"
Error "ab"
With Applicative, the sequence of effectful actions to be performed is fixed at compile-time. With Monad, it can be varied at run-time based on the results of effects.
For example, with an Applicative parser, the sequence of parsing actions is fixed for all time. That means that you can potentially perform "optimisations" on it. On the other hand, I can write a Monadic parser which parses some a BNF grammar description, dynamically constructs a parser for that grammar, and then runs that parser over the rest of the input. Every time you run this parser, it potentially constructs a brand new parser to parse the second portion of the input. Applicative has no hope of doing such a thing - and there is no chance of performing compile-time optimisations on a parser that doesn't exist yet...
As you can see, sometimes the "limitation" of Applicative is actually beneficial - and sometimes the extra power offered by Monad is required to get the job done. This is why we have both.
If you try to convert the type signature of Monad's bind and Applicative <*> to natural language, you will find that:
bind : I will give you the contained value and you will return me a new packaged value
<*>: You give me a packaged function that accepts a contained value and return a value and I will use it to create new packaged value based on my rules.
Now as you can see from the above description, bind gives you more control as compared to <*>
If you work with Applicatives, the "shape" of the result is already determined by the "shape" of the input, e.g. if you call [f,g,h] <*> [a,b,c,d,e], your result will be a list of 15 elements, regardless which values the variables have. You don't have this guarantee/limitation with monads. Consider [x,y,z] >>= join replicate: For [0,0,0] you'll get the result [], for [1,2,3] the result [1,2,2,3,3,3].
Now that ApplicativeDo extension become pretty common thing, the difference between Monad and Applicative can be illustrated using simple code snippet.
With Monad you can do
do
r1 <- act1
if r1
then act2
else act3
but having only Applicative do-block, you can't use if on things you've pulled out with <-.
The Monad class defines a >> method, which sequences two monadic actions:
>> :: Monad m => m a -> m b -> m b
The binding operator >>= has a flipped-argument equivalent, =<<; as do the monadic function composition ('fish') operators >=> and <=<. There doesn't seem to be a <<, though (after a few minutes of Hoogling). Why is this?
Edit: I know it's not a big deal. I just like the way certain lines of code look with the left-pointing operators. x <- doSomething =<< doSomethingElse just looks nicer, with the arrows all going the same way, than x <- doSomethingElse >>= doSomething.
To the best of my knowledge there is no good reason. Note, that your Monad should also be an instance of Applicative, so you can use <* and *> instead as your sequencing tools.
Here's an alternative answer, as a similar question was recently asked and marked a duplicate. It turns out that it's not at all clear what the definition of (<<) should be! While this issue was alluded to in the comments on the older answer, I don't think it was made entirely clear that there's a significant problem here.
Obviously, the two reasonable possibilities for a definition are:
(<<) :: Monad m => m a -> m b -> m a
p << q = do {x <- p; q; return x} -- definition #1
p << q = do {q; p} -- definition #2
By analogy with the applicative operators (<*) and (*>), it's clear that the new (<<) operator should preserve the ordering of side effects from left to right and only have the effect of switching which action's return value is used, so definition #1 is obviously the correct one. This has the desirable property that << and <* will be synonymous for (well-behaved) monads, just as >> and *> are synonymous, so no surprises.
Of course, by analogy with =<< and >>=, it's clear that flipping the direction of the greater than signs should have the effect of flipping the arguments, so definiton #2 is obviously the correct one. This has the desirable property that a pipeline of monadic operations:
u >>= v >>= w >> x >>= y
can be reversed by flipping the operators:
y =<< x << w =<< v =<< u
This also preserves the identities for the Kleisli operators:
(f >=> g) x === f x >>= g
(f <=< g) x === f =<< g x
which certainly look like they ought to hold.
Anyway, I don't know if this was the original reason (<<) was left out. (Probably not, as the decision would have predated the introduction of applicative operators, so people would have assumed "definition #2" as the only possibility), but I'm pretty sure it would be a sticking point now, as the different behavior of (<<) and (<*) would be quite unexpected given the close association people expect between applicative and monad operations.
From a gentle introduction to Haskell, there are the following monad laws. Can anyone intuitively explain what they mean?
return a >>= k = k a
m >>= return = m
xs >>= return . f = fmap f xs
m >>= (\x -> k x >>= h) = (m >>= k) >>= h
Here is my attempted explanation:
We expect the return function to wrap a so that its monadic nature is trivial. When we bind it to a function, there are no monadic effects, it should just pass a to the function.
The unwrapped output of m is passed to return that rewraps it. The monadic nature remains the same. So it is the same as the original monad.
The unwrapped value is passed to f then rewrapped. The monadic nature remains the same. This is the behavior expected when we transform a normal function into a monadic function.
I don't have an explanation for this law. This does say that the monad must be "almost associative" though.
Your descriptions seem pretty good. Generally people speak of three monad laws, which you have as 1, 2, and 4. Your third law is slightly different, and I'll get to that later.
For the three monad laws, I find it much easier to get an intuitive understanding of what they mean when they're re-written using Kleisli composition:
-- defined in Control.Monad
(>=>) :: Monad m => (a -> m b) -> (b -> m c) -> a -> m c
mf >=> n = \x -> mf x >>= n
Now the laws can be written as:
1) return >=> mf = mf -- left identity
2) mf >=> return = mf -- right identity
4) (f >=> g) >=> h = f >=> (g >=> h) -- associativity
1) Left Identity Law - returning a value doesn't change the value and doesn't do anything in the monad.
2) Right Identity Law - returning a value doesn't change the value and doesn't do anything in the monad.
4) Associativity - monadic composition is associative (I like KennyTM's answer for this)
The two identity laws basically say the same thing, but they're both necessary because return should have identity behavior on both sides of the bind operator.
Now for the third law. This law essentially says that both the Functor instance and your Monad instance behave the same way when lifting a function into the monad, and that neither does anything monadic. If I'm not mistaken, it's the case that when a monad obeys the other three laws and the Functor instance obeys the functor laws, then this statement will always be true.
A lot of this comes from the Haskell Wiki. The Typeclassopedia is a good reference too.
No disagreements with the other answers, but it might help to think of the monad laws as actually describing two sets of properties. As John says, the third law you mention is slightly different, but here's how the others can be split apart:
Functions that you bind to a monad compose just like regular functions.
As in John's answer, what's called a Kleisli arrow for a monad is a function with type a -> m b. Think of return as id and (<=<) as (.), and the monad laws are the translations of these:
id . f is equivalent to f
f . id is equivalent to f
(f . g) . h is equivalent to f . (g . h)
Sequences of monadic effects append like lists.
For the most part, you can think of the extra monadic structure as a sequence of extra behaviors associated with a monadic value; e.g. Maybe being "give up" for Nothing and "keep going" for Just. Combining two monadic actions then essentially concatenates the sequences of behaviors they held.
In this sense, return is again an identity--the null action, akin to an empty list of behaviors--and (>=>) is concatenation. So, the monad laws are translations of these:
[] ++ xs is equivalent to xs
xs ++ [] is equivalent to xs
(xs ++ ys) ++ zs is equivalent to xs ++ (ys ++ zs)
These three laws describe a ridiculously common pattern, which Haskell unfortunately can't quite express in full generality. If you're interested, Control.Category gives a generalization of "things that look like function composition", while Data.Monoid generalizes the latter case where no type parameters are involved.
In terms of do notation, rule 4 means we can add an extra do block to group a sequence of monadic operations.
do do
y <- do
x <- m x <- m
y <- k x <=> k x
h y h y
This allows functions that return a monadic value to work properly.
The first three laws say that "return" only wraps a value and does nothing else. So you can eliminate "return" calls without changing the semantics.
The last law is associativity for bind. It means that you take something like:
do
x <- foo
bar x
z <- baz
and turn it into
do
do
x <- foo
bar x
z <- baz
without changing the meaning. Of course you wouldn't do exactly this, but you might want to put the inner "do" clause in an "if" statement and want it to mean the same when the "if" is true.
Sometimes monads don't exactly follow these laws, particularly when some kind of bottom value occurs. That's OK as long as its documented and is "morally correct" (i.e. the laws are followed for non-bottom values, or the results are considered equivalent in some other way).