join is defined along with bind to flatten the combined data structure into single structure.
From type system view, (+) 7 :: Num a => a -> a could be considered as a Functor, (+) :: Num a => a -> a -> a could be considered as a Functor of Functor, how to get some intuition about it instead of just relying on type system? Why join (+) 7 === 14?
Even though it is possible to get the final result through manually stepping by the function binding process, it would be great if some intuition were given.
This is from the NICTA exercises.
-- | Binds a function on the reader ((->) t).
--
-- >>> ((*) =<< (+10)) 7
-- 119
instance Bind ((->) t) where
(=<<) ::
(a -> ((->) t b))
-> ((->) t a)
-> ((->) t b)
(f =<< a) t =
f (a t) t
-- | Flattens a combined structure to a single structure.
--
-- >>> join (+) 7
-- 14
join ::
Bind f =>
f (f a)
-> f a
join f =
id =<< f
*Course.State> :t join (+)
join (+) :: Num a => a -> a
*Course.State> :t join
join :: Bind f => f (f a) -> f a
*Course.State> :t (+)
(+) :: Num a => a -> a -> a
how to get some intuition about it instead of just relying on type system?
I'd rather say that relying on the type system is a great way to build a specific sort of intuition. The type of join is:
join :: Monad m => m (m a) -> m a
Specialised to (->) r, it becomes:
(r -> (r -> a)) -> (r -> a)
Now let's try to define join for functions:
-- join :: (r -> (r -> a)) -> (r -> a)
join f = -- etc.
We know the result must be a r -> a function:
join f = \x -> -- etc.
However, we do not know anything at all about what the r and a types are, and therefore we know nothing in particular about f :: r -> (r -> a) and x :: r. Our ignorance means there is literally just one thing we can do with them: passing x as an argument, both to f and to f x:
join f = \x -> f x x
Therefore, join for functions passes the same argument twice because that is the only possible implementation. Of course, that implementation is only a proper monadic join because it follows the monad laws:
join . fmap join = join . join
join . fmap return = id
join . return = id
Verifying that might be another nice exercise.
Going along with the traditional analogy of a monad as a context for computation, join is a method of combining contexts. Let's start with your example. join (+) 7. Using a function as a monad implies the reader monad. (+ 1) is a reader monad which takes the environment and adds one to it. Thus, (+) would be a reader monad within a reader monad. The outer reader monad takes the environment n and returns a reader of the form (n +), which will take a new environment. join simply combines the two environments so that you provide it once and it applies the given parameter twice. join (+) === \x -> (+) x x.
Now, more in general, let's look at some other examples. The Maybe monad represents potential failure. A value of Nothing is a failed computation, whereas a Just x is a success. A Maybe within a Maybe is a computation that could fail twice. A value of Just (Just x) is obviously a success, so joining that produces Just x. A Nothing or a Just Nothing indicates failure at some point, so joining the possible failure should indicate that the computation failed, i.e. Nothing.
A similar analogy can be made for the list monad, for which join is merely concat, the writer monad, which uses the monoidal operator <> to combine the output values in question, or any other monad.
join is a fundamental property of monads and is the operation that makes it significantly stronger than a functor or an applicative functor. Functors can be mapped over, applicatives can be sequences, monads can be combined. Categorically, a monad is often defined as join and return. It just so happens that in Haskell we find it more convenient to define it in terms of return, (>>=), and fmap, but the two definitions have been proven synonymous.
An intuition about join is that is squashes 2 containers into one. .e.g
join [[1]] => [1]
join (Just (Just 1)) => 1
join (a christmas tree decorated with small cristmas tree) => a cristmas tree
etc ...
Now, how can you join functions ? In fact functions, can be seen as a container.
If you look at a Hash table for example. You give a key and you get a value (or not). It's a function key -> value (or if you prefer key -> Maybe value).
So how would you join 2 HashMap ?
Let's say I have (in python style) h={"a": {"a": 1, "b": 2}, "b" : {"a" : 10, "b" : 20 }} how can I join it, or if you prefer flatten it ?
Given "a" which value should I get ? h["a"] gives me {"a":1, "b":2}. The only thing I can do with it is to find "a" again in this new value, which gives me 1.
Therefore join h equals to {"a":1, "b":20}.
It's the same for a function.
Related
For example:
Maybe (Maybe Bool) -> Maybe Bool
Just (Just True) -----> Just True
Just (Just False) ----> Just False
Just (Nothing) -------> Nothing
Nothing --------------> ?
It would map Nothing to Nothing. What is the mathematical theory or theorem underlying it?
If related to category theory, what part is it related to?
Is there a mathematical theory related to the Behavior of join of State, Writer, Reader, etc?
Is there any guarantee that m(m a) -> m a is safe?
edited:
Since result type a of m a -> a forgets the structure. so instead in order not to forget, result type m a of m (m a) -> m a is used for the compound effect of outer - m (...) with effect of inner - m.
Strictly speaking, two information(effect) was compounded into one only before the structure disappeared. The structure no longer exists.
I thought it was important to guarantee that there was no problem in doing so. Is it up to the programmer without any special rules or theory?
The compound doesn't look natural to me, it looks artificial.
Sorry for the vague question, thanks for all the comments.
The Mathematical definition of a monad, with translation to Haskell:
A couple of incidental preliminary notes
I'm going to assume you're familiar with Functors in Haskell. If not I'm tempted to direct you to my explanation on this other question here. I'm not going to explain category theory to you except by translating it into Haskell as best I can.
Identity functor vs Identity Functor instance.
Note: Firstly let me point out that the identity functor in mathematics does nothing, whereas the Identity functor in Haskell adds a newtype wrapper. Whenever we use the mathematical identity functor on a type a we should just get a back, so I won't be using the Identity functor instance.
Natural transformations
Secondly, note that a natural transformation between two functors (either possibly the identity), in Haskell is a polymorphic function e between two types made by (possibly mathematical identity) Functor instances, for example [a] -> Maybe a or (Int,a) -> Either String a such that e . fmap f == fmap f . e.
So safeLast : [a] -> Maybe a is a natural transformation, because safeLast (map f xs) == fmap f (safeLast xs), and even
rejectSomeSmallNumbers :: (Int,a) -> Either String a
rejectSomeSmallNumbers (i,a) = case i of
0 -> Left "Way too small!"
1 -> Left "Too small!"
2 -> Left "Two is small."
3 -> Left "Three is small, too."
_ -> Right a
is a natural transformation because rejectSomeSmallNumbers . fmap f == fmap f . rejectSomeSmallNumbers :: (Int,a) -> Either String b.
A natural transformation can use as much information as it likes about the two functors it connects (eg (,) Int and Either String) but it can't use any information about the type a any more than the functors can. It shouldn't be possible to write a polymorphic function between two valid functor types that's not a natural transformation. See this answer for more information.
What is a monad according to Maths and Haskell?
Let H be a category (let Hask be the kind of all haskell types together with function types, functions, etc etc).
A monad on H is (a monad in Hask is)
an endofunctor M : H -> H
a type constructor m: * -> * which has a Functor instance with fmap :: (a -> b) -> (m a -> m b)
and a natural transformation eta : 1_H -> M
a polymorphic function from a -> m a, called pure defined in the Applicative instance
a natural transformation mu : M^2 -> M
a polymorphic function from m (m a) -> m a, called join that is defined in Control.Monad
such that the following two rules hold:
mu . M mu == mu . mu M as natural transformations M^3 -> M
join . fmap join == join . join :: m (m (m a)) -> m a
mu . M eta == mu . eta M == 1_H as natural transformations M -> M
join . fmap pure == join . pure == id :: m a -> m a
What do these two rules mean?
Just to give you a handle on what these two conditions are saying, here they are when we're using lists:
(join . fmap join) [xss, yss, zss]
== join [join xss, join yss, join zss]
== join (join [xss, yss, zss])
and
join (fmap pure xs)
== join [[x] | x <- xs]
== xs
== id xs
== join [xs]
== join (pure xs)
(Fun fact, join isn't part of the monad definition. I have a perhaps unreliable memory that it used to be, but in Control.Monad it's defined as join x = x >>= id and as commented there, it could be defined as join bss = do { bs <- bss ; bs })
What does this mean for the Maybe monad in your example?
Well firstly, because join is polymorphic (mu is a natural transformation), it can't use any information about the type a in Maybe a, so we couldn't for example make it so that join (Just (Just False)) = Just True or join (Just Nothing) = Just False because we can only use values that are already in the Maybe a we're given:
join :: Maybe (Maybe a) -> Maybe a
join Nothing = Nothing -- can't provide Just a because we have no a
join (Just Nothing) = Nothing -- same reason
join (Just (Just a)) =
-- two choices: we could do the obviously correct `Just a` or collapse everything with `Nothing`.
pure :: a -> Maybe a
pure a =
-- two choices: we could do the obviously correct `Just a` or collapse everything with `Nothing`.
What stops us doing the crazy Nothing thing?
Let's look at the two rules, specialising to Maybe, and to the Just branches, because all the Nothings are inevitably Nothing because of polymorphism.
(join . fmap join) (Just maybemaybe)
== join (Just (join maybemaybe))
== join (join (Just maybemaybe)) -- required by he rule
That one works if we put Just a in the definition, or if we put Nothing, too.
In the second rule:
join (fmap pure (Just a))
== join (Just (pure a))
== join (pure (Just a))
== id (Just a) -- by the rule
== Just a
Well that forces pure to be Just, and at the same time forces join (Just (Just a)) to give us Just a.
Reader
Let's ditch the newtype wrapping to make the laws easier to talk about.
type Reader input a = input -> a
We'd need
join :: Reader input (Reader input a) -> Reader input a
join (make_an_a_maker :: (input -> (input -> a)) :: input -> a
join make_an_a_maker input = (make_an_a_maker input) input
There isn't anything else we can do without using undefined or similar.
So what stops you making crazy join functions?
Most of the time, the fact that you're making a polymorphic function, some of the time because you want to do the obviously correct thing and it works, and the rest of the time because you chose to follow the rules.
Not-relevant nerd note:
I prefer to think of Monads as type constructors m so that Kleisli composition is associative, with the unit being pure:
(>=>) :: (a -> m b)
-> ( b -> m c)
-> (a -> m c)
(first >=> second) a = do
b <- first a
c <- second b
return c
or if you prefer
(first >=> second) a =
first a >>= \b -> second b
so the laws are
(one >=> two) >=> three == one >=> (two >=> three) and
k >=> pure == pure >=> k == k
I think your confusion is over the fact that Nothing is not a single value. It is a polymorphic type that can be specialized to any number of values, depending on how a is fixed:
> :set -XTypeApplications
> :t Nothing
Nothing :: Maybe a
> :t Nothing #Int
Nothing #Int :: Maybe Int
> :t Nothing #Bool
Nothing #Bool :: Maybe Bool
> :t Nothing #(Maybe Bool)
Nothing #(Maybe Bool) :: Maybe (Maybe Bool)
Similarly, join :: Monad m => m (m a) -> m a can be specialized:
> :t join #Maybe
join #Maybe :: Maybe (Maybe a) -> Maybe a
> :t join #Maybe #Bool
join #Maybe #Bool :: Maybe (Maybe Bool) -> Maybe Bool
Maybe (Maybe Bool) has four values:
Just (Just True)
Just (Just False)
Just Nothing
Nothing
Maybe Bool has three values:
Just True
Just False
Nothing
join :: Maybe (Maybe Bool) is not a injection; it maps two different values of type Maybe (Maybe Bool) to the same value of type Maybe Bool:
join (Just (Just True)) == Just True
join (Just (Just False)) == Just False
join (Just Nothing) == Nothing
join Nothing == Nothing
Both Just Nothing :: Maybe (Maybe Bool) and Nothing :: Maybe (Maybe Bool) are mapped to Nothing :: Maybe Bool.
The Monad typeclass can be defined in terms of return and (>>=). However, if we already have a Functor instance for some type constructor f, then this definition is sort of 'more than we need' in that (>>=) and return could be used to implement fmap so we're not making use of the Functor instance we assumed.
In contrast, defining return and join seems like a more 'minimal'/less redundant way to make f a Monad. This way, the Functor constraint is essential because fmap cannot be written in terms of these operations. (Note join is not necessarily the only minimal way to go from Functor to Monad: I think (>=>) works as well.)
Similarly, Applicative can be defined in terms of pure and (<*>), but this definition again does not take advantage of the Functor constraint since these operations are enough to define fmap.
However, Applicative f can also be defined using unit :: f () and (>*<) :: f a -> f b -> f (a, b). These operations are not enough to define fmap so I would say in some sense this is a more minimal way to go from Functor to Applicative.
Is there a characterization of Monad as fmap, unit, (>*<), and some other operator which is minimal in that none of these functions can be derived from the others?
(>>=) does not work, since it can implement a >*< b = a >>= (\ x -> b >>= \ y -> pure (x, y)) where pure x = fmap (const x) unit.
Nor does join since m >>= k = join (fmap k m) so (>*<) can be implemented as above.
I suspect (>=>) fails similarly.
I have something, I think. It's far from elegant, but maybe it's enough to get you unstuck, at least. I started with join :: m (m a) -> ??? and asked "what could it produce that would require (<*>) to get back to m a?", which I found a fruitful line of thought that probably has more spoils.
If you introduce a new type T which can only be constructed inside the monad:
t :: m T
Then you could define a join-like operation which requires such a T:
joinT :: m (m a) -> m (T -> a)
The only way we can produce the T we need to get to the sweet, sweet a inside is by using t, and then we have to combine that with the result of joinT somehow. There are two basic operations that can combine two ms into one: (<*>) and joinT -- fmap is no help. joinT is not going to work, because we'll just need yet another T to use its result, so (<*>) is the only option, meaning that (<*>) can't be defined in terms of joinT.
You could roll that all up into an existential, if you prefer.
joinT :: (forall t. m t -> (m (m a) -> m (t -> a)) -> r) -> r
I got the impression that (>>=) (used by Haskell) and join (preferred by mathematicians) are "equal" since one can write one in terms of the other:
import Control.Monad (join)
join x = x >>= id
x >>= f = join (fmap f x)
Additionally every monad is a functor since bind can be used to replace fmap:
fmap f x = x >>= (return . f)
I have the following questions:
Is there a (non-recursive) definition of fmap in terms of join? (fmap f x = join $ fmap (return . f) x follows from the equations above but is recursive.)
Is "every monad is a functor" a conclusion when using bind (in the definition of a monad), but an assumption when using join?
Is bind more "powerful" than join? And what would "more powerful" mean?
A monad can be either defined in terms of:
return :: a -> m a
bind :: m a -> (a -> m b) -> m b
or alternatively in terms of:
return :: a -> m a
fmap :: (a -> b) -> m a -> m b
join :: m (m a) -> m a
To your questions:
No, we cannot define fmap in terms of join, since otherwise we could remove fmap from the second list above.
No, "every monad is a functor" is a statement about monads in general, regardless whether you define your specific monad in terms of bind or in terms of join and fmap. It is easier to understand the statement if you see the second definition, but that's it.
Yes, bind is more "powerful" than join. It is exactly as "powerful" as join and fmap combined, if you mean with "powerful" that it has the capacity to define a monad (always in combination with return).
For an intuition, see e.g. this answer – bind allows you to combine or chain strategies/plans/computations (that are in a context) together. As an example, let's use the Maybe context (or Maybe monad):
λ: let plusOne x = Just (x + 1)
λ: Just 3 >>= plusOne
Just 4
fmap also let's you chain computations in a context together, but at the cost of increasing the nesting with every step.[1]
λ: fmap plusOne (Just 3)
Just (Just 4)
That's why you need join: to squash two levels of nesting into one. Remember:
join :: m (m a) -> m a
Having only the squashing step doesn't get you very far. You need also fmap to have a monad – and return, which is Just in the example above.
[1]: fmap and (>>=) don't take their two arguments in the same order, but don't let that confuse you.
Is there a [definition] of fmap in terms of join?
No, there isn't. That can be demonstrated by attempting to do it. Suppose we are given an arbitrary type constructor T, and functions:
returnT :: a -> T a
joinT :: T (T a) -> T a
From this data alone, we want to define:
fmapT :: (a -> b) -> T a -> T b
So let's sketch it:
fmapT :: (a -> b) -> T a -> T b
fmapT f ta = tb
where
tb = undefined -- tb :: T b
We need to get a value of type T b somehow. ta :: T a on its own won't do, so we need functions that produce T b values. The only two candidates are joinT and returnT. joinT doesn't help:
fmapT :: (a -> b) -> T a -> T b
fmapT f ta = joinT ttb
where
ttb = undefined -- ttb :: T (T b)
It just kicks the can down the road, as needing a T (T b) value under these circumstances is no improvement.
We might try returnT instead:
fmapT :: (a -> b) -> T a -> T b
fmapT f ta = returnT b
where
b = undefined -- b :: b
Now we need a b value. The only thing that can give us one is f:
fmapT :: (a -> b) -> T a -> T b
fmapT f ta = returnT (f a)
where
a = undefined -- a :: a
And now we are stuck: nothing can give us an a. We have exhausted all available possibilities, so fmapT cannot be defined in such terms.
A digression: it wouldn't suffice to cheat by using a function like this:
extractT :: T a -> a
With an extractT, we might try a = extractT ta, leading to:
fmapT :: (a -> b) -> T a -> T b
fmapT f ta = returnT (f (extractT ta))
It is not enough, however, for fmapT to have the right type: it must also follow the functor laws. In particular, fmapT id = id should hold. With this definition, fmapT id is returnT . extractT, which, in general, is not id (most functors which are instances of both Monad and Comonad serve as examples).
Is "every monad is a functor" a conclusion when using bind (in the definition of a monad), but an assumption when using join?
"Every monad is a functor" is an assumption, or, more precisely, part of the definition of monad. To pick an arbitrary illustration, here is Emily Riehl, Category Theory In Context, p. 154:
Definition 5.1.1. A monad on a category C consists of
an endofunctor T : C → C,
a unit natural transformation η : 1C ⇒ T, and
a multiplication natural transformation μ :T2 ⇒ T,
so that the following diagrams commute in CC: [diagrams of the monad laws]
A monad, therefore, involves an endofunctor by definition. For a Haskell type constructor T that instantiates Monad, the object mapping of that endofunctor is T itself, and the morphism mapping is its fmap. That T will be a Functor instance, and therefore will have an fmap, is, in contemporary Haskell, guaranteed by Applicative (and, by extension, Functor) being a superclass of Monad.
Is that the whole story, though? As far as Haskell is concerned. we know that liftM exists, and also that in a not-so-distant past Functor was not a superclass of Monad. Are those two facts mere Haskellisms? Not quite. In the classic paper Notions of computation and monads, Eugenio Moggi unearths the following definition (p. 3):
Definition 1.2 ([Man76]) A Kleisli triple over a category C is a triple (T, η, _*), where T : Obj(C) → Obj(C), ηA : A → T A for A ∈ Obj(C), f* : T A → T B for f : A → T B and the following equations hold:
ηA* = idT A
ηA; f* = f for f : A → T B
f*; g* = (f; g*)* for f : A → T B and g : B → T C
The important detail here is that T is presented as merely an object mapping in the category C, and not as an endofunctor in C. Working in the Hask category, that amounts to taking a type constructor T without presupposing it is a Functor instance. In code, we might write that as:
class KleisliTriple t where
return :: a -> t a
(=<<) :: (a -> t b) -> t a -> t b
-- (return =<<) = id
-- (f =<<) . return = f
-- (g =<<) . (f =<<) = ((g =<<) . f =<<)
Flipped bind aside, that is the pre-AMP definition of Monad in Haskell. Unsurprisingly, Moggi's paper doesn't take long to show that "there is a one-to-one correspondence between Kleisli triples and monads" (p. 5), establishing along the way that T can be extended to an endofunctor (in Haskell, that step amounts to defining the morphism mapping liftM f m = return . f =<< m, and then showing it follows the functor laws).
All in all, if you write lawful definitions of return and (>>=) without presupposing fmap, you indeed get a lawful implementation of Functor as a consequence. "There is a one-to-one correspondence between Kleisli triples and monads" is a consequence of the definition of Kleisli triple, while "a monad involves an endofunctor" is part of the definition of monad. It is tempting to consider whether it would be more accurate to describe what Haskellers did when writing Monad instances as "setting up a Klesili triple" rather than "setting up a monad", but I will refrain out of fear of getting mired down terminological pedantry -- and in any case, now that Functor is a superclass of Monad there is no practical reason to worry about that.
Is bind more "powerful" than join? And what would "more powerful" mean?
Trick question!
Taken at face value, the answer would be yes, to the extent that, together with return, (>>=) makes it possible to implement fmap (via liftM, as noted above), while join doesn't. However, I don't feel it is worthwhile to insist on this distinction. Why so? Because of the monad laws. Just like it doesn't make sense to talk about a lawful (>>=) without presupposing return, it doesn't make sense to talk about a lawful join without pressuposing return and fmap.
One might get the impression that I am giving too much weight to the laws by using them to tie Monad and Functor in this way. It is true that there are cases of laws that involve two classes, and that only apply to types which instantiate them both. Foldable provides a good example of that: we can find the following law in the Traversable documentation:
The superclass instances should satisfy the following: [...]
In the Foldable instance, foldMap should be equivalent to traversal with a constant applicative functor (foldMapDefault).
That this specific law doesn't always apply is not a problem, because we don't need it to characterise what Foldable is (alternatives include "a Foldable is a container from which we can extract some sequence of elements", and "a Foldable is a container that can be converted to the free monoid on its element type"). With the monad laws, though, it isn't like that: the meaning of the class is inextricably bound to all three of the monad laws.
I'm looking for the following function:
Applicative f => f (f a) -> f a
Hoogle shows me join:
>:t join
join :: Monad m => m (m a) -> m a
Is there a function that matches my desired signature?
To expand a bit on Carl's answer, If there was such a thing as join, but for applicatives:
class Applicative f => ApplicativeWithJoin f where
join' :: f (f a) -> f a
Then you would automatically have a monad:
instance ApplicativeWithJoin m => Monad m where
return = pure
x >>= f = join' (f <$> x)
There is no such function. join is explicitly what Applicative lacks and Monad has.
To expand on SingleNegationElimination's answer:
Applicative's <*> allows you to combine effects together and their values inside together, or manipulate a value inside using <$>, but you can't make an effect depend on a value of a previous computation.
Monads, on the other hand, allow an effect to be determined by the result of the previous computation, as witnessed by >>=.
With any Applicative you can first use <$> to convert a value of type a inside f a into some f b, so you'll get f (f b). But without join, the inner f b is just another value, there is no way how you can combine it with the outer one that is actually executed. Adding join makes it possible, allowing to define the full power of Monad.
This question already has answers here:
Closed 10 years ago.
Possible Duplicate:
What is a monad?
I am learning to program in the functional language of Haskell and I came across Monads when studying parsers. I had never heard of them before and so I did some extra studying to find out what they are.
Everywhere I look in order to learn this topic just confuses me. I can't really find a simple definition of what a Monad is and how to use them. "A monad is a way to structure computations in terms of values and sequences of computations using those values" - eh???
Can someone please provide a simple definition of what a Monad is in Haskell, the laws associated with them and give an example?
Note: I know how to use the do syntax as I have had a look at I/O actions and functions with side-effects.
Intuition
A rough intuition would be that a Monad is a particular kind of container (Functor), for which you have two operations available. A wrapping operation return that takes a single element into a container. An operation join that merges a container of containers into a single container.
return :: Monad m => a -> m a
join :: Monad m => m (m a) -> m a
So for the Monad Maybe you have:
return :: a -> Maybe a
return x = Just x
join :: Maybe (Maybe a) -> Maybe a
join (Just (Just x) = Just x
join (Just Nothing) = Nothing
join Nothing = Nothing
Likewise for the Monad [ ] these operations are defined to be:
return :: a -> [a]
return x = [x]
join :: [[a]] -> [a]
join xs = concat xs
The standard mathematical definition of Monad is based on these return and join operators. However in Haskell the definition of the class Monad substitutes a bind operator for join.
Monads in Haskell
In functional programming languages these special containers are typically used to denote effectful computations. The type Maybe a would represent a computation that may or may not succeed, and the type [a] a computation that is non-deterministic. Particularly we're interested in functions with effects, i.e.those with types a->m b for some Monad m. And we need to be able to compose them. This can be done using either a monadic composition or bind operator.
(>=>) :: Monad m => (a -> m b) -> (b -> m c) -> a -> m c
(>>=) :: Monad m => m a -> (a -> m b) -> m b
In Haskell the latter is the standard one. Note that its type is very similar to the type of the application operator (but with flipped arguments):
(>>=) :: Monad m => m a -> (a -> m b) -> m b
flip ($) :: a -> (a -> b) -> b
It takes an effectful function f :: a -> m b and a computation mx :: m a returning values of type a, and performs the application mx >>= f. So how do we do this with Monads? Containers (Functors) can be mapped, and in this case the result is a computation within a computation which can then be flattened:
fmap f mx :: m (m b)
join (fmap f mx) :: m b
So we have:
(mx >>= f) = join (fmap f mx) :: m b
To see this working in practise consider a simple example with lists (non-deterministic functions). Suppose you have a list of possible results mx = [1,2,3] and a non-deterministic function f x = [x-1, x*2]. To calculate mx >>= f you begin by mapping mx with f and then you merge the results::
fmap f mx = [[0,2],[1,4],[2,6]]
join [[0,2],[1,4],[2,6]] = [0,2,1,4,2,6]
Since in Haskell the bind operator (>>=) is more important than join, for efficiency reasons in the latter is defined from the former and not the other way around.
join mx = mx >>= id
Also the bind operator, being defined with join and fmap, can also be used to define a mapping operation. For this reason Monads are not required to be instances of the class Functor. The equivalent operation to fmap is called liftM in the Monad library.
liftM f mx = mx >>= \x-> return (f x)
So the actual definitions for the Monads Maybe becomes:
return :: a -> Maybe a
return x = Just x
(>>=) :: Maybe a -> (a -> Maybe b) -> Maybe b
Nothing >>= f = Nothing
Just x >>= f = f x
And for the Monad [ ]:
return :: a -> [a]
return x = [x]
(>>=) :: [a] -> (a -> [b]) -> [b]
xs >>= f = concat (map f xs)
= concatMap f xs -- same as above but more efficient
When designing your own Monads you may find it easier to, instead of trying to directly define (>>=), split the problem in parts and figure out what how to map and join your structures. Having map and join can also be useful to verify that your Monad is well defined, in the sense that it satisfy the required laws.
Monad Laws
Your Monad should be a Functor, so the mapping operation should satisfy:
fmap id = id
fmap g . fmap f = fmap (g . f)
The laws for return and join are:
join . return = id
join . fmap return = id
join . join = join . fmap join
The first two laws specify that merging undoes wrapping. If you wrap a container in another one, join gives you back the original. If you map the contents of a container with a wrapping operation, join again gives you back what you initially had. The last law is the associativity of join. If you have three layers of containers you get the same result by merging from the inside or the outside.
Again you can work with bind instead of join and fmap. You get fewer but (arguably) more complicated laws:
return a >>= f = f a
m >>= return = m
(m >>= f) >>= g = m >>= (\x -> f x >>= g)
A monad in Haskell is something that has two operations defined:
(>>=) :: Monad m => m a -> (a -> m b) -> m b -- also called bind
return :: Monad m => a -> m a
These two operations need to satisfy certain laws that really might just confuse you at this point, if you don't have a knack for mathy ideas. Conceptually, you use bind to operate on values on a monadic level and return to create monadic values from "trivial" ones. For instance,
getLine :: IO String,
so you cannot modify and putStrLn this String -- because it's not a String but an IO String!
Well, we have an IO Monad handy, so not to worry. All we have to do is use bind to do what we want. Let's see what bind looks like in the IO Monad:
(>>=) :: IO a -> (a -> IO b) -> IO b
And if we place getLine at the left hand side of bind, we can make it more specific yet.
(>>=) :: IO String -> (String -> IO b) -> IO b
Okay, so getLine >>= putStrLn . (++ ". No problem after all!") would print the entered line with the extra content added. The right hand side is a function that takes a String and produces an IO () - that wasn't hard at all! We just go by the types.
There are Monads defined for a lot of different types, for instance Maybe and [a], and they behave conceptually in the same way.
Just 2 >>= return . (+2) would yield Just 4, as you might expect. Note that we had to use return here, because otherwise the function on the right hand side would not match the return type m b, but just b, which would be a type error. It worked in the case of putStrLn because it already produces an IO something, which was exactly what our type needed to match. (Spoiler: Expressions of shape foo >>= return . bar are silly, because every Monad is a Functor. Can you figure out what that means?)
I personally think that this is as far as intuition will get you on the topic of monads, and if you want to dive deeper, you really do need to dive into the theory. I liked getting a hang of just using them first. You can look up the source for various Monad instances, for instance the List ([]) Monad or Maybe Monad on Hoogle and get a bit smarter on the exact implementations. Once you feel comfortable with that, have a go at the actual monad laws and try to gain a more theoretical understanding for them!
The Typeclassopedia has a section about Monad (but do read the preceding sections about Functor and Applicative first).