Are Monad instances uniquely determined by their Applicative instances? [duplicate]

Are Monad instances uniquely determined by their Applicative instances? [duplicate] - haskell

As described this question/answers, Functor instances are uniquely determined, if they exists.
For lists, there are two well know Applicative instances: [] and ZipList. So Applicative isn't unique (see also Can GHC derive Functor and Applicative instances for a monad transformer? and Why is there no -XDeriveApplicative extension?). However, ZipList needs infinite lists, as its pure repeats a given element indefinitely.
Are there other, perhaps better examples of data structures that have at least two Applicative instances?
Are there any such examples that only involve finite data structures? That is, like if hypothetically Haskell's type system distinguished inductive and coinductive data types, would it be possible to uniquely determine Applicative?
Going further, if we could extend both [] and ZipList to a Monad, we'd have an example where a monad isn't uniquely determined by the data type and its Functor. Alas, ZipList has a Monad instance only if we restrict ourselves to infinite lists (streams).
And return for [] creates a single-element list, so it requires finite lists. Therefore:
Are Monad instances uniquely determined by the data type? Or is there an example of a data type that can have two distinct Monad instances?
In the case there is an example with two or more distinct instances, an obvious question arises, if they must/can have the same Applicative instance:
Are Monad instances uniquely determined by the Applicative instance, or is there an example of an Applicative that can have two distinct Monad instances?
Is there an example of a data type with two distinct Monad instances, each having a different Applicative super-instance?
And finally we can ask the same question for Alternative/MonadPlus. This is complicated by the fact that there are two distinct set of MonadPlus laws. Assuming we accept one of the set of laws (and for Applicative we accept right/left distributivity/absorption, see also this question),
is Alternative uniquely determined by Applicative, and MonadPlus by Monad, or are there any counter-examples?
If any of the above are unique, I'd be interested in knowing why, to have a hint of a proof. If not, an counter-example.

First, since Monoids are not unique, neither are Writer Monads or Applicatives. Consider
data M a = M Int a
then you can give it Applicative and Monad instances isomorphic to either of:
Writer (Sum Int)
Writer (Product Int)
Given a Monoid instance for a type s, another isomorphic pair with different Applicative/Monad instances is:
ReaderT s (Writer s)
State s
As for having one Applicative instance extend to two different Monads, I cannot remember any example. However, back when I tried to convince myself completely about whether ZipList really cannot be made a Monad, I found the following pretty strong restriction that holds for any Monad:
join (fmap (\x -> fmap (\y -> f x y) ys) xs) = f <$> xs <*> ys
That doesn't give join for all values though: in the case of lists the restricted values are the ones where all elements have the same length, i.e. lists of lists with "rectangular" shape.
(For Reader monads, where the "shape" of monadic values doesn't vary, these are in fact all the m (m x) values, so those do have unique extension. EDIT: Come to think of it, Either, Maybe and Writer also have only "rectangular" m (m x) values, so their extension from Applicative to Monad is also unique.)
I wouldn't be surprised if an Applicative with two Monads exists, though.
For Alternative/MonadPlus, I cannot recall any law for instances using the Left Distribution law instead of Left Catch, I see nothing preventing you from just swapping (<|>) with flip (<|>). I don't know if there's a less trivial variation.
ADDENDUM: I suddenly remembered I had found an example of an Applicative with two Monads. Namely, finite lists. There's the usual Monad [] instance, but you can then replace its join by the following function (essentially making empty lists "infectious"):
ljoin xs
| any null xs = []
| otherwise = concat xs
(Alas, the lists need to be finite because otherwise the null check will never finish, and that would ruin the join . fmap return == id monad law.)
This has the same value as join/concat on rectangular lists of lists, so will give the same Applicative. As I recall, it turns out that the first two monad laws are automatic from that, and you just need to check ljoin . ljoin == ljoin . fmap ljoin.

Given that every Applicative has a Backwards counterpart,
newtype Backwards f x = Backwards {backwards :: f x}
instance Applicative f => Applicative (Backwards f) where
pure x = Backwards (pure x)
Backwards ff <*> Backwards fs = Backwards (flip ($) <$> fs <*> ff)
it's unusual for Applicative to be uniquely determined, just as (and this is very far from unrelated) many sets extend to monoids in multiple ways.
In this answer, I set the exercise of finding at least four distinct valid Applicative instances for nonempty lists: I won't spoil it here, but I will give a big hint on how to hunt.
Meanwhile, in some wonderful recent work (which I saw at a summer school a few months ago), Tarmo Uustalu showed a rather neat way to get a handle on this problem, at least when the underlying functor is a container, in the sense of Abbott, Altenkirch and Ghani.
Warning: Dependent types ahead!
What is a container? If you have dependent types to hand, you can present container-like functors F uniformly, as being determined by two components
a set of shapes, S : Set
an S-indexed set of positions, P : S -> Set
Up to isomorphism, container data structures in F X are given by the dependent pair of some shape s : S, and some function e : P s -> X, which tells you the element located at each position. That is, we define the extension of a container
(S <| P) X = (s : S) * (P s -> X)
(which, by the way, looks a lot like a generalized power series if you read -> as reversed exponentiation). The triangle is supposed to remind you of a tree node sideways, with an element s : S labelling the apex, and the baseline representing the position set P s. We say that some functor is a container if it is isomorphic to some S <| P.
In Haskell, you can easily take S = F (), but constructing P can take quite a bit of type-hackery. But that is something you can try at home. You'll find that containers are closed under all the usual polynomial type-forming operations, as well as identity,
Id ~= () <| \ _ -> ()
composition, where a whole shape is made from just one outer shape and an inner shape for each outer position,
(S0 <| P0) . (S1 <| P1) ~= ((S0 <| P0) S1) <| \ (s0, e0) -> (p0 : P0, P1 (e0 p0))
and some other things, notably the tensor, where there is one outer and one inner shape (so "outer" and "inner" are interchangeable)
(S0 <| P0) (X) (S1 <| P1) = ((S0, S1) <| \ (s0, s1) -> (P0 s0, P1 s1))
so that F (X) G means "F-structures of G-structures-all-the-same-shape", e.g., [] (X) [] means rectangular lists-of-lists. But I digress
Polymorphic functions between containers Every polymorphic function
m : forall X. (S0 <| P0) X -> (S1 <| P1) X
can be implemented by a container morphism, constructed from two components in a very particular way.
a function f : S0 -> S1 mapping input shapes to output shapes;
a function g : (s0 : S0) -> P1 (f s0) -> P0 s0 mapping output positions to input positions.
Our polymorphic function is then
\ (s0, e0) -> (f s0, e0 . g s0)
where the output shape is computed from the input shape, then the output positions are filled up by picking elements from input positions.
(If you're Peter Hancock, you have a whole other metaphor for what's going on. Shapes are Commands; Positions are Responses; a container morphism is a device driver, translating commands one way, then responses the other.)
Every container morphism gives you a polymorphic function, but the reverse is also true. Given such an m, we may take
(f s, g s) = m (s, id)
That is, we have a representation theorem, saying that every polymorphic function between two containers is given by such an f, g-pair.
What about Applicative? We kind of got a bit lost along the way, building all this machinery. But it has been worth it. When the underlying functors for monads and applicatives are containers, the polymorphic functions pure and <*>, return and join must be representable by the relevant notion of container morphism.
Let's take applicatives first, using their monoidal presentation. We need
unit : () -> (S <| P) ()
mult : forall X, Y. ((S <| P) X, (S <| P) Y) -> (S <| P) (X, Y)
The left-to-right maps for shapes require us to deliver
unitS : () -> S
multS : (S, S) -> S
so it looks like we might need a monoid. And when you check that the applicative laws, you find we need exactly a monoid. Equipping a container with applicative structure is exactly refining the monoid structures on its shapes with suitable position-respecting operations. There's nothing to do for unit (because there is no chocie of source position), but for mult, we need that whenenver
multS (s0, s1) = s
we have
multP (s0, s1) : P s -> (P s0, P s1)
satisfying appropriate identity and associativity conditions. If we switch to Hancock's interpretation, we're defining a monoid (skip, semicolon) for commands, where there is no way to look at the response to the first command before choosing the second, like commands are a deck of punch cards. We have to be able to chop up responses to combined commands into the individual responses to the individual commands.
So, every monoid on the shapes gives us a potential applicative structure. For lists, shapes are numbers (lengths), and there are a great many monoids from which to choose. Even if shapes live in Bool, we have quite a bit of choice.
What about Monad? Meanwhile, for monads M with M ~= S <| P. We need
return : Id -> M
join : M . M -> M
Looking at shapes first, that means we need a sort-of lopsided monoid.
return_f : () -> S
join_f : (S <| P) S -> S -- (s : S, P s -> S) -> S
It's lopsided because we get a bunch of shapes on the right, not just one. If we switch to Hancock's interpretation, we're defining a kind of sequential composition for commands, where we do let the second command be chosen on the basis of the first response, like we're interacting at a teletype. More geometrically, we're explaining how to glom two layers of a tree into one. It would be very surprising if such compositions were unique.
Again, for the positions, we have to map single output positions to pairs in a coherent way. This is trickier for monads: we first choose an outer position (response), then we have to choose an inner position(response) appropriate to the shape (command) found at the first position (chosen after the first response).
I'd love to link to Tarmo's work for the details, but it doesn't seem to have hit the streets yet. He has actually used this analysis to enumerate all possible monad structures for several choices of underlying container. I'm looking forward to the paper!
Edit. By way of doing honour to the other answer, I should observe that when everywhere P s = (), then (S <| P) X ~= (S, X) and the monad/applicative structures coincide exactly with each other and with the monoid structures on S. That is, for writer monads, we need only choose the shape-level operations, because there is exactly one position for a value in every case.

Related

To what extent are Applicative/Monad instances uniquely determined?

First, since Monoids are not unique, neither are Writer Monads or Applicatives. Consider
data M a = M Int a
then you can give it Applicative and Monad instances isomorphic to either of:
Writer (Sum Int)
Writer (Product Int)
Given a Monoid instance for a type s, another isomorphic pair with different Applicative/Monad instances is:
ReaderT s (Writer s)
State s
As for having one Applicative instance extend to two different Monads, I cannot remember any example. However, back when I tried to convince myself completely about whether ZipList really cannot be made a Monad, I found the following pretty strong restriction that holds for any Monad:
join (fmap (\x -> fmap (\y -> f x y) ys) xs) = f <$> xs <*> ys
That doesn't give join for all values though: in the case of lists the restricted values are the ones where all elements have the same length, i.e. lists of lists with "rectangular" shape.
(For Reader monads, where the "shape" of monadic values doesn't vary, these are in fact all the m (m x) values, so those do have unique extension. EDIT: Come to think of it, Either, Maybe and Writer also have only "rectangular" m (m x) values, so their extension from Applicative to Monad is also unique.)
I wouldn't be surprised if an Applicative with two Monads exists, though.
For Alternative/MonadPlus, I cannot recall any law for instances using the Left Distribution law instead of Left Catch, I see nothing preventing you from just swapping (<|>) with flip (<|>). I don't know if there's a less trivial variation.
ADDENDUM: I suddenly remembered I had found an example of an Applicative with two Monads. Namely, finite lists. There's the usual Monad [] instance, but you can then replace its join by the following function (essentially making empty lists "infectious"):
ljoin xs
| any null xs = []
| otherwise = concat xs
(Alas, the lists need to be finite because otherwise the null check will never finish, and that would ruin the join . fmap return == id monad law.)
This has the same value as join/concat on rectangular lists of lists, so will give the same Applicative. As I recall, it turns out that the first two monad laws are automatic from that, and you just need to check ljoin . ljoin == ljoin . fmap ljoin.

Given that every Applicative has a Backwards counterpart,
newtype Backwards f x = Backwards {backwards :: f x}
instance Applicative f => Applicative (Backwards f) where
pure x = Backwards (pure x)
Backwards ff <*> Backwards fs = Backwards (flip ($) <$> fs <*> ff)
it's unusual for Applicative to be uniquely determined, just as (and this is very far from unrelated) many sets extend to monoids in multiple ways.
In this answer, I set the exercise of finding at least four distinct valid Applicative instances for nonempty lists: I won't spoil it here, but I will give a big hint on how to hunt.
Meanwhile, in some wonderful recent work (which I saw at a summer school a few months ago), Tarmo Uustalu showed a rather neat way to get a handle on this problem, at least when the underlying functor is a container, in the sense of Abbott, Altenkirch and Ghani.
Warning: Dependent types ahead!
What is a container? If you have dependent types to hand, you can present container-like functors F uniformly, as being determined by two components
a set of shapes, S : Set
an S-indexed set of positions, P : S -> Set
Up to isomorphism, container data structures in F X are given by the dependent pair of some shape s : S, and some function e : P s -> X, which tells you the element located at each position. That is, we define the extension of a container
(S <| P) X = (s : S) * (P s -> X)
(which, by the way, looks a lot like a generalized power series if you read -> as reversed exponentiation). The triangle is supposed to remind you of a tree node sideways, with an element s : S labelling the apex, and the baseline representing the position set P s. We say that some functor is a container if it is isomorphic to some S <| P.
In Haskell, you can easily take S = F (), but constructing P can take quite a bit of type-hackery. But that is something you can try at home. You'll find that containers are closed under all the usual polynomial type-forming operations, as well as identity,
Id ~= () <| \ _ -> ()
composition, where a whole shape is made from just one outer shape and an inner shape for each outer position,
(S0 <| P0) . (S1 <| P1) ~= ((S0 <| P0) S1) <| \ (s0, e0) -> (p0 : P0, P1 (e0 p0))
and some other things, notably the tensor, where there is one outer and one inner shape (so "outer" and "inner" are interchangeable)
(S0 <| P0) (X) (S1 <| P1) = ((S0, S1) <| \ (s0, s1) -> (P0 s0, P1 s1))
so that F (X) G means "F-structures of G-structures-all-the-same-shape", e.g., [] (X) [] means rectangular lists-of-lists. But I digress
Polymorphic functions between containers Every polymorphic function
m : forall X. (S0 <| P0) X -> (S1 <| P1) X
can be implemented by a container morphism, constructed from two components in a very particular way.
a function f : S0 -> S1 mapping input shapes to output shapes;
a function g : (s0 : S0) -> P1 (f s0) -> P0 s0 mapping output positions to input positions.
Our polymorphic function is then
\ (s0, e0) -> (f s0, e0 . g s0)
where the output shape is computed from the input shape, then the output positions are filled up by picking elements from input positions.
(If you're Peter Hancock, you have a whole other metaphor for what's going on. Shapes are Commands; Positions are Responses; a container morphism is a device driver, translating commands one way, then responses the other.)
Every container morphism gives you a polymorphic function, but the reverse is also true. Given such an m, we may take
(f s, g s) = m (s, id)
That is, we have a representation theorem, saying that every polymorphic function between two containers is given by such an f, g-pair.
What about Applicative? We kind of got a bit lost along the way, building all this machinery. But it has been worth it. When the underlying functors for monads and applicatives are containers, the polymorphic functions pure and <*>, return and join must be representable by the relevant notion of container morphism.
Let's take applicatives first, using their monoidal presentation. We need
unit : () -> (S <| P) ()
mult : forall X, Y. ((S <| P) X, (S <| P) Y) -> (S <| P) (X, Y)
The left-to-right maps for shapes require us to deliver
unitS : () -> S
multS : (S, S) -> S
so it looks like we might need a monoid. And when you check that the applicative laws, you find we need exactly a monoid. Equipping a container with applicative structure is exactly refining the monoid structures on its shapes with suitable position-respecting operations. There's nothing to do for unit (because there is no chocie of source position), but for mult, we need that whenenver
multS (s0, s1) = s
we have
multP (s0, s1) : P s -> (P s0, P s1)
satisfying appropriate identity and associativity conditions. If we switch to Hancock's interpretation, we're defining a monoid (skip, semicolon) for commands, where there is no way to look at the response to the first command before choosing the second, like commands are a deck of punch cards. We have to be able to chop up responses to combined commands into the individual responses to the individual commands.
So, every monoid on the shapes gives us a potential applicative structure. For lists, shapes are numbers (lengths), and there are a great many monoids from which to choose. Even if shapes live in Bool, we have quite a bit of choice.
What about Monad? Meanwhile, for monads M with M ~= S <| P. We need
return : Id -> M
join : M . M -> M
Looking at shapes first, that means we need a sort-of lopsided monoid.
return_f : () -> S
join_f : (S <| P) S -> S -- (s : S, P s -> S) -> S
It's lopsided because we get a bunch of shapes on the right, not just one. If we switch to Hancock's interpretation, we're defining a kind of sequential composition for commands, where we do let the second command be chosen on the basis of the first response, like we're interacting at a teletype. More geometrically, we're explaining how to glom two layers of a tree into one. It would be very surprising if such compositions were unique.
Again, for the positions, we have to map single output positions to pairs in a coherent way. This is trickier for monads: we first choose an outer position (response), then we have to choose an inner position(response) appropriate to the shape (command) found at the first position (chosen after the first response).
I'd love to link to Tarmo's work for the details, but it doesn't seem to have hit the streets yet. He has actually used this analysis to enumerate all possible monad structures for several choices of underlying container. I'm looking forward to the paper!
Edit. By way of doing honour to the other answer, I should observe that when everywhere P s = (), then (S <| P) X ~= (S, X) and the monad/applicative structures coincide exactly with each other and with the monoid structures on S. That is, for writer monads, we need only choose the shape-level operations, because there is exactly one position for a value in every case.

Monads: Determining if an arbitrary transformation is possible

There are quite a few of questions here about whether or not certain transformations of types that involve Monads are possible.
For instance, it's possible to make a function of type f :: Monad m => [m a] -> m [a], but impossible to make a function of type g :: Monad m => m [a] -> [m a] as a proper antifunction to the former. (IE: f . g = id)
I want to understand what rules one can use to determine if a function of that type can or cannot be constructed, and why these types cannot be constructed if they disobey these rules.

The way that I've always thought about monads is that a value of type Monad m => m a is some program of type m that executes and produces an a. The monad laws reinforce this notion by thinking of composition of these programs as "do thing one then do thing two", and produce some sort of combination of the results.
Right unit Taking a program and just returning its value should
be the same as just running the original program.
m >>= return = m
Left unit If you create a simple program that just returns a value,
and then pass that value to a function that creates a new program, then
the resulting program should just be as if you called the function on the
value.
return x >>= f = f x
Associativity If you execute a program m, feed its result into a function f that produces another program, and then feed that result into a third function g that also produces a program, then this is identical to creating a new function that returns a program based on feeding the result of f into g, and feeding the result of m into it.
(m >>= f) >>= g = m >>= (\x -> f x >>= g)
Using this intuition about a "program that creates a value" can come to some conclusions about what it means for the functions that you've provided in your examples.
Monad m => [m a] -> m [a] Deviating from the intuitive definition of what this function should do is hard: Execute each program in sequence and collect the results. This produces another program that produces a list of results.
Monad m => m [a] -> [m a] This doesn't really have a clear intuitive definition, since it's a program that produces a list. You can't create a list without getting access to the resulting values which in this case means executing a program. Certain monads, that have a clear way to extract a value from a program, and provide some variant of m a -> a, like the State monad, can have sane implementations of some function like this. It would have to be application specific though. Other monads, like IO, you cannot escape from.
Monad m => (a -> m b) -> m (a -> b) This also doesn't really have a clear intuitive definition. Here you have a function f that produces a program of type m b, but you're trying to return a function of type m (a -> b). Based on the a, f creates completely different programs with different executing semantics. You cannot encompass these variations in a single program of type m (a -> b), even if you can provide a proper mapping of a -> b in the programs resulting value.
This intuition doesn't really encompass the idea behind monads completely. For example, the monadic context of a list doesn't really behave like a program.

Something easy to remember is : "you can't escape from a Monad" (it's kind of design for it). Transforming m [a] to [m a] is a form of escape, so you can't.
On the other hand you can easily create a Monad from something (using return) so traversing ([m a] -> m [a]) is usually possible.

If you take a look at "Monad laws", monad only constrain you to define a composition function but not reverse function.
In the first example you can compose the list elements.
In the second one Monad m => m [a] -> [m a], you cannot split an action into multiple actions ( action composition is not reversible).
Example:
Let's say you have to read 2 values.
s1 <- action
s2 <- action
Doing so, action result s2 depends by the side effect made by s1.
You can bind these 2 actions in 1 action to be executed in the same order, but you cannot split them and execute action from s2 without s1 made the side effect needed by the second one.

Not really an answer, and much too informal for my linking, but nevertheless I have a few interesting observations that won't fit into a comment. First, let's consider this function you refer to:
f :: Monad m => [m a] -> m [a]
This signature is in fact stronger than it needs to be. The current generalization of this is the sequenceA function from Data.Traversable:
sequenceA :: (Traversable t, Applicative f) -> t (f a) -> f (t a)
...which doesn't need the full power of Monad, and can work with any Traversable and not just lists.
Second: the fact that Traversable only requires Applicative is I think really significant to this question, because applicative computations have a "list-like" structure. Every applicative computation can be rewritten to have the form f <$> a1 <*> ... <*> an for some f. Or, informally, every applicative computation can be seen as a list of actions a1, ... an (heterogeneous on the result type, homogeneous in the functor type), plus an n-place function to combine their results.
If we look at sequenceA through this lens, all it does is choose an f built out of the appropriate nested number of list constructors:
sequenceA [a1, ..., an] == f <$> a1 <*> ... <*> an
where f v1 ... vn = v1 : ... : vn : []
Now, I haven't had the chance to try and prove this yet, but my conjectures would be the following:
Mathematically speaking at least, sequenceA has a left inverse in free applicative functors. If you have a Functor f => [FreeA f a] and you sequenceA it, what you get is a list-like structure that contains those computations and a combining function that makes a list out of their results. I suspect however that it's not possible to write such a function in Haskell (unSequenceFreeA :: (Traversable t, Functor f) => FreeA f (t a) -> Maybe (t (Free f a))), because you can't pattern match on the structure of the combining function in the FreeA to tell that it's of the form f v1 ... vn = v1 : ... : vn : [].
sequenceA doesn't have a right inverse in a free applicative, however, because the combining function that produces a list out of the results from the a1, ... an actions may do anything; for example, return a constant list of arbitrary length (unrelated to the computations that the free applicative value performs).
Once you move to non-free applicative functors, there will no longer be a left inverse for sequenceA, because the non-free applicative functor's equations translate into cases where you can no longer tell apart which of two t (f a) "action lists" was the source for a given f (t a) "list-producing action."

Combining the state monad with the costate comonad

How to combine the state monad S -> (A, S) with the costate comonad (E->A, E)?
I tried with both obvious combinations S -> ((E->A, E), S) and (E->S->(A, S), E) but then in either case I do not know how to define the operations (return, extract, ... and so on) for the combination.

Combining two monads O and I yields a monad if either O or I is copointed, i.e. have an extract method. Each comonad is copointed. If both O and I` are copointed, then you have two different "natural" ways to obtain a monad which are presumably not equivalent.
You have:
unit_O :: a -> O a
join_O :: O (O a) -> O a
unit_I :: a -> I a
join_I :: I (I a) -> I a
Here I've added _O and _I suffixed for clarity; in actual Haskell code, they would not be there, since the type checker figures this out on its own.
Your goal is to show that O (I O (I a))) is a monad. Let's assume that O is copointed, i.e. that there is a function extract_O :: O a -> a.
Then we have:
unit :: a -> O (I a)
unit = unit_O . unit_I
join :: O (I (O (I a))) -> O (I a)
The problem, of course, is in implementing join. We follow this strategy:
fmap over the outer O
use extract_O to get ride of the inner O
use join_I to combine the two I monads
This leads us to
join = fmap_O $ join_I . fmap_I extract
To make this work, you'll also need to define
newtype MCompose O I a = MCompose O (I a)
and add the respective type constructors and deconstructors into the definitions above.
The other alternative uses extract_I instead of extract_O. This version is even simpler:
join = join_O . fmap_O extract_I
This defines a new monad. I assume you can define a new comonad in the same way, but I haven't attempted this.

As the other answer demonstrates, both of the combinations S -> ((E->A, E), S) and (E->S->(A, S), E)
have Monad and Comonad instances simultaneously. In fact giving a Monad/Comonad instance is
equivalent to giving a
monoid structure to resp. its points ∀r.r->f(r) or its copoints ∀r.f(r)->r, at least in classical,
non-constructive sense (I don't know the constructive answer). This fact suggests that actually a
Functor f has a very good chance that it can
be both Monad and Comonad, provided its points and copoints are non-trivial.
The real question, however, is whether the Monad/Comonad instances constructed as such do have natural
computational/categorical meanings. In this particular case I would say "no", because you don't seem to have
a priori knowledge about how to compose them in a way that suit your computational needs.
The standard categorical way to compose two (co)monads is via adjunctions. Let me summarize your situation:
Fₑ Fₛ
--> -->
Hask ⊣ Hask ⊣ Hask
<-- <--
Gₑ Gₛ
Fₜ(a) = (a,t)
Gₜ(a) = (t->a)
Proof of Fₜ ⊣ Gₜ:
Fₜ(x) -> y ≃
(x,t) -> y ≃
x -> (t->y) ≃
x -> Gₜ(y)
Now you can see that the state monad (s->(a,s)) ≃ (s->a,s->s) is the composition GₛFₛ and the costate comonad
is FₑGₑ. This adjunction says that Hask can be interpreted as a model of the (co)state (co)algebras.
Now, 'adjunctions compose.' For example,
FₛFₑ(x) -> y ≃
Fₑ(x) -> Gₛ(y) ≃
x -> GₑGₛ(y)
So FₛFₑ ⊣ GₑGₛ. This gives a pair of a monad and a comonad, namely
T(a) = GₑGₛFₛFₑ(a)
= GₑGₛFₛ(a,e)
= GₑGₛ(a,e,s)
= Gₑ(s->(a,e,s))
= e->s->(a,e,s)
= ((e,s)->a, (e,s)->(e,s))
G(a) = FₛFₑGₑGₛ(a)
= FₛFₑGₑ(s->a)
= FₛFₑ(e->s->a)
= Fₛ(e->s->a,e)
= (e->s->a,e,s)
= ((e,s)->a, (e,s))
T is simply the state monad with the state (e,s), G is the costate comonad with the costate (e,s), so
these do have very natural meanings.
Composing adjunctions is a natural, frequent mathematical operation. For example, a geometric morphism
between topoi (kind of Cartesian Closed Categories which admit complex (non-free) constructions at the 'type level') is defined as a pair of adjunctions, only requiring its left adjoint to be left exact (i.e. preserves finite limits). If those topoi are sheaves on topological spaces,
composing the adjunctions simply corresponds to composing (unique) continuous base change maps (in the opposite direction), having a very natural meaning.
On the other hand, composing monads/comonads directly seems to be a very rare practice in Mathematics.
This is because often a (co)monad is thought of as a carrier of an (co)algebraic theory, rather than as a
model. In this interpretation the corresponding adjunctions are the models, not the monad. The problem
is that composing two theories requires another theory, a theory about how to compose them. For example,
imagine composing two theories of monoids. Then you may get at least two new theories,
namely the theory of lists of lists, or ring-like algebras where two kinds of binary operations distribute.
Neither is a priori better/more natural than the other. This is the meaning of "monads don't compose"; it doesn't say the composition cannot be a monad, but it does say you will need another theory how to compose them.
In contrast, composing adjunctions naturally results in another adjunction simply because by doing so you are
implicity specifying the rules of composing two given theories. So by taking the monad of the composed adjunction you get the theory that also specifies the rules of composition.

Monads with Join() instead of Bind()

Monads are usually explained in turns of return and bind. However, I gather you can also implement bind in terms of join (and fmap?)
In programming languages lacking first-class functions, bind is excruciatingly awkward to use. join, on the other hand, looks quite easy.
I'm not completely sure I understand how join works, however. Obviously, it has the [Haskell] type
join :: Monad m => m (m x) -> m x
For the list monad, this is trivially and obviously concat. But for a general monad, what, operationally, does this method actually do? I see what it does to the type signatures, but I'm trying to figure out how I'd write something like this in, say, Java or similar.
(Actually, that's easy: I wouldn't. Because generics is broken. ;-) But in principle the question still stands...)
Oops. It looks like this has been asked before:
Monad join function
Could somebody sketch out some implementations of common monads using return, fmap and join? (I.e., not mentioning >>= at all.) I think perhaps that might help it to sink in to my dumb brain...

Without plumbing the depths of metaphor, might I suggest to read a typical monad m as "strategy to produce a", so the type m value is a first class "strategy to produce a value". Different notions of computation or external interaction require different types of strategy, but the general notion requires some regular structure to make sense:
if you already have a value, then you have a strategy to produce a value (return :: v -> m v) consisting of nothing other than producing the value that you have;
if you have a function which transforms one sort of value into another, you can lift it to strategies (fmap :: (v -> u) -> m v -> m u) just by waiting for the strategy to deliver its value, then transforming it;
if you have a strategy to produce a strategy to produce a value, then you can construct a strategy to produce a value (join :: m (m v) -> m v) which follows the outer strategy until it produces the inner strategy, then follows that inner strategy all the way to a value.
Let's have an example: leaf-labelled binary trees...
data Tree v = Leaf v | Node (Tree v) (Tree v)
...represent strategies to produce stuff by tossing a coin. If the strategy is Leaf v, there's your v; if the strategy is Node h t, you toss a coin and continue by strategy h if the coin shows "heads", t if it's "tails".
instance Monad Tree where
return = Leaf
A strategy-producing strategy is a tree with tree-labelled leaves: in place of each such leaf, we can just graft in the tree which labels it...
join (Leaf tree) = tree
join (Node h t) = Node (join h) (join t)
...and of course we have fmap which just relabels leaves.
instance Functor Tree where
fmap f (Leaf x) = Leaf (f x)
fmap f (Node h t) = Node (fmap f h) (fmap f t)
Here's an strategy to produce a strategy to produce an Int.
Toss a coin: if it's "heads", toss another coin to decide between two strategies (producing, respectively, "toss a coin for producing 0 or producing 1" or "produce 2"); if it's "tails" produce a third ("toss a coin for producing 3 or tossing a coin for 4 or 5").
That clearly joins up to make a strategy producing an Int.
What we're making use of is the fact that a "strategy to produce a value" can itself be seen as a value. In Haskell, the embedding of strategies as values is silent, but in English, I use quotation marks to distinguish using a strategy from just talking about it. The join operator expresses the strategy "somehow produce then follow a strategy", or "if you are told a strategy, you may then use it".
(Meta. I'm not sure whether this "strategy" approach is a suitably generic way to think about monads and the value/computation distinction, or whether it's just another crummy metaphor. I do find leaf-labelled tree-like types a useful source of intuition, which is perhaps not a surprise as they're the free monads, with just enough structure to be monads at all, but no more.)
PS The type of "bind"
(>>=) :: m v -> (v -> m w) -> m w
says "if you have a strategy to produce a v, and for each v a follow-on strategy to produce a w, then you have a strategy to produce a w". How can we capture that in terms of join?
mv >>= v2mw = join (fmap v2mw mv)
We can relabel our v-producing strategy by v2mw, producing instead of each v value the w-producing strategy which follows on from it — ready to join!

join = concat -- []
join f = \x -> f x x -- (e ->)
join f = \s -> let (f', s') = f s in f' s' -- State
join (Just (Just a)) = Just a; join _ = Nothing -- Maybe
join (Identity (Identity a)) = Identity a -- Identity
join (Right (Right a)) = Right a; join (Right (Left e)) = Left e;
join (Left e) = Left e -- Either
join ((a, m), m') = (a, m' `mappend` m) -- Writer
-- N.B. there is a non-newtype-wrapped Monad instance for tuples that
-- behaves like the Writer instance, but with the tuple order swapped
join f = \k -> f (\f' -> f' k) -- Cont

Calling fmap (f :: a -> m b) (x ::ma) produces values (y ::m(m b)) so it is a very natural thing to use join to get back values (z :: m b).
Then bind is defined simply as bind ma f = join (fmap f ma), thus achieving the Kleisly compositionality of functions of (:: a -> m b) variety, which is what it is really all about:
ma `bind` (f >=> g) = (ma `bind` f) `bind` g -- bind = (>>=)
= (`bind` g) . (`bind` f) $ ma
= join . fmap g . join . fmap f $ ma
And so, with flip bind = (=<<), we have
((g <=< f) =<<) = (g =<<) . (f =<<) = join . (g <$>) . join . (f <$>)

OK, so it's not really good form to answer your own question, but I'm going to note down my thinking in case it enlightens anybody else. (I doubt it...)
If a monad can be thought of as a "container", then both return and join have pretty obvious semantics. return generates a 1-element container, and join turns a container of containers into a single container. Nothing hard about that.
So let us focus on monads which are more naturally thought of as "actions". In that case, m x is some sort of action which yields a value of type x when you "execute" it. return x does nothing special, and then yields x. fmap f takes an action that yields an x, and constructs an action that computes x and then applies f to it, and returns the result. So far, so good.
It's fairly obvious that if f itself generates an action, then what you end up with is m (m x). That is, an action that computes another action. In a way, that's maybe even simpler to wrap your mind around than the >>= function which takes an action and a "function that produces an action" and so on.
So, logically speaking, it seems join would run the first action, take the action it produces, and then run that. (Or rather, join would return an action that does what I just described, if you want to split hairs.)
That seems to be the central idea. To implement join, you want to run an action, which then gives you another action, and then you run that. (Whatever "run" happens to mean for this particular monad.)
Given this insight, I can take a stab at writing some join implementations:
join Nothing = Nothing
join (Just mx) = mx
If the outer action is Nothing, return Nothing, else return the inner action. Then again, Maybe is more of a container than an action, so let's try something else...
newtype Reader s x = Reader (s -> x)
join (Reader f) = Reader (\ s -> let Reader g = f s in g s)
That was... painless. A Reader is really just a function that takes a global state and only then returns its result. So to unstack, you apply the global state to the outer action, which returns a new Reader. You then apply the state to this inner function as well.
In a way, it's perhaps easier than the usual way:
Reader f >>= g = Reader (\ s -> let x = f s in g x)
Now, which one is the reader function, and which one is the function that computes the next reader...?
Now let's try the good old State monad. Here every function takes an initial state as input but also returns a new state along with its output.
data State s x = State (s -> (s, x))
join (State f) = State (\ s0 -> let (s1, State g) = f s0 in g s1)
That wasn't too hard. It's basically run followed by run.
I'm going to stop typing now. Feel free to point out all the glitches and typos in my examples... :-/

I've found many explanations of monads that say "you don't have to know anything about category theory, really, just think of monads as burritos / space suits / whatever".
Really, the article that demystified monads for me just said what categories were, described monads (including join and bind) in terms of categories, and didn't bother with any bogus metaphors:
http://en.wikibooks.org/wiki/Haskell/Category_theory
I think the article is very readable without much math knowledge required.

Asking what a type signature in Haskell does is rather like asking what an interface in Java does.
It, in some literal sense, "doesn't". (Though, of course, you will typically have some sort of purpose associated with it, that's mostly in your mind, and mostly not in the implementation.)
In both cases you are declaring legal sequences of symbols in the language which will be used in later definitions.
Of course, in Java, I suppose you could say that an interface corresponds to a type signature which is going to be implemented literally in the VM. You can get some polymorphism this way -- you can define a name that accepts an interface, and you can provide a different definition for the name which accepts a different interface. Something similar happens in Haskell, where you can provide a declaration for a name which accepts one type and then another declaration for that name which treats a different type.

This is Monad explained in one picture. The 2 functions in the green category are not composable, when being mapped to the blue category with join . fmap (strictly speaking, they are one category), they become composable. Monad is about turning a function of type T -> Monad<U> into a function of type Monad<T> -> Monad<U>.

Monads as adjunctions

I've been reading about monads in category theory. One definition of monads uses a pair of adjoint functors. A monad is defined by a round-trip using those functors. Apparently adjunctions are very important in category theory, but I haven't seen any explanation of Haskell monads in terms of adjoint functors. Has anyone given it a thought?

Edit: Just for fun, I'm going to do this right. Original answer preserved below
The current adjunction code for category-extras now is in the adjunctions package: http://hackage.haskell.org/package/adjunctions
I'm just going to work through the state monad explicitly and simply. This code uses Data.Functor.Compose from the transformers package, but is otherwise self-contained.
An adjunction between f (D -> C) and g (C -> D), written f -| g, can be characterized in a number of ways. We'll use the counit/unit (epsilon/eta) description, which gives two natural transformations (morphisms between functors).
class (Functor f, Functor g) => Adjoint f g where
counit :: f (g a) -> a
unit :: a -> g (f a)
Note that the "a" in counit is really the identity functor in C, and the "a" in unit is really the identity functor in D.
We can also recover the hom-set adjunction definition from the counit/unit definition.
phiLeft :: Adjoint f g => (f a -> b) -> (a -> g b)
phiLeft f = fmap f . unit
phiRight :: Adjoint f g => (a -> g b) -> (f a -> b)
phiRight f = counit . fmap f
In any case, we can now define a Monad from our unit/counit adjunction like so:
instance Adjoint f g => Monad (Compose g f) where
return x = Compose $ unit x
x >>= f = Compose . fmap counit . getCompose $ fmap (getCompose . f) x
Now we can implement the classic adjunction between (a,) and (a ->):
instance Adjoint ((,) a) ((->) a) where
-- counit :: (a,a -> b) -> b
counit (x, f) = f x
-- unit :: b -> (a -> (a,b))
unit x = \y -> (y, x)
And now a type synonym
type State s = Compose ((->) s) ((,) s)
And if we load this up in ghci, we can confirm that State is precisely our classic state monad. Note that we can take the opposite composition and get the Costate Comonad (aka the store comonad).
There are a bunch of other adjunctions we can make into monads in this fashion (such as (Bool,) Pair), but they're sort of strange monads. Unfortunately we can't do the adjunctions that induce Reader and Writer directly in Haskell in a pleasant way. We can do Cont, but as copumpkin describes, that requires an adjunction from an opposite category, so it actually uses a different "form" of the "Adjoint" typeclass that reverses some arrows. That form is also implemented in a different module in the adjunctions package.
this material is covered in a different way by Derek Elkins' article in The Monad Reader 13 -- Calculating Monads with Category Theory: http://www.haskell.org/wikiupload/8/85/TMR-Issue13.pdf
Also, Hinze's recent Kan Extensions for Program Optimization paper walks through the construction of the list monad from the adjunction between Mon and Set: http://www.cs.ox.ac.uk/ralf.hinze/Kan.pdf
Old answer:
Two references.
1) Category-extras delivers, as as always, with a representation of adjunctions and how monads arise from them. As usual, it's good to think with, but pretty light on documentation: http://hackage.haskell.org/packages/archive/category-extras/0.53.5/doc/html/Control-Functor-Adjunction.html
2) -Cafe also delivers with a promising but brief discussion on the role of adjunction. Some of which may help in interpreting category-extras: http://www.haskell.org/pipermail/haskell-cafe/2007-December/036328.html

Derek Elkins was showing me recently over dinner how the Cont Monad arises from composing the (_ -> k) contravariant functor with itself, since it happens to be self-adjoint. That's how you get (a -> k) -> k out of it. Its counit, however, leads to double negation elimination, which can't be written in Haskell.
For some Agda code that illustrates and proves this, please see http://hpaste.org/68257.

This is an old thread, but I found the question interesting,
so I did some calculations myself. Hopefully Bartosz is still there
and might read this..
In fact, the Eilenberg-Moore construction does give a very clear picture in this case.
(I will use CWM notation with Haskell like syntax)
Let T be the list monad < T,eta,mu > (eta = return and mu = concat)
and consider a T-algebra h:T a -> a.
(Note that T a = [a] is a free monoid <[a],[],(++)>, that is, identity [] and multiplication (++).)
By definition, h must satisfy h.T h == h.mu a and h.eta a== id.
Now, some easy diagram chasing proves that h actually induces a monoid structure on a (defined by x*y = h[x,y] ),
and that h becomes a monoid homomorphism for this structure.
Conversely, any monoid structure < a,a0,* > defined in Haskell is naturally defined as a T-algebra.
In this way (h = foldr ( * ) a0, a function that 'replaces' (:) with (*),and maps [] to a0, the identity).
So, in this case, the category of T-algebras is just the category of monoid structures definable in Haskell, HaskMon.
(Please check that the morphisms in T-algebras are actually monoid homomorphisms.)
It also characterizes lists as universal objects in HaskMon, just like free products in Grp, polynomial rings in CRng, etc.
The adjuction corresponding to the above construction is < F,G,eta,epsilon >
where
F:Hask -> HaskMon, which takes a type a to the 'free monoid generated by a',that is, [a],
G:HaskMon -> Hask, the forgetful functor (forget the multiplication),
eta:1 -> GF , the natural transformation defined by \x::a -> [x],
epsilon: FG -> 1 , the natural transformation defined by the folding function above
(the 'canonical surjection' from a free monoid to its quotient monoid)
Next, there is another 'Kleisli category' and the corresponding adjunction.
You can check that it is just the category of Haskell types with morphisms a -> T b,
where its compositions are given by the so-called 'Kleisli composition' (>=>).
A typical Haskell programmer will find this category more familiar.
Finally,as is illustrated in CWM, the category of T-algebras
(resp. Kleisli category) becomes the terminal (resp. initial) object in the category
of adjuctions that define the list monad T in a suitable sense.
I suggest to do a similar calculations for the binary tree functor T a = L a | B (T a) (T a) to check your understanding.

I've found a standard constructions of adjunct functors for any monad by Eilenberg-Moore, but I'm not sure if it adds any insight to the problem. The second category in the construction is a category of T-algebras. A T algebra adds a "product" to the initial category.
So how would it work for a list monad? The functor in the list monad consists of a type constructor, e.g., Int->[Int] and a mapping of functions (e.g., standard application of map to lists). An algebra adds a mapping from lists to elements. One example would be adding (or multiplying) all the elements of a list of integers. The functor F takes any type, e.g., Int, and maps it into the algebra defined on the lists of Int, where the product is defined by monadic join (or vice versa, join is defined as the product). The forgetful functor G takes an algebra and forgets the product. The pair F, G, of adjoint functors is then used to construct the monad in the usual way.
I must say I'm none the wiser.

If you are interested,here's some thoughts of a non-expert
on the role of monads and adjunctions in programming languages:
First of all, there exists for a given monad T a unique adjunction to the Kleisli category of T.
In Haskell,the use of monads is primarily confined to operations in this category
(which is essentially a category of free algebras,no quotients).
In fact, all one can do with a Haskell Monad is to compose some Kleisli morphisms of
type a->T b through the use of do expressions, (>>=), etc., to create a new
morphism. In this context, the role of monads is restricted to just the economy
of notation.One exploits associativity of morphisms to be able to write (say) [0,1,2]
instead of (Cons 0 (Cons 1 (Cons 2 Nil))), that is, you can write sequence as sequence,
not as a tree.
Even the use of IO monads is non essential, for the current Haskell type system is powerful
enough to realize data encapsulation (existential types).
This is my answer to your original question,
but I'm curious what Haskell experts have to say about this.
On the other hand, as we have noted, there's also a 1-1 correspondence between monads and
adjunctions to (T-)algebras. Adjoints, in MacLane's terms, are 'a way
to express equivalences of categories.'
In a typical setting of adjunctions <F,G>:X->A where F is some sort
of 'free algebra generator' and G a 'forgetful functor',the corresponding monad
will (through the use of T-algebras) describe how (and when) the algebraic structure of A is constructed on the objects of X.
In the case of Hask and the list monad T, the structure which T introduces is that
of monoid,and this can help us to establish properties (including the correctness) of code through algebraic
methods that the theory of monoids provides. For example, the function foldr (*) e::[a]->a can
readily be seen as an associative operation as long as <a,(*),e> is a monoid,
a fact which could be exploited by the compiler to optimize the computation (e.g. by parallelism).
Another application is to identify and classify 'recursion patterns' in functional programming using categorical
methods in the hope to (partially) dispose of 'the goto of functional programming', Y (the arbitrary recursion combinator).
Apparently, this kind of applications is one of the primary motivations of the creators of Category Theory (MacLane, Eilenberg, etc.),
namely, to establish natural equivalence of categories, and transfer a well-known method in one category
to another (e.g. homological methods to topological spaces,algebraic methods to programming, etc.).
Here, adjoints and monads are indispensable tools to exploit this connection of categories.
(Incidentally, the notion of monads (and its dual, comonads) is so general that one can even go so far as to define 'cohomologies' of
Haskell types.But I have not given a thought yet.)
As for non-determistic functions you mentioned, I have much less to say...
But note that; if an adjunction <F,G>:Hask->A for some category A defines the list monad T,
there must be a unique 'comparison functor' K:A->MonHask (the category of monoids definable in Haskell), see CWM.
This means, in effect, that your category of interest must be a category of monoids in some restricted form (e.g. it may lack some quotients but not free algebras) in order to define the list monad.
Finally,some remarks:
The binary tree functor I mentioned in my last posting easily generalizes to arbitrary data type
T a1 .. an = T1 T11 .. T1m | ....
Namely,any data type in Haskell naturally defines a monad (together with the corresponding category of algebras and the Kleisli category),
which is just the result of any data constructor in Haskell being total.
This is another reason why I consider Haskell's Monad class is not much more than a syntax sugar
(which is pretty important in practice,of course).

Develop Reference

node.js excel linux python-3.x azure haskell apache-spark rust .htaccess string