haskell - the type of join function - haskell

data M a = M a deriving (Show)
unitM a = M a
bindM (M a) f = f a
joinM :: M (M a) -> M a
joinM m = m `bindM` id
joinM' :: M a -> a
joinM' m = m `bindM` id
Note that joinM (M 0) will fail to type check, whereas joinM' (M 0) will be fine.
My question: why is joinM defined as M (M a) -> M a but not as M a -> a?
From my understanding,
unitM puts the value a into the monad M a
joinM gets the value a from the monad M a
So joinM should really work on any monad, i.e., not necessarily nested ones such as M (M a), right?

The point of monads is that you can't get a value out of them. If join had type m a -> a then the IO monad would be perfectly useless, since you could just extract the values freely. The point of monads is that you can chain computations together (>>= can be defined in terms of join, provided you have return and fmap) and put values into a monadic context, but you can't (in general) get them out.
In your specific case, you've defined what is essentially the identity monad. In that case, it's easy to extract the value; you just strip away the layer of M and move on with your life. But that's not true for general monads, so we restrict the type of join so that more things can be monads.
Your bindM is not of the correct type, by the way. The general type of >>= is
(>>=) :: Monad m => m a -> (a -> m b) -> m b
Your function has type
bindM :: M a -> (a -> b) -> b
Notice that your type is more general. Hence, again, in your specific case, you can get away with being looser on the requirements of joinM, whereas specific monads cannot. Try giving bindM an explicit type signature of M a -> (a -> M b) -> M b and then see if both of your join functions still typecheck.

Given a type constructor M :: * -> *, and a type a, consider the following sequence of types
a, M a, M (M a), M (M (M a)), ...
If we have polymorphic functions return :: b -> M b and extract :: M b -> b (your alternative join), we can convert a value of any type above to any other type above. Indeed, we can add and remove M as wanted using these two functions, choosing the type b suitably. In more casual words, we can move both to the right and to the left in such type sequence.
In a monad, instead, we can move to the right without limits (using return). We can also move to the left almost everywhere: the important exception being that we can not move from M a to a. This is realized by join :: M (M c) -> M c, which has the type of extract :: M b -> b restricted to the case b = M c. So essentially, we can move left (as with extract), but only when we end up in a type which has at least one M -- hence, no further to the left than M a.
As Carl mentions above in the comments this restriction makes it possible to have more monads. For instance, if M = [] is the list monad, we can properly implement return and join but not extract.
return :: a -> [a]
return x = [x]
join :: [[a]] -> [a]
join xss = concat xss
Instead extract :: [a] -> a can not be a total function, since extract [] :: a would be well typed, yet tries to extract a value of type a from the empty list. It is a well-known theoretical result that no total expression can have the polymorphic type ... :: a. We can have undefined :: a, fromJust Nothing :: a, or head [] :: a but all of these are not total, and will raise an error when evaluated.

Related

Why is it bind's argument's responsibility to unit its value?

The typical monad bind function has the following signature:
m a -> (a -> m b) -> m b
As I understand it (and I might well be wrong,) the function (a -> m b) is just a mapping function from one structure a to another b. Assuming that is correct begs the question why bind's signature is not simply:
m a -> (a -> b) -> m b
Given that unit is part of a monad's definition; why give the function (a -> m b) the responsibility to call unit on whatever value b it produced – wouldn't it be more sensible to make it part of bind?
A function like m a -> (a -> b) -> m b would be equivalent to fmap :: (a -> b) -> f a -> f b. All fmap can do is change the values inside the action, it can't perform new actions. With m a -> (a -> m b) -> m b, you can "run" the m a, feed that value into (a -> m b), then return a new effect of m b. Without this, you would only ever be able to have one effect in your program, you couldn't have two print statements sequentially, you couldn't connect to a network and then download a URL, and you couldn't respond to user input, you would only be able to transform the value returned from each primitive operation. It's this operation that allows monads to be more powerful than either functors or applicatives.
Another detail here is that you aren't necessarily just wrapping a value with unit, that m b could represent an action, not just returning something. For example, where is the call to return in the action putStrLn :: String -> m ()? This function's signature is compatible with the second argument to >>=, with a ~ String and b ~ (), but there is not call to return anywhere in its body. The point of >>= is to sequence two actions together, not just to wrap values in a context.
Because m a -> (a -> b) -> m b is just fmap which a Monas has, being functor.
What a Monad add to a functor is the ability to join (or squash) a nested Monad to a simple one.
Example a list of list to simple list , or [[1,2], [3]] to [1,2,3].
If you replace b with m b in the fmap signature you end up with
m a -> (a -> m b) -> m (m b)
With a normal functor, you are stuck with your double layer of container (m (m b)). With a Monad,
using the join function, you can squash the m (m b) to m b. So bind is in fact join.fmap.
In fact, join and fmap can be written using only bind (and return), so in practice, it's easier to only define one function bind, instead of two join and fmap, even though it often simpler to write the laters.
So basically, bind is a mix of fmap and join.
As I understand it (and I might well be wrong,) the function (a -> m b) is just a mapping function from one structure a to another b
You're quite right about this – if you change the word "mapping" to morphism. For functions a -> m b are morphisms of the monad's Kleisli category. In that light, the characteristic feature of monads is that you can compose Kleislis in the same way you can compose functions:
type Kleisli m a b = a -> m b -- `Control.Arrow` has this as a `newtype` with `Category` instance.
-- compare (.) :: (b->c) -> (a->b) -> a->c
(<=<) :: Kleisli m b c -> Kleisli m a b -> Kleisli m a c
(f<=<g) x = f =<< g x
Also, you can use ordinary functions as Kleislis:
(return.) :: (a->b) -> Kleisli m a b
However, Kleislis are strictly more powerful than functions. E.g. for m ≡ IO, they are basically functions which can have side-effects, which as you know ordinary Haskell functions can't. So you can't turn a Kleisli back into a function – and if >>= accepted an a->b rather than a Kleisli m a b, but all you had was a Kleisli, there would be no way to use it.
A function of type a -> m b has potentially many more capabilities than one of type a -> b followed by return (or as you call it, "unit"). In fact no "effectful" operation can be expressed in the latter form.
Another take on this: any useful monad will have a number of operations specific to it, beyond just the ones that come from the monadic interface. For example, the IO monad has getLine :: IO String. Consider this very simple program:
main :: IO ()
main = do name <- prompt "What's your name?"
putStrLn ("Hello " ++ name ++ "!")
prompt :: String -> IO String
prompt str = do putStrLn str
getLine
Note that the type of prompt fits the a -> m b mold, but it doesn't use return (a.k.a. unit) anywhere. This is because it uses getLine :: IO String, an opaque operation provided by the IO monad and which cannot be defined in terms of return and >>=.
Think of it this way: ultimately, Monad is never something you use on its own; it's an interface for plugging together things that are extrinsic to it, like getLine and putStrLn.

Explanation of partial application - join

Why does partial application of functions with different signatures work?
Take Control.Monad.join as an example:
GHCi> :t (=<<)
(=<<) :: Monad m => (a -> m b) -> m a -> m b
GHCi> :t id
id :: a -> a
GHCi> :t (=<<) id
(=<<) id :: Monad m => m (m b) -> m b
Why does it accepts id :: a -> a in place of (a -> m b) argument, as they are obviously different ?
=<<'s type signature says that the first argument is a function from an a (anything) to a monad of b.
Well, m b counts as anything, right? So we can just substitute in m b for every a:
(=<<) :: Monad m => (m b -> m b) -> m (m b) -> m b
ids type says that it is a function from anything to the same anything. So if we sub in m b (not forgetting the monad constraint), we get:
id :: Monad m => m b -> m b
Then you can see that the types match.
Some useful concepts to use here:
Any type with a variable a can be converted into a different type by replacing every instance of a with any other type t. So if you have the type a -> b -> c, you can obtain the type a -> d -> c or the type a -> b -> Int by replacing b with d or c with Int respectively.
Any two types that can be converted to each other by replacement are equivalent. For example, a -> b and c -> d are equivalent (a ~ c, b ~ d).
If a type t can be converted to a type t', but t' can't be converted back to t, then we say that t' is a specialization of t. For example, a -> a is a specialization of a -> b.
Now, with these very useful concepts, the answer to your question is very simple: even if the function's "native" types don't match exactly, they are compatible because they can be rewritten or specialized to get that exact match. Matt Fenwick's answer shows specializations that do it for this case.
It tries to unify a with m b, and simply decides that a must be m b, so the type of (=<<) (under the assumption a ~ m b) is Monad m => (mb -> m b) -> m (m b) -> m b, and once you apply it to id, you are left with Monad m => m (m b) -> m b.

why can't a function take monadic value and return another monadic value?

Let's say that we have two monadic functions:
f :: a -> m b
g :: b -> m c
h :: a -> m c
The bind function is defined as
(>>=) :: m a -> (a -> m b) -> m b
My question is why can not we do something like below. Declare a function which would take a monadic value and returns another monadic value?
f :: a -> m b
g :: m b -> m c
h :: a -> m c
The bind function is defined as
(>>=) :: m a -> (ma -> m b) -> m b
What is in the haskell that restricts a function from taking a monadic value as it's argument?
EDIT: I think I did not make my question clear. The point is, when you are composing functions using bind operator, why is that the second argument for bind operator is a function which takes non-monadic value (b)? Why can't it take a monadic value (mb) and give back mc . Is it that, when you are dealing with monads and the function you would compose will always have the following type.
f :: a -> m b
g :: b -> m c
h :: a -> m c
and h = f 'compose' g
I am trying to learn monads and this is something I am not able to understand.
A key ability of Monad is to "look inside" the m a type and see an a; but a key restriction of Monad is that it must be possible for monads to be "inescapable," i.e., the Monad typeclass operations should not be sufficient to write a function of type Monad m => m a -> a. (>>=) :: Monad m => m a -> (a -> m b) -> m b gives you exactly this ability.
But there's more than one way to achieve that. The Monad class could be defined like this:
class Functor f where
fmap :: (a -> b) -> f a -> f b
class Functor f => Monad m where
return :: a -> m a
join :: m (m a) -> m a
You ask why could we not have a Monad m => m a -> (m a -> m b) -> m b function. Well, given f :: a -> b, fmap f :: ma -> mb is basically that. But fmap by itself doesn't give you the ability to "look inside" a Monad m => m a yet not be able to escape from it. However join and fmap together give you that ability. (>>=) can be written generically with fmap and join:
(>>=) :: Monad m => m a -> (a -> m b) -> m b
ma >>= f = join (fmap f ma)
In fact this is a common trick for defining a Monad instance when you're having trouble coming up with a definition for (>>=)—write the join function for your would-be monad, then use the generic definition of (>>=).
Well, that answers the "does it have to be the way it is" part of the question with a "no." But, why is it the way it is?
I can't speak for the designers of Haskell, but I like to think of it this way: in Haskell monadic programming, the basic building blocks are actions like these:
getLine :: IO String
putStrLn :: String -> IO ()
More generally, these basic building blocks have types that look like Monad m => m a, Monad m => a -> m b, Monad m => a -> b -> m c, ..., Monad m => a -> b -> ... -> m z. People informally call these actions. Monad m => m a is a no-argument action, Monad m => a -> m b is a one-argument action, and so on.
Well, (>>=) :: Monad m => m a -> (a -> m b) -> m b is basically the simplest function that "connects" two actions. getLine >>= putStrLn is the action that first executes getLine, and then executes putStrLn passing it the result that was obtained from executing getLine. If you had fmap and join and not >>= you'd have to write this:
join (fmap putStrLn getLine)
Even more generally, (>>=) embodies a notion much like a "pipeline" of actions, and as such is the more useful operator for using monads as a kind of programming language.
Final thing: make sure you are aware of the Control.Monad module. While return and (>>=) are the basic functions for monads, there's endless other more high-level functions that you can define using those two, and that module gathers a few dozen of the more common ones. Your code should not be forced into a straitjacket by (>>=); it's a crucial building block that's useful both on its own and as a component for larger building blocks.
why can not we do something like below. Declare a function which would take a monadic value and returns another monadic value?
f :: a -> m b
g :: m b -> m c
h :: a -> m c
Am I to understand that you wish to write the following?
compose :: (a -> m b) -> (m b -> m c) -> (a -> m c)
compose f g = h where
h = ???
It turns out that this is just regular function composition, but with the arguments in the opposite order
(.) :: (y -> z) -> (x -> y) -> (x -> z)
(g . f) = \x -> g (f x)
Let's choose to specialize (.) with the types x = a, y = m b, and z = m c
(.) :: (m b -> m c) -> (a -> m b) -> (a -> m c)
Now flip the order of the inputs, and you get the desired compose function
compose :: (a -> m b) -> (m b -> m c) -> (a -> m c)
compose = flip (.)
Notice that we haven't even mentioned monads anywhere here. This works perfectly well for any type constructor m, whether it is a monad or not.
Now let's consider your other question. Suppose we want to write the following:
composeM :: (a -> m b) -> (b -> m c) -> (a -> m c)
Stop. Hoogle time. Hoogling for that type signature, we find there is an exact match! It is >=> from Control.Monad, but notice that for this function, m must be a monad.
Now the question is why. What makes this composition different from the other one such that this one requires m to be a Monad, while the other does not? Well, the answer to that question lies at the heart of understanding what the Monad abstraction is all about, so I'll leave a more detailed answer to the various internet resources that speak about the subject. Suffice it to say that there is no way to write composeM without knowing something about m. Go ahead, try it. You just can't write it without some additional knowledge about what m is, and the additional knowledge necessary to write this function just happens to be that m has the structure of a Monad.
Let me paraphrase your question a little bit:
why can't don't we use functions of type g :: m a -> m b with Monads?
The answer is, we do already, with Functors. There's nothing especially "monadic" about fmap f :: Functor m => m a -> m b where f :: a -> b. Monads are Functors; we get such functions just by using good old fmap:
class Functor f a where
fmap :: (a -> b) -> f a -> f b
If you have two functions f :: m a -> m b and a monadic value x :: m a, you can simply apply f x. You don't need any special monadic operator for that, just function application. But a function such as f can never "see" a value of type a.
Monadic composition of functions is much stronger concept and functions of type a -> m b are the core of monadic computations. If you have a monadic value x :: m a, you cannot "get into it" to retrieve some value of type a. But, if you have a function f :: a -> m b that operates on values of type a, you can compose the value with the function using >>= to get x >>= f :: m b. The point is, f "sees" a value of type a and can work with it (but it cannot return it, it can only return another monadic value). This is the benefit of >>= and each monad is required to provide its proper implementation.
To compare the two concepts:
If you have g :: m a -> m b, you can compose it with return to get g . return :: a -> m b (and then work with >>=), but
not vice versa. In general there is no way of creating a function of type m a -> m b from a function of type a -> m b.
So composing functions of types like a -> m b is a strictly stronger concept than composing functions of types like m a -> m b.
For example: The list monad represents computations that can give a variable number of answers, including 0 answers (you can view it as non-deterministic computations). The key elements of computing within list monad are functions of type a -> [b]. They take some input and produce a variable number of answers. Composition of these functions takes the results from the first one, applies the second function to each of the results, and merges it into a single list of all possible answers.
Functions of type [a] -> [b] would be different: They'd represent computations that take multiple inputs and produce multiple answers. They can be combined too, but we get something less strong than the original concept.
Perhaps even more distinctive example is the IO monad. If you call getChar :: IO Char and used only functions of type IO a -> IO b, you'd never be able to work with the character that was read. But >>= allows you to combine such a value with a function of type a -> IO b that can "see" the character and do something with it.
As others have pointed out, there is nothing that restricts a function to take a monadic value as argument. The bind function itself takes one, but not the function that is given to bind.
I think you can make this understandable to yourself with the "Monad is a Container" metaphor. A good example for this is Maybe. While we know how to unwrap a value from the Maybe conatiner, we do not know it for every monad, and in some monads (like IO) it is entirely impossible.
The idea is now that the Monad does this behind the scenes in a way you don't have to know about. For example, you indeed need to work with a value that was returned in the IO monad, but you cannot unwrap it, hence the function that does this needs to be in the IO monad itself.
I like to think of a monad as a recipe for constructing a program with a specific context. The power that a monad provides is the ability to, at any stage within your constructed program, branch depending upon the previous value. The usual >>= function was chosen as being the most generally useful interface to this branching ability.
As an example, the Maybe monad provides a program that may fail at some stage (the context is the failure state). Consider this psuedo-Haskell example:
-- take a computation that produces an Int. If the current Int is even, add 1.
incrIfEven :: Monad m => m Int -> m Int
incrIfEven anInt =
let ourInt = currentStateOf anInt
in if even ourInt then return (ourInt+1) else return ourInt
In order to branch based on the current result of a computation, we need to be able to access that current result. The above psuedo-code would work if we had access to currentStateOf :: m a -> a, but that isn't generally possible with monads. Instead we write our decision to branch as a function of type a -> m b. Since the a isn't in a monad in this function, we can treat it like a regular value, which is much easier to work with.
incrIfEvenReal :: Monad m => m Int -> m Int
incrIfEvenReal anInt = anInt >>= branch
where branch ourInt = if even ourInt then return (ourInt+1) else return ourInt
So the type of >>= is really for ease of programming, but there are a few alternatives that are sometimes more useful. Notably the function Control.Monad.join, which when combined with fmap gives exactly the same power as >>= (either can be defined in terms of the other).
The reason (>>=)'s second argument does not take a monad as input is because there is no need to bind such a function at all. Just apply it:
m :: m a
f :: a -> m b
g :: m b -> m c
h :: c -> m b
(g (m >>= f)) >>= h
You don't need (>>=) for g at all.
The function can take a monadic value if it wants. But it is not forced to do so.
Consider the following contrived definitions, using the list monad and functions from Data.Char:
m :: [[Int]]
m = [[71,72,73], [107,106,105,104]]
f :: [Int] -> [Char]
f mx = do
g <- [toUpper, id, toLower]
x <- mx
return (g $ chr x)
You can certainly run m >>= f; the result will have type [Char].
(It's important here that m :: [[Int]] and not m :: [Int]. >>= always "strips off" one monadic layer from its first argument. If you don't want that to happen, do f m instead of m >>= f.)
As others have mentioned, nothing restricts such functions from being written.
There is, in fact, a large family of functions of type :: m a -> (m a -> m b) -> m b:
f :: Monad m => Int -> m a -> (m a -> m b) -> m b
f n m mf = replicateM_ n m >>= mf m
where
f 0 m mf = mf m
f 1 m mf = m >> mf m
f 2 m mf = m >> m >> mf m
... etc. ...
(Note the base case: when n is 0, it's simply normal functional application.)
But what does this function do? It performs a monadic action multiple times, finally throwing away all the results, and returning the application of mf to m.
Useful sometimes, but hardly generally useful, especially compared to >>=.
A quick Hoogle search doesn't turn up any results; perhaps a telling result.

What Haskell type system magic allows for the definition of join?

The join utility function is defined as:
join :: (Monad m) => m (m a) -> m a
join x = x >>= id
Given that the type of >>= is Monad m => m a -> (a -> m b) -> m b and id is a -> a, how can that function also be typed as a -> m b as it must be in the definition above? What are m and b in this case?
The as in the types for >>= and id aren't necessarily the same as, so let's restate the types like this:
(>>=) :: Monad m => m a -> (a -> m b) -> m b
id :: c -> c
So we can conclude that c is the same as a after all, at least when id is the second argument to >>=... and also that c is the same as m b. So a is the same as m b. In other words:
(>>= id) :: Monad m => m (m b) -> m b
dave4420 hits it, but I think the following remarks might still be useful.
There are rules that you can use to validly "rewrite" a type into another type that's compatible with the original. These rules involve replacing all occurrences of a type variable with some other type:
If you have id :: a -> a, you can replace a with c and get id :: c -> c. This latter type can also be rewritten to the original id :: a -> a, which means that these two types are equivalent. As a general rule, if you replace all instances of type variable with another type variable that occurs nowhere in the original, you get an equivalent type.
You can replace all occurrences of a type variable with a concrete type. I.e., if you have id :: a -> a, you can rewrite that to id :: Int -> Int. The latter however can't be rewritten back to the original, so in this case you're specializing the type.
More generally than the second rule, you can replace all occurrences of a type variable any type, concrete or variable. So for example, if you have f :: a -> m b, you can replace all occurrences of a with m b and get f :: m b -> m b. Since this one can't be undone either, it's also a specialization.
That last example shows how id can be used as the second argument of >>=. So the answer to your question is that we can rewrite and derive types as follows:
1. (>>=) :: m a -> (a -> m b) -> m b (premise)
2. id :: a -> a (premise)
3. (>>=) :: m (m b) -> (m b -> m b) -> m b (replace a with m b in #1)
4. id :: m b -> m b (replace a with m b in #2)
.
.
.
n. (>>= id) :: m (m b) -> m b (indirectly from #3 and #4)

Monad "bind" function question

If I define the "bind" function like this:
(>>=) :: M a -> (a -> M' b) -> M' b
Will this definition help me if I want the result to be of a new Monad type, or I should use same Monad but with b in the same Monad box as before?
As I've mentioned in the comment, I don't think such operation can be safely defined for general monads (e.g. M = IO, M' = Maybe).
However, if the M is safely convertible to M', then this bind can be defined as:
convert :: M1 a -> M2 a
...
(>>=*) :: M1 a -> (a -> M2 b) -> M2 b
x >>=* f = convert x >>= f
And conversely,
convert x = x >>=* return
Some of such safe conversion methods are maybeToList (Maybe → []), listToMaybe ([] → Maybe), stToIO (ST RealWorld → IO), ... note that there isn't a generic convert method for any monads.
Not only will that definition not help, but it will seriously confuse future readers of your code, since it will break all expectations of use for it.
For instance, are both M and M' supposed to be Monads? If so, then how are they defined? Remember: the definition of >>= is part of the definition of Monad, and is used everywhere to define other Monad-using functions - every function besides return and fail themselves.
Also, do you get to choose which M and M' you use, or does the computer? If so, then how do you choose? Does it work for any two Monad instances, or is there some subset of Monad that you want - or does the choice of M determine the choice of M'?
It's possible to make a function like what you've written, but it surely is a lot more complicated than >>=, and it would be misleading, cruel, and potentially disastrous to try to cram your function into >>='s clothes.
This can be a complicated thing to do, but it is doable in some contexts. Basically, if they are monads you can see inside (such as Maybe or a monad you've written) then you can define such an operation.
One thing which is sometimes quite handy (in GHC) is to replace the Monad class with one of your own. If you define return, >>=, fail you'll still be able to use do notation. Here's an example that may be like what you want:
class Compose s t where
type Comp s t
class Monad m where
return :: a -> m s a
fail :: String -> m a
(>>=) :: (Compose s t) => m s a -> (a -> m t b) -> m (Comp s t) b
(>>) :: (Compose s t) => m s a -> m t b -> m (Comp s t) b
m >> m' = m >>= \_ -> m'
You can then control which types can be sequenced using the bind operator based on which instances of Compose you define. Naturally you'll often want Comp s s = s, but you can also use this to define all sorts of crazy things.
For instance, perhaps you have some operations in your monad which absolutely cannot be followed by any other operations. Want to enforce that statically? Define an empty datatype data Terminal and provide no instances of Compose Terminal t.
This approach is not good for transposing from (say) Maybe to IO, but it can be used to carry along some type-level data about what you're doing.
If you really do want to change monads, you can modify the class definitions above into something like
class Compose m n where
type Comp m n
(>>=*) :: m a -> (a -> n b) -> (Compose m n) b
class Monad m where
return :: a -> m a
fail :: String -> m a
(>>=) :: Compose m n => m a -> (a -> n b) -> (Compose m n) b
m >>= f = m >>=* f
(>>) :: Compose m n => m a -> (n b) -> (Compose m n) b
m >> n = m >>=* \_ -> n
I've used the former style to useful ends, though I imagine that this latter idea may also be useful in certain contexts.
You may want to look at this sample from Oleg: http://okmij.org/ftp/Computation/monads.html#param-monad

Resources