Wrapping a function's arguments - haskell

I am trying to figure out the mechanism to transform a generic function with n arguments into a function where each of the n argument is wrapped inside a context.
A simple example in pseudo-Haskell, where the wrapped type is tuple:
wrap :: (a1 -> a2 -> a3 -> ...) -> Context c -> ((c, a1) -> (c, a2) -> (c, a3) -> ...)
The container which holds the contexts is suggested by the alias Context c, for which I'm yet to figure out a reasonable form. For example it could be an n-tuple, a list of cs or even a composition of type (c -> c -> c -> ...).
It would most likely have to do with some ingenious use of applicative functors or monads, but I currently have no idea where to start attacking the problem. Any lead would be deeply appreciated.
Edit. Resolution
As the comments below suggest, the behaviour can be described as "give me some generic function with n arguments and [some contexts] and attach a context to each argument of the function". This resulting function will be applied applicative-style to n structures S a using some operators which are implemented to make use of these contexts, something like:
(wrap f contexts) §> s1 <§> s2 <§> s3 <§> ....
As an example, I got something running with fixed-size vectors arguments like FSVec c a. So I was wondering whether there is a method to pass a generic function of type fi :: [a1] -> [a2] -> ... and some lengths and "zip" them to get fo :: FSVec c a1 -> FSVec c a2 -> .... And even more, can it be done in a generic way, i.e. apply any arbitrary constructor, not necessarily one which uses type-level (e.g. a tuple constructor as in the example above)?
Alternate solution
Of course, I can always "zip" the contexts on the structures themselves, having to only reimplement (§>) and (<§>) to act accordingly and saving me a lot of trouble:
f §> (c1, s1) <§> (c2, s2) <§> (c3, s3) <§> ...
Still, I would like to "hide" these contexts as to separate concerns. Apart from just being a matter of style, it is actually a good mental exercise to explore the power of applicative functors and applicative-like functions.

Related

What is the difference between functor and monad Intuitively

I have learned the definitions of functor and monad, But I am still unable to figure out the difference between them except for the definitions. I have read some answers to this problem, In What is the difference between a Functor and a Monad? one comment say
A functor takes a pure function (and a functorial value) whereas a monad takes a Kleisli arrow, i.e. a function that returns a monad (and a monadic value). Hence you can chain two monads and the second monad can depend on the result of the previous one. You cannot do this with functors.
This comment is interesting and gives me a read of their difference. But I still have some questions.
why functor cannot use the result of previous one? since fmap :: (a -> b) -> f a -> f b, when I currying fmap with a pure function, I can get a f a -> f b function,f b depends on f a, Does the result mean data inside the functor?
In category theory I can understand the comment since I cannot get the element in category theory, but In Haskell I find out that I can use the result of functor since Haskell can remember the data constructor, Does Haskell prevent me from understanding this comment? Should I understand this in pure category theory?
When using fmap :: Functor f => (a -> b) -> f a -> f b, the argument function can never depend on any of the structure from the functor f itself; how can it, when all it ever gets passed are a values, nothing to do with f? When you use fmap to get an f a -> f b function, the code that is responsible for building the final f b value is the code of fmap itself, not the code of the a -> b function being mapped. The function being mapped might not even be called at all (e.g. if you're using the Maybe functor and the f a value is Nothing, there are no a values inside that for fmap to pass to the a -> b function, and it will never be called).
But fmap is fully polymorphic in a and b, meaning the code of fmap doesn't know what those types are and cannot assume anything about them. So fmap has to build the final f b value in a way that only depends on the functor structure added by f; the code of fmap cannot look at any a values and make decisions about what to do. In fact really the only thing a lawful fmap can do is return a value with exactly the same structure as the input f a value, only with any as that were inside replaced by the b value returned by the a -> b function. That's why you can just deriving Functor any data type; there's at most one way to make any given data type a Functor, and the compiler can just do it for you.
But the monad =<< function1 has this type: Monad m => (a -> m b) -> m a -> m b. Here the mapped function has type a -> m b. That means the final m b structure returned at the end isn't purely determined by the code of =<< (at least not necessarily). The a -> m b function knows what monad is being used and returns some monadic structure in its m b value, not just a raw b. The mapped function still doesn't receive any monadic structure, so its intermediate m b result can't depend on that; any dependency the final m b value has on the monadic structure in the original m a has to be determined by the code for =<< (which again, cannot make any decisions based on a values itself). But the mapped function (if it is called) does get to contribute some monadic structure, and that can depend on the a values (because the mapped function doesn't have to be completely polymorphic in a; it can inspect a values and make decisions based on them).
This means that ultimately any part of the final output m b might depend on any part of the input (if we don't know what function was mapped, or how the =<< function works internally). This is quite different from the functor case, where even without knowing anything about the specific functor or the mapped function we know that the final output will always look like "a copy of the input with as replaced by bs" (and specifically which b replaces each a is determined by the mapped function).
It's possibly worth noting: almost everything I've said here is a direct consequence of what these types mean:
Functor f => (a -> b) -> f a -> f b
Monad m => (a -> m a) -> m a -> m b
These aren't special properties of functors and monads you have to remember, it's just a consequence of those operations having those types. (The functor and monad laws express things that the types don't, but I've not needed to invoke those to explain this difference)
Once you are really deeply used to thinking about types the Haskell way, you can figure this out for yourself just by looking at the types. I didn't of course; I had it explained to me (a dozen times) when I was learning, and I've remembered it since then. But now I can figure out similar things about new APIs I've never seen before, without any tutorials or explanations.
1 I'm using the reversed =<< operator rather than the traditional >>= operator merely because it lines up better with the fmap / <$> operator.
A monad is a functor, or rather a functor along with two additional operations. (The original(?) name for a monad was a triple.)
It helps to look at the original mathematical definition, which provided an operation unit :: Monad m => a -> m a, but instead of (>>=) :: Monad m => m a -> (a -> m b) -> m b, defined an operation called join :: Monad m => m (m a) -> m a, which lets you "remove" a layer of lifting. The two formulations are equivalent, as join can be defined in terms of (>>=)
join ma = ma >>= id
and vice versa:
ma >>= f = join $ fmap f ma
The last definition gives a different interpretation of a monad: it's a functor that you can "compress"; chaining comes for free by mapping a function over an already lifted value, then combining the two layers of lifting back to a single layer.
unit is required in either case because there's no way to get a lifted value in the first place with just fmap.

Understanding the Haskell type of this expression

I was doing a Haskell types exercise and this one has stumped me. The provided expression is:
f2 f g h = h.g.f
And the type for f2 is apparently:
f2 :: (a -> b1) -> (b1 -> b) -> (b -> c) -> a -> c
This seems overly complicated for such a short expression. Can someone explain why this makes sense as the type?
Why do you think it's complicated? It's just a function that takes three functions (of matching types) and returns a new one.
I've renamed b1 to b and updated other names, for consistency. So here is a slightly edited version:
f2 :: (a -> b) -> (b -> c) -> (c -> d) -> a -> d
Here, f2 takes f which has type a -> b. Meaning, it's a function, with some input of some type a (a can be just anything, no restrictions here) and some return value of type b.
Then f2 takes g which input's type must match f's output, so it's b -> c. It cannot be, like, e -> c (where e is something different from b), or the code won't make sense as it won't be possible to compose f . g.
And exactly the same goes for the third argument, h, with type c -> d.
The result of function composition (what f2 returns) is a new function that takes in whatever f does and returns whatever h returns, so it's a -> d. With this we've covered the whole type definition.
Basically, it's a relatively long definition, but I think it's of a very simple nature.
I find these “determine the type of such and such expression” generally a bit backwards. Types should always come first: you want to program a solution to some task, you formulate the problem description as a type signature. Then you go ahead and actually write an implementation.
In this case, you'd start with the problem: I have three functions f, g and h, and a value of type that I can pass to f. Furthermore, the functions have pairwise matching result/argument type. Hence the signature
f2 :: (α -> β) -> (β -> γ) -> (γ -> δ) -> α -> δ
You could now go on and implement this in explicit-pointed form, i.e.
f2 f g h x = h (g (f x))
which is still quite brief. After all, it's quite a simple task!
But in Haskell you can make it even shorter, by using the standard composition operator .. The fact that the final implementation is so extremely short is basically just down to the fact that f2 does essentially the same thing as ., just twice. So this isn't more surprising than if you have a very complex task with a complicated type signature, but discover some library that contains a function which does almost that exact task. Obviously, invoking that ready-build function will give you a much shorter implementation than the task complexity would suggest, but the complexity is merely deferred to the library function.

Are there useful applications for the Divisible Type Class?

I've lately been working on an API in Elm where one of the main types is contravariant. So, I've googled around to see what one can do with contravariant types and found that the Contravariant package in Haskell defines the Divisible type class.
It is defined as follows:
class Contravariant f => Divisible f where
divide :: (a -> (b, c)) -> f b -> f c -> f a
conquer :: f a
It turns out that my particular type does suit the definition of the Divisible type class. While Elm does not support type classes, I do look at Haskell from time to time for some inspiration.
My question: Are there any practical uses for this type class? Are there known APIs out there in Haskell (or other languages) that benefit from this divide-conquer pattern? Are there any gotchas I should be aware of?
Thank you very much for your help.
One example:
Applicative is useful for parsing, because you can turn Applicative parsers of parts into a parser of wholes, needing only a pure function for combining the parts into a whole.
Divisible is useful for serializing (should we call this coparsing now?), because you can turn Divisible serializers of parts into a serializer of wholes, needing only a pure function for splitting the whole into parts.
I haven't actually seen a project that worked this way, but I'm (slowly) working on an Avro implementation for Haskell that does.
When I first came across Divisible I wanted it for divide, and had no idea what possible use conquer could be other than cheating (an f a out of nowhere, for any a?). But to make the Divisible laws check out for my serializers conquer became a "serializer" that encodes anything to zero bytes, which makes a lot of sense.
Here's a possible use case.
In streaming libraries, one can have fold-like constructs like the ones from the foldl package, that are fed a sequence of inputs and return a summary value when the sequence is exhausted.
These folds are contravariant on their inputs, and can be made Divisible. This means that if you have a stream of elements where each element can be somehow decomposed into b and c parts, and you also happen to have a fold that consumes bs and another fold that consumes cs, then you can build a fold that consumes the original stream.
The actual folds from foldl don't implement Divisible, but they could, using a newtype wrapper. In my process-streaming package I have a fold-like type that does implement Divisible.
divide requires the return values of the constituent folds to be of the same type, and that type must be an instance of Monoid. If the folds return different, unrelated monoids, a workaround is to put each return value in a separate field of a tuple, leaving the other field as mempty. This works because a tuple of monoids is itself a Monoid.
I'll examine the example of the core data types in Fritz Henglein's generalized radix sort techniques as implemented by Edward Kmett in the discrimination package.
While there's a great deal going on there, it largely focuses around a type like this
data Group a = Group (forall b . [(a, b)] -> [[b]])
If you have a value of type Group a you essentially must have an equivalence relationship on a because if I give you an association between as and some type b completely unknown to you then you can give me "groupings" of b.
groupId :: Group a -> [a] -> [[a]]
groupId (Group grouper) = grouper . map (\a -> (a, a))
You can see this as a core type for writing a utility library of groupings. For instance, we might want to know that if we can Group a and Group b then we can Group (a, b) (more on this in a second). Henglein's core idea is that if you can start with some basic Groups on integers—we can write very fast Group Int32 implementations via radix sort—and then use combinators to extend them over all types then you will have generalized radix sort to algebraic data types.
So how might we build our combinator library?
Well, f :: Group a -> Group b -> Group (a, b) is pretty important in that it lets us make groups of product-like types. Normally, we'd get this from Applicative and liftA2 but Group, you'll notice, is Contravaiant, not a Functor.
So instead we use Divisible
divided :: Group a -> Group b -> Group (a, b)
Notice that this arises in a strange way from
divide :: (a -> (b, c)) -> Group b -> Group c -> Group a
as it has the typical "reversed arrow" character of contravariant things. We can now understand things like divide and conquer in terms of their interpretation on Group.
Divide says that if I want to build a strategy for equating as using strategies for equating bs and cs, I can do the following for any type x
Take your partial relation [(a, x)] and map over it with a function f :: a -> (b, c), and a little tuple manipulation, to get a new relation [(b, (c, x))].
Use my Group b to discriminate [(b, (c, x))] into [[(c, x)]]
Use my Group c to discriminate each [(c, x)] into [[x]] giving me [[[x]]]
Flatten the inner layers to get [[x]] like we need
instance Divisible Group where
conquer = Group $ return . fmap snd
divide k (Group l) (Group r) = Group $ \xs ->
-- a bit more cleverly done here...
l [ (b, (c, d)) | (a,d) <- xs, let (b, c) = k a] >>= r
We also get interpretations of the more tricky Decidable refinement of Divisible
class Divisible f => Decidable f where
lose :: (a -> Void) -> f a
choose :: (a -> Either b c) -> f b -> f c -> f a
instance Decidable Group where
lose :: (a -> Void) -> Group a
choose :: (a -> Either b c) -> Group b -> Group c -> Group a
These read as saying that for any type a of which we can guarantee there are no values (we cannot produce values of Void by any means, a function a -> Void is a means of producing Void given a, thus we must not be able to produce values of a by any means either!) then we immediately get a grouping of zero values
lose _ = Group (\_ -> [])
We also can go a similar game as to divide above except instead of sequencing our use of the input discriminators, we alternate.
Using these techniques we build up a library of "Groupable" things, namely Grouping
class Grouping a where
grouping :: Group a
and note that nearly all the definitions arise from the basic definition atop groupingNat which uses fast monadic vector manipuations to achieve an efficient radix sort.

Usefulness of "function arrows associate to the right"?

Reading http://www.seas.upenn.edu/~cis194/spring13/lectures/04-higher-order.html it states
In particular, note that function arrows associate to the right, that
is, W -> X -> Y -> Z is equivalent to W -> (X -> (Y -> Z)). We can
always add or remove parentheses around the rightmost top-level arrow
in a type.
Function arrows associate to the right but as function application associates to the left then what is usefulness of this information ? I feel I'm not understanding something as to me it is a meaningless point that function arrows associate to the right. As function application always associates to the left then this the only associativity I should be concerned with ?
Function arrows associate to the right but [...] what is usefulness of this information?
If you see a type signature like, for example, f : String -> Int -> Bool you need to know the associativity of the function arrow to understand what the type of f really is:
if the arrow associates to the left, then the type means (String -> Int) -> Bool, that is, f takes a function as argument and returns a boolean.
if the arrow associates to the right, then the type means String -> (Int -> Bool), that is, f takes a string as argument and returns a function.
That's a big difference, and if you want to use f, you need to know which one it is. Since the function arrow associates to the right, you know that it has to be the second option: f takes a string and returns a function.
Function arrows associate to the right [...] function application associates to the left
These two choices work well together. For example, we can call the f from above as f "answer" 42 which really means (f "answer") 42. So we are passing the string "answer" to f which returns a function. And then we're passing the number 42 to that function, which returns a boolean. In effect, we're almost using f as a function with two arguments.
This is the standard way of writing functions with two (or more) arguments in Haskell, so it is a very common use case. Because of the associativity of function application and of the function arrow, we can write this common use case without parentheses.
When defining a two-argument curried function, we usually write something like this:
f :: a -> b -> c
f x y = ...
If the arrow did not associate to the right, the above type would instead have to be spelled out as a -> (b -> c). So the usefulness of ->'s associativity is that it saves us from writing too many parentheses when declaring function types.
If an operator # is 'right associative', it means this:
a # b # c # d = a # (b # (c # d))
... for any number of arguments. It behaves like foldr
This means that:
a -> b -> c -> d = a -> (b -> (c -> d))
Note: a -> (b -> (c -> d)) =/= ((a -> b) -> c) -> d ! This is very important.
What this tells us is that, say, foldr:
λ> :t foldr
foldr :: (a -> b -> b) -> b -> [a] -> b
Takes a function of type (a -> b -> b), and then returns... a function that takes a b, and then returns... a function that takes a [a], and then returns... a b. This means that we can apply functions like this
f a b c
because
f a b c = ((f a) b) c
and f will return two functions each time an argument is given.
Essentially, this isn't very useful as such, but is important information for when we want to interpret and call function types.
However, in functions like (++), associativity matters. If (++) were left associative, it would be very slow, so it's right associative.
Early functional language Lisp suffered from excessively nested parenthesis (which make code (or even text (if you do not mind to consider a broader context)) difficult to read. With time functional language designers opted to make functional code easy to read and write for pros even at cost of confusing rookies with less uniform rules.
In functional code,
function type declaration like (String -> Int) -> Bool are much more rare than functions like String -> (Int -> Bool), because functions that return functions are trade mark of functional style. Thus associating arrows to right helps reduce parentheses number (on overage, you might need to map a function to a primitive type). For function applications it is vise-versa.
The main purposes is convenience, because partial function application goes from left to right.
Every time you partially apply a function to a set of values, the remaining type has to be valid.
You can think of arrow types as a queue of types, where the queue itself is a type. During partial function application, you dequeue as many types from the queue as the number of arguments, yielding whatever remains of the queue. The resulting queue is still a valid type.
This is why types associate to the right. If types associate to the left, it will behave like a stack, and you won't be able to partially apply it the same way without leaving "holes" or undefined domains. For instance, say you have the following function:
foo :: a -> b -> c -> d
If Haskell types were left-associative, then passing a single parameter to foo would yield the following invalid type:
((? -> b) -> c) -> d
You will then be forced to circumvent it by adding parentheses, which could hamper readability.

Monad theory and Haskell

Most tutorials seem to give a lot of examples of monads (IO, state, list and so on) and then expect the reader to be able to abstract the overall principle and then they mention category theory. I don't tend to learn very well by trying generalise from examples and I would like to understand from a theoretical point of view why this pattern is so important.
Judging from this thread:
Can anyone explain Monads?
this is a common problem, and I've tried looking at most of the tutorials suggested (except the Brian Beck videos which won't play on my linux machine):
Does anyone know of a tutorial that starts from category theory and explains IO, state, list monads in those terms? the following is my unsuccessful attempt to do so:
As I understand it a monad consists of a triple: an endo-functor and two natural transformations.
The functor is usually shown with the type:
(a -> b) -> (m a -> m b)
I included the second bracket just to emphasise the symmetry.
But, this is an endofunctor, so shouldn't the domain and codomain be the same like this?:
(a -> b) -> (a -> b)
I think the answer is that the domain and codomain both have a type of:
(a -> b) | (m a -> m b) | (m m a -> m m b) and so on ...
But I'm not really sure if that works or fits in with the definition of the functor given?
When we move on to the natural transformation it gets even worse. If I understand correctly a natural transformation is a second order functor (with certain rules) that is a functor from one functor to another one. So since we have defined the functor above the general type of the natural transformations would be:
((a -> b) -> (m a -> m b)) -> ((a -> b) -> (m a -> m b))
But the actual natural transformations we are using have type:
a -> m a
m a -> (a ->m b) -> m b
Are these subsets of the general form above? and why are they natural transformations?
Martin
A quick disclaimer: I'm a little shaky on category theory in general, while I get the impression you have at least some familiarity with it. Hopefully I won't make too much of a hash of this...
Does anyone know of a tutorial that starts from category theory and explains IO, state, list monads in those terms?
First of all, ignore IO for now, it's full of dark magic. It works as a model of imperative computations for the same reasons that State works for modelling stateful computations, but unlike the latter IO is a black box with no way to deduce the monadic structure from the outside.
The functor is usually shown with the type: (a -> b) -> (m a -> m b) I included the second bracket just to emphasise the symmetry.
But, this is an endofunctor, so shouldn't the domain and codomain be the same like this?:
I suspect you are misinterpreting how type variables in Haskell relate to the category theory concepts.
First of all, yes, that specifies an endofunctor, on the category of Haskell types. A type variable such as a is not anything in this category, however; it's a variable that is (implicitly) universally quantified over all objects in the category. Thus, the type (a -> b) -> (a -> b) describes only endofunctors that map every object to itself.
Type constructors describe an injective function on objects, where the elements of the constructor's codomain cannot be described by any means except as an application of the type constructor. Even if two type constructors produce isomorphic results, the resulting types remain distinct. Note that type constructors are not, in the general case, functors.
The type variable m in the functor signature, then, represents a one-argument type constructor. Out of context this would normally be read as universal quantification, but that's incorrect in this case since no such function can exist. Rather, the type class definition binds m and allows the definition of such functions for specific type constructors.
The resulting function, then, says that for any type constructor m which has fmap defined, for any two objects a and b and a morphism between them, we can find a morphism between the types given by applying m to a and b.
Note that while the above does, of course, define an endofunctor on Hask, it is not even remotely general enough to describe all such endofunctors.
But the actual natural transformations we are using have type:
a -> m a
m a -> (a ->m b) -> m b
Are these subsets of the general form above? and why are they natural transformations?
Well, no, they aren't. A natural transformation is roughly a function (not a functor) between functors. The two natural transformations that specify a monad M look like I -> M where I is the identity functor, and M ∘ M -> M where ∘ is functor composition. In Haskell, we have no good way of working directly with either a true identity functor or with functor composition. Instead, we discard the identity functor to get just (Functor m) => a -> m a for the first, and write out nested type constructor application as (Functor m) => m (m a) -> m a for the second.
The first of these is obviously return; the second is a function called join, which is not part of the type class. However, join can be written in terms of (>>=), and the latter is more often useful in day-to-day programming.
As far as specific monads go, if you want a more mathematical description, here's a quick sketch of an example:
For some fixed type S, consider two functors F and G where F(x) = (S, x) and G(x) = S -> x (It should hopefully be obvious that these are indeed valid functors).
These functors are also adjoints; consider natural transformations unit :: x -> G (F x) and counit :: F (G x) -> x. Expanding the definitions gives us unit :: x -> (S -> (S, x)) and counit :: (S, S -> x) -> x. The types suggest uncurried function application and tuple construction; feel free to verify that those work as expected.
An adjunction gives rise to a monad by composition of the functors, so taking G ∘ F and expanding the definition, we get G (F x) = S -> (S, x), which is the definition of the State monad. The unit for the adjunction is of course return; and you should be able to use counit to define join.
This page does exactly that. I think your main confusion is that the class doesn't really make the Type a functor, but it defines a functor from the category of Haskell types into the category of that type.
Following the notation of the link, assuming F is a Haskell Functor, it means that there is a functor from the category of Hask to the category of F.
Roughly speaking, Haskell does its category theory all in just one category, whose objects are Haskell types and whose arrows are functions between these types. It's definitely not a general-purpose language for modelling category theory.
A (mathematical) functor is an operation turning things in one category into things in another, possibly entirely different, category. An endofunctor is then a functor which happens to have the same source and target categories. In Haskell, a functor is an operation turning things in the category of Haskell types into other things also in the category of Haskell types, so it is always an endofunctor.
[If you're following the mathematical literature, technically, the operation '(a->b)->(m a -> m b)' is just the arrow part of the endofunctor m, and 'm' is the object part]
When Haskellers talk about working 'in a monad' they really mean working in the Kleisli category of the monad. The Kleisli category of a monad is a thoroughly confusing beast at first, and normally needs at least two colours of ink to give a good explanation, so take the following attempt for what it is and check out some references (unfortunately Wikipedia is useless here for all but the straight definitions).
Suppose you have a monad 'm' on the category C of Haskell types. Its Kleisli category Kl(m) has the same objects as C, namely Haskell types, but an arrow a ~(f)~> b in Kl(m) is an arrow a -(f)-> mb in C. (I've used a squiggly line in my Kleisli arrow to distinguish the two). To reiterate: the objects and arrows of the Kl(C) are also objects and arrows of C but the arrows point to different objects in Kl(C) than in C. If this doesn't strike you as odd, read it again more carefully!
Concretely, consider the Maybe monad. Its Kleisli category is just the collection of Haskell types, and its arrows a ~(f)~> b are functions a -(f)-> Maybe b. Or consider the (State s) monad whose arrows a ~(f)~> b are functions a -(f)-> (State s b) == a -(f)-> (s->(s,b)). In any case, you're always writing a squiggly arrow as a shorthand for doing something to the type of the codomain of your functions.
[Note that State is not a monad, because the kind of State is * -> * -> *, so you need to supply one of the type parameters to turn it into a mathematical monad.]
So far so good, hopefully, but suppose you want to compose arrows a ~(f)~> b and b ~(g)~> c. These are really Haskell functions a -(f)-> mb and b -(g)-> mc which you cannot compose because the types don't match. The mathematical solution is to use the 'multiplication' natural transformation u:mm->m of the monad as follows: a ~(f)~> b ~(g)~> c == a -(f)-> mb -(mg)-> mmc -(u_c)-> mc to get an arrow a->mc which is a Kleisli arrow a ~(f;g)~> c as required.
Perhaps a concrete example helps here. In the Maybe monad, you cannot compose functions f : a -> Maybe b and g : b -> Maybe c directly, but by lifting g to
Maybe_g :: Maybe b -> Maybe (Maybe c)
Maybe_g Nothing = Nothing
Maybe_g (Just a) = Just (g a)
and using the 'obvious'
u :: Maybe (Maybe c) -> Maybe c
u Nothing = Nothing
u (Just Nothing) = Nothing
u (Just (Just c)) = Just c
you can form the composition u . Maybe_g . f which is the function a -> Maybe c that you wanted.
In the (State s) monad, it's similar but messier: Given two monadic functions a ~(f)~> b and b ~(g)~> c which are really a -(f)-> (s->(s,b)) and b -(g)-> (s->(s,c)) under the hood, you compose them by lifting g into
State_s_g :: (s->(s,b)) -> (s->(s,(s->(s,c))))
State_s_g p s1 = let (s2, b) = p s1 in (s2, g b)
then you apply the 'multiplication' natural transformation u, which is
u :: (s->(s,(s->(s,c)))) -> (s->(s,c))
u p1 s1 = let (s2, p2) = p1 s1 in p2 s2
which (sort of) plugs the final state of f into the initial state of g.
In Haskell, this turns out to be a bit of an unnatural way to work so instead there's the (>>=) function which basically does the same thing as u but in a way that makes it easier to implement and use. This is important: (>>=) is not the natural transformation 'u'. You can define each in terms of the other, so they're equivalent, but they're not the same thing. The Haskell version of 'u' is written join.
The other thing missing from this definition of Kleisli categories is the identity on each object: a ~(1_a)~> a which is really a -(n_a)-> ma where n is the 'unit' natural transformation. This is written return in Haskell, and doesn't seem to cause as much confusion.
I learned category theory before I came to Haskell, and I too have had difficulty with the mismatch between what mathematicians call a monad and what they look like in Haskell. It's no easier from the other direction!
Not sure I understand what was the question but yes, you are right, monad in Haskell is defined as a triple:
m :: * -> * -- this is endofunctor from haskell types to haskell types!
return :: a -> m a
(>>=) :: m a -> (a -> m b) -> m b
but common definition from category theory is another triple:
m :: * -> *
return :: a -> m a
join :: m (m a) -> m a
It is slightly confusing but it's not so hard to show that these two definitions are equal.
To do that we need to define join in terms of (>>=) (and vice versa).
First step:
join :: m (m a) -> m a
join x = ?
This gives us x :: m (m a).
All we can do with something that have type m _ is to aply (>>=) to it:
(x >>=) :: (m a -> m b) -> m b
Now we need something as a second argument for (>>=), and also,
from the type of join we have constraint (x >>= y) :: ma.
So y here will have type y :: ma -> ma and id :: a -> a fits it very well:
join x = x >>= id
The other way
(>>=) :: ma -> (a -> mb) -> m b
(>>=) x y = ?
Where x :: m a and y :: a -> m b.
To get m b from x and y we need something of type a.
Unfortunately, we can't extract a from m a. But we can substitute it for something else (remember, monad is a functor also):
fmap :: (a -> b) -> m a -> m b
fmap y x :: m (m b)
And it's perfectly fits as argument for join: (>>=) x y = join (fmap y x).
The best way to look at monads and computational effects is to start with where Haskell got the notion of monads for computational effects from, and then look at Haskell after you understand that. See this paper in particular: Notions of Computation and Monads, by E. Moggi.
See also Moggi's earlier paper which shows how monads work for the lambda calculus alone: http://citeseerx.ist.psu.edu/viewdoc/summary?doi=10.1.1.26.2787
The fact that monads capture substitution, among other things (http://blog.sigfpe.com/2009/12/where-do-monads-come-from.html), and substitution is key to the lambda calculus, should give a good clue as to why they have so much expressive power.
While monads originally came from category theory, this doesn't mean that category theory is the only abstract context in which you can view them. A different viewpoint is given by operational semantics. For an introduction, have a look at my Operational Monad Tutorial.
One way to look at IO is to consider it as a strange kind of state monad. Remember that the state monad looks like:
data State s a = State (s -> (s, a))
where the "s" argument is the data type you want to thread through the computation. Also, this version of "State" doesn't have "get" and "put" actions and we don't export the constructor.
Now imagine a type
data RealWorld = RealWorld ......
This has no real definition, but notionally a value of type RealWorld holds the state of the entire universe outside the computer. Of course we can never have a value of type RealWorld, but you can imagine something like:
getChar :: RealWorld -> (RealWorld, Char)
In other words the "getChar" function takes a state of the universe before the keyboard button has been pressed, and returns the key pressed plus the state of the universe after the key has been pressed. Of course the problem is that the previous state of the world is still available to be referenced, which can't happen in reality.
But now we write this:
type IO = State RealWorld
getChar :: IO Char
Notionally, all we have done is wrap the previous version of "getChar" as a state action. But by doing this we can no longer access the "RealWorld" values because they are wrapped up inside the State monad.
So when a Haskell program wants to change a lightbulb it takes hold of the bulb and applies a "rotate" function to the RealWorld value inside IO.
For me, so far, the explanation that comes closest to tie together monads in category theory and monads in Haskell is that monads are a monid whose objects have the type a->m b. I can see that these objects are very close to an endofunctor and so the composition of such functions are related to an imperative sequence of program statements. Also functions which return IO functions are valid in pure functional code until the inner function is called from outside.
This id element is 'a -> m a' which fits in very well but the multiplication element is function composition which should be:
(>=>) :: Monad m => (a -> m b) -> (b -> m c) -> (a -> m c)
This is not quite function composition, but close enough (I think to get true function composition we need a complementary function which turns m b back into a, then we get function composition if we apply these in pairs?), I'm not quite sure how to get from that to this:
(>>=) :: Monad m => m a -> (a -> m b) -> m b
I've got a feeling I may have seen an explanation of this in all the stuff that I read, without understanding its significance the first time through, so I will do some re-reading to try to (re)find an explanation of this.
The other thing I would like to do is link together all the different category theory explanations: endofunctor+2 natural transformations, Kleisli category, a monoid whose objects are monoids and so on. For me the thing that seems to link all these explanations is that they are two level. That is, normally we treat category objects as black-boxes where we imply their properties from their outside interactions, but here there seems to be a need to go one level inside the objects to see what’s going on? We can explain monads without this but only if we accept apparently arbitrary constructions.
Martin
See this question: is chaining operations the only thing that the monad class solves?
In it, I explain my vision that we must differentiate between the Monad class and individual types that solve individual problems. The Monad class, by itself, only solve the important problem of "chaining operations with choice" and mades this solution available to types being instance of it (by means of "inheritance").
On the other hand, if a given type that solves a given problem faces the problem of "chaining operations with choice" then, it should be made an instance (inherit) of the Monad class.
The fact is that problems not get solved merely by being a Monad. It would be like saying that "wheels" solve many problems, but actually "wheels" only solve a problem, and things with wheels solve many different problems.

Resources