Pass more than 1 parameter to monad - haskell

I'm learning Haskell and making up some examples. I'm not sure why the second example doesn't work
foo :: Int -> Int -> Maybe Int
foo 0 0 = Nothing
foo a b = Just $ a + b
bar :: Int -> Maybe Int
bar 0 = Nothing
bar a = Just $ a + 1
-- This works
Just 4 >>= bar
-- Why this doesn't work?
(Just 4 Just 4) >>= foo
-- This works
do
a <- Just 3
b <- Just 4
foo a b

As the comment says, (Just 4 Just 4) tries to apply the constructor Just to 3 arguments when it only takes one. So, I will assume that you wanted something like (Just 4, Just 4), and want it to work like your final example.
The type of the "bind" operator is (>>=) :: Monad m => m a -> (m a -> b) -> m b. This means that the function expected after the operator only takes one argument, not two. So, again, the ultimate reason why it doesn't work is that, your function takes the wrong number of arguments. (Partial application means that you don't have to provide all the arguments at once, but it sounds like you're expecting some other piece of data to be magically routed to the missing argument...)
Desugaring your do example to >>= form translates as:
Just 3 >>= \a -> Just 4 >>= \b -> foo a b
To make this a little clearer, I'll parenthesize the lambdas:
Just 3 >>= ( \a -> Just 4 >>= (\b -> foo a b) )
That makes it easier to see that you can simplify the inner lambda:
Just 3 >>= ( \a -> Just 4 >>= foo a )
So, it's possible after all to route the missing data to the extra argument! But, you do have to work out the routing yourself...
There's nothing particularly magical about Haskell functions; they tend to be more particular about how they're called than dynamic languages. The largest "magic" here is that the type checker can often tell when you're not using them correctly.
And (as the other answer notes) there is nothing magical about >>= -- it's just another function, and in order to understand how to use it, you need to take a look at its type.

It doesn't work because >>= is a perfectly normal operator (and operators are perfectly normal functions).
You seem to be thinking of >>= as special syntax for getting values out of the monadic value on its left and feeding it to the function on the right. It is not special syntax; rather >>= itself is a function that gets applied to the values on its left and its right (and then computes a result as you expect).
However, that means that the left and right arguments must be valid expressions for things that could exist as ordinary values; things you could simply bind to variables with var = <expr> syntax. Just 4 >>= bar works because (among other requirements) Just 4 on its own is a valid expression of type Maybe Int and bar is a valid expression of type Int -> Maybe Int. Just 4 Just 4 >>= foo doesn't work because Just 4 Just 4 is not a correct expression (what would it's type be?); it's interpreted as applying Just to the 3 separate arguments 4, Just, and 4, whereas you want it to be two separate values Just 4 and Just 4. But even if you could get the compiler to interpret something there as two separate values, there's no way for >>= to be passed two separate values as its left argument; it's expecting (in this usage) a single value of type Just Int.
If you have a function like foo that needs two arguments and you want to source those arguments from values that are in a monadic context, then you can't just apply >>= you need to write code that does that (like your final example with the do block; there are many other ways to do something equivalent).

The other answers described why this doesn't work. But IMO it's quite reasonable that you want this, and indeed Just 3 >>= \x -> Just 4 >>= \y -> foo x y is a bit of a silly solution to the task. Basically, the x and y values are independent of each other, yet you're fetching them sequentially, in a way that the complete y calculation could in principle depend on the value of x.
Monads aren't really the right abstraction here, they're too strong. To get x and y non-sequentially, you can use Applicative interface. The form that most Haskellers prefer nowadays (I think) is
foo <$> Just 3 <*> Just 4
You can read this as “zip the effectful values Just 3 and Just 4 together to a single action with two values, then apply foo over those values”.
...Actually that's not really how it works though, and for me that was super confusing when I first learned about applicatives. Namely, the above expression is in fact parsed as
(foo <$> Just 3) <*> Just 4
which looks again like it's sequential-style. But it's not, what going on here is only a currying/laziness trick to pass multiple values through the applicative value without having to group them to a suitable tuple. The code that literally works like I explained it would be
uncurry foo <$> ((,)<$>Just 3<*>Just 4)
Here, (,)<$>Just 3<*>Just 4 evaluates to Just (3,4). Then fmapping foo over that needs to be done in uncurried form, so the two arguments are accepted as a tuple. It's structurally clear, yet awkward because we're working against Haskell's curried style.
(Mathematically, this tupling is what's conceptually happing though: generally speaking, you're working in a monoidal category. Some other incarnations of applicative functors have such a tuppling-combinator as their underlying interface, instead of <*>; e.g. >*< from the invertible package.)
The trick with foo<$>Just 3<*>Just 4 is that instead of building a tuple, we start with partially applying foo to the 3 result. This doesn't actually require anything applicative/monadic yet – we're basically just transforming the contained value – in general: values – from 3 to foo 3, without touching their context. You may consider this a purely symbolic operation. Note that the type is Maybe (Int -> Int) at this point.
Then you use the <*> combinator to zip both of the Maybe contexts together, and simultaneously apply the foo 3 partially-evaluated function to its second argument.
I personally like this form, which is also equivalent:
liftA2 foo (Just 3) (Just 4)
We're not finished yet though: all the above suggestions give a result of type Maybe (Maybe Int). To flatten that into a Maybe Int, that's where you actually need the monad interface. One option is
join $ foo <$> Just 3 <*> Just 4

Related

How to return a pure value from a impure method

I know it must sound trivial but I was wondering how you can unwrap a value from a functor and return it as pure value?
I have tried:
f::IO a->a
f x=(x>>=)
f= >>=
What should I place in the right side? I can't use return since it will wrap it back again.
It's a frequently asked question: How do I extract 'the' value from my monad, not only in Haskell, but in other languages as well. I have a theory about why this question keeps popping up, so I'll try to answer according to that; I hope it helps.
Containers of single values
You can think of a functor (and therefore also a monad) as a container of values. This is most palpable with the (redundant) Identity functor:
Prelude Control.Monad.Identity> Identity 42
Identity 42
This is nothing but a wrapper around a value, in this case 42. For this particular container, you can extract the value, because it's guaranteed to be there:
Prelude Control.Monad.Identity> runIdentity $ Identity 42
42
While Identity seems fairly useless, you can find other functors that seem to wrap a single value. In F#, for example, you'll often encounter containers like Async<'a> or Lazy<'a>, which are used to represent asynchronous or lazy computations (Haskell doesn't need the latter, because it's lazy by default).
You'll find lots of other single-value containers in Haskell, such as Sum, Product, Last, First, Max, Min, etc. Common to all of those is that they wrap a single value, which means that you can extract the value.
I think that when people first encounter functors and monads, they tend to think of the concept of a data container in this way: as a container of a single value.
Containers of optional values
Unfortunately, some common monads in Haskell seem to support that idea. For example, Maybe is a data container as well, but one that can contain zero or one value. You can, unfortunately, still extract the value if it's there:
Prelude Data.Maybe> fromJust $ Just 42
42
The problem with this is that fromJust isn't total, so it'll crash if you call it with a Nothing value:
Prelude Data.Maybe> fromJust Nothing
*** Exception: Maybe.fromJust: Nothing
You can see the same sort of problem with Either. Although I'm not aware of a built-in partial function to extract a Right value, you can easily write one with pattern matching (if you ignore the compiler warning):
extractRight :: Either l r -> r
extractRight (Right x) = x
Again, it works in the 'happy path' scenario, but can just as easily crash:
Prelude> extractRight $ Right 42
42
Prelude> extractRight $ Left "foo"
*** Exception: <interactive>:12:1-26: Non-exhaustive patterns in function extractRight
Still, since functions like fromJust exists, I suppose it tricks people new to the concept of functors and monads into thinking about them as data containers from which you can extract a value.
When you encounter something like IO Int for the first time, then, I can understand why you'd be tempted to think of it as a container of a single value. In a sense, it is, but in another sense, it isn't.
Containers of multiple values
Even with lists, you can (attempt to) extract 'the' value from a list:
Prelude> head [42..1337]
42
Still, it could fail:
Prelude> head []
*** Exception: Prelude.head: empty list
At this point, however, it should be clear that attempting to extract 'the' value from any arbitrary functor is nonsense. A list is a functor, but it contains an arbitrary number of values, including zero and infinitely many.
What you can always do, though, is to write functions that take a 'contained' value as input and returns another value as output. Here's an arbitrary example of such a function:
countAndMultiply :: Foldable t => (t a, Int) -> Int
countAndMultiply (xs, factor) = length xs * factor
While you can't 'extract the value' out of a list, you can apply your function to each of the values in a list:
Prelude> fmap countAndMultiply [("foo", 2), ("bar", 3), ("corge", 2)]
[6,9,10]
Since IO is a functor, you can do the same with it as well:
Prelude> foo = return ("foo", 2) :: IO (String, Int)
Prelude> :t foo
foo :: IO (String, Int)
Prelude> fmap countAndMultiply foo
6
The point is that you don't extract a value from a functor, you step into the functor.
Monad
Sometimes, the function you apply to a functor returns a value that's already wrapped in the same data container. As an example, you may have a function that splits a string over a particular character. To keep things simple, let's just look at the built-in function words that splits a string into words:
Prelude> words "foo bar"
["foo","bar"]
If you have a list of strings, and apply words to each, you'll get a nested list:
Prelude> fmap words ["foo bar", "baz qux"]
[["foo","bar"],["baz","qux"]]
The result is a nested data container, in this case a list of lists. You can flatten it with join:
Prelude Control.Monad> join $ fmap words ["foo bar", "baz qux"]
["foo","bar","baz","qux"]
This is the original definition of a monad: it's a functor that you can flatten. In modern Haskell, Monad is defined by bind (>>=), from which one can derive join, but it's also possible to derive >>= from join.
IO as all values
At this point, you may be wondering: what does that have to do with IO? Isn't IO a a container of a single value of the type a?
Not really. One interpretation of IO is that it's a container that holds an arbitrary value of the type a. According to that interpretation, it's analogous to the many-worlds interpretation of quantum mechanics. IO a is the superposition of all possible values of the type a.
In Schrödinger's original thought experiment, the cat in the box is both alive and dead until observed. That's two possible states superimposed. If we think about a variable called catIsAlive, it would be equivalent to the superposition of True and False. So, you can think of IO Bool as a set of possible values {True, False} that will only collapse into a single value when observed.
Likewise, IO Word8 can be interpreted as a superposition of the set of all possible Word8 values, i.e. {0, 1, 2,.. 255}, IO Int as the superposition of all possible Int values, IO String as all possible String values (i.e. an infinite set), and so on.
So how do you observe the value, then?
You don't extract it, you work within the data container. You can, as shown above, fmap and join over it. So, you can write your application as pure functions that you then compose with impure values with fmap, >>=, join, and so on.
It is trivial, so this will be a long answer. In short, the problem lies in the signature, IO a -> a, is not a type properly allowed in Haskell. This really has less to do with IO being a functor than the fact that IO is special.
For some functors you can recover the pure value. For instance a partially applied pair, (,) a, is a functor. We unwrap the value via snd.
snd :: (a,b) -> b
snd (_,b) = b
So this is a functor that we can unwrap to a pure value, but this really has nothing to do with being a functor. It has more to do with pairs belonging to a different Category Theoretic concept, Comonad, with:
extract :: Comonad w => w a -> a
Any Comonad will be a functor for which you can recover the pure value.
Many (non-comonadic) functors have--lets say "evaluators"--which allow something like what is being asked. For instance, we can evaluate a Maybe with maybe :: a -> Maybe a -> a. By providing a default, maybe a has the desired type, Maybe a -> a. Another useful example from State, evalState :: State s a -> s -> a, has its arguments reversed but the concept is the same; given the monad, State s a, and initial state, s, we unwrap the pure value, a.
Finally to the specifics of IO. No "evaluator" for IO is provided in the Haskell language or libraries. We might consider running the program itself an evaluator--much in the same vein of evalState. But if that's a valid conceptual move, then it should only help to convince you that there is no sane way to unwrap from IO--any program written is just the IO a input to its evaluator function.
Instead, what you are forced to do--by design--is to work within the IO monad. For instance, if you have a pure function, f :: a -> b, you apply it within the IO context via, fmap f :: IO a -> IO b
TL;DR You can't get a pure value out of the IO monad. Apply pure functions within the IO context, for instance by fmap

Memoizing multiplication

My application multiplies vectors after a (costly) conversion using an FFT. As a result, when I write
f :: (Num a) => a -> [a] -> [a]
f c xs = map (c*) xs
I only want to compute the FFT of c once, rather than for every element of xs. There really isn't any need to store the FFT of c for the entire program, just in the local scope.
I attempted to define my Num instance like:
data Foo = Scalar c
| Vec Bool v -- the bool indicates which domain v is in
instance Num Foo where
(*) (Scalar c) = \x -> case x of
Scalar d -> Scalar (c*d)
Vec b v-> Vec b $ map (c*) v
(*) v1 = let Vec True v = fft v1
in \x -> case x of
Scalar d -> Vec True $ map (c*) v
v2 -> Vec True $ zipWith (*) v (fft v2)
Then, in an application, I call a function similar to f (which works on arbitrary Nums) where c=Vec False v, and I expected that this would be just as fast as if I hack f to:
g :: Foo -> [Foo] -> [Foo]
g c xs = let c' = fft c
in map (c'*) xs
The function g makes the memoization of fft c occur, and is much faster than calling f (no matter how I define (*)). I don't understand what is going wrong with f. Is it my definition of (*) in the Num instance? Does it have something to do with f working over all Nums, and GHC therefore being unable to figure out how to partially compute (*)?
Note: I checked the core output for my Num instance, and (*) is indeed represented as nested lambdas with the FFT conversion in the top level lambda. So it looks like this is at least capable of being memoized. I have also tried both judicious and reckless use of bang patterns to attempt to force evaluation to no effect.
As a side note, even if I can figure out how to make (*) memoize its first argument, there is still another problem with how it is defined: A programmer wanting to use the Foo data type has to know about this memoization capability. If she wrote
map (*c) xs
no memoization would occur. (It must be written as (map (c*) xs)) Now that I think about it, I'm not entirely sure how GHC would rewrite the (*c) version since I have curried (*). But I did a quick test to verify that both (*c) and (c*) work as expected: (c*) makes c the first arg to *, while (*c) makes c the second arg to *. So the problem is that it is not obvious how one should write the multiplication to ensure memoization. Is this just an inherent downside to the infix notation (and the implicit assumption that the arguments to * are symmetric)?
The second, less pressing issue is that the case where we map (v*) onto a list of scalars. In this case, (hopefully) the fft of v would be computed and stored, even though it is unnecessary since the other multiplicand is a scalar. Is there any way around this?
Thanks
I believe stable-memo package could solve your problem. It memoizes values not using equality but by reference identity:
Whereas most memo combinators memoize based on equality, stable-memo does it based on whether the exact same argument has been passed to the function before (that is, is the same argument in memory).
And it automatically drops memoized values when their keys are garbage collected:
stable-memo doesn't retain the keys it has seen so far, which allows them to be garbage collected if they will no longer be used. Finalizers are put in place to remove the corresponding entries from the memo table if this happens.
So if you define something like
fft = memo fft'
where fft' = ... -- your old definition
you'll get pretty much what you need: Calling map (c *) xs will memoize the computation of fft inside the first call to (*) and it gets reused on subsequent calls to (c *). And if c is garbage collected, so is fft' c.
See also this answer to How to add fields that only cache something to ADT?
I can see two problems that might prevent memoization:
First, f has an overloaded type and works for all Num instances. So f cannot use memoization unless it is either specialized (which usually requires a SPECIALIZE pragma) or inlined (which may happen automatically, but is more reliable with an INLINE pragma).
Second, the definition of (*) for Foo performs pattern matching on the first argument, but f multiplies with an unknown c. So within f, even if specialized, no memoization can occur. Once again, it very much depends on f being inlined, and a concrete argument for c to be supplied, so that inlining can actually appear.
So I think it'd help to see how exactly you're calling f. Note that if f is defined using two arguments, it has to be given two arguments, otherwise it cannot be inlined. It would furthermore help to see the actual definition of Foo, as the one you are giving mentions c and v which aren't in scope.

About value in context (applied in Monad)

I have a small question about value in context.
Take Just 'a', so the value in context of type Maybe in this case is 'a'
Take [3], so value in context of type [a] in this case is 3
And if you apply the monad for [3] like this: [3] >>= \x -> [x+3], it means you assign x with value 3. It's ok.
But now, take [3,2], so what is the value in the context of type [a]?. And it's so strange that if you apply monad for it like this:
[3,4] >>= \x -> x+3
It got the correct answer [6,7], but actually we don't understand what is x in this case. You can answer, ah x is 3 and then 4, and x feeds the function 2 times and concat as Monad does: concat (map f xs) like this:
[3,4] >>= concat (map f x)
So in this case, [3,4] will be assigned to the x. It means wrong, because [3,4] is not a value. Monad is wrong.
I think your problem is focusing too much on the values. A monad is a type constructor, and as such not concerned with how many and what kinds of values there are, but only the context.
A Maybe a can be an a, or nothing. Easy, and you correctly observed that.
An Either String a is either some a, or alternatively some information in form of a String (e.g. why the calculation of a failed).
Finally, [a] is an unknown number of as (or none at all), that may have resulted from an ambiguous computation, or one giving multiple results (like a quadratic equation).
Now, for the interpretation of (>>=), it is helpful to know that the essential property of a monad (how it is defined by category theorists) is
join :: m (m a) -> m a.
Together with fmap, (>>=) can be written in terms of join.
What join means is the following: A context, put in the same context again, still has the same resulting behavior (for this monad).
This is quite obvious for Maybe (Maybe a): Something can essentially be Just (Just x), or Nothing, or Just Nothing, which provides the same information as Nothing. So, instead of using Maybe (Maybe a), you could just have Maybe a and you wouldn't lose any information. That's what join does: it converts to the "easier" context.
[[a]] is somehow more difficult, but not much. You essentially have multiple/ambiguous results out of multiple/ambiguous results. A good example are the roots of a fourth-degree polynomial, found by solving a quadratic equation. You first get two solutions, and out of each you can find two others, resulting in four roots.
But the point is, it doesn't matter if you speak of an ambiguous ambiguous result, or just an ambiguous result. You could just always use the context "ambiguous", and transform multiple levels with join.
And here comes what (>>=) does for lists: it applies ambiguous functions to ambiguous values:
squareRoots :: Complex -> [Complex]
fourthRoots num = squareRoots num >>= squareRoots
can be rewritten as
fourthRoots num = join $ squareRoots `fmap` (squareRoots num)
-- [1,-1,i,-i] <- [[1,-1],[i,-i]] <- [1,-1] <- 1
since all you have to do is to find all possible results for each possible value.
This is why join is concat for lists, and in fact
m >>= f == join (fmap f) m
must hold in any monad.
A similar interpretation can be given to IO. A computation with side-effects, which can also have side-effects (IO (IO a)), is in essence just something with side-effects.
You have to take the word "context" quite broadly.
A common way of interpreting a list of values is that it represents an indeterminate value, so [3,4] represents a value which is three or four, but we don't know which (perhaps we just know it's a solution of x^2 - 7x + 12 = 0).
If we then apply f to that, we know it's 6 or 7 but we still don't know which.
Another example of an indeterminate value that you're more used to is 3. It could mean 3::Int or 3::Integer or even sometimes 3.0::Double. It feels easier because there's only one symbol representing the indeterminate value, whereas in a list, all the possibilities are listed (!).
If you write
asum = do
x <- [10,20]
y <- [1,2]
return (x+y)
You'll get a list with four possible answers: [11,12,21,22]
That's one for each of the possible ways you could add x and y.
It is not the values that are in the context, it's the types.
Just 'a' :: Maybe Char --- Char is in a Maybe context.
[3, 2] :: [Int] --- Int is in a [] context.
Whether there is one, none or many of the a in the m a is beside the point.
Edit: Consider the type of (>>=) :: Monad m => m a -> (a -> m b) -> m b.
You give the example Just 3 >>= (\x->Just(4+x)). But consider Nothing >>= (\x->Just(4+x)). There is no value in the context. But the type is in the context all the same.
It doesn't make sense to think of x as necessarily being a single value. x has a single type. If we are dealing with the Identity monad, then x will be a single value, yes. If we are in the Maybe monad, x may be a single value, or it may never be a value at all. If we are in the list monad, x may be a single value, or not be a value at all, or be various different values... but what it is not is the list of all those different values.
Your other example --- [2, 3] >>= (\x -> x + 3) --- [2, 3] is not passed to the function. [2, 3] + 3 would have a type error. 2 is passed to the function. And so is 3. The function is invoked twice, gives results for both those inputs, and the results are combined by the >>= operator. [2, 3] is not passed to the function.
"context" is one of my favorite ways to think about monads. But you've got a slight misconception.
Take Just 'a', so the value in context of type Maybe in this case is 'a'
Not quite. You keep saying the value in context, but there is not always a value "inside" a context, or if there is, then it is not necessarily the only value. It all depends on which context we are talking about.
The Maybe context is the context of "nullability", or potential absence. There might be something there, or there might be Nothing. There is no value "inside" of Nothing. So the maybe context might have a value inside, or it might not. If I give you a Maybe Foo, then you cannot assume that there is a Foo. Rather, you must assume that it is a Foo inside the context where there might actually be Nothing instead. You might say that something of type Maybe Foo is a nullable Foo.
Take [3], so value in context of type [a] in this case is 3
Again, not quite right. A list represents a nondeterministic context. We're not quite sure what "the value" is supposed to be, or if there is one at all. In the case of a singleton list, such as [3], then yes, there is just one. But one way to think about the list [3,4] is as some unobservable value which we are not quite sure what it is, but we are certain that it 3 or that it is 4. You might say that something of type [Foo] is a nondeterministic Foo.
[3,4] >>= \x -> x+3
This is a type error; not quite sure what you meant by this.
So in this case, [3,4] will be assigned to the x. It means wrong, because [3,4] is not a value. Monad is wrong.
You totally lost me here. Each instance of Monad has its own implementation of >>= which defines the context that it represents. For lists, the definition is
(xs >>= f) = (concat (map f xs))
You may want to learn about Functor and Applicative operations, which are related to the idea of Monad, and might help clear some confusion.

What are the benefits of currying?

I don't think I quite understand currying, since I'm unable to see any massive benefit it could provide. Perhaps someone could enlighten me with an example demonstrating why it is so useful. Does it truly have benefits and applications, or is it just an over-appreciated concept?
(There is a slight difference between currying and partial application, although they're closely related; since they're often mixed together, I'll deal with both terms.)
The place where I realized the benefits first was when I saw sliced operators:
incElems = map (+1)
--non-curried equivalent: incElems = (\elems -> map (\i -> (+) 1 i) elems)
IMO, this is totally easy to read. Now, if the type of (+) was (Int,Int) -> Int *, which is the uncurried version, it would (counter-intuitively) result in an error -- but curryied, it works as expected, and has type [Int] -> [Int].
You mentioned C# lambdas in a comment. In C#, you could have written incElems like so, given a function plus:
var incElems = xs => xs.Select(x => plus(1,x))
If you're used to point-free style, you'll see that the x here is redundant. Logically, that code could be reduced to
var incElems = xs => xs.Select(curry(plus)(1))
which is awful due to the lack of automatic partial application with C# lambdas. And that's the crucial point to decide where currying is actually useful: mostly when it happens implicitly. For me, map (+1) is the easiest to read, then comes .Select(x => plus(1,x)), and the version with curry should probably be avoided, if there is no really good reason.
Now, if readable, the benefits sum up to shorter, more readable and less cluttered code -- unless there is some abuse of point-free style done is with it (I do love (.).(.), but it is... special)
Also, lambda calculus would get impossible without using curried functions, since it has only one-valued (but therefor higher-order) functions.
* Of course it actually in Num, but it's more readable like this for the moment.
Update: how currying actually works.
Look at the type of plus in C#:
int plus(int a, int b) {..}
You have to give it a tuple of values -- not in C# terms, but mathematically spoken; you can't just leave out the second value. In haskell terms, that's
plus :: (Int,Int) -> Int,
which could be used like
incElem = map (\x -> plus (1, x)) -- equal to .Select (x => plus (1, x))
That's way too much characters to type. Suppose you'd want to do this more often in the future. Here's a little helper:
curry f = \x -> (\y -> f (x,y))
plus' = curry plus
which gives
incElem = map (plus' 1)
Let's apply this to a concrete value.
incElem [1]
= (map (plus' 1)) [1]
= [plus' 1 1]
= [(curry plus) 1 1]
= [(\x -> (\y -> plus (x,y))) 1 1]
= [plus (1,1)]
= [2]
Here you can see curry at work. It turns a standard haskell style function application (plus' 1 1) into a call to a "tupled" function -- or, viewed at a higher level, transforms the "tupled" into the "untupled" version.
Fortunately, most of the time, you don't have to worry about this, as there is automatic partial application.
It's not the best thing since sliced bread, but if you're using lambdas anyway, it's easier to use higher-order functions without using lambda syntax. Compare:
map (max 4) [0,6,9,3] --[4,6,9,4]
map (\i -> max 4 i) [0,6,9,3] --[4,6,9,4]
These kinds of constructs come up often enough when you're using functional programming, that it's a nice shortcut to have and lets you think about the problem from a slightly higher level--you're mapping against the "max 4" function, not some random function that happens to be defined as (\i -> max 4 i). It lets you start to think in higher levels of indirection more easily:
let numOr4 = map $ max 4
let numOr4' = (\xs -> map (\i -> max 4 i) xs)
numOr4 [0,6,9,3] --ends up being [4,6,9,4] either way;
--which do you think is easier to understand?
That said, it's not a panacea; sometimes your function's parameters will be the wrong order for what you're trying to do with currying, so you'll have to resort to a lambda anyway. However, once you get used to this style, you start to learn how to design your functions to work well with it, and once those neurons starts to connect inside your brain, previously complicated constructs can start to seem obvious in comparison.
One benefit of currying is that it allows partial application of functions without the need of any special syntax/operator. A simple example:
mapLength = map length
mapLength ["ab", "cde", "f"]
>>> [2, 3, 1]
mapLength ["x", "yz", "www"]
>>> [1, 2, 3]
map :: (a -> b) -> [a] -> [b]
length :: [a] -> Int
mapLength :: [[a]] -> [Int]
The map function can be considered to have type (a -> b) -> ([a] -> [b]) because of currying, so when length is applied as its first argument, it yields the function mapLength of type [[a]] -> [Int].
Currying has the convenience features mentioned in other answers, but it also often serves to simplify reasoning about the language or to implement some code much easier than it could be otherwise. For example, currying means that any function at all has a type that's compatible with a ->b. If you write some code whose type involves a -> b, that code can be made work with any function at all, no matter how many arguments it takes.
The best known example of this is the Applicative class:
class Functor f => Applicative f where
pure :: a -> f a
(<*>) :: f (a -> b) -> f a -> f b
And an example use:
-- All possible products of numbers taken from [1..5] and [1..10]
example = pure (*) <*> [1..5] <*> [1..10]
In this context, pure and <*> adapt any function of type a -> b to work with lists of type [a]. Because of partial application, this means you can also adapt functions of type a -> b -> c to work with [a] and [b], or a -> b -> c -> d with [a], [b] and [c], and so on.
The reason this works is because a -> b -> c is the same thing as a -> (b -> c):
(+) :: Num a => a -> a -> a
pure (+) :: (Applicative f, Num a) => f (a -> a -> a)
[1..5], [1..10] :: Num a => [a]
pure (+) <*> [1..5] :: Num a => [a -> a]
pure (+) <*> [1..5] <*> [1..10] :: Num a => [a]
Another, different use of currying is that Haskell allows you to partially apply type constructors. E.g., if you have this type:
data Foo a b = Foo a b
...it actually makes sense to write Foo a in many contexts, for example:
instance Functor (Foo a) where
fmap f (Foo a b) = Foo a (f b)
I.e., Foo is a two-parameter type constructor with kind * -> * -> *; Foo a, the partial application of Foo to just one type, is a type constructor with kind * -> *. Functor is a type class that can only be instantiated for type constrcutors of kind * -> *. Since Foo a is of this kind, you can make a Functor instance for it.
The "no-currying" form of partial application works like this:
We have a function f : (A ✕ B) → C
We'd like to apply it partially to some a : A
To do this, we build a closure out of a and f (we don't evaluate f at all, for the time being)
Then some time later, we receive the second argument b : B
Now that we have both the A and B argument, we can evaluate f in its original form...
So we recall a from the closure, and evaluate f(a,b).
A bit complicated, isn't it?
When f is curried in the first place, it's rather simpler:
We have a function f : A → B → C
We'd like to apply it partially to some a : A – which we can just do: f a
Then some time later, we receive the second argument b : B
We apply the already evaluated f a to b.
So far so nice, but more important than being simple, this also gives us extra possibilities for implementing our function: we may be able to do some calculations as soon as the a argument is received, and these calculations won't need to be done later, even if the function is evaluated with multiple different b arguments!
To give an example, consider this audio filter, an infinite impulse response filter. It works like this: for each audio sample, you feed an "accumulator function" (f) with some state parameter (in this case, a simple number, 0 at the beginning) and the audio sample. The function then does some magic, and spits out the new internal state1 and the output sample.
Now here's the crucial bit – what kind of magic the function does depends on the coefficient2 λ, which is not quite a constant: it depends both on what cutoff frequency we'd like the filter to have (this governs "how the filter will sound") and on what sample rate we're processing in. Unfortunately, the calculation of λ is a bit more complicated (lp1stCoeff $ 2*pi * (νᵥ ~*% δs) than the rest of the magic, so we wouldn't like having to do this for every single sample, all over again. Quite annoying, because νᵥ and δs are almost constant: they change very seldom, certainly not at each audio sample.
But currying saves the day! We simply calculate λ as soon as we have the necessary parameters. Then, at each of the many many audio samples to come, we only need to perform the remaining, very easy magic: yⱼ = yⱼ₁ + λ ⋅ (xⱼ - yⱼ₁). So we're being efficient, and still keeping a nice safe referentially transparent purely-functional interface.
1 Note that this kind of state-passing can generally be done more nicely with the State or ST monad, that's just not particularly beneficial in this example
2 Yes, this is a lambda symbol. I hope I'm not confusing anybody – fortunately, in Haskell it's clear that lambda functions are written with \, not with λ.
It's somewhat dubious to ask what the benefits of currying are without specifying the context in which you're asking the question:
In some cases, like functional languages, currying will merely be seen as something that has a more local change, where you could replace things with explicit tupled domains. However, this isn't to say that currying is useless in these languages. In some sense, programming with curried functions make you "feel" like you're programming in a more functional style, because you more typically face situations where you're dealing with higher order functions. Certainly, most of the time, you will "fill in" all of the arguments to a function, but in the cases where you want to use the function in its partially applied form, this is a bit simpler to do in curried form. We typically tell our beginning programmers to use this when learning a functional language just because it feels like better style and reminds them they're programming in more than just C. Having things like curry and uncurry also help for certain conveniences within functional programming languages too, I can think of arrows within Haskell as a specific example of where you would use curry and uncurry a bit to apply things to different pieces of an arrow, etc...
In some cases, you want to think about more than functional programs, you can present currying / uncurrying as a way to state the elimination and introduction rules for and in constructive logic, which provides a connection to a more elegant motivation for why it exists.
In some cases, for example, in Coq, using curried functions versus tupled functions can produce different induction schemes, which may be easier or harder to work with, depending on your applications.
I used to think that currying was simple syntax sugar that saves you a bit of typing. For example, instead of writing
(\ x -> x + 1)
I can merely write
(+1)
The latter is instantly more readable, and less typing to boot.
So if it's just a convenient short cut, why all the fuss?
Well, it turns out that because function types are curried, you can write code which is polymorphic in the number of arguments a function has.
For example, the QuickCheck framework lets you test functions by feeding them randomly-generated test data. It works on any function who's input type can be auto-generated. But, because of currying, the authors were able to rig it so this works with any number of arguments. Were functions not curried, there would be a different testing function for each number of arguments - and that would just be tedious.

Why monads? How does it resolve side-effects?

I am learning Haskell and trying to understand Monads. I have two questions:
From what I understand, Monad is just another typeclass that declares ways to interact with data inside "containers", including Maybe, List, and IO. It seems clever and clean to implement these 3 things with one concept, but really, the point is so there can be clean error handling in a chain of functions, containers, and side effects. Is this a correct interpretation?
How exactly is the problem of side-effects solved? With this concept of containers, the language essentially says anything inside the containers is non-deterministic (such as i/o). Because lists and IOs are both containers, lists are equivalence-classed with IO, even though values inside lists seem pretty deterministic to me. So what is deterministic and what has side-effects? I can't wrap my head around the idea that a basic value is deterministic, until you stick it in a container (which is no special than the same value with some other values next to it, e.g. Nothing) and it can now be random.
Can someone explain how, intuitively, Haskell gets away with changing state with inputs and output? I'm not seeing the magic here.
The point is so there can be clean error handling in a chain of functions, containers, and side effects. Is this a correct interpretation?
Not really. You've mentioned a lot of concepts that people cite when trying to explain monads, including side effects, error handling and non-determinism, but it sounds like you've gotten the incorrect sense that all of these concepts apply to all monads. But there's one concept you mentioned that does: chaining.
There are two different flavors of this, so I'll explain it two different ways: one without side effects, and one with side effects.
No Side Effects:
Take the following example:
addM :: (Monad m, Num a) => m a -> m a -> m a
addM ma mb = do
a <- ma
b <- mb
return (a + b)
This function adds two numbers, with the twist that they are wrapped in some monad. Which monad? Doesn't matter! In all cases, that special do syntax de-sugars to the following:
addM ma mb =
ma >>= \a ->
mb >>= \b ->
return (a + b)
... or, with operator precedence made explicit:
ma >>= (\a -> mb >>= (\b -> return (a + b)))
Now you can really see that this is a chain of little functions, all composed together, and its behavior will depend on how >>= and return are defined for each monad. If you're familiar with polymorphism in object-oriented languages, this is essentially the same thing: one common interface with multiple implementations. It's slightly more mind-bending than your average OOP interface, since the interface represents a computation policy rather than, say, an animal or a shape or something.
Okay, let's see some examples of how addM behaves across different monads. The Identity monad is a decent place to start, since its definition is trivial:
instance Monad Identity where
return a = Identity a -- create an Identity value
(Identity a) >>= f = f a -- apply f to a
So what happens when we say:
addM (Identity 1) (Identity 2)
Expanding this, step by step:
(Identity 1) >>= (\a -> (Identity 2) >>= (\b -> return (a + b)))
(\a -> (Identity 2) >>= (\b -> return (a + b)) 1
(Identity 2) >>= (\b -> return (1 + b))
(\b -> return (1 + b)) 2
return (1 + 2)
Identity 3
Great. Now, since you mentioned clean error handling, let's look at the Maybe monad. Its definition is only slightly trickier than Identity:
instance Monad Maybe where
return a = Just a -- same as Identity monad!
(Just a) >>= f = f a -- same as Identity monad again!
Nothing >>= _ = Nothing -- the only real difference from Identity
So you can imagine that if we say addM (Just 1) (Just 2) we'll get Just 3. But for grins, let's expand addM Nothing (Just 1) instead:
Nothing >>= (\a -> (Just 1) >>= (\b -> return (a + b)))
Nothing
Or the other way around, addM (Just 1) Nothing:
(Just 1) >>= (\a -> Nothing >>= (\b -> return (a + b)))
(\a -> Nothing >>= (\b -> return (a + b)) 1
Nothing >>= (\b -> return (1 + b))
Nothing
So the Maybe monad's definition of >>= was tweaked to account for failure. When a function is applied to a Maybe value using >>=, you get what you'd expect.
Okay, so you mentioned non-determinism. Yes, the list monad can be thought of as modeling non-determinism in a sense... It's a little weird, but think of the list as representing alternative possible values: [1, 2, 3] is not a collection, it's a single non-deterministic number that could be either one, two or three. That sounds dumb, but it starts to make some sense when you think about how >>= is defined for lists: it applies the given function to each possible value. So addM [1, 2] [3, 4] is actually going to compute all possible sums of those two non-deterministic values: [4, 5, 5, 6].
Okay, now to address your second question...
Side Effects:
Let's say you apply addM to two values in the IO monad, like:
addM (return 1 :: IO Int) (return 2 :: IO Int)
You don't get anything special, just 3 in the IO monad. addM does not read or write any mutable state, so it's kind of no fun. Same goes for the State or ST monads. No fun. So let's use a different function:
fireTheMissiles :: IO Int -- returns the number of casualties
Clearly the world will be different each time missiles are fired. Clearly. Now let's say you're trying to write some totally innocuous, side effect free, non-missile-firing code. Perhaps you're trying once again to add two numbers, but this time without any monads flying around:
add :: Num a => a -> a -> a
add a b = a + b
and all of a sudden your hand slips, and you accidentally typo:
add a b = a + b + fireTheMissiles
An honest mistake, really. The keys were so close together. Fortunately, because fireTheMissiles was of type IO Int rather than simply Int, the compiler is able to avert disaster.
Okay, totally contrived example, but the point is that in the case of IO, ST and friends, the type system keeps effects isolated to some specific context. It doesn't magically eliminate side effects, making code referentially transparent that shouldn't be, but it does make it clear at compile time what scope the effects are limited to.
So getting back to the original point: what does this have to do with chaining or composition of functions? Well, in this case, it's just a handy way of expressing a sequence of effects:
fireTheMissilesTwice :: IO ()
fireTheMissilesTwice = do
a <- fireTheMissiles
print a
b <- fireTheMissiles
print b
Summary:
A monad represents some policy for chaining computations. Identity's policy is pure function composition, Maybe's policy is function composition with failure propogation, IO's policy is impure function composition and so on.
Let me start by pointing at the excellent "You could have invented monads" article. It illustrates how the Monad structure can naturally manifest while you are writing programs. But the tutorial doesn't mention IO, so I will have a stab here at extending the approach.
Let us start with what you probably have already seen - the container monad. Let's say we have:
f, g :: Int -> [Int]
One way of looking at this is that it gives us a number of possible outputs for every possible input. What if we want all possible outputs for the composition of both functions? Giving all possibilities we could get by applying the functions one after the other?
Well, there's a function for that:
fg x = concatMap g $ f x
If we put this more general, we get
fg x = f x >>= g
xs >>= f = concatMap f xs
return x = [x]
Why would we want to wrap it like this? Well, writing our programs primarily using >>= and return gives us some nice properties - for example, we can be sure that it's relatively hard to "forget" solutions. We'd explicitly have to reintroduce it, say by adding another function skip. And also we now have a monad and can use all combinators from the monad library!
Now, let us jump to your trickier example. Let's say the two functions are "side-effecting". That's not non-deterministic, it just means that in theory the whole world is both their input (as it can influence them) as well as their output (as the function can influence it). So we get something like:
f, g :: Int -> RealWorld# -> (Int, RealWorld#)
If we now want f to get the world that g left behind, we'd write:
fg x rw = let (y, rw') = f x rw
(r, rw'') = g y rw'
in (r, rw'')
Or generalized:
fg x = f x >>= g
x >>= f = \rw -> let (y, rw') = x rw
(r, rw'') = f y rw'
in (r, rw'')
return x = \rw -> (x, rw)
Now if the user can only use >>=, return and a few pre-defined IO values we get a nice property again: The user will never actually see the RealWorld# getting passed around! And that is a very good thing, as you aren't really interested in the details of where getLine gets its data from. And again we get all the nice high-level functions from the monad libraries.
So the important things to take away:
The monad captures common patterns in your code, like "always pass all elements of container A to container B" or "pass this real-world-tag through". Often, once you realize that there is a monad in your program, complicated things become simply applications of the right monad combinator.
The monad allows you to completely hide the implementation from the user. It is an excellent encapsulation mechanism, be it for your own internal state or for how IO manages to squeeze non-purity into a pure program in a relatively safe way.
Appendix
In case someone is still scratching his head over RealWorld# as much as I did when I started: There's obviously more magic going on after all the monad abstraction has been removed. Then the compiler will make use of the fact that there can only ever be one "real world". That's good news and bad news:
It follows that the compiler must guarantuee execution ordering between functions (which is what we were after!)
But it also means that actually passing the real world isn't necessary as there is only one we could possibly mean: The one that is current when the function gets executed!
Bottom line is that once execution order is fixed, RealWorld# simply gets optimized out. Therefore programs using the IO monad actually have zero runtime overhead. Also note that using RealWorld# is obviously only one possible way to put IO - but it happens to be the one GHC uses internally. The good thing about monads is that, again, the user really doesn't need to know.
You could see a given monad m as a set/family (or realm, domain, etc.) of actions (think of a C statement). The monad m defines the kind of (side-)effects that its actions may have:
with [] you can define actions which can fork their executions in different "independent parallel worlds";
with Either Foo you can define actions which can fail with errors of type Foo;
with IO you can define actions which can have side-effects on the "outside world" (access files, network, launch processes, do a HTTP GET ...);
you can have a monad whose effect is "randomness" (see package MonadRandom);
you can define a monad whose actions can make a move in a game (say chess, Go…) and receive move from an opponent but are not able to write to your filesystem or anything else.
Summary
If m is a monad, m a is an action which produces a result/output of type a.
The >> and >>= operators are used to create more complex actions out of simpler ones:
a >> b is a macro-action which does action a and then action b;
a >> a does action a and then action a again;
with >>= the second action can depend on the output of the first one.
The exact meaning of what an action is and what doing an action and then another one is depends on the monad: each monad defines an imperative sublanguage with some features/effects.
Simple sequencing (>>)
Let's say with have a given monad M and some actions incrementCounter, decrementCounter, readCounter:
instance M Monad where ...
-- Modify the counter and do not produce any result:
incrementCounter :: M ()
decrementCounter :: M ()
-- Get the current value of the counter
readCounter :: M Integer
Now we would like to do something interesting with those actions. The first thing we would like to do with those actions is to sequence them. As in say C, we would like to be able to do:
// This is C:
counter++;
counter++;
We define an "sequencing operator" >>. Using this operator we can write:
incrementCounter >> incrementCounter
What is the type of "incrementCounter >> incrementCounter"?
It is an action made of two smaller actions like in C you can write composed-statements from atomic statements :
// This is a macro statement made of several statements
{
counter++;
counter++;
}
// and we can use it anywhere we may use a statement:
if (condition) {
counter++;
counter++;
}
it can have the same kind of effects as its subactions;
it does not produce any output/result.
So we would like incrementCounter >> incrementCounter to be of type M (): an (macro-)action with the same kind of possible effects but without any output.
More generally, given two actions:
action1 :: M a
action2 :: M b
we define a a >> b as the macro-action which is obtained by doing (whatever that means in our domain of action) a then b and produces as output the result of the execution of the second action. The type of >> is:
(>>) :: M a -> M b -> M b
or more generally:
(>>) :: (Monad m) => m a -> m b -> m b
We can define bigger sequence of actions from simpler ones:
action1 >> action2 >> action3 >> action4
Input and outputs (>>=)
We would like to be able to increment by something else that 1 at a time:
incrementBy 5
We want to provide some input in our actions, in order to do this we define a function incrementBy taking an Int and producing an action:
incrementBy :: Int -> M ()
Now we can write things like:
incrementCounter >> readCounter >> incrementBy 5
But we have no way to feed the output of readCounter into incrementBy. In order to do this, a slightly more powerful version of our sequencing operator is needed. The >>= operator can feed the output of a given action as input to the next action. We can write:
readCounter >>= incrementBy
It is an action which executes the readCounter action, feeds its output in the incrementBy function and then execute the resulting action.
The type of >>= is:
(>>=) :: Monad m => m a -> (a -> m b) -> m b
A (partial) example
Let's say I have a Prompt monad which can only display informations (text) to the user and ask informations to the user:
-- We don't have access to the internal structure of the Prompt monad
module Prompt (Prompt(), echo, prompt) where
-- Opaque
data Prompt a = ...
instance Monad Prompt where ...
-- Display a line to the CLI:
echo :: String -> Prompt ()
-- Ask a question to the user:
prompt :: String -> Prompt String
Let's try to define a promptBoolean message actions which asks for a question and produces a boolean value.
We use the prompt (message ++ "[y/n]") action and feed its output to a function f:
f "y" should be an action which does nothing but produce True as output;
f "n" should be an action which does nothing but produce False as output;
anything else should restart the action (do the action again);
promptBoolean would look like this:
-- Incomplete version, some bits are missing:
promptBoolean :: String -> M Boolean
promptBoolean message = prompt (message ++ "[y/n]") >>= f
where f result = if result == "y"
then ???? -- We need here an action which does nothing but produce `True` as output
else if result=="n"
then ???? -- We need here an action which does nothing but produce `False` as output
else echo "Input not recognised, try again." >> promptBoolean
Producing a value without effect (return)
In order to fill the missing bits in our promptBoolean function, we need a way to represent dummy actions without any side effect but which only outputs a given value:
-- "return 5" is an action which does nothing but outputs 5
return :: (Monad m) => a -> m a
and we can now write out promptBoolean function:
promptBoolean :: String -> Prompt Boolean
promptBoolean message :: prompt (message ++ "[y/n]") >>= f
where f result = if result=="y"
then return True
else if result=="n"
then return False
else echo "Input not recognised, try again." >> promptBoolean message
By composing those two simple actions (promptBoolean, echo) we can define any kind of dialogue between the user and your program (the actions of the program are deterministic as our monad does not have a "randomness effect").
promptInt :: String -> M Int
promptInt = ... -- similar
-- Classic "guess a number game/dialogue"
guess :: Int -> m()
guess n = promptInt "Guess:" m -> f
where f m = if m == n
then echo "Found"
else (if m > n
then echo "Too big"
then echo "Too small") >> guess n
The operations of a monad
A Monad is a set of actions which can be composed with the return and >>= operators:
>>= for action composition;
return for producing a value without any (side-)effect.
These two operators are the minimal operators needed to define a Monad.
In Haskell, the >> operator is needed as well but it can in fact be derived from >>=:
(>>): Monad m => m a -> m b -> m b
a >> b = a >>= f
where f x = b
In Haskell, an extra fail operator is need as well but this is really a hack (and it might be removed from Monad in the future).
This is the Haskell definition of a Monad:
class Monad m where
return :: m a
(>>=) :: m a -> (a -> m b) -> m b
(>>) :: m a -> m b -> m b -- can be derived from (>>=)
fail :: String -> m a -- mostly a hack
Actions are first-class
One great thing about monads is that actions are first-class. You can take them in a variable, you can define function which take actions as input and produce some other actions as output. For example, we can define a while operator:
-- while x y : does action y while action x output True
while :: (Monad m) => m Boolean -> m a -> m ()
while x y = x >>= f
where f True = y >> while x y
f False = return ()
Summary
A Monad is a set of actions in some domain. The monad/domain define the kind of "effects" which are possible. The >> and >>= operators represent sequencing of actions and monadic expression may be used to represent any kind of "imperative (sub)program" in your (functional) Haskell program.
The great things are that:
you can design your own Monad which supports the features and effects that you want
see Prompt for an example of a "dialogue only subprogram",
see Rand for an example of "sampling only subprogram";
you can write your own control structures (while, throw, catch or more exotic ones) as functions taking actions and composing them in some way to produce a bigger macro-actions.
MonadRandom
A good way of understanding monads, is the MonadRandom package. The Rand monad is made of actions whose output can be random (the effect is randomness). An action in this monad is some kind of random variable (or more exactly a sampling process):
-- Sample an Int from some distribution
action :: Rand Int
Using Rand to do some sampling/random algorithms is quite interesting because you have random variables as first class values:
-- Estimate mean by sampling nsamples times the random variable x
sampleMean :: Real a => Int -> m a -> m a
sampleMean n x = ...
In this setting, the sequence function from Prelude,
sequence :: Monad m => [m a] -> m [a]
becomes
sequence :: [Rand a] -> Rand [a]
It creates a random variable obtained by sampling independently from a list of random variables.
There are three main observations concerning the IO monad:
1) You can't get values out of it. Other types like Maybe might allow to extract values, but neither the monad class interface itself nor the IO data type allow it.
2) "Inside" IO is not only the real value but also that "RealWorld" thing. This dummy value is used to enforce the chaining of actions by the type system: If you have two independent calculations, the use of >>= makes the second calculation dependent on the first.
3) Assume a non-deterministic thing like random :: () -> Int, which isn't allowed in Haskell. If you change the signature to random :: Blubb -> (Blubb, Int), it is allowed, if you make sure that nobody ever can use a Blubb twice: Because in that case all inputs are "different", it is no problem that the outputs are different as well.
Now we can use the fact 1): Nobody can get something out of IO, so we can use the RealWord dummy hidden in IO to serve as a Blubb. There is only one IOin the whole application (the one we get from main), and it takes care of proper sequentiation, as we have seen in 2). Problem solved.
One thing that often helps me to understand the nature of something is to examine it in the most trivial way possible. That way, I'm not getting distracted by potentially unrelated concepts. With that in mind, I think it may be helpful to understand the nature of the Identity Monad, as it's the most trivial implementation of a Monad possible (I think).
What is interesting about the Identity Monad? I think it is that it allows me to express the idea of evaluating expressions in a context defined by other expressions. And to me, that is the essence of every Monad I've encountered (so far).
If you already had a lot of exposure to 'mainstream' programming languages before learning Haskell (like I did), then this doesn't seem very interesting at all. After all, in a mainstream programming language, statements are executed in sequence, one after the other (excepting control-flow constructs, of course). And naturally, we can assume that every statement is evaluated in the context of all previously executed statements and that those previously executed statements may alter the environment and the behavior of the currently executing statement.
All of that is pretty much a foreign concept in a functional, lazy language like Haskell. The order in which computations are evaluated in Haskell is well-defined, but sometimes hard to predict, and even harder to control. And for many kinds of problems, that's just fine. But other sorts of problems (e.g. IO) are hard to solve without some convenient way to establish an implicit order and context between the computations in your program.
As far as side-effects go, specifically, often they can be transformed (via a Monad) in to simple state-passing, which is perfectly legal in a pure functional language. Some Monads don't seem to be of that nature, however. Monads such as the IO Monad or the ST monad literally perform side-effecting actions. There are many ways to think about this, but one way that I think about it is that just because my computations must exist in a world without side-effects, the Monad may not. As such, the Monad is free to establish a context for my computation to execute that is based on side-effects defined by other computations.
Finally, I must disclaim that I am definitely not a Haskell expert. As such, please understand that everything I've said is pretty much my own thoughts on this subject and I may very well disown them later when I understand Monads more fully.
the point is so there can be clean error handling in a chain of functions, containers, and side effects
More or less.
how exactly is the problem of side-effects solved?
A value in the I/O monad, i.e. one of type IO a, should be interpreted as a program. p >> q on IO values can then be interpreted as the operator that combines two programs into one that first executes p, then q. The other monad operators have similar interpretations. By assigning a program to the name main, you declare to the compiler that that is the program that has to be executed by its output object code.
As for the list monad, it's not really related to the I/O monad except in a very abstract mathematical sense. The IO monad gives deterministic computation with side effects, while the list monad gives non-deterministic (but not random!) backtracking search, somewhat similar to Prolog's modus operandi.
With this concept of containers, the language essentially says anything inside the containers is non-deterministic
No. Haskell is deterministic. If you ask for integer addition 2+2 you will always get 4.
"Nondeterministic" is only a metaphor, a way of thinking. Everything is deterministic under the hood. If you have this code:
do x <- [4,5]
y <- [0,1]
return (x+y)
it is roughly equivalent to Python code
l = []
for x in [4,5]:
for y in [0,1]:
l.append(x+y)
You see nondeterminism here? No, it's deterministic construction of a list. Run it twice, you'll get the same numbers in the same order.
You can describe it this way: Choose arbitrary x from [4,5]. Choose arbitrary y from [0,1]. Return x+y. Collect all possible results.
That way seems to involve nondeterminism, but it's only a nested loop (list comprehension). There is no "real" nondeterminism here, it's simulated by checking all possibilities. Nondeterminism is an illusion. The code only appears to be nondeterministic.
This code using State monad:
do put 0
x <- get
put (x+2)
y <- get
return (y+3)
gives 5 and seems to involve changing state. As with lists it's an illusion. There are no "variables" that change (as in imperative languages). Everything is nonmutable under the hood.
You can describe the code this way: put 0 to a variable. Read the value of a variable to x. Put (x+2) to the variable. Read the variable to y, and return y+3.
That way seems to involve state, but it's only composing functions passing additional parameter. There is no "real" mutability here, it's simulated by composition. Mutability is an illusion. The code only appears to be using it.
Haskell does it this way: you've got functions
a -> s -> (b,s)
This function takes and old value of state and returns new value. It does not involve mutability or change variables. It's a function in mathematical sense.
For example the function "put" takes new value of state, ignores current state and returns new state:
put x _ = ((), x)
Just like you can compose two normal functions
a -> b
b -> c
into
a -> c
using (.) operator you can compose "state" transformers
a -> s -> (b,s)
b -> s -> (c,s)
into a single function
a -> s -> (c,s)
Try writing the composition operator yourself. This is what really happens, there are no "side effects" only passing arguments to functions.
From what I understand, Monad is just another typeclass that declares ways to interact with data [...]
...providing an interface common to all those types which have an instance. This can then be used to provide generic definitions which work across all monadic types.
It seems clever and clean to implement these 3 things with one concept [...]
...the only three things that are implemented are the instances for those three types (list, Maybe and IO) - the types themselves are defined independently elsewhere.
[...] but really, the point is so there can be clean error handling in a chain of functions, containers, and side effects.
Not just error handling e.g. consider ST - without the monadic interface, you would have to pass the encapsulated-state directly and correctly...a tiresome task.
How exactly is the problem of side-effects solved?
Short answer: Haskell solves manages them by using types to indicate their presence.
Can someone explain how, intuitively, Haskell gets away with changing state with inputs and output?
"Intuitively"...like what's available over here? Let's try a simple direct comparison instead:
From How to Declare an Imperative by Philip Wadler:
(* page 26 *)
type 'a io = unit -> 'a
infix >>=
val >>= : 'a io * ('a -> 'b io) -> 'b io
fun m >>= k = fn () => let
val x = m ()
val y = k x ()
in
y
end
val return : 'a -> 'a io
fun return x = fn () => x
val putc : char -> unit io
fun putc c = fn () => putcML c
val getc : char io
val getc = fn () => getcML ()
fun getcML () =
valOf(TextIO.input1(TextIO.stdIn))
(* page 25 *)
fun putcML c =
TextIO.output1(TextIO.stdOut,c)
Based on these two answers of mine, this is my Haskell translation:
type IO a = OI -> a
(>>=) :: IO a -> (a -> IO b) -> IO b
m >>= k = \ u -> let !(u1, u2) = part u in
let !x = m u1 in
let !y = k x u2 in
y
return :: a -> IO a
return x = \ u -> let !_ = part u in x
putc :: Char -> IO ()
putc c = \ u -> putcOI c u
getc :: IO Char
getc = \ u -> getcOI u
-- primitives
data OI
partOI :: OI -> (OI, OI)
putcOI :: Char -> OI -> ()
getcOI :: OI -> Char
Now remember that short answer about side-effects?
Haskell manages them by using types to indicate their presence.
Data.Char.chr :: Int -> Char -- no side effects
getChar :: IO Char -- side effects at
{- :: OI -> Char -} -- work: beware!

Resources