What are the benefits of currying? - haskell

I don't think I quite understand currying, since I'm unable to see any massive benefit it could provide. Perhaps someone could enlighten me with an example demonstrating why it is so useful. Does it truly have benefits and applications, or is it just an over-appreciated concept?

(There is a slight difference between currying and partial application, although they're closely related; since they're often mixed together, I'll deal with both terms.)
The place where I realized the benefits first was when I saw sliced operators:
incElems = map (+1)
--non-curried equivalent: incElems = (\elems -> map (\i -> (+) 1 i) elems)
IMO, this is totally easy to read. Now, if the type of (+) was (Int,Int) -> Int *, which is the uncurried version, it would (counter-intuitively) result in an error -- but curryied, it works as expected, and has type [Int] -> [Int].
You mentioned C# lambdas in a comment. In C#, you could have written incElems like so, given a function plus:
var incElems = xs => xs.Select(x => plus(1,x))
If you're used to point-free style, you'll see that the x here is redundant. Logically, that code could be reduced to
var incElems = xs => xs.Select(curry(plus)(1))
which is awful due to the lack of automatic partial application with C# lambdas. And that's the crucial point to decide where currying is actually useful: mostly when it happens implicitly. For me, map (+1) is the easiest to read, then comes .Select(x => plus(1,x)), and the version with curry should probably be avoided, if there is no really good reason.
Now, if readable, the benefits sum up to shorter, more readable and less cluttered code -- unless there is some abuse of point-free style done is with it (I do love (.).(.), but it is... special)
Also, lambda calculus would get impossible without using curried functions, since it has only one-valued (but therefor higher-order) functions.
* Of course it actually in Num, but it's more readable like this for the moment.
Update: how currying actually works.
Look at the type of plus in C#:
int plus(int a, int b) {..}
You have to give it a tuple of values -- not in C# terms, but mathematically spoken; you can't just leave out the second value. In haskell terms, that's
plus :: (Int,Int) -> Int,
which could be used like
incElem = map (\x -> plus (1, x)) -- equal to .Select (x => plus (1, x))
That's way too much characters to type. Suppose you'd want to do this more often in the future. Here's a little helper:
curry f = \x -> (\y -> f (x,y))
plus' = curry plus
which gives
incElem = map (plus' 1)
Let's apply this to a concrete value.
incElem [1]
= (map (plus' 1)) [1]
= [plus' 1 1]
= [(curry plus) 1 1]
= [(\x -> (\y -> plus (x,y))) 1 1]
= [plus (1,1)]
= [2]
Here you can see curry at work. It turns a standard haskell style function application (plus' 1 1) into a call to a "tupled" function -- or, viewed at a higher level, transforms the "tupled" into the "untupled" version.
Fortunately, most of the time, you don't have to worry about this, as there is automatic partial application.

It's not the best thing since sliced bread, but if you're using lambdas anyway, it's easier to use higher-order functions without using lambda syntax. Compare:
map (max 4) [0,6,9,3] --[4,6,9,4]
map (\i -> max 4 i) [0,6,9,3] --[4,6,9,4]
These kinds of constructs come up often enough when you're using functional programming, that it's a nice shortcut to have and lets you think about the problem from a slightly higher level--you're mapping against the "max 4" function, not some random function that happens to be defined as (\i -> max 4 i). It lets you start to think in higher levels of indirection more easily:
let numOr4 = map $ max 4
let numOr4' = (\xs -> map (\i -> max 4 i) xs)
numOr4 [0,6,9,3] --ends up being [4,6,9,4] either way;
--which do you think is easier to understand?
That said, it's not a panacea; sometimes your function's parameters will be the wrong order for what you're trying to do with currying, so you'll have to resort to a lambda anyway. However, once you get used to this style, you start to learn how to design your functions to work well with it, and once those neurons starts to connect inside your brain, previously complicated constructs can start to seem obvious in comparison.

One benefit of currying is that it allows partial application of functions without the need of any special syntax/operator. A simple example:
mapLength = map length
mapLength ["ab", "cde", "f"]
>>> [2, 3, 1]
mapLength ["x", "yz", "www"]
>>> [1, 2, 3]
map :: (a -> b) -> [a] -> [b]
length :: [a] -> Int
mapLength :: [[a]] -> [Int]
The map function can be considered to have type (a -> b) -> ([a] -> [b]) because of currying, so when length is applied as its first argument, it yields the function mapLength of type [[a]] -> [Int].

Currying has the convenience features mentioned in other answers, but it also often serves to simplify reasoning about the language or to implement some code much easier than it could be otherwise. For example, currying means that any function at all has a type that's compatible with a ->b. If you write some code whose type involves a -> b, that code can be made work with any function at all, no matter how many arguments it takes.
The best known example of this is the Applicative class:
class Functor f => Applicative f where
pure :: a -> f a
(<*>) :: f (a -> b) -> f a -> f b
And an example use:
-- All possible products of numbers taken from [1..5] and [1..10]
example = pure (*) <*> [1..5] <*> [1..10]
In this context, pure and <*> adapt any function of type a -> b to work with lists of type [a]. Because of partial application, this means you can also adapt functions of type a -> b -> c to work with [a] and [b], or a -> b -> c -> d with [a], [b] and [c], and so on.
The reason this works is because a -> b -> c is the same thing as a -> (b -> c):
(+) :: Num a => a -> a -> a
pure (+) :: (Applicative f, Num a) => f (a -> a -> a)
[1..5], [1..10] :: Num a => [a]
pure (+) <*> [1..5] :: Num a => [a -> a]
pure (+) <*> [1..5] <*> [1..10] :: Num a => [a]
Another, different use of currying is that Haskell allows you to partially apply type constructors. E.g., if you have this type:
data Foo a b = Foo a b
...it actually makes sense to write Foo a in many contexts, for example:
instance Functor (Foo a) where
fmap f (Foo a b) = Foo a (f b)
I.e., Foo is a two-parameter type constructor with kind * -> * -> *; Foo a, the partial application of Foo to just one type, is a type constructor with kind * -> *. Functor is a type class that can only be instantiated for type constrcutors of kind * -> *. Since Foo a is of this kind, you can make a Functor instance for it.

The "no-currying" form of partial application works like this:
We have a function f : (A ✕ B) → C
We'd like to apply it partially to some a : A
To do this, we build a closure out of a and f (we don't evaluate f at all, for the time being)
Then some time later, we receive the second argument b : B
Now that we have both the A and B argument, we can evaluate f in its original form...
So we recall a from the closure, and evaluate f(a,b).
A bit complicated, isn't it?
When f is curried in the first place, it's rather simpler:
We have a function f : A → B → C
We'd like to apply it partially to some a : A – which we can just do: f a
Then some time later, we receive the second argument b : B
We apply the already evaluated f a to b.
So far so nice, but more important than being simple, this also gives us extra possibilities for implementing our function: we may be able to do some calculations as soon as the a argument is received, and these calculations won't need to be done later, even if the function is evaluated with multiple different b arguments!
To give an example, consider this audio filter, an infinite impulse response filter. It works like this: for each audio sample, you feed an "accumulator function" (f) with some state parameter (in this case, a simple number, 0 at the beginning) and the audio sample. The function then does some magic, and spits out the new internal state1 and the output sample.
Now here's the crucial bit – what kind of magic the function does depends on the coefficient2 λ, which is not quite a constant: it depends both on what cutoff frequency we'd like the filter to have (this governs "how the filter will sound") and on what sample rate we're processing in. Unfortunately, the calculation of λ is a bit more complicated (lp1stCoeff $ 2*pi * (νᵥ ~*% δs) than the rest of the magic, so we wouldn't like having to do this for every single sample, all over again. Quite annoying, because νᵥ and δs are almost constant: they change very seldom, certainly not at each audio sample.
But currying saves the day! We simply calculate λ as soon as we have the necessary parameters. Then, at each of the many many audio samples to come, we only need to perform the remaining, very easy magic: yⱼ = yⱼ₁ + λ ⋅ (xⱼ - yⱼ₁). So we're being efficient, and still keeping a nice safe referentially transparent purely-functional interface.
1 Note that this kind of state-passing can generally be done more nicely with the State or ST monad, that's just not particularly beneficial in this example
2 Yes, this is a lambda symbol. I hope I'm not confusing anybody – fortunately, in Haskell it's clear that lambda functions are written with \, not with λ.

It's somewhat dubious to ask what the benefits of currying are without specifying the context in which you're asking the question:
In some cases, like functional languages, currying will merely be seen as something that has a more local change, where you could replace things with explicit tupled domains. However, this isn't to say that currying is useless in these languages. In some sense, programming with curried functions make you "feel" like you're programming in a more functional style, because you more typically face situations where you're dealing with higher order functions. Certainly, most of the time, you will "fill in" all of the arguments to a function, but in the cases where you want to use the function in its partially applied form, this is a bit simpler to do in curried form. We typically tell our beginning programmers to use this when learning a functional language just because it feels like better style and reminds them they're programming in more than just C. Having things like curry and uncurry also help for certain conveniences within functional programming languages too, I can think of arrows within Haskell as a specific example of where you would use curry and uncurry a bit to apply things to different pieces of an arrow, etc...
In some cases, you want to think about more than functional programs, you can present currying / uncurrying as a way to state the elimination and introduction rules for and in constructive logic, which provides a connection to a more elegant motivation for why it exists.
In some cases, for example, in Coq, using curried functions versus tupled functions can produce different induction schemes, which may be easier or harder to work with, depending on your applications.

I used to think that currying was simple syntax sugar that saves you a bit of typing. For example, instead of writing
(\ x -> x + 1)
I can merely write
(+1)
The latter is instantly more readable, and less typing to boot.
So if it's just a convenient short cut, why all the fuss?
Well, it turns out that because function types are curried, you can write code which is polymorphic in the number of arguments a function has.
For example, the QuickCheck framework lets you test functions by feeding them randomly-generated test data. It works on any function who's input type can be auto-generated. But, because of currying, the authors were able to rig it so this works with any number of arguments. Were functions not curried, there would be a different testing function for each number of arguments - and that would just be tedious.

Related

Understanding currying and HOFs

I am currently studying functional programming and it's most important feature : Higher Order Functions.
It's not as crystal clear as I'd like currently and therefore I'd like to understand perfectly how HOFs work.
Considering this function
{- Curried addition. -}
plusc :: Num a => a -> (a -> a)
plusc = (+)
To what extent can we say that this function uses currying and is a HOF ?
EDIT : Basically, I don't understand how the definition of the function stands for an addition (parameters, associativity, etc )
I wouldn't personally call plusc a HOF, because its arguments aren't functions. A way to spot an obvious HOF is to look for a parens in the signature that aren't at the leftmost side:
{- equivalent signature -}
plusc :: Num a => a -> a -> a
When we remove optional parens, it's obvious that the function isn't a HOF that takes functions, but it's curried.
Note: Since every curried function can return a function, though, we might say that after partially applying it, it returns a function, and as such operates on functions - so it is a HOF. I don't think this is particularly helpful way of describing/learning the concept, but I suppose the definition would span both parameters and results.
An uncurried version would simply group its arguments:
plusUnc :: Num a => (a, a) -> a
Now a HOF might take such a function and turn it into some other one:
imu :: Num a => (a -> a -> a) -> (a -> a -> a)
imu f = \a b -> f a b
Note: The lambda impl could obviously be simplified, I spelled it out just for illustration.
Note that f is the "lower" order function that's being passed into imu. To use it:
imuPlus = imu plusc -- a function is being passed
imuPlus 1 2 -- == 3
Note: since we're mixing both concepts (and you asked for both), imu is also curried. An uncurried version could look like this:
imuUnc :: ((a -> a -> a), (a, a)) -> a
Now it is a HOF (it has a function in the parameters), but it doesn't return a function, which differs from the examples above.
It's just much easier to use when it's curried, though, mostly because of partial application.

Example of deep understanding of currying

Reading https://wiki.haskell.org/Currying
it states :
Much of the time, currying can be ignored by the new programmer. The
major advantage of considering all functions as curried is
theoretical: formal proofs are easier when all functions are treated
uniformly (one argument in, one result out). Having said that, there
are Haskell idioms and techniques for which you need to understand
currying.
What is a Haskell technique/idiom that a deeper understanding of currying is required ?
Partial function application isn't really a distinct feature of Haskell; it is just a consequence of curried functions.
map :: (a -> b) -> [a] -> [b]
In a language like Python, map always takes two arguments: a function of type a -> b and a list of type [a]
map(f, [x, y, z]) == [f(x), f(y), f(z)]
This requires you to pretend that the -> syntax is just for show, and that the -> between (a -> b) and [a] is not really the same as the one between [a] -> [b]. However, that is not the case; it's the exact same operator, and it is right-associative. The type of map can be explicitly parenthesized as
map :: (a -> b) -> ([a] -> [b])
and suddenly it seems much less interesting that you might give only one argument (the function) to map and get back a new function of type [a] -> [b]. That is all partial function application is: taking advantage of the fact that all functions are curried.
In fact, you never really give more than one argument to a function. To go along with -> being right-associative, function application is left-associative, meaning a "multi-argument" call like
map f [1,2,3]
is really two function applications, which becomes clearer if we parenthesize it.
(map f) [1,2,3]
map is first "partially" applied to one argument f, which returns a new function. This function is then applied to [1,2,3] to get the final result.

Why do we need monads?

In my humble opinion the answers to the famous question "What is a monad?", especially the most voted ones, try to explain what is a monad without clearly explaining why monads are really necessary. Can they be explained as the solution to a problem?
Why do we need monads?
We want to program only using functions. ("functional programming (FP)" after all).
Then, we have a first big problem. This is a program:
f(x) = 2 * x
g(x,y) = x / y
How can we say what is to be executed first? How can we form an ordered sequence of functions (i.e. a program) using no more than functions?
Solution: compose functions. If you want first g and then f, just write f(g(x,y)). This way, "the program" is a function as well: main = f(g(x,y)). OK, but ...
More problems: some functions might fail (i.e. g(2,0), divide by 0). We have no "exceptions" in FP (an exception is not a function). How do we solve it?
Solution: Let's allow functions to return two kind of things: instead of having g : Real,Real -> Real (function from two reals into a real), let's allow g : Real,Real -> Real | Nothing (function from two reals into (real or nothing)).
But functions should (to be simpler) return only one thing.
Solution: let's create a new type of data to be returned, a "boxing type" that encloses maybe a real or be simply nothing. Hence, we can have g : Real,Real -> Maybe Real. OK, but ...
What happens now to f(g(x,y))? f is not ready to consume a Maybe Real. And, we don't want to change every function we could connect with g to consume a Maybe Real.
Solution: let's have a special function to "connect"/"compose"/"link" functions. That way, we can, behind the scenes, adapt the output of one function to feed the following one.
In our case: g >>= f (connect/compose g to f). We want >>= to get g's output, inspect it and, in case it is Nothing just don't call f and return Nothing; or on the contrary, extract the boxed Real and feed f with it. (This algorithm is just the implementation of >>= for the Maybe type). Also note that >>= must be written only once per "boxing type" (different box, different adapting algorithm).
Many other problems arise which can be solved using this same pattern: 1. Use a "box" to codify/store different meanings/values, and have functions like g that return those "boxed values". 2. Have a composer/linker g >>= f to help connecting g's output to f's input, so we don't have to change any f at all.
Remarkable problems that can be solved using this technique are:
having a global state that every function in the sequence of functions ("the program") can share: solution StateMonad.
We don't like "impure functions": functions that yield different output for same input. Therefore, let's mark those functions, making them to return a tagged/boxed value: IO monad.
Total happiness!
The answer is, of course, "We don't". As with all abstractions, it isn't necessary.
Haskell does not need a monad abstraction. It isn't necessary for performing IO in a pure language. The IO type takes care of that just fine by itself. The existing monadic desugaring of do blocks could be replaced with desugaring to bindIO, returnIO, and failIO as defined in the GHC.Base module. (It's not a documented module on hackage, so I'll have to point at its source for documentation.) So no, there's no need for the monad abstraction.
So if it's not needed, why does it exist? Because it was found that many patterns of computation form monadic structures. Abstraction of a structure allows for writing code that works across all instances of that structure. To put it more concisely - code reuse.
In functional languages, the most powerful tool found for code reuse has been composition of functions. The good old (.) :: (b -> c) -> (a -> b) -> (a -> c) operator is exceedingly powerful. It makes it easy to write tiny functions and glue them together with minimal syntactic or semantic overhead.
But there are cases when the types don't work out quite right. What do you do when you have foo :: (b -> Maybe c) and bar :: (a -> Maybe b)? foo . bar doesn't typecheck, because b and Maybe b aren't the same type.
But... it's almost right. You just want a bit of leeway. You want to be able to treat Maybe b as if it were basically b. It's a poor idea to just flat-out treat them as the same type, though. That's more or less the same thing as null pointers, which Tony Hoare famously called the billion-dollar mistake. So if you can't treat them as the same type, maybe you can find a way to extend the composition mechanism (.) provides.
In that case, it's important to really examine the theory underlying (.). Fortunately, someone has already done this for us. It turns out that the combination of (.) and id form a mathematical construct known as a category. But there are other ways to form categories. A Kleisli category, for instance, allows the objects being composed to be augmented a bit. A Kleisli category for Maybe would consist of (.) :: (b -> Maybe c) -> (a -> Maybe b) -> (a -> Maybe c) and id :: a -> Maybe a. That is, the objects in the category augment the (->) with a Maybe, so (a -> b) becomes (a -> Maybe b).
And suddenly, we've extended the power of composition to things that the traditional (.) operation doesn't work on. This is a source of new abstraction power. Kleisli categories work with more types than just Maybe. They work with every type that can assemble a proper category, obeying the category laws.
Left identity: id . f = f
Right identity: f . id = f
Associativity: f . (g . h) = (f . g) . h
As long as you can prove that your type obeys those three laws, you can turn it into a Kleisli category. And what's the big deal about that? Well, it turns out that monads are exactly the same thing as Kleisli categories. Monad's return is the same as Kleisli id. Monad's (>>=) isn't identical to Kleisli (.), but it turns out to be very easy to write each in terms of the other. And the category laws are the same as the monad laws, when you translate them across the difference between (>>=) and (.).
So why go through all this bother? Why have a Monad abstraction in the language? As I alluded to above, it enables code reuse. It even enables code reuse along two different dimensions.
The first dimension of code reuse comes directly from the presence of the abstraction. You can write code that works across all instances of the abstraction. There's the entire monad-loops package consisting of loops that work with any instance of Monad.
The second dimension is indirect, but it follows from the existence of composition. When composition is easy, it's natural to write code in small, reusable chunks. This is the same way having the (.) operator for functions encourages writing small, reusable functions.
So why does the abstraction exist? Because it's proven to be a tool that enables more composition in code, resulting in creating reusable code and encouraging the creation of more reusable code. Code reuse is one of the holy grails of programming. The monad abstraction exists because it moves us a little bit towards that holy grail.
Benjamin Pierce said in TAPL
A type system can be regarded as calculating a kind of static
approximation to the run-time behaviours of the terms in a program.
That's why a language equipped with a powerful type system is strictly more expressive, than a poorly typed language. You can think about monads in the same way.
As #Carl and sigfpe point, you can equip a datatype with all operations you want without resorting to monads, typeclasses or whatever other abstract stuff. However monads allow you not only to write reusable code, but also to abstract away all redundant detailes.
As an example, let's say we want to filter a list. The simplest way is to use the filter function: filter (> 3) [1..10], which equals [4,5,6,7,8,9,10].
A slightly more complicated version of filter, that also passes an accumulator from left to right, is
swap (x, y) = (y, x)
(.*) = (.) . (.)
filterAccum :: (a -> b -> (Bool, a)) -> a -> [b] -> [b]
filterAccum f a xs = [x | (x, True) <- zip xs $ snd $ mapAccumL (swap .* f) a xs]
To get all i, such that i <= 10, sum [1..i] > 4, sum [1..i] < 25, we can write
filterAccum (\a x -> let a' = a + x in (a' > 4 && a' < 25, a')) 0 [1..10]
which equals [3,4,5,6].
Or we can redefine the nub function, that removes duplicate elements from a list, in terms of filterAccum:
nub' = filterAccum (\a x -> (x `notElem` a, x:a)) []
nub' [1,2,4,5,4,3,1,8,9,4] equals [1,2,4,5,3,8,9]. A list is passed as an accumulator here. The code works, because it's possible to leave the list monad, so the whole computation stays pure (notElem doesn't use >>= actually, but it could). However it's not possible to safely leave the IO monad (i.e. you cannot execute an IO action and return a pure value — the value always will be wrapped in the IO monad). Another example is mutable arrays: after you have leaved the ST monad, where a mutable array live, you cannot update the array in constant time anymore. So we need a monadic filtering from the Control.Monad module:
filterM :: (Monad m) => (a -> m Bool) -> [a] -> m [a]
filterM _ [] = return []
filterM p (x:xs) = do
flg <- p x
ys <- filterM p xs
return (if flg then x:ys else ys)
filterM executes a monadic action for all elements from a list, yielding elements, for which the monadic action returns True.
A filtering example with an array:
nub' xs = runST $ do
arr <- newArray (1, 9) True :: ST s (STUArray s Int Bool)
let p i = readArray arr i <* writeArray arr i False
filterM p xs
main = print $ nub' [1,2,4,5,4,3,1,8,9,4]
prints [1,2,4,5,3,8,9] as expected.
And a version with the IO monad, which asks what elements to return:
main = filterM p [1,2,4,5] >>= print where
p i = putStrLn ("return " ++ show i ++ "?") *> readLn
E.g.
return 1? -- output
True -- input
return 2?
False
return 4?
False
return 5?
True
[1,5] -- output
And as a final illustration, filterAccum can be defined in terms of filterM:
filterAccum f a xs = evalState (filterM (state . flip f) xs) a
with the StateT monad, that is used under the hood, being just an ordinary datatype.
This example illustrates, that monads not only allow you to abstract computational context and write clean reusable code (due to the composability of monads, as #Carl explains), but also to treat user-defined datatypes and built-in primitives uniformly.
I don't think IO should be seen as a particularly outstanding monad, but it's certainly one of the more astounding ones for beginners, so I'll use it for my explanation.
Naïvely building an IO system for Haskell
The simplest conceivable IO system for a purely-functional language (and in fact the one Haskell started out with) is this:
main₀ :: String -> String
main₀ _ = "Hello World"
With lazyness, that simple signature is enough to actually build interactive terminal programs – very limited, though. Most frustrating is that we can only output text. What if we added some more exciting output possibilities?
data Output = TxtOutput String
| Beep Frequency
main₁ :: String -> [Output]
main₁ _ = [ TxtOutput "Hello World"
-- , Beep 440 -- for debugging
]
cute, but of course a much more realistic “alterative output” would be writing to a file. But then you'd also want some way to read from files. Any chance?
Well, when we take our main₁ program and simply pipe a file to the process (using operating system facilities), we have essentially implemented file-reading. If we could trigger that file-reading from within the Haskell language...
readFile :: Filepath -> (String -> [Output]) -> [Output]
This would use an “interactive program” String->[Output], feed it a string obtained from a file, and yield a non-interactive program that simply executes the given one.
There's one problem here: we don't really have a notion of when the file is read. The [Output] list sure gives a nice order to the outputs, but we don't get an order for when the inputs will be done.
Solution: make input-events also items in the list of things to do.
data IO₀ = TxtOut String
| TxtIn (String -> [Output])
| FileWrite FilePath String
| FileRead FilePath (String -> [Output])
| Beep Double
main₂ :: String -> [IO₀]
main₂ _ = [ FileRead "/dev/null" $ \_ ->
[TxtOutput "Hello World"]
]
Ok, now you may spot an imbalance: you can read a file and make output dependent on it, but you can't use the file contents to decide to e.g. also read another file. Obvious solution: make the result of the input-events also something of type IO, not just Output. That sure includes simple text output, but also allows reading additional files etc..
data IO₁ = TxtOut String
| TxtIn (String -> [IO₁])
| FileWrite FilePath String
| FileRead FilePath (String -> [IO₁])
| Beep Double
main₃ :: String -> [IO₁]
main₃ _ = [ TxtIn $ \_ ->
[TxtOut "Hello World"]
]
That would now actually allow you to express any file operation you might want in a program (though perhaps not with good performance), but it's somewhat overcomplicated:
main₃ yields a whole list of actions. Why don't we simply use the signature :: IO₁, which has this as a special case?
The lists don't really give a reliable overview of program flow anymore: most subsequent computations will only be “announced” as the result of some input operation. So we might as well ditch the list structure, and simply cons a “and then do” to each output operation.
data IO₂ = TxtOut String IO₂
| TxtIn (String -> IO₂)
| Terminate
main₄ :: IO₂
main₄ = TxtIn $ \_ ->
TxtOut "Hello World"
Terminate
Not too bad!
So what has all of this to do with monads?
In practice, you wouldn't want to use plain constructors to define all your programs. There would need to be a good couple of such fundamental constructors, yet for most higher-level stuff we would like to write a function with some nice high-level signature. It turns out most of these would look quite similar: accept some kind of meaningfully-typed value, and yield an IO action as the result.
getTime :: (UTCTime -> IO₂) -> IO₂
randomRIO :: Random r => (r,r) -> (r -> IO₂) -> IO₂
findFile :: RegEx -> (Maybe FilePath -> IO₂) -> IO₂
There's evidently a pattern here, and we'd better write it as
type IO₃ a = (a -> IO₂) -> IO₂ -- If this reminds you of continuation-passing
-- style, you're right.
getTime :: IO₃ UTCTime
randomRIO :: Random r => (r,r) -> IO₃ r
findFile :: RegEx -> IO₃ (Maybe FilePath)
Now that starts to look familiar, but we're still only dealing with thinly-disguised plain functions under the hood, and that's risky: each “value-action” has the responsibility of actually passing on the resulting action of any contained function (else the control flow of the entire program is easily disrupted by one ill-behaved action in the middle). We'd better make that requirement explicit. Well, it turns out those are the monad laws, though I'm not sure we can really formulate them without the standard bind/join operators.
At any rate, we've now reached a formulation of IO that has a proper monad instance:
data IO₄ a = TxtOut String (IO₄ a)
| TxtIn (String -> IO₄ a)
| TerminateWith a
txtOut :: String -> IO₄ ()
txtOut s = TxtOut s $ TerminateWith ()
txtIn :: IO₄ String
txtIn = TxtIn $ TerminateWith
instance Functor IO₄ where
fmap f (TerminateWith a) = TerminateWith $ f a
fmap f (TxtIn g) = TxtIn $ fmap f . g
fmap f (TxtOut s c) = TxtOut s $ fmap f c
instance Applicative IO₄ where
pure = TerminateWith
(<*>) = ap
instance Monad IO₄ where
TerminateWith x >>= f = f x
TxtOut s c >>= f = TxtOut s $ c >>= f
TxtIn g >>= f = TxtIn $ (>>=f) . g
Obviously this is not an efficient implementation of IO, but it's in principle usable.
Monads serve basically to compose functions together in a chain. Period.
Now the way they compose differs across the existing monads, thus resulting in different behaviors (e.g., to simulate mutable state in the state monad).
The confusion about monads is that being so general, i.e., a mechanism to compose functions, they can be used for many things, thus leading people to believe that monads are about state, about IO, etc, when they are only about "composing functions".
Now, one interesting thing about monads, is that the result of the composition is always of type "M a", that is, a value inside an envelope tagged with "M". This feature happens to be really nice to implement, for example, a clear separation between pure from impure code: declare all impure actions as functions of type "IO a" and provide no function, when defining the IO monad, to take out the "a" value from inside the "IO a". The result is that no function can be pure and at the same time take out a value from an "IO a", because there is no way to take such value while staying pure (the function must be inside the "IO" monad to use such value). (NOTE: well, nothing is perfect, so the "IO straitjacket" can be broken using "unsafePerformIO : IO a -> a" thus polluting what was supposed to be a pure function, but this should be used very sparingly and when you really know to be not introducing any impure code with side-effects.
Monads are just a convenient framework for solving a class of recurring problems. First, monads must be functors (i.e. must support mapping without looking at the elements (or their type)), they must also bring a binding (or chaining) operation and a way to create a monadic value from an element type (return). Finally, bind and return must satisfy two equations (left and right identities), also called the monad laws. (Alternatively one could define monads to have a flattening operation instead of binding.)
The list monad is commonly used to deal with non-determinism. The bind operation selects one element of the list (intuitively all of them in parallel worlds), lets the programmer to do some computation with them, and then combines the results in all worlds to single list (by concatenating, or flattening, a nested list). Here is how one would define a permutation function in the monadic framework of Haskell:
perm [e] = [[e]]
perm l = do (leader, index) <- zip l [0 :: Int ..]
let shortened = take index l ++ drop (index + 1) l
trailer <- perm shortened
return (leader : trailer)
Here is an example repl session:
*Main> perm "a"
["a"]
*Main> perm "ab"
["ab","ba"]
*Main> perm ""
[]
*Main> perm "abc"
["abc","acb","bac","bca","cab","cba"]
It should be noted that the list monad is in no way a side effecting computation. A mathematical structure being a monad (i.e. conforming to the above mentioned interfaces and laws) does not imply side effects, though side-effecting phenomena often nicely fit into the monadic framework.
You need monads if you have a type constructor and functions that returns values of that type family. Eventually, you would like to combine these kind of functions together. These are the three key elements to answer why.
Let me elaborate. You have Int, String and Real and functions of type Int -> String, String -> Real and so on. You can combine these functions easily, ending with Int -> Real. Life is good.
Then, one day, you need to create a new family of types. It could be because you need to consider the possibility of returning no value (Maybe), returning an error (Either), multiple results (List) and so on.
Notice that Maybe is a type constructor. It takes a type, like Int and returns a new type Maybe Int. First thing to remember, no type constructor, no monad.
Of course, you want to use your type constructor in your code, and soon you end with functions like Int -> Maybe String and String -> Maybe Float. Now, you can't easily combine your functions. Life is not good anymore.
And here's when monads come to the rescue. They allow you to combine that kind of functions again. You just need to change the composition . for >==.
Why do we need monadic types?
Since it was the quandary of I/O and its observable effects in nonstrict languages like Haskell that brought the monadic interface to such prominence:
[...] monads are used to address the more general problem of computations (involving state, input/output, backtracking, ...) returning values: they do not solve any input/output-problems directly but rather provide an elegant and flexible abstraction of many solutions to related problems. [...] For instance, no less than three different input/output-schemes are used to solve these basic problems in Imperative functional programming, the paper which originally proposed `a new model, based on monads, for performing input/output in a non-strict, purely functional language'. [...]
[Such] input/output-schemes merely provide frameworks in which side-effecting operations can safely be used with a guaranteed order of execution and without affecting the properties of the purely functional parts of the language.
Claus Reinke (pages 96-97 of 210).
(emphasis by me.)
[...] When we write effectful code – monads or no monads – we have to constantly keep in mind the context of expressions we pass around.
The fact that monadic code ‘desugars’ (is implementable in terms of) side-effect-free code is irrelevant. When we use monadic notation, we program within that notation – without considering what this notation desugars into. Thinking of the desugared code breaks the monadic abstraction. A side-effect-free, applicative code is normally compiled to (that is, desugars into) C or machine code. If the desugaring argument has any force, it may be applied just as well to the applicative code, leading to the conclusion that it all boils down to the machine code and hence all programming is imperative.
[...] From the personal experience, I have noticed that the mistakes I make when writing monadic code are exactly the mistakes I made when programming in C. Actually, monadic mistakes tend to be worse, because monadic notation (compared to that of a typical imperative language) is ungainly and obscuring.
Oleg Kiselyov (page 21 of 26).
The most difficult construct for students to understand is the monad. I introduce IO without mentioning monads.
Olaf Chitil.
More generally:
Still, today, over 25 years after the introduction of the concept of monads to the world of functional programming, beginning functional programmers struggle to grasp the concept of monads. This struggle is exemplified by the numerous blog posts about the effort of trying to learn about monads. From our own experience we notice that even at university level, bachelor level students often struggle to comprehend monads and consistently score poorly on monad-related exam questions.
Considering that the concept of monads is not likely to disappear from the functional programming landscape any time soon, it is vital that we, as the functional programming community, somehow overcome the problems novices encounter when first studying monads.
Tim Steenvoorden, Jurriën Stutterheim, Erik Barendsen and Rinus Plasmeijer.
If only there was another way to specify "a guaranteed order of execution" in Haskell, while keeping the ability to separate regular Haskell definitions from those involved in I/O (and its observable effects) - translating this variation of Philip Wadler's echo:
val echoML : unit -> unit
fun echoML () = let val c = getcML () in
if c = #"\n" then
()
else
let val _ = putcML c in
echoML ()
end
fun putcML c = TextIO.output1(TextIO.stdOut,c);
fun getcML () = valOf(TextIO.input1(TextIO.stdIn));
...could then be as simple as:
echo :: OI -> ()
echo u = let !(u1:u2:u3:_) = partsOI u in
let !c = getChar u1 in
if c == '\n' then
()
else
let !_ = putChar c u2 in
echo u3
where:
data OI -- abstract
foreign import ccall "primPartOI" partOI :: OI -> (OI, OI)
⋮
foreign import ccall "primGetCharOI" getChar :: OI -> Char
foreign import ccall "primPutCharOI" putChar :: Char -> OI -> ()
⋮
and:
partsOI :: OI -> [OI]
partsOI u = let !(u1, u2) = partOI u in u1 : partsOI u2
How would this work? At run-time, Main.main receives an initial OI pseudo-data value as an argument:
module Main(main) where
main :: OI -> ()
⋮
...from which other OI values are produced, using partOI or partsOI. All you have to do is ensure each new OI value is used at most once, in each call to an OI-based definition, foreign or otherwise. In return, you get back a plain ordinary result - it isn't e.g. paired with some odd abstract state, or requires the use of a callback continuation, etc.
Using OI, instead of the unit type () like Standard ML does, means we can avoid always having to use the monadic interface:
Once you're in the IO monad, you're stuck there forever, and are reduced to Algol-style imperative programming.
Robert Harper.
But if you really do need it:
type IO a = OI -> a
unitIO :: a -> IO a
unitIO x = \ u -> let !_ = partOI u in x
bindIO :: IO a -> (a -> IO b) -> IO b
bindIO m k = \ u -> let !(u1, u2) = partOI u in
let !x = m u1 in
let !y = k x u2 in
y
⋮
So, monadic types aren't always needed - there are other interfaces out there:
LML had a fully fledged implementation of oracles running of a multi-processor (a Sequent Symmetry) back in ca 1989. The description in the Fudgets thesis refers to this implementation. It was fairly pleasant to work with and quite practical.
[...]
These days everything is done with monads so other solutions are sometimes forgotten.
Lennart Augustsson (2006).
Wait a moment: since it so closely resembles Standard ML's direct use of effects, is this approach and its use of pseudo-data referentially transparent?
Absolutely - just find a suitable definition of "referential transparency"; there's plenty to choose from...

Removing common haskell piping boilerplate

I have some pretty common Haskell boilerplate that shows up in a lot of places. It looks something like this (when instantiating classes):
a <= b = (modify a) <= (modify b)
like this (with normal functions):
fn x y z = fn (foo x) (foo y) (foo z)
and sometimes even with tuples, as in:
mod (x, y) = (alt x, alt y)
It seems like there should be an easy way to reduce all of this boilerplate and not have to repeat myself quite so much. (These are simple examples, but it does get annoying). I imagine that there are abstractions created for removing such boilerplate, but I'm not sure what they're called nor where to look. Can any haskellites point me in the right direction?
For the (<=) case, consider defining compare instead; you can then use Data.Ord.comparing, like so:
instance Ord Foo where
compare = comparing modify
Note that comparing can simply be defined as comparing f = compare `on` f, using Data.Function.on.
For your fn example, it's not clear. There's no way to simplify this type of definition in general. However, I don't think the boilerplate is too bad in this instance.
For mod:
mod = alt *** alt
using Control.Arrow.(***) — read a b c as b -> c in the type signature; arrows are just a general abstraction (like functors or monads) of which functions are an instance. You might like to define both = join (***) (which is itself shorthand for both f = f *** f); I know at least one other person who uses this alias, and I think it should be in Control.Arrow proper.
So, in general, the answer is: combinators, combinators, combinators! This ties directly in with point-free style. It can be overdone, but when the combinators exist for your situation, such code can not only be cleaner and shorter, it can be easier to read: you only have to learn an abstraction once, and can then apply it everywhere when reading code.
I suggest using Hoogle to find these combinators; when you think you see a general pattern underlying a definition, try abstracting out what you think the common parts are, taking the type of the result, and searching for it on Hoogle. You might find a combinator that does just what you want.
So, for instance, for your mod case, you could abstract out alt, yielding \f (a,b) -> (f a, f b), then search for its type, (a -> b) -> (a, a) -> (b, b) — there's an exact match, but it's in the fgl graph library, which you don't want to depend on. Still, you can see how the ability to search by type can be very valuable indeed!
There's also a command-line version of Hoogle with GHCi integration; see its HaskellWiki page for more information.
(There's also Hayoo, which searches the entirety of Hackage, but is slightly less clever with types; which one you use is up to personal preference.)
For some of the boilerplate, Data.Function.on can be helpful, although in these examples, it doesn't gain much
instance Ord Foo where
(<=) = (<=) `on` modify -- maybe
mod = uncurry ((,) `on` alt) -- Not really

Typed FP: Tuple Arguments and Curriable Arguments

In statically typed functional programming languages, like Standard ML, F#, OCaml and Haskell, a function will usually be written with the parameters separated from each other and from the function name simply by whitespace:
let add a b =
a + b
The type here being "int -> (int -> int)", i.e. a function that takes an int and returns a function which its turn takes and int and which finally returns an int. Thus currying becomes possible.
It's also possible to define a similar function that takes a tuple as an argument:
let add(a, b) =
a + b
The type becomes "(int * int) -> int" in this case.
From the point of view of language design, is there any reason why one could not simply identify these two type patterns in the type algebra? In other words, so that "(a * b) -> c" reduces to "a -> (b -> c)", allowing both variants to be curried with equal ease.
I assume this question must have cropped up when languages like the four I mentioned were designed. So does anyone know any reason or research indicating why all four of these languages chose not to "unify" these two type patterns?
I think the consensus today is to handle multiple arguments by currying (the a -> b -> c form) and to reserve tuples for when you genuinely want values of tuple type (in lists and so on). This consensus is respected by every statically typed functional language since Standard ML, which (purely as a matter of convention) uses tuples for functions that take multiple arguments.
Why is this so? Standard ML is the oldest of these languages, and when people were first writing ML compilers, it was not known how to handle curried arguments efficiently. (At the root of the problem is the fact that any curried function could be partially applied by some other code you haven't seen yet, and you have to compile it with that possibility in mind.) Since Standard ML was designed, compilers have improved, and you can read about the state of the art in an excellent paper by Simon Marlow and Simon Peyton Jones.
LISP, which is the oldest functional language of them all, has a concrete syntax which is extremely unfriendly to currying and partial application. Hrmph.
At least one reason not to conflate curried and uncurried functional types is that it would be very confusing when tuples are used as returned values (a convenient way in these typed languages to return multiple values). In the conflated type system, how can function remain nicely composable? Would a -> (a * a) also transform to a -> a -> a? If so, are (a * a) -> a and a -> (a * a) the same type? If not, how would you compose a -> (a * a) with (a * a) -> a?
You propose an interesting solution to the composition issue. However, I don't find it satisfying because it doesn't mesh well with partial application, which is really a key convenience of curried functions. Let me attempt to illustrate the problem in Haskell:
apply f a b = f a b
vecSum (a1,a2) (b1,b2) = (a1+b1,a2+b2)
Now, perhaps your solution could understand map (vecSum (1,1)) [(0,1)], but what about the equivalent map (apply vecSum (1,1)) [(0,1)]? It becomes complicated! Does your fullest unpacking mean that the (1,1) is unpacked with apply's a & b arguments? That's not the intent... and in any case, reasoning becomes complicated.
In short, I think it would be very hard to conflate curried and uncurried functions while (1) preserving the semantics of code valid under the old system and (2) providing a reasonable intuition and algorithm for the conflated system. It's an interesting problem, though.
Possibly because it is useful to have a tuple as a separate type, and it is good to keep different types separate?
In Haskell at least, it is quite easy to go from one form to the other:
Prelude> let add1 a b = a+b
Prelude> let add2 (a,b) = a+b
Prelude> :t (uncurry add1)
(uncurry add1) :: (Num a) => (a, a) -> a
Prelude> :t (curry add2)
(curry add2) :: (Num a) => a -> a -> a
so uncurry add1 is same as add2 and curry add2 is the same as add1. I guess similar things are possible in the other languages as well. What greater gains would there be to automatically currying every function defined on a tuple? (Since that's what you seem to be asking.)
Expanding on the comments under namin's good answer:
So assume a function which has type 'a -> ('a * 'a):
let gimme_tuple(a : int) =
(a*2, a*3)
Then assume a function which has type ('a * 'a) -> 'b:
let add(a : int, b) =
a + b
Then composition (assuming the conflation that I propose) wouldn't pose any problem as far as I can see:
let foo = add(gimme_tuple(5))
// foo gets value 5*2 + 5*3 = 25
But then you could conceive of a polymorphic function that takes the place of add in the last code snippet, for instance a little function that just takes a 2-tuple and makes a list of the two elements:
let gimme_list(a, b) =
[a, b]
This would have the type ('a * 'a) -> ('a list). Composition now would be problematic. Consider:
let bar = gimme_list(gimme_tuple(5))
Would bar now have the value [10, 15] : int list or would bar be a function of type (int * int) -> ((int * int) list), which would eventually return a list whose head would be the tuple (10, 15)? For this to work, I posited in a comment to namin's answer that one would need an additional rule in the type system that the binding of actual to formal parameters be the "fullest possible", i.e. that the system should attempt a non-partial binding first and only try a partial binding if a full binding is unattainable. In our example, it would mean that we get the value [10, 15] because a full binding is possible in this case.
Is such a concept of "fullest possible" inherently meaningless? I don't know, but I can't immediately see a reason it would be.
One problem is of course if you want the second interpretation of the last code snippet, then you'd need to go jump through an extra hoop (typically an anonymous function) to get that.
Tuples are not present in these languages simply to be used as function parameters. They are a really convenient way to represent structured data, e.g., a 2D point (int * int), a list element ('a * 'a list), or a tree node ('a * 'a tree * 'a tree). Remember also that structures are just syntactic sugar for tuples. I.e.,
type info =
{ name : string;
age : int;
address : string; }
is a convenient way of handling a (string * int * string) tuple.
There's no fundamental reason you need tuples in a programming language (you can build up tuples in the lambda calculus, just as you can booleans and integers—indeed using currying*), but they are convenient and useful.
* E.g.,
tuple a b = λf.f a b
fst x = x (λa.λb.a)
snd x = x (λa.λb.b)
curry f = λa.λb.f (λg.g a b)
uncurry f = λx.x f

Resources