Understanding `ap` in a point-free function in Haskell - haskell

I am able to understand the basics of point-free functions in Haskell:
addOne x = 1 + x
As we see x on both sides of the equation, we simplify it:
addOne = (+ 1)
Incredibly it turns out that functions where the same argument is used twice in different parts can be written point-free!
Let me take as a basic example the average function written as:
average xs = realToFrac (sum xs) / genericLength xs
It may seem impossible to simplify xs, but http://pointfree.io/ comes out with:
average = ap ((/) . realToFrac . sum) genericLength
That works.
As far as I understand this states that average is the same as calling ap on two functions, the composition of (/) . realToFrac . sum and genericLength
Unfortunately the ap function makes no sense whatsoever to me, the docs http://hackage.haskell.org/package/base-4.8.1.0/docs/Control-Monad.html#v:ap state:
ap :: Monad m => m (a -> b) -> m a -> m b
In many situations, the liftM operations can be replaced by uses of ap,
which promotes function application.
return f `ap` x1 `ap` ... `ap` xn
is equivalent to
liftMn f x1 x2 ... xn
But writing:
let average = liftM2 ((/) . realToFrac . sum) genericLength
does not work, (gives a very long type error message, ask and I'll include it), so I do not understand what the docs are trying to say.
How does the expression ap ((/) . realToFrac . sum) genericLength work? Could you explain ap in simpler terms than the docs?

Any lambda term can be rewritten to an equivalent term that uses just a set of suitable combinators and no lambda abstractions. This process is called abstraciton elimination. During the process you want to remove lambda abstractions from inside out. So at one step you have λx.M where M is already free of lambda abstractions, and you want to get rid of x.
If M is x, you replace λx.x with id (id is usually denoted by I in combinatory logic).
If M doesn't contain x, you replace the term with const M (const is usually denoted by K in combinatory logic).
If M is PQ, that is the term is λx.PQ, you want to "push" x inside both parts of the function application so that you can recursively process both parts. This is accomplished by using the S combinator defined as λfgx.(fx)(gx), that is, it takes two functions and passes x to both of them, and applies the results together. You can easily verify that that λx.PQ is equivalent to S(λx.P)(λx.Q), and we can recursively process both subterms.
As described in the other answers, the S combinator is available in Haskell as ap (or <*>) specialized to the reader monad.
The appearance of the reader monad isn't accidental: When solving the task of replacing λx.M with an equivalent function is basically lifting M :: a to the reader monad r -> a (actually the reader Applicative part is enough), where r is the type of x. If we revise the process above:
The only case that is actually connected with the reader monad is when M is x. Then we "lift" x to id, to get rid of the variable. The other cases below are just mechanical applications of lifting an expression to an applicative functor:
The other case λx.M where M doesn't contain x, it's just lifting M to the reader applicative, which is pure M. Indeed, for (->) r, pure is equivalent to const.
In the last case, <*> :: f (a -> b) -> f a -> f b is function application lifted to a monad/applicative. And this is exactly what we do: We lift both parts P and Q to the reader applicative and then use <*> to bind them together.
The process can be further improved by adding more combinators, which allows the resulting term to be shorter. Most often, combinators B and C are used, which in Haskell correspond to functions (.) and flip. And again, (.) is just fmap/<$> for the reader applicative. (I'm not aware of such a built-in function for expressing flip, but it'd be viewed as a specialization of f (a -> b) -> a -> f b for the reader applicative.)
Some time ago I wrote a short article about this: The Monad Reader Issue 17, The Reader Monad and Abstraction Elimination.

When the monad m is (->) a, as in your case, you can define ap as follows:
ap f g = \x -> f x (g x)
We can see that this indeed "works" in your pointfree example.
average = ap ((/) . realToFrac . sum) genericLength
average = \x -> ((/) . realToFrac . sum) x (genericLength x)
average = \x -> (/) (realToFrac (sum x)) (genericLength x)
average = \x -> realToFrac (sum x) / genericLength x
We can also derive ap from the general law
ap f g = do ff <- f ; gg <- g ; return (ff gg)
that is, desugaring the do-notation
ap f g = f >>= \ff -> g >>= \gg -> return (ff gg)
If we substitute the definitions of the monad methods
m >>= f = \x -> f (m x) x
return x = \_ -> x
we get the previous definition of ap back (for our specific monad (->) a). Indeed:
app f g
= f >>= \ff -> g >>= \gg -> return (ff gg)
= f >>= \ff -> g >>= \gg -> \_ -> ff gg
= f >>= \ff -> g >>= \gg _ -> ff gg
= f >>= \ff -> \x -> (\gg _ -> ff gg) (g x) x
= f >>= \ff -> \x -> (\_ -> ff (g x)) x
= f >>= \ff -> \x -> ff (g x)
= f >>= \ff x -> ff (g x)
= \y -> (\ff x -> ff (g x)) (f y) y
= \y -> (\x -> f y (g x)) y
= \y -> f y (g y)

The Simple Bit: fixing liftM2
The problem in the original example is that ap works a bit differently from the liftM functions. ap takes a function wrapped up in a monad, and applies it to an argument wrapped up in a monad. But the liftMn functions take a "normal" function (one which is not wrapped up in a monad) and apply it to argument(s) that are wrapped up in monads.
I'll explain more about what that means below, but the upshot is that if you want to use liftM2, then you have to pull (/) out and make it a separate argument at the beginning. (So in this case (/) is the "normal" function.)
let average = liftM2 ((/) . realToFrac . sum) genericLength -- does not work
let average = liftM2 (/) (realToFrac . sum) genericLength -- works
As posted in the original question, calling liftM2 should involve three agruments: liftM2 f x1 x2. Here the f is (/), x1 is (realToFrac . sum) and x2 is genericLength.
The version posted in the question (the one which doesn't work) was trying to call liftM2 with only two arguments.
The explanation
I'll build this up in a few stages. I'll start with some specific values, and build up to a function that can take any set of values. Jump to the last section for the TL:DR
In this example, lets assume the list of numbers is [1,2,3,4]. The sum of these numbers is 10, and the length of the list is 4. The average is 10/4 or 2.5.
To shoe-horn this into the right form for ap, we're going to break this into a function, an input, and a result.
ourFunction = (10/) -- "divide 10 by"
ourInput = 4
ourResult = 2.5
Three kinds of Function Application
ap and listM both involve monads. At this point in the explanation, you can think of a monad as something that a value can be "wrapped up in". I'll give a better definition below.
Normal function application applies a normal function to a normal input. liftM applies a normal function to an input wrapped in a monad, and ap applies a function wrapped in a monad to an input wrapped in a monad.
(10/) 4 -- returns 2.5
liftM (10/) monad(4) -- returns monad(2.5)
ap monad(10/) monad(4) -- returns monad(2.5)
(Note that this is pseudocode. monad(4) is not actually valid Haskell).
(Note that liftM is a different function from liftM2, which was used earlier. liftM takes a function and only one argument, which is a better fit for the pattern i'm describing.)
In the average function defined above, the monads were functions, but "functions-as-monads" can be hard to talk about, so I'll start with simpler examples.
So what's a monad?
A better description of a monad is "something which contains a value, or produces a value, or which you can somehow extract a value from, but which also has something more complicated going on."
That's a really vague description, but it kind of has to be, because the "something more complicated" can be a lot of different things.
Monads can be confusing, but the point of them is that when you use monad operations (like ap and liftM) they will take care of the "something more complicated" for you, so you can just concentrate on the values.
That's probably still not very clear, so let's do some examples:
The Maybe monad
ap (Just (10/)) (Just 4) -- result is (Just 2.5)
One of the simplest monads is 'Maybe'. The value is whatever is contained inside a Just. So if we call ap and give it (Just ourFunction) and (Just ourInput) then we get back (Just ourResult).
The "something more complicated" is the fact that there might not be a value there at all, and you have to allow for the Nothing case.
As mentioned, the point of using a function like ap is that it takes care of these extra complications for us. With the Maybe monad, ap handles this by returning Nothing if either the Maybe-function or the Maybe-input were Nothing.
ap (Just (10/)) Nothing -- result is Nothing
ap Nothing (Just 4) -- result is Nothing
The List Monad
ap [(10/)] [4] -- result is [2.5]
With the list Monad, the value is whatever is inside the list. So ap [ourfunction] [ourInput] returns [ourResult].
The "something more complicated" is that there may be more than one thing inside the list (or exactly one thing, or nothing at all).
With lists, that means ap takes a list of zero or more functions, and a list of zero or more inputs. It handles that by returning a list of zero or more results: one result for every possible combination of function and input.
ap [(10/), (100/)] [5,4,2] -- result is [2.0, 2.5, 5.0, 20.0, 25.0, 50.0]
Functions as Monads
A function like genericLength is considered a Monad because it has a value (the function's output), and it has a "something more complicated" (the fact that you have to supply an input before you can get the value).
This is where it gets a little confusing, because we're dealing with multiple functions, multiple inputs, and multiple results. It is all well defined, it's just hard to describe, so we have to be careful with our terminology.
Lets start with the list [1,2,3,4], and call that our "original input". That's the list we're trying to find the average of. It's the xs argument in the original average function.
If we give our original input ([1,2,3,4]) to genericLength then we get a value of '4'.
Our other function is ((/) . realToFrac . sum). It takes our list [1,2,3,4] and finds the sum (10), turns that into a fractional value, and then feeds it as the first argument to (/). The result is an incomplete division function that is waiting for another argument. ie it takes [1,2,3,4] as an input, and produces (10/) as its output.
This all fits with the way ap is defined for functions. With functions, ap takes two things. The first is a function that reads the original input and produces a new function. The second is a function that reads the original input and produces a new input. The final result is a function that takes the original input, and returns the same thing you would get if you applied the new function to the new input.
You might have to read that a few times to make sense of it. Alternatively, here it is in pseudocode:
average =
ap
(functionThatTakes [1,2,3,4] and returns "(10/)" )
(functionThatTakes [1,2,3,4] and returns " 4 " )
-- which means:
average =
(functionThatTakes [1,2,3,4] and returns "2.5" )
If you compare this to the simpler examples above, you'll see that it still has our function (10/), our input 4 and our result 2.5. And each of them is once again wrapped up in the "something more complicated". In this case, the "something more complicated" is the "function that takes [1,2,3,4] and returns...".
Of course, since they're functions, they don't have to take [1,2,3,4] as their input. If they took a different list of integers (eg [1,2,3,4,5]) then we would get different results (e.g. new function: (15/), new input 5 and new value 3).
Other examples
minPlusMax = ap ((+) . minimum) maximum
-- a function that adds the minimum element of a list, to the maximum element
upperAndLower = ap ((,) . toUpper) toLower
-- a function that takes a Char and returns a tuple, with the upper case and lower case versions of a character
These could all also be defined using liftM2.
average = liftM2 (/) sum genericLength
minPlusMax = liftM2 (+) minimum maximum
upperAndLower = liftM2 (,) toUpper toLower

Related

Understanding this assignment?

In order to refresh my 20 year old experience with Haskell I am walking through https://en.wikibooks.org/wiki/Write_Yourself_a_Scheme_in_48_Hours/Adding_Variables_and_Assignment and at one point the following line is introduced to apply op to all the params. This is to implement e.g. (+ 1 2 3 4)
numericBinop op params = mapM unpackNum params >>= return . Number . foldl1 op
I do not understand the syntax, and the explanation in the text is a bit vague.
I understand what foldl1 does and how to dot functions (unpackNum is a helper function), but using Monads and the >>= operator leaves me a bit confused.
How is this to be read?
Essentially,
mapM unpackNum params >>= return . Number . foldl1 op
is made of two parts.
mapM unpackNum params means: take the list of parameters params. On each item, apply unpackNum: this will produce an Integer wrapped inside the ThrowsError monad. So, it's not exactly a plain Integer, since it has a chance to error out. Anyway, using unpackNum on each item either successfully produces all Integers, or throws some error. In the first case, we produce a new list of type [Integer], in the second one we (unsurprisingly) throw the error. So, the resulting type for this part is ThrowsError [Integer].
The second part is ... >>= return . Number . foldl1 op. Here >>= means: if the first part threw some error, the whole expression throws that error as well. If the part succeeded in producing [Integer] then proceed with foldl1 op, wrap the result as a Number, and finally use return to inject this value as a successful computation.
Overall there are monadic computations, but you should not think about those too much. The monadic stuff here is only propagating the errors around, or store plain values if the computation is successful. With a bit of experience, one can concentrate on the successful values only, and let mapM,>>=,return handle the error cases.
By the way, note that while the book uses code like action >>= return . f, this is arguably a bad style. One can use, to the same effect, fmap f action or f <$> action, which is a more direct way to express the same computation. E.g.
Number . foldl1 op <$> mapM unpackNum params
which is very close to a non-monadic code which ignores the error cases
-- this would work if there was no monad around, and errors did not have to be handled
Number . foldl1 op $ map unpackNum params
Your question is about syntax, so I'll just talk about how to parse that expression. Haskell's syntax is pretty simple. Informally:
identifiers separated by spaces are function application (the first identifier applied to the rest)
except identifiers that use non-alphanumeric chatracters (e.g. >>=, or .) are infix (i.e. their first argument is to the left of the identifier)
the first type of function application above (non-infix) binds more tightly than the second
operators can associate either to the left or right, and have different precedence (defined with an infix... declaration)
So only knowing this, if I see:
mapM unpackNum params >>= return . Number . foldl1 op
To begin with I know that it must be parse like
(mapM unpackNum params) >>= return . Number . (foldl1 op)
To go further we need to inspect the fixity/precedence of the two operators we see in this expression:
Prelude> :info (.)
(.) :: (b -> c) -> (a -> b) -> a -> c -- Defined in ‘GHC.Base’
infixr 9 .
Prelude> :info (>>=)
class Applicative m => Monad (m :: * -> *) where
(>>=) :: m a -> (a -> m b) -> m b
...
-- Defined in ‘GHC.Base’
infixl 1 >>=
(.) has a higher precedence (9 vs 1 for >>=), so its arguments will bind more tightly (i.e. we parenthesize them first). But how do we know which of these is correct?
(mapM unpackNum params) >>= ((return . Number) . (foldl1 op))
(mapM unpackNum params) >>= (return . (Number . (foldl1 op)))
...? Because (.) was declared infixr it associates to the right, meaning the second parse above is correct.
As Will Ness points out in the comments, (.) is associative (like e.g. addition) so both of these happen to be semantically equivalent.
With a little experience with a library (or the Prelude in this case) you learn to parse expressions with operators correctly without thinking too much.
If after doing this exercise you want to understand what a function does or how it works, then you can click through to the source of the functions you're interested in and replace occurrences of left-hand sides with right-hand sides (i.e. inline the bodies of the functions and terms). Obviously you can do this in your head or in an editor.
You could "sugar this up" with a more beginner-friendly syntax, with the do notation. Your function, numericBinop op params = mapM unpackNum params >>= return . Number . foldl1 op would become:
numericBinop op params = do
x <- mapM unpackNum params -- "<-" translates to ">>=", the bind operator
return . Number $ foldl1 op x
Well now the most mysterious is the mapM function, that is sequence . fmap, and it simply takes a function, fmaps it over the container, and flips the type, in this case (I presume) from [Number Integer] to ThrowsError [Integer], while preserving any errors (side effects) that may arise during the flipping, or in other words, if the 'flipping' caused any error, it would be represented in the result.
Not the simplest example, and you probably would be better off seeing how mapM (fmap (+1)) [Just 2, Just 3] differs from mapM (fmap (+1)) [Just 2, Nothing]. For more insights look into Traversable typeclass.
After that, you bind the [Integer] out of the ThrowsError monad and feed it to the function that takes care of doing the computation on the list, resulting in a single Integer, which in turn you need to re-embed into the ThrowsError monad with the return function after you wrap it into a Number.
If you still have trouble understanding monads, I suggest you take a look at the still relevant LYAH chapter that will gently introduce you to monads
>>= builds a computation that may fail at either end: its left argument can be an empty monad, in which case it does not even happen, otherwise its result can be empty as well. It has type
>>= :: m a -> (a -> m b) -> m b
See, its arguments are: value(s) immersed into monad and a function that accepts a pure value and returns an immersed result. This operator is a monadic version of what is known as flatMap in Scala, for instance; in Haskell, its particular implementation for lists is known as concatMap. If you have a list l, then l >>= f works as follows: for each element of l, f is applied to this element and returns a list; and all those resultant lists are concatenated to produce result.
Consider a code in Java:
try {
function1();
function2();
}
catch(Exception e) {
}
What happens when function2 is called? See, after the call to function1 the program is probably in a valid state, so function2() is an operator that transforms this current state into some different next one. But the call to function1() could result in an exception thrown, so the control would immediately transfer to the catch-block—this can be regarded as null state, so there's nothing to apply function2 to. In other words, we have the following possible control paths:
[S0] --- function1() --> [S1] --- function2() --> [S2]
[S0] --- function1() --> [] --> catch
(For simplicity, exceptions thrown from function2 are not considered in this diagram.)
So, either [S1] is a (non-empty) valid machine state, and function2 transforms it further to a (non-empty) valid [S2], or it is empty, and thus function2() is a no-op and never run. This can be summarized in pseudo-code as
S2 <- [S0] >>= function1 >>= function2
First, syntax. Whitespace is application, semantically:
f x = f $ x -- "call" f with an argument x
so your expression is actually
numericBinop op params = ((mapM unpackNum) params) >>= return . Number . (foldl1 op)
Next, the operators are built from non-alphanumerical characters, without any whitespace. Here, there's . and >>=. Running :i (.) and :i (>>=) at GHCi reveals their fixity specs are infixl 9 . and infixr 1 >>=. 9 is above 1 so . is stronger than >>=; thus
= ((mapM unpackNum) params) >>= (return . Number . (foldl1 op))
infixl 9 . means . associates to the right, thus, finally, it is
= ((mapM unpackNum) params) >>= (return . (Number . (foldl1 op)))
The (.) is defined as (f . g) x = f (g x), thus (f . (g . h)) x = f ((g . h) x) = f (g (h x)) = (f . g) (h x) = ((f . g) . h) x; by eta-reduction we have
(f . (g . h)) = ((f . g) . h)
thus (.) is associative, and so parenthesization is optional. We'll drop the explicit parens with the "whitespace" application from now on as well. Thus we have
numericBinop op params = (mapM unpackNum params) >>=
(\ x -> return (Number (foldl1 op x))) -- '\' is for '/\'ambda
Monadic sequences are easier written with do, and the above is equivalent to
= do
{ x <- mapM unpackNum params -- { ; } are optional, IF all 'do'
; return (Number (foldl1 op x))) -- lines are indented at the same level
}
Next, mapM can be defined as
mapM f [] = return []
mapM f (x:xs) = do { x <- f x ;
xs <- mapM f xs ;
return (x : xs) }
and the Monad Laws demand that
do { r <- do { x ; === do { x ;
y } ; r <- y ;
foo r foo r
} }
(you can find an overview of do notation e.g. in this recent answer of mine); thus,
numericBinop op [a, b, ..., z] =
do {
a <- unpackNum a ;
b <- unpackNum b ;
...........
z <- unpackNum z ;
return (Number (foldl1 op [a, b, ..., z]))
}
(you may have noticed my use of x <- x bindings -- we can use the same variable name on both sides of <-, because monadic bindings are not recursive -- thus introducing shadowing.)
This is now clearer, hopefully.
But, I said "first, syntax". So now, the meaning of it. By same Monad Laws,
numericBinop op [a, b, ..., y, z] =
do {
xs <- do { a <- unpackNum a ;
b <- unpackNum b ;
...........
y <- unpackNum y ;
return [a, b, ..., y] } ;
z <- unpackNum z ;
return (Number (op (foldl1 op xs) z))
}
thus, we need only understand the sequencing of two "computations", c and d,
do { a <- c ; b <- d ; return (foo a b) }
=
c >>= (\ a ->
d >>= (\ b ->
return (foo a b) ))
for a particular monad involved, which is determined by the bind (>>=) operator's implementation for a given monad.
Monads are EDSLs for generalized function composition. The sequencing of computations involves not only the explicit expressions appearing in the do sequence, but also the implicit effects peculiar to the particular monad in question, performed in principled and consistent manner behind the scenes. Which is the whole point to having monads in the first place (well, one of the main points, at least).
Here the monad involved appears to concern itself with the possibility of failure, and early bail-outs in the event that failure indeed happens.
So, with the do code we write the essence of what we intend to happen, and the possibility of intermittent failure is automatically taken care of, for us, behind the scenes.
In other words, if one of unpackNum computations fails, so will the whole of the combined computation fail, without attempting any of the subsequent unpackNum sub-computations. But if all of them succeed, so will the combined computation.

Partial application of functions and currying, how to make a better code instead of a lot of maps?

I am a beginner at Haskell and I am trying to grasp it.
I am having the following problem:
I have a function that gets 5 parameters, lets say
f x y w z a = x - y - w - z - a
And I would like to apply it while only changing the variable x from 1 to 10 whereas y, w, z and a will always be the same. The implementation I achieved was the following but I think there must be a better way.
Let's say I would like to use:
x from 1 to 10
y = 1
w = 2
z = 3
a = 4
Accordingly to this I managed to apply the function as following:
map ($ 4) $ map ($ 3) $ map ($ 2) $ map ($ 1) (map f [1..10])
I think there must be a better way to apply a lot of missing parameters to partially applied functions without having to use too many maps.
All the suggestions so far are good. Here's another, which might seem a bit weird at first, but turns out to be quite handy in lots of other situations.
Some type-forming operators, like [], which is the operator which maps a type of elements, e.g. Int to the type of lists of those elements, [Int], have the property of being Applicative. For lists, that means there is some way, denoted by the operator, <*>, pronounced "apply", to turn lists of functions and lists of arguments into lists of results.
(<*>) :: [s -> t] -> [s] -> [t] -- one instance of the general type of <*>
rather than your ordinary application, given by a blank space, or a $
($) :: (s -> t) -> s -> t
The upshot is that we can do ordinary functional programming with lists of things instead of things: we sometimes call it "programming in the list idiom". The only other ingredient is that, to cope with the situation when some of our components are individual things, we need an extra gadget
pure :: x -> [x] -- again, one instance of the general scheme
which wraps a thing up as a list, to be compatible with <*>. That is pure moves an ordinary value into the applicative idiom.
For lists, pure just makes a singleton list and <*> produces the result of every pairwise application of one of the functions to one of the arguments. In particular
pure f <*> [1..10] :: [Int -> Int -> Int -> Int -> Int]
is a list of functions (just like map f [1..10]) which can be used with <*> again. The rest of your arguments for f are not listy, so you need to pure them.
pure f <*> [1..10] <*> pure 1 <*> pure 2 <*> pure 3 <*> pure 4
For lists, this gives
[f] <*> [1..10] <*> [1] <*> [2] <*> [3] <*> [4]
i.e. the list of ways to make an application from the f, one of the [1..10], the 1, the 2, the 3 and the 4.
The opening pure f <*> s is so common, it's abbreviated f <$> s, so
f <$> [1..10] <*> [1] <*> [2] <*> [3] <*> [4]
is what would typically be written. If you can filter out the <$>, pure and <*> noise, it kind of looks like the application you had in mind. The extra punctuation is only necessary because Haskell can't tell the difference between a listy computation of a bunch of functions or arguments and a non-listy computation of what's intended as a single value but happens to be a list. At least, however, the components are in the order you started with, so you see more easily what's going on.
Esoterica. (1) in my (not very) private dialect of Haskell, the above would be
(|f [1..10] (|1|) (|2|) (|3|) (|4|)|)
where each idiom bracket, (|f a1 a2 ... an|) represents the application of a pure function to zero or more arguments which live in the idiom. It's just a way to write
pure f <*> a1 <*> a2 ... <*> an
Idris has idiom brackets, but Haskell hasn't added them. Yet.
(2) In languages with algebraic effects, the idiom of nondeterministic computation is not the same thing (to the typechecker) as the data type of lists, although you can easily convert between the two. The program becomes
f (range 1 10) 2 3 4
where range nondeterministically chooses a value between the given lower and upper bounds. So, nondetermism is treated as a local side-effect, not a data structure, enabling operations for failure and choice. You can wrap nondeterministic computations in a handler which give meanings to those operations, and one such handler might generate the list of all solutions. That's to say, the extra notation to explain what's going on is pushed to the boundary, rather than peppered through the entire interior, like those <*> and pure.
Managing the boundaries of things rather than their interiors is one of the few good ideas our species has managed to have. But at least we can have it over and over again. It's why we farm instead of hunting. It's why we prefer static type checking to dynamic tag checking. And so on...
Others have shown ways you can do it, but I think it's useful to show how to transform your version into something a little nicer. You wrote
map ($ 4) $ map ($ 3) $ map ($ 2) $ map ($ 1) (map f [1..10])
map obeys two fundamental laws:
map id = id. That is, if you map the identity function over any list, you'll get back the same list.
For any f and g, map f . map g = map (f . g). That is, mapping over a list with one function and then another one is the same as mapping over it with the composition of those two functions.
The second map law is the one we want to apply here.
map ($ 4) $ map ($ 3) $ map ($ 2) $ map ($ 1) (map f [1..10])
=
map ($ 4) . map ($ 3) . map ($ 2) . map ($ 1) . map f $ [1..10]
=
map (($ 4) . ($ 3) . ($ 2) . ($ 1) . f) [1..10]
What does ($ a) . ($ b) do? \x -> ($ a) $ ($ b) x = \x -> ($ a) $ x b = \x -> x b a. What about ($ a) . ($ b) . ($ c)? That's (\x -> x b a) . ($ c) = \y -> (\x -> x b a) $ ($ c) y = \y -> y c b a. The pattern now should be clear: ($ a) . ($ b) ... ($ y) = \z -> z y x ... c b a. So ultimately, we get
map ((\z -> z 1 2 3 4) . f) [1..10]
=
map (\w -> (\z -> z 1 2 3 4) (f w)) [1..10]
=
map (\w -> f w 1 2 3 4) [1..10]
=
map (\x -> ($ 4) $ ($ 3) $ ($ 2) $ ($ 1) $ f x) [1..10]
In addition to what the other answers say, it might be a good idea to reorder the parameters of your function, especially x is usually the parameter that you vary over like that:
f y w z a x = x - y - w - z - a
If you make it so that the x parameter comes last, you can just write
map (f 1 2 3 4) [1..10]
This won't work in all circumstances of course, but it is good to see what parameters are more likely to vary over a series of calls and put them towards the end of the argument list and parameters that tend to stay the same towards the start. When you do this, currying and partial application will usually help you out more than they would otherwise.
Assuming you don't mind variables you simply define a new function that takes x and calls f. If you don't have a function definition there (you can generally use let or where) you can use a lambda instead.
f' x = f x 1 2 3 4
Or with a lambda
\x -> f x 1 2 3 4
Currying won't do you any good here, because the argument you want to vary (enumerate) isn't the last one. Instead, try something like this.
map (\x -> f x 1 2 3 4) [1..10]
The general solution in this situation is a lambda:
\x -> f x 1 2 3 4
however, if you're seeing yourself doing this very often, with the same argument, it would make sense to move that argument to be the last argument instead:
\x -> f 1 2 3 4 x
in which case currying applies perfectly well and you can just replace the above expression with
f 1 2 3 4
so in turn you could write:
map (f 1 2 3 4) [1..10]

Composing a chain of 2-argument functions

So I have a list of a functions of two arguments of the type [a -> a -> a]
I want to write a function which will take the list and compose them into a chain of functions which takes length+1 arguments composed on the left. For example if I have [f,g,h] all of types [a -> a -> a] I need to write a function which gives:
chain [f,g,h] = \a b c d -> f ( g ( h a b ) c ) d
Also if it helps, the functions are commutative in their arguments ( i.e. f x y = f y x for all x y ).
I can do this inside of a list comprehension given that I know the the number of functions in question, it would be almost exactly like the definition. It's the stretch from a fixed number of functions to a dynamic number that has me stumped.
This is what I have so far:
f xs = f' xs
where
f' [] = id
f' (x:xs) = \z -> x (f' xs) z
I think the logic is along the right path, it just doesn't type-check.
Thanks in advance!
The comment from n.m. is correct--this can't be done in any conventional way, because the result's type depends on the length of the input list. You need a much fancier type system to make that work. You could compromise in Haskell by using a list that encodes its length in the type, but that's painful and awkward.
Instead, since your arguments are all of the same type, you'd be much better served by creating a function that takes a list of values instead of multiple arguments. So the type you want is something like this: chain :: [a -> a -> a] -> [a] -> a
There are several ways to write such a function. Conceptually you want to start from the front of the argument list and the end of the function list, then apply the first function to the first argument to get something of type a -> a. From there, apply that function to the next argument, then apply the next function to the result, removing one element from each list and giving you a new function of type a -> a.
You'll need to handle the case where the list lengths don't match up correctly, as well. There's no way around that, other than the aforementioned type-encoded-lengths and the hassle associate with such.
I wonder, whether your "have a list of a functions" requirement is a real requirement or a workaround? I was faced with the same problem, but in my case set of functions was small and known at compile time. To be more precise, my task was to zip 4 lists with xor. And all I wanted is a compact notation to compose 3 binary functions. What I used is a small helper:
-- Binary Function Chain
bfc :: (c -> d) -> (a -> b -> c) -> a -> b -> d
bfc f g = \a b -> f (g a b)
For example:
ghci> ((+) `bfc` (*)) 5 3 2 -- (5 * 3) + 2
17
ghci> ((+) `bfc` (*) `bfc` (-)) 5 3 2 1 -- ((5 - 3) * 2) + 1
5
ghci> zipWith3 ((+) `bfc` (+)) [1,2] [3,4] [5,6]
[9,12]
ghci> getZipList $ (xor `bfc` xor `bfc` xor) <$> ZipList [1,2] <*> ZipList [3,4] <*> ZipList [5,6] <*> ZipList [7,8]
[0,8]
That doesn't answers the original question as it is, but hope still can be helpful since it covers pretty much what question subject line is about.

How does currying work?

I'm very new to Haskell and FP in general. I've read many of the writings that describe what currying is, but I haven't found an explanation to how it actually works.
Here is a function: (+) :: a -> (a -> a)
If I do (+) 4 7, the function takes 4 and returns a function that takes 7 and returns 11. But what happens to 4 ? What does that first function do with 4? What does (a -> a) do with 7?
Things get more confusing when I think about a more complicated function:
max' :: Int -> (Int -> Int)
max' m n | m > n = m
| otherwise = n
what does (Int -> Int) compare its parameter to? It only takes one parameter, but it needs two to do m > n.
Understanding higher-order functions
Haskell, as a functional language, supports higher-order functions (HOFs). In mathematics HOFs are called functionals, but you don't need any mathematics to understand them. In usual imperative programming, like in Java, functions can accept values, like integers and strings, do something with them, and return back a value of some other type.
But what if functions themselves were no different from values, and you could accept a function as an argument or return it from another function? f a b c = a + b - c is a boring function, it sums a and b and then substracts c. But the function could be more interesting, if we could generalize it, what if we'd want sometimes to sum a and b, but sometimes multiply? Or divide by c instead of subtracting?
Remember, (+) is just a function of 2 numbers that returns a number, there's nothing special about it, so any function of 2 numbers that returns a number could be in place of it. Writing g a b c = a * b - c, h a b c = a + b / c and so on just doesn't cut it for us, we need a general solution, we are programmers after all! Here how it is done in Haskell:
let f g h a b c = a `g` b `h` c in f (*) (/) 2 3 4 -- returns 1.5
And you can return functions too. Below we create a function that accepts a function and an argument and returns another function, which accepts a parameter and returns a result.
let g f n = (\m -> m `f` n); f = g (+) 2 in f 10 -- returns 12
A (\m -> m `f` n) construct is an anonymous function of 1 argument m that applies f to that m and n. Basically, when we call g (+) 2 we create a function of one argument, that just adds 2 to whatever it receives. So let f = g (+) 2 in f 10 equals 12 and let f = g (*) 5 in f 5 equals 25.
(See also my explanation of HOFs using Scheme as an example.)
Understanding currying
Currying is a technique that transforms a function of several arguments to a function of 1 argument that returns a function of 1 argument that returns a function of 1 argument... until it returns a value. It's easier than it sounds, for example we have a function of 2 arguments, like (+).
Now imagine that you could give only 1 argument to it, and it would return a function? You could use this function later to add this 1st argument, now encased in this new function, to something else. E.g.:
f n = (\m -> n - m)
g = f 10
g 8 -- would return 2
g 4 -- would return 6
Guess what, Haskell curries all functions by default. Technically speaking, there are no functions of multiple arguments in Haskell, only functions of one argument, some of which may return new functions of one argument.
It's evident from the types. Write :t (++) in interpreter, where (++) is a function that concatenates 2 strings together, it will return (++) :: [a] -> [a] -> [a]. The type is not [a],[a] -> [a], but [a] -> [a] -> [a], meaning that (++) accepts one list and returns a function of type [a] -> [a]. This new function can accept yet another list, and it will finally return a new list of type [a].
That's why function application syntax in Haskell has no parentheses and commas, compare Haskell's f a b c with Python's or Java's f(a, b, c). It's not some weird aesthetic decision, in Haskell function application goes from left to right, so f a b c is actually (((f a) b) c), which makes complete sense, once you know that f is curried by default.
In types, however, the association is from right to left, so [a] -> [a] -> [a] is equivalent to [a] -> ([a] -> [a]). They are the same thing in Haskell, Haskell treats them exactly the same. Which makes sense, because when you apply only one argument, you get back a function of type [a] -> [a].
On the other hand, check the type of map: (a -> b) -> [a] -> [b], it receives a function as its first argument, and that's why it has parentheses.
To really hammer down the concept of currying, try to find the types of the following expressions in the interpreter:
(+)
(+) 2
(+) 2 3
map
map (\x -> head x)
map (\x -> head x) ["conscience", "do", "cost"]
map head
map head ["conscience", "do", "cost"]
Partial application and sections
Now that you understand HOFs and currying, Haskell gives you some syntax to make code shorter. When you call a function with 1 or multiple arguments to get back a function that still accepts arguments, it's called partial application.
You understand already that instead of creating anonymous functions you can just partially apply a function, so instead of writing (\x -> replicate 3 x) you can just write (replicate 3). But what if you want to have a divide (/) operator instead of replicate? For infix functions Haskell allows you to partially apply it using either of arguments.
This is called sections: (2/) is equivalent to (\x -> 2 / x) and (/2) is equivalent to (\x -> x / 2). With backticks you can take a section of any binary function: (2`elem`) is equivalent to (\xs -> 2 `elem` xs).
But remember, any function is curried by default in Haskell and therefore always accepts one argument, so sections can be actually used with any function: let (+^) be some weird function that sums 4 arguments, then let (+^) a b c d = a + b + c in (2+^) 3 4 5 returns 14.
Compositions
Other handy tools to write concise and flexible code are composition and application operator. Composition operator (.) chains functions together. Application operator ($) just applies function on the left side to the argument on the right side, so f $ x is equivalent to f x. However ($) has the lowest precedence of all operators, so we can use it to get rid of parentheses: f (g x y) is equivalent to f $ g x y.
It is also helpful when we need to apply multiple functions to the same argument: map ($2) [(2+), (10-), (20/)] would yield [4,8,10]. (f . g . h) (x + y + z), f (g (h (x + y + z))), f $ g $ h $ x + y + z and f . g . h $ x + y + z are equivalent, but (.) and ($) are different things, so read Haskell: difference between . (dot) and $ (dollar sign) and parts from Learn You a Haskell to understand the difference.
You can think of it like that the function stores the argument and returns a new function that just demands the other argument(s). The new function already knows the first argument, as it is stored together with the function. This is handled internally by the compiler. If you want to know how this works exactly, you may be interested in this page although it may be a bit complicated if you are new to Haskell.
If a function call is fully saturated (so all arguments are passed at the same time), most compilers use an ordinary calling scheme, like in C.
Does this help?
max' = \m -> \n -> if (m > n)
then m
else n
Written as lambdas. max' is a value of a lambda that itself returns a lambda given some m, which returns the value.
Hence max' 4 is
max' 4 = \n -> if (4 > n)
then 4
else n
Something that may help is to think about how you could implement curry as a higher order function if Haskell didn't have built in support for it. Here is a Haskell implementation that works for a function on two arguments.
curry :: (a -> b -> c) -> a -> (b -> c)
curry f a = \b -> f a b
Now you can pass curry a function on two arguments and the first argument and it will return a function on one argument (this is an example of a closure.)
In ghci:
Prelude> let curry f a = \b -> f a b
Prelude> let g = curry (+) 5
Prelude> g 10
15
Prelude> g 15
20
Prelude>
Fortunately we don't have to do this in Haskell (you do in Lisp if you want currying) because support is built into the language.
If you come from C-like languages, their syntax might help you to understand it. For example in PHP the add function could be implemented as such:
function add($a) {
return function($b) use($a) {
return $a + $b;
};
}
Haskell is based on Lambda calculus. Internally what happens is that everything gets converted into a function. So your compiler evaluates (+) as follows
(+) :: Num a => a -> a -> a
(+) x y = \x -> (\y -> x + y)
That is, (+) :: a -> a -> a is essentially the same as (+) :: a -> (a -> a). Hope this helps.

CPS in curried languages

How does CPS in curried languages like lambda calculus or Ocaml even make sense? Technically, all function have one argument. So say we have a CPS version of addition in one such language:
cps-add k n m = k ((+) n m)
And we call it like
(cps-add random-continuation 1 2)
This is then the same as:
(((cps-add random-continuation) 1) 2)
I already see two calls there that aren't tail calls and in reality a complexly nested expression, the (cps-add random-continuation) returns a value, namely a function that consumes a number, and then returns a function which consumes another number and then delivers the sum of both to that random-continuation. But we can't work around this value returning by simply translating this into CPS again, because we can only give each function one argument. We need to have at least two to make room for the continuation and the 'actual' argument.
Or am I missing something completely?
Since you've tagged this with Haskell, I'll answer in that regard: In Haskell, the equivalent of doing a CPS transform is working in the Cont monad, which transforms a value x into a higher-order function that takes one argument and applies it to x.
So, to start with, here's 1 + 2 in regular Haskell: (1 + 2) And here it is in the continuation monad:
contAdd x y = do x' <- x
y' <- y
return $ x' + y'
...not terribly informative. To see what's going on, let's disassemble the monad. First, removing the do notation:
contAdd x y = x >>= (\x' -> y >>= (\y' -> return $ x' + y'))
The return function lifts a value into the monad, and in this case is implemented as \x k -> k x, or using an infix operator section as \x -> ($ x).
contAdd x y = x >>= (\x' -> y >>= (\y' -> ($ x' + y')))
The (>>=) operator (read "bind") chains together computations in the monad, and in this case is implemented as \m f k -> m (\x -> f x k). Changing the bind function to prefix form and substituting in the lambda, plus some renaming for clarity:
contAdd x y = (\m1 f1 k1 -> m1 (\a1 -> f1 a1 k1)) x (\x' -> (\m2 f2 k2 -> m2 (\a2 -> f2 a2 k2)) y (\y' -> ($ x' + y')))
Reducing some function applications:
contAdd x y = (\k1 -> x (\a1 -> (\x' -> (\k2 -> y (\a2 -> (\y' -> ($ x' + y')) a2 k2))) a1 k1))
contAdd x y = (\k1 -> x (\a1 -> y (\a2 -> ($ a1 + a2) k1)))
And a bit of final rearranging and renaming:
contAdd x y = \k -> x (\x' -> y (\y' -> k $ x' + y'))
In other words: The arguments to the function have been changed from numbers, into functions that take a number and return the final result of the entire expression, just as you'd expect.
Edit: A commenter points out that contAdd itself still takes two arguments in curried style. This is sensible because it doesn't use the continuation directly, but not necessary. To do otherwise, you'd need to first break the function apart between the arguments:
contAdd x = x >>= (\x' -> return (\y -> y >>= (\y' -> return $ x' + y')))
And then use it like this:
foo = do f <- contAdd (return 1)
r <- f (return 2)
return r
Note that this is really no different from the earlier version; it's simply packaging the result of each partial application as taking a continuation, not just the final result. Since functions are first-class values, there's no significant difference between a CPS expression holding a number vs. one holding a function.
Keep in mind that I'm writing things out in very verbose fashion here to make explicit all the steps where something is in continuation-passing style.
Addendum: You may notice that the final expression looks very similar to the de-sugared version of the monadic expression. This is not a coincidence, as the inward-nesting nature of monadic expressions that lets them change the structure of the computation based on previous values is closely related to continuation-passing style; in both cases, you have in some sense reified a notion of causality.
Short answer : of course it makes sense, you can apply a CPS-transform directly, you will only have lots of cruft because each argument will have, as you noticed, its own attached continuation
In your example, I will consider that there is a +(x,y) uncurried primitive, and that you're asking what is the translation of
let add x y = +(x,y)
(This add faithfully represent OCaml's (+) operator)
add is syntaxically equivalent to
let add = fun x -> (fun y -> +(x, y))
So you apply a CPS transform¹ and get
let add_cps = fun x kx -> kx (fun y ky -> ky +(x,y))
If you want a translated code that looks more like something you could have willingly written, you can devise a finer transformation that actually considers known-arity function as non-curried functions, and tream all parameters as a whole (as you have in non-curried languages, and as functional compilers already do for obvious performance reasons).
¹: I wrote "a CPS transform" because there is no "one true CPS translation". Different translations have been devised, producing more or less continuation-related garbage. The formal CPS translations are usually defined directly on lambda-calculus, so I suppose you're having a less formal, more hand-made CPS transform in mind.
The good properties of CPS (as a style that program respect, and not a specific transformation into this style) are that the order of evaluation is completely explicit, and that all calls are tail-calls. As long as you respect those, you're relatively free in what you can do. Handling curryfied functions specifically is thus perfectly fine.
Remark : Your (cps-add k 1 2) version can also be considered tail-recursive if you assume the compiler detect and optimize that cps-add actually always take 3 arguments, and don't build intermediate closures. That may seem far-fetched, but it's the exact same assumption we use when reasoning about tail-calls in non-CPS programs, in those languages.
yes, technically all functions can be decomposed into functions with one method, however, when you want to use CPS the only thing you are doing is saying is that at a certain point of computation, run the continuation method.
Using your example, lets have a look. To make things a little easier, let's deconstruct cps-add into its normal form where it is a function only taking one argument.
(cps-add k) -> n -> m = k ((+) n m)
Note at this point that the continuation, k, is not being evaluated (Could this be the point of confusion for you?).
Here we have a method called cps-add k that receives a function as an argument and then returns a function that takes another argument, n.
((cps-add k) n) -> m = k ((+) n m)
Now we have a function that takes an argument, m.
So I suppose what I am trying to point out is that currying does not get in the way of CPS style programming. Hope that helps in some way.

Resources