Can any recursive definition be rewritten using foldr? - haskell

Say I have a general recursive definition in haskell like this:
foo a0 a1 ... = base_case
foo b0 b1 ...
| cond1 = recursive_case_1
| cond2 = recursive_case_2
...
Can it always rewritten using foldr? Can it be proved?

If we interpret your question literally, we can write const value foldr to achieve any value, as #DanielWagner pointed out in a comment.
A more interesting question is whether we can instead forbid general recursion from Haskell, and "recurse" only through the eliminators/catamorphisms associated to each user-defined data type, which are the natural generalization of foldr to inductively defined data types. This is, essentially, (higher-order) primitive recursion.
When this restriction is performed, we can only compose terminating functions (the eliminators) together. This means that we can no longer define non terminating functions.
As a first example, we lose the trivial recursion
f x = f x
-- or even
a = a
since, as said, the language becomes total.
More interestingly, the general fixed point operator is lost.
fix :: (a -> a) -> a
fix f = f (fix f)
A more intriguing question is: what about the total functions we can express in Haskell? We do lose all the non-total functions, but do we lose any of the total ones?
Computability theory states that, since the language becomes total (no more non termination), we lose expressiveness even on the total fragment.
The proof is a standard diagonalization argument. Fix any enumeration of programs in the total fragment so that we can speak of "the i-th program".
Then, let eval i x be the result of running the i-th program on the natural x as input (for simplicity, assume this is well typed, and that the result is a natural). Note that, since the language is total, then a result must exist. Moreover, eval can be implemented in the unrestricted Haskell language, since we can write an interpreter of Haskell in Haskell (left as an exercise :-P), and that would work as fine for the fragment. Then, we simply take
f n = succ $ eval n n
The above is a total function (a composition of total functions) which can be expressed in Haskell, but not in the fragment. Indeed, otherwise there would be a program to compute it, say the i-th program. In such case we would have
eval i x = f x
for all x. But then,
eval i i = f i = succ $ eval i i
which is impossible -- contradiction. QED.

In type theory, it is indeed the case that you can elaborate all definitions by dependent pattern-matching into ones only using eliminators (a more strongly-typed version of folds, the generalisation of lists' foldr).
See e.g. Eliminating Dependent Pattern Matching (pdf)

Related

Sharing vs. non-sharing fixed-point combinator

This is the usual definition of the fixed-point combinator in Haskell:
fix :: (a -> a) -> a
fix f = let x = f x in x
On https://wiki.haskell.org/Prime_numbers, they define a different fixed-point combinator:
_Y :: (t -> t) -> t
_Y g = g (_Y g) -- multistage, non-sharing, g (g (g (g ...)))
-- g (let x = g x in x) -- two g stages, sharing
_Y is a non-sharing fixpoint combinator, here arranging for a recursive "telescoping" multistage primes production (a tower of producers).
What exactly does this mean? What is "sharing" vs. "non-sharing" in that context? How does _Y differ from fix?
"Sharing" means f x re-uses the x that it creates; but with _Y g = g . g . g . g . ..., each g calculates its output anew (cf. this and this).
In that context, the sharing version has much worse memory usage, leads to a space leak.1
The definition of _Y mirrors the usual lambda calculus definition's effect for the Y combinator, which emulates recursion by duplication, while true recursion refers to the same (hence, shared) entity.
In
x = f x
(_Y g) = g (_Y g)
both xs refer to the same entity, but each of (_Y g)s refer to equivalent, but separate, entity. That's the intention of it, anyway.
Of course thanks to referential transparency there's no guarantee in Haskell the language for any of this. But GHC the compiler does behave this way.
_Y g is a common sub-expression and it could be "eliminated" by a compiler by giving it a name and reusing that named entity, subverting the whole purpose of it. That's why the GHC has the "no common sub-expressions elimination" -fno-cse flag which prevents this explicitly. It used to be that you had to use this flag to achieve the desired behaviour here, but not anymore. GHC won't be as aggressive at common sub-expressions elimination anymore, with the more recent (read: several years now) versions.
disclaimer: I'm the author of that part of the page you're referring to. Was hoping for the back-and-forth that's usual on wiki pages, but it never came, so my work didn't get reviewed like that. Either no-one bothered, or it is passable (lacking major errors). The wiki seems to be largely abandoned for many years now.
1 The g function involved,
(3:) . minus [5,7..] . foldr (\ (x:xs) ⟶ (x:) . union xs) []
. map (\ p ⟶ [p², p² + 2p..])
produces an increasing stream of all odd primes given an increasing stream of all odd primes. To produce a prime N in value, it consumes its input stream up to the first prime above sqrt(N) in value, at least. Thus the production points are given roughly by repeated squaring, and there are ~ log (log N) of such g functions in total in the chain (or "tower") of these primes producers, each immediately garbage collectible, the lowest one producing its primes given just the first odd prime, 3, known a priori.
And with the two-staged _Y2 g = g x where { x = g x } there would be only two of them in the chain, but only the top one would be immediately garbage collectible, as discussed at the referenced link above.
_Y is translated to the following STG:
_Y f = let x = _Y f in f x
fix is translated identically to the Haskell source:
fix f = let x = f x in x
So fix f sets up a recursive thunk x and returns it, while _Y is a recursive function, and importantly it’s not tail-recursive. Forcing _Y f enters f, passing a new call to _Y f as an argument, so each recursive call sets up a new thunk; forcing the x returned by fix f enters f, passing x itself as an argument, so each recursive call is into the same thunk—this is what’s meant by “sharing”.
The sharing version usually has better memory usage, and also lets the GHC RTS detect some kinds of infinite loop. When a thunk is forced, before evaluation starts, it’s replaced with a “black hole”; if at any point during evaluation of a thunk a black hole is reached from the same thread, then we know we have an infinite loop and can throw an exception (which you may have seen displayed as Exception: <<loop>>).
I think you already received excellent answers, from a GHC/Haskell perspective. I just wanted to chime in and add a few historical/theoretical notes.
The correspondence between unfolding and cyclic views of recursion is rigorously studied in Hasegawa's PhD thesis: https://www.springer.com/us/book/9781447112211
(Here's a shorter paper that you can read without paying Springer: https://link.springer.com/content/pdf/10.1007%2F3-540-62688-3_37.pdf)
Hasegawa assumes a traced monoidal category, a requirement that is much less stringent than the usual PCPO assumption of domain theory, which forms the basis of how we think about Haskell in general. What Hasegawa showed was that one can define these "sharing" fixed point operators in such a setting, and established that they correspond to the usual unfolding view of fixed points from Church's lambda-calculus. That is, there is no way to tell them apart by making them produce different answers.
Hasegawa's correspondence holds for what's known as central arrows; i.e., when there are no "effects" involved. Later on, Benton and Hyland extended this work and showed that the correspondence holds for certain cases when the underlying arrow can perform "mild" monadic effects as well: https://pdfs.semanticscholar.org/7b5c/8ed42a65dbd37355088df9dde122efc9653d.pdf
Unfortunately, Benton and Hyland only allow effects that are quite "mild": Effects like the state and environment monads fit the bill, but not general effects like exceptions, lists, or IO. (The fixed point operators for these effectful computations are known as mfix in Haskell, with the type signature (a -> m a) -> m a, and they form the basis of the recursive-do notation.)
It's still an open question how to extend this work to cover arbitrary monadic effects. Though it doesn't seem to be receiving much attention these days. (Would make a great PhD topic for those interested in the correspondence between lambda-calculus, monadic effects, and graph-based computations.)

How does return statement work in Haskell? [duplicate]

This question already has answers here:
What's so special about 'return' keyword
(3 answers)
Closed 5 years ago.
Consider these functions
f1 :: Maybe Int
f1 = return 1
f2 :: [Int]
f2 = return 1
Both have the same statement return 1. But the results are different. f1 gives value Just 1 and f2 gives value [1]
Looks like Haskell invokes two different versions of return based on return type. I like to know more about this kind of function invocation. Is there a name for this feature in programming languages?
This is a long meandering answer!
As you've probably seen from the comments and Thomas's excellent (but very technical) answer You've asked a very hard question. Well done!
Rather than try to explain the technical answer I've tried to give you a broad overview of what Haskell does behind the scenes without diving into technical detail. Hopefully it will help you to get a big picture view of what's going on.
return is an example of type inference.
Most modern languages have some notion of polymorphism. For example var x = 1 + 1 will set x equal to 2. In a statically typed language 2 will usually be an int. If you say var y = 1.0 + 1.0 then y will be a float. The operator + (which is just a function with a special syntax)
Most imperative languages, especially object oriented languages, can only do type inference one way. Every variable has a fixed type. When you call a function it looks at the types of the argument and chooses a version of that function that fits the types (or complains if it can't find one).
When you assign the result of a function to a variable the variable already has a type and if it doesn't agree with the type of the return value you get an error.
So in an imperative language the "flow" of type deduction follows time in your program Deduce the type of a variable, do something with it and deduce the type of the result. In a dynamically typed language (such as Python or javascript) the type of a variable is not assigned until the value of the variable is computed (which is why there don't seem to be types). In a statically typed language the types are worked out ahead of time (by the compiler) but the logic is the same. The compiler works out what the types of variables are going to be, but it does so by following the logic of the program in the same way as the program runs.
In Haskell the type inference also follows the logic of the program. Being Haskell it does so in a very mathematically pure way (called System F). The language of types (that is the rules by which types are deduced) are similar to Haskell itself.
Now remember Haskell is a lazy language. It doesn't work out the value of anything until it needs it. That's why it makes sense in Haskell to have infinite data structures. It never occurs to Haskell that a data structure is infinite because it doesn't bother to work it out until it needs to.
Now all that lazy magic happens at the type level too. In the same way that Haskell doesn't work out what the value of an expression is until it really needs to, Haskell doesn't work out what the type of an expression is until it really needs to.
Consider this function
func (x : y : rest) = (x,y) : func rest
func _ = []
If you ask Haskell for the type of this function it has a look at the definition, sees [] and : and deduces that it's working with lists. But it never needs to look at the types of x and y, it just knows that they have to be the same because they end up in the same list. So it deduces the type of the function as [a] -> [a] where a is a type that it hasn't bothered to work out yet.
So far no magic. But it's useful to notice the difference between this idea and how it would be done in an OO language. Haskell doesn't convert the arguments to Object, do it's thing and then convert back. Haskell just hasn't been asked explicitly what the type of the list is. So it doesn't care.
Now try typing the following into ghci
maxBound - length ""
maxBound : "Hello"
Now what just happened !? minBound bust be a Char because I put it on the front of a string and it must be an integer because I added it to 0 and got a number. Plus the two values are clearly very different.
So what is the type of minBound? Let's ask ghci!
:type minBound
minBound :: Bounded a => a
AAargh! what does that mean? Basically it means that it hasn't bothered to work out exactly what a is, but is has to be Bounded if you type :info Bounded you get three useful lines
class Bounded a where
minBound :: a
maxBound :: a
and a lot of less useful lines
So if a is Bounded there are values minBound and maxBound of type a.
In fact under the hood Bounded is just a value, it's "type" is a record with fields minBound and maxBound. Because it's a value Haskell doesn't look at it until it really needs to.
So I appear to have meandered somewhere in the region of the answer to your question. Before we move onto return (which you may have noticed from the comments is a wonderfully complex beast.) let's look at read.
ghci again
read "42" + 7
read "'H'" : "ello"
length (read "[1,2,3]")
and hopefully you won't be too surprised to find that there are definitions
read :: Read a => String -> a
class Read where
read :: String -> a
so Read a is just a record containing a single value which is a function String -> a. Its very tempting to assume that there is one read function which looks at a string, works out what type is contained in the string and returns that type. But it does the opposite. It completely ignores the string until it's needed. When the value is needed, Haskell first works out what type it's expecting, once it's done that it goes and gets the appropriate version of the read function and combines it with the string.
now consider something slightly more complex
readList :: Read a => [String] -> a
readList strs = map read strs
under the hood readList actually takes two arguments
readList' (Read a) -> [String] -> [a]
readList' {read = f} strs = map f strs
Again as Haskell is lazy it only bothers looking at the arguments when it's needs to find out the return value, at that point it knows what a is, so the compiler can go and fine the right version of Read. Until then it doesn't care.
Hopefully that's given you a bit of an idea of what's happening and why Haskell can "overload" on the return type. But it's important to remember it's not overloading in the conventional sense. Every function has only one definition. It's just that one of the arguments is a bag of functions. read_str doesn't ever know what types it's dealing with. It just knows it gets a function String -> a and some Strings, to do the application it just passes the arguments to map. map in turn doesn't even know it gets strings. When you get deeper into Haskell it becomes very important that functions don't know very much about the types they're dealing with.
Now let's look at return.
Remember how I said that the type system in Haskell was very similar to Haskell itself. Remember that in Haskell functions are just ordinary values.
Does this mean I can have a type that takes a type as an argument and returns another type? Of course it does!
You've seen some type functions Maybe takes a type a and returns another type which can either be Just a or Nothing. [] takes a type a and returns a list of as. Type functions in Haskell are usually containers. For example I could define a type function BinaryTree which stores a load of a's in a tree like structure. There are of course lots of much stranger ones.
So, if these type functions are similar to ordinary types I can have a typeclass that contains type functions. One such typeclass is Monad
class Monad m where
return a -> m a
(>>=) m a (a -> m b) -> m b
so here m is some type function. If I want to define Monad for m I need to define return and the scary looking operator below it (which is called bind)
As others have pointed out the return is a really misleading name for a fairly boring function. The team that designed Haskell have since realised their mistake and they're genuinely sorry about it. return is just an ordinary function that takes an argument and returns a Monad with that type in it. (You never asked what a Monad actually is so I'm not going to tell you)
Let's define Monad for m = Maybe!
First I need to define return. What should return x be? Remember I'm only allowed to define the function once, so I can't look at x because I don't know what type it is. I could always return Nothing, but that seems a waste of a perfectly good function. Let's define return x = Just x because that's literally the only other thing I can do.
What about the scary bind thing? what can we say about x >>= f? well x is a Maybe a of some unknown type a and f is a function that takes an a and returns a Maybe b. Somehow I need to combine these to get a Maybe b`
So I need to define Nothing >== f. I can't call f because it needs an argument of type a and I don't have a value of type a I don't even know what 'a' is. I've only got one choice which is to define
Nothing >== f = Nothing
What about Just x >>= f? Well I know x is of type a and f takes a as an argument, so I can set y = f a and deduce that y is of type b. Now I need to make a Maybe b and I've got a b so ...
Just x >>= f = Just (f x)
So I've got a Monad! what if m is List? well I can follow a similar sort of logic and define
return x = [x]
[] >>= f = []
(x : xs) >>= a = f x ++ (xs >>= f)
Hooray another Monad! It's a nice exercise to go through the steps and convince yourself that there's no other sensible way of defining this.
So what happens when I call return 1?
Nothing!
Haskell's Lazy remember. The thunk return 1 (technical term) just sits there until someone needs the value. As soon as Haskell needs the value it know what type the value should be. In particular it can deduce that m is List. Now that it knows that Haskell can find the instance of Monad for List. As soon as it does that it has access to the correct version of return.
So finally Haskell is ready To call return, which in this case returns [1]!
The return function is from the Monad class:
class Applicative m => Monad (m :: * -> *) where
...
return :: a -> m a
So return takes any value of type a and results in a value of type m a. The monad, m, as you've observed is polymorphic using the Haskell type class Monad for ad hoc polymorphism.
At this point you probably realize return is not an good, intuitive, name. It's not even a built in function or a statement like in many other languages. In fact a better-named and identically-operating function exists - pure. In almost all cases return = pure.
That is, the function return is the same as the function pure (from the Applicative class) - I often think to myself "this monadic value is purely the underlying a" and I try to use pure instead of return if there isn't already a convention in the codebase.
You can use return (or pure) for any type that is a class of Monad. This includes the Maybe monad to get a value of type Maybe a:
instance Monad Maybe where
...
return = pure -- which is from Applicative
...
instance Applicative Maybe where
pure = Just
Or for the list monad to get a value of [a]:
instance Applicative [] where
{-# INLINE pure #-}
pure x = [x]
Or, as a more complex example, Aeson's parse monad to get a value of type Parser a:
instance Applicative Parser where
pure a = Parser $ \_path _kf ks -> ks a

How do we overcome the compile time and runtime gap when programming in a Dependently Typed Language?

I'm told that in dependent type system, "types" and "values" is mixed, and we can treat both of them as "terms" instead.
But there is something I can't understand: in a strongly typed programming language without Dependent Type (like Haskell), Types is decided (infered or checked) at compile time, but values is decided (computed or inputed) at runtime.
I think there must be a gap between these two stages. Just think that if a value is interactively read from STDIN, how can we reference this value in a type which must be decided AOT?
e.g. There is a natural number n and a list of natural number xs (which contains n elements) which I need to read from STDIN, how can I put them into a data structure Vect n Nat?
Suppose we input n :: Int at runtime from STDIN. We then read n strings, and store them into vn :: Vect n String (pretend for the moment this can be done).
Similarly, we can read m :: Int and vm :: Vect m String. Finally, we concatenate the two vectors: vn ++ vm (simplifying a bit here). This can be type checked, and will have type Vect (n+m) String.
Now it is true that the type checker runs at compile time, before the values n,m are known, and also before vn,vm are known. But this does not matter: we can still reason symbolically on the unknowns n,m and argue that vn ++ vm has that type, involving n+m, even if we do not yet know what n+m actually is.
It is not that different from doing math, where we manipulate symbolic expressions involving unknown variables according to some rules, even if we do not know the values of the variables. We don't need to know what number is n to see that n+n = 2*n.
Similarly, the type checker can type check
-- pseudocode
readNStrings :: (n :: Int) -> IO (Vect n String)
readNStrings O = return Vect.empty
readNStrings (S p) = do
s <- getLine
vp <- readNStrings p
return (Vect.cons s vp)
(Well, actually some more help from the programmer could be needed to typecheck this, since it involves dependent matching and recursion. But I'll neglect this.)
Importantly, the type checker can check that without knowing what n is.
Note that the same issue actually already arises with polymorphic functions.
fst :: forall a b. (a, b) -> a
fst (x, y) = x
test1 = fst # Int # Float (2, 3.5)
test2 = fst # String # Bool ("hi!", True)
...
One might wonder "how can the typechecker check fst without knowing what types a and b will be at runtime?". Again, by reasoning symbolically.
With type arguments this is arguably more obvious since we usually run the programs after type erasure, unlike value parameters like our n :: Int above, which can not be erased. Still, there is some similarity between universally quantifying over types or over Int.
It seems to me that there are two questions here:
Given that some values are unknown during compile-time (e.g., values read from STDIN), how can we make use of them in types? (Note that chi has already given an excellent answer to this.)
Some operations (e.g., getLine) seem to make absolutely no sense at compile-time; how could we possibly talk about them in types?
The answer to (1), as chi has said, is symbolic or abstract reasoning. You can read in a number n, and then have a procedure that builds a Vect n Nat by reading from the command line n times, making use of arithmetic properties such as the fact that 1+(n-1) = n for nonzero natural numbers.
The answer to (2) is a bit more subtle. Naively, you might want to say "this function returns a vector of length n, where n is read from the command line". There are two types you might try to give this (apologies if I'm getting Haskell notation wrong)
unsafePerformIO (do n <- getLine; return (IO (Vect (read n :: Int) Nat)))
or (in pseudo-Coq notation, since I'm not sure what Haskell's notation for existential types is)
IO (exists n, Vect n Nat)
These two types can actually both be made sense of, and say different things. The first type, to me, says "at compile time, read n from the command line, and return a function which, at runtime, gives a vector of length n by performing IO". The second type says "at runtime, perform IO to get a natural number n and a vector of length n".
The way I like looking at this is that all side effects (other than, perhaps, non-termination) are monad transformers, and there is only one monad: the "real-world" monad. Monad transformers work just as well at the type level as at the term level; the one thing which is special is run :: M a -> a which executes the monad (or stack of monad transformers) in the "real world". There are two points in time at which you can invoke run: one is at compile time, where you invoke any instance of run which shows up at the type level. Another is at runtime, where you invoke any instance of run which shows up at the value level. Note that run only makes sense if you specify an evaluation order; if your language does not specify whether it is call-by-value or call-by-name (or call-by-push-value or call-by-need or call-by-something-else), you can get incoherencies when you try to compute a type.

what is the meaning of "let x = x in x" and "data Float#" in GHC.Prim in Haskell

I looked at the module of GHC.Prim and found that it seems that all datas in GHC.Prim are defined as data Float# without something like =A|B, and all functions in GHC.Prim is defined as gtFloat# = let x = x in x.
My question is whether these definations make sense and what they mean.
I checked the header of GHC.Prim like below
{-
This is a generated file (generated by genprimopcode).
It is not code to actually be used. Its only purpose is to be
consumed by haddock.
-}
I guess it may have some relations with the questions and who could please explain that to me.
It's magic :)
These are the "primitive operators and operations". They are hardwired into the compiler, hence there are no data constructors for primitives and all functions are bottom since they are necessarily not expressable in pure haskell.
(Bottom represents a "hole" in a haskell program, an infinite loop or undefined are examples of bottom)
To put it another way
These data declarations/functions are to provide access to the raw compiler internals. GHC.Prim exists to export these primitives, it doesn't actually implement them or anything (eg its code isn't actually useful). All of that is done in the compiler.
It's meant for code that needs to be extremely optimized. If you think you might need it, some useful reading about the primitives in GHC
A brief expansion of jozefg's answer ...
Primops are precisely those operations that are supplied by the runtime because they can't be defined within the language (or shouldn't be, for reasons of efficiency, say). The true purpose of GHC.Prim is not to define anything, but merely to export some operations so that Haddock can document their existence.
The construction let x = x in x is used at this point in GHC's codebase because the value undefined has not yet been, um, "defined". (That waits until the Prelude.) But notice that the circular let construction, just like undefined, is both syntactically correct and can have any type. That is, it's an infinite loop with the semantics of ⊥, just as undefined is.
... and an aside
Also note that in general the Haskell expression let x = z in y means "change the variable x to the expression z wherever x occurs in the expression y". If you're familiar with the lambda calculus, you should recognize this as the reduction rule for the application of the lambda abstraction \x -> y to the term z. So is the Haskell expression let x = x in x nothing more than some syntax on top of the pure lambda calculus? Let's take a look.
First, we need to account for the recursiveness of Haskell's let expressions. The lambda calculus does not admit recursive definitions, but given a primitive fixed-point operator fix,1 we can encode recursiveness explicitly. For example, the Haskell expression let x = x in x has the same meaning as (fix \r x -> r x) z.2 (I've renamed the x on the right side of the application to z to emphasize that it has no implicit relation to the x inside the lambda).
Applying the usual definition of a fixed-point operator, fix f = f (fix f), our translation of let x = x in x reduces (or rather doesn't) like this:
(fix \r x -> r x) z ==>
(\s y -> s y) (fix \r x -> r x) z ==>
(\y -> (fix \r x -> r x) y) z ==>
(fix \r x -> r x) z ==> ...
So at this point in the development of the language, we've introduced the semantics of ⊥ from the foundation of the (typed) lambda calculus with a built-in fixed-point operator. Lovely!
We need a primitive fixed-point operation (that is, one that is built into the language) because it's impossible to define a fixed-point combinator in the simply typed lambda calculus and its close cousins. (The definition of fix in Haskell's Prelude doesn't contradict this—it's defined recursively, but we need a fixed-point operator to implement recursion.)
If you haven't seen this before, you should read up on fixed-point recursion in the lambda calculus. A text on the lambda calculus is best (there are some free ones online), but some Googling should get you going. The basic idea is that we can convert a recursive definition into a non-recursive one by abstracting over the recursive call, then use a fixed-point combinator to pass our function (lambda abstraction) to itself. The base-case of a well-defined recursive definition corresponds to a fixed point of our function, so the function executes, calling itself over and over again until it hits a fixed point, at which point the function returns its result. Pretty damn neat, huh?

What are the benefits of currying?

I don't think I quite understand currying, since I'm unable to see any massive benefit it could provide. Perhaps someone could enlighten me with an example demonstrating why it is so useful. Does it truly have benefits and applications, or is it just an over-appreciated concept?
(There is a slight difference between currying and partial application, although they're closely related; since they're often mixed together, I'll deal with both terms.)
The place where I realized the benefits first was when I saw sliced operators:
incElems = map (+1)
--non-curried equivalent: incElems = (\elems -> map (\i -> (+) 1 i) elems)
IMO, this is totally easy to read. Now, if the type of (+) was (Int,Int) -> Int *, which is the uncurried version, it would (counter-intuitively) result in an error -- but curryied, it works as expected, and has type [Int] -> [Int].
You mentioned C# lambdas in a comment. In C#, you could have written incElems like so, given a function plus:
var incElems = xs => xs.Select(x => plus(1,x))
If you're used to point-free style, you'll see that the x here is redundant. Logically, that code could be reduced to
var incElems = xs => xs.Select(curry(plus)(1))
which is awful due to the lack of automatic partial application with C# lambdas. And that's the crucial point to decide where currying is actually useful: mostly when it happens implicitly. For me, map (+1) is the easiest to read, then comes .Select(x => plus(1,x)), and the version with curry should probably be avoided, if there is no really good reason.
Now, if readable, the benefits sum up to shorter, more readable and less cluttered code -- unless there is some abuse of point-free style done is with it (I do love (.).(.), but it is... special)
Also, lambda calculus would get impossible without using curried functions, since it has only one-valued (but therefor higher-order) functions.
* Of course it actually in Num, but it's more readable like this for the moment.
Update: how currying actually works.
Look at the type of plus in C#:
int plus(int a, int b) {..}
You have to give it a tuple of values -- not in C# terms, but mathematically spoken; you can't just leave out the second value. In haskell terms, that's
plus :: (Int,Int) -> Int,
which could be used like
incElem = map (\x -> plus (1, x)) -- equal to .Select (x => plus (1, x))
That's way too much characters to type. Suppose you'd want to do this more often in the future. Here's a little helper:
curry f = \x -> (\y -> f (x,y))
plus' = curry plus
which gives
incElem = map (plus' 1)
Let's apply this to a concrete value.
incElem [1]
= (map (plus' 1)) [1]
= [plus' 1 1]
= [(curry plus) 1 1]
= [(\x -> (\y -> plus (x,y))) 1 1]
= [plus (1,1)]
= [2]
Here you can see curry at work. It turns a standard haskell style function application (plus' 1 1) into a call to a "tupled" function -- or, viewed at a higher level, transforms the "tupled" into the "untupled" version.
Fortunately, most of the time, you don't have to worry about this, as there is automatic partial application.
It's not the best thing since sliced bread, but if you're using lambdas anyway, it's easier to use higher-order functions without using lambda syntax. Compare:
map (max 4) [0,6,9,3] --[4,6,9,4]
map (\i -> max 4 i) [0,6,9,3] --[4,6,9,4]
These kinds of constructs come up often enough when you're using functional programming, that it's a nice shortcut to have and lets you think about the problem from a slightly higher level--you're mapping against the "max 4" function, not some random function that happens to be defined as (\i -> max 4 i). It lets you start to think in higher levels of indirection more easily:
let numOr4 = map $ max 4
let numOr4' = (\xs -> map (\i -> max 4 i) xs)
numOr4 [0,6,9,3] --ends up being [4,6,9,4] either way;
--which do you think is easier to understand?
That said, it's not a panacea; sometimes your function's parameters will be the wrong order for what you're trying to do with currying, so you'll have to resort to a lambda anyway. However, once you get used to this style, you start to learn how to design your functions to work well with it, and once those neurons starts to connect inside your brain, previously complicated constructs can start to seem obvious in comparison.
One benefit of currying is that it allows partial application of functions without the need of any special syntax/operator. A simple example:
mapLength = map length
mapLength ["ab", "cde", "f"]
>>> [2, 3, 1]
mapLength ["x", "yz", "www"]
>>> [1, 2, 3]
map :: (a -> b) -> [a] -> [b]
length :: [a] -> Int
mapLength :: [[a]] -> [Int]
The map function can be considered to have type (a -> b) -> ([a] -> [b]) because of currying, so when length is applied as its first argument, it yields the function mapLength of type [[a]] -> [Int].
Currying has the convenience features mentioned in other answers, but it also often serves to simplify reasoning about the language or to implement some code much easier than it could be otherwise. For example, currying means that any function at all has a type that's compatible with a ->b. If you write some code whose type involves a -> b, that code can be made work with any function at all, no matter how many arguments it takes.
The best known example of this is the Applicative class:
class Functor f => Applicative f where
pure :: a -> f a
(<*>) :: f (a -> b) -> f a -> f b
And an example use:
-- All possible products of numbers taken from [1..5] and [1..10]
example = pure (*) <*> [1..5] <*> [1..10]
In this context, pure and <*> adapt any function of type a -> b to work with lists of type [a]. Because of partial application, this means you can also adapt functions of type a -> b -> c to work with [a] and [b], or a -> b -> c -> d with [a], [b] and [c], and so on.
The reason this works is because a -> b -> c is the same thing as a -> (b -> c):
(+) :: Num a => a -> a -> a
pure (+) :: (Applicative f, Num a) => f (a -> a -> a)
[1..5], [1..10] :: Num a => [a]
pure (+) <*> [1..5] :: Num a => [a -> a]
pure (+) <*> [1..5] <*> [1..10] :: Num a => [a]
Another, different use of currying is that Haskell allows you to partially apply type constructors. E.g., if you have this type:
data Foo a b = Foo a b
...it actually makes sense to write Foo a in many contexts, for example:
instance Functor (Foo a) where
fmap f (Foo a b) = Foo a (f b)
I.e., Foo is a two-parameter type constructor with kind * -> * -> *; Foo a, the partial application of Foo to just one type, is a type constructor with kind * -> *. Functor is a type class that can only be instantiated for type constrcutors of kind * -> *. Since Foo a is of this kind, you can make a Functor instance for it.
The "no-currying" form of partial application works like this:
We have a function f : (A ✕ B) → C
We'd like to apply it partially to some a : A
To do this, we build a closure out of a and f (we don't evaluate f at all, for the time being)
Then some time later, we receive the second argument b : B
Now that we have both the A and B argument, we can evaluate f in its original form...
So we recall a from the closure, and evaluate f(a,b).
A bit complicated, isn't it?
When f is curried in the first place, it's rather simpler:
We have a function f : A → B → C
We'd like to apply it partially to some a : A – which we can just do: f a
Then some time later, we receive the second argument b : B
We apply the already evaluated f a to b.
So far so nice, but more important than being simple, this also gives us extra possibilities for implementing our function: we may be able to do some calculations as soon as the a argument is received, and these calculations won't need to be done later, even if the function is evaluated with multiple different b arguments!
To give an example, consider this audio filter, an infinite impulse response filter. It works like this: for each audio sample, you feed an "accumulator function" (f) with some state parameter (in this case, a simple number, 0 at the beginning) and the audio sample. The function then does some magic, and spits out the new internal state1 and the output sample.
Now here's the crucial bit – what kind of magic the function does depends on the coefficient2 λ, which is not quite a constant: it depends both on what cutoff frequency we'd like the filter to have (this governs "how the filter will sound") and on what sample rate we're processing in. Unfortunately, the calculation of λ is a bit more complicated (lp1stCoeff $ 2*pi * (νᵥ ~*% δs) than the rest of the magic, so we wouldn't like having to do this for every single sample, all over again. Quite annoying, because νᵥ and δs are almost constant: they change very seldom, certainly not at each audio sample.
But currying saves the day! We simply calculate λ as soon as we have the necessary parameters. Then, at each of the many many audio samples to come, we only need to perform the remaining, very easy magic: yⱼ = yⱼ₁ + λ ⋅ (xⱼ - yⱼ₁). So we're being efficient, and still keeping a nice safe referentially transparent purely-functional interface.
1 Note that this kind of state-passing can generally be done more nicely with the State or ST monad, that's just not particularly beneficial in this example
2 Yes, this is a lambda symbol. I hope I'm not confusing anybody – fortunately, in Haskell it's clear that lambda functions are written with \, not with λ.
It's somewhat dubious to ask what the benefits of currying are without specifying the context in which you're asking the question:
In some cases, like functional languages, currying will merely be seen as something that has a more local change, where you could replace things with explicit tupled domains. However, this isn't to say that currying is useless in these languages. In some sense, programming with curried functions make you "feel" like you're programming in a more functional style, because you more typically face situations where you're dealing with higher order functions. Certainly, most of the time, you will "fill in" all of the arguments to a function, but in the cases where you want to use the function in its partially applied form, this is a bit simpler to do in curried form. We typically tell our beginning programmers to use this when learning a functional language just because it feels like better style and reminds them they're programming in more than just C. Having things like curry and uncurry also help for certain conveniences within functional programming languages too, I can think of arrows within Haskell as a specific example of where you would use curry and uncurry a bit to apply things to different pieces of an arrow, etc...
In some cases, you want to think about more than functional programs, you can present currying / uncurrying as a way to state the elimination and introduction rules for and in constructive logic, which provides a connection to a more elegant motivation for why it exists.
In some cases, for example, in Coq, using curried functions versus tupled functions can produce different induction schemes, which may be easier or harder to work with, depending on your applications.
I used to think that currying was simple syntax sugar that saves you a bit of typing. For example, instead of writing
(\ x -> x + 1)
I can merely write
(+1)
The latter is instantly more readable, and less typing to boot.
So if it's just a convenient short cut, why all the fuss?
Well, it turns out that because function types are curried, you can write code which is polymorphic in the number of arguments a function has.
For example, the QuickCheck framework lets you test functions by feeding them randomly-generated test data. It works on any function who's input type can be auto-generated. But, because of currying, the authors were able to rig it so this works with any number of arguments. Were functions not curried, there would be a different testing function for each number of arguments - and that would just be tedious.

Resources