Does the WHNF reduction in Haskell happen at Compile time? - haskell

AFAIU the object file generated by Haskell compiler should be machine code. So does that object file have a representation of the original AST and reduces it at run time or does this reduction happen at compile time and only final WHNF values are converted to the corresponding machine code.
I understand the compilation time of latter would be a function of the time complexity of program itself which I think is less likely.
Can someone give a clear explanation of what happens at run time and what happens at compile time in the case of Haskell (GHC)?

A compiler could perform its job performing all the reduction at runtime. That is, the resulting executable could have a (large) data section, where the whole program AST is encoded, and a (small) text/code section with a generic WHNF reducer which operates on the AST.
Note that the above approach would work in any language. E.g. a Python compiler could also generate an executable file comprising AST data and generic reducer.
The reducer would follow the so-called small-step semantics of the language, which is a very well known notion in computer science (more specifically, in programming languages theory).
However, the performance of such approach would be quite poor.
Researchers in programming languages worked on finding better approaches, resulting in the definition of abstract machines. Essentially, an abstract machine is an algorithm for running an high level program in a lower level setting. Usually, it exploits a few data structures (e.g. stacks) to make the process more efficient. In the case of functional languages such as Haskell, well known abstract machines include:
Categorical Abstract Machine (Caml originally used this, I think)
Krivine Machine
SECD
Spineless Tagless G-machine (GHC uses this one)
The problem itself is far from being trivial. There has been, and I'd say there still is, research on making WHNF reduction more efficient.
Each Haskell definition, after GHC compilation, becomes a sequence of assembly instructions, which manipulate the state of the STG machine. There is no AST around, only code which manipulates data / closures / etc.
One could say that it is very important to use such advanced techniques to improve performance, coupled with heavy optimizations. A negative consequence, though, is that it becomes hard to understand the performance of the resulting code from the original one, since one needs to take into account how the abstract machine works (which is non trivial) and the optimizations (which are quite complex nowadays). To some lesser extent, this is also the case for heavily optimizing C or C++ compilers, where it becomes harder to know when the optimization was triggered or not.
Ultimately, an experienced programmer (in Haskell, C, C++, or anything else) will come to understand the basic optimizations of their compiler, and the basic mechanisms abstract machine which is being used. However, this is not something which is easy to master, I think.
In the question, it is mentioned that WHNF reduction could be performed at compile time. This is only partially true, since the value of the variables originating from IO actions can not be known until runtime, so reduction can only happen at runtime when it involves those values. Further, performing reduction can also make performance worse! E.g.
let x = complex computation in x + x
-- vs
complex computation + complex computation
The latter is the result of reducing the former, but it duplicates work! Indeed, most abstract machines use a lazy reduction approach that makes x to be computed only once in such cases.

Related

Is there any link between functional programming and strong typing?

All of the "pure" functional languages are strong typed. Is there any link between those?
Non-trivial functional programming techniques make heavy use of first-class and higher-order functions. First-class functions are implemented as closures. Non-trivial use of first-class functions and closures is only sane when you have garbage collection. Efficient and reliable garbage collection requires memory safety (which I assume you mean by "strongly typed"). So there you go.
Purity doesn't really matter for that.
"Pure" functional languages are those which enforce referential transparency. The enforcement could be static (via the type system), or it could be dynamic (e.g. a runtime failure). I'm guessing you mean "statically typed" when you say "strongly typed"...
Sincbe the community from which typed, pure functional programming emerged is separately issued in reducing runtime failures and making programming safer adding purity without type enforcement -- such that runtime failure is still an option -- is incongruous.
So its no suprise you see types and effect typing going together with purity-by-default: it is all about reducing runtime failures.
Mercury (in which you can do functional programming, but is more of a pure logic programming language) actually has an explicit static purity system. Every predicate or function is statically known to be pure or impure (or semipure, but I'm not going to go into that in detail). Putting a call to an impure function inside a pure function (pure is the default) will result in an error detected at compile time.
It also has a static type system, in which the type of every expression/variable is statically known by the compiler, and type errors are detected at compile time. But the type system is completely independent of the purity system (in that you can have pure, impure, and semipure functions of any given type).
So we can imagine a different language with the same static purity system but in which the types of expressions/variables are not statically known, and may vary dynamically at runtime. One could even imagine such a language having "weak types" in the sense of PHP (i.e. the language will try to convert values such that operations that don't make sense on the value's type can actually be performed), or in the sense of C (i.e. you can convince the language to store values of one type in a variable the language will treat as if it were a different type).
One could also imagine a language in which the purity was not statically known but still enforced at runtime. The language would have to do something such as keeping track of whether it was in an pure call, and if so rejecting calls to impure primitive operations.
So in that sense, no there's no link between strong typing and pure programming.
However, languages which actually enforce purity (rather than merely encouraging it, as in Scala) have traditionally achieved this by static analysis. Indeed, one of the motivations for pure code is that it is much more susceptible to static analysis than code which is impure in arbitrary ways. A contrived example is that a function which takes a boolean argument and returns something can be known to return one of at most two results if it is pure; if it is not known to be pure then the language has to assume it might return something different at every single invocation.. And if you're interested in doing static analysis of your code and you have this static analysis system for enforcing purity, you might as well make it enforce type safety as well. So there's just "not that much call" for languages which enforce purity but don't have strong static type systems. I'm not aware of any that actually exist (there's not all that many languages that enforce purity at all, as far as I know).

What is the practical use for laziness as a built-in language feature?

It's fairly obvious why a functional programming language that wants to be lazy needs to be pure. I'm looking at the reverse question: if a language wants to be pure, is there a big advantage in being lazy? One argument, made by one of the designers of Haskell, is that it removes temptation; maybe, but I'm trying to weigh up the more concrete advantages.
Given that you want to do functional programming, what are the use cases where built-in laziness lets you express things more clearly, simply or concisely?
Stated simply: Why is laziness so important that you'd want to build it into the language?
(I'm looking for use cases more oriented towards an application rather than a demo - I know you can do things like producing an infinite list of prime numbers by filtering an infinite list of natural numbers, but who writes that ten times 'fore lunch...)
"Nothing is evaluated until it is needed at another place" is a simplified metaphor which doesn't cover all aspects of lazy evaluation (e.g. it doesn't mention the strictness phenomena).
From theoretical standpoint, there are 3 ways to go when designing a pure language (of course if it's based on some kind of lambda calculus and not on more exotic evaluation models): strict, non-strict and total.
Each of them has its advantages and disadvantages, so you need to read corresponding research papers.
Total languages are most pure of the three. In the other two the non-termination can be seen as a side effect, so strictness and totality analysers must be built to keep an implementation efficient. Both analyses are undecidable, so the analyzers can never be complete.
However, the total languages are least expressive: it's impossible for a total language to be Turing complete. A frequent approach to get good enough expressiveness is to have a built-in proof system for well-founded recursion, which is not much easier to build than the analyzers for non-total languages.
From practical standpoint, non-strict semantics lets you more easily define control abstractions, as control structures are essentially non-strict. In a strict language you still need some places with non-strict semantics. E.g. if construct has non-strict semantic even in strict languages.
So if your language is strict, control structures are a special case. In contrast, a non-strict language can be uniformly non-strict - it doesn't have an inherent need in strict constructs.
As for "who writes that ten times 'fore lunch" - anyone who uses Haskell for their projects does. I think developing a non-toy project using a language (a non-strict language in your case) is a best way to grasp its advantages and disadvantages.
Below are a few generic usecases for laziness illustrated by non-toy examples:
Cases when control flow is hard to predict. Think of attribute grammars when without laziness you have to perform a topological sort on attributes to resolve the dependensies. Re-sorting your code every time the dependency graph is changed is not practical. In Haskell you can implement the attribute grammar formalism without an explicit sorting, and there are at least two actual implementations on Hackage. The attribute grammars have wide application in compiler construction.
The "generate and search" approach to solve many optimizaton problems. In a strict language you have to interleave generation and search, in Haskell you just compose separate generation and searching functions, and your code remains syntactically modular, but interleaved at runtime. Think of the traveling salesman problem (TSP), when you generate all possible tours and then search through them using a branch-and-bound algorithm. Note that branch an bound algorithms only inspects certain first cities of a tour, only the necessary parts of routes are generated. The TSP has several applications even in its purest formulation, such as planning, logistics, and the manufacture of microchips. Slightly modified, it appears as a sub-problem in many areas, such as DNA sequencing.
Lazy code has non-modular control flow, so a single function can have many possible control flows depending on the environment it executes in. This phenomena can be seen as some kind of 'control flow polymorphism', so lazy control flow abstractions are more generic than their strict counterparts, and a standard library of higher-order functions is much more useful in a lazy language. Think of Python generators, loops and list iterators: in Haskell list functions cover all three usecases, with control flow adapting to different usage scenarios because of laziness. It is not limited to lists - think of Data.Arrow and iteratees, lazy and strict versions of State monad etc. Also note that non-modular control flow is both an advantage and disadvantage, as it makes reasoning about performance harder.
Lazy possibly infinite data structures are useful beyond toy examples. See works of Conal Elliott on memoizing higher order functions using tries. Infinite data structures appear as infinite search spaces (see 2), infinite loops and never-exhausting generators in Python sense (see 3).
Mac OS X's Core Image is a good practical example of lazy evaluation.
Basically, Core Image lets you create a directed acyclic graph of image generators and filters. No evaluation actually takes place until the last step in the process: materialization. When you request to materialize a Core Image graph, the final image frame is propagated backwards through the graph's transformations, thus minimizing the quantity of actual pixel values that need to be evaluated.
There's an extensive discussion of this point in Hughes's classic Why Functional Programming Matters. Therein, Hughes argues that laziness allows for improved modularity, using a number of accessible examples.

Will I develop good/bad habits because of lazy evaluation?

I'm looking to learn functional programming with either Haskell or F#.
Are there any programming habits (good or bad) that could form as a result Haskell's lazy evaluation? I like the idea of Haskell's functional programming purity for the purposes of understanding functional programming. I'm just a bit worried about two things:
I may misinterpret lazy-evaluation-based features as being part of the "functional paradigm".
I may develop thought patterns that work in a lazy world but not in a normal order/eager evaluation world.
There are habits that you get into when programming in a lazy language that don't work in a strict language. Some of these seem so natural to Haskell programmers that they don't think of them as lazy evaluation. A couple of examples off the top of my head:
f x y = if x > y then .. a .. b .. else c
where
a = expensive
b = expensive
c = expensive
here we define a bunch of subexpressions in a where clause, with complete disregard for which of them will ever be evaluated. It doesn't matter: the compiler will ensure that no unnecessary work is performed at runtime. Non-strict semantics means that the compiler is able to do this. Whenever I write in a strict language I trip over this a lot.
Another example that springs to mind is "numbering things":
pairs = zip xs [1..]
here we just want to associate each element in a list with its index, and zipping with the infinite list [1..] is the natural way to do it in Haskell. How do you write this without an infinite list? Well, the fold isn't too readable
pairs = foldr (\x xs -> \n -> (x,n) : xs (n+1)) (const []) xs 1
or you could write it with explicit recursion (too verbose, doesn't fuse). There are several other ways to write it, none of which are as simple and clear as the zip.
I'm sure there are many more. Laziness is surprisingly useful, when you get used to it.
You'll certainly learn about evaluation strategies. Non-strict evaluation strategies can be very powerful for particular kinds of programming problems, and once you're exposed to them, you may be frustrated that you can't use them in some language setting.
I may develop thought patterns that work in a lazy world but not in a normal order/eager evaluation world.
Right. You'll be a more rounded programmer. Abstractions that provide "delaying" mechanisms are fairly common now, so you'd be a worse programmer not to know them.
I may misinterpret lazy-evaluation-based features as being part of the "functional paradigm".
Lazy evaluation is an important part of the functional paradigm. It's not a requirement - you can program functionally with eager evaluation - but it's a tool that naturally fits functional programming.
You see people explicitly implement/invoke it (notably in the form of lazy sequences) in languages that don't make it the default; and while mixing it with imperative code requires caution, pure functional code allows safe use of laziness. And since laziness makes many constructs cleaner and more natural, it's a great fit!
(Disclaimer: no Haskell or F# experience)
To expand on Beni's answer: if we ignore operational aspects in terms of efficiency (and stick with a purely functional world for the moment), every terminating expression under eager evaluation is also terminating under non-strict evaluation, and the values of both (their denotations) coincide.
This is to say that lazy evaluation is strictly more expressive than eager evaluation. By allowing you to write more correct and useful expressions, it expands your "vocabulary" and ability to think functionally.
Here's one example of why:
A language can be lazy-by-default but with optional eagerness, or eager by default with optional laziness, but in fact its been shown (c.f. Okasaki for example) that there are certain purely functional data structures which can only achieve certain orders of performance if implemented in a language that provides laziness either optionally or by default.
Now when you do want to worry about efficiency, then the difference does matter, and sometimes you will want to be strict and sometimes you won't.
But worrying about strictness is a good thing, because very often the cleanest thing to do (and not only in a lazy-by-default language) is to use a thoughtful mix of lazy and eager evaluation, and thinking along these lines will be a good thing no matter which language you wind up using in the future.
Edit: Inspired by Simon's post, one additional point: many problems are most naturally thought about as traversals of infinite structures rather than basically recursive or iterative. (Although such traversals themselves will generally involve some sort of recursive call.) Even for finite structures, very often you only want to explore a small portion of a potentially large tree. Generally speaking, non-strict evaluation allows you to stop mixing up the operational issue of what the processor actually bothers to figure out with the semantic issue of the most natural way to represent the actual structure you're using.
Recently, i found myself doing Haskell-style programming in Python. I took over a monolithic function that extracted/computed/generated values and put them in a file sink, in one step.
I thought this was bad for understanding, reuse and testing. My plan was to separate value generation and value processing. In Haskell i would have generated a (lazy) list of those computed values in a pure function and would have done the post-processing in another (side-effect bearing) function.
Knowing that non-lazy lists in Python can be expensive, if they tend to get big, i thought about the next close Python solution. To me that was to use a generator for the value generation step.
The Python code got much better thanks to my lazy (pun intended) mindset.
I'd expect bad habits.
I saw one of my coworkers try to use (hand-coded) lazy evaluation in our .NET project. Unfortunately the consequence of lazy evaluation hid the bug where it would try remote invocations before the start of main executed, and thus outside the try/catch to handle the "Hey I can't connect to the internet" case.
Basically, the manner of something was hiding the fact that something really expensive was hiding behind a property read and so made it look like a good idea to do inside the type initializer.
Contextual information missing.
Laziness (or more specifically, the assumption of the availabilty of the purity and equational reasoning) is sometimes quite useful for specific problem domains, but not necessarily better in general. If you're talking about general-purpose language settings, relying on the lazy evaluation rules by default is considered harmful.
Analysis
Any languages has functional combination (or the applicable terms combination; i.e. function call expression, function-like macro invocation, FEXPRs, etc.) enforces rules on evaluation, implying the order of different parts of subcomputation therein. For convenience and the simplicity of the specification of the language, a language usually specify the rules in a flavor paired to the reduction strategy:
The strict evaluation, or the applicative-order reduction, which evaluates all subexpression first, before the subcomputation of the remaining evaluation of the hole combination.
The non-strict evaluation, or the normal-order reduction, which does not necessarily evaluate every subexpression at first.
The remaining subcomputation finally determines the result of the whole evaluation of the expression. (For program-defined constructs, this usually implies the substitution of the evaluated argument into something like a function body, and the subsequent evaluation of the result.)
Lazy evaluation, or the call-by-need strategy, is a typical concrete instance of the non-strict evaluation kind. To make it practically usable, subexpression evaluations are required to be pure (side-effect-free), so the reductions implementing the strategy can have the Church-Rosser property whatever the order of subexpression evaluation is actually adopted.
One significant merit of such design is the availability of the equational resoning: users can encode the equality of expression evaluation in the program, and optimizing implementation of the language can perform the transformation depending directly on such constructs.
However, there are many serious problems behind such design.
Equational reasoning is not important as it in the first glance in practice.
The encoding is not a separate feature. It has some specific requirements on the other features to carry the encoding. For a pure language, it is even more difficult to encode them elsewhere, so there is certain pressure to make the type system more expressive, hence more complicated typing and typechecking.
Whether the compiler uses the equational reasoning directly encoded in the program or not is an implementation detail. It is more of a taste of style to promote the importance.
Syntatic equations are not powerful enough to encode semantic conditions like cases of "unspecified behavior" in ISO C. It still needs some additional primitives to express non-determinism of such semantic equivalence classes to make optimization techniques based on such equivalence possible.
It is computationally inefficient at the very basic level by default, and not amendable by the programmer easily.
There is no systemic way to reduce the cost on equations which are known not required by the programmer.
One of the significance comes from the clash between lazily evaluated combinations and proper tail recursion over the combinations.
The unpredictable abuse of thunks to memoize the lazily evaluated expressions also makes troubles on the utilization of the machine resources (e.g. registers and the cache memory).
Purely functional languages like Haskell may declare the referential transparency is a good thingTM. However, this is faulty in certain contexts.
There are semantic gaps over the terminology itself. The purity is not the only aspect for the referential transparency; moreover, there are other kinds of such property not readily provided by the evaluation strategy.
In general, referential transparency should not be a goal about programming. Instead, it is an optional manner to implement the composable components of programs. Composability is essentially about the expected invariance on the interface of the components. There are many ways to keep the composability without the aid of any kinds of referential transparency. Whether the guarantee should be enforced by the language rules? It depends. At least, it should not depend totally on the language designers' point.
The lack of impure evaluations requires more syntax noises to encode many constructs simply expressible by mutable state cells in the traditional impure languages. The workarounds of the practical problems do make the solution more difficult and hard to reason by humans.
For example, I/O operations are side-effectful, thus not directly expressible in Haskell expressions under the usual non-strict evaluation rules, otherwise the order of effects will be non-deterministic.
To overcoming the shortcoming, some indirect conventional constructs like the IO monad to simulate the traditional imperative style are proposed. Such monadic constructs are in essential "indirect" in the sense similar to the continuation-passing style, which is considerably low-level and difficult to read. Even though monads can be "powerful" than continuations in expresiveness, it does not naturally powerful than more high-level alternatives (like algebraic effect systems) when the lazy evaluation strategy is not enforced by default.
Besides the intuition problem above, the necessity of using monadic constructs are often difficult to prove formally (if ever possible). As the result, they are very easily abused (just like the design patterns for "OOP" languages derived from Simula). The related syntax sugar, notably, the famous do-notation, is abused for a few decades before well-known by the Haskell community.
Simulating strict language constructs in languages like Haskell usually needs monadic constructs, while simulating non-strict constructs in strict languages are considerably simpler and easier to implement efficiently. For instance, there is SRFI-45.
The lazy evaluation strategy does not deal with many other non-strict constructs well.
For example, seq has to be a compiler magic in GHC. This is not easily expressible by other Haskell constructs without massive changes in the core Haskell language rules.
Although traditional strict languages also do not allow user programs to simulate the enforcement of the order easily so such sequential constructs are therefore primitive (examples: C-like ; is primitive; the derivation of Scheme's begin is relying on the primitive lambda which in turn implying an implicit evaluation order on expressions), it can be implementable reusing the applicative order rules without additional ad-hoc primitives, like the derivation of the$sequence operator in the Kernel language.
Concerns about specific questions
Lazy evaluation is not a must for the "functional paradigm", though as mentioned above, purely functional languages are likely have the lazy evaluation strategy by default. The common properties are the usability of first-class functions. Impure languages like Lisp and ML family are considered "functional", which use eager evaluation by default. Also note the popularity of "functional paradigm" came after the introducing of function-level programming. The latter is quite different, but still somewhat similar to "functional programming" on the treatment of first-classness.
As mentioned above, the way to simulate laziness in eager languages are well-known. Additionally, for pure programs, there may be no non-trivially semantic difference between call-by-need and normal order reduction. To figure out something really only work in a lazy world is actually not easy. (Do you want to implement the language?) Just go ahead.
Conclusion
Be careful to the problem domain. Lazy evaluation may work well for specific scenarios. However, making it by default is likely to be a bad idea in general, because users (whoever to use the language to program, or to derive a new dialect based on the current language) will likely have few chances to ignore all of the problems it will cause.
Well, try to think of something that would work if lazily evaluated, that wouldn't if eagerly evaluated. The most common category of these would be lazy logical operator evaluation used to hide a "side effect". I'll use C#-ish language to explain, but functional languages would have similar analogs.
Take the simple C# lambda:
(a,b) => a==0 || ++b < 20
In a lazy-evaluated language, if a==0, the expression ++b < 20 is not evaluated (because the entire expression evaluates to true either way), which means that b is not incremented. In both imperative and functional languages, this behavior (and similar behavior of the AND operator) can be used to "hide" logic containing side effects that should not be executed:
(a,b) => a==0 && save(b)
"a" in this case may be the number of validation errors. If there were validation errors, the first half fails and the second half is not evaluated. If there were no validation errors, the second half is evaluated (which would include the side effect of trying to save b) and the result (apparently true or false) is returned to be evaluated. If either side evaluates to false, the lambda returns false indicating that b was not successfully saved. If this were evaluated "eagerly", we would try to save regardless of the value of "a", which would probably be bad if a nonzero "a" indicated that we shouldn't.
Side effects in functional languages are generally considered a no-no. However, there are few non-trivial programs that do not require at least one side effect; there's generally no other way to make a functional algorithm integrate with non-functional code, or with peripherals like a data store, display, network channel, etc.

What functional language techniques can be used in imperative languages?

Which techniques or paradigms normally associated with functional languages can productively be used in imperative languages as well?
e.g.:
Recursion can be problematic in languages without tail-call optimization, limiting its use to a narrow set of cases, so that's of limited usefulness
Map and filter have found their way into non-functional languages, even though they have a functional sort of feel to them
I happen to really like not having to worry about state in functional languages. If I were particularly stubborn I might write C programs without modifying variables, only encapsulating my state in variables passed to functions and in values returned from functions.
Even though functions aren't first class values, I can wrap one in an object in Java say, and pass that into another method. Like Functional programming, just less fun.
So, for veterans of functional programming, when you program in imperative languages, what ideas from FP have you applied successfully?
Pretty nearly all of them?
If you understand functional languages, you can write imperative programs that are "informed" by a functional style. That will lead you away from side effects, and toward programs in which reading the program text at any particular point is sufficient to let you really know what the meaning of the program is at that point.
Back at the Dawn of Time we used to worry about "coupling" and "cohesion". Learning an FP will lead you to write systems with optimal (minimal) coupling, and high cohesion.
Here are things that get in the way of doing FP in a non-FP language:
If the language doesn't support lambda/closures, and doesn't have any syntactic sugar to easily mostly hack it, you are dead in the water. You don't call map/filter without closures.
If the language is statically-typed and doesn't support generics, you are dead in the water. All the good FP stuff uses genericity.
If the language doesn't support tail-recursion, you are hindered. You can write implementations of e.g. 'map' iteratively; also often your data may not be too large and recursion will be ok.
If the language does not support algebraic data types and pattern-matching, you will be mildly hindered. It's just annoying not to have them once you've tasted them.
If the language cannot express type classes, well, oh well... you'll get by, but darn if that's not just the awesomest feature ever, but Haskell is the only remotely popular language with good support.
Not having first-class functions really puts a damper on writing functional programs, but there are a few things that you can do that don't require them. The first is to eschew mutable state - try to have most or all of your classes return new objects that represent the modified state instead of making the change internally. As an example, if you were writing a linked list with an add operation, you would want to return the new linked list from add as opposed to modifying the object.
While this may make your programs less efficient (due to the increased number of objects being created and destroyed) you will gain the ability to more easily debug the program because the state and operation of the objects becomes more predictable, not to mention the ability to nest function calls more deeply because they have state inputs and outputs.
I've successfully used higher-order functions a lot, especially the kind that are passed in rather than the kind that are returned. The kind that are returned can be a bit tedious but can be simulated.
All sorts of applicative data structures and recursive functions work well in imperative languages.
The things I miss the most:
Almost no imperative languages guarantee to optimize every tail call.
I know of no imperative language that supports case analysis by pattern matching.

What is "Total Functional Programming"?

Wikipedia has this to say:
Total functional programming (also
known as strong functional
programming, to be contrasted with
ordinary, or weak functional
programming) is a programming paradigm
which restricts the range of programs
to those which are provably
terminating.
and
These restrictions mean that total
functional programming is not
Turing-complete. However, the set of
algorithms which can be used is still
huge. For example, any algorithm which
has had an asymptotic upper bound
calculated for it can be trivially
transformed into a
provably-terminating function by using
the upper bound as an extra argument
which is decremented upon each
iteration or recursion.
There is also a Lambda The Ultimate Post about a paper on Total Functional Programming.
I hadn't come across that until last week on a mailing list.
Are there any more resources, references or any example implementations that you know of?
If I understood that correctly, Total Functional Programming means just that: Programming with Total Functions. If I remember my math courses correctly, a Total Function is a function which is defined over its entire domain, a Partial Function is one which has "holes" in its definition.
Now, if you have a function which for some input value v goes into an infinite recursion or an infinite loop or in general doesn't terminate in some other fashion, then your function isn't defined for v, and thus partial, i.e. not total.
Total Functional Programming doesn't allow you to write such a function. All functions always return a result for all possible inputs; and the type checker ensures that this is the case.
My guess is that this vastly simplifies error handling: there aren't any.
The downside is already mentioned in your quote: it's not Turing-complete. E.g. an Operating System is essentially a giant infinite loop. Indeed, we do not want an Operating System to terminate, we call this behaviour a "crash" and yell at our computers about it!
While this is an old question, I think that none of the answers so far mention the real motivation for total functional programming, which is this:
If programs are proofs, and proofs are programs, then programs which have 'holes' don't make any sense as proofs, and introduce logical inconsistency.
Basically, if a proof is a program, an infinite loop can be used to prove anything. This is really bad, and provides much of the motivation for why we might want to program totally. Other answers tend to not account for the flip side of the paper. While the languages are techincally not turing complete, you can recover a lot of interesting programs by using co-inductive definitions and functions. We're very prone to think of inductive data, but codata serves an important purpose in these languages, where you can totally define a definition which is infinite (and when doing real computation which terminates, you will potentially use only a finite piece of this, or maybe not if you're writing an operating system!).
It is also of note that most proof assistants work based on this principle, Coq, for example.
Charity is another language that guarantees termination:
http://pll.cpsc.ucalgary.ca/charity1/www/home.html
Hume is a language with 4 levels. The outer level is Turing complete and the innermost layer guarantees termination:
http://www-fp.cs.st-andrews.ac.uk/hume/report/

Resources