Good Haskell coding standards - haskell

Could someone provide a link to a good coding standard for Haskell? I've found this and this, but they are far from comprehensive. Not to mention that the HaskellWiki one includes such "gems" as "use classes with care" and "defining symbolic infix identifiers should be left to library writers only."

Really hard question. I hope your answers turn up something good. Meanwhile, here is a catalog of mistakes or other annoying things that I have found in beginners' code. There is some overlap with the Cal Tech style page that Kornel Kisielewicz points to. Some of my advice is every bit as vague and useless as the HaskellWiki "gems", but I hope at least it is better advice :-)
Format your code so it fits in 80 columns. (Advanced users may prefer 87 or 88; beyond that is pushing it.)
Don't forget that let bindings and where clauses create a mutually recursive nest of definitions, not a sequence of definitions.
Take advantage of where clauses, especially their ability to see function parameters that are already in scope (nice vague advice). If you are really grokking Haskell, your code should have a lot more where-bindings than let-bindings. Too many let-bindings is a sign of an unreconstructed ML programmer or Lisp programmer.
Avoid redundant parentheses. Some places where redundant parentheses are particularly offensive are
Around the condition in an if expression (brands you as an unreconstructed C programmer)
Around a function application which is itself the argument of an infix operator (Function application binds tighter than any infix operator. This fact should be burned into every Haskeller's brain, in much the same way that us dinosaurs had APL's right-to-left scan rule burned in.)
Put spaces around infix operators. Put a space following each comma in a tuple literal.
Prefer a space between a function and its argument, even if the argument is parenthesized.
Use the $ operator judiciously to cut down on parentheses. Be aware of the close relationship between $ and infix .:
f $ g $ h x == (f . g . h) x == f . g . h $ x
Don't overlook the built-in Maybe and Either types.
Never write if <expression> then True else False; the correct phrase is simply <expression>.
Don't use head or tail when you could use pattern matching.
Don't overlook function composition with the infix dot operator.
Use line breaks carefully. Line breaks can increase readability, but there is a tradeoff: Your editor may display only 40–50 lines at once. If you need to read and understand a large function all at once, you mustn't overuse line breaks.
Almost always prefer the -- comments which run to end of line over the {- ... -} comments. The braced comments may be appropriate for large headers—that's it.
Give each top-level function an explicit type signature.
When possible, align -- lines, = signs, and even parentheses and commas that occur in adjacent lines.
Influenced as I am by GHC central, I have a very mild preference to use camelCase for exported identifiers and short_name with underscores for local where-bound or let-bound variables.

Some good rules of thumbs imho:
Consult with HLint to make sure you don't have redundant braces and that your code isn't pointlessly point-full.
Avoid recreating existing library functions. Hoogle can help you find them.
Often times existing library functions are more general than what one was going to make. For example if you want Maybe (Maybe a) -> Maybe a, then join does that among other things.
Argument naming and documentation is important sometimes.
For a function like replicate :: Int -> a -> [a], it's pretty obvious what each of the arguments does, from their types alone.
For a function that takes several arguments of the same type, like isPrefixOf :: (Eq a) => [a] -> [a] -> Bool, naming/documentation of arguments is more important.
If one function exists only to serve another function, and isn't otherwise useful, and/or it's hard to think of a good name for it, then it probably should exist in it's caller's where clause instead of in the module's scope.
DRY
Use Template-Haskell when appropriate.
Bundles of functions like zip3, zipWith3, zip4, zipWith4, etc are very meh. Use Applicative style with ZipLists instead. You probably never really need functions like those.
Derive instances automatically. The derive package can help you derive instances for type-classes such as Functor (there is only one correct way to make a type an instance of Functor).
Code that is more general has several benefits:
It's more useful and reusable.
It is less prone to bugs because there are more constraints.
For example if you want to program concat :: [[a]] -> [a], and notice how it can be more general as join :: Monad m => m (m a) -> m a. There is less room for error when programming join because when programming concat you can reverse the lists by mistake and in join there are very few things you can do.
When using the same stack of monad transformers in many places in your code, make a type synonym for it. This will make the types shorter, more concise, and easier to modify in bulk.
Beware of "lazy IO". For example readFile doesn't really read the file's contents at the moment the file is read.
Avoid indenting so much that I can't find the code.
If your type is logically an instance of a type-class, make it an instance.
The instance can replace other interface functions you may have considered with familiar ones.
Note: If there is more than one logical instance, create newtype-wrappers for the instances.
Make the different instances consistent. It would have been very confusing/bad if the list Applicative behaved like ZipList.

I like to try to organize functions
as point-free style compositions as
much as possible by doing things
like:
func = boo . boppity . bippity . snd
where boo = ...
boppity = ...
bippity = ...
I like using ($) only to avoid nested parens or long parenthesized expressions
... I thought I had a few more in me, oh well

I'd suggest taking a look at this style checker.

I found good markdown file covering almost every aspect of haskell code style. It can be used as cheat sheet. You can find it here: link

Related

Pre-defined infix operator list in Haskell?

Basically, I need to define infix operator for function composition, flipped g . f manner.
(#) :: (a -> b) -> (b -> c) -> a -> c
(#) = flip (.)
infixl 9 #
Here, temporarily, I just have chosen the symbol: #, but I'm not certain this choice is not problematic in terms of collision to other definitions.
The weird thing is somehow I could not find a Pre-defined infix operator list in Haskell.
I want the composition operator as concise as possible, hopefully, a single character like ., and if possible, I want to make it &. Is it OK?
Is there any guidance or best pracice tutorial? Please advise.
Related Q&A:
What characters are permitted for Haskell operators?
You don't see a list of Haskell built-in operators for the same reason you don't see a list of all built-in functions. They're everywhere. Some are in Prelude, some are in Control.Monad, etc., etc. Operators aren't special in Haskell; they're ordinary functions with neat syntax. And Haskellers are generally pretty operator-happy in general. Spend any time inside your favorite lens library and you'll find plenty of amusing-looking operators.
In terms of (#), it might be best to avoid. I don't know of any built-in operators called that, but # can be a bit special when it comes to parsing. Specifically, a compiler extension enables # at the end of ordinary identifiers, and GHC defines a lot of built-in (primitive) types following this practice. For instance, Int is the usual (boxed) integer type, whereas Int# is a primitive integer. Most people don't need to interface with this directly, but it is a use that # has in Haskell that would be confused slightly by the addition of your operator.
Your (#) operator is called (>>>) in Haskell, and it works on all Category instances, including functions. Its companion is (<<<), the generalization of (.) to all Category instances. If three characters is too long for you, I've seen it called (|>) in some other languages, but Haskell already uses that operator for something else. If you're not using Data.Sequence, you could use that operator. But personally, I'd just go with (>>>). Any Haskeller will recognize it pretty quickly.

How to make a partial function?

I was thinking about how I could save myself from undefinition, and one idea I had was to enumerate all possible sources of partiality. At least I would know what of to beware. I found three yet:
Incomplete pattern matches or guards.
Recursion. (Optionally excluding structural recursion on algebraic types.)
If a function is unsafe, any use of that function infects the user code. (Should I be saying "partiality is transitive"?)
I have heard of other ways to obtain a logical contradiction, for instance by using negative types, but I am not sure if anything of that sort applies to Haskell. There are many logical paradoxes out there, and some of them can be encoded in Haskell, but may it be true that any logical paradox requires the use of recursion, and is therefore covered by the point 2 above?
For instance, if it were proven that a Haskell expression free of recursion can always be evaluated to normal form, then the three points I give would be a complete list. I fuzzily remember seeing something like a proof of this in one of Simon Peyton Jones's books, but that was written like 30 years ago, so even if I remember correctly and it used to apply to a prototype Haskell back then, it may be false today, seeing how many a language extension we have. Possibly some of them enable other ways to undefine a program?
And then, if it were so easy to detect expressions that cannot be partial, why do we not do that? How easier would life be!
This is a partial answer (pun intended), where I'll only list a few arguably non obvious ways one can achieve non termination.
First, I'll confirm that negative-recursive types can indeed cause non termination. Indeed, it is known that allowing a recursive type such as
data R a = R (R a -> a)
allows one to define fix, and obtain non termination from there.
{-# LANGUAGE ScopedTypeVariables #-}
{-# OPTIONS -Wall #-}
data R a = R (R a -> a)
selfApply :: R a -> a
selfApply t#(R x) = x t
-- Church's fixed point combinator Y
-- fix f = (\x. f (x x))(\x. f (x x))
fix :: forall a. (a -> a) -> a
fix f = selfApply (R (\x -> f (selfApply x)))
Total languages like Coq or Agda prohibit this by requiring recursive types to use only strictly-positive recursion.
Another potential source of non-termination is that Haskell allows Type :: Type. As far as I can see, that makes it possible to encode System U in Haskell, where Girard's paradox can be used to cause a logical inconsistency, constructing a term of type Void. That term (as far as I understand) would be non terminating.
Girard's paradox is unfortunately rather complex to fully describe, and I have not completely studied it yet. I only know it is related to the so-called hypergame, a game where the first move is to choose a finite game to play. A finite game is one which causes every match to terminate after finitely many moves. The next moves after that would correspond to a match according to the chosen finite game at step one. Here's the paradox: since the chosen game must be finite, no matter what it is, the whole hypergame match will always terminate after a finite amount of moves. This makes hypergame itself a finite game, making the infinite sequence of moves "I choose hypergame, I choose hypergame, ..." a valid play, in turn proving that hypergame is not finite.
Apparently, this argument can be encoded in a rich enough pure type system like System U, and Type :: Type allows to embed the same argument.

Function searching for applicable types for applying to Lensed pieces of Haskell AST

My primary queston is: is there, within some Haskell AST, a way I can determine a list of the available declarations, and their types? I'm trying to build an editor that allows for the user to be shown all the appropriate edits available, such as inserting functions and/or other declared values that can be used or inserted at any point. It'll also disallows syntax errors as well as type-errors. (That is, it'll be a semantic structural editor, which I'll also use the typechecker to make sure the editing pieces make sense to in this case, Haskell).
The second part of my question is: once I have that list, given a particular expression or function or focussed-on piece of AST (using Lens), how could I filter the list based on what could possibly replace or fit that particular focussed-on AST piece (whether by providing arguments to a function, or if it's a value, just "as-is"). Perhaps I need to add some concrete example here... something like: "Haskell, which declarations could possibly be applied (for functions) and/or placed into the hole at yay x y z = (x + y - z) * _?" then if there was an expression number2 :: Num a => a ; number2 = 23 it would put this in the list, as well as the functions available in the context, as well as those from Num itself such as (+) :: Num a => a -> a -> a, (*) :: Num a => a -> a -> a, and any other declarations that resulted in a type that would match such as Num a => a etc. etc.
More details follow:
I’ve done a fair bit of research into this area over quite a long time: looked at and used hint, Language.Haskell.Exts and Control.Lens a fair bit. Also had a look into Dynamic. Control.Lens is relevant for the second half of my question. I've also looked at quite a few projects along the way including Conal Elliott's "Semantic Editing Combinators", Paul Chiusano's Unison system and quite a few things in Clojure and Lisp as well.
So, I know I can get a list of the exports of a module with hint as [String], and I could coerce that to [Dynamic], I think (possibly?), but I’m not sure how I’d get sub-function declarations and their types. (Maybe I could take the declarations within that scope with AST and put them in their own modules in a String and pull them in by getting the top level declarations with hint? that would work but feels hacky and cumbersome)
I can use (:~:) from Data.Typeable to do "propositional equality" (ie typechecking?) on two terms, but what I actually need to do is see if a term could be matched into a position in the source/AST (I'm using lenses and prisms to focus on those parts of the AST) given some number of arguments. Some kind of partial type-checking, or result type-checking? Because the thing I might be focussing on could very well be a function, and I might need to keep the same arity.
I feel like perhaps this is very similar to Idris' term-searching, though I haven't looked into the source for that and I'm not sure if that's something only possible in a dependently typed language.
Any help would be great.
Looks like I kind of answered my own questions, so I'm going to do so formally here.
The answer to the first part of my question can be found in the Reflection module of the hint library. I knew I could get a list a [String] of these modules, but there's a function in there that can be used which has type: getModuleExports :: MonadInterpreter m => ModuleName -> m [ModuleElem] and is most likely the sort of thing I'm after. This is because hint provides access to a large part of the GHC API. It also provides some lookup functions which I can then use to get the types of these top level terms.
https://github.com/mvdan/hint/blob/master/src/Hint/Reflection.hs#L30
Also, Template Haskell provides some of the functionality I'm interested in, and I'll probably end up using quite a bit of that to build my functions, or at least a set of lenses for whatever syntax is being used by the code (/text) under consideration.
In terms of the second part of the question, I still don't have a particularly good answer, so my first attempt will be to use some String munging on the output of the lookup functions and see what I can do.

Can compilers deduce/prove mathematically?

I'm starting to learn functional programming language like Haskell, ML and most of the exercises will show off things like:
foldr (+) 0 [ 1 ..10]
which is equivalent to
sum = 0
for( i in [1..10] )
sum += i
So that leads me to think why can't compiler know that this is Arithmetic Progression and use O(1) formula to calculate?
Especially for pure FP languages without side effect?
The same applies for
sum reverse list == sum list
Given a + b = b + a
and definition of reverse, can compilers/languages prove it automatically?
Compilers generally don't try to prove this kind of thing automatically, because it's hard to implement.
As well as adding the logic to the compiler to transform one fragment of code into another, you have to be very careful that it only tries to do it when it's actually safe - i.e. there are often lots of "side conditions" to worry about. For example in your example above, someone might have written an instance of the type class Num (and hence the (+) operator) where the a + b is not b + a.
However, GHC does have rewrite rules which you can add to your own source code and could be used to cover some relatively simple cases like the ones you list above, particularly if you're not too bothered about the side conditions.
For example, and I haven't tested this, you might use the following rule for one of your examples above:
{-# RULES
"sum/reverse" forall list . sum (reverse list) = sum list
#-}
Note the parentheses around reverse list - what you've written in your question actually means (sum reverse) list and wouldn't typecheck.
EDIT:
As you're looking for official sources and pointers to research, I've listed a few.
Obviously it's hard to prove a negative but the fact that no-one has given an example of a general-purpose compiler that does this kind of thing routinely is probably quite strong evidence in itself.
As others have pointed out, even simple arithmetic optimisations are surprisingly dangerous, particularly on floating point numbers, and compilers generally have flags to turn them off - for example Visual C++, gcc. Even integer arithmetic isn't always clear-cut and people occasionally have big arguments about how to deal with things like overflow.
As Joachim noted, integer variables in loops are one place where slightly more sophisticated optimisations are applied because there are actually significant wins to be had. Muchnick's book is probably the best general source on the topic but it's not that cheap. The wikipedia page on strength reduction is probably as good an introduction as any to one of the standard optimisations of this kind, and has some references to the relevant literature.
FFTW is an example of a library that does all kinds of mathematical optimization internally. Some of its code is generated by a customised compiler the authors wrote specifically for the purpose. It's worthwhile because the authors have domain-specific knowledge of optimizations that in the specific context of the library are both worth the effort and safe
People sometimes use template metaprogramming to write "self-optimising libraries" that again might rely on arithmetic identities, see for example Blitz++. Todd Veldhuizen's PhD dissertation has a good overview.
If you descend into the realms of toy and academic compilers all sorts of things go. For example my own PhD dissertation is about writing inefficient functional programs along with little scripts that explain how to optimise them. Many of the examples (see Chapter 6) rely on applying arithmetic rules to justify the underlying optimisations.
Also, it's worth emphasising that the last few examples are of specialised optimisations being applied only to certain parts of the code (e.g. calls to specific libraries) where it is expected to be worthwhile. As other answers have pointed out, it's simply too expensive for a compiler to go searching for all possible places in an entire program where an optimisation might apply. The GHC rewrite rules that I mentioned above are a great example of a compiler exposing a generic mechanism for individual libraries to use in a way that's most appropriate for them.
The answer
No, compilers don’t do that kind of stuff.
One reason why
And for your examples, it would even be wrong: Since you did not give type annotations, the Haskell compiler will infer the most general type, which would be
foldr (+) 0 [ 1 ..10] :: Num a => a
and similar
(\list -> sum (reverse list)) :: Num a => [a] -> a
and the Num instance for the type that is being used might well not fulfil the mathematical laws required for the transformation you suggest. The compiler should, before everything else, avoid to change the meaning (i.e. the semantics) of your program.
More pragmatically: The cases where the compiler could detect such large-scale transformations rarely occur in practice, so it would not be worth it to implement them.
An exception
Note notable exceptions are linear transformations in loops. Most compilers will rewrite
for (int i = 0; i < n; i++) {
... 200 + 4 * i ...
}
to
for (int i = 0, j = 200; i < n; i++, j += 4) {
... j ...
}
or something similar, as that pattern does often occur in code working on array.
The optimizations you have in mind will probably not be done even in the presence of monomorphic types, because there are so many possibilities and so much knowledge required. For example, in this example:
sum list == sum (reverse list)
The compiler would need to know or take into account the following facts:
sum = foldl (+) 0
(+) is commutative
reverse list is a permutation of list
foldl x c l, where x is commutative and c is a constant, yields the same result for all permutations of l.
This all seems trivial. Sure, the compiler can most probably look up the definition of sumand inline it. It could be required that (+) be commutative, but remember that +is just another symbol without attached meaning to the compiler. The third point would require the compiler to prove some non trivial properties about reverse.
But the point is:
You don't want to perform the compiler to do those calculations with each and every expression. Remember, to make this really useful, you'd have to heap up a lot of knowledge about many, many standard functions and operators.
You still can't replace the expression above with True unless you can rule out the possibility that list or some list element is bottom. Usually, one cannot do this. You can't even do the following "trivial" optimization of f x == f x in all cases
f x `seq` True
For, consider
f x = (undefined :: Bool, x)
then
f x `seq` True ==> True
f x == f x ==> undefined
That being said, regarding your first example slightly modified for monomorphism:
f n = n * foldl (+) 0 [1..10] :: Int
it is imaginable to optimize the program by moving the expression out of its context and replace it with the name of a constant, like so:
const1 = foldl (+) 0 [1..10] :: Int
f n = n * const1
This is because the compiler can see that the expression must be constant.
What you're describing looks like super-compilation. In your case, if the expression had a monomorphic type like Int (as opposed to polymorphic Num a => a), the compiler could infer that the expression foldr (+) 0 [1 ..10] has no external dependencies, therefore it could be evaluated at compile time and replaced by 55. However, AFAIK no mainstream compiler currently does this kind of optimization.
(In functional programming "proving" is usually associated with something different. In languages with dependent types types are powerful enough to express complex proposition and then through the Curry-Howard correspondence programs become proofs of such propositions.)
As others have noted, it's unclear that your simplifications even hold in Haskell. For instance, I can define
newtype NInt = N Int
instance Num NInt where
N a + _ = N a
N b * _ = N b
... -- etc
and now sum . reverse :: Num [a] -> a does not equal sum :: Num [a] -> a since I can specialize each to [NInt] -> NInt where sum . reverse == sum clearly does not hold.
This is one general tension that exists around optimizing "complex" operations—you actually need quite a lot of information in order to successfully prove that it's okay to optimize something. This is why the syntax-level compiler optimization which do exist are usually monomorphic and related to the structure of programs---it's usually such a simplified domain that there's "no way" for the optimization to go wrong. Even that is often unsafe because the domain is never quite so simplified and well-known to the compiler.
As an example, a very popular "high-level" syntactic optimization is stream fusion. In this case the compiler is given enough information to know that stream fusion can occur and is basically safe, but even in this canonical example we have to skirt around notions of non-termination.
So what does it take to have \x -> sum [0..x] get replaced by \x -> x*(x + 1)/2? The compiler would need a theory of numbers and algebra built-in. This is not possible in Haskell or ML, but becomes possible in dependently typed languages like Coq, Agda, or Idris. There you could specify things like
revCommute :: (_+_ :: a -> a -> a)
-> Commutative _+_
-> foldr _+_ z (reverse as) == foldr _+_ z as
and then, theoretically, tell the compiler to rewrite according to revCommute. This would still be difficult and finicky, but at least we'd have enough information around. To be clear, I'm writing something very strange above, a dependent type. The type not only depends on the ability to introduce both a type and a name for the argument inline, but also the existence of the entire syntax of your language "at the type level".
There are a lot of differences between what I just wrote and what you'd do in Haskell, though. First, in order to form a basis where such promises can be taken seriously, we must throw away general recursion (and thus we already don't have to worry about questions of non-termination like stream-fusion does). We also must have enough structure around to create something like the promise Commutative _+_---this likely depends upon there being an entire theory of operators and mathematics built into the language's standard library else you would need to create that yourself. Finally, the richness of type system required to even express these kinds of theories adds a lot of complexity to the entire system and tosses out type inference as you know it today.
But, given all that structure, I'd never be able to create an obligation Commutative _+_ for the _+_ defined to work on NInts and so we could be certain that foldr (+) 0 . reverse == foldr (+) 0 actually does hold.
But now we'd need to tell the compiler how to actually perform that optimization. For stream-fusion, the compiler rules only kick in when we write something in exactly the right syntactic form to be "clearly" an optimization redex. The same kinds of restrictions would apply to our sum . reverse rule. In fact, already we're sunk because
foldr (+) 0 . reverse
foldr (+) 0 (reverse as)
don't match. They're "obviously" the same due to some rules we could prove about (.), but that means that now the compiler must invoke two built-in rules in order to perform our optimization.
At the end of the day, you need a very smart optimization search over the sets of known laws in order to achieve the kinds of automatic optimizations you're talking about.
So not only do we add a lot of complexity to the entire system, require a lot of base work to build-in some useful algebraic theories, and lose Turing completeness (which might not be the worst thing), we also only get a finicky promise that our rule would even fire unless we perform an exponentially painful search during compilation.
Blech.
The compromise that exists today tends to be that sometimes we have enough control over what's being written to be mostly certain that a certain obvious optimization can be performed. This is the regime of stream fusion and it requires a lot of hidden types, carefully written proofs, exploitations of parametricity, and hand-waving before it's something the community trusts enough to run on their code.
And it doesn't even always fire. For an example of battling that problem take a look at the source of Vector for all of the RULES pragmas that specify all of the common circumstances where Vector's stream-fusion optimizations should kick in.
All of this is not at all a critique of compiler optimizations or dependent type theories. Both are really incredible. Instead it's just an amplification of the tradeoffs involved in introducing such an optimization. It's not to be done lightly.
Fun fact: Given two arbitrary formulas, do they both give the same output for the same inputs? The answer to this trivial question is not computable! In other words, it is mathematically impossible to write a computer program that always gives the correct answer in finite time.
Given this fact, it's perhaps not surprising that nobody has a compiler that can magically transform every possible computation into its most efficient form.
Also, isn't this the programmer's job? If you want the sum of an arithmetic sequence commonly enough that it's a performance bottleneck, why not just write some more efficient code yourself? Similarly, if you really want Fibonacci numbers (why?), use the O(1) algorithm.

What are super combinators and constant applicative forms?

I'm struggling with what Super Combinators are:
A supercombinator is either a constant, or a combinator which contains only supercombinators as subexpressions.
And also with what Constant Applicative Forms are:
Any super combinator which is not a lambda abstraction. This includes truly constant expressions such as 12, ((+) 1 2), [1,2,3] as well as partially applied functions such as ((+) 4). Note that this last example is equivalent under eta abstraction to \ x -> (+) 4 x which is not a CAF.
This is just not making any sense to me! Isn't ((+) 4) just as "truly constant" as 12? CAFs sound like values to my simple mind.
These Haskell wiki pages you reference are old, and I think unfortunately written. Particularly unfortunate is that they mix up CAFs and supercombinators. Supercombinators are interesting but unrelated to GHC. CAFs are still very much a part of GHC, and can be understood without reference to supercombinators.
So let's start with supercombinators. Combinators derive from combinatory logic, and, in the usage here, consist of functions which only apply the values passed in to one another in one or another form -- i.e. they combine their arguments. The most famous set of combinators are S, K, and I, which taken together are Turing-complete. Supercombinators, in this context, are functions built only of values passed in, combinators, and other supercombinators. Hence any supercombinator can be expanded, through substitution, into a plain old combinator.
Some compilers for functional languages (not GHC!) use combinators and supercombinators as intermediate steps in compilation. As with any similar compiler technology, the reason for doing this is to admit optimization analysis that is more easily performed in such a simplified, minimal language. One such core language built on supercombinators is Edwin Brady's epic.
Constant Applicative Forms are something else entirely. They're a bit more subtle, and have a few gotchas. The way to think of them is as an aspect of compiler implementation with no separate semantic meaning but with a potentially profound effect on runtime performance. The following may not be a perfect description of a CAF, but it'll try to convey my intuition of what one is, since I haven't seen a really good description anywhere else for me to crib from. The clean "authoritative" description in the GHC Commentary Wiki reads as follows:
Constant Applicative Forms, or CAFs for short, are top-level values
defined in a program. Essentially, they are objects that are not
allocated dynamically at run-time but, instead, are part of the static
data of the program.
That's a good start. Pure, functional, lazy languages can be thought of in some sense as a graph reduction machine. The first time you demand the value of a node, that forces its evaluation, which in turn can demand the values of subnodes, etc. One a node is evaluated, the resultant value sticks around (although it does not have to stick around -- since this is a pure language we could always keep the subnodes live and recalculate with no semantic effect). A CAF is indeed just a value. But, in the context, a special kind of value -- one which the compiler can determine has a meaning entirely dependent on its subnodes. That is to say:
foo x = ...
where thisIsACaf = [1..10::Int]
thisIsNotACaf = [1..x::Int]
thisIsAlsoNotACaf :: Num a => [a]
thisIsAlsoNotACaf = [1..10] -- oops, polymorphic! the "num" dictionary is implicitly a parameter.
thisCouldBeACaf = const [1..10::Int] x -- requires a sufficiently smart compiler
thisAlsoCouldBeACaf _ = [1..10::Int] -- also requires a sufficiently smart compiler
So why do we care if things are CAFs? Basically because sometimes we really really don't want to recompute something (for example, a memotable!) and so want to make sure it is shared properly. Other times we really do want to recompute something (e.g. a huge boring easy to generate list -- such as the naturals -- which we're just walking over) and not have it stick around in memory forever. A combination of naming things and binding them under lets or writing them inline, etc. typically lets us specify these sorts of things in a natural, intuitive way. Occasionally, however, the compiler is smarter or dumber than we expect, and something we think should only be computed once is always recomputed, or something we don't want to hang on to gets lifted out as a CAF. Then, we need to think things through more carefully. See this discussion to get an idea about some of the trickiness involved: A good way to avoid "sharing"?
[By the way, I don't feel up to it, but anyone that wants to should feel free to take as much of this answer as they want to try and integrate it with the existing Haskell Wiki pages and improve/update them]
Matt is right in that the definition is confusing. It is even contradictory. A CAF is defined as:
Any super combinator which is not a lambda abstraction. This includes
truly constant expressions such as 12, ((+) 1 2), [1,2,3] as
well as partially applied functions such as ((+) 4).
Hence, ((+) 4) is seen as a CAF. But in the very next sentence we're told it is equivalent to something that is not a CAF:
this last example is equivalent under eta abstraction to \ x -> (+) 4 x which is not a CAF.
It would be cleaner to rule out partially applied functions on the ground that they are equivalent to lambda abstractions.

Resources