Wikipedia has this to say:
Total functional programming (also
known as strong functional
programming, to be contrasted with
ordinary, or weak functional
programming) is a programming paradigm
which restricts the range of programs
to those which are provably
terminating.
and
These restrictions mean that total
functional programming is not
Turing-complete. However, the set of
algorithms which can be used is still
huge. For example, any algorithm which
has had an asymptotic upper bound
calculated for it can be trivially
transformed into a
provably-terminating function by using
the upper bound as an extra argument
which is decremented upon each
iteration or recursion.
There is also a Lambda The Ultimate Post about a paper on Total Functional Programming.
I hadn't come across that until last week on a mailing list.
Are there any more resources, references or any example implementations that you know of?
If I understood that correctly, Total Functional Programming means just that: Programming with Total Functions. If I remember my math courses correctly, a Total Function is a function which is defined over its entire domain, a Partial Function is one which has "holes" in its definition.
Now, if you have a function which for some input value v goes into an infinite recursion or an infinite loop or in general doesn't terminate in some other fashion, then your function isn't defined for v, and thus partial, i.e. not total.
Total Functional Programming doesn't allow you to write such a function. All functions always return a result for all possible inputs; and the type checker ensures that this is the case.
My guess is that this vastly simplifies error handling: there aren't any.
The downside is already mentioned in your quote: it's not Turing-complete. E.g. an Operating System is essentially a giant infinite loop. Indeed, we do not want an Operating System to terminate, we call this behaviour a "crash" and yell at our computers about it!
While this is an old question, I think that none of the answers so far mention the real motivation for total functional programming, which is this:
If programs are proofs, and proofs are programs, then programs which have 'holes' don't make any sense as proofs, and introduce logical inconsistency.
Basically, if a proof is a program, an infinite loop can be used to prove anything. This is really bad, and provides much of the motivation for why we might want to program totally. Other answers tend to not account for the flip side of the paper. While the languages are techincally not turing complete, you can recover a lot of interesting programs by using co-inductive definitions and functions. We're very prone to think of inductive data, but codata serves an important purpose in these languages, where you can totally define a definition which is infinite (and when doing real computation which terminates, you will potentially use only a finite piece of this, or maybe not if you're writing an operating system!).
It is also of note that most proof assistants work based on this principle, Coq, for example.
Charity is another language that guarantees termination:
http://pll.cpsc.ucalgary.ca/charity1/www/home.html
Hume is a language with 4 levels. The outer level is Turing complete and the innermost layer guarantees termination:
http://www-fp.cs.st-andrews.ac.uk/hume/report/
Related
It's fairly obvious why a functional programming language that wants to be lazy needs to be pure. I'm looking at the reverse question: if a language wants to be pure, is there a big advantage in being lazy? One argument, made by one of the designers of Haskell, is that it removes temptation; maybe, but I'm trying to weigh up the more concrete advantages.
Given that you want to do functional programming, what are the use cases where built-in laziness lets you express things more clearly, simply or concisely?
Stated simply: Why is laziness so important that you'd want to build it into the language?
(I'm looking for use cases more oriented towards an application rather than a demo - I know you can do things like producing an infinite list of prime numbers by filtering an infinite list of natural numbers, but who writes that ten times 'fore lunch...)
"Nothing is evaluated until it is needed at another place" is a simplified metaphor which doesn't cover all aspects of lazy evaluation (e.g. it doesn't mention the strictness phenomena).
From theoretical standpoint, there are 3 ways to go when designing a pure language (of course if it's based on some kind of lambda calculus and not on more exotic evaluation models): strict, non-strict and total.
Each of them has its advantages and disadvantages, so you need to read corresponding research papers.
Total languages are most pure of the three. In the other two the non-termination can be seen as a side effect, so strictness and totality analysers must be built to keep an implementation efficient. Both analyses are undecidable, so the analyzers can never be complete.
However, the total languages are least expressive: it's impossible for a total language to be Turing complete. A frequent approach to get good enough expressiveness is to have a built-in proof system for well-founded recursion, which is not much easier to build than the analyzers for non-total languages.
From practical standpoint, non-strict semantics lets you more easily define control abstractions, as control structures are essentially non-strict. In a strict language you still need some places with non-strict semantics. E.g. if construct has non-strict semantic even in strict languages.
So if your language is strict, control structures are a special case. In contrast, a non-strict language can be uniformly non-strict - it doesn't have an inherent need in strict constructs.
As for "who writes that ten times 'fore lunch" - anyone who uses Haskell for their projects does. I think developing a non-toy project using a language (a non-strict language in your case) is a best way to grasp its advantages and disadvantages.
Below are a few generic usecases for laziness illustrated by non-toy examples:
Cases when control flow is hard to predict. Think of attribute grammars when without laziness you have to perform a topological sort on attributes to resolve the dependensies. Re-sorting your code every time the dependency graph is changed is not practical. In Haskell you can implement the attribute grammar formalism without an explicit sorting, and there are at least two actual implementations on Hackage. The attribute grammars have wide application in compiler construction.
The "generate and search" approach to solve many optimizaton problems. In a strict language you have to interleave generation and search, in Haskell you just compose separate generation and searching functions, and your code remains syntactically modular, but interleaved at runtime. Think of the traveling salesman problem (TSP), when you generate all possible tours and then search through them using a branch-and-bound algorithm. Note that branch an bound algorithms only inspects certain first cities of a tour, only the necessary parts of routes are generated. The TSP has several applications even in its purest formulation, such as planning, logistics, and the manufacture of microchips. Slightly modified, it appears as a sub-problem in many areas, such as DNA sequencing.
Lazy code has non-modular control flow, so a single function can have many possible control flows depending on the environment it executes in. This phenomena can be seen as some kind of 'control flow polymorphism', so lazy control flow abstractions are more generic than their strict counterparts, and a standard library of higher-order functions is much more useful in a lazy language. Think of Python generators, loops and list iterators: in Haskell list functions cover all three usecases, with control flow adapting to different usage scenarios because of laziness. It is not limited to lists - think of Data.Arrow and iteratees, lazy and strict versions of State monad etc. Also note that non-modular control flow is both an advantage and disadvantage, as it makes reasoning about performance harder.
Lazy possibly infinite data structures are useful beyond toy examples. See works of Conal Elliott on memoizing higher order functions using tries. Infinite data structures appear as infinite search spaces (see 2), infinite loops and never-exhausting generators in Python sense (see 3).
Mac OS X's Core Image is a good practical example of lazy evaluation.
Basically, Core Image lets you create a directed acyclic graph of image generators and filters. No evaluation actually takes place until the last step in the process: materialization. When you request to materialize a Core Image graph, the final image frame is propagated backwards through the graph's transformations, thus minimizing the quantity of actual pixel values that need to be evaluated.
There's an extensive discussion of this point in Hughes's classic Why Functional Programming Matters. Therein, Hughes argues that laziness allows for improved modularity, using a number of accessible examples.
I'm looking to learn functional programming with either Haskell or F#.
Are there any programming habits (good or bad) that could form as a result Haskell's lazy evaluation? I like the idea of Haskell's functional programming purity for the purposes of understanding functional programming. I'm just a bit worried about two things:
I may misinterpret lazy-evaluation-based features as being part of the "functional paradigm".
I may develop thought patterns that work in a lazy world but not in a normal order/eager evaluation world.
There are habits that you get into when programming in a lazy language that don't work in a strict language. Some of these seem so natural to Haskell programmers that they don't think of them as lazy evaluation. A couple of examples off the top of my head:
f x y = if x > y then .. a .. b .. else c
where
a = expensive
b = expensive
c = expensive
here we define a bunch of subexpressions in a where clause, with complete disregard for which of them will ever be evaluated. It doesn't matter: the compiler will ensure that no unnecessary work is performed at runtime. Non-strict semantics means that the compiler is able to do this. Whenever I write in a strict language I trip over this a lot.
Another example that springs to mind is "numbering things":
pairs = zip xs [1..]
here we just want to associate each element in a list with its index, and zipping with the infinite list [1..] is the natural way to do it in Haskell. How do you write this without an infinite list? Well, the fold isn't too readable
pairs = foldr (\x xs -> \n -> (x,n) : xs (n+1)) (const []) xs 1
or you could write it with explicit recursion (too verbose, doesn't fuse). There are several other ways to write it, none of which are as simple and clear as the zip.
I'm sure there are many more. Laziness is surprisingly useful, when you get used to it.
You'll certainly learn about evaluation strategies. Non-strict evaluation strategies can be very powerful for particular kinds of programming problems, and once you're exposed to them, you may be frustrated that you can't use them in some language setting.
I may develop thought patterns that work in a lazy world but not in a normal order/eager evaluation world.
Right. You'll be a more rounded programmer. Abstractions that provide "delaying" mechanisms are fairly common now, so you'd be a worse programmer not to know them.
I may misinterpret lazy-evaluation-based features as being part of the "functional paradigm".
Lazy evaluation is an important part of the functional paradigm. It's not a requirement - you can program functionally with eager evaluation - but it's a tool that naturally fits functional programming.
You see people explicitly implement/invoke it (notably in the form of lazy sequences) in languages that don't make it the default; and while mixing it with imperative code requires caution, pure functional code allows safe use of laziness. And since laziness makes many constructs cleaner and more natural, it's a great fit!
(Disclaimer: no Haskell or F# experience)
To expand on Beni's answer: if we ignore operational aspects in terms of efficiency (and stick with a purely functional world for the moment), every terminating expression under eager evaluation is also terminating under non-strict evaluation, and the values of both (their denotations) coincide.
This is to say that lazy evaluation is strictly more expressive than eager evaluation. By allowing you to write more correct and useful expressions, it expands your "vocabulary" and ability to think functionally.
Here's one example of why:
A language can be lazy-by-default but with optional eagerness, or eager by default with optional laziness, but in fact its been shown (c.f. Okasaki for example) that there are certain purely functional data structures which can only achieve certain orders of performance if implemented in a language that provides laziness either optionally or by default.
Now when you do want to worry about efficiency, then the difference does matter, and sometimes you will want to be strict and sometimes you won't.
But worrying about strictness is a good thing, because very often the cleanest thing to do (and not only in a lazy-by-default language) is to use a thoughtful mix of lazy and eager evaluation, and thinking along these lines will be a good thing no matter which language you wind up using in the future.
Edit: Inspired by Simon's post, one additional point: many problems are most naturally thought about as traversals of infinite structures rather than basically recursive or iterative. (Although such traversals themselves will generally involve some sort of recursive call.) Even for finite structures, very often you only want to explore a small portion of a potentially large tree. Generally speaking, non-strict evaluation allows you to stop mixing up the operational issue of what the processor actually bothers to figure out with the semantic issue of the most natural way to represent the actual structure you're using.
Recently, i found myself doing Haskell-style programming in Python. I took over a monolithic function that extracted/computed/generated values and put them in a file sink, in one step.
I thought this was bad for understanding, reuse and testing. My plan was to separate value generation and value processing. In Haskell i would have generated a (lazy) list of those computed values in a pure function and would have done the post-processing in another (side-effect bearing) function.
Knowing that non-lazy lists in Python can be expensive, if they tend to get big, i thought about the next close Python solution. To me that was to use a generator for the value generation step.
The Python code got much better thanks to my lazy (pun intended) mindset.
I'd expect bad habits.
I saw one of my coworkers try to use (hand-coded) lazy evaluation in our .NET project. Unfortunately the consequence of lazy evaluation hid the bug where it would try remote invocations before the start of main executed, and thus outside the try/catch to handle the "Hey I can't connect to the internet" case.
Basically, the manner of something was hiding the fact that something really expensive was hiding behind a property read and so made it look like a good idea to do inside the type initializer.
Contextual information missing.
Laziness (or more specifically, the assumption of the availabilty of the purity and equational reasoning) is sometimes quite useful for specific problem domains, but not necessarily better in general. If you're talking about general-purpose language settings, relying on the lazy evaluation rules by default is considered harmful.
Analysis
Any languages has functional combination (or the applicable terms combination; i.e. function call expression, function-like macro invocation, FEXPRs, etc.) enforces rules on evaluation, implying the order of different parts of subcomputation therein. For convenience and the simplicity of the specification of the language, a language usually specify the rules in a flavor paired to the reduction strategy:
The strict evaluation, or the applicative-order reduction, which evaluates all subexpression first, before the subcomputation of the remaining evaluation of the hole combination.
The non-strict evaluation, or the normal-order reduction, which does not necessarily evaluate every subexpression at first.
The remaining subcomputation finally determines the result of the whole evaluation of the expression. (For program-defined constructs, this usually implies the substitution of the evaluated argument into something like a function body, and the subsequent evaluation of the result.)
Lazy evaluation, or the call-by-need strategy, is a typical concrete instance of the non-strict evaluation kind. To make it practically usable, subexpression evaluations are required to be pure (side-effect-free), so the reductions implementing the strategy can have the Church-Rosser property whatever the order of subexpression evaluation is actually adopted.
One significant merit of such design is the availability of the equational resoning: users can encode the equality of expression evaluation in the program, and optimizing implementation of the language can perform the transformation depending directly on such constructs.
However, there are many serious problems behind such design.
Equational reasoning is not important as it in the first glance in practice.
The encoding is not a separate feature. It has some specific requirements on the other features to carry the encoding. For a pure language, it is even more difficult to encode them elsewhere, so there is certain pressure to make the type system more expressive, hence more complicated typing and typechecking.
Whether the compiler uses the equational reasoning directly encoded in the program or not is an implementation detail. It is more of a taste of style to promote the importance.
Syntatic equations are not powerful enough to encode semantic conditions like cases of "unspecified behavior" in ISO C. It still needs some additional primitives to express non-determinism of such semantic equivalence classes to make optimization techniques based on such equivalence possible.
It is computationally inefficient at the very basic level by default, and not amendable by the programmer easily.
There is no systemic way to reduce the cost on equations which are known not required by the programmer.
One of the significance comes from the clash between lazily evaluated combinations and proper tail recursion over the combinations.
The unpredictable abuse of thunks to memoize the lazily evaluated expressions also makes troubles on the utilization of the machine resources (e.g. registers and the cache memory).
Purely functional languages like Haskell may declare the referential transparency is a good thingTM. However, this is faulty in certain contexts.
There are semantic gaps over the terminology itself. The purity is not the only aspect for the referential transparency; moreover, there are other kinds of such property not readily provided by the evaluation strategy.
In general, referential transparency should not be a goal about programming. Instead, it is an optional manner to implement the composable components of programs. Composability is essentially about the expected invariance on the interface of the components. There are many ways to keep the composability without the aid of any kinds of referential transparency. Whether the guarantee should be enforced by the language rules? It depends. At least, it should not depend totally on the language designers' point.
The lack of impure evaluations requires more syntax noises to encode many constructs simply expressible by mutable state cells in the traditional impure languages. The workarounds of the practical problems do make the solution more difficult and hard to reason by humans.
For example, I/O operations are side-effectful, thus not directly expressible in Haskell expressions under the usual non-strict evaluation rules, otherwise the order of effects will be non-deterministic.
To overcoming the shortcoming, some indirect conventional constructs like the IO monad to simulate the traditional imperative style are proposed. Such monadic constructs are in essential "indirect" in the sense similar to the continuation-passing style, which is considerably low-level and difficult to read. Even though monads can be "powerful" than continuations in expresiveness, it does not naturally powerful than more high-level alternatives (like algebraic effect systems) when the lazy evaluation strategy is not enforced by default.
Besides the intuition problem above, the necessity of using monadic constructs are often difficult to prove formally (if ever possible). As the result, they are very easily abused (just like the design patterns for "OOP" languages derived from Simula). The related syntax sugar, notably, the famous do-notation, is abused for a few decades before well-known by the Haskell community.
Simulating strict language constructs in languages like Haskell usually needs monadic constructs, while simulating non-strict constructs in strict languages are considerably simpler and easier to implement efficiently. For instance, there is SRFI-45.
The lazy evaluation strategy does not deal with many other non-strict constructs well.
For example, seq has to be a compiler magic in GHC. This is not easily expressible by other Haskell constructs without massive changes in the core Haskell language rules.
Although traditional strict languages also do not allow user programs to simulate the enforcement of the order easily so such sequential constructs are therefore primitive (examples: C-like ; is primitive; the derivation of Scheme's begin is relying on the primitive lambda which in turn implying an implicit evaluation order on expressions), it can be implementable reusing the applicative order rules without additional ad-hoc primitives, like the derivation of the$sequence operator in the Kernel language.
Concerns about specific questions
Lazy evaluation is not a must for the "functional paradigm", though as mentioned above, purely functional languages are likely have the lazy evaluation strategy by default. The common properties are the usability of first-class functions. Impure languages like Lisp and ML family are considered "functional", which use eager evaluation by default. Also note the popularity of "functional paradigm" came after the introducing of function-level programming. The latter is quite different, but still somewhat similar to "functional programming" on the treatment of first-classness.
As mentioned above, the way to simulate laziness in eager languages are well-known. Additionally, for pure programs, there may be no non-trivially semantic difference between call-by-need and normal order reduction. To figure out something really only work in a lazy world is actually not easy. (Do you want to implement the language?) Just go ahead.
Conclusion
Be careful to the problem domain. Lazy evaluation may work well for specific scenarios. However, making it by default is likely to be a bad idea in general, because users (whoever to use the language to program, or to derive a new dialect based on the current language) will likely have few chances to ignore all of the problems it will cause.
Well, try to think of something that would work if lazily evaluated, that wouldn't if eagerly evaluated. The most common category of these would be lazy logical operator evaluation used to hide a "side effect". I'll use C#-ish language to explain, but functional languages would have similar analogs.
Take the simple C# lambda:
(a,b) => a==0 || ++b < 20
In a lazy-evaluated language, if a==0, the expression ++b < 20 is not evaluated (because the entire expression evaluates to true either way), which means that b is not incremented. In both imperative and functional languages, this behavior (and similar behavior of the AND operator) can be used to "hide" logic containing side effects that should not be executed:
(a,b) => a==0 && save(b)
"a" in this case may be the number of validation errors. If there were validation errors, the first half fails and the second half is not evaluated. If there were no validation errors, the second half is evaluated (which would include the side effect of trying to save b) and the result (apparently true or false) is returned to be evaluated. If either side evaluates to false, the lambda returns false indicating that b was not successfully saved. If this were evaluated "eagerly", we would try to save regardless of the value of "a", which would probably be bad if a nonzero "a" indicated that we shouldn't.
Side effects in functional languages are generally considered a no-no. However, there are few non-trivial programs that do not require at least one side effect; there's generally no other way to make a functional algorithm integrate with non-functional code, or with peripherals like a data store, display, network channel, etc.
OK, for those who have never encountered the term, a quine is a "self-replicating" computer program. To be more specific, one which - upon execution - produces a copy of its own source code as its only output.
The quines can, of course, be developed in many programming languages (but not all); but some languages are obviously more suited to producing quines than others (to clearly understand the somewhat subjective-sounding "more suited", look at a Haskell example vs. C example in the Wiki page - and I provide my more-objective definition below).
The question I have is, from programming language perspective, what language features (either theoretical design ones or syntax sugar) make the language more suitable/helpful for writing quines?
My definition of "more suitable" is "quines are easier to write" and "are shorter/more readable/less obfuscated". But you're welcome to add more criteria that are at least somewhat objective.
Please note that this question explicitly excludes degenerate cases, like a language which is designed to contain "print_a_quine" primitive.
I am not entirely sure, so correct me if anyone of you knows better.
I agree with both other answers, going further by explaining, that a quine is this:
Y g
where Y is a Y fixed-point combinator (or any other fixed-point combinator), which means in lambda calculus:
Y g = g(Y g)
now, it is quite apparent, that we need the code to be data and g be a function which will print its arguments.
So to summarize we need for constructing such a quines functions, printing function, fixed-point combinator and call-by-name evaluation strategy.
The smallest language that satisfies this conditions is AFAIK Zot from the Iota and Jot family.
Languages like the Io Programming Language and others allow the treating of code as data. In tree walking systems, this typically allows the language implementer to expose the abstract syntax tree as a first class citizen. In the case of Io, this is what it does. Being object oriented, the AST is modelled around Message objects, and a special sentinel is created to represent the currently executing message; this sentinel is called thisMessage. thisMessage is a full Message like any other, and responds to the print message, which prints it to the screen. As a result, the shortest quine I've ever been able to produce in any language, has come from Io and looks like this:
thisMessage print
Anyway, I just couldn't help but sharing this with you on this subject. The above certainly makes writing quines easy, but not doing it this way certainly doesn't preclude easily creating a quine.
I'm not sure if this is useful answer from a practical point of view, but there is some useful theory in computability theory. In particular fixed points and Kleene's recursion theorem can be used for writing quines. Apparently, the theory can be used for writing quine in LISP (as the wikipedia page shows).
I recently started studying functional programming using Haskell and came upon this article on the official Haskell wiki: How to read Haskell.
The article claims that short variable names such as x, xs, and f are fitting for Haskell code, because of conciseness and abstraction. In essence, it claims that functional programming is such a distinct paradigm that the naming conventions from other paradigms don't apply.
What are your thoughts on this?
In a functional programming paradigm, people usually construct abstractions not only top-down, but also bottom-up. That means you basically enhance the host language. In this kind of situations I see terse naming as appropriate. The Haskell language is already terse and expressive, so you should be kind of used to it.
However, when trying to model a certain domain, I don't believe succinct names are good, even when the function bodies are small. Domain knowledge should reflect in naming.
Just my opinion.
In response to your comment
I'll take two code snippets from Real World Haskell, both from chapter 3.
In the section named "A more controlled approach", the authors present a function that returns the second element of a list. Their final version is this:
tidySecond :: [a] -> Maybe a
tidySecond (_:x:_) = Just x
tidySecond _ = Nothing
The function is generic enough, due to the type parameter a and the fact we're acting on a built in type, so that we don't really care what the second element actually is. I believe x is enough in this case. Just like in a little mathematical equation.
On the other hand, in the section named "Introducing local variables", they're writing an example function that tries to model a small piece of the banking domain:
lend amount balance = let reserve = 100
newBalance = balance - amount
in if balance < reserve
then Nothing
else Just newBalance
Using short variable name here is certainly not recommended. We actually do care what those amounts represent.
I think if the semantics of the arguments are clear within the context of the code then you can get away with short variable names. I often use these in C# lambdas for the same reason. However if it is ambiguous, you should be more explicit with naming.
map :: (a->b) -> [a] -> [b]
map f [] = []
map f (x:xs) = f x : map f xs
To someone who hasn't had any exposure to Haskell, that might seem like ugly, unmaintainable code. But most Haskell programmers will understand this right away. So it gets the job done.
var list = new int[] { 1, 2, 3, 4, 5 };
int countEven = list.Count(n => n % 2 == 0)
In that case, short variable name seems appropriate.
list.Aggregate(0, (total, value) => total += value);
But in this case it seems more appropriate to name the variables, because it isn't immediately apparent what the Aggregate is doing.
Basically, I believe not to worry too much about convention unless it's absolutely necessary to keep people from screwing up. If you have any choice in the matter, use what makes sense in the context (language, team, block of code) you are working, and will be understandable by someone else reading it hours, weeks or years later. Anything else is just time-wasting OCD.
I think scoping is the #1 reason for this. In imperative languages, dynamic variables, especially global ones need to be named properly, as they're used in several functions. With lexical scoping, it's clear what the symbol is bound to at compile time.
Immutability also contributes to this to some extent- in traditional languages like C/ C++/ Java, a variable can represent different data at different points in time. Therefore, it needs to be given a name to give the programmer an idea of its functionality.
Personally, I feel that features features like first-class functions make symbol names pretty redundant. In traditional languages, it's easier to relate to a symbol; based on its usage, we can tell if it's data or a function.
I'm studying Haskell now, but I don't feel that its naming conventions is so very different. Of course, in Java you're hardly to find a names like xs. But it is easy to find names like x in some mathematical functions, i, j for counters etc. I consider such names to be perfectly appropriate in right context. xs in Haskell is appropriate only generic functions over lists. There's a lot of them in Haskell, so this name is wide-spread. Java doesn't provide easy way to handle such a generic abstractions, that's why names for lists (and lists themselves) are usually much more specific, e.g. lists or users.
I just attended a number of talks on Haskell with lots of code samples. As longs as the code dealt with x, i and f the naming didn't bother me. However, as soon as we got into heavy duty list manipulation and the like I found the three letters or so names to be a lot less readable than I prefer.
To be fair a significant part of the naming followed a set of conventions, so I assume that once you get into the lingo it will be a little easier.
Fortunately, nothing prevents us from using meaningful names, but I don't agree that the language itself somehow makes three letter identifiers meaningful to the majority of people.
When in Rome, do as the Romans do
(Or as they say in my town: "Donde fueres, haz lo que vieres")
Anything that aids readability is a good thing - meaningful names are therefore a good thing in any language.
I use short variable names in many languages but they're reserved for things that aren't important in the overall meaning of the code or where the meaning is clear in the context.
I'd be careful how far I took the advice about Haskell names
My Haskell practice is only of mediocre level, thus, I dare to try to reply only the second, more general part of Your question:
"In essence, it claims that functional programming is such a distinct paradigm that the naming conventions from other paradigms don't apply."
I suspect, the answer is "yes", but my motivation behind this opinion is restricted only on experience in just one single functional language. Still, it may be interesting, because this is an extremely minimalistic one, thus, theoretically very "pure", and underlying a lot of practical functional languages.
I was curios how easy it is to write practical programs on such an "extremely" minimalistic functional programming language like combinatory logic.
Of course, functional programming languages lack mutable variables, but combinatory logic "goes further one step more" and it lacks even formal parameters. It lacks any syntactic sugar, it lacks any predefined datatypes, even booleans or numbers. Everything must be mimicked by combinators, and traced back to the applications of just two basic combinators.
Despite of such extreme minimalism, there are still practical methods for "programming" combinatory logic in a neat and pleasant way. I have written a quine in it in a modular and reusable way, and it would not be nasty even to bootstrap a self-interpreter on it.
For summary, I felt the following features in using this extremely minimalistic functional programming language:
There is a need to invent a lot of auxiliary functions. In Haskell, there is a lot of syntactic sugar (pattern matching, formal parameters). You can write quite complicated functions in few lines. But in combinatory logic, a task that could be expressed in Haskell by a single function, must be replaced with well-chosen auxiliary functions. The burden of replacing Haskell syntactic sugar is taken by cleverly chosen auxiliary functions in combinatory logic. As for replying Your original question: it is worth of inventing meaningful and catchy names for these legions of auxiliary functions, because they can be quite powerful and reusable in many further contexts, sometimes in an unexpected way.
Moreover, a programmer of combinatory logic is not only forced to find catchy names of a bunch of cleverly chosen auxiliary functions, but even more, he is forced to (re)invent whole new theories. For example, for mimicking lists, the programmer is forced to mimick them with their fold functions, basically, he has to (re)invent catamorphisms, deep algebraic and category theory concepts.
I conjecture, several differences can be traced back to the fact that functional languages have a powerful "glue".
In Haskell, meaning is conveyed less with variable names than with types. Being purely functional has the advantage of being able to ask for the type of any expression, regardless of context.
I agree with a lot of the points made here about argument naming but a quick 'find on page' shows that no one has mentioned Tacit programming (aka pointfree / pointless). Whether this is easier to read may be debatable so it's up to you & your team, but definitely worth a thorough consideration.
No named arguments = No argument naming conventions.
Which techniques or paradigms normally associated with functional languages can productively be used in imperative languages as well?
e.g.:
Recursion can be problematic in languages without tail-call optimization, limiting its use to a narrow set of cases, so that's of limited usefulness
Map and filter have found their way into non-functional languages, even though they have a functional sort of feel to them
I happen to really like not having to worry about state in functional languages. If I were particularly stubborn I might write C programs without modifying variables, only encapsulating my state in variables passed to functions and in values returned from functions.
Even though functions aren't first class values, I can wrap one in an object in Java say, and pass that into another method. Like Functional programming, just less fun.
So, for veterans of functional programming, when you program in imperative languages, what ideas from FP have you applied successfully?
Pretty nearly all of them?
If you understand functional languages, you can write imperative programs that are "informed" by a functional style. That will lead you away from side effects, and toward programs in which reading the program text at any particular point is sufficient to let you really know what the meaning of the program is at that point.
Back at the Dawn of Time we used to worry about "coupling" and "cohesion". Learning an FP will lead you to write systems with optimal (minimal) coupling, and high cohesion.
Here are things that get in the way of doing FP in a non-FP language:
If the language doesn't support lambda/closures, and doesn't have any syntactic sugar to easily mostly hack it, you are dead in the water. You don't call map/filter without closures.
If the language is statically-typed and doesn't support generics, you are dead in the water. All the good FP stuff uses genericity.
If the language doesn't support tail-recursion, you are hindered. You can write implementations of e.g. 'map' iteratively; also often your data may not be too large and recursion will be ok.
If the language does not support algebraic data types and pattern-matching, you will be mildly hindered. It's just annoying not to have them once you've tasted them.
If the language cannot express type classes, well, oh well... you'll get by, but darn if that's not just the awesomest feature ever, but Haskell is the only remotely popular language with good support.
Not having first-class functions really puts a damper on writing functional programs, but there are a few things that you can do that don't require them. The first is to eschew mutable state - try to have most or all of your classes return new objects that represent the modified state instead of making the change internally. As an example, if you were writing a linked list with an add operation, you would want to return the new linked list from add as opposed to modifying the object.
While this may make your programs less efficient (due to the increased number of objects being created and destroyed) you will gain the ability to more easily debug the program because the state and operation of the objects becomes more predictable, not to mention the ability to nest function calls more deeply because they have state inputs and outputs.
I've successfully used higher-order functions a lot, especially the kind that are passed in rather than the kind that are returned. The kind that are returned can be a bit tedious but can be simulated.
All sorts of applicative data structures and recursive functions work well in imperative languages.
The things I miss the most:
Almost no imperative languages guarantee to optimize every tail call.
I know of no imperative language that supports case analysis by pattern matching.