Are there any purely functional Schemes or Lisps?

Are there any purely functional Schemes or Lisps? - haskell

I've played around with a few functional programming languages and really enjoy the s-expr syntax used by Lisps (Scheme in particular).
I also see the advantages of working in a purely functional language. Therefore:
Are there any purely functional Schemes (or Lisps in general)?

The new Racket language (formerly PLT Scheme) allows you to implement any semantics you like with s-expressions (really any syntax). The base language is an eagerly evaluated, dynamically typed scheme variant but some notable languages built on top are a lazy scheme and a functional reactive system called Father Time.
An easy way to make a purely functional language in Racket is to take the base language and not provide any procedures that mutate state. For example:
#lang racket/base
(provide (except-out (all-from-out racket/base) set! ...more here...))
makes up a language that has no set!.

I don't believe there are any purely functional Lisps, but Clojure is probably the closest.
Rich Hickey, the creator of Clojure:
Why did I write yet another programming language?
Basically because I wanted a Lisp for Functional Programming
designed for Concurrency and couldn't find one.
http://clojure.org/rationale
Clojure is functional, with immutable data types and variables, but you can get mutable behavior in some special cases or by dropping down to Java (Clojure runs on the JVM).
This is by design - another quote by Rich is
A purely functional programming
language is only good for heating your
computer.
See the presentation of Clojure for Lisp programmers.

Are there any purely functional Schemes (or Lisps in general)?
The ACL2 theorem prover is a pure Lisp. It is, however, intended for theorem proving rather than programming, and in particular it is limited to first-order programs. It has, however, been extremely successful in its niche.
Among other things, it won the 2005 ACM Software System Award.

Probably not, at least not as anything other than toys/proofs of concept. Note that even Haskell isn't 100% purely functional--it has secret escape hatches, and anything in IO is only "pure" in some torturous, hand-waving sense of the word.
So, that said, do you really need a purely functional language? You can write purely functional code in almost any language, with varying degrees of inconvenience and inefficiency.
Of course, languages that assume universal state-modification make it painful to keep things pure, so perhaps what you really want is a language that encourages immutability? In that case, you might find it worthwhile to take a look at Clojure's philosophy. And it's a Lisp, to boot!
As a final note, do realize that most of Haskell's "syntax" is thick layers of sugar. The underlying language is not much more than a typed lambda calculus, and nothing stops you from writing all your code that way. You may get funny looks from other Haskell programmers, though. There's also Liskell but I'm not sure what state it's in these days.
On a final, practical note: If you want to actually write code you intend to use, not just tinker with stuff for fun, you'll really want a clever compiler that knows how to work with pure code/immutable data structures.

inconsistent and non-extendable syntax
What is "inconsistency" here?
It is odd to base a language choice soley on syntax. After all, learning syntax will take a few hours -- it is a tiny fraction of the investment required.
In comparison, important considerations like speed, typing discipline, portability, breadth of libraries, documentation and community, have far greater impact on whether you can be productive.
Ignoring all the flame bait, a quick google for immutable Scheme yields some results:
http://blog.plt-scheme.org/2007/11/getting-rid-of-set-car-and-set-cdr.html

30 years ago there was lispkit lisp
Not sure how accesible it is today.
[Thats one of the places where I learnt functional programming]

there is owl lisp, a dialect of scheme R5RS with all data structures made immutable and some additional pure data structures. It is not a large project, but seems to be actively developed and used by a small group of people (from what I can see on the website & git repository). There are also plans to include R7RS support and some sort of type inference. So while probably not ready for production use, this might be a fun thing to play with.

If you like lisp's syntax then you can actually do similar things in Haskell
let fibs = ((++) [1, 1] (zipWith (+) fibs (tail fibs)))
The let fibs = aside. You can always use s-expr syntax in Haskell expressions. This is because you can always add parentheses on the outside and it won't matter. This is the same code without redundant parentheses:
let fibs = (++) [1, 1] (zipWith (+) fibs (tail fibs))
And here it is in "typical" Haskell style:
let fibs = [1, 1] ++ zipWith (+) fibs (tail fibs)

There are a couple of projects that aim to use haskell underneath a lispy syntax. The older, deader, and more ponderous one is "Liskell". The newer, more alive, and lighter weight one is hasp. I think you might find it worth a look.

Related

Will I develop good/bad habits because of lazy evaluation?

I'm looking to learn functional programming with either Haskell or F#.
Are there any programming habits (good or bad) that could form as a result Haskell's lazy evaluation? I like the idea of Haskell's functional programming purity for the purposes of understanding functional programming. I'm just a bit worried about two things:
I may misinterpret lazy-evaluation-based features as being part of the "functional paradigm".
I may develop thought patterns that work in a lazy world but not in a normal order/eager evaluation world.

There are habits that you get into when programming in a lazy language that don't work in a strict language. Some of these seem so natural to Haskell programmers that they don't think of them as lazy evaluation. A couple of examples off the top of my head:
f x y = if x > y then .. a .. b .. else c
where
a = expensive
b = expensive
c = expensive
here we define a bunch of subexpressions in a where clause, with complete disregard for which of them will ever be evaluated. It doesn't matter: the compiler will ensure that no unnecessary work is performed at runtime. Non-strict semantics means that the compiler is able to do this. Whenever I write in a strict language I trip over this a lot.
Another example that springs to mind is "numbering things":
pairs = zip xs [1..]
here we just want to associate each element in a list with its index, and zipping with the infinite list [1..] is the natural way to do it in Haskell. How do you write this without an infinite list? Well, the fold isn't too readable
pairs = foldr (\x xs -> \n -> (x,n) : xs (n+1)) (const []) xs 1
or you could write it with explicit recursion (too verbose, doesn't fuse). There are several other ways to write it, none of which are as simple and clear as the zip.
I'm sure there are many more. Laziness is surprisingly useful, when you get used to it.

You'll certainly learn about evaluation strategies. Non-strict evaluation strategies can be very powerful for particular kinds of programming problems, and once you're exposed to them, you may be frustrated that you can't use them in some language setting.
I may develop thought patterns that work in a lazy world but not in a normal order/eager evaluation world.
Right. You'll be a more rounded programmer. Abstractions that provide "delaying" mechanisms are fairly common now, so you'd be a worse programmer not to know them.

I may misinterpret lazy-evaluation-based features as being part of the "functional paradigm".
Lazy evaluation is an important part of the functional paradigm. It's not a requirement - you can program functionally with eager evaluation - but it's a tool that naturally fits functional programming.
You see people explicitly implement/invoke it (notably in the form of lazy sequences) in languages that don't make it the default; and while mixing it with imperative code requires caution, pure functional code allows safe use of laziness. And since laziness makes many constructs cleaner and more natural, it's a great fit!
(Disclaimer: no Haskell or F# experience)

To expand on Beni's answer: if we ignore operational aspects in terms of efficiency (and stick with a purely functional world for the moment), every terminating expression under eager evaluation is also terminating under non-strict evaluation, and the values of both (their denotations) coincide.
This is to say that lazy evaluation is strictly more expressive than eager evaluation. By allowing you to write more correct and useful expressions, it expands your "vocabulary" and ability to think functionally.
Here's one example of why:
A language can be lazy-by-default but with optional eagerness, or eager by default with optional laziness, but in fact its been shown (c.f. Okasaki for example) that there are certain purely functional data structures which can only achieve certain orders of performance if implemented in a language that provides laziness either optionally or by default.
Now when you do want to worry about efficiency, then the difference does matter, and sometimes you will want to be strict and sometimes you won't.
But worrying about strictness is a good thing, because very often the cleanest thing to do (and not only in a lazy-by-default language) is to use a thoughtful mix of lazy and eager evaluation, and thinking along these lines will be a good thing no matter which language you wind up using in the future.
Edit: Inspired by Simon's post, one additional point: many problems are most naturally thought about as traversals of infinite structures rather than basically recursive or iterative. (Although such traversals themselves will generally involve some sort of recursive call.) Even for finite structures, very often you only want to explore a small portion of a potentially large tree. Generally speaking, non-strict evaluation allows you to stop mixing up the operational issue of what the processor actually bothers to figure out with the semantic issue of the most natural way to represent the actual structure you're using.

Recently, i found myself doing Haskell-style programming in Python. I took over a monolithic function that extracted/computed/generated values and put them in a file sink, in one step.
I thought this was bad for understanding, reuse and testing. My plan was to separate value generation and value processing. In Haskell i would have generated a (lazy) list of those computed values in a pure function and would have done the post-processing in another (side-effect bearing) function.
Knowing that non-lazy lists in Python can be expensive, if they tend to get big, i thought about the next close Python solution. To me that was to use a generator for the value generation step.
The Python code got much better thanks to my lazy (pun intended) mindset.

I'd expect bad habits.
I saw one of my coworkers try to use (hand-coded) lazy evaluation in our .NET project. Unfortunately the consequence of lazy evaluation hid the bug where it would try remote invocations before the start of main executed, and thus outside the try/catch to handle the "Hey I can't connect to the internet" case.
Basically, the manner of something was hiding the fact that something really expensive was hiding behind a property read and so made it look like a good idea to do inside the type initializer.

Contextual information missing.
Laziness (or more specifically, the assumption of the availabilty of the purity and equational reasoning) is sometimes quite useful for specific problem domains, but not necessarily better in general. If you're talking about general-purpose language settings, relying on the lazy evaluation rules by default is considered harmful.
Analysis
Any languages has functional combination (or the applicable terms combination; i.e. function call expression, function-like macro invocation, FEXPRs, etc.) enforces rules on evaluation, implying the order of different parts of subcomputation therein. For convenience and the simplicity of the specification of the language, a language usually specify the rules in a flavor paired to the reduction strategy:
The strict evaluation, or the applicative-order reduction, which evaluates all subexpression first, before the subcomputation of the remaining evaluation of the hole combination.
The non-strict evaluation, or the normal-order reduction, which does not necessarily evaluate every subexpression at first.
The remaining subcomputation finally determines the result of the whole evaluation of the expression. (For program-defined constructs, this usually implies the substitution of the evaluated argument into something like a function body, and the subsequent evaluation of the result.)
Lazy evaluation, or the call-by-need strategy, is a typical concrete instance of the non-strict evaluation kind. To make it practically usable, subexpression evaluations are required to be pure (side-effect-free), so the reductions implementing the strategy can have the Church-Rosser property whatever the order of subexpression evaluation is actually adopted.
One significant merit of such design is the availability of the equational resoning: users can encode the equality of expression evaluation in the program, and optimizing implementation of the language can perform the transformation depending directly on such constructs.
However, there are many serious problems behind such design.
Equational reasoning is not important as it in the first glance in practice.
The encoding is not a separate feature. It has some specific requirements on the other features to carry the encoding. For a pure language, it is even more difficult to encode them elsewhere, so there is certain pressure to make the type system more expressive, hence more complicated typing and typechecking.
Whether the compiler uses the equational reasoning directly encoded in the program or not is an implementation detail. It is more of a taste of style to promote the importance.
Syntatic equations are not powerful enough to encode semantic conditions like cases of "unspecified behavior" in ISO C. It still needs some additional primitives to express non-determinism of such semantic equivalence classes to make optimization techniques based on such equivalence possible.
It is computationally inefficient at the very basic level by default, and not amendable by the programmer easily.
There is no systemic way to reduce the cost on equations which are known not required by the programmer.
One of the significance comes from the clash between lazily evaluated combinations and proper tail recursion over the combinations.
The unpredictable abuse of thunks to memoize the lazily evaluated expressions also makes troubles on the utilization of the machine resources (e.g. registers and the cache memory).
Purely functional languages like Haskell may declare the referential transparency is a good thingTM. However, this is faulty in certain contexts.
There are semantic gaps over the terminology itself. The purity is not the only aspect for the referential transparency; moreover, there are other kinds of such property not readily provided by the evaluation strategy.
In general, referential transparency should not be a goal about programming. Instead, it is an optional manner to implement the composable components of programs. Composability is essentially about the expected invariance on the interface of the components. There are many ways to keep the composability without the aid of any kinds of referential transparency. Whether the guarantee should be enforced by the language rules? It depends. At least, it should not depend totally on the language designers' point.
The lack of impure evaluations requires more syntax noises to encode many constructs simply expressible by mutable state cells in the traditional impure languages. The workarounds of the practical problems do make the solution more difficult and hard to reason by humans.
For example, I/O operations are side-effectful, thus not directly expressible in Haskell expressions under the usual non-strict evaluation rules, otherwise the order of effects will be non-deterministic.
To overcoming the shortcoming, some indirect conventional constructs like the IO monad to simulate the traditional imperative style are proposed. Such monadic constructs are in essential "indirect" in the sense similar to the continuation-passing style, which is considerably low-level and difficult to read. Even though monads can be "powerful" than continuations in expresiveness, it does not naturally powerful than more high-level alternatives (like algebraic effect systems) when the lazy evaluation strategy is not enforced by default.
Besides the intuition problem above, the necessity of using monadic constructs are often difficult to prove formally (if ever possible). As the result, they are very easily abused (just like the design patterns for "OOP" languages derived from Simula). The related syntax sugar, notably, the famous do-notation, is abused for a few decades before well-known by the Haskell community.
Simulating strict language constructs in languages like Haskell usually needs monadic constructs, while simulating non-strict constructs in strict languages are considerably simpler and easier to implement efficiently. For instance, there is SRFI-45.
The lazy evaluation strategy does not deal with many other non-strict constructs well.
For example, seq has to be a compiler magic in GHC. This is not easily expressible by other Haskell constructs without massive changes in the core Haskell language rules.
Although traditional strict languages also do not allow user programs to simulate the enforcement of the order easily so such sequential constructs are therefore primitive (examples: C-like ; is primitive; the derivation of Scheme's begin is relying on the primitive lambda which in turn implying an implicit evaluation order on expressions), it can be implementable reusing the applicative order rules without additional ad-hoc primitives, like the derivation of the$sequence operator in the Kernel language.
Concerns about specific questions
Lazy evaluation is not a must for the "functional paradigm", though as mentioned above, purely functional languages are likely have the lazy evaluation strategy by default. The common properties are the usability of first-class functions. Impure languages like Lisp and ML family are considered "functional", which use eager evaluation by default. Also note the popularity of "functional paradigm" came after the introducing of function-level programming. The latter is quite different, but still somewhat similar to "functional programming" on the treatment of first-classness.
As mentioned above, the way to simulate laziness in eager languages are well-known. Additionally, for pure programs, there may be no non-trivially semantic difference between call-by-need and normal order reduction. To figure out something really only work in a lazy world is actually not easy. (Do you want to implement the language?) Just go ahead.
Conclusion
Be careful to the problem domain. Lazy evaluation may work well for specific scenarios. However, making it by default is likely to be a bad idea in general, because users (whoever to use the language to program, or to derive a new dialect based on the current language) will likely have few chances to ignore all of the problems it will cause.

Well, try to think of something that would work if lazily evaluated, that wouldn't if eagerly evaluated. The most common category of these would be lazy logical operator evaluation used to hide a "side effect". I'll use C#-ish language to explain, but functional languages would have similar analogs.
Take the simple C# lambda:
(a,b) => a==0 || ++b < 20
In a lazy-evaluated language, if a==0, the expression ++b < 20 is not evaluated (because the entire expression evaluates to true either way), which means that b is not incremented. In both imperative and functional languages, this behavior (and similar behavior of the AND operator) can be used to "hide" logic containing side effects that should not be executed:
(a,b) => a==0 && save(b)
"a" in this case may be the number of validation errors. If there were validation errors, the first half fails and the second half is not evaluated. If there were no validation errors, the second half is evaluated (which would include the side effect of trying to save b) and the result (apparently true or false) is returned to be evaluated. If either side evaluates to false, the lambda returns false indicating that b was not successfully saved. If this were evaluated "eagerly", we would try to save regardless of the value of "a", which would probably be bad if a nonzero "a" indicated that we shouldn't.
Side effects in functional languages are generally considered a no-no. However, there are few non-trivial programs that do not require at least one side effect; there's generally no other way to make a functional algorithm integrate with non-functional code, or with peripherals like a data store, display, network channel, etc.

Language features helpful for writing quines (self-printing programs)?

OK, for those who have never encountered the term, a quine is a "self-replicating" computer program. To be more specific, one which - upon execution - produces a copy of its own source code as its only output.
The quines can, of course, be developed in many programming languages (but not all); but some languages are obviously more suited to producing quines than others (to clearly understand the somewhat subjective-sounding "more suited", look at a Haskell example vs. C example in the Wiki page - and I provide my more-objective definition below).
The question I have is, from programming language perspective, what language features (either theoretical design ones or syntax sugar) make the language more suitable/helpful for writing quines?
My definition of "more suitable" is "quines are easier to write" and "are shorter/more readable/less obfuscated". But you're welcome to add more criteria that are at least somewhat objective.
Please note that this question explicitly excludes degenerate cases, like a language which is designed to contain "print_a_quine" primitive.

I am not entirely sure, so correct me if anyone of you knows better.
I agree with both other answers, going further by explaining, that a quine is this:
Y g
where Y is a Y fixed-point combinator (or any other fixed-point combinator), which means in lambda calculus:
Y g = g(Y g)
now, it is quite apparent, that we need the code to be data and g be a function which will print its arguments.
So to summarize we need for constructing such a quines functions, printing function, fixed-point combinator and call-by-name evaluation strategy.
The smallest language that satisfies this conditions is AFAIK Zot from the Iota and Jot family.

Languages like the Io Programming Language and others allow the treating of code as data. In tree walking systems, this typically allows the language implementer to expose the abstract syntax tree as a first class citizen. In the case of Io, this is what it does. Being object oriented, the AST is modelled around Message objects, and a special sentinel is created to represent the currently executing message; this sentinel is called thisMessage. thisMessage is a full Message like any other, and responds to the print message, which prints it to the screen. As a result, the shortest quine I've ever been able to produce in any language, has come from Io and looks like this:
thisMessage print
Anyway, I just couldn't help but sharing this with you on this subject. The above certainly makes writing quines easy, but not doing it this way certainly doesn't preclude easily creating a quine.

I'm not sure if this is useful answer from a practical point of view, but there is some useful theory in computability theory. In particular fixed points and Kleene's recursion theorem can be used for writing quines. Apparently, the theory can be used for writing quine in LISP (as the wikipedia page shows).

Is functional programming a subset of imperative programming?

One of the main characteristics of functional programming is the use of side-effectless functions. However, this can be done in an imperative language also. The same is true for recursion and lambda functions (for example C++0x). Therefore I wonder whether imperative programming languages are a superset of functional ones.

I can't really say whether they are subset of one another. What I can tell, though, that (except for really esoteric languages) they are all Turing-complete, which means that in the end they're all equally powerful, but not neccesarily equally expressive.

Generally speaking, no; functional programming is a subset of declarative programming (which includes logic programming languages, like Prolog). Many imperative languages borrow elements from functional programming languages, but simply having lambdas or referentially-transparent functions does not make an imperative language functional; functional programming is about more than just these elements.

It is possible to implement a certain programming paradigm in a language which doesn't support the programming paradigm natively. For example its possible to write Object Oriented Code in C while it is not designed for this purpose.
Functional programming is a well developed programming paradigm of its own and is best learnt through languages like Haskell, LISP etc. And after you have learnt them well, even though you don't use these languages regularly, you may start using those principles in the day to day language you use on regular basis.
Some people may like to Google for Object oriented programming in C

A paradigm is a way of doing things, and there are two main programming paradigms: imperative and declarative. The fact that some languages allow to mix both paradigms doesn't mean that one is included in the other, but that the languages are multi-paradigm.
To clarify it a little bit more, let me continue with your analogy: if Lisp and OCaml (for example) are considered functional languages, and both of them allow imperative style... then should imperative be considered a subset of functional?

Most imperative languages don't have functions as first-order types, whereas most functionald o. (As does C++, via boost::function.)
By first-order type, this meas a value/variable can be of any type, an int, a bool, a function from int->bool. It usually also includes closures or bound values as well, where you have the same function, but some arguments are already filled in.
Those two are what functional programming is mostly about, IMHO.

I think it might be helpful to draw a distinction between paradigm and language.
To me, paradigms represent "ways of thinking" (concepts and abstractions such as functions, objects, recursion), whereas languages offer "ways of doing" (syntax, variables, evaluations).
All true programming languages are equivalent in the sense that they are Turing-complete and able, in theory, to compute any Turing-computable function as well as simulate or be simulated by a universal Turing machine.
The interesting thing is how difficult it is to accomplish certain tasks in certain languages or paradigms, how appropriate the tool is to the task. Even Conway's Game of Life is Turing-complete, but that does not make me want to program with it.
Many languages support a number of paradigms. C++ was designed as an object-oriented extension for C, but it is possible to write purely procedural code in it.
Some languages borrow/acquire features from other languages or paradigms over time (just look at the evolution of Java).
A few languages, like Common Lisp, are impressively multi-paradigm languages. It is possible to write code that is functional, object oriented or procedural in Lisp. Arguably, aspect-orientation is already part of the common lisp object system, and therefore "nothing special". In Lisp, it is easy to extend the language itself to do whatever you need it to do, thus it is sometimes called the "programmable programming language". (I'll point out here that Lisp describes a family of languages of which Common Lisp is only one dialect).
I think it doesn't matter which of the terms, declarative, imperative, functional or procedural, is a subset of which. What matters more is to understand the tools languages you're working with, and how those are different from other tools. Even more important is to understand the different ways of thinking that the paradigms represent, since those are your thought-tools. As with most other things in life, the more you understand, the more effective you become.

One way to look at it (not saying it's the right way 'cos I'm not a lang designer or theorist by any means) is that if the language is essentially converted to something else then that 'something else' must be the superset of the source. So bytecode is necessarily a superset of Java. .NET IL is a superset of C# and of F#. The functional constructs in C# (i.e. LINQ) are thus a subset of the imperative constructs of IL.
Since machine language is imperative, you could take the position that, therefore, all languages are imperative, because they are just abstractions useful for humans that are then boiled away by the compiler to procedural, imperative machine code.

Pattern mapping like
f:: [int] -> int
f [] = 0
f (x:xs) = 1 + f(xs)
is something that is for instance one thing that is not available in imperative languages.
Also constructs like curried functions:
add2 :: int -> int
add2 = (2 +)
is not available in most imperative languages

Yes, functional programming is a subset of imperative programming, but...
Yes, because there is nothing in functional programming that you can't do in imperative programming (syntax differences notwithstanding). You can "do" functional programming in an imperative language.
But...
The things you can't do are the key features of functional programming.
By limiting what you can do,
you make certain mistakes impossible,
you enable features (such as program analysis, simpler concurrency, simpler testing, etc).
Other benefits of functional programming are more subjective.
You often hear these arguments.
State allows side effects. Side effects are bad. Functional programming has no state. You can't have side effects without state.
This is a dubious benefit.
First, only unintended side effects are bad. Proper programming practices, such as limiting modification of state to privileged code, alleviate side effect issues.
Second, functional programming only has no internal state. If the program has IO (accessing files, network, hardware) you have external state, thus the potential of side effects.
Functional programs are easier to debug.
In some ways yes, like knowing the explicit path to an exception.
But having state to examine is a debugging benefit that functional programming does not have.
Functional programs are easier to understand.
This is only true if you are fluent in functional programming and not fluent in imperative programming.
I for one, being more fluent in imperative programming, find this argument to be false.

Does functional programming mandate new naming conventions?

I recently started studying functional programming using Haskell and came upon this article on the official Haskell wiki: How to read Haskell.
The article claims that short variable names such as x, xs, and f are fitting for Haskell code, because of conciseness and abstraction. In essence, it claims that functional programming is such a distinct paradigm that the naming conventions from other paradigms don't apply.
What are your thoughts on this?

In a functional programming paradigm, people usually construct abstractions not only top-down, but also bottom-up. That means you basically enhance the host language. In this kind of situations I see terse naming as appropriate. The Haskell language is already terse and expressive, so you should be kind of used to it.
However, when trying to model a certain domain, I don't believe succinct names are good, even when the function bodies are small. Domain knowledge should reflect in naming.
Just my opinion.
In response to your comment
I'll take two code snippets from Real World Haskell, both from chapter 3.
In the section named "A more controlled approach", the authors present a function that returns the second element of a list. Their final version is this:
tidySecond :: [a] -> Maybe a
tidySecond (_:x:_) = Just x
tidySecond _ = Nothing
The function is generic enough, due to the type parameter a and the fact we're acting on a built in type, so that we don't really care what the second element actually is. I believe x is enough in this case. Just like in a little mathematical equation.
On the other hand, in the section named "Introducing local variables", they're writing an example function that tries to model a small piece of the banking domain:
lend amount balance = let reserve = 100
newBalance = balance - amount
in if balance < reserve
then Nothing
else Just newBalance
Using short variable name here is certainly not recommended. We actually do care what those amounts represent.

I think if the semantics of the arguments are clear within the context of the code then you can get away with short variable names. I often use these in C# lambdas for the same reason. However if it is ambiguous, you should be more explicit with naming.
map :: (a->b) -> [a] -> [b]
map f [] = []
map f (x:xs) = f x : map f xs
To someone who hasn't had any exposure to Haskell, that might seem like ugly, unmaintainable code. But most Haskell programmers will understand this right away. So it gets the job done.
var list = new int[] { 1, 2, 3, 4, 5 };
int countEven = list.Count(n => n % 2 == 0)
In that case, short variable name seems appropriate.
list.Aggregate(0, (total, value) => total += value);
But in this case it seems more appropriate to name the variables, because it isn't immediately apparent what the Aggregate is doing.
Basically, I believe not to worry too much about convention unless it's absolutely necessary to keep people from screwing up. If you have any choice in the matter, use what makes sense in the context (language, team, block of code) you are working, and will be understandable by someone else reading it hours, weeks or years later. Anything else is just time-wasting OCD.

I think scoping is the #1 reason for this. In imperative languages, dynamic variables, especially global ones need to be named properly, as they're used in several functions. With lexical scoping, it's clear what the symbol is bound to at compile time.
Immutability also contributes to this to some extent- in traditional languages like C/ C++/ Java, a variable can represent different data at different points in time. Therefore, it needs to be given a name to give the programmer an idea of its functionality.
Personally, I feel that features features like first-class functions make symbol names pretty redundant. In traditional languages, it's easier to relate to a symbol; based on its usage, we can tell if it's data or a function.

I'm studying Haskell now, but I don't feel that its naming conventions is so very different. Of course, in Java you're hardly to find a names like xs. But it is easy to find names like x in some mathematical functions, i, j for counters etc. I consider such names to be perfectly appropriate in right context. xs in Haskell is appropriate only generic functions over lists. There's a lot of them in Haskell, so this name is wide-spread. Java doesn't provide easy way to handle such a generic abstractions, that's why names for lists (and lists themselves) are usually much more specific, e.g. lists or users.

I just attended a number of talks on Haskell with lots of code samples. As longs as the code dealt with x, i and f the naming didn't bother me. However, as soon as we got into heavy duty list manipulation and the like I found the three letters or so names to be a lot less readable than I prefer.
To be fair a significant part of the naming followed a set of conventions, so I assume that once you get into the lingo it will be a little easier.
Fortunately, nothing prevents us from using meaningful names, but I don't agree that the language itself somehow makes three letter identifiers meaningful to the majority of people.

When in Rome, do as the Romans do
(Or as they say in my town: "Donde fueres, haz lo que vieres")

Anything that aids readability is a good thing - meaningful names are therefore a good thing in any language.
I use short variable names in many languages but they're reserved for things that aren't important in the overall meaning of the code or where the meaning is clear in the context.
I'd be careful how far I took the advice about Haskell names

My Haskell practice is only of mediocre level, thus, I dare to try to reply only the second, more general part of Your question:
"In essence, it claims that functional programming is such a distinct paradigm that the naming conventions from other paradigms don't apply."
I suspect, the answer is "yes", but my motivation behind this opinion is restricted only on experience in just one single functional language. Still, it may be interesting, because this is an extremely minimalistic one, thus, theoretically very "pure", and underlying a lot of practical functional languages.
I was curios how easy it is to write practical programs on such an "extremely" minimalistic functional programming language like combinatory logic.
Of course, functional programming languages lack mutable variables, but combinatory logic "goes further one step more" and it lacks even formal parameters. It lacks any syntactic sugar, it lacks any predefined datatypes, even booleans or numbers. Everything must be mimicked by combinators, and traced back to the applications of just two basic combinators.
Despite of such extreme minimalism, there are still practical methods for "programming" combinatory logic in a neat and pleasant way. I have written a quine in it in a modular and reusable way, and it would not be nasty even to bootstrap a self-interpreter on it.
For summary, I felt the following features in using this extremely minimalistic functional programming language:
There is a need to invent a lot of auxiliary functions. In Haskell, there is a lot of syntactic sugar (pattern matching, formal parameters). You can write quite complicated functions in few lines. But in combinatory logic, a task that could be expressed in Haskell by a single function, must be replaced with well-chosen auxiliary functions. The burden of replacing Haskell syntactic sugar is taken by cleverly chosen auxiliary functions in combinatory logic. As for replying Your original question: it is worth of inventing meaningful and catchy names for these legions of auxiliary functions, because they can be quite powerful and reusable in many further contexts, sometimes in an unexpected way.
Moreover, a programmer of combinatory logic is not only forced to find catchy names of a bunch of cleverly chosen auxiliary functions, but even more, he is forced to (re)invent whole new theories. For example, for mimicking lists, the programmer is forced to mimick them with their fold functions, basically, he has to (re)invent catamorphisms, deep algebraic and category theory concepts.
I conjecture, several differences can be traced back to the fact that functional languages have a powerful "glue".

In Haskell, meaning is conveyed less with variable names than with types. Being purely functional has the advantage of being able to ask for the type of any expression, regardless of context.

I agree with a lot of the points made here about argument naming but a quick 'find on page' shows that no one has mentioned Tacit programming (aka pointfree / pointless). Whether this is easier to read may be debatable so it's up to you & your team, but definitely worth a thorough consideration.
No named arguments = No argument naming conventions.

What technique in functional programming is difficult to learn but useful afterwards?

This question is of course inspired by Monads in Haskell.

wrapping my head around continuation passing style has helped my javascript coding a lot

I would say First-class functions.
In computer science, a programming
language is said to support
first-class functions (or function
literals) if it treats functions as
first-class objects. Specifically,
this means that the language supports
constructing new functions during the
execution of a program, storing them
in data structures, passing them as
arguments to other functions, and
returning them as the values of other
functions. This concept doesn't cover
any means external to the language and
program (metaprogramming), such as
invoking a compiler or an eval
function to create a new function.

Do you want to measure the usefulness in connection with functional-programming itself or programming in general?
In general, the positive experience of functional programming doesn't result from particular techniques but from the way it changes your thinking -
Holding immutable data
Formulating declaratively (recursion, pattern-matching)
Treating functions as data
So I'd say that functional programming is the answer to your question itself.
But to give a more specific answer too, I'd vote for functional abstraction mechanisms like
monads
arrows
continuation-passing-style
zippers
higher-order-functions
generics + typeclasses.
As already said, they are very abstract things on the first view, but once you have understood them, they are extremely cool and valueable techniques to write concise, error-safe and last but not least highly reusable code.
Compare the following (Pseudocode):
// Concrete
def sumList(Data : List[Int]) = ...
// Generic
def sumGeneric[C : Collection[T], T : Num](Data : C) = ...
The latter might be somewhat unintuitive compared with the first definition, but it allows you to work with any collection and numeric type in general!
All in all, many modern (mainstream) languages have discovered such benefits and introduced very functional features like lambda functios or Linq. Having understood these techniques will also improve writing code in this languages.

One from the "advanced" department: Programming with phantom types (sometimes also called indexed types). It's admittedly not a "standard" technique in functional programming but not entirely esoteric either, and it's something to keep your brain busy for awhile (you asked for something difficult, right? ;)).
In a nutshell, it is about parameterizing types to encode and statically enforce certain properties at compile time. One of the standard examples is the vector addition function that statically ensures that given two vectors of length N and M will return a vector of length N+M or otherwise you get a compile-time error. Yes, there are more interesting applications.
These techniques are not quite as useful in C++ as they are in a proper functional programming language, but so far I've managed to sneak some of this stuff in all of my recent projects at work to a varying degree, most recently in a C++ EDSL context where it worked out really well. You don't necessarily have to encode fancy stuff, learning this helped me catching the situations where a few type tags can reduce the verbosity of an EDSL or allowed a cleaner syntax, for example.
Admittedly, the usefulness is somewhat restricted by language support and what you're trying to achieve.
Some starters:
Generic and Indexed Type (slides with some brief applications overview)
Fun with Phantom Types
The Kennedy and Russo paper mentioned in the slides is Generalized Algebraic Data Types
and Object Oriented Programming and puts some of this stuff into the context of C#/Java.
Chapter 3 in Dave Abraham's book C++ Template Metaprogramming is available online as sample chapter and uses these techniques in C++ for dimensional analysis.
A practical FP project using phantom types is HaskellDB.

I would say that Structural typing in OCaml is particularly rewarding.

recursion. Difficult to wrap your head around it at times

The concept of higher-order functions, lambda functions and the power of generic algorithms that are easy to combine were very beneficial for me. I'm always excited when I see what I can do with a fold in haskell.
Likewise my programming in C# has changed a lot (to the better, I hope) since I got into functional programming (haskell specifically).

Develop Reference

node.js excel linux python-3.x azure haskell apache-spark rust .htaccess string