Related
I am pretty much 90% sure that the title of this question is wrong however I have no idea what the right title would be (I will gladly edit the title if suggestions come along!).
When reading up on Haskell and the core principles of the language you always find that it is a language "based on lambda expressions". I remember reading somewhere that this means that at the end, the main function just gets "proprocessed" into one big lambda, everything gets inlined, basically your entire code becomes one single, huge, lambda expression.
My questions are:
Is what I said above true?
If the answer to question 1 is "yes", is there any... decompiler/partial compiler/preprocessor? I know about this that lets you see the assembly code behind languages like C/++ and Haskell but is there anything I could use to explore the generated lambda expression?
This question is asked from a purely educational standpoint and not intended to seek a solution to a particular problem. I simply wish to learn more about a language I find extremely fascinating.
Let's make a distinction between the semantics of Haskell and the implementation of GHC. Mostly because we use different terms for language semantics than for assembly, but also because some other compiler might do things differently than GHC.
Every Haskell program defines main, which is an expression of type IO (). I don't like to call it a "lambda expression" because the type shows that it's not a function. The definition of main is some nested tree of function calls. Even the sequential lines in a do block are defined as calls to the functions (>>) and (>>=).
GHC uses heuristics to decide what to inline, to get the best runtime performance. It will usually inline small expressions that aren't recursive. I believe the runtime system maintains a callstack of functions currently being evaluated, not unlike the runtime result of compiling function calls in C or other imperative languages.
GHC provides many options for printing intermediate stages of compilation. I'm not sure which you will find interesting. Core is the lowest-level representation that feels like Haskell. Cmm (also called C--) is the highest-level representation that feels like assembly.
I'm looking to learn functional programming with either Haskell or F#.
Are there any programming habits (good or bad) that could form as a result Haskell's lazy evaluation? I like the idea of Haskell's functional programming purity for the purposes of understanding functional programming. I'm just a bit worried about two things:
I may misinterpret lazy-evaluation-based features as being part of the "functional paradigm".
I may develop thought patterns that work in a lazy world but not in a normal order/eager evaluation world.
There are habits that you get into when programming in a lazy language that don't work in a strict language. Some of these seem so natural to Haskell programmers that they don't think of them as lazy evaluation. A couple of examples off the top of my head:
f x y = if x > y then .. a .. b .. else c
where
a = expensive
b = expensive
c = expensive
here we define a bunch of subexpressions in a where clause, with complete disregard for which of them will ever be evaluated. It doesn't matter: the compiler will ensure that no unnecessary work is performed at runtime. Non-strict semantics means that the compiler is able to do this. Whenever I write in a strict language I trip over this a lot.
Another example that springs to mind is "numbering things":
pairs = zip xs [1..]
here we just want to associate each element in a list with its index, and zipping with the infinite list [1..] is the natural way to do it in Haskell. How do you write this without an infinite list? Well, the fold isn't too readable
pairs = foldr (\x xs -> \n -> (x,n) : xs (n+1)) (const []) xs 1
or you could write it with explicit recursion (too verbose, doesn't fuse). There are several other ways to write it, none of which are as simple and clear as the zip.
I'm sure there are many more. Laziness is surprisingly useful, when you get used to it.
You'll certainly learn about evaluation strategies. Non-strict evaluation strategies can be very powerful for particular kinds of programming problems, and once you're exposed to them, you may be frustrated that you can't use them in some language setting.
I may develop thought patterns that work in a lazy world but not in a normal order/eager evaluation world.
Right. You'll be a more rounded programmer. Abstractions that provide "delaying" mechanisms are fairly common now, so you'd be a worse programmer not to know them.
I may misinterpret lazy-evaluation-based features as being part of the "functional paradigm".
Lazy evaluation is an important part of the functional paradigm. It's not a requirement - you can program functionally with eager evaluation - but it's a tool that naturally fits functional programming.
You see people explicitly implement/invoke it (notably in the form of lazy sequences) in languages that don't make it the default; and while mixing it with imperative code requires caution, pure functional code allows safe use of laziness. And since laziness makes many constructs cleaner and more natural, it's a great fit!
(Disclaimer: no Haskell or F# experience)
To expand on Beni's answer: if we ignore operational aspects in terms of efficiency (and stick with a purely functional world for the moment), every terminating expression under eager evaluation is also terminating under non-strict evaluation, and the values of both (their denotations) coincide.
This is to say that lazy evaluation is strictly more expressive than eager evaluation. By allowing you to write more correct and useful expressions, it expands your "vocabulary" and ability to think functionally.
Here's one example of why:
A language can be lazy-by-default but with optional eagerness, or eager by default with optional laziness, but in fact its been shown (c.f. Okasaki for example) that there are certain purely functional data structures which can only achieve certain orders of performance if implemented in a language that provides laziness either optionally or by default.
Now when you do want to worry about efficiency, then the difference does matter, and sometimes you will want to be strict and sometimes you won't.
But worrying about strictness is a good thing, because very often the cleanest thing to do (and not only in a lazy-by-default language) is to use a thoughtful mix of lazy and eager evaluation, and thinking along these lines will be a good thing no matter which language you wind up using in the future.
Edit: Inspired by Simon's post, one additional point: many problems are most naturally thought about as traversals of infinite structures rather than basically recursive or iterative. (Although such traversals themselves will generally involve some sort of recursive call.) Even for finite structures, very often you only want to explore a small portion of a potentially large tree. Generally speaking, non-strict evaluation allows you to stop mixing up the operational issue of what the processor actually bothers to figure out with the semantic issue of the most natural way to represent the actual structure you're using.
Recently, i found myself doing Haskell-style programming in Python. I took over a monolithic function that extracted/computed/generated values and put them in a file sink, in one step.
I thought this was bad for understanding, reuse and testing. My plan was to separate value generation and value processing. In Haskell i would have generated a (lazy) list of those computed values in a pure function and would have done the post-processing in another (side-effect bearing) function.
Knowing that non-lazy lists in Python can be expensive, if they tend to get big, i thought about the next close Python solution. To me that was to use a generator for the value generation step.
The Python code got much better thanks to my lazy (pun intended) mindset.
I'd expect bad habits.
I saw one of my coworkers try to use (hand-coded) lazy evaluation in our .NET project. Unfortunately the consequence of lazy evaluation hid the bug where it would try remote invocations before the start of main executed, and thus outside the try/catch to handle the "Hey I can't connect to the internet" case.
Basically, the manner of something was hiding the fact that something really expensive was hiding behind a property read and so made it look like a good idea to do inside the type initializer.
Contextual information missing.
Laziness (or more specifically, the assumption of the availabilty of the purity and equational reasoning) is sometimes quite useful for specific problem domains, but not necessarily better in general. If you're talking about general-purpose language settings, relying on the lazy evaluation rules by default is considered harmful.
Analysis
Any languages has functional combination (or the applicable terms combination; i.e. function call expression, function-like macro invocation, FEXPRs, etc.) enforces rules on evaluation, implying the order of different parts of subcomputation therein. For convenience and the simplicity of the specification of the language, a language usually specify the rules in a flavor paired to the reduction strategy:
The strict evaluation, or the applicative-order reduction, which evaluates all subexpression first, before the subcomputation of the remaining evaluation of the hole combination.
The non-strict evaluation, or the normal-order reduction, which does not necessarily evaluate every subexpression at first.
The remaining subcomputation finally determines the result of the whole evaluation of the expression. (For program-defined constructs, this usually implies the substitution of the evaluated argument into something like a function body, and the subsequent evaluation of the result.)
Lazy evaluation, or the call-by-need strategy, is a typical concrete instance of the non-strict evaluation kind. To make it practically usable, subexpression evaluations are required to be pure (side-effect-free), so the reductions implementing the strategy can have the Church-Rosser property whatever the order of subexpression evaluation is actually adopted.
One significant merit of such design is the availability of the equational resoning: users can encode the equality of expression evaluation in the program, and optimizing implementation of the language can perform the transformation depending directly on such constructs.
However, there are many serious problems behind such design.
Equational reasoning is not important as it in the first glance in practice.
The encoding is not a separate feature. It has some specific requirements on the other features to carry the encoding. For a pure language, it is even more difficult to encode them elsewhere, so there is certain pressure to make the type system more expressive, hence more complicated typing and typechecking.
Whether the compiler uses the equational reasoning directly encoded in the program or not is an implementation detail. It is more of a taste of style to promote the importance.
Syntatic equations are not powerful enough to encode semantic conditions like cases of "unspecified behavior" in ISO C. It still needs some additional primitives to express non-determinism of such semantic equivalence classes to make optimization techniques based on such equivalence possible.
It is computationally inefficient at the very basic level by default, and not amendable by the programmer easily.
There is no systemic way to reduce the cost on equations which are known not required by the programmer.
One of the significance comes from the clash between lazily evaluated combinations and proper tail recursion over the combinations.
The unpredictable abuse of thunks to memoize the lazily evaluated expressions also makes troubles on the utilization of the machine resources (e.g. registers and the cache memory).
Purely functional languages like Haskell may declare the referential transparency is a good thingTM. However, this is faulty in certain contexts.
There are semantic gaps over the terminology itself. The purity is not the only aspect for the referential transparency; moreover, there are other kinds of such property not readily provided by the evaluation strategy.
In general, referential transparency should not be a goal about programming. Instead, it is an optional manner to implement the composable components of programs. Composability is essentially about the expected invariance on the interface of the components. There are many ways to keep the composability without the aid of any kinds of referential transparency. Whether the guarantee should be enforced by the language rules? It depends. At least, it should not depend totally on the language designers' point.
The lack of impure evaluations requires more syntax noises to encode many constructs simply expressible by mutable state cells in the traditional impure languages. The workarounds of the practical problems do make the solution more difficult and hard to reason by humans.
For example, I/O operations are side-effectful, thus not directly expressible in Haskell expressions under the usual non-strict evaluation rules, otherwise the order of effects will be non-deterministic.
To overcoming the shortcoming, some indirect conventional constructs like the IO monad to simulate the traditional imperative style are proposed. Such monadic constructs are in essential "indirect" in the sense similar to the continuation-passing style, which is considerably low-level and difficult to read. Even though monads can be "powerful" than continuations in expresiveness, it does not naturally powerful than more high-level alternatives (like algebraic effect systems) when the lazy evaluation strategy is not enforced by default.
Besides the intuition problem above, the necessity of using monadic constructs are often difficult to prove formally (if ever possible). As the result, they are very easily abused (just like the design patterns for "OOP" languages derived from Simula). The related syntax sugar, notably, the famous do-notation, is abused for a few decades before well-known by the Haskell community.
Simulating strict language constructs in languages like Haskell usually needs monadic constructs, while simulating non-strict constructs in strict languages are considerably simpler and easier to implement efficiently. For instance, there is SRFI-45.
The lazy evaluation strategy does not deal with many other non-strict constructs well.
For example, seq has to be a compiler magic in GHC. This is not easily expressible by other Haskell constructs without massive changes in the core Haskell language rules.
Although traditional strict languages also do not allow user programs to simulate the enforcement of the order easily so such sequential constructs are therefore primitive (examples: C-like ; is primitive; the derivation of Scheme's begin is relying on the primitive lambda which in turn implying an implicit evaluation order on expressions), it can be implementable reusing the applicative order rules without additional ad-hoc primitives, like the derivation of the$sequence operator in the Kernel language.
Concerns about specific questions
Lazy evaluation is not a must for the "functional paradigm", though as mentioned above, purely functional languages are likely have the lazy evaluation strategy by default. The common properties are the usability of first-class functions. Impure languages like Lisp and ML family are considered "functional", which use eager evaluation by default. Also note the popularity of "functional paradigm" came after the introducing of function-level programming. The latter is quite different, but still somewhat similar to "functional programming" on the treatment of first-classness.
As mentioned above, the way to simulate laziness in eager languages are well-known. Additionally, for pure programs, there may be no non-trivially semantic difference between call-by-need and normal order reduction. To figure out something really only work in a lazy world is actually not easy. (Do you want to implement the language?) Just go ahead.
Conclusion
Be careful to the problem domain. Lazy evaluation may work well for specific scenarios. However, making it by default is likely to be a bad idea in general, because users (whoever to use the language to program, or to derive a new dialect based on the current language) will likely have few chances to ignore all of the problems it will cause.
Well, try to think of something that would work if lazily evaluated, that wouldn't if eagerly evaluated. The most common category of these would be lazy logical operator evaluation used to hide a "side effect". I'll use C#-ish language to explain, but functional languages would have similar analogs.
Take the simple C# lambda:
(a,b) => a==0 || ++b < 20
In a lazy-evaluated language, if a==0, the expression ++b < 20 is not evaluated (because the entire expression evaluates to true either way), which means that b is not incremented. In both imperative and functional languages, this behavior (and similar behavior of the AND operator) can be used to "hide" logic containing side effects that should not be executed:
(a,b) => a==0 && save(b)
"a" in this case may be the number of validation errors. If there were validation errors, the first half fails and the second half is not evaluated. If there were no validation errors, the second half is evaluated (which would include the side effect of trying to save b) and the result (apparently true or false) is returned to be evaluated. If either side evaluates to false, the lambda returns false indicating that b was not successfully saved. If this were evaluated "eagerly", we would try to save regardless of the value of "a", which would probably be bad if a nonzero "a" indicated that we shouldn't.
Side effects in functional languages are generally considered a no-no. However, there are few non-trivial programs that do not require at least one side effect; there's generally no other way to make a functional algorithm integrate with non-functional code, or with peripherals like a data store, display, network channel, etc.
Haskell is generally referenced as an example of a purely functional language. How can this be justified given the existence of System.IO.Unsafe.unsafePerformIO ?
Edit: I thought with "purely functional" it was meant that it is impossible to introduce impure code into the functional part of the program.
The Languages We Call Haskell
unsafePerformIO is part of the Foreign Function Interface specification, not core Haskell 98 specification. It can be used to do local side effects that don't escape some scope, in order to expose a purely functional interface. That is, we use it to hide effects when the type checker can't do it for us (unlike the ST monad, which hides effects with a static guarantee).
To illustrate precisely the multiple languages that we call "Haskell", consider the image below. Each ring corresponds to a specific set of computational features, ordered by safety, and with area correlating to expressive power (i.e. the number of programs you can write if you have that feature).
The language known as Haskell 98 is specified right down in the middle, admitting total and partial functions. Agda (or Epigram), where only total functions are allowed, is even less expressive, but "more pure" and more safe. While Haskell as we use it today includes everything out to the FFI, where unsafePerformIO lives. That is, you can write anything in modern Haskell, though if you use things from the outer rings, it will be harder to establish safety and security guarantees made simple by the inner rings.
So, Haskell programs are not typically built from 100% referentially transparent code, however, it is the only moderately common language that is pure by default.
I thought with "purely functional" it was meant that it is impossible to introduce impure code...
The real answer is that unsafePerformIO is not part of Haskell, any more than say, the garbage collector or run-time system are part of Haskell. unsafePerformIO is there in the system so that the people who build the system can create a pure functional abstraction over very effectful hardware. All real languages have loopholes that make it possible for system builders to get things done in ways that are more effective than dropping down to C code or assembly code.
As to the broader picture of how side effects and I/O fit into Haskell via the IO monad, I think the easiest way to think of Haskell is that it is a pure language that describes effectful computations. When the computation described is main, the run-time system carries out those effects faithfully.
unsafePerformIO is a way to get effects in an unsafe manner; where "unsafe" means "safety must be guaranteed by the programmer"—nothing is checked by the compiler. If you are a savvy programmer and are willing to meet heavy proof obligations, you can use unsafePerformIO. But at that point you are not programming in Haskell any more; you are programming in an unsafe language that looks a lot like Haskell.
The language/implementation is purely functional. It includes a couple "escape hatches," which you don't have to use if you don't want to.
I don't think unsafePerformIO means that haskell somehow becomes impure. You can create pure (referentially transparent) functions from impure functions.
Consider the skiplist. In order for it to work well it requires access to a RNG, an impure function, but this doesn't make the data structure impure. If you add an item and then convert it to a list, the same list will be returned every time given the item you add.
For this reason I think unsafePerformIO should be thought of as promisePureIO. A function that means that functions that have side-effects and therefore would be labelled impure by the type system can become acknowledged as referentially transparent by the type system.
I understand that you have to have a slightly weaker definition of pure for this to hold though. i.e pure functions are referentially transparent and never called because of a side-effect (like print).
Unfortunately the language has to do some real world work, and this implies talking with the external environment.
The good thing is that you can (and should) limit the usage of this "out of style" code to few specific well documented portions of your program.
I have a feeling I'll be very unpopular for saying what I'm about to say, but felt I had to respond to some of the (in my opinion mis-) information presented here.
Although it's true that unsafePerformIO was officially added to the language as part of the FFI addendum, the reasons for this are largely historical rather than logical. It existed unofficially and was widely used long before Haskell ever had an FFI. It was never officially part of the main Haskell standard because, as you have observed, it was just too much of an embarrassment. I guess the hope was that it would just go away at some point in the future, somehow. Well that hasn't happened, nor will it in my opinion.
The development of FFI addendum provided a convenient pretext for unsafePerformIO to get snuck in to the official language standard as it probably doesn't seem too bad here, when compared to adding the capability to call foreign (I.E. C) code (where all bets are off regarding statically ensuring purity and type safety anyway). It was also jolly convenient to put it here for what are essentially political reasons. It fostered the myth that Haskell would be pure, if only it wasn't for all that dirty "badly designed" C, or "badly designed" operating systems, or "badly designed" hardware or .. whatever.. It's certainly true that unsafePerformIO is regularly used with FFI related code, but the reasons for this are often more to do with bad design of the FFI and indeed of Haskell itself, not bad design of whatever foreign thing Haskell is trying interface too.
So as Norman Ramsey says, the official position came to be that it was OK to use unsafePerformIO provided certain proof obligations were satisfied by whoever used it (primarily that doing this doesn't invalidate important compiler transformations like inlining and common sub-expression elimination). So far so good, or so one might think. The real kicker is that these proof obligations cannot be satisfied by what is probably the single most common use case for unsafePerformIO, which by my estimate accounts for well over 50% of all the unsafePerformIOs out there in the wild. I'm talking about the appalling idiom known as the "unsafePerformIO hack" which is provably (in fact obviously) completely unsafe (in the presence of inlining and cse) .
I don't really have the time, space or inclination to go into what the "unsafePerformIO hack" is or why it's needed in real IO libraries, but the bottom line is that folk who work on Haskells IO infrastructure are usually "stuck between a rock and a hard place". They can either provide an inherently safe API which has no safe implementation (in Haskell), or they can provide an inherently unsafe API which can be safely implemented, but what they can rarely do is provide safety in both API design and implementation. Judging by the depressing regularity with which the "unsafePerformIO hack" appears in real world code (including the Haskell standard libraries), it seems most choose the former option as the lesser of the two evils, and just hope that the compiler won't muck things up with inlining, cse or any other transformation.
I do wish all this was not so. Unfortunately, it is.
Safe Haskell, a recent extension of GHC, gives a new answer to this question. unsafePerformIO is a part of GHC Haskell, but not a part of the safe dialect.
unsafePerformIO should be used only to build referentially transparent functions; for example, memoization. In these cases, the author of a package marks it as "trustworthy". A safe module can import only safe and trustworthy modules; it cannot import unsafe modules.
For more information: GHC manual, Safe Haskell paper
Haskell is generally referenced as an example of a purely functional language. How can this be justified given the existence of System.IO.Unsafe.unsafePerformIO ?
Edit: I thought with "purely functional" it was meant that it is impossible to introduce impure code into the functional part of the program.
The IO monad is actually defined in Haskell, and you can in fact see its definition here. This monad does not exist to deal with impurities but rather to handle side effects. In any case, you could actually pattern match your way out of the IO monad, so the existence of unsafePerformIO shouldn't really be troubling to you.
OK, for those who have never encountered the term, a quine is a "self-replicating" computer program. To be more specific, one which - upon execution - produces a copy of its own source code as its only output.
The quines can, of course, be developed in many programming languages (but not all); but some languages are obviously more suited to producing quines than others (to clearly understand the somewhat subjective-sounding "more suited", look at a Haskell example vs. C example in the Wiki page - and I provide my more-objective definition below).
The question I have is, from programming language perspective, what language features (either theoretical design ones or syntax sugar) make the language more suitable/helpful for writing quines?
My definition of "more suitable" is "quines are easier to write" and "are shorter/more readable/less obfuscated". But you're welcome to add more criteria that are at least somewhat objective.
Please note that this question explicitly excludes degenerate cases, like a language which is designed to contain "print_a_quine" primitive.
I am not entirely sure, so correct me if anyone of you knows better.
I agree with both other answers, going further by explaining, that a quine is this:
Y g
where Y is a Y fixed-point combinator (or any other fixed-point combinator), which means in lambda calculus:
Y g = g(Y g)
now, it is quite apparent, that we need the code to be data and g be a function which will print its arguments.
So to summarize we need for constructing such a quines functions, printing function, fixed-point combinator and call-by-name evaluation strategy.
The smallest language that satisfies this conditions is AFAIK Zot from the Iota and Jot family.
Languages like the Io Programming Language and others allow the treating of code as data. In tree walking systems, this typically allows the language implementer to expose the abstract syntax tree as a first class citizen. In the case of Io, this is what it does. Being object oriented, the AST is modelled around Message objects, and a special sentinel is created to represent the currently executing message; this sentinel is called thisMessage. thisMessage is a full Message like any other, and responds to the print message, which prints it to the screen. As a result, the shortest quine I've ever been able to produce in any language, has come from Io and looks like this:
thisMessage print
Anyway, I just couldn't help but sharing this with you on this subject. The above certainly makes writing quines easy, but not doing it this way certainly doesn't preclude easily creating a quine.
I'm not sure if this is useful answer from a practical point of view, but there is some useful theory in computability theory. In particular fixed points and Kleene's recursion theorem can be used for writing quines. Apparently, the theory can be used for writing quine in LISP (as the wikipedia page shows).
I recently started studying functional programming using Haskell and came upon this article on the official Haskell wiki: How to read Haskell.
The article claims that short variable names such as x, xs, and f are fitting for Haskell code, because of conciseness and abstraction. In essence, it claims that functional programming is such a distinct paradigm that the naming conventions from other paradigms don't apply.
What are your thoughts on this?
In a functional programming paradigm, people usually construct abstractions not only top-down, but also bottom-up. That means you basically enhance the host language. In this kind of situations I see terse naming as appropriate. The Haskell language is already terse and expressive, so you should be kind of used to it.
However, when trying to model a certain domain, I don't believe succinct names are good, even when the function bodies are small. Domain knowledge should reflect in naming.
Just my opinion.
In response to your comment
I'll take two code snippets from Real World Haskell, both from chapter 3.
In the section named "A more controlled approach", the authors present a function that returns the second element of a list. Their final version is this:
tidySecond :: [a] -> Maybe a
tidySecond (_:x:_) = Just x
tidySecond _ = Nothing
The function is generic enough, due to the type parameter a and the fact we're acting on a built in type, so that we don't really care what the second element actually is. I believe x is enough in this case. Just like in a little mathematical equation.
On the other hand, in the section named "Introducing local variables", they're writing an example function that tries to model a small piece of the banking domain:
lend amount balance = let reserve = 100
newBalance = balance - amount
in if balance < reserve
then Nothing
else Just newBalance
Using short variable name here is certainly not recommended. We actually do care what those amounts represent.
I think if the semantics of the arguments are clear within the context of the code then you can get away with short variable names. I often use these in C# lambdas for the same reason. However if it is ambiguous, you should be more explicit with naming.
map :: (a->b) -> [a] -> [b]
map f [] = []
map f (x:xs) = f x : map f xs
To someone who hasn't had any exposure to Haskell, that might seem like ugly, unmaintainable code. But most Haskell programmers will understand this right away. So it gets the job done.
var list = new int[] { 1, 2, 3, 4, 5 };
int countEven = list.Count(n => n % 2 == 0)
In that case, short variable name seems appropriate.
list.Aggregate(0, (total, value) => total += value);
But in this case it seems more appropriate to name the variables, because it isn't immediately apparent what the Aggregate is doing.
Basically, I believe not to worry too much about convention unless it's absolutely necessary to keep people from screwing up. If you have any choice in the matter, use what makes sense in the context (language, team, block of code) you are working, and will be understandable by someone else reading it hours, weeks or years later. Anything else is just time-wasting OCD.
I think scoping is the #1 reason for this. In imperative languages, dynamic variables, especially global ones need to be named properly, as they're used in several functions. With lexical scoping, it's clear what the symbol is bound to at compile time.
Immutability also contributes to this to some extent- in traditional languages like C/ C++/ Java, a variable can represent different data at different points in time. Therefore, it needs to be given a name to give the programmer an idea of its functionality.
Personally, I feel that features features like first-class functions make symbol names pretty redundant. In traditional languages, it's easier to relate to a symbol; based on its usage, we can tell if it's data or a function.
I'm studying Haskell now, but I don't feel that its naming conventions is so very different. Of course, in Java you're hardly to find a names like xs. But it is easy to find names like x in some mathematical functions, i, j for counters etc. I consider such names to be perfectly appropriate in right context. xs in Haskell is appropriate only generic functions over lists. There's a lot of them in Haskell, so this name is wide-spread. Java doesn't provide easy way to handle such a generic abstractions, that's why names for lists (and lists themselves) are usually much more specific, e.g. lists or users.
I just attended a number of talks on Haskell with lots of code samples. As longs as the code dealt with x, i and f the naming didn't bother me. However, as soon as we got into heavy duty list manipulation and the like I found the three letters or so names to be a lot less readable than I prefer.
To be fair a significant part of the naming followed a set of conventions, so I assume that once you get into the lingo it will be a little easier.
Fortunately, nothing prevents us from using meaningful names, but I don't agree that the language itself somehow makes three letter identifiers meaningful to the majority of people.
When in Rome, do as the Romans do
(Or as they say in my town: "Donde fueres, haz lo que vieres")
Anything that aids readability is a good thing - meaningful names are therefore a good thing in any language.
I use short variable names in many languages but they're reserved for things that aren't important in the overall meaning of the code or where the meaning is clear in the context.
I'd be careful how far I took the advice about Haskell names
My Haskell practice is only of mediocre level, thus, I dare to try to reply only the second, more general part of Your question:
"In essence, it claims that functional programming is such a distinct paradigm that the naming conventions from other paradigms don't apply."
I suspect, the answer is "yes", but my motivation behind this opinion is restricted only on experience in just one single functional language. Still, it may be interesting, because this is an extremely minimalistic one, thus, theoretically very "pure", and underlying a lot of practical functional languages.
I was curios how easy it is to write practical programs on such an "extremely" minimalistic functional programming language like combinatory logic.
Of course, functional programming languages lack mutable variables, but combinatory logic "goes further one step more" and it lacks even formal parameters. It lacks any syntactic sugar, it lacks any predefined datatypes, even booleans or numbers. Everything must be mimicked by combinators, and traced back to the applications of just two basic combinators.
Despite of such extreme minimalism, there are still practical methods for "programming" combinatory logic in a neat and pleasant way. I have written a quine in it in a modular and reusable way, and it would not be nasty even to bootstrap a self-interpreter on it.
For summary, I felt the following features in using this extremely minimalistic functional programming language:
There is a need to invent a lot of auxiliary functions. In Haskell, there is a lot of syntactic sugar (pattern matching, formal parameters). You can write quite complicated functions in few lines. But in combinatory logic, a task that could be expressed in Haskell by a single function, must be replaced with well-chosen auxiliary functions. The burden of replacing Haskell syntactic sugar is taken by cleverly chosen auxiliary functions in combinatory logic. As for replying Your original question: it is worth of inventing meaningful and catchy names for these legions of auxiliary functions, because they can be quite powerful and reusable in many further contexts, sometimes in an unexpected way.
Moreover, a programmer of combinatory logic is not only forced to find catchy names of a bunch of cleverly chosen auxiliary functions, but even more, he is forced to (re)invent whole new theories. For example, for mimicking lists, the programmer is forced to mimick them with their fold functions, basically, he has to (re)invent catamorphisms, deep algebraic and category theory concepts.
I conjecture, several differences can be traced back to the fact that functional languages have a powerful "glue".
In Haskell, meaning is conveyed less with variable names than with types. Being purely functional has the advantage of being able to ask for the type of any expression, regardless of context.
I agree with a lot of the points made here about argument naming but a quick 'find on page' shows that no one has mentioned Tacit programming (aka pointfree / pointless). Whether this is easier to read may be debatable so it's up to you & your team, but definitely worth a thorough consideration.
No named arguments = No argument naming conventions.