In what sense is Constant Applicative Form applicative?

In what sense is Constant Applicative Form applicative? - haskell

I understand a CAF is a form in a sense of it being of specific shape in memory, or one of infinitely many possible graphic representations of some value it can be evaluated to. (It is noted that "constant applicative form" is synonymous to "static thunk".)
I understand it being constant in that there are no free variables and all the information necessary to evaluate a Constant Form is already contained therein. It's a shape that has no arrows pointing outwards.
But why "applicative"? I can't sleep at night due to this. Everyone says caf, caf, but who actually knows what that literally means? Does it have something to do with applicative functors (I guess not)? What other kinds of applicative forms does one get out there?

A term in constant applicative form is a constant applied to (zero or more) other constants. (Of course each of those constants may require quite some computation before they're fully evaluated!)

Every CAF is a supercombinator, and supercombinators are, long story short, functions that take other (possibly nullary) functions and apply them to one another.
So, my understanding of "applicative" in the CAF name is as referring to their supercombinatorish nature.

I reached out to Haskell Cafe and Stephen Tetley kindly clarified this for me. Briefly:
Circa the 1970s and 80s "applicative" was often used in the UK as a
synonym for functional ...
-- So we may kind of paraphrase the "caf" as "cff".
I will still have to be looking into what this actually means. Stephen suggested a paper that talks, among other things, of applicative expressions which may happen to be the same thing as our applicative forms, but it will take indefinite time for me to get through to making a reasonably well founded statement of whether it is the case, so I will post an answer for now, while reserving for the possibility of my expanding it after a while.

Related

Why is it fair to think of just locally small cartesian closed categories in Haskell for the Curry class?

Control.Category.Constrained is a very interesting project that presents the class for cartesian closed categories - Curry.
Yet, I do not see why we think of all cartesian closed categories which allow curry and uncurry (Hom(X * Y, Z) ≅ Hom(X, Z^Y) in terms of category theory). Wikipedia says that such property holds only for locally small cartesian closed categories. Under this post many people suggest that Hask itself is not locally small (on the other hand, everyone says that Hask is not a cartesian closed category, which I reckon to be pure and uninteresting formalism).
In this post on Math.SE speaks on assuming all categories are locally small. But it is given from a mathematical point of view where we discuss properties. I would like to know why we decided to concentrate on curry and uncurry as Curry’s methods. Is it because pretty much everyone who knows Haskell also knows these functions? Or is there any other reason?

I would like to know why we decided to concentrate on curry and uncurry as Curry’s methods. Is it because pretty much everyone who knows Haskell also knows these functions?
As the library author I can answer that with confidence and the answer is yes: it is because curry and uncurry are well-established part of the Haskell vernacular. constrained-categories was never intended to radically change Haskell and/or make it more mathematically solid in some sense, but rather to subtly generalise the existing class hierarchies – mostly to allow defining functors etc. that couldn't be given Prelude.Functor instances.
Whether Curry could be formalised in terms of local smallness I frankly don't know. I'm also not sure whether that and other “maths foundations” aspects can even be meaningfully discussed in the context of a Haskell library. Somewhat off-topic rant ahead It's just a fact that Haskell is a non-total language, and yes, that means just about any axiom can be thwarted by some undefined attack. But I also don't really see that as a problem. Many people seem to think of Haskell as a sort of uncanny valley: too restrictive for use in real-world applications, yet nothing can be proved properly. I see it exactly the other way around: Haskell has a sufficiently powerful type system to be able to express the mathematical ideas that are useful for real-world applications, without getting its value semantics caught up too deep in the underlying foundations to be practical to actually use in the real world. (I.e., you don't constantly spend weeks proving some “obviously it's true that...” theorem. I'm looking at you, Coq...)Instead of writing 100% rigorous proofs, we narrow down the types as best as possible and then use QuickCheck to see whether something typically works as the maths would demand.
Don't get me wrong, I think formalising the foundations is important too and dependently-typed total languages are great, but all that is somewhat missing the point of where Haskell's potential really lies. At least it's not where I aim my Haskell development, including constrained-categories. If somebody who's deeper into the pure maths wants to chime in, I'm delighted to hear about it.

What does "a monad is a model of computation" mean

What does it mean exactly when people say "a monad is a model of computation"? Does this mean computation in the sense of turing completeness? If so, how?
Clarification: This question is not about explaining monads but what people mean with "model of computation" in this context and how this relates to monads. See towards the end of this answer for a typical use of this phrase.
In my understanding a turing machine, the theory of recursive functions, lambda calculus etc. are all models of computation and I cannot see how a monad would relate to that if at all.

The idea of monads as models of computation can be traced back to the work of Eugenio Moggi. Among Haskell practitioners, the best known paper by Moggi on this matter is Notions of computations as monads (1991). Relevant quotes include:
The [lambda]-calculus is considered a useful mathematical tool in the study of programming languages, since programs can be identified with [lambda]-terms. However, if one goes further and uses [beta][eta]-conversion to prove equivalence of programs, then a gross simplification is introduced (programs are identified with total functions from values to values) that may jeopardise the applicability of theoretical results, In this paper we introduce calculi based on a categorical semantics for computations, that provide a correct basis for proving equivalence of programs for a wide range of notions of computation. [p. 1]
[...]
We do not take as a starting point for proving equivalence of programs the theory of [beta][eta]-conversion, which identifies the denotation of a program (procedure) of type A -> B with a total function from A to B, since this identification wipes out completely behaviours such as non-termination, non-determinism, and side-effects, that can be exhibited by real programs. Instead, we proceed as follows:
We take category theory as a general theory of functions and develop on top a categorical semantics of computations based on monads. [...] [p. 1]
[...]
The basic idea behind the categorical semantics below is that, in order to interpret a programming language in a category [C], we distinguish the object A of values (of type A) from the object TA of computations (of type A), and take as denotations of programs (of type A) the elements of TA. In particular, we identify the type A with the object of values (of type A) and obtain the object of computations (of type A) by applying an unary type-constructor T to A. We call T a notion of computation, since it abstracts away from the type of values computations may produce. There are many choices for TA corresponding to different notions of computations. [pp. 2-3]
[...]
We have identified monads as important to modeling notions of computations, but computational monads seem to have additional properties; e.g., they have a tensorial strength and may satisfy the mono requirement. It is likely that there are other properties of computational monads still to be identified, and there is no reason to believe that such properties have to be found in the literature on monads. [p. 27 -- thanks danidiaz]
A related older paper by Moggi, Computational lambda-calculus and monads (1989 -- thanks michid for the reference), speaks literally of "computational model[s]":
A computational model is a monad (T;[eta];[mu]) satisfying the mono requirement: [eta-A] is a mono for every A [belonging to] C.
There is an alternative description of a monad (see[7]), which is easier to justify computationally. [...] [p. 2]
This particular bit of terminology was dropped in the Notions of computations as monads, as Moggi sharpened the focus of his presentation on the "alternative description" (namely, Kleisli triples, which are composed by, in Haskell parlance, a type constructor, return and bind). The essence, though, remain the same throughout.
Philip Wadler presents the idea with a more practical bent in Monads for functional programming (1992):
The use of monads to structure functional programs is described. Monads provide a convenient framework for simulating effectsfound in other languages, such as global state, exception handling, out-put, or non-determinism. [p. 1]
[...]
Pure functional languages have this advantage: all flow of data is made explicit.And this disadvantage: sometimes it is painfully explicit.
A program in a pure functional language is written as a set of equations. Explicit data flow ensures that the value of an expression depends only on its free variables. Hence substitution of equals for equals is always valid, making such programs especially easy to reason about. Explicit data flow also ensures that the order of computation is irrelevant, making such programs susceptible to lazy evaluation.
It is with regard to modularity that explicit data flow becomes both a blessing and a curse. On the one hand, it is the ultimate in modularity. All data in and all data out are rendered manifest and accessible, providing a maximum of flexibility. On the other hand, it is the nadir of modularity. The essence of an algorithm can become buried under the plumbing required to carry data from its point of creation to its point of use. [p. 2]
[...]
Say it is desired to add error checking, so that the second example above returns a sensible error message. In an impure language, this is easily achieved with the use of exceptions.
In a pure language, exception handling may be mimicked by introducing a type to represent computations that may raise an exception. [pp. 3 -4 -- note this is before monads are introduced as an unifying abstraction.]
[...]
Each of the variations on the interpreter has a similar structure, which may be abstracted to yield the notion of a monad.
In each variation, we introduced a type of computations. Respectively, M represented computations that could raise exceptions, act on state, and generate output. By now the reader will have guessed that M stands for monad. [p. 6]
This is one of the roots of the usage of "computation" to refer to monadic values.
A significant body of later literature makes use of the concept of computation in this manner. For instance, this is the opening passage of Notions of Computation as Monoids by Exequiel Rivas and Mauro Jaskelioff (2014 -- thanks danidiaz for the suggestion):
When constructing a semantic model of a system or when structuring computer code,there are several notions of computation that one might consider. Monads (Moggi, 1989; Moggi, 1991) are the most popular notion, but other notions,such as arrows (Hughes, 2000) and, more recently, applicative functors (McBride & Paterson, 2008) have been gaining widespread acceptance. Each of these notions of computation has particular characteristics that makes them more suitable for some tasks than for others. Nevertheless, there is much to be gained from unifying all three different notions under a single conceptual framework. [p. 1]
Another good example is Comonadic notions of computation by Tarmo Uustalu and Varmo Vene (2000):
Since the seminal work by Moggi in the late 80s, monads, more precisely, strong monads, have become a generally accepted tool for structuring effectful notions of computation, such as computation with exceptions, output, computation using an environment, state-transforming, nondeterministic and probabilistic computation etc. The idea is to use a Kleisli category as the category of impure, effectful functions, with the Kleisli inclusion giving an embedding of the pure functions from the base category. [...] [p. 263]
[...]
The starting-point in the monadic approach to (call-by-value) effectful computation is the idea that impure, effectful functions from A to B must be nothing else than pure functions from A to TB. Here pure functions live in a base category C and T is an endofunctor on C that describes the notion of effect of interest; it is useful to think of TA as the type of effectful computations of values of a given type A.
For this to work, impure functions must have identities and compose. Therefore T cannot merely be a functor, but must be a monad. [p. 265]
Such uses of "computation" fit the usual computer science notion of models of computation (see danidiaz's answer for more on that). In the informal functional programming literature, allusions to monads as models of computation have varying degrees of precision. Still, they generally draw from, or at least are offshoots of, a rigorous idea.

Nothing. It doesn't mean anything. It's the output of someone struggling to find metaphors which make monads into something they already know. It almost means something. "It is possible to construct models of computation which form monads," for instance, is a meaningful statement. But the difference is significant. "Monads are models of computation" is an attempt to force a broad abstraction into a narrow interpretation. The other specifies that you can work with a broader abstraction for one use case.
Be very wary of reductive explanations. Do you think that an entire community of developers would keep using unfamiliar terminology if familiar terminology communicated the same thing? The term Monad has stuck around for 20 years in a language community that rapidly invents and discards abstractions as it searches for improvements. The only way that can happen is if it communicates something useful and precise.
It's just hard to write an explanation of the application of the idea to programming that makes any sense to people who don't know enough of the language to understand the constructs in use. If you aren't comfortable with at least higher-kinded types, type classes, and higher-order functions there's no way to understand what the notation is saying.
Learning prerequisite ideas will help. Practice writing code will help. Looking at how (>>=) works for various concrete types will help. Struggling through learning how to use a library like Parsec (or modern descendants like megaparsec) will help.
Trying to force the idea to match something you already know via metaphor will not.

Expanding a little on #duplode's answer, I think that when talking about computation, "model" can have at least two slightly different meanings.
One is model in the sense of the Church–Turing thesis. Here a model is a way of performing computations that is capable of expressing any algorithm. So turing machines, lambda calculus, post correspondence systems... are all models.
Another is model in the sense of programming language semantics. The idea is that we consider programs as composable syntactical structures, and we want them to "mean" something, ideally in a way that lets us determine the meaning of a composition from the meaning of the elements. In this sense, lambda calculus has models.
Now, one kind of semantics is denotational semantics, in which the meaning we assign to a program is some kind of mathematical object. For a trivial example, consider binary numbers. Here the "programs" are strings of 0s and 1s, regarded as mere symbols. And the "model" would be natural numbers, along with a function which maps each string of symbols to the corresponding natural number.
Sometimes these denotations of programs are expressed in terms of category theory. This is the context of Moggi's papers: he is making use of machinery from category theory—like monads—to map programming language concepts like exceptions, continuations, input/output... into a mathematical model. Monads become a convenient way of structuring the mathematical universe of program meanings.

What's the current status of restricted monads?

Going back to at least the late 1990s there have been people wishing for the integration of restricted monads into Haskell in a friendly way.
For example, without restricted monads you can't make an efficient monad out of Set, Map or probability distributions. Here's a SO question from a few years ago where someone else ran afoul of this problem.
There are various workarounds that people have come up with, including:
Creating a new type class for every possible restriction.
Using Template Haskell.
Using Constraint Kinds.
None of these approaches seem to be "canonical" however. I found a comment from Don Stewart on this blog post, in 2007, where he intimated that we were "quite close" to having restricted monads with Indexed types.
What is the current status? Is there now a 'canonical' way to do restricted monads? Or we are still living with workarounds?

There's a recent paper by Anders Persson, Emil Axelsson, and Josef Svenningson that shows a way to encode restricted monads. I've forgotten the details, but I remember it was a nice paper.
Persson, A. ; Axelsson, E. ; Svenningsson, J. (2011). Generic monadic constructs for embedded languages. IFL 2011, the 23rd Symposium on Implementation and Application of Functional Languages.

Actually it is possible to obtain an efficient Set monad as a regular monad,
without any restrictions. In two distinct ways. The following article
explains both:
http://okmij.org/ftp/Haskell/set-monad.html
The article also points out that restricted monads are actually quite
restricted and preclude many monadic idioms. I conjecture that the
implementation methods are general and any restricted monad can be
turned into the usual one, without losing efficiency. So, it may seem
that we don't need restricted monads at all.

Monad in non-programming terms [duplicate]

This question already has answers here:
Closed 11 years ago.
Possible Duplicate:
What is a monad?
How would you describe a monad in non-programming terms? Is there some concept/thing outside of programming (outside of all programming, not just FP) which could be said to act or be monad-like in a significant way?

Yes, there are several things outside programming that can be said to be like monads. No, none of them will help you understand monads. Please read Abstraction, intuition, and the “monad tutorial fallacy”:
Joe Haskeller is trying to learn about monads. After struggling to understand them for a week, looking at examples, writing code, reading things other people have written, he finally has an “aha!” moment: everything is suddenly clear, and Joe Understands Monads! What has really happened, of course, is that Joe’s brain has fit all the details together into a higher-level abstraction, a metaphor which Joe can use to get an intuitive grasp of monads; let us suppose that Joe’s metaphor is that Monads are Like Burritos. Here is where Joe badly misinterprets his own thought process: “Of course!” Joe thinks. “It’s all so simple now. The key to understanding monads is that they are Like Burritos. If only I had thought of this before!” The problem, of course, is that if Joe HAD thought of this before, it wouldn’t have helped: the week of struggling through details was a necessary and integral part of forming Joe’s Burrito intuition, not a sad consequence of his failure to hit upon the idea sooner.
But now Joe goes and writes a monad tutorial called “Monads are Burritos,” under the well-intentioned but mistaken assumption that if other people read his magical insight, learning about monads will be a snap for them. “Monads are easy,” Joe writes. “Think of them as burritos.” Joe hides all the actual details about types and such because those are scary, and people will learn better if they can avoid all that difficult and confusing stuff. Of course, exactly the opposite is true, and all Joe has done is make it harder for people to learn about monads, because now they have to spend a week thinking that monads are burritos and getting utterly confused, and then a week trying to forget about the burrito analogy, before they can actually get down to the business of learning about monads.
As I said in another answer long ago, sigfpe's article You Could Have Invented Monads! (And Maybe You Already Have.), as well as Philip Wadler's original paper Monads for functional programming, are both excellent introductions (which give not analogies but lots of examples), but beyond that you just keep coding, and eventually it will all seem trivial.
[Not a real answer: One place monads exist outside all programming, of course, is in mathematics. As this hilarious post points out, "a monad is a monoid in the category of endofunctors, what's the problem?" :-)]
Edit: The questioner seems to have interpreted this answer as condescending, saying something like "Monads are so complicated they are beyond analogy". In fact, nothing of the sort was intended, and it's monad-analogies that often appear condescending. Maybe I should restate my point as "You don't have to understand monads". You use particular monads because they're useful — you use the Maybe monad when you need Maybe types, you use the IO monad when you need to do IO, similarly other examples, and apparently in C#, you use the Nullable<> pattern, LINQ and query comprehensions, etc. Now, the insight that there's a single general abstraction underlying all these structures, which we call a monad, is not necessary to understand or use the specific monads. It is something that can come as an afterthought, after you've seen more than one example and recognise a pattern: learning proceeds from the concrete to the abstract. Directly explaining the abstraction, by appealing to analogies of the abstraction itself, does not usually help a learner grasp what it's an abstraction of.

Here's my current stab at it:
Monads are bucket brigades:
Each operation is a person standing in line; i.e. there's an unambiguous sequence in which the operations take place.
Each person takes one bucket as input, takes stuff out of it, and puts new stuff in the bucket. The bucket, in turn, is passed down to the next person in the brigade (through the bind, or >>=, operation).
The return operation is simply the operation of putting stuff in the bucket.
In the case of sequence (>>) operations, the contents of the bucket are dumped before they're passed to the next person. The next person doesn't care what was in the bucket, they're just waiting to receive it.
In the case of monads on (), a ticket is being passed around inside the bucket. It's called "the Unit", and it's just a blank sheet of paper.
In the case of IO monads, each person says something aloud that's either utterly profound or utterly stupid – but they can only speak when they're holding the bucket.
Hope this helps. :-)
Edit: I appreciate your support, but sadly, the Monad Tutorial curse has struck again. What I've described is just function application with containers, not monads! But I'm no nihilist – I believe the Monad Tutorial curse can be broken! So here's a somewhat more, um, complicated picture that I think describes it a bit better. You decide whether it's worth taking to your friends.
Monads are a bucket brigade with project managers. The project managers stand behind all but the first member of the brigade. The members of the bucket brigade are seated on stools, and have buckets in front of them.
The first person receives some stuff, does something with it, and puts it in a bucket. That person then hands off – not to the next person in the brigade, that would be too easy! :-) – but to the project manager standing behind that person.
The project manager (her name is bind, or >>=) takes the bucket and decides what to do with it. She may decide to take the first person's stuff out of the bucket and just hand it to the person in front of her without further ado (that's the IO monad). She may choose to throw the bucket away and end the brigade (that's fail). She may decide to just bypass the person in front of her and pass the bucket to the next manager in the brigade without further ado (that's what happens with Nothing in the Maybe monad). She may even decide to take the stuff out of the bucket and hand it to the person in front of her a piece at a time! (That's the List monad.) In the case of sequence (>>) she just taps the shoulder of the person in front of her, instead of handing them any stuff.
When the next person makes a bucket of stuff, the person hands it to the next project manager. The next project manager figures out again what to do with the bucket she's given, and hands the stuff in the bucket to her person. At the end, the bucket is passed back up the chain of project managers, who can optionally do stuff with the bucket (like the List monad assembling all the results). The first project manager produces a bucket of stuff as the result.
In the case of the do syntax, each person is actually an operation that's defined on the spot within the context of everything that's gone before – as if the project manager passes along not just what's in the bucket, but also the values (er, stuff) that have been generated by the previous members of the brigade. The context building in this case is much easier to see if you write out the computation using bind and sequence instead of using the do syntax – note each successive "statement" is an anonymous function constructed within the operation that's preceded that point.
() values, IO monads, and the return operation remain described as above.
"But this is too complicated! Why can't the people just unload the buckets themselves?" I hear you ask. Well, the project manager can do a bunch of work behind the scenes that would otherwise complicate the person's work. We're trying to make it easy on these brigade members, so they don't have to do too much. In the case of the Maybe monad, for example, each person doesn't have to check the value of what they're given to see if they were given Nothing – the project manager takes care of that for them.
"Well, then, if you're realliy trying to make each person's job easier, why not go all the way – have a person just take stuff and hand off stuff, and let the project manager worry about the bucketing?" That's often done, and it has a special name called lifting the person (er, operation) into the monad. Sometimes, though, you want a person that has something a bit more complicated to do, where they want some control over the bucket that's produced (e.g. whether they need to return Nothing in the case of the Maybe monad), and that's what the monad in full generality provides.
The points being:
The operations are sequenced.
Each person knows how to make buckets, but not how to get stuff out of buckets.
Each project manager knows how to deal with buckets, and how to get stuff out of them, but doesn't care what's in them.
Thus ends my bedtime tutorial. :-P

In non-programming terms:
If F and G are a pair of adjoint functors, with F left adjoint to G, then the composition G.F is a monad.

Is there some concept/thing outside of programming (outside of all
programming, not just FP) which could be said to act or be monad-like in a
significant way?
Yes, in fact there is. Monads are quite directly related to "possibility" in modal logic by an extension of the Curry-Howard isomorphism. (See: A Judgmental Reconstruction of Modal Logic.)
This is quite a strong relationship, and to me the concepts related to possibility on the logical side are more intuitive than those related to monads from category theory. The best way I've found to explain monads to my students draws on this relationship but without explicitly showing the isomorphism.
The basic idea is that without monads, all expressions exist in the same world, and all calculation is done in that world. But with monads there can be many worlds and the calculation moves between them. (e.g., each world might specify the current value of some mutable state)
In this view, a monad p means "in a possible reachable world from the current world".
In particular if t is a type then:
x :: t means something of type t is directly available in the current world
y :: p t means something of type t is available in a world reachable from the current one
Then, return allows us to use the current world as a reachable one.
return :: t -> p t
And >>= allows us to make use of a something in a reachable world and then to reach additional worlds from that world.
(>>=) :: p t -> (t -> p s) -> p s
So >>= can be used to construct a path to a reachable world from smaller paths through other worlds.
With the worlds being something like states this is pretty easy to explain. For something like an IO monad, it's also pretty easy: a world is specified by all the interactions a program has had with the outside world.
For non-termination two worlds suffice - the ordinary one, and one that is infinitely far in the future. (Applying >>= with the second world is allowed, but you're unlikely to observe what happens in that world.) For a continuation monad, the world remains the same when continuations are used normally, and there are extra worlds for when they are not (e.g., for callcc).

From this excellent post by Mike Vanier,
One of the key concepts in Haskell
that sets it apart from other
programming languages is the concept
of a "monad". People seem to find this
difficult to learn (I did as well),
and as a result there are loads of
monad tutorials on the web, some of
which are very good (I particularly
like All About Monads by Jeff
Newbern). It's even been said that
writing a monad tutorial is a rite of
passage for new Haskell programmers.
However, one big problem with many
monad tutorials is that they try to
explain what monads are in reference
to existing concepts that the reader
already understands (I've even seen
this in presentations by Simon
Peyton-Jones, the main author of the
GHC compiler and general Haskell grand
poobah). This is a mistake, and I'm
going to tell you why.
It's natural, when trying to explain
what something is, to explain it by
reference to things the other person
already knows about. This works well
when the new thing is similar in some
ways to things the other person is
familiar with. It breaks down utterly
when the new thing is completely out
of the experience of the person
learning it. For instance, if you were
trying to explain what fire is to a
caveman who had never seen a fire,
what would you say? "It's kind of like
a cross between air and water, but
hot..." Not very effective. Similarly,
explaining what an atom is in terms of
quantum mechanics is problematic,
because we know that the electron
doesn't really orbit around the
nucleus like a planet around a star,
and the notion of a "delocalized
electron cloud" doesn't really mean
much. Feynman once said that nobody
really understood quantum mechanics,
and on an intuitive level that's true.
But on a mathematical level, quantum
mechanics is well-understood; we just
don't have a good intuition for what
the math really means.
How does this relate to monads? Time
and again, in tutorials, blog posts
and on the Haskell mailing lists, I've
seen monads explained in one of two
supposedly-intuitive ways: a monad is
"kind of like an action" or "kind of
like a container". How can something
be both an action and a container?
Aren't these separate concepts? Is a
monad some kind of weird "active
container"? No, but the point is that
claiming that a monad is a kind of
action or a kind of container is
incorrect. So what is a monad, anyway?
Here's the answer: A monad is a
purely abstract concept, with no
fundamental relationship to anything
you've probably ever heard of
before. The notion of a monad comes
from category theory, which is the
most abstract branch of mathematics I
know of. In fact, the whole point of
category theory is to abstract out all
of the structure of mathematics to
expose the similarities and analogies
between seemingly disparate areas (for
instance, between algebra and
topology), so as to condense
mathematics into its fundamental
concepts, and thus reduce redundancy.
(I could go on about this for quite a
while, but I'd rather get back to the
point I'm trying to make.) Since I'm
guessing that most programmers
learning Haskell don't know much about
category theory, monads are not going
to mean anything to them. That doesn't
mean that they need to learn all about
category theory to use monads in
Haskell (fortunately), but it does
mean that they need to get comfortable
thinking about things in a more
abstract way than they are probably
used to.
Please go to the link at the top of the post to read the full article.

In practice, most of the monads I've worked with behave like some kind of implicit context.
It's like when you and a friend are trying to have a conversation about a mutual friend. Every time you say "Bob," you're both referring to the same Bob, and that fact is just implicitly threaded through your conversation due to the context of Bob being your mutual friend.
You can, of course, have a conversation with your boss (not your friend) about your skip-level manager (not your friend) who happens to be named Bob. Here you can have another conversation, again with some implied connotation that only makes sense within the context of the conversation. You can even utter the exact same words as you did with your friend, but they will carry a different meaning because of the different context.
In programming it's the same. The way that tell behaves depends on which monad you're in; the way that information is assembled (>>=) depends on which monad you're in. Same idea, different mode of conversation.
Heck, even the rules of the conversation can be monadic. "Don't tell anyone what I told you" hides information the same way that runST prevents references from escaping the ST monad. Obviously, conversations can have layers and layers of context, just like we have stacks of monad transformers.
Hope that helps.

Well, here's a nicely detailed description of monads that's definitely outside of all programming. I know it's outside of programming because I'm a programmer and I don't understand even half of what it talks about.
There's also a series of videos on YouTube explaining monads of that variety--here's the first in the sequence.
I'm guessing that's not really what you were looking for, though...

I like to think of them as abstractions of computations that can be "bound." Or, burritos!

It depends on who you are talking to. Any explanation has to be pitched at the right level. My explanation to a chemical engineer would be different to my explanation to a mathematician or a finance manager.
The best approach is to relate it to something in the expertise of the person you are talking to. As a rule sequencing is a fairly universal problem, so try to find something the person knows about where you say "first do X, then do Y". Then explain how ordinary programming languages have a problem with that; if you say "do X, then do Y" to a computer it does X and Y immediately without waiting for further input, but it can't do Z in the meantime for someone else; the computer's idea of "and then do" is different from yours. So programmers have to write their programs differently from the way that you (the expert) would explain it. This creates a gap between what you say and what the program says. It costs time and money to cross that gap.
Monads let you put your version of "and then do" into the computer, so you can say "do X and then do Y", and the programmer can write "do {x ; y}", and it means what you mean.

Yes, Monads comes from a concept outside of haskell. Haskell have many terms and Ideas that have been borrowed from Category theory. This is one of them. So if this person who is not a programmer turns out to be a mathematician who have studied Category theory, just say: "a Monad is a monoid in the category of endofunctors."

Does functional programming mandate new naming conventions?

I recently started studying functional programming using Haskell and came upon this article on the official Haskell wiki: How to read Haskell.
The article claims that short variable names such as x, xs, and f are fitting for Haskell code, because of conciseness and abstraction. In essence, it claims that functional programming is such a distinct paradigm that the naming conventions from other paradigms don't apply.
What are your thoughts on this?

In a functional programming paradigm, people usually construct abstractions not only top-down, but also bottom-up. That means you basically enhance the host language. In this kind of situations I see terse naming as appropriate. The Haskell language is already terse and expressive, so you should be kind of used to it.
However, when trying to model a certain domain, I don't believe succinct names are good, even when the function bodies are small. Domain knowledge should reflect in naming.
Just my opinion.
In response to your comment
I'll take two code snippets from Real World Haskell, both from chapter 3.
In the section named "A more controlled approach", the authors present a function that returns the second element of a list. Their final version is this:
tidySecond :: [a] -> Maybe a
tidySecond (_:x:_) = Just x
tidySecond _ = Nothing
The function is generic enough, due to the type parameter a and the fact we're acting on a built in type, so that we don't really care what the second element actually is. I believe x is enough in this case. Just like in a little mathematical equation.
On the other hand, in the section named "Introducing local variables", they're writing an example function that tries to model a small piece of the banking domain:
lend amount balance = let reserve = 100
newBalance = balance - amount
in if balance < reserve
then Nothing
else Just newBalance
Using short variable name here is certainly not recommended. We actually do care what those amounts represent.

I think if the semantics of the arguments are clear within the context of the code then you can get away with short variable names. I often use these in C# lambdas for the same reason. However if it is ambiguous, you should be more explicit with naming.
map :: (a->b) -> [a] -> [b]
map f [] = []
map f (x:xs) = f x : map f xs
To someone who hasn't had any exposure to Haskell, that might seem like ugly, unmaintainable code. But most Haskell programmers will understand this right away. So it gets the job done.
var list = new int[] { 1, 2, 3, 4, 5 };
int countEven = list.Count(n => n % 2 == 0)
In that case, short variable name seems appropriate.
list.Aggregate(0, (total, value) => total += value);
But in this case it seems more appropriate to name the variables, because it isn't immediately apparent what the Aggregate is doing.
Basically, I believe not to worry too much about convention unless it's absolutely necessary to keep people from screwing up. If you have any choice in the matter, use what makes sense in the context (language, team, block of code) you are working, and will be understandable by someone else reading it hours, weeks or years later. Anything else is just time-wasting OCD.

I think scoping is the #1 reason for this. In imperative languages, dynamic variables, especially global ones need to be named properly, as they're used in several functions. With lexical scoping, it's clear what the symbol is bound to at compile time.
Immutability also contributes to this to some extent- in traditional languages like C/ C++/ Java, a variable can represent different data at different points in time. Therefore, it needs to be given a name to give the programmer an idea of its functionality.
Personally, I feel that features features like first-class functions make symbol names pretty redundant. In traditional languages, it's easier to relate to a symbol; based on its usage, we can tell if it's data or a function.

I'm studying Haskell now, but I don't feel that its naming conventions is so very different. Of course, in Java you're hardly to find a names like xs. But it is easy to find names like x in some mathematical functions, i, j for counters etc. I consider such names to be perfectly appropriate in right context. xs in Haskell is appropriate only generic functions over lists. There's a lot of them in Haskell, so this name is wide-spread. Java doesn't provide easy way to handle such a generic abstractions, that's why names for lists (and lists themselves) are usually much more specific, e.g. lists or users.

I just attended a number of talks on Haskell with lots of code samples. As longs as the code dealt with x, i and f the naming didn't bother me. However, as soon as we got into heavy duty list manipulation and the like I found the three letters or so names to be a lot less readable than I prefer.
To be fair a significant part of the naming followed a set of conventions, so I assume that once you get into the lingo it will be a little easier.
Fortunately, nothing prevents us from using meaningful names, but I don't agree that the language itself somehow makes three letter identifiers meaningful to the majority of people.

When in Rome, do as the Romans do
(Or as they say in my town: "Donde fueres, haz lo que vieres")

Anything that aids readability is a good thing - meaningful names are therefore a good thing in any language.
I use short variable names in many languages but they're reserved for things that aren't important in the overall meaning of the code or where the meaning is clear in the context.
I'd be careful how far I took the advice about Haskell names

My Haskell practice is only of mediocre level, thus, I dare to try to reply only the second, more general part of Your question:
"In essence, it claims that functional programming is such a distinct paradigm that the naming conventions from other paradigms don't apply."
I suspect, the answer is "yes", but my motivation behind this opinion is restricted only on experience in just one single functional language. Still, it may be interesting, because this is an extremely minimalistic one, thus, theoretically very "pure", and underlying a lot of practical functional languages.
I was curios how easy it is to write practical programs on such an "extremely" minimalistic functional programming language like combinatory logic.
Of course, functional programming languages lack mutable variables, but combinatory logic "goes further one step more" and it lacks even formal parameters. It lacks any syntactic sugar, it lacks any predefined datatypes, even booleans or numbers. Everything must be mimicked by combinators, and traced back to the applications of just two basic combinators.
Despite of such extreme minimalism, there are still practical methods for "programming" combinatory logic in a neat and pleasant way. I have written a quine in it in a modular and reusable way, and it would not be nasty even to bootstrap a self-interpreter on it.
For summary, I felt the following features in using this extremely minimalistic functional programming language:
There is a need to invent a lot of auxiliary functions. In Haskell, there is a lot of syntactic sugar (pattern matching, formal parameters). You can write quite complicated functions in few lines. But in combinatory logic, a task that could be expressed in Haskell by a single function, must be replaced with well-chosen auxiliary functions. The burden of replacing Haskell syntactic sugar is taken by cleverly chosen auxiliary functions in combinatory logic. As for replying Your original question: it is worth of inventing meaningful and catchy names for these legions of auxiliary functions, because they can be quite powerful and reusable in many further contexts, sometimes in an unexpected way.
Moreover, a programmer of combinatory logic is not only forced to find catchy names of a bunch of cleverly chosen auxiliary functions, but even more, he is forced to (re)invent whole new theories. For example, for mimicking lists, the programmer is forced to mimick them with their fold functions, basically, he has to (re)invent catamorphisms, deep algebraic and category theory concepts.
I conjecture, several differences can be traced back to the fact that functional languages have a powerful "glue".

In Haskell, meaning is conveyed less with variable names than with types. Being purely functional has the advantage of being able to ask for the type of any expression, regardless of context.

I agree with a lot of the points made here about argument naming but a quick 'find on page' shows that no one has mentioned Tacit programming (aka pointfree / pointless). Whether this is easier to read may be debatable so it's up to you & your team, but definitely worth a thorough consideration.
No named arguments = No argument naming conventions.

Develop Reference

node.js excel linux python-3.x azure haskell apache-spark rust .htaccess string