What does "a monad is a model of computation" mean

What does "a monad is a model of computation" mean - haskell

What does it mean exactly when people say "a monad is a model of computation"? Does this mean computation in the sense of turing completeness? If so, how?
Clarification: This question is not about explaining monads but what people mean with "model of computation" in this context and how this relates to monads. See towards the end of this answer for a typical use of this phrase.
In my understanding a turing machine, the theory of recursive functions, lambda calculus etc. are all models of computation and I cannot see how a monad would relate to that if at all.

The idea of monads as models of computation can be traced back to the work of Eugenio Moggi. Among Haskell practitioners, the best known paper by Moggi on this matter is Notions of computations as monads (1991). Relevant quotes include:
The [lambda]-calculus is considered a useful mathematical tool in the study of programming languages, since programs can be identified with [lambda]-terms. However, if one goes further and uses [beta][eta]-conversion to prove equivalence of programs, then a gross simplification is introduced (programs are identified with total functions from values to values) that may jeopardise the applicability of theoretical results, In this paper we introduce calculi based on a categorical semantics for computations, that provide a correct basis for proving equivalence of programs for a wide range of notions of computation. [p. 1]
[...]
We do not take as a starting point for proving equivalence of programs the theory of [beta][eta]-conversion, which identifies the denotation of a program (procedure) of type A -> B with a total function from A to B, since this identification wipes out completely behaviours such as non-termination, non-determinism, and side-effects, that can be exhibited by real programs. Instead, we proceed as follows:
We take category theory as a general theory of functions and develop on top a categorical semantics of computations based on monads. [...] [p. 1]
[...]
The basic idea behind the categorical semantics below is that, in order to interpret a programming language in a category [C], we distinguish the object A of values (of type A) from the object TA of computations (of type A), and take as denotations of programs (of type A) the elements of TA. In particular, we identify the type A with the object of values (of type A) and obtain the object of computations (of type A) by applying an unary type-constructor T to A. We call T a notion of computation, since it abstracts away from the type of values computations may produce. There are many choices for TA corresponding to different notions of computations. [pp. 2-3]
[...]
We have identified monads as important to modeling notions of computations, but computational monads seem to have additional properties; e.g., they have a tensorial strength and may satisfy the mono requirement. It is likely that there are other properties of computational monads still to be identified, and there is no reason to believe that such properties have to be found in the literature on monads. [p. 27 -- thanks danidiaz]
A related older paper by Moggi, Computational lambda-calculus and monads (1989 -- thanks michid for the reference), speaks literally of "computational model[s]":
A computational model is a monad (T;[eta];[mu]) satisfying the mono requirement: [eta-A] is a mono for every A [belonging to] C.
There is an alternative description of a monad (see[7]), which is easier to justify computationally. [...] [p. 2]
This particular bit of terminology was dropped in the Notions of computations as monads, as Moggi sharpened the focus of his presentation on the "alternative description" (namely, Kleisli triples, which are composed by, in Haskell parlance, a type constructor, return and bind). The essence, though, remain the same throughout.
Philip Wadler presents the idea with a more practical bent in Monads for functional programming (1992):
The use of monads to structure functional programs is described. Monads provide a convenient framework for simulating effectsfound in other languages, such as global state, exception handling, out-put, or non-determinism. [p. 1]
[...]
Pure functional languages have this advantage: all flow of data is made explicit.And this disadvantage: sometimes it is painfully explicit.
A program in a pure functional language is written as a set of equations. Explicit data flow ensures that the value of an expression depends only on its free variables. Hence substitution of equals for equals is always valid, making such programs especially easy to reason about. Explicit data flow also ensures that the order of computation is irrelevant, making such programs susceptible to lazy evaluation.
It is with regard to modularity that explicit data flow becomes both a blessing and a curse. On the one hand, it is the ultimate in modularity. All data in and all data out are rendered manifest and accessible, providing a maximum of flexibility. On the other hand, it is the nadir of modularity. The essence of an algorithm can become buried under the plumbing required to carry data from its point of creation to its point of use. [p. 2]
[...]
Say it is desired to add error checking, so that the second example above returns a sensible error message. In an impure language, this is easily achieved with the use of exceptions.
In a pure language, exception handling may be mimicked by introducing a type to represent computations that may raise an exception. [pp. 3 -4 -- note this is before monads are introduced as an unifying abstraction.]
[...]
Each of the variations on the interpreter has a similar structure, which may be abstracted to yield the notion of a monad.
In each variation, we introduced a type of computations. Respectively, M represented computations that could raise exceptions, act on state, and generate output. By now the reader will have guessed that M stands for monad. [p. 6]
This is one of the roots of the usage of "computation" to refer to monadic values.
A significant body of later literature makes use of the concept of computation in this manner. For instance, this is the opening passage of Notions of Computation as Monoids by Exequiel Rivas and Mauro Jaskelioff (2014 -- thanks danidiaz for the suggestion):
When constructing a semantic model of a system or when structuring computer code,there are several notions of computation that one might consider. Monads (Moggi, 1989; Moggi, 1991) are the most popular notion, but other notions,such as arrows (Hughes, 2000) and, more recently, applicative functors (McBride & Paterson, 2008) have been gaining widespread acceptance. Each of these notions of computation has particular characteristics that makes them more suitable for some tasks than for others. Nevertheless, there is much to be gained from unifying all three different notions under a single conceptual framework. [p. 1]
Another good example is Comonadic notions of computation by Tarmo Uustalu and Varmo Vene (2000):
Since the seminal work by Moggi in the late 80s, monads, more precisely, strong monads, have become a generally accepted tool for structuring effectful notions of computation, such as computation with exceptions, output, computation using an environment, state-transforming, nondeterministic and probabilistic computation etc. The idea is to use a Kleisli category as the category of impure, effectful functions, with the Kleisli inclusion giving an embedding of the pure functions from the base category. [...] [p. 263]
[...]
The starting-point in the monadic approach to (call-by-value) effectful computation is the idea that impure, effectful functions from A to B must be nothing else than pure functions from A to TB. Here pure functions live in a base category C and T is an endofunctor on C that describes the notion of effect of interest; it is useful to think of TA as the type of effectful computations of values of a given type A.
For this to work, impure functions must have identities and compose. Therefore T cannot merely be a functor, but must be a monad. [p. 265]
Such uses of "computation" fit the usual computer science notion of models of computation (see danidiaz's answer for more on that). In the informal functional programming literature, allusions to monads as models of computation have varying degrees of precision. Still, they generally draw from, or at least are offshoots of, a rigorous idea.

Nothing. It doesn't mean anything. It's the output of someone struggling to find metaphors which make monads into something they already know. It almost means something. "It is possible to construct models of computation which form monads," for instance, is a meaningful statement. But the difference is significant. "Monads are models of computation" is an attempt to force a broad abstraction into a narrow interpretation. The other specifies that you can work with a broader abstraction for one use case.
Be very wary of reductive explanations. Do you think that an entire community of developers would keep using unfamiliar terminology if familiar terminology communicated the same thing? The term Monad has stuck around for 20 years in a language community that rapidly invents and discards abstractions as it searches for improvements. The only way that can happen is if it communicates something useful and precise.
It's just hard to write an explanation of the application of the idea to programming that makes any sense to people who don't know enough of the language to understand the constructs in use. If you aren't comfortable with at least higher-kinded types, type classes, and higher-order functions there's no way to understand what the notation is saying.
Learning prerequisite ideas will help. Practice writing code will help. Looking at how (>>=) works for various concrete types will help. Struggling through learning how to use a library like Parsec (or modern descendants like megaparsec) will help.
Trying to force the idea to match something you already know via metaphor will not.

Expanding a little on #duplode's answer, I think that when talking about computation, "model" can have at least two slightly different meanings.
One is model in the sense of the Church–Turing thesis. Here a model is a way of performing computations that is capable of expressing any algorithm. So turing machines, lambda calculus, post correspondence systems... are all models.
Another is model in the sense of programming language semantics. The idea is that we consider programs as composable syntactical structures, and we want them to "mean" something, ideally in a way that lets us determine the meaning of a composition from the meaning of the elements. In this sense, lambda calculus has models.
Now, one kind of semantics is denotational semantics, in which the meaning we assign to a program is some kind of mathematical object. For a trivial example, consider binary numbers. Here the "programs" are strings of 0s and 1s, regarded as mere symbols. And the "model" would be natural numbers, along with a function which maps each string of symbols to the corresponding natural number.
Sometimes these denotations of programs are expressed in terms of category theory. This is the context of Moggi's papers: he is making use of machinery from category theory—like monads—to map programming language concepts like exceptions, continuations, input/output... into a mathematical model. Monads become a convenient way of structuring the mathematical universe of program meanings.

Related

What are algebraic structures in functional programming?

I've been doing some light reading on functional programming concepts and ideas. So far, so good, I've read about three main concepts: algebraic structures, type classes, and algebraic data types. I have a fairly good understanding of what algebraic data types are. I think sum types and product types are fairly straightforward. For example, I can imagine creating an algebraic data type like a Card type which is a product type consisting of two enum types, Suit (with four values and symbols) and Rank (with 13 values and symbols).
However, I'm still hung up on trying to understand precisely what algebraic structures and type classes are. I just have a surface-level picture in my head but can't quite completely wrap my head around, for instance, the different types of algebraic structures like functors, monoids, monads, etc. How exactly are these different? How can they be used in a programming setting? How are type classes different from regular classes? Can anyone at least point me in the direction of a good book on abstract algebra and functional programming? Someone recommended I learn Haskell but do I really need to learn Haskell in order to understand functional programming?

"algebraic structure" is a concept that goes well beyond programming, it belongs to mathematics.
Imagine the unfathomably deep sea of all possible mathematical objects. Numbers of every stripe (the naturals, the reals, p-adic numbers...) are there, but also things like sequences of letters, graphs, trees, symmetries of geometrical figures, and all well-defined transformations and mappings between them. And much else.
We can try to "throw a net" into this sea and retain only some of those entities, by specifying conditions. Like "collections of things, for which there is an operation that combines two of those things into a third thing of the same type, and for which the operation is associative". We can give those conditions their own name, like, say, "semigroup". (Because we are talking about highly abstract stuff, choosing a descriptive name is difficult.)
That leaves out many inhabitants of the mathematical "sea", but the description still fits a lot of them! Many collections of things are semigroups. The natural numbers with the multiplication operation for example, but also non-empty lists of letters with concatenation, or the symmetries of a square with composition.
You can expand your description with extra conditions. Like "a semigroup, and there's also an element such that combining it with any other element gives the other element, unchanged". That restricts the number of mathematical entities that fit the description, because you are demanding more of them. Some valid semigroups will lack that "neutral element". But a lot of mathematical entities will still satisfy the expanded description. If you aren't careful, you can declare conditions so restrictive that no possible mathematical entity can actually fit them! At other times, you can be so precise that only one entity fits them.
Working purely with these descriptions of mathematical entities, using only the general properties we require of them, we can obtain unexpected results about them, non-obvious at first sight, results that will apply to all entities which fit the description. Think of these discoveries as the mathematical equivalent of "code reuse". For example, if we know that some collection of things is a semigroup, then we can calculate exponentials using binary exponentiation instead of tediously combining a thing with itself n times. But that only works because of the associative property of the semigroup operation.

You’ve asked quite a few questions here, but I can try to answer them as best I can:
… different types of algebraic structures like functors, monoids, monads, etc. How exactly are these different? How can they be used in a programming setting?
This is a very common question when learning Haskell. I won’t write yet another answer here — and a complete answer is fairly long anyway — but a simple Google search gives some very good answers: e.g. I can recommend 1 2 3
How are type classes different from regular classes?
(By ‘regular classes’ I assume you mean classes as found in OOP.)
This is another common question. Basically, the two have almost nothing in common except the name. A class in OOP is a combination of fields and methods. Classes are used by creating instances of that class; each instance can store data in its fields, and manipulate that data using its methods.
By contrast, a type class is simply a collection of functions (often also called methods, though there’s pretty much no connection). You can declare an instance of a type class for a data type (again, no connection) by redefining each method of the class for that type, after which you may use the methods with that type. For instance, the Eq class looks like this:
class Eq a where
(==) :: a -> a -> Bool
(/=) :: a -> a -> Bool
And you can define an instance of that class for, say, Bool, by implementing each function:
instance Eq Bool where
True == True = True
False == False = True
_ == _ = False
p /= q = not (p == q)
Can anyone at least point me in the direction of a good book on abstract algebra and functional programming?
I must admit that I can’t help with this (and it’s off-topic for Stack Overflow anyway).
Someone recommended I learn Haskell but do I really need to learn Haskell in order to understand functional programming?
No, you don’t — you can learn functional programming from any functional language, including Lisp (particularly Scheme dialects), OCaml, F#, Elm, Scala etc. Haskell happens to be a particularly ‘pure’ functional programming language, and I would recommend it as well, but if you just want to learn and understand functional programming then any one of those will do.

Type classes with laws that contain not equalities/symmetries but inequalities/asymmetries

All of the type classes that I've come across, I think have had laws that establish symmetries by specifying equations. I was wondering though if there are any prominent theoretical or even practical examples of type classes that establish asymmetries, i.e. ones that demand the lack of symmetry? By e.g. specifying <expr1> /= <expr2> or <, or not somePredicate(a, b).
I understand that inequality can be expressed as an equality with a free variable, i.e. a > b = a + k = b etc, but I'm thinking the introduction of free variables itself might align with my idea of enforced asymmetry.
What would be the (theoretical) applications of such law? And are there any (runnable) examples of this?
Alternatively: if this can't be considered Haskell enough to be on SO, should this go on CS or CSTheory?

Algebraic laws in general typically are only specified in terms of equational identities, and not not disequalities. The standpoint to think about this is model theory. A theory can be thought of as 1) a collection of symbols, of different arities, so that sentences can be constructed from them (i.e. of arity 0 we might have sequences of numerals, of arity 1 we have negation, and of arity two we have addition) and 2) a set of equations that provide relations between sentences constructed from such signatures.
This lets us describe things like various arithmetic theories, groups, rings, modules, etc.
Now a model of a theory is a set of concrete assignments of mathematical objects (numbers, functions, etc) to the elements of the signature, such that the translation of sentences into the elements of the model respects these equations.
Categorically, we often think of a theory as a special sort of category of all possible sentences generated by the signature. The arrows in this category are implication -- sentences which may be generated from others by application of the equational identities. This in turn induces equivalences between all sentences which are the same under the application of the equational identities (this yields the "generators and relations" approach). And in turn, a model is simply a functor from this theory to any other category, though typically Set.
This yields a very nice adjunction between syntax and semantics. The greater the collection of sentences you want to model, the fewer the models you can get, and the more models you have, the smaller the set of sentences that will be satisfied by all of them. (Here I am only sketching the idea rather than filling in lots of important details).
In any case, one consequence of this that people tend to ignore, but that really pays off, is that in such a setting you want a "terminal model" that is the least among all models, just as you want an "initial theory" that admits all models. The terminal (aka trivial) model is the functor that sends everything in the theory to the empty set and maps on the empty set. Lots of very nice properties emerge when you have such things. But note -- to have such things, you must only have equational identities and not "disidentities." Such theories are called Algebraic Theories.
What does this all have to do with Haskell? Well, we can think of the signatures of typeclasses exactly as signatures of algebraic theories, and the laws of them as the equations of such theories. And that's generally how typeclasses are used in Haskell and why they were introduced -- to suit these sorts of situations.
But of course we don't have to do this -- we can have whatever laws we want. But we lose all sorts of nice properties along the way -- and often in fact find that disequalities mean our theory will have very few models, and with weird structure relating them. Since typeclasses are intended to capture common structure between various things, and since non-algebraic theories tend to fix unique(ish) models, then it turns out it is rarely the case that we would want to use disequality relations in typeclass laws. And indeed I can't think of any examples where I've seen it come up.
Here's another way to think of it -- consider a theory with equalities and disequalities both, and then eliminate the disequalities. What remains still admits all the prior models, but also may have a bunch of "unintended" ones. So we don't gain additional reasoning in terms of rewrites -- we just have certain models that are a priori excluded. Furthermore, when one wishes to rule out "unintended" models this is usually because we want to fix a particular "intended" one. And if we want to fix a particular intended model, the question immediately arises -- why not just use that concrete structure, instead of the more general typeclass?

Is complex differentiation of datatypes sensible?

pigworker once asked how to express that a type is infinitely differentiable. This question brought to mind the fact that in complex analysis, a function that is differentiable (on an open set) must be infinitely differentiable (on that set). Is there a way to talk about complex differentiation of datatypes? If so, does a similar theorem hold?

Not really an answer... but this rant is way too long for a comment.
I find it a bit misleading to think complex differentiability just implies infinite differentiability. It's in fact much stronger than that: if a function is complex differentiable, then its derivatives at any point determine the entire function. And because infinite differentiability gives you a full Taylor series, you have an analytic function which is equal to your function, i.e. is your function itself. So, in a sense complex differentiable functions are analytic... because they are.
From a (standard) calculus perspective, the key contrast between real diff'ability and complex diff'ability is that in the reals, there is only one direction in which you can take the limit of difference-quotients (f(x+δ) - f x)/δ. You merely require that the left limit equals the right limit. But because that's an equality after the limit, this has only an effect locally. (Topologically speaking, the constraint just compares two discrete values, so it doesn't really deal with continuity properties at all.)
OTOH, for complex differentiability we require that the limit of the difference quotient is the same if we approach x from any direction in the entire complex plane. That's an entire continuous degree of freedom constrained. You can then go on to perform topological tricks (Cauchy integrals are essentially that) to “spread” the constraint through the entire domain.
I consider this a bit problematic philosophically. Holomorphic functions aren't really functions at all, as in: they're not so much defined by the entirety of their result values across the domain, as by some way to write them with analytic formulas (i.e. possibly-infinite algebraic expressions / polynomials).
Most mathematicians and physicists apparently like this a lot – such expressions are just the way in which they generally write functions.
I don't, really, like it at all: to me, a function should be a function, something defined by individual values, like field strengths you can measure in space or results you can define in Haskell.
Anyway, I digress...
If we translate this issue from functions on numbers to functors on Haskell types, I suppose the upshot is that complex diff'ability means nothing else but: a type can be written as a (possibly infinite?) ADT polynomial. And how to get infinite differentiability for such ADTs was shown in the post you linked to.
Another spin... perhaps closer to an answer.
These “derivatives” of Haskell types aren't really derivatives in the calculus sense. As in, they aren't motivated by a concept of small-pertubation response analysis†. It so happens that you can mathematically proove, for a very specific class of functions – those defined by an algebraic expression – that the calculus-derivative can again be written in a simple algebraic way (given by the well-known differentiation rules). That means trivially that you can differentiate infinitely often.
The usefulness of this symbolic differentiation also motivates to think about it as a more abstract operation. And when you're differentiating Haskell types, it is mainly just this algebraic definition you're going after, not the original calculus one.
Which is fine... but once you're doing algebra rather than calculus, it's not very meaningful to distinguish “real” from “complex” – it's actually neither, because you're not handling values but symbolic representations of values. An untyped language, if you will (and indeed, Haskell's type language is still untyped, with everything having kind *).
†Be it with traditional convergent limits or NSA-infinitesimals.

What does "pure" mean in "pure functional language"?

Haskell has been called a "pure functional language."
What does "pure" mean in this context? What consequences does this have for a programmer?

In a pure functional language, you can't do anything that has a side effect.
A side effect would mean that evaluating an expression changes some internal state that would later cause evaluating the same expression to have a different result. In a pure functional language you can evaluate the same expression as often as you want with the same arguments, and it would always return the same value, because there is no state to change.
For example, a pure functional language cannot have an assignment operator or do input/output, although for practical purposes, even pure functional languages often call impure libraries to do I/O.

"Pure" and "functional" are two separate concepts, although one is not very useful without the other.
A pure expression is idempotent: it can be evaluated any number of times, with identical results each time. This means the expression cannot have any observable side-effects. For example, if a function mutated its arguments, set a variable somewhere, or changed behavior based on something other than its input alone, then that function call is not pure.
A functional programming language is one in which functions are first-class. In other words, you can manipulate functions with exactly the same ease with which you can manipulate all other first-class values. For example, being able to use a "function returning bool" as a "data structure representing a set" would be easy in a functional programming language.
Programming in functional programming languages is often done in a mostly-pure style, and it is difficult to be strictly pure without higher-order function manipulation enabled by functional programming languages.
Haskell is a functional programming language, in which (almost) all expressions are pure; thus, Haskell is a purely functional programming language.

A pure function is one which has no side effects — it takes a value in and gives a value back. There's no global state that functions modify. A pure functional language is one which forces functions to be pure. Purity has a number of interesting consequences, such as the fact that evaluation can be lazy — since a function call has no purpose but to return a value, then you don't need to actually execute the function if you aren't going to use its value. Thanks to this, things like recursive functions on infinite lists are common in Haskell.
Another consequence is that it doesn't matter in what order functions are evaluated — since they can't affect each other, you can do them in any order that's convenient. This means that some of the problems posed by parallel programming simply don't exist, since there isn't a "wrong" or "right" order for functions to execute.

Strictly speaking, a pure functional language is a functional language (i.e. a language where functions are first-class values) where expressions have no side effects. The term “purely functional language” is synonymous.
By this definition, Haskell is not a pure functional language. Any language in which you can write programs that display their result, read and write files, have a GUI, and so on, is not purely functional. Thus no general purpose programming language is purely functional (but there are useful domain-specific purely functional languages: they can typically be seen as embedded languages in some way).
There is a useful relaxed sense in which languages like Haskell and Erlang can be considered purely functional, but languages like ML and Scheme cannot. A language can be considered purely functional if there is a reasonably large, useful and well-characterised subset where side effects are impossible. For example, in Haskell, all programs whose type is not built from IO or other effect-denoting monad are side-effect-free. In Erlang, all programs that don't use IO-inducing libraries or concurrency features are side-effect-free (this is more of a stretch than the Haskell case). Conversely, in ML or Scheme, a side effect can be buried in any function.
By this perspective, the purely functional subset of Haskell can be seen as the embedded language to deal with the behavior inside each monad (of course this is an odd perspective, as almost all the computation is happening in this “embedded” subset), and the purely functional subset of Erlang can be seen as the embedded language do deal with local behavior.
Graham Hutton has a slightly different, and quite interesting, perspective on the topic of purely functional languages:
Sometimes, the term “purely functional” is also used in a broader sense to mean languages that might incorporate computational effects, but without altering the notion of ‘function’ (as evidenced by the fact that the essential properties of functions are preserved.) Typically, the evaluation of an expression can yield a ‘task’, which is then executed separately to cause computational effects. The evaluation and execution phases are separated in such a way that the evaluation phase does not compromise the standard properties of expressions and functions. The input/output mechanisms of Haskell, for example, are of this kind.
I.e. in Haskell, a function has the type a -> b and can't have side effects. An expression of type IO (a -> b) can have side effects, but it's not a function. Thus in Haskell functions must be pure, hence Haskell is purely functional.

As there cannot be any side effects in pure functional code, testing gets much easier as there is no external state to check or verify. Also, because of this, extending code may become easier.
I lost count of the number of times I had trouble with non-obvious side effects when extending/fixing (or trying to fix) code.

As others have mentioned, the term "pure" in "pure functional programming language" refers to the lack of observable side-effects. For me, this leads straight to the question:
What is a side-effect?
I have seen side-effects explained both as
something that a function does other than simply compute its result
something that can affect the result of a function other than the inputs to the function.
If the first definition is the correct one, then any function that does I/O (e.g. writing to a file) cannot be said to be a "pure" function. Whereas Haskell programs can call functions which cause I/O to be performed, it would seem that Haskell is not a pure functional programming language (as it is claimed to be) according to this definition.
For this and other reasons, I think the second definition is the more useful one. According to the second definition, Haskell can still claim to be a completely pure functional programming language because functions that cause I/O to be performed compute results based only on function inputs. How Haskell reconciles these seemingly conflicting requirements is quite interesting, I think, but I'll resist the temptation to stray any further from answering the actual question.

Amr Sabry wrote a paper about what a pure functional language is. Haskell is by this definition considered pure, if we ignore things like unsafePerformIO. Using this definition also makes ML and Erlang impure. There are subsets of most languages that qualify as pure, but personally I don't think it's very useful to talk about C being a pure language.
Higher-orderness is orthogonal to purity, you can design a pure first-order functional language.

Is there a visual modeling language or style for the functional programming paradigm?

UML is a standard aimed at the modeling of software which will be written in OO languages, and goes hand in hand with Java. Still, could it possibly be used to model software meant to be written in the functional programming paradigm? Which diagrams would be rendered useful given the embedded visual elements?
Is there a modeling language aimed at functional programming, more specifically Haskell? What tools for putting together diagrams would you recommend?
Edited by OP Sept 02, 2009:
What I'm looking for is the most visual, lightest representation of what goes on in the code. Easy to follow diagrams, visual models not necessarily aimed at other programmers. I'll be developing a game in Haskell very soon but because this project is for my graduation conclusion work I need to introduce some sort of formalization of the proposed solution. I was wondering if there is an equivalent to the UML+Java standard, but for Haskell.
Should I just stick to storyboards, written descriptions, non-formalized diagrams (some shallow flow-chart-like images), non-formalized use case descriptions?
Edited by jcolebrand June 21, 2012:
Note that the asker originally wanted a visual metphor, and now that we've had three years, we're looking for more/better tools. None of the original answers really addressed the concept of "visual metaphor design tool" so ... that's what the new bounty is looking to provide for.

I believe the modeling language for Haskell is called "math". It's often taught in schools.

Yes, there are widely used modeling/specification languages/techniques for Haskell.
They're not visual.
In Haskell, types give a partial specification.
Sometimes, this specification fully determines the meaning/outcome while leaving various implementation choices.
Going beyond Haskell to languages with dependent types, as in Agda & Coq (among others), types are much more often useful as a complete specification.
Where types aren't enough, add formal specifications, which often take a simple functional form.
(Hence, I believe, the answers that the modeling language of choice for Haskell is Haskell itself or "math".)
In such a form, you give a functional definition that is optimized for clarity and simplicity and not all for efficiency.
The definition might even involve uncomputable operations such as function equality over infinite domains.
Then, step by step, you transform the specification into the form of an efficiently computable functional program.
Every step preserves the semantics (denotation), and so the final form ("implementation") is guaranteed to be semantically equivalent to the original form ("specification").
You'll see this process referred to by various names, including "program transformation", "program derivation", and "program calculation".
The steps in a typical derivation are mostly applications of "equational reasoning", augmented with a few applications of mathematical induction (and co-induction).
Being able to perform such simple and useful reasoning was a primary motivation for functional programming languages in the first place, and they owe their validity to the "denotative" nature of "genuinely functional programming".
(The terms "denotative" and "genuinely functional" are from Peter Landin's seminal paper The Next 700 Programming languages.)
Thus the rallying cry for pure functional programming used to be "good for equational reasoning", though I don't hear that description nearly as often these days.
In Haskell, denotative corresponds to types other than IO and types that rely on IO (such as STM).
While the denotative/non-IO types are good for correct equational reasoning, the IO/non-denotative types are designed to be bad for incorrect equational reasoning.
A specific version of derivation-from-specification that I use as often as possible in my Haskell work is what I call "semantic type class morphisms" (TCMs).
The idea there is to give a semantics/interpretation for a data type, and then use the TCM principle to determine (often uniquely) the meaning of most or all of the type's functionality via type class instances.
For instance, I say that the meaning of an Image type is as a function from 2D space.
The TCM principle then tells me the meaning of the Monoid, Functor, Applicative, Monad, Contrafunctor, and Comonad instances, as corresponding to those instances for functions.
That's a lot of useful functionality on images with very succinct and compelling specifications!
(The specification is the semantic function plus a list of standard type classes for which the semantic TCM principle must hold.)
And yet I have tremendous freedom of how to represent images, and the semantic TCM principle eliminates abstraction leaks.
If you're curious to see some examples of this principle in action, check out the paper Denotational design with type class morphisms.

We use theorem provers to do formal modelling (with verification), such as Isabelle or Coq. Sometimes we use domain specific languages (e.g. Cryptol) to do the high level design, before deriving the "low level" Haskell implementation.
Often we just use Haskell as the modelling language, and derive the actual implementation via rewriting.
QuickCheck properties also play a part in the design document, along with type and module decompositions.

Yes, Haskell.
I get the impression that programmers using functional languages don't feel the need to simplify their language of choice away when thinking about their design, which is one (rather glib) way of viewing what UML does for you.

I have watched a few video interviews, and read some interviews, with the likes of erik meijer and simon peyton-jones. It seems as when it comes to modelling and understanding ones problem domain, they use type signatures, especially function signatures.
Sequence diagrams (UML) could be related to the composition of functions.
A static class diagram (UML) could be related to type signatures.

In Haskell, you model by types.
Just begin with writing your function-, class- and data signatures without any implementation and try to make the types fit. Next step is QuickCheck.
E.g. to model a sort:
class Ord a where
compare :: a -> a -> Ordering
sort :: Ord a => [a] -> [a]
sort = undefined
then tests
prop_preservesLength l = (length l) == (length $ sort l)
...
and finally the implementation ...

Although not a recommendation to use (as it appears to be not available for download), but the HOPS system visualizes term graphs, which are often a convenient representation of functional programs.
It may be also considered a design tool as it supports documenting the programs as well as constructing them; I believe it can also step through the rewriting of the terms if you want it to so you can see them unfold.
Unfortunately, I believe it is no longer actively developed though.

I realize I'm late to the party, but I'll still give my answer in case someone would find it useful.
I think I'd go for systemic methodologies of the likes of SADT/IDEF0.
https://en.wikipedia.org/wiki/Function_model
https://en.wikipedia.org/wiki/Structured_Analysis_and_Design_Technique
Such diagrams can be made with the Dia program that is available on both Linux, Windows and MacOS.

You can a data flow process network model as described in Realtime Signal Processing: Dataflow, Visual, and Functional Programming by Hideki John Reekie
For example for code like (Haskell):
fact n | n == 0 = 1
| otherwise = n * fact (n - 1)
The visual representation would be:

What would be the point in modelling Haskell with Maths? I thought the whole point of Haskell was that it related so closely to Maths that Mathematicians could pick it up and run with it. Why would you translate a language into itself?
When working with another functional programming language (f#) I used diagrams on a white board describing the large blocks and then modelled the system in an OO way using UML, using classes. There was a slight miss match in the building blocks in f# (split the classes up into data structures and functions that acted upon them). But for the understanding from a business perspective it worked a treat. I would add mind that the problem was business/Web oriented and don't know how well the technique would work for something a bit more financial. I think I would probably capture the functions as objects without state and they should fit in nicely.
It all depends on the domain your working in.

I use USL - Universal Systems Language. I'm learning Erlang and I think it's a perfect fit.
Too bad the documentation is very limited and nobody uses it.
More information here.

Develop Reference

node.js excel linux python-3.x azure haskell apache-spark rust .htaccess string