Is there a visual modeling language or style for the functional programming paradigm? - programming-languages

UML is a standard aimed at the modeling of software which will be written in OO languages, and goes hand in hand with Java. Still, could it possibly be used to model software meant to be written in the functional programming paradigm? Which diagrams would be rendered useful given the embedded visual elements?
Is there a modeling language aimed at functional programming, more specifically Haskell? What tools for putting together diagrams would you recommend?
Edited by OP Sept 02, 2009:
What I'm looking for is the most visual, lightest representation of what goes on in the code. Easy to follow diagrams, visual models not necessarily aimed at other programmers. I'll be developing a game in Haskell very soon but because this project is for my graduation conclusion work I need to introduce some sort of formalization of the proposed solution. I was wondering if there is an equivalent to the UML+Java standard, but for Haskell.
Should I just stick to storyboards, written descriptions, non-formalized diagrams (some shallow flow-chart-like images), non-formalized use case descriptions?
Edited by jcolebrand June 21, 2012:
Note that the asker originally wanted a visual metphor, and now that we've had three years, we're looking for more/better tools. None of the original answers really addressed the concept of "visual metaphor design tool" so ... that's what the new bounty is looking to provide for.

I believe the modeling language for Haskell is called "math". It's often taught in schools.

Yes, there are widely used modeling/specification languages/techniques for Haskell.
They're not visual.
In Haskell, types give a partial specification.
Sometimes, this specification fully determines the meaning/outcome while leaving various implementation choices.
Going beyond Haskell to languages with dependent types, as in Agda & Coq (among others), types are much more often useful as a complete specification.
Where types aren't enough, add formal specifications, which often take a simple functional form.
(Hence, I believe, the answers that the modeling language of choice for Haskell is Haskell itself or "math".)
In such a form, you give a functional definition that is optimized for clarity and simplicity and not all for efficiency.
The definition might even involve uncomputable operations such as function equality over infinite domains.
Then, step by step, you transform the specification into the form of an efficiently computable functional program.
Every step preserves the semantics (denotation), and so the final form ("implementation") is guaranteed to be semantically equivalent to the original form ("specification").
You'll see this process referred to by various names, including "program transformation", "program derivation", and "program calculation".
The steps in a typical derivation are mostly applications of "equational reasoning", augmented with a few applications of mathematical induction (and co-induction).
Being able to perform such simple and useful reasoning was a primary motivation for functional programming languages in the first place, and they owe their validity to the "denotative" nature of "genuinely functional programming".
(The terms "denotative" and "genuinely functional" are from Peter Landin's seminal paper The Next 700 Programming languages.)
Thus the rallying cry for pure functional programming used to be "good for equational reasoning", though I don't hear that description nearly as often these days.
In Haskell, denotative corresponds to types other than IO and types that rely on IO (such as STM).
While the denotative/non-IO types are good for correct equational reasoning, the IO/non-denotative types are designed to be bad for incorrect equational reasoning.
A specific version of derivation-from-specification that I use as often as possible in my Haskell work is what I call "semantic type class morphisms" (TCMs).
The idea there is to give a semantics/interpretation for a data type, and then use the TCM principle to determine (often uniquely) the meaning of most or all of the type's functionality via type class instances.
For instance, I say that the meaning of an Image type is as a function from 2D space.
The TCM principle then tells me the meaning of the Monoid, Functor, Applicative, Monad, Contrafunctor, and Comonad instances, as corresponding to those instances for functions.
That's a lot of useful functionality on images with very succinct and compelling specifications!
(The specification is the semantic function plus a list of standard type classes for which the semantic TCM principle must hold.)
And yet I have tremendous freedom of how to represent images, and the semantic TCM principle eliminates abstraction leaks.
If you're curious to see some examples of this principle in action, check out the paper Denotational design with type class morphisms.

We use theorem provers to do formal modelling (with verification), such as Isabelle or Coq. Sometimes we use domain specific languages (e.g. Cryptol) to do the high level design, before deriving the "low level" Haskell implementation.
Often we just use Haskell as the modelling language, and derive the actual implementation via rewriting.
QuickCheck properties also play a part in the design document, along with type and module decompositions.

Yes, Haskell.
I get the impression that programmers using functional languages don't feel the need to simplify their language of choice away when thinking about their design, which is one (rather glib) way of viewing what UML does for you.

I have watched a few video interviews, and read some interviews, with the likes of erik meijer and simon peyton-jones. It seems as when it comes to modelling and understanding ones problem domain, they use type signatures, especially function signatures.
Sequence diagrams (UML) could be related to the composition of functions.
A static class diagram (UML) could be related to type signatures.

In Haskell, you model by types.
Just begin with writing your function-, class- and data signatures without any implementation and try to make the types fit. Next step is QuickCheck.
E.g. to model a sort:
class Ord a where
compare :: a -> a -> Ordering
sort :: Ord a => [a] -> [a]
sort = undefined
then tests
prop_preservesLength l = (length l) == (length $ sort l)
...
and finally the implementation ...

Although not a recommendation to use (as it appears to be not available for download), but the HOPS system visualizes term graphs, which are often a convenient representation of functional programs.
It may be also considered a design tool as it supports documenting the programs as well as constructing them; I believe it can also step through the rewriting of the terms if you want it to so you can see them unfold.
Unfortunately, I believe it is no longer actively developed though.

I realize I'm late to the party, but I'll still give my answer in case someone would find it useful.
I think I'd go for systemic methodologies of the likes of SADT/IDEF0.
https://en.wikipedia.org/wiki/Function_model
https://en.wikipedia.org/wiki/Structured_Analysis_and_Design_Technique
Such diagrams can be made with the Dia program that is available on both Linux, Windows and MacOS.

You can a data flow process network model as described in Realtime Signal Processing: Dataflow, Visual, and Functional Programming by Hideki John Reekie
For example for code like (Haskell):
fact n | n == 0 = 1
| otherwise = n * fact (n - 1)
The visual representation would be:

What would be the point in modelling Haskell with Maths? I thought the whole point of Haskell was that it related so closely to Maths that Mathematicians could pick it up and run with it. Why would you translate a language into itself?
When working with another functional programming language (f#) I used diagrams on a white board describing the large blocks and then modelled the system in an OO way using UML, using classes. There was a slight miss match in the building blocks in f# (split the classes up into data structures and functions that acted upon them). But for the understanding from a business perspective it worked a treat. I would add mind that the problem was business/Web oriented and don't know how well the technique would work for something a bit more financial. I think I would probably capture the functions as objects without state and they should fit in nicely.
It all depends on the domain your working in.

I use USL - Universal Systems Language. I'm learning Erlang and I think it's a perfect fit.
Too bad the documentation is very limited and nobody uses it.
More information here.

Related

What does "a monad is a model of computation" mean

What does it mean exactly when people say "a monad is a model of computation"? Does this mean computation in the sense of turing completeness? If so, how?
Clarification: This question is not about explaining monads but what people mean with "model of computation" in this context and how this relates to monads. See towards the end of this answer for a typical use of this phrase.
In my understanding a turing machine, the theory of recursive functions, lambda calculus etc. are all models of computation and I cannot see how a monad would relate to that if at all.
The idea of monads as models of computation can be traced back to the work of Eugenio Moggi. Among Haskell practitioners, the best known paper by Moggi on this matter is Notions of computations as monads (1991). Relevant quotes include:
The [lambda]-calculus is considered a useful mathematical tool in the study of programming languages, since programs can be identified with [lambda]-terms. However, if one goes further and uses [beta][eta]-conversion to prove equivalence of programs, then a gross simplification is introduced (programs are identified with total functions from values to values) that may jeopardise the applicability of theoretical results, In this paper we introduce calculi based on a categorical semantics for computations, that provide a correct basis for proving equivalence of programs for a wide range of notions of computation. [p. 1]
[...]
We do not take as a starting point for proving equivalence of programs the theory of [beta][eta]-conversion, which identifies the denotation of a program (procedure) of type A -> B with a total function from A to B, since this identification wipes out completely behaviours such as non-termination, non-determinism, and side-effects, that can be exhibited by real programs. Instead, we proceed as follows:
We take category theory as a general theory of functions and develop on top a categorical semantics of computations based on monads. [...] [p. 1]
[...]
The basic idea behind the categorical semantics below is that, in order to interpret a programming language in a category [C], we distinguish the object A of values (of type A) from the object TA of computations (of type A), and take as denotations of programs (of type A) the elements of TA. In particular, we identify the type A with the object of values (of type A) and obtain the object of computations (of type A) by applying an unary type-constructor T to A. We call T a notion of computation, since it abstracts away from the type of values computations may produce. There are many choices for TA corresponding to different notions of computations. [pp. 2-3]
[...]
We have identified monads as important to modeling notions of computations, but computational monads seem to have additional properties; e.g., they have a tensorial strength and may satisfy the mono requirement. It is likely that there are other properties of computational monads still to be identified, and there is no reason to believe that such properties have to be found in the literature on monads. [p. 27 -- thanks danidiaz]
A related older paper by Moggi, Computational lambda-calculus and monads (1989 -- thanks michid for the reference), speaks literally of "computational model[s]":
A computational model is a monad (T;[eta];[mu]) satisfying the mono requirement: [eta-A] is a mono for every A [belonging to] C.
There is an alternative description of a monad (see[7]), which is easier to justify computationally. [...] [p. 2]
This particular bit of terminology was dropped in the Notions of computations as monads, as Moggi sharpened the focus of his presentation on the "alternative description" (namely, Kleisli triples, which are composed by, in Haskell parlance, a type constructor, return and bind). The essence, though, remain the same throughout.
Philip Wadler presents the idea with a more practical bent in Monads for functional programming (1992):
The use of monads to structure functional programs is described. Monads provide a convenient framework for simulating effectsfound in other languages, such as global state, exception handling, out-put, or non-determinism. [p. 1]
[...]
Pure functional languages have this advantage: all flow of data is made explicit.And this disadvantage: sometimes it is painfully explicit.
A program in a pure functional language is written as a set of equations. Explicit data flow ensures that the value of an expression depends only on its free variables. Hence substitution of equals for equals is always valid, making such programs especially easy to reason about. Explicit data flow also ensures that the order of computation is irrelevant, making such programs susceptible to lazy evaluation.
It is with regard to modularity that explicit data flow becomes both a blessing and a curse. On the one hand, it is the ultimate in modularity. All data in and all data out are rendered manifest and accessible, providing a maximum of flexibility. On the other hand, it is the nadir of modularity. The essence of an algorithm can become buried under the plumbing required to carry data from its point of creation to its point of use. [p. 2]
[...]
Say it is desired to add error checking, so that the second example above returns a sensible error message. In an impure language, this is easily achieved with the use of exceptions.
In a pure language, exception handling may be mimicked by introducing a type to represent computations that may raise an exception. [pp. 3 -4 -- note this is before monads are introduced as an unifying abstraction.]
[...]
Each of the variations on the interpreter has a similar structure, which may be abstracted to yield the notion of a monad.
In each variation, we introduced a type of computations. Respectively, M represented computations that could raise exceptions, act on state, and generate output. By now the reader will have guessed that M stands for monad. [p. 6]
This is one of the roots of the usage of "computation" to refer to monadic values.
A significant body of later literature makes use of the concept of computation in this manner. For instance, this is the opening passage of Notions of Computation as Monoids by Exequiel Rivas and Mauro Jaskelioff (2014 -- thanks danidiaz for the suggestion):
When constructing a semantic model of a system or when structuring computer code,there are several notions of computation that one might consider. Monads (Moggi, 1989; Moggi, 1991) are the most popular notion, but other notions,such as arrows (Hughes, 2000) and, more recently, applicative functors (McBride & Paterson, 2008) have been gaining widespread acceptance. Each of these notions of computation has particular characteristics that makes them more suitable for some tasks than for others. Nevertheless, there is much to be gained from unifying all three different notions under a single conceptual framework. [p. 1]
Another good example is Comonadic notions of computation by Tarmo Uustalu and Varmo Vene (2000):
Since the seminal work by Moggi in the late 80s, monads, more precisely, strong monads, have become a generally accepted tool for structuring effectful notions of computation, such as computation with exceptions, output, computation using an environment, state-transforming, nondeterministic and probabilistic computation etc. The idea is to use a Kleisli category as the category of impure, effectful functions, with the Kleisli inclusion giving an embedding of the pure functions from the base category. [...] [p. 263]
[...]
The starting-point in the monadic approach to (call-by-value) effectful computation is the idea that impure, effectful functions from A to B must be nothing else than pure functions from A to TB. Here pure functions live in a base category C and T is an endofunctor on C that describes the notion of effect of interest; it is useful to think of TA as the type of effectful computations of values of a given type A.
For this to work, impure functions must have identities and compose. Therefore T cannot merely be a functor, but must be a monad. [p. 265]
Such uses of "computation" fit the usual computer science notion of models of computation (see danidiaz's answer for more on that). In the informal functional programming literature, allusions to monads as models of computation have varying degrees of precision. Still, they generally draw from, or at least are offshoots of, a rigorous idea.
Nothing. It doesn't mean anything. It's the output of someone struggling to find metaphors which make monads into something they already know. It almost means something. "It is possible to construct models of computation which form monads," for instance, is a meaningful statement. But the difference is significant. "Monads are models of computation" is an attempt to force a broad abstraction into a narrow interpretation. The other specifies that you can work with a broader abstraction for one use case.
Be very wary of reductive explanations. Do you think that an entire community of developers would keep using unfamiliar terminology if familiar terminology communicated the same thing? The term Monad has stuck around for 20 years in a language community that rapidly invents and discards abstractions as it searches for improvements. The only way that can happen is if it communicates something useful and precise.
It's just hard to write an explanation of the application of the idea to programming that makes any sense to people who don't know enough of the language to understand the constructs in use. If you aren't comfortable with at least higher-kinded types, type classes, and higher-order functions there's no way to understand what the notation is saying.
Learning prerequisite ideas will help. Practice writing code will help. Looking at how (>>=) works for various concrete types will help. Struggling through learning how to use a library like Parsec (or modern descendants like megaparsec) will help.
Trying to force the idea to match something you already know via metaphor will not.
Expanding a little on #duplode's answer, I think that when talking about computation, "model" can have at least two slightly different meanings.
One is model in the sense of the Church–Turing thesis. Here a model is a way of performing computations that is capable of expressing any algorithm. So turing machines, lambda calculus, post correspondence systems... are all models.
Another is model in the sense of programming language semantics. The idea is that we consider programs as composable syntactical structures, and we want them to "mean" something, ideally in a way that lets us determine the meaning of a composition from the meaning of the elements. In this sense, lambda calculus has models.
Now, one kind of semantics is denotational semantics, in which the meaning we assign to a program is some kind of mathematical object. For a trivial example, consider binary numbers. Here the "programs" are strings of 0s and 1s, regarded as mere symbols. And the "model" would be natural numbers, along with a function which maps each string of symbols to the corresponding natural number.
Sometimes these denotations of programs are expressed in terms of category theory. This is the context of Moggi's papers: he is making use of machinery from category theory—like monads—to map programming language concepts like exceptions, continuations, input/output... into a mathematical model. Monads become a convenient way of structuring the mathematical universe of program meanings.

What is the practical use for laziness as a built-in language feature?

It's fairly obvious why a functional programming language that wants to be lazy needs to be pure. I'm looking at the reverse question: if a language wants to be pure, is there a big advantage in being lazy? One argument, made by one of the designers of Haskell, is that it removes temptation; maybe, but I'm trying to weigh up the more concrete advantages.
Given that you want to do functional programming, what are the use cases where built-in laziness lets you express things more clearly, simply or concisely?
Stated simply: Why is laziness so important that you'd want to build it into the language?
(I'm looking for use cases more oriented towards an application rather than a demo - I know you can do things like producing an infinite list of prime numbers by filtering an infinite list of natural numbers, but who writes that ten times 'fore lunch...)
"Nothing is evaluated until it is needed at another place" is a simplified metaphor which doesn't cover all aspects of lazy evaluation (e.g. it doesn't mention the strictness phenomena).
From theoretical standpoint, there are 3 ways to go when designing a pure language (of course if it's based on some kind of lambda calculus and not on more exotic evaluation models): strict, non-strict and total.
Each of them has its advantages and disadvantages, so you need to read corresponding research papers.
Total languages are most pure of the three. In the other two the non-termination can be seen as a side effect, so strictness and totality analysers must be built to keep an implementation efficient. Both analyses are undecidable, so the analyzers can never be complete.
However, the total languages are least expressive: it's impossible for a total language to be Turing complete. A frequent approach to get good enough expressiveness is to have a built-in proof system for well-founded recursion, which is not much easier to build than the analyzers for non-total languages.
From practical standpoint, non-strict semantics lets you more easily define control abstractions, as control structures are essentially non-strict. In a strict language you still need some places with non-strict semantics. E.g. if construct has non-strict semantic even in strict languages.
So if your language is strict, control structures are a special case. In contrast, a non-strict language can be uniformly non-strict - it doesn't have an inherent need in strict constructs.
As for "who writes that ten times 'fore lunch" - anyone who uses Haskell for their projects does. I think developing a non-toy project using a language (a non-strict language in your case) is a best way to grasp its advantages and disadvantages.
Below are a few generic usecases for laziness illustrated by non-toy examples:
Cases when control flow is hard to predict. Think of attribute grammars when without laziness you have to perform a topological sort on attributes to resolve the dependensies. Re-sorting your code every time the dependency graph is changed is not practical. In Haskell you can implement the attribute grammar formalism without an explicit sorting, and there are at least two actual implementations on Hackage. The attribute grammars have wide application in compiler construction.
The "generate and search" approach to solve many optimizaton problems. In a strict language you have to interleave generation and search, in Haskell you just compose separate generation and searching functions, and your code remains syntactically modular, but interleaved at runtime. Think of the traveling salesman problem (TSP), when you generate all possible tours and then search through them using a branch-and-bound algorithm. Note that branch an bound algorithms only inspects certain first cities of a tour, only the necessary parts of routes are generated. The TSP has several applications even in its purest formulation, such as planning, logistics, and the manufacture of microchips. Slightly modified, it appears as a sub-problem in many areas, such as DNA sequencing.
Lazy code has non-modular control flow, so a single function can have many possible control flows depending on the environment it executes in. This phenomena can be seen as some kind of 'control flow polymorphism', so lazy control flow abstractions are more generic than their strict counterparts, and a standard library of higher-order functions is much more useful in a lazy language. Think of Python generators, loops and list iterators: in Haskell list functions cover all three usecases, with control flow adapting to different usage scenarios because of laziness. It is not limited to lists - think of Data.Arrow and iteratees, lazy and strict versions of State monad etc. Also note that non-modular control flow is both an advantage and disadvantage, as it makes reasoning about performance harder.
Lazy possibly infinite data structures are useful beyond toy examples. See works of Conal Elliott on memoizing higher order functions using tries. Infinite data structures appear as infinite search spaces (see 2), infinite loops and never-exhausting generators in Python sense (see 3).
Mac OS X's Core Image is a good practical example of lazy evaluation.
Basically, Core Image lets you create a directed acyclic graph of image generators and filters. No evaluation actually takes place until the last step in the process: materialization. When you request to materialize a Core Image graph, the final image frame is propagated backwards through the graph's transformations, thus minimizing the quantity of actual pixel values that need to be evaluated.
There's an extensive discussion of this point in Hughes's classic Why Functional Programming Matters. Therein, Hughes argues that laziness allows for improved modularity, using a number of accessible examples.

What's the next step to learning Haskell after monads?

I've been gradually learning Haskell, and even feel like I've got a hang of monads. However, there's still a lot of more exotic stuff that I barely understand, like Arrows, Applicative, etc. Although I'm picking up bits and pieces from Haskell code I've seen, it would be good to find a tutorial that really explains them wholly. (There seem to be dozens of tutorials on monads.. but everything seems to finish straight after that!)
Here are a few of the resources that I've found useful after "getting the hang of" monads:
As SuperBloup noted, Brent Yorgey's Typeclassopedia is indispensable (and it does in fact cover arrows).
There's a ton of great stuff in Real World Haskell that could be considered "after monads": applicative parsing, monad transformers, and STM, for example.
John Hughes's "Generalizing Monads to Arrows" is a great resource that taught me as much about monads as it did about arrows (even though I thought that I already understood monads when I read it).
The "Yampa Arcade" paper is a good introduction to Functional Reactive Programming.
On type families: I've found working with them easier than reading about them. The vector-space package is one place to start, or you could look at the code from Oleg Kiselyov and Ken Shan's course on Haskell and natural language semantics.
Pick a couple of chapters of Chris Okasaki's Purely Functional Data Structures and work through them in detail.
Raymond Smullyan's To Mock a Mockingbird is a fantastically accessible introduction to combinatory logic that will change the way you write Haskell.
Read Gérard Huet's Functional Pearl on zippers. The code is OCaml, but it's useful (and not too difficult) to be able to translate OCaml to Haskell in your head when working through papers like this.
Most importantly, dig into the code of any Hackage libraries you find yourself using. If they're doing something with syntax or idioms or extensions that you don't understand, look it up.
Regarding type classes:
Applicative is actually simpler than Monad. I've recently said a few things about it elsewhere, but the gist is that it's about enhanced Functors that you can lift functions into. To get a feel for Applicative, you could try writing something using Parsec without using do notation--my experience has been that applicative style works better than monadic for straightforward parsers.
Arrows are a very abstract way of working with things that are sort of like functions ("arrows" between types). They can be difficult to get your mind around until you stumble on something that's naturally Arrow-like. At one point I reinvented half of Control.Arrow (poorly) while writing interactive state machines with feedback loops.
You didn't mention it, but an oft-underrated, powerful type class is the humble Monoid. There are lots of places where monoid-like structure can be found. Take a look at the monoids package, for instance.
Aside from type classes, I'd offer a very simple answer to your question: Write programs! The best way to learn is by doing, so pick something fun or useful and just make it happen.
In fact, many of the more abstract concepts--like Arrow--will probably make more sense if you come back to them later and find that, like me, they offer a tidy solution to a problem you've encountered but hadn't even realized could be abstracted out.
However, if you want something specific to shoot for, why not take a look at Functional Reactive Programming--this is a family of techniques that have a lot of promise, but there are a lot of open questions of what the best way to do it is.
Typeclasses like Monad, Applicative, Arrow, Functor are great and all, and even more great for changing how you think about code than necessarily the convenience of having functions generic over them. But there's a common misconception that the "next step" in Haskell is learning about more typeclasses and ways of structuring control flow. The next step is in deciding what you want to write, and trying to write it, exploring what you need along the way.
And even if you understand Monads, that doesn't mean you've scratched the surface of what you can do with monadically structured code. Play with parser combinator libraries, or write your own. Explore why applicative notation is sometimes easier for them. Explore why limiting yourself to applicative parsers might be more efficient.
Look at logic or math problems and explore ways of implementing backtracking -- depth-first, breadth-first, etc. Explore the difference between ListT and LogicT and ChoiceT. Take a look at continuations.
Or do something completely different!
Far and away the most important thing you can do is explore more of Hackage. Grappling with the various exotic features of Haskell will perhaps let you find improved solutions to certain problems, while the libraries on Hackage will vastly expand your set of tools.
The best part about the Haskell ecosystem is that you get to balance learning surgically precise new abstraction techniques with learning how to use the giant buzz saws available to you on Hackage.
Start writing code. You'll learn necessary concepts as you go.
Beyond the language, to use Haskell effectively, you need to learn some real-world tools and techniques. Things to consider:
Cabal, a tool to manage dependencies, build and deploy Haskell applications*.
FFI (Foreign Function Interface) to use C libraries from your Haskell code**.
Hackage as a source of others' libraries.
How to profile and optimize.
Automatic testing frameworks (QuickCheck, HUnit).
*) cabal-init helps to quick-start.
**) Currently, my favourite tool for FFI bindings is bindings-DSL.
As a single next step (rather than half a dozen "next steps"), I suggest that you learn to write your own type classes. Here are a couple of simple problems to get you started:
Writing some interesting instance declarations for QuickCheck. Say for example that you want to generate random trees that are in some way "interesting".
Move on to the following little problem: define functions /\, \/, and complement ("and", "or", & "not") that can be applied not just to Booleans but to predicates of arbitrary arity. (If you look carefully, you can find the answer to this one on SO.)
You know all you need to go forth and write code. But if you're looking for more Haskell-y things to learn about, may I suggest:
Type families. Very handy feature. It basically gives you a way to write functions on the level of types, which is handy when you're trying to write a function whose parameters are polymorphic in a very precise way. One such example:
data TTrue = TTrue
data FFalse = FFalse
class TypeLevelIf tf a b where
type If tf a b
weirdIfStatement :: tf -> a -> b -> tf a b
instance TypeLevelIf TTrue a b where
type If TTrue a b = a
weirdIfStatement TTrue a b = a
instance TypeLevelIf FFalse a b where
type If FFalse a b = b
weirdIfStatement FFalse a b = a
This gives you a function that behaves like an if statement, but is able to return different types based on the truth value it is given.
If you're curious about type-level programming, type families provide one avenue into this topic.
Template Haskell. This is a huge subject. It gives you a power similar to macros in C, but with much more type safety.
Learn about some of the leading Haskell libraries. I can't count how many times parsec has enabled me to write an insanely useful utility quickly. dons periodically publishes a list of popular libraries on hackage; check it out.
Contribute to GHC!
Write a haskell compiler :-).

Language features helpful for writing quines (self-printing programs)?

OK, for those who have never encountered the term, a quine is a "self-replicating" computer program. To be more specific, one which - upon execution - produces a copy of its own source code as its only output.
The quines can, of course, be developed in many programming languages (but not all); but some languages are obviously more suited to producing quines than others (to clearly understand the somewhat subjective-sounding "more suited", look at a Haskell example vs. C example in the Wiki page - and I provide my more-objective definition below).
The question I have is, from programming language perspective, what language features (either theoretical design ones or syntax sugar) make the language more suitable/helpful for writing quines?
My definition of "more suitable" is "quines are easier to write" and "are shorter/more readable/less obfuscated". But you're welcome to add more criteria that are at least somewhat objective.
Please note that this question explicitly excludes degenerate cases, like a language which is designed to contain "print_a_quine" primitive.
I am not entirely sure, so correct me if anyone of you knows better.
I agree with both other answers, going further by explaining, that a quine is this:
Y g
where Y is a Y fixed-point combinator (or any other fixed-point combinator), which means in lambda calculus:
Y g = g(Y g)
now, it is quite apparent, that we need the code to be data and g be a function which will print its arguments.
So to summarize we need for constructing such a quines functions, printing function, fixed-point combinator and call-by-name evaluation strategy.
The smallest language that satisfies this conditions is AFAIK Zot from the Iota and Jot family.
Languages like the Io Programming Language and others allow the treating of code as data. In tree walking systems, this typically allows the language implementer to expose the abstract syntax tree as a first class citizen. In the case of Io, this is what it does. Being object oriented, the AST is modelled around Message objects, and a special sentinel is created to represent the currently executing message; this sentinel is called thisMessage. thisMessage is a full Message like any other, and responds to the print message, which prints it to the screen. As a result, the shortest quine I've ever been able to produce in any language, has come from Io and looks like this:
thisMessage print
Anyway, I just couldn't help but sharing this with you on this subject. The above certainly makes writing quines easy, but not doing it this way certainly doesn't preclude easily creating a quine.
I'm not sure if this is useful answer from a practical point of view, but there is some useful theory in computability theory. In particular fixed points and Kleene's recursion theorem can be used for writing quines. Apparently, the theory can be used for writing quine in LISP (as the wikipedia page shows).

What technique in functional programming is difficult to learn but useful afterwards?

This question is of course inspired by Monads in Haskell.
wrapping my head around continuation passing style has helped my javascript coding a lot
I would say First-class functions.
In computer science, a programming
language is said to support
first-class functions (or function
literals) if it treats functions as
first-class objects. Specifically,
this means that the language supports
constructing new functions during the
execution of a program, storing them
in data structures, passing them as
arguments to other functions, and
returning them as the values of other
functions. This concept doesn't cover
any means external to the language and
program (metaprogramming), such as
invoking a compiler or an eval
function to create a new function.
Do you want to measure the usefulness in connection with functional-programming itself or programming in general?
In general, the positive experience of functional programming doesn't result from particular techniques but from the way it changes your thinking -
Holding immutable data
Formulating declaratively (recursion, pattern-matching)
Treating functions as data
So I'd say that functional programming is the answer to your question itself.
But to give a more specific answer too, I'd vote for functional abstraction mechanisms like
monads
arrows
continuation-passing-style
zippers
higher-order-functions
generics + typeclasses.
As already said, they are very abstract things on the first view, but once you have understood them, they are extremely cool and valueable techniques to write concise, error-safe and last but not least highly reusable code.
Compare the following (Pseudocode):
// Concrete
def sumList(Data : List[Int]) = ...
// Generic
def sumGeneric[C : Collection[T], T : Num](Data : C) = ...
The latter might be somewhat unintuitive compared with the first definition, but it allows you to work with any collection and numeric type in general!
All in all, many modern (mainstream) languages have discovered such benefits and introduced very functional features like lambda functios or Linq. Having understood these techniques will also improve writing code in this languages.
One from the "advanced" department: Programming with phantom types (sometimes also called indexed types). It's admittedly not a "standard" technique in functional programming but not entirely esoteric either, and it's something to keep your brain busy for awhile (you asked for something difficult, right? ;)).
In a nutshell, it is about parameterizing types to encode and statically enforce certain properties at compile time. One of the standard examples is the vector addition function that statically ensures that given two vectors of length N and M will return a vector of length N+M or otherwise you get a compile-time error. Yes, there are more interesting applications.
These techniques are not quite as useful in C++ as they are in a proper functional programming language, but so far I've managed to sneak some of this stuff in all of my recent projects at work to a varying degree, most recently in a C++ EDSL context where it worked out really well. You don't necessarily have to encode fancy stuff, learning this helped me catching the situations where a few type tags can reduce the verbosity of an EDSL or allowed a cleaner syntax, for example.
Admittedly, the usefulness is somewhat restricted by language support and what you're trying to achieve.
Some starters:
Generic and Indexed Type (slides with some brief applications overview)
Fun with Phantom Types
The Kennedy and Russo paper mentioned in the slides is Generalized Algebraic Data Types
and Object Oriented Programming and puts some of this stuff into the context of C#/Java.
Chapter 3 in Dave Abraham's book C++ Template Metaprogramming is available online as sample chapter and uses these techniques in C++ for dimensional analysis.
A practical FP project using phantom types is HaskellDB.
I would say that Structural typing in OCaml is particularly rewarding.
recursion. Difficult to wrap your head around it at times
The concept of higher-order functions, lambda functions and the power of generic algorithms that are easy to combine were very beneficial for me. I'm always excited when I see what I can do with a fold in haskell.
Likewise my programming in C# has changed a lot (to the better, I hope) since I got into functional programming (haskell specifically).

Resources