I once asked a question on haskell beginners, whether to use data/newtype or a typeclass. In my particular case it turned out that no typeclass was required. Additionally Tom Ellis gave me a brilliant advice, what to do when in doubt:
The simplest way of answering this which is mostly correct is:
use data
I know that typeclasses can make a few things a bit prettier, but not much AFIK. It also strikes me that typeclasses are mostly used for brain stem stuff, wheras in newer stuff, new typeclasses hardly ever get introduced and everything is done with data/newtype.
Now I wonder if there are cases where typeclasses are absolutely required and things could not be expressed with data/newtype?
Answering a similar question on StackOverflow Gabriel Gonzales said
Use type classes if:
There is only one correct behavior per given type
The type class has associated equations (i.e. "laws") that all instances must satisfy
Hmm ..
Or are typeclasses and data/newtype somewhat competing concepts which coexist for historical reasons?
I would argue that typeclasses are an essential part of Haskell.
They are the part of Haskell that makes it the easiest language I know of to refactor, and they are a great asset to your being able to reason about the correctness of code.
So, let's talk about dictionary passing.
Now, any sort of dictionary passing is a big improvement in the state of affairs in traditional object oriented languages. We know how to do OOP with vtables in C++. However, the vtable is 'part of the object' in OOP languages. Fusing the vtable with the object forces your code into a form where you have a rigid discipline about who can extend the core types with new features, its really only the original author of the class who has to incorporate all the things others want to bake into their type. This leads to "lava flow code" and all sorts of other design antipatterns, etc.
Languages like C# give you the ability to hack in extension methods to fake new stuff, and "traits" in languages like scala and multiple inheritance in other languages let you delegate some of the work as well, but they are partial solutions.
When you split the vtable from the objects they manipulate you get a heady rush of power. You can now pass them around wherever you want, but then of course you need to name them and talk about them. The ML discipline around modules / functors and the explicit dictionary passing style take this approach.
Typeclasses take a slightly different tack. We rely on uniqueness of a typeclass instance for a given type and it is in large part it is this choice permits us to get away with such simple core data types.
Why?
Because we can move the use of the dictionaries to the use sites, and don't have to carry them around with the data types and we can rely upon the fact that when we do so nothing has changed about the behavior of the code.
Mechanical translation of the code to more complex manually passed dictionaries loses the uniqueness of such a dictionary at a given type. Passing the dictionaries in at different points in your program now leads to programs with greatly differing behavior. You may or may not have to remember the dictionaries your data type was constructed with, and woe betide you if you want to have conditional behavior based on what your arguments are.
For simple examples like Set you can get away with a manual dictionary translation. The price doesn't seem so high. You have to bake in the dictionary for, say, how you want to sort the Set when you make the object and then insert/lookup, would just preserve your choice. This might be a cost you can bear. When you union two Sets now, of course, its up in the air which ordering you get. Maybe you take the smaller and insert it into the larger, but then the ordering would change willy nilly, so instead you have to take say, the left and always insert it into the right, or document this haphazard behavior. You're now being forced into suboptimal performing solutions in the interest of 'flexibility'.
But Set is a trivial example. There you might bake an index into the type about which instance it was you are using, there is only one class involved. What happens when you want more complex behavior? One of the things we do with Haskell is work with monad transformers. Now you have lots of instances floating around -- and you don't have a good place to store them all, MonadReader, MonadWriter, MonadState, etc. may all apply.. conditionally, based on the underlying monad. what happens when you hoist and swap it out and now different things may or may not apply?
Carrying around an explicit dictionaries for this is a lot of work, there isn't a good place to store them and you are asking users to adopt a global program transformation to adopt this practice.
These are the things that typeclasses make effortless.
Do I believe you should use them for everything?
Not by a long shot.
But I can't agree with the other replies here that they are inessential to Haskell.
Haskell is the only language that supplies them and they are critical to at least my ability to think in this language, and are a huge part of why I consider Haskell home.
I do agree with a few things here, use typeclasses when there are laws and when the choice is unambiguous.
I'd challenge however, that if you don't have laws or if the choice isn't unambiguous, you may not know enough about how to model the problem domain, and should be seeking something for which you can fit it into the typeclass mold, possibly even into existing abstractions -- and when you finally find that solution, you'll find you can easily reuse it.
Typeclasses are, in most cases, inessential. Any typeclass code can be mechanically converted into dictionary-passing style. They mainly provide convenience, sometimes an essential amount of convenience (cf. kmett's answer).
Sometimes the single-instance property of typeclasses is used to enforce invariants. For example, you could not convert Data.Set into dictionary-passing style safely, because if you inserted twice with two different Ord dictionaries, you could break the data structure invariant. Of course you could still convert any working code to working code in dictionary-passing style, but you would not be able to outlaw as much broken code.
Laws are another important cultural aspect to typeclasses. The compiler does not enforce laws, but Haskell programmers expect typeclasses to come with laws that all the instances satisfy. This can be leveraged to provide stonger guarantees about some functions. This advantage comes only from the conventions of the community, and is not a formal property of a language.
To answer that part of the question:
"typeclasses and data/newtype somewhat competing concepts"
No. Typeclasses are an extension to the type system, that allows you to make constraints on polymorphic arguments. Like most things in programming, they are, of course, syntactic sugar [so they aren't essential in the sense that their use can't be replaced by anything else]. That doesn't mean they're superfluous. It just means you could express similar things using other language facilities, but you'd lose some clarity while you're at it. Dictionary passing can be used for mostly the same things, but it's ultimately less strict in the type system because it allows changing behavior at runtime (which is also an excellent example of where you'd use dictionary passing instead of type classes).
Data and newtype still mean exactly the same thing whether you have typeclasses or not: Introduce a new type, in the case of data as new kind of data structure, and in case of newtype as a typesafe variant of type.
To expand slightly on my comment I would suggest always starting by using data and dictionary passing. If the boilerplate and manual instance plumbing becomes too much to bear then consider introducing a typeclass. I suspect this approach generally leads to a cleaner design.
I just want to make a really mundane point about syntax.
People tend to underestimate the convenience afforded by type classes, probably because they have never tried Haskell without using any. This is a "the grass is greener on the other side of the fence" sort of phenomenon.
while :: Monad m -> m Bool -> m a -> m ()
while m p body = (>>=) m p $ \x ->
if x
then (>>) m body (while m p body)
else return m ()
average :: Floating a -> a -> a -> a -> a
average f a b c = (/) f ((+) (floatingToNum f) a ((+) (floatingToNum f) b c))
(fromInteger (floatingToNum f) 3)
This is the historical motivation for type classes and it remains valid today. If we didn't have type classes, we'd certainly need some kind of replacement for it to avoid writing monstrosities like these. (Maybe something like record puns or Agda's "open".)
I know that typeclasses can make a few things a bit prettier, but not much AFIK.
Bit prettier?? No! Way prettier! (as others have already noted)
However the answer to this really depends very much where this question comes from.
If Haskell is your tool of choice for serious software engineering, typeclasses are
powerful and essential.
If you are a beginner using haskell to learn (functional) programming, the complexity and difficulty of typeclasses can outweigh the advantages – certainly at the beginning of your studies.
Here are a couple of examples comparing ghc with gofer (predecessor of hugs,
predecessor of modern haskell):
gofer
? 1 ++ [2,3,4]
ERROR: Type error in application
*** expression :: 1 ++ [2,3,4]
*** term :: 1
*** type :: Int
*** does not match :: [Int]
Now compare with ghc:
Prelude> 1 ++ [2,3,4]
:2:1:
No instance for (Num [a0]) arising from the literal `1'
Possible fix: add an instance declaration for (Num [a0])
In the first argument of `(++)', namely `1'
In the expression: 1 ++ [2, 3, 4]
In an equation for `it': it = 1 ++ [2, 3, 4]
:2:7:
No instance for (Num a0) arising from the literal `2'
The type variable `a0' is ambiguous
Possible fix: add a type signature that fixes these type variable(s)
Note: there are several potential instances:
instance Num Double -- Defined in `GHC.Float'
instance Num Float -- Defined in `GHC.Float'
instance Integral a => Num (GHC.Real.Ratio a)
-- Defined in `GHC.Real'
...plus three others
In the expression: 2
In the second argument of `(++)', namely `[2, 3, 4]'
In the expression: 1 ++ [2, 3, 4]
This should suggest that error-message-wise, not only are typeclasses not prettier, they can be uglier!
One can go all the way (in gofer) and use the 'simple prelude' that uses
no typeclasses at all. This makes it quite unrealistic for serious programming
but real neat for wrapping your head round Hindley-Milner:
Standard Prelude
? :t (==)
(==) :: Eq a => a -> a -> Bool
? :t (+)
(+) :: Num a => a -> a -> a
Simple Prelude
? :t (==)
(==) :: a -> a -> Bool
? :t (+)
(+) :: Int -> Int -> Int
Related
To clarify my question, let me rephrase it in a more or less equivalent way:
Why is there a concept of superclass/class inheritance in Haskell?
What are the historical reasons that led to that design choice?
Why would it be so bad, for example, to have a base library with no class hierarchy, just typeclasses independent from each other?
Here I'll expose some random thoughts that made me want to ask this question. My current intuitions might be inaccurate as they are based on my current understanding of Haskell which is not perfect, but here they are...
It is not obvious to me why type class inheritance exists in Haskell. I find it a bit weird, as it creates asymmetry in concepts.
Often in mathematics, concepts can be defined from different viewpoints, I don't necessarily want to favor an order of how they ought to be defined. OK there is some order in which one should prove things, but once theorems and structures are there, I'd rather see them as available independent tools.
Moreover one perhaps not so good thing I see with class inheritance is this: I think a class instance will silently pick a corresponding superclass instance, which was probably implemented to be the most natural one for that type. Let's consider a Monad viewed as a subclass of Functor. Maybe there could be more than one way to define a Functor on some type that also happens to be a Monad. But saying that a Monad is a Functor implicitly makes the choice of one particular Functor for that Monad. Someday, you might forget that actually you wanted some other Functor.
Perhaps this example is not the best fit, but I have the feeling this sort of situation might generalize and possibly be dangerous if your class is a child of many. Current Haskell inheritance sounds like it makes default choices about parents implicitly.
If instead you have a design without hierarchy, I feel you would always have to be explicit about all the properties required, which would perhaps mean a bit less risk, more clarity, and more symmetry. So far, what I'm seeing is that the cost of such a design, would be : more constraints to write in instance definitions, and newtype wrappers, for each meaningful conversion from one set of concepts to another. I am not sure, but perhaps that could have been acceptable. Unfortunately, I think Haskell auto deriving mechanism for newtypes doesn't work very well, I would appreciate that the language was somehow smarter with newtype wrapping/unwrapping and required less verbosity.
I'm not sure, but now that I think about it, perhaps an alternative to newtype wrappers could be specific imports of modules containing specific variations of instances.
Another alternative I thought about while writing this, is that maybe one could weaken the meaning of class (P x) => C x, where instead of it being a requirement that an instance of C selects an instance of P, we could just take it to loosely mean that for example, C class also contains P's methods but no instance of P is automatically selected, no other relationship with P exists. So we could keep some sort of weaker hierarchy that might be more flexible.
Thanks if you have some clarifications over that topic, and/or correct my possible misunderstandings.
Maybe you're tired of hearing from me, but here goes...
I think superclasses were introduced as a relatively minor and unimportant feature of type classes. In Wadler and Blott, 1988, they are briefly discussed in Section 6 where the example class Eq a => Num a is given. There, the only rationale offered is that it's annoying to have to write (Eq a, Num a) => ... in a function type when it should be "obvious" that data types that can be added, multiplied, and negated ought to be testable for equality as well. The superclass relationship allows "a convenient abbreviation".
(The unimportance of this feature is underscored by the fact that this example is so terrible. Modern Haskell doesn't have class Eq a => Num a because the logical justification for all Nums also being Eqs is so weak. The example class Eq a => Ord a would be been a lot more convincing.)
So, the base library implemented without any superclasses would look more or less the same. There would just be more logically superfluous constraints on function type signatures in both library and user code, and instead of fielding this question, I'd be fielding a beginner question about why:
leq :: (Ord a) => a -> a -> Bool
leq x y = x < y || x == y
doesn't type check.
To your point about superclasses forcing a particular hierarchy, you're missing your target.
This kind of "forcing" is actually a fundamental feature of type classes. Type classes are "opinionated by design", and in a given Haskell program (where "program" includes all the libraries, include base used by the program), there can be only one instance of a particular type class for a particular type. This property is referred to as coherence. (Even though there is a language extension IncohorentInstances, it is considered very dangerous and should only be used when all possible instances of a particular type class for a particular type are functionally equivalent.)
This design decision comes with certain costs, but it also comes with a number of benefits. Edward Kmett talks about this in detail in this video, starting at around 14:25. In particular, he compares Haskell's coherent-by-design type classes with Scala's incoherent-by-design implicits and contrasts the increased power that comes with the Scala approach with the reusability (and refactoring benefits) of "dumb data types" that comes with the Haskell approach.
So, there's enough room in the design space for both coherent type classes and incoherent implicits, and Haskell's appoach isn't necessarily the right one.
BUT, since Haskell has chosen coherent type classes, there's no "cost" to having a specific hierarchy:
class Functor a => Monad a
because, for a particular type, like [] or MyNewMonadDataType, there can only be one Monad and one Functor instance anyway. The superclass relationship introduces a requirement that any type with Monad instance must have Functor instance, but it doesn't restrict the choice of Functor instance because you never had a choice in the first place. Or rather, your choice was between having zero Functor [] instances and exactly one.
Note that this is separate from the question of whether or not there's only one reasonable Functor instance for a Monad type. In principle, we could define a law-violating data type with incompatible Functor and Monad instances. We'd still be restricted to using that one Functor MyType instance and that one Monad MyType instance throughout our program, whether or not Functor was a superclass of Monad.
In python, I can add (union) and subtract (difference) sets with + and -. How would I set this up in Haskell? Would (-) = Data.Set.difference work? I tried it, but then I think regular subtraction with numbers got messed up.
Haskell places a few more restrictions on the overloading of numerical operators than Python does, there are rules and laws that must be followed in order to define them. For example, you would also need to define * and abs to go with it. Instead, use the operators already defined in Data.Set, namely \\ for set difference, and there isn't one already define for union, but you could easy make your own alias, or you could use it as
set1 `union` set2
I recommend sticking with the already defined functions and operators, it'll make your code much more readable to anyone else that takes a look at it. Feel free to introduce new operators that do more than just alias an existing function, although good practice says to do so sparingly still.
What you are proposing to do is, to put it a bit comically, very unhaskellic. Haskellers generally adopt the following attitude:
The same name or symbol should not be overloaded to mean two different things.
This means that all overloadable names or symbols (i.e., class operations) must have a consistent core meaning that all of their overloaded instances must respect.
In Haskell, the (+) and (-) operations are defined by the Num class. The docs aren't explicit about it, but to implement a class you must implement all of its methods, which includes things like fromInteger :: Num a => Integer -> a (the operation that converts any Integer into an instance of your class) and abs :: Num a => a -> a (take the absolute value of a number).
You can't implement the Num class for sets without profoundly abusing its meaning. So don't do it.
Note that there are other classes that may be more suitable to what you're trying to do. For example, there is the Monoid class that provides generic operations that are suitable for sets. In fact, the Data.Set module implements Monoid as union, so you can use the mappend function or (<>) operator to take the union of two sets generically (or the append of two lists, or many other things).
There is no obvious, popular class that the Set.difference operator would be an instance of, I'm afraid.
To define a Num instance for a type it would look like:
instance Num (Set a) where
(+) = -- definition
(-) = -- definition
-- etc
If you merely define, at the top level:
(-) = -- definition
Then you are simply shadowing the (-) that comes from Num.
As bheklilr says, Set is not a valid instance for Num because it cannot satisfy the ring laws. Haskell will not forbid you from defining the instance but it is a poor idea. People work with type classes by using their laws, so violating them results in incorrect programs.
This post poses the question for the case of !! . The accepted answer tell us that what you are actually doing is creating a new function !! and then you should avoid importing the standard one.
But, why to do so if the new function is to be applied to different types than the standard one? Is not the compiler able to choose the right one according to its parameters?
Is there any compiler flag to allow this?
For instance, if * is not defined for [Float] * Float
Why the compiler cries
> Ambiguous occurrence *
> It could refer to either `Main.*', defined at Vec.hs:4:1
> or `Prelude.*',
for this code:
(*) :: [Float] -> Float -> [Float]
(*) as k = map (\a -> a*k) as -- here: clearly Float*Float
r = [1.0, 2.0, 3.0] :: [Float]
s = r * 2.0 -- here: clearly [Float] * Float
main = do
print r
print s
Allowing the compiler to choose the correct implementation of a function based on its type is the purpose of typeclasses. It is not possible without them.
For a justification of this approach, you might read the paper that introduced them: How to make ad-hoc polymorphism less ad hoc [PDF].
Really, the reason is this: in Haskell, there is not necessarily a clear association “variable x has type T”.
Haskell is almost as flexible as dynamic languages, in the sense that any type can be a type variable, i.e. can have polymorphic type. But whereas in dynamic languages (and also e.g. OO polymorphism or C++ templates), the types of such type-variables are basically just extra information attached to the value-variables in your code (so an overloaded operator can see: argument is an Int->do this, is a String->do that), in Haskell the type variables live in a completely seperate scope in the type language. This gives you many advantages, for instance higher-kinded polymorphism is pretty much impossible without such a system. However, it also means it's harder to reason about how overloaded functions should be resolved. If Haskell allowed you to just write overloads and assume the compiler does its best guess at resolving the ambiguity, you'd often end up with strange error messages in unexpected places. (Actually, this can easily happen with overloads even if you have no Hindley-Milner type system. C++ is notorious for it.)
Instead, Haskell chooses to force overloads to be explicit. You must first define a type class before you can overload methods, and though this can't completely preclude confusing compilation errors it makes them much easier to avoid. Also, it lets you express polymorphic methods with type resolution that couldn't be expressed with traditional overloading, in particular polymorphic results (which is great for writing very easily reusable code).
It is a design decision, not a theoretical problem, not to include this in Haskell. As you say, many other languages use types to disambiguate between terms on an ad-hoc way. But type classes have similar functionality and additionally allow abstraction over things that are overloaded. Type-directed name resolution does not.
Nevertheless, forms of type-directed name resolution have been discussed for Haskell (for example in the context of resolving record field selectors) and are supported by some languages similar to Haskell such as Agda (for data constructors) or Idris (more generally).
I have pretty decent intuition about types Haskell prohibits as "impredicative": namely ones where a forall appears in an argument to a type constructor other than ->. But just what is predicativity? What makes it important? How does it relate to the word "predicate"?
The central question of these type systems is: "Can you substitute a polymorphic type in for a type variable?". Predicative type systems are the no-nonsense schoolmarm answering, "ABSOLUTELY NOT", while impredicative type systems are your carefree buddy who thinks that sounds like a fun idea and what could possibly go wrong?
Now, Haskell muddies the discussion a bit because it believes polymorphism should be useful but invisible. So for the remainder of this post, I will be writing in a dialect of Haskell where uses of forall are not just allowed but required. This way we can distinguish between the type a, which is a monomorphic type which draws its value from a typing environment that we can define later, and the type forall a. a, which is one of the harder polymorphic types to inhabit. We'll also allow forall to go pretty much anywhere in a type -- as we'll see, GHC restricts its type syntax as a "fail-fast" mechanism rather than as a technical requirement.
Suppose we have told the compiler id :: forall a. a -> a. Can we later ask to use id as if it had type (forall b. b) -> (forall b. b)? Impredicative type systems are okay with this, because we can instantiate the quantifier in id's type to forall b. b, and substitute forall b. b for a everywhere in the result. Predicative type systems are a bit more wary of that: only monomorphic types are allowed in. (So if we had a particular b, we could write id :: b -> b.)
There's a similar story about [] :: forall a. [a] and (:) :: forall a. a -> [a] -> [a]. While your carefree buddy may be okay with [] :: [forall b. b] and (:) :: (forall b. b) -> [forall b. b] -> [forall b. b], the predicative schoolmarm isn't, so much. In fact, as you can see from the only two constructors of lists, there is no way to produce lists containing polymorphic values without instantiating the type variable in their constructors to a polymorphic value. So although the type [forall b. b] is allowed in our dialect of Haskell, it isn't really sensible -- there's no (terminating) terms of that type. This motivates GHC's decision to complain if you even think about such a type -- it's the compiler's way of telling you "don't bother".*
Well, what makes the schoolmarm so strict? As usual, the answer is about keeping type-checking and type-inference doable. Type inference for impredicative types is right out. Type checking seems like it might be possible, but it's bloody complicated and nobody wants to maintain that.
On the other hand, some might object that GHC is perfectly happy with some types that appear to require impredicativity:
> :set -Rank2Types
> :t id :: (forall b. b) -> (forall b. b)
{- no complaint, but very chatty -}
It turns out that some slightly-restricted versions of impredicativity are not too bad: specifically, type-checking higher-rank types (which allow type variables to be substituted by polymorphic types when they are only arguments to (->)) is relatively simple. You do lose type inference above rank-2, and principal types above rank-1, but sometimes higher rank types are just what the doctor ordered.
I don't know about the etymology of the word, though.
* You might wonder whether you can do something like this:
data FooTy a where
FooTm :: FooTy (forall a. a)
Then you would get a term (FooTm) whose type had something polymorphic as an argument to something other than (->) (namely, FooTy), you don't have to cross the schoolmarm to do it, and so the belief "applying non-(->) stuff to polymorphic types isn't useful because you can't make them" would be invalidated. GHC doesn't let you write FooTy, and I will admit I'm not sure whether there's a principled reason for the restriction or not.
(Quick update some years later: there is a good, principled reason that FooTm is still not okay. Namely, the way that GADTs are implemented in GHC is via type equalities, so the expanded type of FooTm is actually FooTm :: forall a. (a ~ forall b. b) => FooTy a. Hence to actually use FooTm, one would indeed need to instantiate a type variable with a polymorphic type. Thanks to Stephanie Weirich for pointing this out to me.)
Let me just add a point regarding the "etymology" issue, since the other answer by #DanielWagner covers much of the technical ground.
A predicate on something like a is a -> Bool. Now a predicate logic is one that can in some sense reason about predicates -- so if we have some predicate P and we can talk about, for a given a, P(a), now in a "predicate logic" (such as first-order logic) we can also say ∀a. P(a). So we can quantify over variables and discuss the behavior of predicates over such things.
Now, in turn, we say a statement is predicative if all of the things a predicate is applied to are introduced prior to it. So statements are "predicated on" things that already exist. In turn, a statement is impredicative if it can in some sense refer to itself by its "bootstraps".
So in the case of e.g. the id example above, we find that we can give a type to id such that it takes something of the type of id to something else of the type of id. So now we can give a function a type where an quantified variable (introduced by forall a.) can "expand" to be the same type as that of the entire function itself!
Hence impredicativity introduces a possibility of a certain "self reference". But wait, you might say, wouldn't such a thing lead to contradiction? The answer is: "well, sometimes." In particular, "System F" which is the polymorphic lambda calculus and the essential "core" of GHC's "core" language allows a form of impredicativity that nonetheless has two levels -- the value level, and the type level, which is allowed to quantify over itself. In this two-level stratification, we can have impredicativity and not contradiction/paradox.
Although note that this neat trick is very delicate and easy to screw up by the addition of more features, as this collection of articles by Oleg indicates: http://okmij.org/ftp/Haskell/impredicativity-bites.html
I'd like to make a comment on the etymology issue, since #sclv's answer isn't quite right (etymologically, not conceptually).
Go back in time, to the days of Russell when everything is set theory— including logic. One of the logical notions of particular import is the "principle of comprehension"; that is, given some logical predicate φ:A→2 we would like to have some principle to determine the set of all elements satisfying that predicate, written as "{x | φ(x) }" or some variation thereon. The key point to bear in mind is that "sets" and "predicates" are viewed as being fundamentally different things: predicates are mappings from objects to truth values, and sets are objects. Thus, for example, we may allow quantifying over sets but not quantifying over predicates.
Now, Russell was rather concerned by his eponymous paradox, and sought some way to get rid of it. There are numerous fixes, but the one of interest here is to restrict the principle of comprehension. But first, the formal definition of the principle: ∃S.∀x.S x ↔︎ φ(x); that is, for our particular φ there exists some object (i.e., set) S such that for every object (also a set, but thought of as an element) x, we have that S x (you can think of this as meaning "x∈S", though logicians of the time gave "∈" a different meaning than mere juxtaposition) is true just in case φ(x) is true. If we take the principle exactly as written then we end up with an impredicative theory. However, we can place restrictions on which φ we're allowed to take the comprehension of. (For example, if we say that φ must not contain any second-order quantifiers.) Thus, for any restriction R, if a set S is determined (i.e., generated via comprehension) by some R-predicate, then we say that S is "R-predicative". If every set in our language is R-predicative then we say that our language is "R-predicative". And then, as is often the case with hyphenated prefix things, the prefix gets dropped off and left implicit, whence "predicative" languages. And, naturally, languages which are not predicative are "impredicative".
That's the old school etymology. Since those days the terms have gone off and gotten lives of their own. The ways we use "predicative" and "impredicative" today are quite different, because the things we're concerned about have changed. So it can sometimes be a bit hard to see how the heck our modern usage ties back to this stuff. Honestly, I don't think knowing the etymology really helps any in terms of figuring out what the words are really about (these days).
What it says in the title. If I write a type signature, is it possible to algorithmically generate an expression which has that type signature?
It seems plausible that it might be possible to do this. We already know that if the type is a special-case of a library function's type signature, Hoogle can find that function algorithmically. On the other hand, many simple problems relating to general expressions are actually unsolvable (e.g., it is impossible to know if two functions do the same thing), so it's hardly implausible that this is one of them.
It's probably bad form to ask several questions all at once, but I'd like to know:
Can it be done?
If so, how?
If not, are there any restricted situations where it becomes possible?
It's quite possible for two distinct expressions to have the same type signature. Can you compute all of them? Or even some of them?
Does anybody have working code which does this stuff for real?
Djinn does this for a restricted subset of Haskell types, corresponding to a first-order logic. It can't manage recursive types or types that require recursion to implement, though; so, for instance, it can't write a term of type (a -> a) -> a (the type of fix), which corresponds to the proposition "if a implies a, then a", which is clearly false; you can use it to prove anything. Indeed, this is why fix gives rise to ⊥.
If you do allow fix, then writing a program to give a term of any type is trivial; the program would simply print fix id for every type.
Djinn is mostly a toy, but it can do some fun things, like deriving the correct Monad instances for Reader and Cont given the types of return and (>>=). You can try it out by installing the djinn package, or using lambdabot, which integrates it as the #djinn command.
Oleg at okmij.org has an implementation of this. There is a short introduction here but the literate Haskell source contains the details and the description of the process. (I'm not sure how this corresponds to Djinn in power, but it is another example.)
There are cases where is no unique function:
fst', snd' :: (a, a) -> a
fst' (a,_) = a
snd' (_,b) = b
Not only this; there are cases where there are an infinite number of functions:
list0, list1, list2 :: [a] -> a
list0 l = l !! 0
list1 l = l !! 1
list2 l = l !! 2
-- etc.
-- Or
mkList0, mkList1, mkList2 :: a -> [a]
mkList0 _ = []
mkList1 a = [a]
mkList2 a = [a,a]
-- etc.
(If you only want total functions, then consider [a] as restricted to infinite lists for list0, list1 etc, i.e. data List a = Cons a (List a))
In fact, if you have recursive types, any types involving these correspond to an infinite number of functions. However, at least in the case above, there is a countable number of functions, so it is possible to create an (infinite) list containing all of them. But, I think the type [a] -> [a] corresponds to an uncountably infinite number of functions (again restrict [a] to infinite lists) so you can't even enumerate them all!
(Summary: there are types that correspond to a finite, countably infinite and uncountably infinite number of functions.)
This is impossible in general (and for languages like Haskell that does not even has the strong normalization property), and only possible in some (very) special cases (and for more restricted languages), such as when a codomain type has the only one constructor (for example, a function f :: forall a. a -> () can be determined uniquely). In order to reduce a set of possible definitions for a given signature to a singleton set with just one definition need to give more restrictions (in the form of additional properties, for example, it is still difficult to imagine how this can be helpful without giving an example of use).
From the (n-)categorical point of view types corresponds to objects, terms corresponds to arrows (constructors also corresponds to arrows), and function definitions corresponds to 2-arrows. The question is analogous to the question of whether one can construct a 2-category with the required properties by specifying only a set of objects. It's impossible since you need either an explicit construction for arrows and 2-arrows (i.e., writing terms and definitions), or deductive system which allows to deduce the necessary structure using a certain set of properties (that still need to be defined explicitly).
There is also an interesting question: given an ADT (i.e., subcategory of Hask) is it possible to automatically derive instances for Typeable, Data (yes, using SYB), Traversable, Foldable, Functor, Pointed, Applicative, Monad, etc (?). In this case, we have the necessary signatures as well as additional properties (for example, the monad laws, although these properties can not be expressed in Haskell, but they can be expressed in a language with dependent types). There is some interesting constructions:
http://ulissesaraujo.wordpress.com/2007/12/19/catamorphisms-in-haskell
which shows what can be done for the list ADT.
The question is actually rather deep and I'm not sure of the answer, if you're asking about the full glory of Haskell types including type families, GADT's, etc.
What you're asking is whether a program can automatically prove that an arbitrary type is inhabited (contains a value) by exhibiting such a value. A principle called the Curry-Howard Correspondence says that types can be interpreted as mathematical propositions, and the type is inhabited if the proposition is constructively provable. So you're asking if there is a program that can prove a certain class of propositions to be theorems. In a language like Agda, the type system is powerful enough to express arbitrary mathematical propositions, and proving arbitrary ones is undecidable by Gödel's incompleteness theorem. On the other hand, if you drop down to (say) pure Hindley-Milner, you get a much weaker and (I think) decidable system. With Haskell 98, I'm not sure, because type classes are supposed to be able to be equivalent to GADT's.
With GADT's, I don't know if it's decidable or not, though maybe some more knowledgeable folks here would know right away. For example it might be possible to encode the halting problem for a given Turing machine as a GADT, so there is a value of that type iff the machine halts. In that case, inhabitability is clearly undecidable. But, maybe such an encoding isn't quite possible, even with type families. I'm not currently fluent enough in this subject for it to be obvious to me either way, though as I said, maybe someone else here knows the answer.
(Update:) Oh a much simpler interpretation of your question occurs to me: you may be asking if every Haskell type is inhabited. The answer is obviously not. Consider the polymorphic type
a -> b
There is no function with that signature (not counting something like unsafeCoerce, which makes the type system inconsistent).