Actually generic length function in Haskell - haskell

length is generic on the input in that it accepts any Foldable.
genericLength is generic on the output in that it outputs any Integral.
Is there an actually fully generic function with type:
(Integral b, Foldable t) => t a -> b
Or even:
(Num b, Foldable t) => t a -> b
As even though length should only ever produce integer values, integers are also floats, so producing any number works just fine.
Currently I am just using:
foldl' (const (+ 1)) 0
I know one can be made fairly easily, but I also know that often the base functions are optimized in some way or another, so I was hoping there was an existing function for generically calculating length. If not my followup question is why? Specifically why doesn't genericLength accept a Foldable?

I mean this was pretty much answered in the comments. But just so this question is marked as answered the reason is basically that genericLength has terrible performance due to it supporting lazy natural numbers, and thus should pretty much be avoided. This performance issue is partially avoided via rewrite rules, but even with that it is still not a good function to use and thus it isn't worth it to attempt to generalize it further.

Related

How can I add constraints to a type in a data type in haskell? [duplicate]

In many articles about Haskell they say it allows to make some checks during compile time instead of run time. So, I want to implement the simplest check possible - allow a function to be called only on integers greater than zero. How can I do it?
module Positive (toPositive, getPositive, Positive) where
newtype Positive = Positive { unPositive :: Int }
toPositive :: Int -> Maybe Positive
toPositive n = if (n <= 0) then Nothing else Just (Positive n)
-- We can't export unPositive, because unPositive can be used
-- to update the field. Trivially renaming it to getPositive
-- ensures that getPositive can only be used to access the field
getPositive :: Positive -> Int
getPositive = unPositive
The above module doesn't export the constructor, so the only way to build a value of type Positive is to supply toPositive with a positive integer, which you can then unwrap using getPositive to access the actual value.
You can then write a function that only accepts positive integers using:
positiveInputsOnly :: Positive -> ...
Haskell can perform some checks at compile time that other languages perform at runtime. Your question seems to imply you are hoping for arbitrary checks to be lifted to compile time, which isn't possible without a large potential for proof obligations (which could mean you, the programmer, would need to prove the property is true for all uses).
In the below, I don't feel like I'm saying anything more than what pigworker touched on while mentioning the very cool sounding Inch tool. Hopefully the additional words on each topic will clarify some of the solution space for you.
What People Mean (when speaking of Haskell's static guarantees)
Typically when I hear people talk about the static guarantees provided by Haskell they are talking about the Hindley Milner style static type checking. This means one type can not be confused for another - any such misuse is caught at compile time (ex: let x = "5" in x + 1 is invalid). Obviously, this only scratches the surface and we can discuss some more aspects of static checking in Haskell.
Smart Constructors: Check once at runtime, ensure safety via types
Gabriel's solution is to have a type, Positive, that can only be positive. Building positive values still requires a check at runtime but once you have a positive there are no checks required by consuming functions - the static (compile time) type checking can be leveraged from here.
This is a good solution for many many problems. I recommended the same thing when discussing golden numbers. Never-the-less, I don't think this is what you are fishing for.
Exact Representations
dflemstr commented that you can use a type, Word, which is unable to represent negative numbers (a slightly different issue than representing positives). In this manner you really don't need to use a guarded constructor (as above) because there is no inhabitant of the type that violates your invariant.
A more common example of using proper representations is non-empty lists. If you want a type that can never be empty then you could just make a non-empty list type:
data NonEmptyList a = Single a | Cons a (NonEmptyList a)
This is in contrast to the traditional list definition using Nil instead of Single a.
Going back to the positive example, you could use a form of Peano numbers:
data NonNegative = One | S NonNegative
Or user GADTs to build unsigned binary numbers (and you can add Num, and other instances, allowing functions like +):
{-# LANGUAGE GADTs #-}
data Zero
data NonZero
data Binary a where
I :: Binary a -> Binary NonZero
O :: Binary a -> Binary a
Z :: Binary Zero
N :: Binary NonZero
instance Show (Binary a) where
show (I x) = "1" ++ show x
show (O x) = "0" ++ show x
show (Z) = "0"
show (N) = "1"
External Proofs
While not part of the Haskell universe, it is possible to generate Haskell using alternate systems (such as Coq) that allow richer properties to be stated and proven. In this manner the Haskell code can simply omit checks like x > 0 but the fact that x will always be greater than 0 will be a static guarantee (again: the safety is not due to Haskell).
From what pigworker said, I would classify Inch in this category. Haskell has not grown sufficiently to perform your desired tasks, but tools to generate Haskell (in this case, very thin layers over Haskell) continue to make progress.
Research on More Descriptive Static Properties
The research community that works with Haskell is wonderful. While too immature for general use, people have developed tools to do things like statically check function partiality and contracts. If you look around you'll find it's a rich field.
I would be failing in my duty as his supervisor if I failed to plug Adam Gundry's Inch preprocessor, which manages integer constraints for Haskell.
Smart constructors and abstraction barriers are all very well, but they push too much testing to run time and don't allow for the possibility that you might actually know what you're doing in a way that checks out statically, with no need for Maybe padding. (A pedant writes. The author of another answer appears to suggest that 0 is positive, which some might consider contentious. Of course, the truth is that we have uses for a variety of lower bounds, 0 and 1 both occurring often. We also have some use for upper bounds.)
In the tradition of Xi's DML, Adam's preprocessor adds an extra layer of precision on top of what Haskell natively offers but the resulting code erases to Haskell as is. It would be great if what he's done could be better integrated with GHC, in coordination with the work on type level natural numbers that Iavor Diatchki has been doing. We're keen to figure out what's possible.
To return to the general point, Haskell is currently not sufficiently dependently typed to allow the construction of subtypes by comprehension (e.g., elements of Integer greater than 0), but you can often refactor the types to a more indexed version which admits static constraint. Currently, the singleton type construction is the cleanest of the available unpleasant ways to achieve this. You'd need a kind of "static" integers, then inhabitants of kind Integer -> * capture properties of particular integers such as "having a dynamic representation" (that's the singleton construction, giving each static thing a unique dynamic counterpart) but also more specific things like "being positive".
Inch represents an imagining of what it would be like if you didn't need to bother with the singleton construction in order to work with some reasonably well behaved subsets of the integers. Dependently typed programming is often possible in Haskell, but is currently more complicated than necessary. The appropriate sentiment toward this situation is embarrassment, and I for one feel it most keenly.
I know that this was answered a long time ago and I already provided an answer of my own, but I wanted to draw attention to a new solution that became available in the interim: Liquid Haskell, which you can read an introduction to here.
In this case, you can specify that a given value must be positive by writing:
{-# myValue :: {v: Int | v > 0} #-}
myValue = 5
Similarly, you can specify that a function f requires only positive arguments like this:
{-# f :: {v: Int | v > 0 } -> Int #-}
Liquid Haskell will verify at compile-time that the given constraints are satisfied.
This—or actually, the similar desire for a type of natural numbers (including 0)—is actually a common complaints about Haskell's numeric class hierarchy, which makes it impossible to provide a really clean solution to this.
Why? Look at the definition of Num:
class (Eq a, Show a) => Num a where
(+) :: a -> a -> a
(*) :: a -> a -> a
(-) :: a -> a -> a
negate :: a -> a
abs :: a -> a
signum :: a -> a
fromInteger :: Integer -> a
Unless you revert to using error (which is a bad practice), there is no way you can provide definitions for (-), negate and fromInteger.
Type-level natural numbers are planned for GHC 7.6.1: https://ghc.haskell.org/trac/ghc/ticket/4385
Using this feature it's trivial to write a "natural number" type, and gives a performance you could never achieve (e.g. with a manually written Peano number type).

Why some Haskell functions does not abstract concrete integral type?

For example, function length abstracts concrete sequence (Foldable), but do not abstract concrete integral type Int:
length :: Foldable t => t a -> Int
Would it be more useable or more convenient to have following type signature?
length' :: Foldable t, Integral i => t a -> i
Yes, perhaps, however first take note that more polymorphism is not always a good thing. If all functions have highly polymorphic arguments and results, then the compiler has little information to start type inference, so you end up having to type more awkward local signatures.
Now as for length, there is very little reason why it you'd want a result of any other Integral type but Int, at least not on a 64-bit machine:
Smaller types like Word16 don't usually give much performance- or memory advantage in Haskell, because there'll be some boxing somwhere, and then you have a 64-bit pointer to only 16 bits of information... bit silly.
It's basically impossible to have a list (let alone array or map) so large that its length can't be measured with a 63 bit word. Even for a crazy lazy list that never completely exists in memory at any time.Now, strictly speaking, Int only guarantees 29 which can in some extreme cases be exhausted, but practically this is only relevant on 32-bit platforms which are anyways more limited in memory and performance, so you wouldn't want to juggle such huge data.
For that matter, any application where performance or exhaustion could possibly an issue should probably be optimised to something more efficient than lists or other Foldable (which are always boxed due to parametricity); unboxed vectors or ByteStrings perform much better.
If you need the result to be Integer or something else more expensive, not for length but simply context reasons, it is probably a better idea to just calculate the length in Int and convert once at the end, rather than dragging around a slow addition through the entire list.
That said, I do in fact sometimes wish the signature was in fact that of the actually existing
genericlength :: (Foldable t, Num i) => t a -> i
...note that the function doesn't need to know its result will be integral. And in fact it would often be pretty useful to make the result rational, so we could then just write
average :: Fractional n => [n] -> n
average l = sum l / length l
instead of needing an extra fromIntegral.
The reason this isn't the case in Prelude.length? I don't know, seems rather historical. The rationale is probably as I said in the beginning that you don't want too much polymorphism. Then again, IMO it's usually a good idea to make function results as polymorphic as possible, and rather constrain the arguments a bit more tightly, because then the result will always be bound to something that can be used for type inference.

Can compilers deduce/prove mathematically?

I'm starting to learn functional programming language like Haskell, ML and most of the exercises will show off things like:
foldr (+) 0 [ 1 ..10]
which is equivalent to
sum = 0
for( i in [1..10] )
sum += i
So that leads me to think why can't compiler know that this is Arithmetic Progression and use O(1) formula to calculate?
Especially for pure FP languages without side effect?
The same applies for
sum reverse list == sum list
Given a + b = b + a
and definition of reverse, can compilers/languages prove it automatically?
Compilers generally don't try to prove this kind of thing automatically, because it's hard to implement.
As well as adding the logic to the compiler to transform one fragment of code into another, you have to be very careful that it only tries to do it when it's actually safe - i.e. there are often lots of "side conditions" to worry about. For example in your example above, someone might have written an instance of the type class Num (and hence the (+) operator) where the a + b is not b + a.
However, GHC does have rewrite rules which you can add to your own source code and could be used to cover some relatively simple cases like the ones you list above, particularly if you're not too bothered about the side conditions.
For example, and I haven't tested this, you might use the following rule for one of your examples above:
{-# RULES
"sum/reverse" forall list . sum (reverse list) = sum list
#-}
Note the parentheses around reverse list - what you've written in your question actually means (sum reverse) list and wouldn't typecheck.
EDIT:
As you're looking for official sources and pointers to research, I've listed a few.
Obviously it's hard to prove a negative but the fact that no-one has given an example of a general-purpose compiler that does this kind of thing routinely is probably quite strong evidence in itself.
As others have pointed out, even simple arithmetic optimisations are surprisingly dangerous, particularly on floating point numbers, and compilers generally have flags to turn them off - for example Visual C++, gcc. Even integer arithmetic isn't always clear-cut and people occasionally have big arguments about how to deal with things like overflow.
As Joachim noted, integer variables in loops are one place where slightly more sophisticated optimisations are applied because there are actually significant wins to be had. Muchnick's book is probably the best general source on the topic but it's not that cheap. The wikipedia page on strength reduction is probably as good an introduction as any to one of the standard optimisations of this kind, and has some references to the relevant literature.
FFTW is an example of a library that does all kinds of mathematical optimization internally. Some of its code is generated by a customised compiler the authors wrote specifically for the purpose. It's worthwhile because the authors have domain-specific knowledge of optimizations that in the specific context of the library are both worth the effort and safe
People sometimes use template metaprogramming to write "self-optimising libraries" that again might rely on arithmetic identities, see for example Blitz++. Todd Veldhuizen's PhD dissertation has a good overview.
If you descend into the realms of toy and academic compilers all sorts of things go. For example my own PhD dissertation is about writing inefficient functional programs along with little scripts that explain how to optimise them. Many of the examples (see Chapter 6) rely on applying arithmetic rules to justify the underlying optimisations.
Also, it's worth emphasising that the last few examples are of specialised optimisations being applied only to certain parts of the code (e.g. calls to specific libraries) where it is expected to be worthwhile. As other answers have pointed out, it's simply too expensive for a compiler to go searching for all possible places in an entire program where an optimisation might apply. The GHC rewrite rules that I mentioned above are a great example of a compiler exposing a generic mechanism for individual libraries to use in a way that's most appropriate for them.
The answer
No, compilers don’t do that kind of stuff.
One reason why
And for your examples, it would even be wrong: Since you did not give type annotations, the Haskell compiler will infer the most general type, which would be
foldr (+) 0 [ 1 ..10] :: Num a => a
and similar
(\list -> sum (reverse list)) :: Num a => [a] -> a
and the Num instance for the type that is being used might well not fulfil the mathematical laws required for the transformation you suggest. The compiler should, before everything else, avoid to change the meaning (i.e. the semantics) of your program.
More pragmatically: The cases where the compiler could detect such large-scale transformations rarely occur in practice, so it would not be worth it to implement them.
An exception
Note notable exceptions are linear transformations in loops. Most compilers will rewrite
for (int i = 0; i < n; i++) {
... 200 + 4 * i ...
}
to
for (int i = 0, j = 200; i < n; i++, j += 4) {
... j ...
}
or something similar, as that pattern does often occur in code working on array.
The optimizations you have in mind will probably not be done even in the presence of monomorphic types, because there are so many possibilities and so much knowledge required. For example, in this example:
sum list == sum (reverse list)
The compiler would need to know or take into account the following facts:
sum = foldl (+) 0
(+) is commutative
reverse list is a permutation of list
foldl x c l, where x is commutative and c is a constant, yields the same result for all permutations of l.
This all seems trivial. Sure, the compiler can most probably look up the definition of sumand inline it. It could be required that (+) be commutative, but remember that +is just another symbol without attached meaning to the compiler. The third point would require the compiler to prove some non trivial properties about reverse.
But the point is:
You don't want to perform the compiler to do those calculations with each and every expression. Remember, to make this really useful, you'd have to heap up a lot of knowledge about many, many standard functions and operators.
You still can't replace the expression above with True unless you can rule out the possibility that list or some list element is bottom. Usually, one cannot do this. You can't even do the following "trivial" optimization of f x == f x in all cases
f x `seq` True
For, consider
f x = (undefined :: Bool, x)
then
f x `seq` True ==> True
f x == f x ==> undefined
That being said, regarding your first example slightly modified for monomorphism:
f n = n * foldl (+) 0 [1..10] :: Int
it is imaginable to optimize the program by moving the expression out of its context and replace it with the name of a constant, like so:
const1 = foldl (+) 0 [1..10] :: Int
f n = n * const1
This is because the compiler can see that the expression must be constant.
What you're describing looks like super-compilation. In your case, if the expression had a monomorphic type like Int (as opposed to polymorphic Num a => a), the compiler could infer that the expression foldr (+) 0 [1 ..10] has no external dependencies, therefore it could be evaluated at compile time and replaced by 55. However, AFAIK no mainstream compiler currently does this kind of optimization.
(In functional programming "proving" is usually associated with something different. In languages with dependent types types are powerful enough to express complex proposition and then through the Curry-Howard correspondence programs become proofs of such propositions.)
As others have noted, it's unclear that your simplifications even hold in Haskell. For instance, I can define
newtype NInt = N Int
instance Num NInt where
N a + _ = N a
N b * _ = N b
... -- etc
and now sum . reverse :: Num [a] -> a does not equal sum :: Num [a] -> a since I can specialize each to [NInt] -> NInt where sum . reverse == sum clearly does not hold.
This is one general tension that exists around optimizing "complex" operations—you actually need quite a lot of information in order to successfully prove that it's okay to optimize something. This is why the syntax-level compiler optimization which do exist are usually monomorphic and related to the structure of programs---it's usually such a simplified domain that there's "no way" for the optimization to go wrong. Even that is often unsafe because the domain is never quite so simplified and well-known to the compiler.
As an example, a very popular "high-level" syntactic optimization is stream fusion. In this case the compiler is given enough information to know that stream fusion can occur and is basically safe, but even in this canonical example we have to skirt around notions of non-termination.
So what does it take to have \x -> sum [0..x] get replaced by \x -> x*(x + 1)/2? The compiler would need a theory of numbers and algebra built-in. This is not possible in Haskell or ML, but becomes possible in dependently typed languages like Coq, Agda, or Idris. There you could specify things like
revCommute :: (_+_ :: a -> a -> a)
-> Commutative _+_
-> foldr _+_ z (reverse as) == foldr _+_ z as
and then, theoretically, tell the compiler to rewrite according to revCommute. This would still be difficult and finicky, but at least we'd have enough information around. To be clear, I'm writing something very strange above, a dependent type. The type not only depends on the ability to introduce both a type and a name for the argument inline, but also the existence of the entire syntax of your language "at the type level".
There are a lot of differences between what I just wrote and what you'd do in Haskell, though. First, in order to form a basis where such promises can be taken seriously, we must throw away general recursion (and thus we already don't have to worry about questions of non-termination like stream-fusion does). We also must have enough structure around to create something like the promise Commutative _+_---this likely depends upon there being an entire theory of operators and mathematics built into the language's standard library else you would need to create that yourself. Finally, the richness of type system required to even express these kinds of theories adds a lot of complexity to the entire system and tosses out type inference as you know it today.
But, given all that structure, I'd never be able to create an obligation Commutative _+_ for the _+_ defined to work on NInts and so we could be certain that foldr (+) 0 . reverse == foldr (+) 0 actually does hold.
But now we'd need to tell the compiler how to actually perform that optimization. For stream-fusion, the compiler rules only kick in when we write something in exactly the right syntactic form to be "clearly" an optimization redex. The same kinds of restrictions would apply to our sum . reverse rule. In fact, already we're sunk because
foldr (+) 0 . reverse
foldr (+) 0 (reverse as)
don't match. They're "obviously" the same due to some rules we could prove about (.), but that means that now the compiler must invoke two built-in rules in order to perform our optimization.
At the end of the day, you need a very smart optimization search over the sets of known laws in order to achieve the kinds of automatic optimizations you're talking about.
So not only do we add a lot of complexity to the entire system, require a lot of base work to build-in some useful algebraic theories, and lose Turing completeness (which might not be the worst thing), we also only get a finicky promise that our rule would even fire unless we perform an exponentially painful search during compilation.
Blech.
The compromise that exists today tends to be that sometimes we have enough control over what's being written to be mostly certain that a certain obvious optimization can be performed. This is the regime of stream fusion and it requires a lot of hidden types, carefully written proofs, exploitations of parametricity, and hand-waving before it's something the community trusts enough to run on their code.
And it doesn't even always fire. For an example of battling that problem take a look at the source of Vector for all of the RULES pragmas that specify all of the common circumstances where Vector's stream-fusion optimizations should kick in.
All of this is not at all a critique of compiler optimizations or dependent type theories. Both are really incredible. Instead it's just an amplification of the tradeoffs involved in introducing such an optimization. It's not to be done lightly.
Fun fact: Given two arbitrary formulas, do they both give the same output for the same inputs? The answer to this trivial question is not computable! In other words, it is mathematically impossible to write a computer program that always gives the correct answer in finite time.
Given this fact, it's perhaps not surprising that nobody has a compiler that can magically transform every possible computation into its most efficient form.
Also, isn't this the programmer's job? If you want the sum of an arithmetic sequence commonly enough that it's a performance bottleneck, why not just write some more efficient code yourself? Similarly, if you really want Fibonacci numbers (why?), use the O(1) algorithm.

Type algebra and Knuth's up arrow notation

Reading through this question and this blog post got me thinking more about type algebra and specifically how to abuse it.
Basically,
1) We can think of the Either A B type as addition: A+B
2) We can think of the ordered pair (A,B) as multiplication: A*B
3) We can think of the function A -> B as exponentiation: B^A
There's an obvious pattern going on here: Multiplication is repeated addition, and exponentiation is repeated multiplication. This led Knuth to define the up arrow ↑ as exponentiation, ↑↑ as repeated exponentiation, ↑↑↑ as repeated ↑↑, and so on. Thus, 10↑↑↑↑10 is a HUGE number.
My question is: how can the function ↑↑↑↑↑↑↑↑↑↑↑↑↑↑↑↑↑↑ be represented in algebraic data
types? It seems like ↑ should be a function with an infinitite number of arguments, but that doesn't make much sense. Would A↑B simply be [A] -> B and thus A↑↑↑↑B be [[[[A]]]]->B?
Bonus points if you can explain what the Ackerman function would look like, or any of the other hypergrowth functions.
At the most obvious level, you could identify a↑↑b with
((...(a -> a) -> ...) -> a) -- iterated b times
and a↑↑↑b is just
(a↑↑(a↑↑(...(a↑↑(a↑↑a))...))) -- iterated b times
so everything can be expressed in terms of some long function type (hence as some immensely long tuple type ...). But I don't think there's a convenient expression for an arbitrary up-arrow symbol in terms of (the cardinality of) familiar Haskell types (beyond the ones written above with ... or ↑), since I can't think of any common mathematical objects that have larger-than-exponential combinatorial dependencies on the size of the underlying sets (without going to recursive datatypes, which are too big) ... maybe there are some such objects in combinatorial set theory? (Your question seems [to me] more about the sizes of sets than anything specific to types.)
(The Wikipedia page you linked already connects these objects to the Ackermann function.)

Given a Haskell type signature, is it possible to generate the code automatically?

What it says in the title. If I write a type signature, is it possible to algorithmically generate an expression which has that type signature?
It seems plausible that it might be possible to do this. We already know that if the type is a special-case of a library function's type signature, Hoogle can find that function algorithmically. On the other hand, many simple problems relating to general expressions are actually unsolvable (e.g., it is impossible to know if two functions do the same thing), so it's hardly implausible that this is one of them.
It's probably bad form to ask several questions all at once, but I'd like to know:
Can it be done?
If so, how?
If not, are there any restricted situations where it becomes possible?
It's quite possible for two distinct expressions to have the same type signature. Can you compute all of them? Or even some of them?
Does anybody have working code which does this stuff for real?
Djinn does this for a restricted subset of Haskell types, corresponding to a first-order logic. It can't manage recursive types or types that require recursion to implement, though; so, for instance, it can't write a term of type (a -> a) -> a (the type of fix), which corresponds to the proposition "if a implies a, then a", which is clearly false; you can use it to prove anything. Indeed, this is why fix gives rise to ⊥.
If you do allow fix, then writing a program to give a term of any type is trivial; the program would simply print fix id for every type.
Djinn is mostly a toy, but it can do some fun things, like deriving the correct Monad instances for Reader and Cont given the types of return and (>>=). You can try it out by installing the djinn package, or using lambdabot, which integrates it as the #djinn command.
Oleg at okmij.org has an implementation of this. There is a short introduction here but the literate Haskell source contains the details and the description of the process. (I'm not sure how this corresponds to Djinn in power, but it is another example.)
There are cases where is no unique function:
fst', snd' :: (a, a) -> a
fst' (a,_) = a
snd' (_,b) = b
Not only this; there are cases where there are an infinite number of functions:
list0, list1, list2 :: [a] -> a
list0 l = l !! 0
list1 l = l !! 1
list2 l = l !! 2
-- etc.
-- Or
mkList0, mkList1, mkList2 :: a -> [a]
mkList0 _ = []
mkList1 a = [a]
mkList2 a = [a,a]
-- etc.
(If you only want total functions, then consider [a] as restricted to infinite lists for list0, list1 etc, i.e. data List a = Cons a (List a))
In fact, if you have recursive types, any types involving these correspond to an infinite number of functions. However, at least in the case above, there is a countable number of functions, so it is possible to create an (infinite) list containing all of them. But, I think the type [a] -> [a] corresponds to an uncountably infinite number of functions (again restrict [a] to infinite lists) so you can't even enumerate them all!
(Summary: there are types that correspond to a finite, countably infinite and uncountably infinite number of functions.)
This is impossible in general (and for languages like Haskell that does not even has the strong normalization property), and only possible in some (very) special cases (and for more restricted languages), such as when a codomain type has the only one constructor (for example, a function f :: forall a. a -> () can be determined uniquely). In order to reduce a set of possible definitions for a given signature to a singleton set with just one definition need to give more restrictions (in the form of additional properties, for example, it is still difficult to imagine how this can be helpful without giving an example of use).
From the (n-)categorical point of view types corresponds to objects, terms corresponds to arrows (constructors also corresponds to arrows), and function definitions corresponds to 2-arrows. The question is analogous to the question of whether one can construct a 2-category with the required properties by specifying only a set of objects. It's impossible since you need either an explicit construction for arrows and 2-arrows (i.e., writing terms and definitions), or deductive system which allows to deduce the necessary structure using a certain set of properties (that still need to be defined explicitly).
There is also an interesting question: given an ADT (i.e., subcategory of Hask) is it possible to automatically derive instances for Typeable, Data (yes, using SYB), Traversable, Foldable, Functor, Pointed, Applicative, Monad, etc (?). In this case, we have the necessary signatures as well as additional properties (for example, the monad laws, although these properties can not be expressed in Haskell, but they can be expressed in a language with dependent types). There is some interesting constructions:
http://ulissesaraujo.wordpress.com/2007/12/19/catamorphisms-in-haskell
which shows what can be done for the list ADT.
The question is actually rather deep and I'm not sure of the answer, if you're asking about the full glory of Haskell types including type families, GADT's, etc.
What you're asking is whether a program can automatically prove that an arbitrary type is inhabited (contains a value) by exhibiting such a value. A principle called the Curry-Howard Correspondence says that types can be interpreted as mathematical propositions, and the type is inhabited if the proposition is constructively provable. So you're asking if there is a program that can prove a certain class of propositions to be theorems. In a language like Agda, the type system is powerful enough to express arbitrary mathematical propositions, and proving arbitrary ones is undecidable by Gödel's incompleteness theorem. On the other hand, if you drop down to (say) pure Hindley-Milner, you get a much weaker and (I think) decidable system. With Haskell 98, I'm not sure, because type classes are supposed to be able to be equivalent to GADT's.
With GADT's, I don't know if it's decidable or not, though maybe some more knowledgeable folks here would know right away. For example it might be possible to encode the halting problem for a given Turing machine as a GADT, so there is a value of that type iff the machine halts. In that case, inhabitability is clearly undecidable. But, maybe such an encoding isn't quite possible, even with type families. I'm not currently fluent enough in this subject for it to be obvious to me either way, though as I said, maybe someone else here knows the answer.
(Update:) Oh a much simpler interpretation of your question occurs to me: you may be asking if every Haskell type is inhabited. The answer is obviously not. Consider the polymorphic type
a -> b
There is no function with that signature (not counting something like unsafeCoerce, which makes the type system inconsistent).

Resources