Haskell "Not a data constructor" [duplicate] - haskell

Thanks to some excellent answers here, I generally understand (clearly in a limited way) the purpose of Haskell's Maybe and that its definition is
data Maybe a = Nothing | Just a
however I'm not entity clear exactly why Just is a part of this definition. As near as I can tell, this is where Just itself is defined, but the the relevant documentation doesn't say much about it.
Am I correct is thinking that the primary benefit of using Just in the definition of Maybe, rather than simply
data Maybe a = Nothing | a
is that it allows for pattern matching to with Just _ and for useful functionality like isJust and fromJust?
Why is Maybe defined in the former way rather than the latter?

Haskell's algebraic data types are tagged unions. By design, when you combine two different types into another type, they have to have constructors to disambiguate them.
Your definition does not fit with how algebraic data types work.
data Maybe a = Nothing | a
There's no "tag" for a here. How would we tell an Maybe a apart from a normal, unwrapped a in your case?
Maybe has a Just constructor because it has to have a constructor by design.
Other languages do have union types which could work like what you imagine, but they would not be a good fit for Haskell. They play out differently in practice and tend to be somewhat error-prone.
There are some strong design reasons for preferring tagged unions to normal union types. They play well with type inference. Unions in real code often have a tag anyhow¹. And, from the point of view of elegance, tagged unions are a natural fit to the language because they are the dual of products (ie tuples and records). If you're curious, I wrote about this in a blog post introducing and motivating algebraic data types.
footnotes
¹ I've played with union types in two places: TypeScript and C. TypeScript compiles to JavaScript which is dynamically typed, meaning it keeps track of the type of a value at runtime—basically a tag.
C doesn't but, in practice, something like 90% of the uses of union types either have a tag or effectively emulate struct subtyping. One of my professors actually did an empirical study on how unions are used in real C code, but I don't remember what paper it was in off-hand.

Another way to look at it (in addition to Tikhon's answer) is to consider another one of the basic Haskell types, Either, which is defined like this:
-- | A value that contains either an #a# (the 'Left') constructor) or
-- a #b# (the 'Right' constructor).
data Either a b = Left a | Right b
This allows you to have values like these:
example1, example2 :: Either String Int
example1 = Left "Hello!"
example2 = Right 42
...but also like this one:
example3, example4 :: Either String String
example3 = Left "Hello!"
example4 = Right "Hello!"
The type Either String String, the first time you encounter it, sounds like "either a String or a String," and you might therefore think that it's the same as just String. But it isn't, because Haskell unions are tagged unions, and therefore an Either String String records not just a String, but also which of the "tags" (data constructors; in this case Left and Right) was used to construct it. So even though both alternatives carry a String as their payload, you're able to tell how any one value was originally built. This is good because there are lots of cases where the alternatives are the same type but the constructors/tags impart extra meaning:
data ResultMessage = FailureMessage String | SuccessMessage String
Here the data constructors are FailureMessage and SuccessMessage, and you can guess from the names that even though the payload in both cases is a String, they would mean very different things!
So bringing it back to Maybe/Just, what's happening here is that Haskell just uniformly works like that: every alternative of a union type has a distinct data constructor that must always be used to construct and pattern match values of its type. Even if at first you might think it would be possible to guess it from context, it just doesn't do it.
There are other reasons, a bit more technical. First, the rules for lazy evaluation are defined in terms of data constructors. The short version: lazy evaluation means that if Haskell is forced to peek inside of a value of type Maybe a, it will try to do the bare minimum amount of work needed to figure out whether it looks like Nothing or like Just x—preferably it won't peek inside the x when it does this.
Second: the language needs to be able distinguish types like Maybe a, Maybe (Maybe a) and Maybe (Maybe (Maybe a)). If you think about it, if we had a type definition that worked like you wrote:
data Maybe a = Nothing | a -- NOT VALID HASKELL
...and we wanted to make a value of type Maybe (Maybe a), you wouldn't be able to tell these two values apart:
example5, example6 :: Maybe (Maybe a)
example5 = Nothing
example6 = Just Nothing
This might seem a bit silly at first, but imagine you have a map whose values are "nullable":
-- Map of persons to their favorite number. If we know that some person
-- doesn't have a favorite number, we store `Nothing` as the value for
-- that person.
favoriteNumber :: Map Person (Maybe Int)
...and want to look up an entry:
Map.lookup :: Ord k => Map k v -> k -> Maybe v
So if we look up mary in the map we have:
Map.lookup favoriteNumber mary :: Maybe (Maybe Int)
And now the result Nothing means Mary's not in the map, while Just Nothing means Mary's in the map but she doesn't have a favorite number.

Just is a constructor, a alone would be of type a, when Just a constructs a different type Maybe a.

Maybe a is designed so to have one more value than the type a. In type theory, sometimes it is written as 1 + a (up to iso), which makes that fact even more evident.
As an experiment, consider the type Maybe (Maybe Bool). Here we have 1 + 1 + 2 values, namely:
Nothing
Just Nothing
Just (Just False)
Just (Just True)
If we were allowed to define
data Maybe a = Nothing | a
we would lose the distinction between the cases Just Nothing and Nothing, since there is no longer Just to make them apart. Indeed, Maybe (Maybe a) would collapse into Maybe a. This would be an inconvenient special case.

Related

Monad "unboxing"

My question came up while following the tutorial Functors, Applicatives, And Monads In Pictures and its JavaScript version.
When the text says that functor unwraps value from the context, I understand that a Just 5 -> 5 transformation is happening. As per What does the "Just" syntax mean in Haskell? , Just is "defined in scope" of the Maybe monad.
My question is what is so magical about the whole unwrapping thing? I mean, what is the problem of having some language rule which automatically unwraps the "scoped" variables? It looks to me that this action is merely a lookup in some kind of a table where the symbol Just 5 corresponds to the integer 5.
My question is inspired by the JavaScript version, where Just 5 is prototype array instance. So unwrapping is, indeed, not rocket science at all.
Is this a "for-computation" type of reason or a "for-programmer" one? Why do we distinguish Just 5 from 5 on the programming language level?
First of all, I don't think you can understand Monads and the like without understanding a Haskell like type system (i.e. without learning a language like Haskell). Yes, there are many tutorials that claim otherwise, but I've read a lot of them before learning Haskell and I didn't get it. So my advice: If you want to understand Monads learn at least some Haskell.
To your question "Why do we distinguish Just 5 from 5 on the programming language level?". For type safety. In most languages that happen not to be Haskell null, nil, whatever, is often used to represent the absence of a value. This however often results in things like NullPointerExceptions, because you didn't anticipate that a value may not be there.
In Haskell there is no null. So if you have a value of type Int, or anything else, that value can not be null. You are guarantied that there is a value. Great! But sometimes you actually want/need to encode the absence of a value. In Haskell we use Maybe for that. So something of type Maybe Int can either be something like Just 5 or Nothing. This way it is explicit that the value may not be there and you can not accidentally forget that it might be Nothing because you have to explicitly unwrap the value.
This has nothing really to do with Monads, except that Maybe happens to implement the Monad type class (a type class is a bit like a Java interface, if you are familiar with Java). That is Maybe is not primarily a Monad, but just happens to also be a Monad.
I think you're looking at this from the wrong direction. Monad is explicitly not about unwrapping. Monad is about composition.
It lets you combine (not necessarily apply) a function of type a -> m b with a value of type m a to get a value of type m b. I can understand where you might think the obvious way to do that is unwrapping the value of type m a into an value of type a. But very few Monad instances work that way. In fact, the only ones that can work that way are the ones that are equivalent to the Identity type. For nearly all instances of Monad, it's just not possible to unwrap a value.
Consider Maybe. Unwrapping a value of type Maybe a into a value of type a is impossible when the starting value is Nothing. Monadic composition has to do something more interesting than just unwrapping.
Consider []. Unwrapping a value of type [a] into a value of type a is impossible unless the input just happens to be a list of length 1. In every other case, monadic composition is doing something more interesting than unwrapping.
Consider IO. A value like getLine :: IO String doesn't contain a String value. It's plain impossible to unwrap, because it isn't wrapping something. Monadic composition of IO values doesn't unwrap anything. It combines IO values into more complex IO values.
I think it's worthwhile to adjust your perspective on what Monad means. If it were only an unwrapping interface, it would be pretty useless. It's more subtle, though. It's a composition interface.
A possible example is this: consider the Haskell type Maybe (Maybe Int). Its values can be of the following form
Nothing
Just Nothing
Just (Just n) for some integer n
Without the Just wrapper we couldn't distinguish between the first two.
Indeed, the whole point of the optional type Maybe a is to add a new value (Nothing) to an existing type a. To ensure such Nothing is indeed a fresh value, we wrap the other values inside Just.
It also helps during type inference. When we see the function call f 'a' we can see that f is called at the type Char, and not at type Maybe Char or Maybe (Maybe Char). The typeclass system would allow f to have a different implementation in each of these cases (this is similar to "overloading" in some OOP languages).
My question is, what is so magical about the whole unwrapping thing?
There is nothing magical about it. You can use garden-variety pattern matching (here in the shape of a case expression) to define...
mapMaybe :: (a -> b) -> Maybe a -> Maybe b
mapMaybe f mx = case mx of
Just x -> Just (f x)
_ -> mx
... which is exactly the same than fmap for Maybe. The only thing the Functor class adds -- and it is a very useful thing, make no mistake -- is an extra level of abstraction that covers various structures that can be mapped over.
Why do we distinguish Just 5 from 5 on programming language level?
More meaningful than the distinction between Just 5 and 5 is the one between their types -- e.g. between Maybe Intand Int. If you have x :: Int, you can be certain x is an Int value you can work with. If you have mx :: Maybe Int, however, you have no such certainty, as the Int might be missing (i.e. mx might be Nothing), and the type system forces you to acknowledge and deal with this possibility.
See also: jpath's answer for further comments on the usefulness of Maybe (which isn't necessarily tied to classes such as Functor and Monad); Carl's answer for further comments on the usefulness of classes like Functor and Monad (beyond the Maybe example).
What "unwrap" means depends on the container. Maybe is just one example. "Unwrapping" means something completely different when the container is [] instead of Maybe.
The magical about the whole unwrapping thing is the abstraction: In a Monad we have a notion of "unwrapping" which abstracts the nature of the container; and then it starts to get "magical"...
You ask what Just means: Just is nothing but a Datatype constructor in Haskell defined via a data declaration like :
data Maybe a = Just a | Nothing
Just take a value of type a and creates a value of type Maybe a. It's Haskell's way to distinguigh values of type a from values of type Maybe a
First of all, you need to remove monads from your question. They have nothing to do this. Treat this articles as one of the points of view on the monads, maybe it does not suit you, you may still little understood in the type system that would understand monads in haskell.
And so, your question can be rephrased as: Why is there no implicit conversion Just 5 => 5? But answer is very simple. Because value Just 5 has type Maybe Integer, so this value may would be Nothing, but what must do compiler in this case? Only programmer can resolve this situation.
But there is more uncomfortable question. There are types, for example, newtype Identity a = Identity a. It's just wrapper around some value. So, why is there no impliciti conversion Identity a => a?
The simple answer is - an attempt to realize this would lead to a different system types, which would not have had many fine qualities that exist in the current. According to this, it can be sacrificed for the benefit of other possibilities.

Why does Maybe include Just?

Thanks to some excellent answers here, I generally understand (clearly in a limited way) the purpose of Haskell's Maybe and that its definition is
data Maybe a = Nothing | Just a
however I'm not entity clear exactly why Just is a part of this definition. As near as I can tell, this is where Just itself is defined, but the the relevant documentation doesn't say much about it.
Am I correct is thinking that the primary benefit of using Just in the definition of Maybe, rather than simply
data Maybe a = Nothing | a
is that it allows for pattern matching to with Just _ and for useful functionality like isJust and fromJust?
Why is Maybe defined in the former way rather than the latter?
Haskell's algebraic data types are tagged unions. By design, when you combine two different types into another type, they have to have constructors to disambiguate them.
Your definition does not fit with how algebraic data types work.
data Maybe a = Nothing | a
There's no "tag" for a here. How would we tell an Maybe a apart from a normal, unwrapped a in your case?
Maybe has a Just constructor because it has to have a constructor by design.
Other languages do have union types which could work like what you imagine, but they would not be a good fit for Haskell. They play out differently in practice and tend to be somewhat error-prone.
There are some strong design reasons for preferring tagged unions to normal union types. They play well with type inference. Unions in real code often have a tag anyhow¹. And, from the point of view of elegance, tagged unions are a natural fit to the language because they are the dual of products (ie tuples and records). If you're curious, I wrote about this in a blog post introducing and motivating algebraic data types.
footnotes
¹ I've played with union types in two places: TypeScript and C. TypeScript compiles to JavaScript which is dynamically typed, meaning it keeps track of the type of a value at runtime—basically a tag.
C doesn't but, in practice, something like 90% of the uses of union types either have a tag or effectively emulate struct subtyping. One of my professors actually did an empirical study on how unions are used in real C code, but I don't remember what paper it was in off-hand.
Another way to look at it (in addition to Tikhon's answer) is to consider another one of the basic Haskell types, Either, which is defined like this:
-- | A value that contains either an #a# (the 'Left') constructor) or
-- a #b# (the 'Right' constructor).
data Either a b = Left a | Right b
This allows you to have values like these:
example1, example2 :: Either String Int
example1 = Left "Hello!"
example2 = Right 42
...but also like this one:
example3, example4 :: Either String String
example3 = Left "Hello!"
example4 = Right "Hello!"
The type Either String String, the first time you encounter it, sounds like "either a String or a String," and you might therefore think that it's the same as just String. But it isn't, because Haskell unions are tagged unions, and therefore an Either String String records not just a String, but also which of the "tags" (data constructors; in this case Left and Right) was used to construct it. So even though both alternatives carry a String as their payload, you're able to tell how any one value was originally built. This is good because there are lots of cases where the alternatives are the same type but the constructors/tags impart extra meaning:
data ResultMessage = FailureMessage String | SuccessMessage String
Here the data constructors are FailureMessage and SuccessMessage, and you can guess from the names that even though the payload in both cases is a String, they would mean very different things!
So bringing it back to Maybe/Just, what's happening here is that Haskell just uniformly works like that: every alternative of a union type has a distinct data constructor that must always be used to construct and pattern match values of its type. Even if at first you might think it would be possible to guess it from context, it just doesn't do it.
There are other reasons, a bit more technical. First, the rules for lazy evaluation are defined in terms of data constructors. The short version: lazy evaluation means that if Haskell is forced to peek inside of a value of type Maybe a, it will try to do the bare minimum amount of work needed to figure out whether it looks like Nothing or like Just x—preferably it won't peek inside the x when it does this.
Second: the language needs to be able distinguish types like Maybe a, Maybe (Maybe a) and Maybe (Maybe (Maybe a)). If you think about it, if we had a type definition that worked like you wrote:
data Maybe a = Nothing | a -- NOT VALID HASKELL
...and we wanted to make a value of type Maybe (Maybe a), you wouldn't be able to tell these two values apart:
example5, example6 :: Maybe (Maybe a)
example5 = Nothing
example6 = Just Nothing
This might seem a bit silly at first, but imagine you have a map whose values are "nullable":
-- Map of persons to their favorite number. If we know that some person
-- doesn't have a favorite number, we store `Nothing` as the value for
-- that person.
favoriteNumber :: Map Person (Maybe Int)
...and want to look up an entry:
Map.lookup :: Ord k => Map k v -> k -> Maybe v
So if we look up mary in the map we have:
Map.lookup favoriteNumber mary :: Maybe (Maybe Int)
And now the result Nothing means Mary's not in the map, while Just Nothing means Mary's in the map but she doesn't have a favorite number.
Just is a constructor, a alone would be of type a, when Just a constructs a different type Maybe a.
Maybe a is designed so to have one more value than the type a. In type theory, sometimes it is written as 1 + a (up to iso), which makes that fact even more evident.
As an experiment, consider the type Maybe (Maybe Bool). Here we have 1 + 1 + 2 values, namely:
Nothing
Just Nothing
Just (Just False)
Just (Just True)
If we were allowed to define
data Maybe a = Nothing | a
we would lose the distinction between the cases Just Nothing and Nothing, since there is no longer Just to make them apart. Indeed, Maybe (Maybe a) would collapse into Maybe a. This would be an inconvenient special case.

Why can't value constructors take type variables WITHOUT parameterizing the type?

As a beginner, it's not obvious to me why this is not allowed:
data Pair = Pair a b
That is, why do Pair 5 "foo" and Pair 'C' [] HAVE to produce different types? Why is it not allowed for them both to create values of type Pair?
I'm learning from "Learn you a", RWH, and the Haskell WikiBook, but have not been able to find the kind of precise, wonky language describing parametrized types that I'm looking for.
Fundamentally, the issue is that you would have no information about the contents of Pair. If all you know is that it contains a value of any type, the only real function you could use on it would be id, which is pretty useless!
The problem is that, since each value could be anything, you have no guarantees about them at all. So you couldn't even use ==: what if the value was a function? You can't compare functions for equality!
Imagine writing a function acting on your hypothetical Pair type:
fn (Pair a b) = ...
What other functions could you use on a and b?
Anything with any sort of concrete type (e.g. Int -> Int or something) wouldn't work because you can't tell if a is an Int. More complicated types like Num n => n -> n wouldn't work because you don't even know if a is a number. The only functions that would work are ones with types like t1 -> t1 or t1 -> t2. However, the only reasonable function of the first type is id and there is no reasonable function of the second type at all.
Now, you could just say "I'm going to try this function, if the type doesn't work, throw an error." But this would then be dynamic typing and would basically be throwing away the type system entirely. This sounds horrible, but it might make sense sometimes, so you can use Data.Dynamic to accomplish something like that. However, you shouldn't worry about it as a beginner and chance are you will never need to use it--I haven't, so far. I'm just including it for the sake of completeness.
With the existential types language extension you can define such a type:
{-# LANGUAGE ExistentialQuantification #-}
data Pair = forall a b. Pair a b
a, b :: Pair
a = Pair 1 2
b = Pair "abc" 'x'
Here both a and b have the same type.
Usually this isn't done this way because to do anything useful with a Pair you'd need to know what it contains, and the definition of Pair removes all that information.
So you can create such values if you really want, but it's hard to find anything useful to do with them.

Given a Haskell type signature, is it possible to generate the code automatically?

What it says in the title. If I write a type signature, is it possible to algorithmically generate an expression which has that type signature?
It seems plausible that it might be possible to do this. We already know that if the type is a special-case of a library function's type signature, Hoogle can find that function algorithmically. On the other hand, many simple problems relating to general expressions are actually unsolvable (e.g., it is impossible to know if two functions do the same thing), so it's hardly implausible that this is one of them.
It's probably bad form to ask several questions all at once, but I'd like to know:
Can it be done?
If so, how?
If not, are there any restricted situations where it becomes possible?
It's quite possible for two distinct expressions to have the same type signature. Can you compute all of them? Or even some of them?
Does anybody have working code which does this stuff for real?
Djinn does this for a restricted subset of Haskell types, corresponding to a first-order logic. It can't manage recursive types or types that require recursion to implement, though; so, for instance, it can't write a term of type (a -> a) -> a (the type of fix), which corresponds to the proposition "if a implies a, then a", which is clearly false; you can use it to prove anything. Indeed, this is why fix gives rise to ⊥.
If you do allow fix, then writing a program to give a term of any type is trivial; the program would simply print fix id for every type.
Djinn is mostly a toy, but it can do some fun things, like deriving the correct Monad instances for Reader and Cont given the types of return and (>>=). You can try it out by installing the djinn package, or using lambdabot, which integrates it as the #djinn command.
Oleg at okmij.org has an implementation of this. There is a short introduction here but the literate Haskell source contains the details and the description of the process. (I'm not sure how this corresponds to Djinn in power, but it is another example.)
There are cases where is no unique function:
fst', snd' :: (a, a) -> a
fst' (a,_) = a
snd' (_,b) = b
Not only this; there are cases where there are an infinite number of functions:
list0, list1, list2 :: [a] -> a
list0 l = l !! 0
list1 l = l !! 1
list2 l = l !! 2
-- etc.
-- Or
mkList0, mkList1, mkList2 :: a -> [a]
mkList0 _ = []
mkList1 a = [a]
mkList2 a = [a,a]
-- etc.
(If you only want total functions, then consider [a] as restricted to infinite lists for list0, list1 etc, i.e. data List a = Cons a (List a))
In fact, if you have recursive types, any types involving these correspond to an infinite number of functions. However, at least in the case above, there is a countable number of functions, so it is possible to create an (infinite) list containing all of them. But, I think the type [a] -> [a] corresponds to an uncountably infinite number of functions (again restrict [a] to infinite lists) so you can't even enumerate them all!
(Summary: there are types that correspond to a finite, countably infinite and uncountably infinite number of functions.)
This is impossible in general (and for languages like Haskell that does not even has the strong normalization property), and only possible in some (very) special cases (and for more restricted languages), such as when a codomain type has the only one constructor (for example, a function f :: forall a. a -> () can be determined uniquely). In order to reduce a set of possible definitions for a given signature to a singleton set with just one definition need to give more restrictions (in the form of additional properties, for example, it is still difficult to imagine how this can be helpful without giving an example of use).
From the (n-)categorical point of view types corresponds to objects, terms corresponds to arrows (constructors also corresponds to arrows), and function definitions corresponds to 2-arrows. The question is analogous to the question of whether one can construct a 2-category with the required properties by specifying only a set of objects. It's impossible since you need either an explicit construction for arrows and 2-arrows (i.e., writing terms and definitions), or deductive system which allows to deduce the necessary structure using a certain set of properties (that still need to be defined explicitly).
There is also an interesting question: given an ADT (i.e., subcategory of Hask) is it possible to automatically derive instances for Typeable, Data (yes, using SYB), Traversable, Foldable, Functor, Pointed, Applicative, Monad, etc (?). In this case, we have the necessary signatures as well as additional properties (for example, the monad laws, although these properties can not be expressed in Haskell, but they can be expressed in a language with dependent types). There is some interesting constructions:
http://ulissesaraujo.wordpress.com/2007/12/19/catamorphisms-in-haskell
which shows what can be done for the list ADT.
The question is actually rather deep and I'm not sure of the answer, if you're asking about the full glory of Haskell types including type families, GADT's, etc.
What you're asking is whether a program can automatically prove that an arbitrary type is inhabited (contains a value) by exhibiting such a value. A principle called the Curry-Howard Correspondence says that types can be interpreted as mathematical propositions, and the type is inhabited if the proposition is constructively provable. So you're asking if there is a program that can prove a certain class of propositions to be theorems. In a language like Agda, the type system is powerful enough to express arbitrary mathematical propositions, and proving arbitrary ones is undecidable by Gödel's incompleteness theorem. On the other hand, if you drop down to (say) pure Hindley-Milner, you get a much weaker and (I think) decidable system. With Haskell 98, I'm not sure, because type classes are supposed to be able to be equivalent to GADT's.
With GADT's, I don't know if it's decidable or not, though maybe some more knowledgeable folks here would know right away. For example it might be possible to encode the halting problem for a given Turing machine as a GADT, so there is a value of that type iff the machine halts. In that case, inhabitability is clearly undecidable. But, maybe such an encoding isn't quite possible, even with type families. I'm not currently fluent enough in this subject for it to be obvious to me either way, though as I said, maybe someone else here knows the answer.
(Update:) Oh a much simpler interpretation of your question occurs to me: you may be asking if every Haskell type is inhabited. The answer is obviously not. Consider the polymorphic type
a -> b
There is no function with that signature (not counting something like unsafeCoerce, which makes the type system inconsistent).

Haskell set datatype/datastructure

What i want to do is to create a type Set in Haskell to represent a generic(polymorphic) set ex. {1,'x',"aasdf",Phi}
first i want to clear that in my program i want to consider Phi(Empty set) as something that belongs to all sets
here is my code
data Set a b= Phi | Cons a (Set a b)
deriving (Show,Eq,Ord)
isMember Phi _ = True
isMember _ Phi = False
isMember x (Cons a b) = if x==a
then True
else isMember x b
im facing a couple of problems:
I want isMember type to be
isMember :: Eq a => a -> Set a b -> Bool
but according to my code it is
isMember :: Eq a => Set a b -> Set (Set a b) c -> Bool
If i have a set of different times the == operator doesn't work correctly so i need some help please :D
Regarding your type error, the problem looks like the first clause to me:
isMember Phi _ = True
This is an odd clause to write, because Phi is an entire set, not a set element. Just deleting it should give you a function of the type you expect.
Observe that your Set type never makes use of its second type argument, so it could be written instead as
data Set a = Phi | Cons a (Set a)
...and at that point you should just use [a], since it's isomorphic and has a huge entourage of functions already written for using and abusing them.
Finally, you ask to be able to put things of different types in. The short answer is that Haskell doesn't really swing that way. It's all about knowing exactly what kind of type a thing is at compile time, which isn't really compatible with what you're suggesting. There are actually some ways to do this; however, I strongly recommend getting much more familiar with Haskell's particular brand of type bondage before trying to take the bonds off.
A) Doing this is almost always not what you actually want.
B) There are a variety of ways to do this from embedding dynamic types (Dynamic) to using very complicated types (HList).
C) Here's a page describing some ways and issues: http://www.haskell.org/haskellwiki/Heterogenous_collections
D) If you're really going to do this, I'd suggest HList: http://homepages.cwi.nl/~ralf/HList/
E) But if you start to look at the documentation / HList paper and find yourself hopelessly confused, fall back to the dynamic solution (or better yet, rethink why you need this) and come back to HLists once you're significantly more comfortable with Haskell.
(Oh yes, and the existential solution described on that page is probably a terrible idea, since it almost never does anything particularly useful for you).
What you try to do is very difficult, as Haskell does not stores any type information by default. Two modules that are very useful for such things are Data.Typeable and Data.Dynamic. They provide support for storing a monomorphic (!) type and support for dynamic monomorphic typing.
I have not attempted to code something like this previously, but I have some ideas to accomplish that:
Each element of your set is a triple (quadruple) of the following things:
A TypeRep of the stored data-type
The value itself, coerced into an Any.
A comparison function (You can only use monomorphic values, you somehow have to store the context)
similary, a function to show the values.
Your set actually has two dimensions, first a tree by the TypeRep and than a list of values.
Whenever you insert a value, you coerce it into an Any and store all the required stuff together with it, as explained in (1) and put it in the right position as in (2).
When you want to find an element, you generate it's TypeRep and find the subtree of the right type. Then you just compare each sub-element with the value you want to find.
That are just some random thoughts. I guess it's actually much easier to use Dynamic.

Resources