Why is context reduction necessary? - haskell

I've just read this paper ("Type classes: an exploration of the design space" by Peyton Jones & Jones), which explains some challenges with the early typeclass system of Haskell, and how to improve it.
Many of the issues that they raise are related to context reduction which is a way to reduce the set of constraints over instance and function declarations by following the "reverse entailment" relationship.
e.g. if you have somewhere instance (Ord a, Ord b) => Ord (a, b) ... then within contexts, Ord (a, b) gets reduced to {Ord a, Ord b} (reduction does not always shrink the number of constrains).
I did not understand from the paper why this reduction was necessary.
Well, I gathered it was used to perform some form of type checking. When you have your reduced set of constraint, you can check that there exist some instance that can satisfy them, otherwise it's an error. I'm not too sure what the added value of that is, since you would notice the problem at the use site, but okay.
But even if you have to do that check, why use the result of reduction inside inferred types? The paper points out it leads to unintuitive inferred types.
The paper is quite ancient (1997) but as far as I can tell, context reduction is still an ongoing concern. The Haskell 2010 spec does mention the inference behaviour I explain above (link).
So, why do it this way?

I don't know if this is The Reason, necessarily, but it might be considered A Reason: in early Haskell, type signatures were only permitted to have "simple" constraints, namely, a type class name applied to a type variable. Thus, for example, all of these were okay:
Ord a => a -> a -> Bool
Eq a => a -> a -> Bool
Graph gr => gr n e -> [n]
But none of these:
Ord (Tree a) => Tree a -> Tree a -> Bool
Eq (a -> b) => (a -> b) -> (a -> b) -> Bool
Graph Gr => Gr n e -> [n]
I think there was a feeling then -- and still today, as well -- that allowing the compiler to infer a type which one couldn't write manually would be a bit unfortunate. Context reduction was a way of turning the above signatures either into ones that could be written by hand as well or an informative error. For example, since one might reasonably have
instance Ord a => Ord (Tree a)
in scope, we could turn the illegal signature Ord (Tree a) => ... into the legal signature Ord a => .... On the other hand, if we don't have any instance of Eq for functions in scope, one would report an error about the type which was inferred to require Eq (a -> b) in its context.
This has a couple of other benefits:
Intuitively pleasing. Many of the context reduction rules do not change whether the type is legal, but do reflect things humans would do when writing the type. I'm thinking here of the de-duplication and subsumption rules that let you turn, e.g. (Eq a, Eq a, Ord a) into just Ord a -- a transformation one definitely would want to do for readability.
This can frequently catch stupid errors; rather than inferring a type like Eq (Integer -> Integer) => Bool which can't be satisfied in a law-abiding way, one can report an error like Perhaps you did not apply a function to enough arguments?. Much friendlier!
It becomes the compiler's job to pinpoint what went wrong. Instead of inferring a complicated context like Eq (Tree (Grizwump a, [Flagle (Gr n e) (Gr n' e') c])) and complaining that the context is not satisfiable, it instead is forced to reduce this to the constituent constraints; it will instead complain that we couldn't determine Eq (Grizwump a) from the existing context -- a much more precise and actionable error.

I think this is indeed desirable in a dictionary passing implementation. In such an implementation, a "dictionary", that is, a tuple or record of functions is passed as implicit argument for every type class constraint in the type of the applied function.
Now, the question is simply when and how those dictionaries are created. Observe that for simple types like Int by necessity all dictionaries for whatever type class Int is an instance of will be a constant.
Not so in the case of parameterized types like lists, Maybe or tuples. It is clear that to show a tuple, for instance, the Show instances of the actual tuple elements need to be known. Hence such a polymorphic dictionary cannot be a constant.
It appears that the principle guiding the dictionary passing is such that only dictionaries for types that appear as type variables in the type of the applied function are passed. Or, to put it differently: no redundant information is replicated.
Consider this function:
f :: (Show a, Show b) => (a,b) -> Int
f ab = length (show ab)
The information that a tuple of show-able components is also showable, thus a constraint like Show (a,b) needs not to appear when we already know (Show a, Show b).
An alternative implementation would be possible, though, where the caller .would be responsible to create and pass dictionaries. This could work without context reduction, such that the type of f would look like:
f :: Show (a,b) => (a,b) -> Int
But this would mean that the code to create the tuple dictionary would have to be repeated on every call site. And it is easy to come up with examples where the number of necessary constraints actually increases, like in:
g :: (Show (a,a), Show(b,b), Show (a,b), Show (b, a)) => a -> b -> Int
g a b = maximum (map length [show (a,a), show (a,b), show (b,a), show(b,b)])
It is instructive to implement a type class/instance system with actual records that are explicitly passed. For example:
data Show' a = Show' { show' :: a -> String }
showInt :: Show' Int
showInt = Show' { show' = intshow } where
intshow :: Int -> String
intshow = show
Once you do this you will probably easily recognize the need for "context reduction".

Related

Relationship between Haskell's 'forall' and '=>'

I'm having trouble wrapping my mind around the relationship (and interactions) between Haskell's forall and => (and for that matter the . that often connects them).
For example
λ> :t (+)
λ> :t id
give
(+) :: forall a. Num a => a -> a -> a
id :: forall a. a -> a
and while I understand how these work in these specific cases, I'm not comfortable parsing the expressions (signatures?) forall a. Num a => or forall a. themselves into something meaningful, or that I can generally understand in more complex contexts.
What do forall a. Num a => and forall a. mean? Specifically, what is the roles played in each by forall, => and a?
(As another perspective, without invoking the "implicit dictionary passing" implementation of type classes):
forall a. in Haskell means "for every type a".1 It's introducing a type variable, and declaring that the rest of the type expression has to be valid whatever choice is made for a.
You usually don't see it in basic Haskell (without turning on any extensions in GHC), because it's not necessary; you just use type variables in your type signature, and GHC automatically assumes there are foralls introducing those variables at the start of the expression.
For example:
zip :: forall a. ( forall b. ( [a] -> [b] -> [(a, b)] ))
zip :: forall a. forall b. [a] -> [b] -> [(a, b)]
zip :: forall a b. [a] -> [b] -> [(a, b)]
zip :: [a] -> [b] -> [(a, b)]
The above are all the same; they just tell us that zip can be a way of zipping a list of a together with a list of b to make a list of (a, b) pairs, whatever choice we feel like making for a and b.
forall mainly comes into play with extensions, because then you can introduce type variables with scopes other than the default ones assumed by GHC if you don't explicitly write them.
Now, the constraints => type syntax can be read roughly as "these constraints imply this type", or "provided these constraints hold, you can use this type". It's used all the time, even in vanilla Haskell with no extensions, so it's important to understand what it means and how it works and not just copy and paste and hope.
The => arrow allows us to state a set of constraints on the variables in the rest of the type expression; it lets us put limitations on what choices can be made to introduce the type variable. You should read it first by ignoring everything left of the => arrow, and reading the the right part on its own. This gives you the "shape" of the type. The stuff to the left of the => arrow tells you what kind of types you can use the rest of the type with.
An example:
(+) :: Num a => a -> a -> a
This means that (+) is exactly the same kind of thing as anything with a simpler type like a -> a -> a, except the Num a => is telling us that we're not free to choose any type a. We can only choose a type for a when we know that it is a member of the Num type class (another slightly more precise way of saying "a is a member of Num is "the constraint Num a holds").
Note that GHC is still assuming that there's an implicit forall a to introduce the type variable a here, so it really looks like:
(+) :: forall a. Num a => a -> a -> a
In which case you can read this off moderately easily as an English sentence once you know what forall a. and Num a => means: "For every type a, provided Num a holds, plus has the type a -> a -> a".
1 If you're familiar with formal logic at all, it's just an ASCII-friendly way of writing ∀a, a "universally quantified variable".
As the forall matter appears to be settled, I'll attempt to explain the => a bit. The things to the left of the => are arguments, much like ones to the left of a ->. But you don't apply these arguments manually, and they can only have specific types.
f :: Num a => a -> a
is a function that takes two arguments:
A Num a dictionary.
An a.
When you apply f, you just provide the a. GHC has to provide the Num a. If it's applied to a specific concrete type like Int, GHC knows Num Int and can supply it at the call site. Otherwise, it checks that Num a is provided by some outer context and uses that one. The great thing about Haskell's typeclass system is that it ensures that any two Num a dictionaries, however they are found, will be identical. So it doesn't matter where the dictionary comes from—it is sure to be the right one.
Further discussion
A lot of these things we're talking about aren't exactly part of Haskell so much as they're part of the way GHC interprets Haskell by translation to GHC core, AKA System FC, an extension of the very well-studied System F, AKA the Girard-Reynolds calculus. System FC is an explicitly typed polymorphic lambda calculus with algebraic datatypes, etc., but no type inference, no instance resolution, etc. After GHC checks the types in your Haskell code, it translates that code to System FC by a thoroughly mechanical process. It can do this confidently because the type checker "decorates" the code with all the information the desugarer needs to plumb all the dictionaries around. If you have a Haskell function that looks like
foo :: forall a . Num a => a -> a -> a
foo x y= x + y
then that will translate to something that looks like
foo :: forall a . Num a -> a -> a -> a
foo = /\ (a :: *) -> \ (d :: Num a) -> \ (x :: a) -> \ (y :: a) -> (+) #a d x y
The /\ is a type lambda—it's just line a normal lambda except it takes a type variable. The # represents application of a type to a function that takes one. The + is really just a record selector. It chooses the right field from the dictionary it's passed.
I suppose it helps if we add the implied parentheses:
(+) :: ∀ a . ( Num a => (a -> (a -> a)) )
id :: ∀ a . ( a -> a )
The ∀ always goes together with a .. It's basically special syntax meaning “anything between ∀ and . are type variables that I want to introduce into the following scope”†
=> denotes what Idris calls an implicit function: Num a is a dictionary for the instance Num a, and such a dictionary is implicitly needed whenever you're adding numbers. But whether a is a type variable here that was previously introduced by some ∀, or a fixed type, doesn't really matter. You could also have
(+) :: Num Int => Int -> Int -> Int
That's just superfluous, because the compiler knows that Int is a Num instance and hence automatically (implicitly!) chooses the right dictionary.
Really, there's no particular relationship between ∀ and =>, they just happen to be used often together.
†Actually this is a type-level lambda. The type expression ∀ a . b behaves analogously to the value level expression \a -> b.

Relationship between TypeRep and "Type" GADT

In Scrap your boilerplate reloaded, the authors describe a new presentation of Scrap Your Boilerplate, which is supposed to be equivalent to the original.
However, one difference is that they assume a finite, closed set of "base" types, encoded with a GADT
data Type :: * -> * where
Int :: Type Int
List :: Type a -> Type [a]
...
In the original SYB, type-safe cast is used, implemented using the Typeable class.
My questions are:
What is the relationship between these two approaches?
Why was the GADT representation chosen for the "SYB Reloaded" presentation?
[I am one of the authors of the "SYB Reloaded" paper.]
TL;DR We really just used it because it seemed more beautiful to us. The class-based Typeable approach is more practical. The Spine view can be combined with the Typeable class and does not depend on the Type GADT.
The paper states this in its conclusions:
Our implementation handles the two central ingredients of generic programming differently from the original SYB paper: we use overloaded functions with
explicit type arguments instead of overloaded functions based on a type-safe
cast 1 or a class-based extensible scheme [20]; and we use the explicit spine
view rather than a combinator-based approach. Both changes are independent
of each other, and have been made with clarity in mind: we think that the structure of the SYB approach is more visible in our setting, and that the relations
to PolyP and Generic Haskell become clearer. We have revealed that while the
spine view is limited in the class of generic functions that can be written, it is
applicable to a very large class of data types, including GADTs.
Our approach cannot be used easily as a library, because the encoding of
overloaded functions using explicit type arguments requires the extensibility of
the Type data type and of functions such as toSpine. One can, however, incorporate Spine into the SYB library while still using the techniques of the SYB
papers to encode overloaded functions.
So, the choice of using a GADT for type representation is one we made mainly for clarity. As Don states in his answer, there are some obvious advantages in this representation, namely that it maintains static information about what type a type representation is for, and that it allows us to implement cast without any further magic, and in particular without the use of unsafeCoerce. Type-indexed functions can also be implemented directly by using pattern matching on the type, and without falling back to various combinators such as mkQ or extQ.
Fact is that I (and I think the co-authors) simply were not very fond of the Typeable class. (In fact, I'm still not, although it is finally becoming a bit more disciplined now in that GHC adds auto-deriving for Typeable, makes it kind-polymorphic, and will ultimately remove the possibility to define your own instances.) In addition, Typeable wasn't quite as established and widely known as it is perhaps now, so it seemed appealing to "explain" it by using the GADT encoding. And furthermore, this was the time when we were also thinking about adding open datatypes to Haskell, thereby alleviating the restriction that the GADT is closed.
So, to summarize: If you actually need dynamic type information only for a closed universe, I'd always go for the GADT, because you can use pattern matching to define type-indexed functions, and you do not have to rely on unsafeCoerce nor advanced compiler magic. If the universe is open, however, which is quite common, certainly for the generic programming setting, then the GADT approach might be instructive, but isn't practical, and using Typeable is the way to go.
However, as we also state in the conclusions of the paper, the choice of Type over Typeable isn't a prerequisite for the other choice we're making, namely to use the Spine view, which I think is more important and really the core of the paper.
The paper itself shows (in Section 8) a variation inspired by the "Scrap your Boilerplate with Class" paper, which uses a Spine view with a class constraint instead. But we can also do a more direct development, which I show in the following. For this, we'll use Typeable from Data.Typeable, but define our own Data class which, for simplicity, just contains the toSpine method:
class Typeable a => Data a where
toSpine :: a -> Spine a
The Spine datatype now uses the Data constraint:
data Spine :: * -> * where
Constr :: a -> Spine a
(:<>:) :: (Data a) => Spine (a -> b) -> a -> Spine b
The function fromSpine is as trivial as with the other representation:
fromSpine :: Spine a -> a
fromSpine (Constr x) = x
fromSpine (c :<>: x) = fromSpine c x
Instances for Data are trivial for flat types such as Int:
instance Data Int where
toSpine = Constr
And they're still entirely straightforward for structured types such as binary trees:
data Tree a = Empty | Node (Tree a) a (Tree a)
instance Data a => Data (Tree a) where
toSpine Empty = Constr Empty
toSpine (Node l x r) = Constr Node :<>: l :<>: x :<>: r
The paper then goes on and defines various generic functions, such as mapQ. These definitions hardly change. We only get class constraints for Data a => where the paper has function arguments of Type a ->:
mapQ :: Query r -> Query [r]
mapQ q = mapQ' q . toSpine
mapQ' :: Query r -> (forall a. Spine a -> [r])
mapQ' q (Constr c) = []
mapQ' q (f :<>: x) = mapQ' q f ++ [q x]
Higher-level functions such as everything also just lose their explicit type arguments (and then actually look exactly the same as in original SYB):
everything :: (r -> r -> r) -> Query r -> Query r
everything op q x = foldl op (q x) (mapQ (everything op q) x)
As I said above, if we now want to define a generic sum function summing up all Int occurrences, we cannot pattern match anymore, but have to fall back to mkQ, but mkQ is defined purely in terms of Typeable and completely independent of Spine:
mkQ :: (Typeable a, Typeable b) => r -> (b -> r) -> a -> r
(r `mkQ` br) a = maybe r br (cast a)
And then (again exactly as in original SYB):
sum :: Query Int
sum = everything (+) sumQ
sumQ :: Query Int
sumQ = mkQ 0 id
For some of the stuff later in the paper (e.g., adding constructor information), a bit more work is needed, but it can all be done. So using Spine really does not depend on using Type at all.
Well, obviously the Typeable use is open -- new variants can be added after the fact, and without modifying the original definitions.
The important change though is that in that TypeRep is untyped. That is, there is no connection between the runtime type , TypeRep, and the static type it encodes. With the GADT approach we can encode the mapping between a type a and its Type, given by the GADT Type a.
We thus bake in evidence for the type rep being statically linked to its origin type, and can write statically typed dynamic application (for example) using Type a as evidence that we have a runtime a.
In the older TypeRep case, we have no such evidence and it comes down to runtime string equality, and a coerce and hope for the best through fromDynamic.
Compare the signatures:
toDyn :: Typeable a => a -> TypeRep -> Dynamic
versus GADT style:
toDyn :: Type a => a -> Type a -> Dynamic
I can't fake my type evidence, and I can use that later when reconstructing things, to e.g. lookup the type class instances for a when all I have is a Type a.

How to Interpret (Eq a)

I need to create a function of two parameters, an Int and a [Int], that returns a new [Int] with all occurrences of the first parameter removed.
I can create the function easily enough, both with list comprehension and list recursion. However, I do it with these parameters:
deleteAll_list_comp :: Integer -> [Integer] -> [Integer]
deleteAll_list_rec :: (Integer -> Bool) -> [Integer] -> [Integer]
For my assignment, however, my required parameters are
deleteAll_list_comp :: (Eq a) => a -> [a] -> [a]
deleteAll_list_rec :: (Eq a) => a -> [a] -> [a]
I don't know how to read this syntax. As Google has told me, (Eq a) merely explains to Haskell that a is a type that is comparable. However, I don't understand the point of this as all Ints are naturally comparable. How do I go about interpreting and implementing the methods using these parameters? What I mean is, what exactly are the parameters to begin with?
#groovy #pelotom
Thanks, this makes it very clear. I understand now that really it is only asking for two parameters as opposed to three. However, I still am running into a problem with this code.
deleteAll_list_rec :: (Eq a) => a -> [a] -> [a]
delete_list_rec toDelete [] = []
delete_list_rec toDelete (a:as) =
if(toDelete == a) then delete_list_rec toDelete as
else a:(delete_list_rec toDelete as)
This gives me a "The type signature for deleteAll_list_rec
lacks an accompanying binding" which makes no sense to me seeing as how I did bind the requirements properly, didn't I? From my small experience, (a:as) counts as a list while extracting the first element from it. Why does this generate an error but
deleteAll_list_comp :: (Eq a) => a -> [a] -> [a]
deleteAll_list_comp toDelete ls = [x | x <- ls, toDelete==x]
does not?
2/7/13 Update: For all those who might stumble upon this post in the future with the same question, I've found some good information about Haskell in general, and my question specifically, at this link : http://learnyouahaskell.com/types-and-typeclasses
"Interesting. We see a new thing here, the => symbol. Everything before the => symbol is >called a class constraint. We can read the previous type declaration like this: the >equality function takes any two values that are of the same type and returns a Bool. The >type of those two values must be a member of the Eq class (this was the class constraint).
The Eq typeclass provides an interface for testing for equality. Any type where it makes >sense to test for equality between two values of that type should be a member of the Eq >class. All standard Haskell types except for IO (the type for dealing with input and >output) and functions are a part of the Eq typeclass."
One way to think of the parameters could be:
(Eq a) => a -> [a] -> [a]
(Eq a) => means any a's in the function parameters should be members of the
class Eq, which can be evaluated as equal or unequal.*
a -> [a] means the function will have two parameters: (1) an element of
type a, and (2) a list of elements of the same type a (we know that
type a in this case should be a member of class Eq, such as Num or
String).
-> [a] means the function will return a list of elements of the same
type a; and the assignment states that this returned list should
exclude any elements that equal the first function parameter,
toDelete.
(* edited based on pelotom's comment)
What you implemented (rather, what you think you implemented) is a function that works only on lists of Integers, what the assignment wants you to do is create one that works on lists of all types provided they are equality-comparable (so that your function will also work on lists of booleans or strings). You probably don't have to change a lot: Try removing the explicit type signatures from your code and ask ghci about the type that it would infer from your code (:l yourfile.hs and then :t deleteAll_list_comp). Unless you use arithmetic operations or similar things, you will most likely find that your functions already work for all Eq a.
As a simpler example that may explain the concept: Let's say we want to write a function isequal that checks for equality (slightly useless, but hey):
isequal :: Integer -> Integer -> Bool
isequal a b = (a == b)
This is a perfectly fine definition of isequal, but the type constraints that I have manually put on it are way stronger than they have to. In fact, in the absence of the manual type signature, ghci infers:
Prelude> :t isequal
isequal :: Eq a => a -> a -> Bool
which tells us that the function will work for all input types, as long as they are deriving Eq, which means nothing more than having a proper == relation defined on them.
There is still a problem with your _rec function though, since it should do the same thing as your _comp function, the type signatures should match.

Programmatic type annotations in Haskell

When metaprogramming, it may be useful (or necessary) to pass along to Haskell's type system information about types that's known to your program but not inferable in Hindley-Milner. Is there a library (or language extension, etc) that provides facilities for doing this—that is, programmatic type annotations—in Haskell?
Consider a situation where you're working with a heterogenous list (implemented using the Data.Dynamic library or existential quantification, say) and you want to filter the list down to a bog-standard, homogeneously typed Haskell list. You can write a function like
import Data.Dynamic
import Data.Typeable
dynListToList :: (Typeable a) => [Dynamic] -> [a]
dynListToList = (map fromJust) . (filter isJust) . (map fromDynamic)
and call it with a manual type annotation. For example,
foo :: [Int]
foo = dynListToList [ toDyn (1 :: Int)
, toDyn (2 :: Int)
, toDyn ("foo" :: String) ]
Here foo is the list [1, 2] :: [Int]; that works fine and you're back on solid ground where Haskell's type system can do its thing.
Now imagine you want to do much the same thing but (a) at the time you write the code you don't know what the type of the list produced by a call to dynListToList needs to be, yet (b) your program does contain the information necessary to figure this out, only (c) it's not in a form accessible to the type system.
For example, say you've randomly selected an item from your heterogenous list and you want to filter the list down by that type. Using the type-checking facilities supplied by Data.Typeable, your program has all the information it needs to do this, but as far as I can tell—this is the essence of the question—there's no way to pass it along to the type system. Here's some pseudo-Haskell that shows what I mean:
import Data.Dynamic
import Data.Typeable
randList :: (Typeable a) => [Dynamic] -> IO [a]
randList dl = do
tr <- randItem $ map dynTypeRep dl
return (dynListToList dl :: [<tr>]) -- This thing should have the type
-- represented by `tr`
(Assume randItem selects a random item from a list.)
Without a type annotation on the argument of return, the compiler will tell you that it has an "ambiguous type" and ask you to provide one. But you can't provide a manual type annotation because the type is not known at write-time (and can vary); the type is known at run-time, however—albeit in a form the type system can't use (here, the type needed is represented by the value tr, a TypeRep—see Data.Typeable for details).
The pseudo-code :: [<tr>] is the magic I want to happen. Is there any way to provide the type system with type information programatically; that is, with type information contained in a value in your program?
Basically I'm looking for a function with (pseudo-) type ??? -> TypeRep -> a that takes a value of a type unknown to Haskell's type system and a TypeRep and says, "Trust me, compiler, I know what I'm doing. This thing has the value represented by this TypeRep." (Note that this is not what unsafeCoerce does.)
Or is there something completely different that gets me the same place? For example, I can imagine a language extension that permits assignment to type variables, like a souped-up version of the extension enabling scoped type variables.
(If this is impossible or highly impractical,—e.g., it requires packing a complete GHCi-like interpreter into the executable—please try to explain why.)
No, you can't do this. The long and short of it is that you're trying to write a dependently-typed function, and Haskell isn't a dependently typed language; you can't lift your TypeRep value to a true type, and so there's no way to write down the type of your desired function. To explain this in a little more detail, I'm first going to show why the way you've phrased the type of randList doesn't really make sense. Then, I'm going to explain why you can't do what you want. Finally, I'll briefly mention a couple thoughts on what to actually do.
Existentials
Your type signature for randList can't mean what you want it to mean. Remembering that all type variables in Haskell are universally quantified, it reads
randList :: forall a. Typeable a => [Dynamic] -> IO [a]
Thus, I'm entitled to call it as, say, randList dyns :: IO [Int] anywhere I want; I must be able to provide a return value for all a, not simply for some a. Thinking of this as a game, it's one where the caller can pick a, not the function itself. What you want to say (this isn't valid Haskell syntax, although you can translate it into valid Haskell by using an existential data type1) is something more like
randList :: [Dynamic] -> (exists a. Typeable a => IO [a])
This promises that the elements of the list are of some type a, which is an instance of Typeable, but not necessarily any such type. But even with this, you'll have two problems. First, even if you could construct such a list, what could you do with it? And second, it turns out that you can't even construct it in the first place.
Since all that you know about the elements of the existential list is that they're instances of Typeable, what can you do with them? Looking at the documentation, we see that there are only two functions2 which take instances of Typeable:
typeOf :: Typeable a => a -> TypeRep, from the type class itself (indeed, the only method therein); and
cast :: (Typeable a, Typeable b) => a -> Maybe b (which is implemented with unsafeCoerce, and couldn't be written otherwise).
Thus, all that you know about the type of the elements in the list is that you can call typeOf and cast on them. Since we'll never be able to usefully do anything else with them, our existential might just as well be (again, not valid Haskell)
randList :: [Dynamic] -> IO [(TypeRep, forall b. Typeable b => Maybe b)]
This is what we get if we apply typeOf and cast to every element of our list, store the results, and throw away the now-useless existentially typed original value. Clearly, the TypeRep part of this list isn't useful. And the second half of the list isn't either. Since we're back to a universally-quantified type, the caller of randList is once again entitled to request that they get a Maybe Int, a Maybe Bool, or a Maybe b for any (typeable) b of their choosing. (In fact, they have slightly more power than before, since they can instantiate different elements of the list to different types.) But they can't figure out what type they're converting from unless they already know it—you've still lost the type information you were trying to keep.
And even setting aside the fact that they're not useful, you simply can't construct the desired existential type here. The error arises when you try to return the existentially-typed list (return $ dynListToList dl). At what specific type are you calling dynListToList? Recall that dynListToList :: forall a. Typeable a => [Dynamic] -> [a]; thus, randList is responsible for picking which a dynListToList is going to use. But it doesn't know which a to pick; again, that's the source of the question! So the type that you're trying to return is underspecified, and thus ambiguous.3
Dependent types
OK, so what would make this existential useful (and possible)? Well, we actually have slightly more information: not only do we know there's some a, we have its TypeRep. So maybe we can package that up:
randList :: [Dynamic] -> (exists a. Typeable a => IO (TypeRep,[a]))
This isn't quite good enough, though; the TypeRep and the [a] aren't linked at all. And that's exactly what you're trying to express: some way to link the TypeRep and the a.
Basically, your goal is to write something like
toType :: TypeRep -> *
Here, * is the kind of all types; if you haven't seen kinds before, they are to types what types are to values. * classifies types, * -> * classifies one-argument type constructors, etc. (For instance, Int :: *, Maybe :: * -> *, Either :: * -> * -> *, and Maybe Int :: *.)
With this, you could write (once again, this code isn't valid Haskell; in fact, it really bears only a passing resemblance to Haskell, as there's no way you could write it or anything like it within Haskell's type system):
randList :: [Dynamic] -> (exists (tr :: TypeRep).
Typeable (toType tr) => IO (tr, [toType tr]))
randList dl = do
tr <- randItem $ map dynTypeRep dl
return (tr, dynListToList dl :: [toType tr])
-- In fact, in an ideal world, the `:: [toType tr]` signature would be
-- inferable.
Now, you're promising the right thing: not that there exists some type which classifies the elements of the list, but that there exists some TypeRep such that its corresponding type classifies the elements of the list. If only you could do this, you would be set. But writing toType :: TypeRep -> * is completely impossible in Haskell: doing this requires a dependently-typed language, since toType tr is a type which depends on a value.
What does this mean? In Haskell, it's perfectly acceptable for values to depend on other values; this is what a function is. The value head "abc", for instance, depends on the value "abc". Similarly, we have type constructors, so it's acceptable for types to depend on other types; consider Maybe Int, and how it depends on Int. We can even have values which depend on types! Consider id :: a -> a. This is really a family of functions: id_Int :: Int -> Int, id_Bool :: Bool -> Bool, etc. Which one we have depends on the type of a. (So really, id = \(a :: *) (x :: a) -> x; although we can't write this in Haskell, there are languages where we can.)
Crucially, however, we can never have a type that depends on a value. We might want such a thing: imagine Vec 7 Int, the type of length-7 lists of integers. Here, Vec :: Nat -> * -> *: a type whose first argument must be a value of type Nat. But we can't write this sort of thing in Haskell.4 Languages which support this are called dependently-typed (and will let us write id as we did above); examples include Coq and Agda. (Such languages often double as proof assistants, and are generally used for research work as opposed to writing actual code. Dependent types are hard, and making them useful for everyday programming is an active area of research.)
Thus, in Haskell, we can check everything about our types first, throw away all that information, and then compile something that refers only to values. In fact, this is exactly what GHC does; since we can never check types at run-time in Haskell, GHC erases all the types at compile-time without changing the program's run-time behavior. This is why unsafeCoerce is easy to implement (operationally) and completely unsafe: at run-time, it's a no-op, but it lies to the type system. Consequently, something like toType is completely impossible to implement in the Haskell type system.
In fact, as you noticed, you can't even write down the desired type and use unsafeCoerce. For some problems, you can get away with this; we can write down the type for the function, but only implement it with by cheating. That's exactly how fromDynamic works. But as we saw above, there's not even a good type to give to this problem from within Haskell. The imaginary toType function allows you to give the program a type, but you can't even write down toType's type!
What now?
So, you can't do this. What should you do? My guess is that your overall architecture isn't ideal for Haskell, although I haven't seen it; Typeable and Dynamic don't actually show up that much in Haskell programs. (Perhaps you're "speaking Haskell with a Python accent", as they say.) If you only have a finite set of data types to deal with, you might be able to bundle things into a plain old algebraic data type instead:
data MyType = MTInt Int | MTBool Bool | MTString String
Then you can write isMTInt, and just use filter isMTInt, or filter (isSameMTAs randomMT).
Although I don't know what it is, there's probably a way you could unsafeCoerce your way through this problem. But frankly, that's not a good idea unless you really, really, really, really, really, really know what you're doing. And even then, it's probably not. If you need unsafeCoerce, you'll know, it won't just be a convenience thing.
I really agree with Daniel Wagner's comment: you're probably going to want to rethink your approach from scratch. Again, though, since I haven't seen your architecture, I can't say what that will mean. Maybe there's another Stack Overflow question in there, if you can distill out a concrete difficulty.
1 That looks like the following:
{-# LANGUAGE ExistentialQuantification #-}
data TypeableList = forall a. Typeable a => TypeableList [a]
randList :: [Dynamic] -> IO TypeableList
However, since none of this code compiles anyway, I think writing it out with exists is clearer.
2 Technically, there are some other functions which look relevant, such as toDyn :: Typeable a => a -> Dynamic and fromDyn :: Typeable a => Dynamic -> a -> a. However, Dynamic is more or less an existential wrapper around Typeables, relying on typeOf and TypeReps to know when to unsafeCoerce (GHC uses some implementation-specific types and unsafeCoerce, but you could do it this way, with the possible exception of dynApply/dynApp), so toDyn doesn't do anything new. And fromDyn doesn't really expect its argument of type a; it's just a wrapper around cast. These functions, and the other similar ones, don't provide any extra power that isn't available with just typeOf and cast. (For instance, going back to a Dynamic isn't very useful for your problem!)
3 To see the error in action, you can try to compile the following complete Haskell program:
{-# LANGUAGE ExistentialQuantification #-}
import Data.Dynamic
import Data.Typeable
import Data.Maybe
randItem :: [a] -> IO a
randItem = return . head -- Good enough for a short and non-compiling example
dynListToList :: Typeable a => [Dynamic] -> [a]
dynListToList = mapMaybe fromDynamic
data TypeableList = forall a. Typeable a => TypeableList [a]
randList :: [Dynamic] -> IO TypeableList
randList dl = do
tr <- randItem $ map dynTypeRep dl
return . TypeableList $ dynListToList dl -- Error! Ambiguous type variable.
Sure enough, if you try to compile this, you get the error:
SO12273982.hs:17:27:
Ambiguous type variable `a0' in the constraint:
(Typeable a0) arising from a use of `dynListToList'
Probable fix: add a type signature that fixes these type variable(s)
In the second argument of `($)', namely `dynListToList dl'
In a stmt of a 'do' block: return . TypeableList $ dynListToList dl
In the expression:
do { tr <- randItem $ map dynTypeRep dl;
return . TypeableList $ dynListToList dl }
But as is the entire point of the question, you can't "add a type signature that fixes these type variable(s)", because you don't know what type you want.
4 Mostly. GHC 7.4 has support for lifting types to kinds and for kind polymorphism; see section 7.8, "Kind polymorphism and promotion", in the GHC 7.4 user manual. This doesn't make Haskell dependently typed—something like TypeRep -> * example is still out5—but you will be able to write Vec by using very expressive types that look like values.
5 Technically, you could now write down something which looks like it has the desired type: type family ToType :: TypeRep -> *. However, this takes a type of the promoted kind TypeRep, and not a value of the type TypeRep; and besides, you still wouldn't be able to implement it. (At least I don't think so, and I can't see how you would—but I am not an expert in this.) But at this point, we're pretty far afield.
What you're observing is that the type TypeRep doesn't actually carry any type-level information along with it; only term-level information. This is a shame, but we can do better when we know all the type constructors we care about. For example, suppose we only care about Ints, lists, and function types.
{-# LANGUAGE GADTs, TypeOperators #-}
import Control.Monad
data a :=: b where Refl :: a :=: a
data Dynamic where Dynamic :: TypeRep a -> a -> Dynamic
data TypeRep a where
Int :: TypeRep Int
List :: TypeRep a -> TypeRep [a]
Arrow :: TypeRep a -> TypeRep b -> TypeRep (a -> b)
class Typeable a where typeOf :: TypeRep a
instance Typeable Int where typeOf = Int
instance Typeable a => Typeable [a] where typeOf = List typeOf
instance (Typeable a, Typeable b) => Typeable (a -> b) where
typeOf = Arrow typeOf typeOf
congArrow :: from :=: from' -> to :=: to' -> (from -> to) :=: (from' -> to')
congArrow Refl Refl = Refl
congList :: a :=: b -> [a] :=: [b]
congList Refl = Refl
eq :: TypeRep a -> TypeRep b -> Maybe (a :=: b)
eq Int Int = Just Refl
eq (Arrow from to) (Arrow from' to') = liftM2 congArrow (eq from from') (eq to to')
eq (List t) (List t') = liftM congList (eq t t')
eq _ _ = Nothing
eqTypeable :: (Typeable a, Typeable b) => Maybe (a :=: b)
eqTypeable = eq typeOf typeOf
toDynamic :: Typeable a => a -> Dynamic
toDynamic a = Dynamic typeOf a
-- look ma, no unsafeCoerce!
fromDynamic_ :: TypeRep a -> Dynamic -> Maybe a
fromDynamic_ rep (Dynamic rep' a) = case eq rep rep' of
Just Refl -> Just a
Nothing -> Nothing
fromDynamic :: Typeable a => Dynamic -> Maybe a
fromDynamic = fromDynamic_ typeOf
All of the above is pretty standard. For more on the design strategy, you'll want to read about GADTs and singleton types. Now, the function you want to write follows; the type is going to look a bit daft, but bear with me.
-- extract only the elements of the list whose type match the head
firstOnly :: [Dynamic] -> Dynamic
firstOnly [] = Dynamic (List Int) []
firstOnly (Dynamic rep v:xs) = Dynamic (List rep) (v:go xs) where
go [] = []
go (Dynamic rep' v:xs) = case eq rep rep' of
Just Refl -> v : go xs
Nothing -> go xs
Here we've picked a random element (I rolled a die, and it came up 1) and extracted only the elements that have a matching type from the list of dynamic values. Now, we could have done the same thing with regular boring old Dynamic from the standard libraries; however, what we couldn't have done is used the TypeRep in a meaningful way. I now demonstrate that we can do so: we'll pattern match on the TypeRep, and then use the enclosed value at the specific type the TypeRep tells us it is.
use :: Dynamic -> [Int]
use (Dynamic (List (Arrow Int Int)) fs) = zipWith ($) fs [1..]
use (Dynamic (List Int) vs) = vs
use (Dynamic Int v) = [v]
use (Dynamic (Arrow (List Int) (List (List Int))) f) = concat (f [0..5])
use _ = []
Note that on the right-hand sides of these equations, we are using the wrapped value at different, concrete types; the pattern match on the TypeRep is actually introducing type-level information.
You want a function that chooses a different type of values to return based on runtime data. Okay, great. But the whole purpose of a type is to tell you what operations can be performed on a value. When you don't know what type will be returned from a function, what do you do with the values it returns? What operations can you perform on them? There are two options:
You want to read the type, and perform some behaviour based on which type it is. In this case you can only cater for a finite list of types known in advance, essentially by testing "is it this type? then we do this operation...". This is easily possible in the current Dynamic framework: just return the Dynamic objects, using dynTypeRep to filter them, and leave the application of fromDynamic to whoever wants to consume your result. Moreover, it could well be possible without Dynamic, if you don't mind setting the finite list of types in your producer code, rather than your consumer code: just use an ADT with a constructor for each type, data Thing = Thing1 Int | Thing2 String | Thing3 (Thing,Thing). This latter option is by far the best if it is possible.
You want to perform some operation that works across a family of types, potentially some of which you don't know about yet, e.g. by using type class operations. This is trickier, and it's tricky conceptually too, because your program is not allowed to change behaviour based on whether or not some type class instance exists – it's an important property of the type class system that the introduction of a new instance can either make a program type check or stop it from type checking, but it can't change the behaviour of a program. Hence you can't throw an error if your input list contains inappropriate types, so I'm really not sure that there's anything you can do that doesn't essentially involve falling back to the first solution at some point.

What does the => symbol mean in Haskell?

I'm new to Haskell and, in general, to functional programming, and I'm a bit uncomfortable with its syntax.
In the following code what does the => denote? And also (Num a, Ord a)?
loop :: (Num a, Ord a) => a -> (t -> t) -> t -> t
This is a typeclass constraint; (Num a, Ord a) => ... means that loop works with any type a that is an instance of the Num and Ord typeclasses, corresponding to numeric types and ordered types respectively. Basically, you can think of loop as having the type on the right hand side of the =>, except that a is required to be an instance of Num and Ord.
You can think of typeclasses as basically similar to OOP interfaces (but they're not the same thing!) — they encapsulate a set of definitions which any instance must support, and generic code can be written using these definitions. For instance, Num includes numeric operations like addition and multiplication, while Ord includes less than, greater than, and so on.
For more information on typeclasses, see this introduction from Learn You a Haskell.
=> separates two parts of a type signature:
On the left, typeclass constraints
On the right, the actual type
So you can think of (Num a, Ord a) => a -> (t -> t) -> t -> t as meaning "the type is a -> (t -> t) -> t -> t and also there must be a Num instance for a and an Ord instance for a".
For more on typeclasses see http://www.learnyouahaskell.com/types-and-typeclasses
One way to think about it is that Ord a and Num a are additional inputs to the function. They are a special kind of input though: dictionaries. When you use this function with a particular type a, there must also be dictionaries available for the Ord and Num operations on the type a as well.
Any function that makes use of a function with dictionary inputs must also have the same dictionary inputs.
foo :: (Num a, Ord a) => a -> t
foo x = loop x someFunc someT
However, you do not have to explicitly pass these dictionaries around. Haskell will take care of that for you, assuming there is a dictionary available. You can create a dictionary with a typeclass instance.
instance Num MyType with
x + y = ...
x - y = ...
...
This creates a dictionary for the Num operations on MyType, therefore MyType can be used anywhere that Num a is a required input (assuming it satisfies the other requirements, of course).
On the left hand side of the => you declare constraints for the types that are used on the right.
In the example you give, it means that a is constrained to being an instance of both the Ord type class and the Num type class.

Resources