GADT for polymorphic list - haskell

I am parsing a few statements of the form
v1 = expression1
v2 = expression2
...
I am using the State Monad and my state should be a pair of (String, Expr a), I really insist on having the expressions typed. I tried to implement the state as [PPair] where I define PPair by the GADT:
data PPair where
PPair :: (String, Expr a) -> PPair
Once this line passed the compiler, I felt that I am doing something really really wrong. I suppressed the thought and went on coding. When I came to writing the code that would extract the value of the variable from the State, I realized the problem:
evalVar k ((PPair (kk, v)):s) = if k == kk then v else evalVar k s
I get:
Inferred type is less polymorphic than expected
which is quite expected. How do I work around this problem? I know I can solve it by breaking up the type over all candidate types a, but is there no neater way?

The problem is that there's no possible type evalVar can have:
evalVar :: String -> [PPair] -> Expr ?
You can't say ? is a, because then you're claiming your return value works for any value of a. What you can do, however, is wrap up "an Expr with an unknown type" into its own data type:
data SomeExpr where
SomeExpr :: Expr a -> SomeExpr
or, equivalently, with RankNTypes rather than GADTs:
data SomeExpr = forall a. SomeExpr (Expr a)
This is called existential quantification. You can then rewrite PPair using SomeExpr:
data PPair = PPair String SomeExpr
and evalVar works out:
evalVar k (PPair kk v : xs)
| k == kk = v
| otherwise = evalVar k xs
(Of course, you could just use a [(String,SomeExpr)] instead, and the standard lookup function.)
In general, though, trying to keep expressions completely typed at the Haskell level like this is probably an exercise in futility; a dependently-typed language like Agda would have no trouble with it, but you'll probably end up running into something Haskell can't do quite quickly, or weakening things to the point where the compile-time safety you wanted out of the effort is lost.
That's not to say it never works, of course; typed languages were one of the motivating examples for GADTs. But it might not work as well as you want, and you'll probably run into trouble if your language has any non-trivial type system features like polymorphism.
If you really want to keep the typing, then I'd use a richer structure than strings to name variables; have a Var a type that explicitly carries the type, like this:
data PPair where
PPair :: Var a -> Expr a -> PPair
evalVar :: Var a -> [PPair] -> Maybe (Expr a)
A good way to achieve something similar to this would be to use the vault package; you can construct Keys from ST and IO, and use Vault as a heterogeneous container. It's basically like a Map where the keys hold the type of the corresponding value. Specifically, I'd suggest defining Var a as Key (Expr a) and using a Vault instead of your [PPair]. (Full disclosure: I've worked on the vault package.)
Of course, you'll still have to map the variable names to the Key values, but you could create all the Keys right after parsing, and carry those around instead of the strings. (It'd be a bit of work to go from a Var to its corresponding variable name with this strategy, though; you can do it with a list of existentials, but the solution is too long to put in this answer.)
(By the way, you can have multiple arguments to a data constructor with GADTs, just like regular types: data PPair where PPair :: String -> Expr a -> PPair.)

Related

Factoring out recursion in a complex AST

For a side project I am working on I currently have to deal with an abstract syntax tree and transform it according to rules (the specifics are unimportant).
The AST itself is nontrivial, meaning it has subexpressions which are restricted to some types only. (e.g. the operator A must take an argument which is of type B only, not any Expr. A drastically simplified reduced version of my datatype looks like this:
data Expr = List [Expr]
| Strange Str
| Literal Lit
data Str = A Expr
| B Expr
| C Lit
| D String
| E [Expr]
data Lit = Int Int
| String String
My goal is to factor out the explicit recursion and rely on recursion schemes instead, as demonstrated in these two excellent blog posts, which provide very powerful general-purpose tools to operate on my AST. Applying the necessary factoring, we end up with:
data ExprF a = List [a]
| Strange (StrF a)
| Literal (LitF a)
data StrF a = A a
| B a
| C (LitF a)
| D String
| E [a]
data LitF a = Int Int
| String String
If I didn't mess up, type Expr = Fix ExprF should now be isomorphic to the previously defined Expr.
However, writing cata for these cases becomes rather tedious, as I have to pattern match B a :: StrF a inside of an Str :: ExprF a for cata to be well-typed. For the entire original AST this is unfeasible.
I stumbled upon fixing GADTs, which seems to me like it is a solution to my problem, however the user-unfriendly interface of the duplicated higher-order type classes etc. is quite the unneccessary boilerplate.
So, to sum up my questions:
Is rewriting the AST as a GADT the correct way to go about this?
If yes, how could I transform the example into a well-working version? On a second note, is there better support for higher kinded Functors in GHC now?
If you've gone through the effort of to separate out the recursion in your data type, then you can just derive Functor and you're done. You don't need any fancy features to get the recursion scheme. (As a side note, there's no reason to parameterize the Lit data type.)
The fold is:
newtype Fix f = In { out :: f (Fix f) }
gfold :: (Functor f) => (f a -> a) -> Fix f -> a
gfold alg = alg . fmap (gfold alg) . out
To specify the algebra (the alg parameter), you need to do a case analysis against ExprF, but the alternative would be to have the fold have a dozen or more parameters: one for each data constructor. That wouldn't really save you much typing and would be much harder to read. If you want (and this may require rank-2 types in general), you can package all those parameters up into a record and then you could use record update to update "pre-made" records that provide "default" behavior in various circumstances. There's an old paper Dealing with Large Bananas that takes an approach like this. What I'm suggesting, to be clear, is just wrapping the gfold function above with a function that takes a record, and passes in an algebra that will do the case analysis and call the appropriate field of the record for each case.
Of course, you could use GHC Generics or the various "generic/polytypic" programming libraries like Scrap Your Boilerplate instead of this. You are basically recreating what they do.

Haskell type keyword used with signature

In the code from Scrap Your Zippers, what does the following line mean:
type Move a = Zipper a -> Maybe (Zipper a)
Type is a synonym for a type and uses the same data constructors, so this make no sense. How is it used here?
type allows us to make synonyms, as you say. This means we can make shortened versions of long and complicated types. Here is the definition of the String base type. Yes, this is how it's defined:
type String = [Char]
This allows us to make types more readable when we write them; everyone prefers seeing String to [Char].
You can also have type arguments like in the data keyword. Here are some Examples:
type Predicate t = t -> Bool
type Transform t = t -> t
type RightFoldSignature a b = (a -> b -> b) -> b -> [a] -> b
type TwoTuple a b = (a,b)
type ThreeTuple a b c = (a,b,c)
... And so on. So, there's nothing particularly strange going on with the declaration you have there - the author is making a type synonym to make things easier to write and clearer to read, presumably to be used in the types of the functions the author wants to create.
Learn you a Haskell has it's own little section on this, a list of the different declarations can be found here, and an article here.

Parsec parsing in Haskell

I have 2 parsers:
nexpr::Parser (Expr Double)
sexpr::Parser (Expr String)
How do I build a parser that tries one and then the other if it doesn't work? I can't figure out what to return. There must be a clever way to do this.
Thanks.
EDIT:
Adding a bit more info...
I'm learning Haskel, so I started with :
data Expr a where
N::Double -> Expr Double
S::String -> Expr String
Add::Expr Double -> Expr Double -> Expr Double
Cat::Expr String -> Expr String -> Expr String
then I read about F-algebra (here) and so I changed it to:
data ExprF :: (* -> *) -> * -> * where
N::Double -> ExprF r Double
S::String -> ExprF r String
Add::r Double -> r Double -> ExprF r Double
Cat::r String -> r String -> ExprF r String
with
type Expr = HFix ExprF
so my parse to:
Parser (Expr Double)
is actually:
Parser (ExprF HFix Double)
Maybe I'm biting off more than I can chew...
As noted in the comments, you can have a parser like this
nOrSexpr :: Parser (Either (Expr Double) (Expr String))
nOrSexpr = (Left <$> nexpr) <|> (Right <$> sexpr)
However, I think the reason that you are having this difficulty is because you are not representing your parse tree as a single type, which is the more usual thing to do. Something like this:
data Expr =
ExprDouble Double
| ExprInt Int
| ExprString String
That way you can have parsers for each kind of expression that are all of type Parser Expr. This is the same as using Either but more flexible and maintainable. So you might have
doubleParser :: Parser Expr
doubleParser = ...
intParser :: Parser Expr
intParser = ...
stringParser :: Parser Expr
stringParser = ...
exprParser :: Parser Expr
exprParser = intParser <|> doubleParser <|> stringParser
Note that the order of the parsers does matter and use can use Parsec's try function if backtracking is needed.
So, for example, if you want to have a sum expression now, you can add to the data type
data Expr =
ExprDouble Double
| ExprInt Int
| ExprString String
| ExprSum Expr Expr
and make the parser
sumParser :: Parser Expr
sumParser = do
a <- exprParser
string " + "
b <- exprParser
return $ ExprSum a b
UPDATE
Well, I take my hat off to you diving straight into GADTs if you are just starting with Haskell. I have been reading through the paper you linked and noticed this immediately in the first paragraph:
The jury is still out on whether the additional type-safety provided by GADTs is worth the added inconvenience of working with them.
There are three points worth taking away here I think. The first is simply that I would have a go with the simpler way of doing things first, to get an idea of how it works and why you might want to add more type safety, before trying to more complicated type theoretical stuff. That comment may not help so feel free to ignore it!
Secondly, and more importantly, your representation...
data ExprF :: (* -> *) -> * -> * where
N :: Double -> ExprF r Double
S :: String -> ExprF r String
Add :: r Double -> r Double -> ExprF r Double
Cat :: r String -> r String -> ExprF r String
...is specifically designed to not allow ill formed type expressions. Contrasted with mine which can, eg ExprSum (ExprDouble 5.0) (ExprString "test"). So the question you really want to ask is what should actually happen when the parser attempts to parse something like "5.0 + \"test\""? Do you want it to just not parse, or do you want it to return a nice message saying that this expression is the wrong type? Compilers are usually designed in multiple stages for this reason. The first pass turns the input into an abstract syntax tree (AST), and further passes annotate this tree with type judgements. This annotated AST can then be transformed into the semantic representation that you really want it in.
So in your case I would recommend two stages. first, parse into a dumb representation like mine, that will give you the correct tree shape but allow ill-typed expressions. Like
data ExprAST =
ExprASTDouble Double
| ExprASTInt Int
| ExprASTString String
| ExprASTAdd Expr Expr
Then have another function that will typecheck the ExprAST. Something like
typecheck :: ExprAST -> Maybe (ExprF HFix a)
(You could also use Either and return either the typechecked GADT or an error string saying what the problem is.) The further problem here is that you don't know what a is statically. The other answer solves this by using type tags and an existential wrapper, which you might find to be the best way to go. I feel like it might be simpler to have a top level expression in your GADT that all expressions must live in, so an entire parse will always have the same type. In the end there is usually only one program type.
My third, and last, point is related to this
The jury is still out on whether the additional type-safety provided by GADTs is worth the added inconvenience of working with them.
The more type safety, generally the more work you have to do to get it. You mention you are new to Haskell, yet this adventure has taken us right to the edge of what it is capable of doing. The type of the parsed expression cannot depend only on the input string in a Haskell function, because it does not allow for dependant types. If you want to go down this path, I might suggest you have a look at a language called Idris. A great introduction to what it is capable of can be found in this video, in which he constructs a typesafe printf.
The problem described looks to be using Parsec to parse into a GADT representation, for which probably the easiest solution would be parse into a monotype representation and then have a (likely partial) type checking phase to produce the well-typed GADT, if it can. The monotype representation could be an existential wrapper over a GADT term, with a type-tag to reify the GADT index.
EDIT: a quick example
Let's define a type for type-tags and an existential wrapper:
data Type :: * -> * where
TDouble :: Type Double
TString :: Type String
data Judgement f = forall ix. Judgement (f ix) (Type ix)
With the example GADT given in the original post, we only have a problem with the outer-most production, which we need to parse to a monotype as we don't know statically which expression type we will get at runtime:
pExpr :: Parser (Judgement Expr)
pExpr = Judgement <$> pDblExpr <*> pure TDouble
<|> Judgement <$> pStrExpr <*> pure TString
We can write a type check phase to produce a GADT or fail, depending on whether the type assertion succeeds or not:
typecheck :: Judgement Expr -> Type ix -> Maybe (Expr ix)
typecheck (Judgement e TDouble) TDouble = Just e
typecheck (Judgement e TString) TString = Just e
typecheck _ _ = Nothing

Could not deduce (b ~ a)

So I have the following code, I'm trying to write an abstract syntax tree for an interpreter, & I prefer not to jam everything in the same data type, so I was going to write a typeclass that had the basic behaviour (in this case AST).
{-# LANGUAGE ExistentialQuantification #-}
import qualified Data.Map as M
-- ...
data Expr = forall a. AST a => Expr a
type Env = [M.Map String Expr]
class AST a where
reduce :: AST b => a -> Env -> b
-- when i remove the line below, it compiles fine
reduce ast _ = ast
-- ...
When I remove the default implementation of reduce in the typeclass AST it compiles fine, but when ever I provide an implementation that returns it's self it complains. I get the following compiler error
src/Data/AbstractSyntaxTree.hs:13:18:
Could not deduce (b ~ a)
from the context (AST a)
bound by the class declaration for `AST'
at src/Data/AbstractSyntaxTree.hs:(11,1)-(13,20)
or from (AST b)
bound by the type signature for reduce :: AST b => a -> Env -> b
at src/Data/AbstractSyntaxTree.hs:12:13-36
`b' is a rigid type variable bound by
the type signature for reduce :: AST b => a -> Env -> b
at src/Data/AbstractSyntaxTree.hs:12:13
`a' is a rigid type variable bound by
the class declaration for `AST'
at src/Data/AbstractSyntaxTree.hs:11:11
In the expression: ast
In an equation for `reduce': reduce ast _ = ast
The behaviour of AST's reduce will evaluate an AST, and occasionally return a different type of AST, and sometimes the same type of AST.
Edit: Regarding data Expr = forall a. AST a => Expr a & GADTs
I originally went with data Expr = forall a. AST a => Expr a because I wanted to represent types like this.
(+ 2 true) -> Compound [Expr (Ref "+"), Expr 2, Expr true]
(+ a 2) -> Compound [Expr (Ref "+"), Expr (Ref "a"), Expr 2]
(eval (+ 2 2)) -> Compound [Expr (Ref "eval"),
Compound [
Expr (Ref "+"),
Expr 2,
Expr 2]]
((lambda (a b) (+ a b)) 2 2) -> Compound [Expr SomeLambdaAST, Expr 2, Expr 2]
Since I'm generating ASTs from text I feel it would be a burden to represent a strictly typed ASTs in a GADT, although I do see where they could be useful in case like DSLs in Haskell.
But since I'm generating the AST from text (which could contain some of the examples above), it might be a bit hard to predict what AST I'll end up with. I don't want to start juggling between Eithers & Maybes. That is what I ended up doing last time & it was a mess, & I gave up trying to attempt this in Haskell.
But again I'm not the most experienced Haskell programmer so maybe I'm looking at this the wrong way, maybe I can implement an AST with so more rigours typing, so I'll have a look and see if I can come up with GADTs, but I have my doubts & I have a feeling that it might end the way it did last time.
Ultimately I'm just trying to learn Haskell at the moment with a fun finish able project, so I don't mind if my first Haskell project isn't really idiomatic Haskell. Getting something working is a higher priority just so I can make my way around the language and have something to show for it.
Update:
I've taken #cdk's & #bheklilr advice and ditched the existential type, although I've gone with a much simpler type, as opposed to utilising GADTs (also suggested by #cdk's & #bheklilr). It could possibly be a stronger type but again I'm just trying to get familiar with Haskell, so I gave up up after a few hours & went with a simple data type like so :P
import qualified Data.Map as M
type Environment = [M.Map String AST]
data AST
= Compound [AST]
| FNum Double
| Func Function
| Err String
| Ref String
data Function
= NativeFn ([AST] -> AST)
| LangFn [String] AST
-- previously called reduce
eval :: Environment -> AST -> AST
eval env ast = case ast of
Ref ref -> case (lookup ref env ) of
Just ast -> ast
Nothing -> Err ("Not in scope `" ++ ref ++ "'")
Compound elements -> case elements of
[] -> Err "You tried to invoke `()'"
function : args -> case (eval env function) of
Func fn -> invoke env fn args
other -> Err $ "Cannot invoke " ++ (show other)
_ -> ast
-- invoke & lookup are defined else where
Although I will still probably look at GADTs as they seem to be pretty interesting & have lead me to some interesting reading material regarding implementing abstract syntax trees in haskell.
What part of the error message are you having difficulties understanding? I think it's quite clear.
The type of reduce is
reduce :: AST b => a -> Env -> b
The first argument has type a and GHC expects reduce to return something of type b, which may be entirely different from a. GHC is correct to complain that you've tried to return a value of a when it expects b.
The "existential quantification with type class" is (as noted by bheklilr) an anti-pattern. A better approach would be to create an Algebraic Data Type for AST:
data AST a
now reduce becomes a simple function:
reduce :: Env -> AST a -> AST b
if you want reduce to be able to return a different type of AST, you could use Either
reduce :: Env -> AST a -> Either (AST a) (AST b)
but I don't think this is what you really want. My advice is to take a look at the GADT style of creating ASTs and re-evaluate your approach.
You are interpreting this type signature incorrectly (in a way that is common to OO programmers):
reduce :: AST b => a -> Env -> b
This does not mean that reduce can choose any type it likes (that is a member of AST) and return a value of that type. If it did, your implementation would be valid. Rather, it means that for any type b the caller likes (that is a member of AST), reduce must be able to return a value in that type. b could well be the same as a sometimes, but it's the caller's choice, not the choice of reduce.
If your implementation returns a value of type a, then this can only be true if b is always equal to a, which is what the compiler is on about when it reports failing to prove that b ~ a.
Haskell does not have subtypes. Type variables are not supertypes of all the concrete types that could instantiate them, as you might be used to using Object or abstract interface types in OO languages. Rather type variables are parameters; any implementation which claims to have a parametric type must work regardless of what types are chosen for the parameters.
If you want to use a design where reduce can return a value in whatever type of AST it feels like (rather than whatever type of AST is asked of it), then you need to use your Expr box again, since Expr is not parameterized by the type of AST it contains, but can still contain any AST:
reduce :: a -> Env -> Expr
reduce ast _ = Expr ast
Now reduce can work regardless of the types chosen for its type parameters, since there's only a. Consumers of the returned Expr will have no way of constraining the type inside the Expr, so they'll have to be written to work regardless of what that type is.
Your default implementation doesn't compile, because it has the wrong definition.
reduce :: AST b => a -> b -> Env -> b
reduce ast _ = ast
Now ast has the type a and reduce function returns type b but according to your implementation you return ast which is of type a but the compiler expects b.
Even something like this will work:
reduce :: AST b => a -> b -> Env -> b
reduce _ ast _ = ast

Programmatic type annotations in Haskell

When metaprogramming, it may be useful (or necessary) to pass along to Haskell's type system information about types that's known to your program but not inferable in Hindley-Milner. Is there a library (or language extension, etc) that provides facilities for doing this—that is, programmatic type annotations—in Haskell?
Consider a situation where you're working with a heterogenous list (implemented using the Data.Dynamic library or existential quantification, say) and you want to filter the list down to a bog-standard, homogeneously typed Haskell list. You can write a function like
import Data.Dynamic
import Data.Typeable
dynListToList :: (Typeable a) => [Dynamic] -> [a]
dynListToList = (map fromJust) . (filter isJust) . (map fromDynamic)
and call it with a manual type annotation. For example,
foo :: [Int]
foo = dynListToList [ toDyn (1 :: Int)
, toDyn (2 :: Int)
, toDyn ("foo" :: String) ]
Here foo is the list [1, 2] :: [Int]; that works fine and you're back on solid ground where Haskell's type system can do its thing.
Now imagine you want to do much the same thing but (a) at the time you write the code you don't know what the type of the list produced by a call to dynListToList needs to be, yet (b) your program does contain the information necessary to figure this out, only (c) it's not in a form accessible to the type system.
For example, say you've randomly selected an item from your heterogenous list and you want to filter the list down by that type. Using the type-checking facilities supplied by Data.Typeable, your program has all the information it needs to do this, but as far as I can tell—this is the essence of the question—there's no way to pass it along to the type system. Here's some pseudo-Haskell that shows what I mean:
import Data.Dynamic
import Data.Typeable
randList :: (Typeable a) => [Dynamic] -> IO [a]
randList dl = do
tr <- randItem $ map dynTypeRep dl
return (dynListToList dl :: [<tr>]) -- This thing should have the type
-- represented by `tr`
(Assume randItem selects a random item from a list.)
Without a type annotation on the argument of return, the compiler will tell you that it has an "ambiguous type" and ask you to provide one. But you can't provide a manual type annotation because the type is not known at write-time (and can vary); the type is known at run-time, however—albeit in a form the type system can't use (here, the type needed is represented by the value tr, a TypeRep—see Data.Typeable for details).
The pseudo-code :: [<tr>] is the magic I want to happen. Is there any way to provide the type system with type information programatically; that is, with type information contained in a value in your program?
Basically I'm looking for a function with (pseudo-) type ??? -> TypeRep -> a that takes a value of a type unknown to Haskell's type system and a TypeRep and says, "Trust me, compiler, I know what I'm doing. This thing has the value represented by this TypeRep." (Note that this is not what unsafeCoerce does.)
Or is there something completely different that gets me the same place? For example, I can imagine a language extension that permits assignment to type variables, like a souped-up version of the extension enabling scoped type variables.
(If this is impossible or highly impractical,—e.g., it requires packing a complete GHCi-like interpreter into the executable—please try to explain why.)
No, you can't do this. The long and short of it is that you're trying to write a dependently-typed function, and Haskell isn't a dependently typed language; you can't lift your TypeRep value to a true type, and so there's no way to write down the type of your desired function. To explain this in a little more detail, I'm first going to show why the way you've phrased the type of randList doesn't really make sense. Then, I'm going to explain why you can't do what you want. Finally, I'll briefly mention a couple thoughts on what to actually do.
Existentials
Your type signature for randList can't mean what you want it to mean. Remembering that all type variables in Haskell are universally quantified, it reads
randList :: forall a. Typeable a => [Dynamic] -> IO [a]
Thus, I'm entitled to call it as, say, randList dyns :: IO [Int] anywhere I want; I must be able to provide a return value for all a, not simply for some a. Thinking of this as a game, it's one where the caller can pick a, not the function itself. What you want to say (this isn't valid Haskell syntax, although you can translate it into valid Haskell by using an existential data type1) is something more like
randList :: [Dynamic] -> (exists a. Typeable a => IO [a])
This promises that the elements of the list are of some type a, which is an instance of Typeable, but not necessarily any such type. But even with this, you'll have two problems. First, even if you could construct such a list, what could you do with it? And second, it turns out that you can't even construct it in the first place.
Since all that you know about the elements of the existential list is that they're instances of Typeable, what can you do with them? Looking at the documentation, we see that there are only two functions2 which take instances of Typeable:
typeOf :: Typeable a => a -> TypeRep, from the type class itself (indeed, the only method therein); and
cast :: (Typeable a, Typeable b) => a -> Maybe b (which is implemented with unsafeCoerce, and couldn't be written otherwise).
Thus, all that you know about the type of the elements in the list is that you can call typeOf and cast on them. Since we'll never be able to usefully do anything else with them, our existential might just as well be (again, not valid Haskell)
randList :: [Dynamic] -> IO [(TypeRep, forall b. Typeable b => Maybe b)]
This is what we get if we apply typeOf and cast to every element of our list, store the results, and throw away the now-useless existentially typed original value. Clearly, the TypeRep part of this list isn't useful. And the second half of the list isn't either. Since we're back to a universally-quantified type, the caller of randList is once again entitled to request that they get a Maybe Int, a Maybe Bool, or a Maybe b for any (typeable) b of their choosing. (In fact, they have slightly more power than before, since they can instantiate different elements of the list to different types.) But they can't figure out what type they're converting from unless they already know it—you've still lost the type information you were trying to keep.
And even setting aside the fact that they're not useful, you simply can't construct the desired existential type here. The error arises when you try to return the existentially-typed list (return $ dynListToList dl). At what specific type are you calling dynListToList? Recall that dynListToList :: forall a. Typeable a => [Dynamic] -> [a]; thus, randList is responsible for picking which a dynListToList is going to use. But it doesn't know which a to pick; again, that's the source of the question! So the type that you're trying to return is underspecified, and thus ambiguous.3
Dependent types
OK, so what would make this existential useful (and possible)? Well, we actually have slightly more information: not only do we know there's some a, we have its TypeRep. So maybe we can package that up:
randList :: [Dynamic] -> (exists a. Typeable a => IO (TypeRep,[a]))
This isn't quite good enough, though; the TypeRep and the [a] aren't linked at all. And that's exactly what you're trying to express: some way to link the TypeRep and the a.
Basically, your goal is to write something like
toType :: TypeRep -> *
Here, * is the kind of all types; if you haven't seen kinds before, they are to types what types are to values. * classifies types, * -> * classifies one-argument type constructors, etc. (For instance, Int :: *, Maybe :: * -> *, Either :: * -> * -> *, and Maybe Int :: *.)
With this, you could write (once again, this code isn't valid Haskell; in fact, it really bears only a passing resemblance to Haskell, as there's no way you could write it or anything like it within Haskell's type system):
randList :: [Dynamic] -> (exists (tr :: TypeRep).
Typeable (toType tr) => IO (tr, [toType tr]))
randList dl = do
tr <- randItem $ map dynTypeRep dl
return (tr, dynListToList dl :: [toType tr])
-- In fact, in an ideal world, the `:: [toType tr]` signature would be
-- inferable.
Now, you're promising the right thing: not that there exists some type which classifies the elements of the list, but that there exists some TypeRep such that its corresponding type classifies the elements of the list. If only you could do this, you would be set. But writing toType :: TypeRep -> * is completely impossible in Haskell: doing this requires a dependently-typed language, since toType tr is a type which depends on a value.
What does this mean? In Haskell, it's perfectly acceptable for values to depend on other values; this is what a function is. The value head "abc", for instance, depends on the value "abc". Similarly, we have type constructors, so it's acceptable for types to depend on other types; consider Maybe Int, and how it depends on Int. We can even have values which depend on types! Consider id :: a -> a. This is really a family of functions: id_Int :: Int -> Int, id_Bool :: Bool -> Bool, etc. Which one we have depends on the type of a. (So really, id = \(a :: *) (x :: a) -> x; although we can't write this in Haskell, there are languages where we can.)
Crucially, however, we can never have a type that depends on a value. We might want such a thing: imagine Vec 7 Int, the type of length-7 lists of integers. Here, Vec :: Nat -> * -> *: a type whose first argument must be a value of type Nat. But we can't write this sort of thing in Haskell.4 Languages which support this are called dependently-typed (and will let us write id as we did above); examples include Coq and Agda. (Such languages often double as proof assistants, and are generally used for research work as opposed to writing actual code. Dependent types are hard, and making them useful for everyday programming is an active area of research.)
Thus, in Haskell, we can check everything about our types first, throw away all that information, and then compile something that refers only to values. In fact, this is exactly what GHC does; since we can never check types at run-time in Haskell, GHC erases all the types at compile-time without changing the program's run-time behavior. This is why unsafeCoerce is easy to implement (operationally) and completely unsafe: at run-time, it's a no-op, but it lies to the type system. Consequently, something like toType is completely impossible to implement in the Haskell type system.
In fact, as you noticed, you can't even write down the desired type and use unsafeCoerce. For some problems, you can get away with this; we can write down the type for the function, but only implement it with by cheating. That's exactly how fromDynamic works. But as we saw above, there's not even a good type to give to this problem from within Haskell. The imaginary toType function allows you to give the program a type, but you can't even write down toType's type!
What now?
So, you can't do this. What should you do? My guess is that your overall architecture isn't ideal for Haskell, although I haven't seen it; Typeable and Dynamic don't actually show up that much in Haskell programs. (Perhaps you're "speaking Haskell with a Python accent", as they say.) If you only have a finite set of data types to deal with, you might be able to bundle things into a plain old algebraic data type instead:
data MyType = MTInt Int | MTBool Bool | MTString String
Then you can write isMTInt, and just use filter isMTInt, or filter (isSameMTAs randomMT).
Although I don't know what it is, there's probably a way you could unsafeCoerce your way through this problem. But frankly, that's not a good idea unless you really, really, really, really, really, really know what you're doing. And even then, it's probably not. If you need unsafeCoerce, you'll know, it won't just be a convenience thing.
I really agree with Daniel Wagner's comment: you're probably going to want to rethink your approach from scratch. Again, though, since I haven't seen your architecture, I can't say what that will mean. Maybe there's another Stack Overflow question in there, if you can distill out a concrete difficulty.
1 That looks like the following:
{-# LANGUAGE ExistentialQuantification #-}
data TypeableList = forall a. Typeable a => TypeableList [a]
randList :: [Dynamic] -> IO TypeableList
However, since none of this code compiles anyway, I think writing it out with exists is clearer.
2 Technically, there are some other functions which look relevant, such as toDyn :: Typeable a => a -> Dynamic and fromDyn :: Typeable a => Dynamic -> a -> a. However, Dynamic is more or less an existential wrapper around Typeables, relying on typeOf and TypeReps to know when to unsafeCoerce (GHC uses some implementation-specific types and unsafeCoerce, but you could do it this way, with the possible exception of dynApply/dynApp), so toDyn doesn't do anything new. And fromDyn doesn't really expect its argument of type a; it's just a wrapper around cast. These functions, and the other similar ones, don't provide any extra power that isn't available with just typeOf and cast. (For instance, going back to a Dynamic isn't very useful for your problem!)
3 To see the error in action, you can try to compile the following complete Haskell program:
{-# LANGUAGE ExistentialQuantification #-}
import Data.Dynamic
import Data.Typeable
import Data.Maybe
randItem :: [a] -> IO a
randItem = return . head -- Good enough for a short and non-compiling example
dynListToList :: Typeable a => [Dynamic] -> [a]
dynListToList = mapMaybe fromDynamic
data TypeableList = forall a. Typeable a => TypeableList [a]
randList :: [Dynamic] -> IO TypeableList
randList dl = do
tr <- randItem $ map dynTypeRep dl
return . TypeableList $ dynListToList dl -- Error! Ambiguous type variable.
Sure enough, if you try to compile this, you get the error:
SO12273982.hs:17:27:
Ambiguous type variable `a0' in the constraint:
(Typeable a0) arising from a use of `dynListToList'
Probable fix: add a type signature that fixes these type variable(s)
In the second argument of `($)', namely `dynListToList dl'
In a stmt of a 'do' block: return . TypeableList $ dynListToList dl
In the expression:
do { tr <- randItem $ map dynTypeRep dl;
return . TypeableList $ dynListToList dl }
But as is the entire point of the question, you can't "add a type signature that fixes these type variable(s)", because you don't know what type you want.
4 Mostly. GHC 7.4 has support for lifting types to kinds and for kind polymorphism; see section 7.8, "Kind polymorphism and promotion", in the GHC 7.4 user manual. This doesn't make Haskell dependently typed—something like TypeRep -> * example is still out5—but you will be able to write Vec by using very expressive types that look like values.
5 Technically, you could now write down something which looks like it has the desired type: type family ToType :: TypeRep -> *. However, this takes a type of the promoted kind TypeRep, and not a value of the type TypeRep; and besides, you still wouldn't be able to implement it. (At least I don't think so, and I can't see how you would—but I am not an expert in this.) But at this point, we're pretty far afield.
What you're observing is that the type TypeRep doesn't actually carry any type-level information along with it; only term-level information. This is a shame, but we can do better when we know all the type constructors we care about. For example, suppose we only care about Ints, lists, and function types.
{-# LANGUAGE GADTs, TypeOperators #-}
import Control.Monad
data a :=: b where Refl :: a :=: a
data Dynamic where Dynamic :: TypeRep a -> a -> Dynamic
data TypeRep a where
Int :: TypeRep Int
List :: TypeRep a -> TypeRep [a]
Arrow :: TypeRep a -> TypeRep b -> TypeRep (a -> b)
class Typeable a where typeOf :: TypeRep a
instance Typeable Int where typeOf = Int
instance Typeable a => Typeable [a] where typeOf = List typeOf
instance (Typeable a, Typeable b) => Typeable (a -> b) where
typeOf = Arrow typeOf typeOf
congArrow :: from :=: from' -> to :=: to' -> (from -> to) :=: (from' -> to')
congArrow Refl Refl = Refl
congList :: a :=: b -> [a] :=: [b]
congList Refl = Refl
eq :: TypeRep a -> TypeRep b -> Maybe (a :=: b)
eq Int Int = Just Refl
eq (Arrow from to) (Arrow from' to') = liftM2 congArrow (eq from from') (eq to to')
eq (List t) (List t') = liftM congList (eq t t')
eq _ _ = Nothing
eqTypeable :: (Typeable a, Typeable b) => Maybe (a :=: b)
eqTypeable = eq typeOf typeOf
toDynamic :: Typeable a => a -> Dynamic
toDynamic a = Dynamic typeOf a
-- look ma, no unsafeCoerce!
fromDynamic_ :: TypeRep a -> Dynamic -> Maybe a
fromDynamic_ rep (Dynamic rep' a) = case eq rep rep' of
Just Refl -> Just a
Nothing -> Nothing
fromDynamic :: Typeable a => Dynamic -> Maybe a
fromDynamic = fromDynamic_ typeOf
All of the above is pretty standard. For more on the design strategy, you'll want to read about GADTs and singleton types. Now, the function you want to write follows; the type is going to look a bit daft, but bear with me.
-- extract only the elements of the list whose type match the head
firstOnly :: [Dynamic] -> Dynamic
firstOnly [] = Dynamic (List Int) []
firstOnly (Dynamic rep v:xs) = Dynamic (List rep) (v:go xs) where
go [] = []
go (Dynamic rep' v:xs) = case eq rep rep' of
Just Refl -> v : go xs
Nothing -> go xs
Here we've picked a random element (I rolled a die, and it came up 1) and extracted only the elements that have a matching type from the list of dynamic values. Now, we could have done the same thing with regular boring old Dynamic from the standard libraries; however, what we couldn't have done is used the TypeRep in a meaningful way. I now demonstrate that we can do so: we'll pattern match on the TypeRep, and then use the enclosed value at the specific type the TypeRep tells us it is.
use :: Dynamic -> [Int]
use (Dynamic (List (Arrow Int Int)) fs) = zipWith ($) fs [1..]
use (Dynamic (List Int) vs) = vs
use (Dynamic Int v) = [v]
use (Dynamic (Arrow (List Int) (List (List Int))) f) = concat (f [0..5])
use _ = []
Note that on the right-hand sides of these equations, we are using the wrapped value at different, concrete types; the pattern match on the TypeRep is actually introducing type-level information.
You want a function that chooses a different type of values to return based on runtime data. Okay, great. But the whole purpose of a type is to tell you what operations can be performed on a value. When you don't know what type will be returned from a function, what do you do with the values it returns? What operations can you perform on them? There are two options:
You want to read the type, and perform some behaviour based on which type it is. In this case you can only cater for a finite list of types known in advance, essentially by testing "is it this type? then we do this operation...". This is easily possible in the current Dynamic framework: just return the Dynamic objects, using dynTypeRep to filter them, and leave the application of fromDynamic to whoever wants to consume your result. Moreover, it could well be possible without Dynamic, if you don't mind setting the finite list of types in your producer code, rather than your consumer code: just use an ADT with a constructor for each type, data Thing = Thing1 Int | Thing2 String | Thing3 (Thing,Thing). This latter option is by far the best if it is possible.
You want to perform some operation that works across a family of types, potentially some of which you don't know about yet, e.g. by using type class operations. This is trickier, and it's tricky conceptually too, because your program is not allowed to change behaviour based on whether or not some type class instance exists – it's an important property of the type class system that the introduction of a new instance can either make a program type check or stop it from type checking, but it can't change the behaviour of a program. Hence you can't throw an error if your input list contains inappropriate types, so I'm really not sure that there's anything you can do that doesn't essentially involve falling back to the first solution at some point.

Resources