Right way to set constraint on function arguments - haskell

I have some function. And I want to restrict types that can be passed to this function. Let's say only types that are cacheable. I can enumerate such types with a type family like:
type family Cacheable a::Bool where
Cacheable X = 'True
Cacheable _ = 'False
And add such a constraint to my function:
myFunc :: forall a. (Cacheable a ~ 'True) => ....
but this constraint is little bit redundant: I can remove it from the function's signature, and nothing changes. Nothing forces it to be in the signature.
Another approach is to create some typeclass whose usage in the body of myFunc will force me to add this constraint to myFunc's signature:
class Cacheable a
ensureCacheable :: ()
ensureCacheable = ()
instance Cacheable X
myFunc :: forall a. (Cacheable a) => ...
myFunc =
let _ = ensureCacheable #a
in ...
but it looks little bit funny.
What is the right/canonical way to do it in Haskell?
Let's suppose that Cacheable cannot have any reasoning methods. Think about it like about some classifier. Another name is IsQuery (vs IsCommand) and queries are cacheable, commands - no.

For the sake of discussion, let's pick a single terminology:
type family CacheableTF a :: Bool where
CacheableTF X = 'True
CacheableTF _ = 'False
class CacheableC a
instance CacheableC X
What's the difference between these? Two aspects:
CacheableTF a expresses in classical logic the fact that a is a cacheable type. It's just a boolean after all. Thus you can arbitrarily stack negations on top of this. (Cacheable a ~ 'False) => .... is a constraint just as valid as the 'True version.
By contrast, CacheableC a is constructive, it's a proposition, a promise that anything which has this in context will be able to access the methods of the type class. (Of course, in your example the method was pretty useless, but even then you could still build other functions on top of it.) Unlike with CacheableTF, you can't really use the negation of CacheableC at all.This aspect is directly reflected in the kinds:
CacheableTF :: Type -> Bool
CacheableC :: Type -> Constraint
CacheableTF is closed-world, CacheableC is open-world. If you want this to be an interface where people can later on make their own types cacheable as well, you need CacheableC. But this power is one of the reasons why class constraints can't be negated: just because the compiler can't find any instance while compiling one module, doesn't mean the type won't have an instance by the time the complete program is linked together.
If you want to make a decision based on whether or not the type is cacheable, you need CacheableTF. However in practice, you'd typically also need some methods detailing how to cache it, in the True case, not just trivial methods like in your CacheableC. This can be accomplished by a more general class that covers both the cacheable- and non-cacheable cases:
class DecideCache a where
type IsCacheable a :: Bool
howToActuallyGoAboutCachingIt
:: (IsCacheable a ~ 'True) => SomeCompl -> Icated -> Stora -> GeMethod

Related

Can we tweak "a -> a" function in Haskell?

In Haskell id function is defined on type level as id :: a -> a and implemented as just returning its argument without any modification, but if we have some type introspection with TypeApplications we can try to modify values without breaking type signature:
{-# LANGUAGE AllowAmbiguousTypes, ScopedTypeVariables, TypeApplications #-}
module Main where
class TypeOf a where
typeOf :: String
instance TypeOf Bool where
typeOf = "Bool"
instance TypeOf Char where
typeOf = "Char"
instance TypeOf Int where
typeOf = "Int"
tweakId :: forall a. TypeOf a => a -> a
tweakId x
| typeOf #a == "Bool" = not x
| typeOf #a == "Int" = x+1
| typeOf #a == "Char" = x
| otherwise = x
This fail with error:
"Couldn't match expected type ‘a’ with actual type ‘Bool’"
But I don't see any problems here (type signature satisfied):
My question is:
How can we do such a thing in a Haskell?
If we can't, that is theoretical\philosophical etc reasons for this?
If this implementation of tweak_id is not "original id", what are theoretical roots that id function must not to do any modifications on term level. Or can we have many implementations of id :: a -> a function (I see that in practice we can, I can implement such a function in Python for example, but what the theory behind Haskell says to this?)
You need GADTs for that.
{-# LANGUAGE ScopedTypeVariables, TypeApplications, GADTs #-}
import Data.Typeable
import Data.Type.Equality
tweakId :: forall a. Typeable a => a -> a
tweakId x
| Just Refl <- eqT #a #Int = x + 1
-- etc. etc.
| otherwise = x
Here we use eqT #type1 #type2 to check whether the two types are equal. If they are, the result is Just Refl and pattern matching on that Refl is enough to convince the type checker that the two types are indeed equal, so we can use x + 1 since x is now no longer only of type a but also of type Int.
This check requires runtime type information, which we usually do not have due to Haskell's type erasure property. The information is provided by the Typeable type class.
This can also be achieved using a user-defined class like your TypeOf if we make it provide a custom GADT value. This can work well if we want to encode some constraint like "type a is either an Int, a Bool, or a String" where we statically know what types to allow (we can even recursively define a set of allowed types in this way). However, to allow any type, including ones that have not yet been defined, we need something like Typeable. That is also very convenient since any user-defined type is automatically made an instance of Typeable.
This fail with error: "Couldn't match expected type ‘a’ with actual type ‘Bool’"But I don't see any problems here
Well, what if I add this instance:
instance TypeOf Float where
typeOf = "Bool"
Do you see the problem now? Nothing prevents somebody from adding such an instance, no matter how silly it is. And so the compiler can't possibly make the assumption that having checked typeOf #a == "Bool" is sufficient to actually use x as being of type Bool.
You can squelch the error if you are confident that nobody will add malicious instances, by using unsafe coercions.
import Unsafe.Coerce
tweakId :: forall a. TypeOf a => a -> a
tweakId x
| typeOf #a == "Bool" = unsafeCoerce (not $ unsafeCoerce x)
| typeOf #a == "Int" = unsafeCoerce (unsafeCoerce x+1 :: Int)
| typeOf #a == "Char" = unsafeCoerce (unsafeCoerce x :: Char)
| otherwise = x
but I would not recommend this. The correct way is to not use strings as a poor man's type representation, but instead the standard Typeable class which is actually tamper-proof and comes with suitable GADTs so you don't need manual unsafe coercions. See chi's answer.
As an alternative, you could also use type-level strings and a functional dependency to make the unsafe coercions safe:
{-# LANGUAGE DataKinds, FunctionalDependencies
, ScopedTypeVariables, UnicodeSyntax, TypeApplications #-}
import GHC.TypeLits (KnownSymbol, symbolVal)
import Data.Proxy (Proxy(..))
import Unsafe.Coerce
class KnownSymbol t => TypeOf a t | a->t, t->a
instance TypeOf Bool "Bool"
instance TypeOf Int "Int"
tweakId :: ∀ a t . TypeOf a t => a -> a
tweakId x = case symbolVal #t Proxy of
"Bool" -> unsafeCoerce (not $ unsafeCoerce x)
"Int" -> unsafeCoerce (unsafeCoerce x + 1 :: Int)
_ -> x
The trick is that the fundep t->a makes writing another instance like
instance TypeOf Float "Bool"
a compile error right there.
Of course, really the most sensible approach is probably to not bother with any kind of manual type equality at all, but simply use the class right away for the behaviour changes you need:
class Tweakable a where
tweak :: a -> a
instance Tweakable Bool where
tweak = not
instance Tweakable Int where
tweak = (+1)
instance Tweakable Char where
tweak = id
The other answers are both very good for covering the ways you can do something like this in Haskell. But I thought it was worth adding something speaking more to this part of the question:
If we can't, that is theoretical\philosophical etc reasons for this?
Actually Haskellers do generally rely quite strongly on the theory that forbids something like your tweakId from existing with type forall a. a -> a. (Even though there are ways to cheat, using things like unsafeCoerce; this is usually considered bad style if you haven't done something like in leftaroundabout's answer, where a class with functional dependencies ensures the unsafe coerce is always valid)
Haskell uses parametric polymorphism1. That means we can write code that works on multiple types because it will treat them all the same; the code only uses operations that will work regardless of the specific type it is invoked on. This is expressed in Haskell types by using type variables; a function with a variable in its type can be used with any type at all substituted for the variable, because every single operation in the function definition will work regardless of what type is chosen.
About the simplest example is indeed the function id, which might be defined like this:
id :: forall a. a -> a
id x = x
Because it's parametrically polymorphic, we can simply choose any type at all we like and use id as if it was defined on that type. For example as if it were any of the following:
id :: Bool -> Bool
id x = x
id :: Int -> Int
id x = x
id :: Maybe (Int -> [IO Bool]) -> Maybe (Int -> [IO Bool])
id x = x
But to ensure that the definition does work for any type, the compiler has to check a very strong restriction. Our id function can only use operations that don't depend on any property of any specific type at all. We can't call not x because the x might not be a Bool, we can't call x + 1 because the x might not be a number, can't check whether x is equal to anything because it might not be a type that supports equality, etc, etc. In fact there is almost nothing you can do with x in the body of id. We can't even ignore x and return some other value of type a; this would require us to write an expression for a value that can be of any type at all and the only things that can do that are things like undefined that don't evaluate to a value at all (because they throw exceptions). It's often said that in fact there is only one valid function with type forall a. a -> a (and that is id)2.
This restriction on what you can do with values whose type contains variables isn't just a restriction for the sake of being picky, it's actually a huge part of what makes Haskell types useful. It means that just looking at the type of a function can often tell you quite a bit about what it can possibly do, and once you get used to it Haskellers rely on this kind of thinking all the time. For example, consider this function signature:
map :: forall a b. (a -> b) -> [a] -> [b]
Just from this type (and the assumption that the code doesn't do anything dumb like add in extra undefined elements of the list) I can tell:
All of the items in the resulting list come are results of the function input; map will have no other way of producing values of type b to put in the list (except undefined, etc).
All of the items in the resulting list correspond to something in the input list mapped through the function; map will have no way of getting any a values to feed to the function (except undefined, etc)
If any items of the input list are dropped or re-ordered, it will be done in a "blind" way that isn't considering the elements at all, only their position in the list; map ultimately has no way of testing any property of the a and b values to decide which order they should go in. For example it might leave out the third element, or swap the 2nd and 76th elements if there are at least 100 elements, etc. But if it applies rules like that it will have to always apply them, regardless of the actual items in the list. It cannot e.g. drop the 4th element if it is less than the 5th element, or only keep outputs from the function that are "truthy", etc.
None of this would be true if Haskell allowed parametrically polymorphic types to have Python-like definitions that check the type of their arguments and then run different code. Such a definition for map could check if the function is supposed to return integers and if so return [1, 2, 3, 4] regardless of the input list, etc. So the type checker would be enforcing a lot less (and thus catching fewer mistakes) if it worked this way.
This kind of reasoning is formalised in the concept of free theorems; it's literally possible to derive formal proofs about a piece of code from its type (and thus get theorems for free). You can google this if you're interested in further reading, but Haskellers generally use this concept informally rather doing real proofs.
Sometimes we do need non-parametric polymorphism. The main tool Haskell provides for that is type classes. If a type variable has a class constraint, then there will be an interface of class methods provided by that constraint. For example the Eq a constraint allows (==) :: a -> a -> Bool to be used, and your own TypeOf a constraint allows typeOf #a to be used. Type class methods do allow you to run different code for different types, so this breaks parametricity. Even just adding Eq a to the type of map means I can no longer assume property 3 from above.
map :: forall a b. Eq a => (a -> b) -> [a] -> [b]
Now map can tell whether some of the items in the original list are equal to each other, so it can use that to decide whether to include them in the result, and in what order. Likewise Monoid a or Monoid b would allow map to break the first two properties by using mempty :: a to produce new values that weren't in the list originally or didn't come from the function. If I add Typeable constraints I can't assume anything, because the function could do all of the Python-style checking of types to apply special-case logic, make use of existing values it knows about if a or b happen to be those types, etc.
So something like your tweakId cannot be given the type forall a. a -> a, for theoretical reasons that are also extremely practically important. But if you need a function that behaves like your tweakId adding a class constraint was the right thing to do to break out of the constraints of parametricity. However simply being able to get a String for each type isn't enough; typeOf #a == "Int" doesn't tell the type checker that a can be used in operations requiring an Int. It knows that in that branch the equality check returned True, but that's just a Bool; the type checker isn't able to reason backwards to why this particular Bool is True and deduce that it could only have happened if a were the type Int. But there are alternative constructs using GADTs that do give the type checker additional knowledge within certain code branches, allowing you to check types at runtime and use different code for each type. The class Typeable is specifically designed for this, but it's a hammer that completely bypasses parametricity; I think most Haskellers would prefer to keep more type-based reasoning intact where possible.
1 Parametric polymorphism is in contrast to class-based polymorphism you may have seen in OO languages (where each class says how a method is implemented for objects of that specific class), or ad-hoc polymophism (as seen in C++) where you simply define multiple definitions with the same name but different types and the types at each application determine which definition is used. I'm not covering those in detail, but the key distinction is both of them allow the definition to have different code for each supported type, rather than guaranteeing the same code will process all supported types.
2 It's not 100% true that there's only one valid function with type forall a. a -> a unless you hide some caveats in "valid". But if you don't use any unsafe features (like unsafeCoerce or the foreign language interface), then a function with type forall a. a -> a will either always throw an exception or it will return its argument unchanged.
The "always throws an exception" isn't terribly useful so we usually assume an unknown function with that type isn't going to do that, and thus ignore this possibility.
There are multiple ways to implement "returns its argument unchanged", like id x = head . head . head $ [[[x]]], but they can only differ from the normal id in being slower by building up some structure around x and then immediately tearing it down again. A caller that's only worrying about correctness (rather than performance) can treat them all the same.
Thus, ignoring the "always undefined" possibility and treating all of the dumb elaborations of id x = x the same, we come to the perspective where we can say "there's only one function with forall a. a -> a".

Why is it said that typeclasses are existential?

According to this link describing existential types:
A value of an existential type like ∃x. F(x) is a pair containing some type x and a value of the type F(x). Whereas a value of a polymorphic type like ∀x. F(x) is a function that takes some type x and produces a value of type F(x). In both cases, the type closes over some type constructor F.
But a function definition with type class constraints doesn't pair with the type class instance.
It's not forall f, exists Functor f, ... (because it's obvious not every type f has instance of Functor f, hence exists Functor f ... not true).
It's not exists f and Functor f, ... (because it's applicable to all instances of satisfied f, not only the chosen one).
To me, it's forall f and instances of Functor f, ..., more like to scala's implicit arguments rather than existential types.
And according to this link describing type classes:
[The class declaration for Eq] means, logically, there is a type a for which the type a -> a -> Bool is inhabited, or, from a it can be proved that a -> a -> Bool (the class promises two different proofs for this, having names == and /=). This proposition is of existential nature (not to be confused with existential type)
What's the difference between type classes and existential types, and why are they both considered "existential"?
The wiki you quote is wrong, or at least being imprecise. A class declaration is not an existential proposition; it is not a proposition of any kind, it is merely a definition of a shorthand. One could then move on to making a proposition using that definition if you wanted, but on its own it's nothing like that. For example,
class Eq a where (==) :: a -> a -> Bool
makes a new definition. One could then write a non-existential, non-universal proposition using it, say,
Eq ()
which we could "prove" by writing:
instance Eq () where () == () = True
Or one could write
prop_ExistsFoo :: exists a. Eq a *> a
as an existential proposition. (Haskell doesn't actually have the exists proposition former, nor (*>). Think of (*>) as dual to (=>) -- just like exists is dual to forall. So where (=>) is a function which takes evidence of a constraint, (*>) is a tuple that contains evidence of a constraint, just like forall is for a function that takes a type while exists is for a tuple that contains a type.) We could "prove" this proposition by, e.g.
prop_ExistsFoo = ()
Note here that the type contained in the exists tuple is (); the evidence contained in the (*>) tuple is the Eq () instance we wrote above. I have honored Haskell's tendency to make types and instances silent and implicit here, so they don't appear in the visible proof text.
Similarly, we could make a different, universal proposition out of Eq by writing something like
prop_ForallEq :: forall a. Eq a => a
which is not nontrivially provable, or
prop_ForallEq2 :: forall a. Eq a => a -> a -> Bool
which we could "prove", for example, by writing
prop_ForallEq2 x y = not (x == y)
or in many other ways.
But the class declaration in itself is definitely not an existential proposition, and it doesn't have "existential nature", whatever that is supposed to mean. Instead of getting hung up and confused on that, please congratulate yourself for correctly labeling this incorrect claim as confusing!
The second quote is imprecise. The existential claim comes with the instances, not with the class itself. Consider the following class:
class Chaos a where
to :: a -> y
from :: x -> a
While this is a perfectly valid declaration, there can't possibly be any instances of Chaos (it there were, to . from would exist, which would be quite amusing). The type of, say, to...
GHCi> :t to
to :: Chaos a => a -> y
... tells us that, given any type a, if a is an instance of Chaos, there is a function which can turn an a into a value of any type whatsoever. If Chaos has no instances, that statement is vacuously true, so we can't infer the existence of any such function from it.
Putting classes aside for a moment, this situation is rather similar to what we have with the absurd function:
absurd :: Void -> a
This type says that, given a Void value, we can produce a value of any type whatsoever. That sounds, well, absurd -- but then we remember that Void is the empty type, which means there are no Void values, and it's all good.
For the sake of contrast, we might note that instances become possible once we break Chaos apart in two classes:
class Primordial a where
conjure :: a -> y
class Doom a where
destroy :: x -> a
instance Primordial Void where
conjure = absurd
instance Doom () where
destroy = const ()
When we, for example, write instance Primordial Void, we are claiming that Void is an instance of Primordial. That implies there must exist a function conjure :: Void -> y, at which point we must back up the claim by supplying an implementation.

Are type synonyms with typeclass constraints possible?

Feel free to change the title, I'm just not experienced enough to know what's really going on.
So, I was writing a program loosely based on this, and wrote this (as it is in the original)
type Row a = [a]
type Matrix a = [Row a]
Nothing special there.
However, I found myself writing a couple of functions with a type like this:
Eq a => Row a -> ...
So I thought that perhaps I could write this constraint into the type synonym definition, because to my mind it shouldn't be that much more complicated, right? If the compiler can work with this in functions, it should work as a type synonym. There are no partial applications here or anything or some kind of trickery (to my eyes).
So I tried this:
type Row a = Eq a => [a]
This doesn't work, and the compiler suggested switching on RankNTypes. The option made it compile, but the functions still required that I leave the Eq a => in their type declarations. As an aside, if I tried also having a type synonym like type Matrix a = [Row a] like before, it results in an error.
So my question(s) are thus:
Is it possible to have a type synonym with a typeclass constraint in its definition?
If not, why?
Is the goal behind this question achievable in some other way?
Constraints on a type variable can not be part of any Haskell type signature.
This may seem a bit of a ridiculous statement: "what's (==) :: Eq a => a -> a -> a then?"
The answer is that a doesn't really exist, in much the same way there is not really an x in the definition f x = x * log x. You've sure enough used the x symbol in defining that function, but really it was just a local tool used in the lambda-abstraction. There is absolutely no way to access this symbol from the outside, indeed it's not required that the compiler even generates anything corresponding to x in the machine code – it might just get optimised away.
Indeed, any polymorphic signature can basically be read as a lambda expression accepting a type variable; various writing styles:
(==) :: forall a . Eq a => a -> a -> a
(==) :: ∀ a . Eq a => a -> a -> a
(==) :: Λa. {Eq a} -> a -> a -> a
This is called System F.
Note that there is not really a "constraint" in this signature, but an extra argument: the Eq-class dictionary.
Usually you want to avoid having constraints in your type synonyms unless it's really necessary. Take the Data.Set API from containers for example.
Many operations in Data.Set require the elements of the set to be instances of Ord because Set is implemented internally as a binary tree. member or insert both require Ord
member :: Ord a => a -> Set a -> Bool
insert :: Ord a => a -> Set a -> Set a
However the definition of Set doesn't mention Ord at all.
This is because some operations on Set dont require an Ord instance, like size or null.
size :: Set a -> Int
null :: Set a -> Bool
If the type class constraint was part of the definition of Set, these functions would have to include the constraint, even though it's not necessary.
So yes, it is possible to have constraints in type synonyms using RankNTypes, but it's generally ill-advised. It's better to write the constraint for the functions that need them instead.

Relationship between TypeRep and "Type" GADT

In Scrap your boilerplate reloaded, the authors describe a new presentation of Scrap Your Boilerplate, which is supposed to be equivalent to the original.
However, one difference is that they assume a finite, closed set of "base" types, encoded with a GADT
data Type :: * -> * where
Int :: Type Int
List :: Type a -> Type [a]
...
In the original SYB, type-safe cast is used, implemented using the Typeable class.
My questions are:
What is the relationship between these two approaches?
Why was the GADT representation chosen for the "SYB Reloaded" presentation?
[I am one of the authors of the "SYB Reloaded" paper.]
TL;DR We really just used it because it seemed more beautiful to us. The class-based Typeable approach is more practical. The Spine view can be combined with the Typeable class and does not depend on the Type GADT.
The paper states this in its conclusions:
Our implementation handles the two central ingredients of generic programming differently from the original SYB paper: we use overloaded functions with
explicit type arguments instead of overloaded functions based on a type-safe
cast 1 or a class-based extensible scheme [20]; and we use the explicit spine
view rather than a combinator-based approach. Both changes are independent
of each other, and have been made with clarity in mind: we think that the structure of the SYB approach is more visible in our setting, and that the relations
to PolyP and Generic Haskell become clearer. We have revealed that while the
spine view is limited in the class of generic functions that can be written, it is
applicable to a very large class of data types, including GADTs.
Our approach cannot be used easily as a library, because the encoding of
overloaded functions using explicit type arguments requires the extensibility of
the Type data type and of functions such as toSpine. One can, however, incorporate Spine into the SYB library while still using the techniques of the SYB
papers to encode overloaded functions.
So, the choice of using a GADT for type representation is one we made mainly for clarity. As Don states in his answer, there are some obvious advantages in this representation, namely that it maintains static information about what type a type representation is for, and that it allows us to implement cast without any further magic, and in particular without the use of unsafeCoerce. Type-indexed functions can also be implemented directly by using pattern matching on the type, and without falling back to various combinators such as mkQ or extQ.
Fact is that I (and I think the co-authors) simply were not very fond of the Typeable class. (In fact, I'm still not, although it is finally becoming a bit more disciplined now in that GHC adds auto-deriving for Typeable, makes it kind-polymorphic, and will ultimately remove the possibility to define your own instances.) In addition, Typeable wasn't quite as established and widely known as it is perhaps now, so it seemed appealing to "explain" it by using the GADT encoding. And furthermore, this was the time when we were also thinking about adding open datatypes to Haskell, thereby alleviating the restriction that the GADT is closed.
So, to summarize: If you actually need dynamic type information only for a closed universe, I'd always go for the GADT, because you can use pattern matching to define type-indexed functions, and you do not have to rely on unsafeCoerce nor advanced compiler magic. If the universe is open, however, which is quite common, certainly for the generic programming setting, then the GADT approach might be instructive, but isn't practical, and using Typeable is the way to go.
However, as we also state in the conclusions of the paper, the choice of Type over Typeable isn't a prerequisite for the other choice we're making, namely to use the Spine view, which I think is more important and really the core of the paper.
The paper itself shows (in Section 8) a variation inspired by the "Scrap your Boilerplate with Class" paper, which uses a Spine view with a class constraint instead. But we can also do a more direct development, which I show in the following. For this, we'll use Typeable from Data.Typeable, but define our own Data class which, for simplicity, just contains the toSpine method:
class Typeable a => Data a where
toSpine :: a -> Spine a
The Spine datatype now uses the Data constraint:
data Spine :: * -> * where
Constr :: a -> Spine a
(:<>:) :: (Data a) => Spine (a -> b) -> a -> Spine b
The function fromSpine is as trivial as with the other representation:
fromSpine :: Spine a -> a
fromSpine (Constr x) = x
fromSpine (c :<>: x) = fromSpine c x
Instances for Data are trivial for flat types such as Int:
instance Data Int where
toSpine = Constr
And they're still entirely straightforward for structured types such as binary trees:
data Tree a = Empty | Node (Tree a) a (Tree a)
instance Data a => Data (Tree a) where
toSpine Empty = Constr Empty
toSpine (Node l x r) = Constr Node :<>: l :<>: x :<>: r
The paper then goes on and defines various generic functions, such as mapQ. These definitions hardly change. We only get class constraints for Data a => where the paper has function arguments of Type a ->:
mapQ :: Query r -> Query [r]
mapQ q = mapQ' q . toSpine
mapQ' :: Query r -> (forall a. Spine a -> [r])
mapQ' q (Constr c) = []
mapQ' q (f :<>: x) = mapQ' q f ++ [q x]
Higher-level functions such as everything also just lose their explicit type arguments (and then actually look exactly the same as in original SYB):
everything :: (r -> r -> r) -> Query r -> Query r
everything op q x = foldl op (q x) (mapQ (everything op q) x)
As I said above, if we now want to define a generic sum function summing up all Int occurrences, we cannot pattern match anymore, but have to fall back to mkQ, but mkQ is defined purely in terms of Typeable and completely independent of Spine:
mkQ :: (Typeable a, Typeable b) => r -> (b -> r) -> a -> r
(r `mkQ` br) a = maybe r br (cast a)
And then (again exactly as in original SYB):
sum :: Query Int
sum = everything (+) sumQ
sumQ :: Query Int
sumQ = mkQ 0 id
For some of the stuff later in the paper (e.g., adding constructor information), a bit more work is needed, but it can all be done. So using Spine really does not depend on using Type at all.
Well, obviously the Typeable use is open -- new variants can be added after the fact, and without modifying the original definitions.
The important change though is that in that TypeRep is untyped. That is, there is no connection between the runtime type , TypeRep, and the static type it encodes. With the GADT approach we can encode the mapping between a type a and its Type, given by the GADT Type a.
We thus bake in evidence for the type rep being statically linked to its origin type, and can write statically typed dynamic application (for example) using Type a as evidence that we have a runtime a.
In the older TypeRep case, we have no such evidence and it comes down to runtime string equality, and a coerce and hope for the best through fromDynamic.
Compare the signatures:
toDyn :: Typeable a => a -> TypeRep -> Dynamic
versus GADT style:
toDyn :: Type a => a -> Type a -> Dynamic
I can't fake my type evidence, and I can use that later when reconstructing things, to e.g. lookup the type class instances for a when all I have is a Type a.

Programmatic type annotations in Haskell

When metaprogramming, it may be useful (or necessary) to pass along to Haskell's type system information about types that's known to your program but not inferable in Hindley-Milner. Is there a library (or language extension, etc) that provides facilities for doing this—that is, programmatic type annotations—in Haskell?
Consider a situation where you're working with a heterogenous list (implemented using the Data.Dynamic library or existential quantification, say) and you want to filter the list down to a bog-standard, homogeneously typed Haskell list. You can write a function like
import Data.Dynamic
import Data.Typeable
dynListToList :: (Typeable a) => [Dynamic] -> [a]
dynListToList = (map fromJust) . (filter isJust) . (map fromDynamic)
and call it with a manual type annotation. For example,
foo :: [Int]
foo = dynListToList [ toDyn (1 :: Int)
, toDyn (2 :: Int)
, toDyn ("foo" :: String) ]
Here foo is the list [1, 2] :: [Int]; that works fine and you're back on solid ground where Haskell's type system can do its thing.
Now imagine you want to do much the same thing but (a) at the time you write the code you don't know what the type of the list produced by a call to dynListToList needs to be, yet (b) your program does contain the information necessary to figure this out, only (c) it's not in a form accessible to the type system.
For example, say you've randomly selected an item from your heterogenous list and you want to filter the list down by that type. Using the type-checking facilities supplied by Data.Typeable, your program has all the information it needs to do this, but as far as I can tell—this is the essence of the question—there's no way to pass it along to the type system. Here's some pseudo-Haskell that shows what I mean:
import Data.Dynamic
import Data.Typeable
randList :: (Typeable a) => [Dynamic] -> IO [a]
randList dl = do
tr <- randItem $ map dynTypeRep dl
return (dynListToList dl :: [<tr>]) -- This thing should have the type
-- represented by `tr`
(Assume randItem selects a random item from a list.)
Without a type annotation on the argument of return, the compiler will tell you that it has an "ambiguous type" and ask you to provide one. But you can't provide a manual type annotation because the type is not known at write-time (and can vary); the type is known at run-time, however—albeit in a form the type system can't use (here, the type needed is represented by the value tr, a TypeRep—see Data.Typeable for details).
The pseudo-code :: [<tr>] is the magic I want to happen. Is there any way to provide the type system with type information programatically; that is, with type information contained in a value in your program?
Basically I'm looking for a function with (pseudo-) type ??? -> TypeRep -> a that takes a value of a type unknown to Haskell's type system and a TypeRep and says, "Trust me, compiler, I know what I'm doing. This thing has the value represented by this TypeRep." (Note that this is not what unsafeCoerce does.)
Or is there something completely different that gets me the same place? For example, I can imagine a language extension that permits assignment to type variables, like a souped-up version of the extension enabling scoped type variables.
(If this is impossible or highly impractical,—e.g., it requires packing a complete GHCi-like interpreter into the executable—please try to explain why.)
No, you can't do this. The long and short of it is that you're trying to write a dependently-typed function, and Haskell isn't a dependently typed language; you can't lift your TypeRep value to a true type, and so there's no way to write down the type of your desired function. To explain this in a little more detail, I'm first going to show why the way you've phrased the type of randList doesn't really make sense. Then, I'm going to explain why you can't do what you want. Finally, I'll briefly mention a couple thoughts on what to actually do.
Existentials
Your type signature for randList can't mean what you want it to mean. Remembering that all type variables in Haskell are universally quantified, it reads
randList :: forall a. Typeable a => [Dynamic] -> IO [a]
Thus, I'm entitled to call it as, say, randList dyns :: IO [Int] anywhere I want; I must be able to provide a return value for all a, not simply for some a. Thinking of this as a game, it's one where the caller can pick a, not the function itself. What you want to say (this isn't valid Haskell syntax, although you can translate it into valid Haskell by using an existential data type1) is something more like
randList :: [Dynamic] -> (exists a. Typeable a => IO [a])
This promises that the elements of the list are of some type a, which is an instance of Typeable, but not necessarily any such type. But even with this, you'll have two problems. First, even if you could construct such a list, what could you do with it? And second, it turns out that you can't even construct it in the first place.
Since all that you know about the elements of the existential list is that they're instances of Typeable, what can you do with them? Looking at the documentation, we see that there are only two functions2 which take instances of Typeable:
typeOf :: Typeable a => a -> TypeRep, from the type class itself (indeed, the only method therein); and
cast :: (Typeable a, Typeable b) => a -> Maybe b (which is implemented with unsafeCoerce, and couldn't be written otherwise).
Thus, all that you know about the type of the elements in the list is that you can call typeOf and cast on them. Since we'll never be able to usefully do anything else with them, our existential might just as well be (again, not valid Haskell)
randList :: [Dynamic] -> IO [(TypeRep, forall b. Typeable b => Maybe b)]
This is what we get if we apply typeOf and cast to every element of our list, store the results, and throw away the now-useless existentially typed original value. Clearly, the TypeRep part of this list isn't useful. And the second half of the list isn't either. Since we're back to a universally-quantified type, the caller of randList is once again entitled to request that they get a Maybe Int, a Maybe Bool, or a Maybe b for any (typeable) b of their choosing. (In fact, they have slightly more power than before, since they can instantiate different elements of the list to different types.) But they can't figure out what type they're converting from unless they already know it—you've still lost the type information you were trying to keep.
And even setting aside the fact that they're not useful, you simply can't construct the desired existential type here. The error arises when you try to return the existentially-typed list (return $ dynListToList dl). At what specific type are you calling dynListToList? Recall that dynListToList :: forall a. Typeable a => [Dynamic] -> [a]; thus, randList is responsible for picking which a dynListToList is going to use. But it doesn't know which a to pick; again, that's the source of the question! So the type that you're trying to return is underspecified, and thus ambiguous.3
Dependent types
OK, so what would make this existential useful (and possible)? Well, we actually have slightly more information: not only do we know there's some a, we have its TypeRep. So maybe we can package that up:
randList :: [Dynamic] -> (exists a. Typeable a => IO (TypeRep,[a]))
This isn't quite good enough, though; the TypeRep and the [a] aren't linked at all. And that's exactly what you're trying to express: some way to link the TypeRep and the a.
Basically, your goal is to write something like
toType :: TypeRep -> *
Here, * is the kind of all types; if you haven't seen kinds before, they are to types what types are to values. * classifies types, * -> * classifies one-argument type constructors, etc. (For instance, Int :: *, Maybe :: * -> *, Either :: * -> * -> *, and Maybe Int :: *.)
With this, you could write (once again, this code isn't valid Haskell; in fact, it really bears only a passing resemblance to Haskell, as there's no way you could write it or anything like it within Haskell's type system):
randList :: [Dynamic] -> (exists (tr :: TypeRep).
Typeable (toType tr) => IO (tr, [toType tr]))
randList dl = do
tr <- randItem $ map dynTypeRep dl
return (tr, dynListToList dl :: [toType tr])
-- In fact, in an ideal world, the `:: [toType tr]` signature would be
-- inferable.
Now, you're promising the right thing: not that there exists some type which classifies the elements of the list, but that there exists some TypeRep such that its corresponding type classifies the elements of the list. If only you could do this, you would be set. But writing toType :: TypeRep -> * is completely impossible in Haskell: doing this requires a dependently-typed language, since toType tr is a type which depends on a value.
What does this mean? In Haskell, it's perfectly acceptable for values to depend on other values; this is what a function is. The value head "abc", for instance, depends on the value "abc". Similarly, we have type constructors, so it's acceptable for types to depend on other types; consider Maybe Int, and how it depends on Int. We can even have values which depend on types! Consider id :: a -> a. This is really a family of functions: id_Int :: Int -> Int, id_Bool :: Bool -> Bool, etc. Which one we have depends on the type of a. (So really, id = \(a :: *) (x :: a) -> x; although we can't write this in Haskell, there are languages where we can.)
Crucially, however, we can never have a type that depends on a value. We might want such a thing: imagine Vec 7 Int, the type of length-7 lists of integers. Here, Vec :: Nat -> * -> *: a type whose first argument must be a value of type Nat. But we can't write this sort of thing in Haskell.4 Languages which support this are called dependently-typed (and will let us write id as we did above); examples include Coq and Agda. (Such languages often double as proof assistants, and are generally used for research work as opposed to writing actual code. Dependent types are hard, and making them useful for everyday programming is an active area of research.)
Thus, in Haskell, we can check everything about our types first, throw away all that information, and then compile something that refers only to values. In fact, this is exactly what GHC does; since we can never check types at run-time in Haskell, GHC erases all the types at compile-time without changing the program's run-time behavior. This is why unsafeCoerce is easy to implement (operationally) and completely unsafe: at run-time, it's a no-op, but it lies to the type system. Consequently, something like toType is completely impossible to implement in the Haskell type system.
In fact, as you noticed, you can't even write down the desired type and use unsafeCoerce. For some problems, you can get away with this; we can write down the type for the function, but only implement it with by cheating. That's exactly how fromDynamic works. But as we saw above, there's not even a good type to give to this problem from within Haskell. The imaginary toType function allows you to give the program a type, but you can't even write down toType's type!
What now?
So, you can't do this. What should you do? My guess is that your overall architecture isn't ideal for Haskell, although I haven't seen it; Typeable and Dynamic don't actually show up that much in Haskell programs. (Perhaps you're "speaking Haskell with a Python accent", as they say.) If you only have a finite set of data types to deal with, you might be able to bundle things into a plain old algebraic data type instead:
data MyType = MTInt Int | MTBool Bool | MTString String
Then you can write isMTInt, and just use filter isMTInt, or filter (isSameMTAs randomMT).
Although I don't know what it is, there's probably a way you could unsafeCoerce your way through this problem. But frankly, that's not a good idea unless you really, really, really, really, really, really know what you're doing. And even then, it's probably not. If you need unsafeCoerce, you'll know, it won't just be a convenience thing.
I really agree with Daniel Wagner's comment: you're probably going to want to rethink your approach from scratch. Again, though, since I haven't seen your architecture, I can't say what that will mean. Maybe there's another Stack Overflow question in there, if you can distill out a concrete difficulty.
1 That looks like the following:
{-# LANGUAGE ExistentialQuantification #-}
data TypeableList = forall a. Typeable a => TypeableList [a]
randList :: [Dynamic] -> IO TypeableList
However, since none of this code compiles anyway, I think writing it out with exists is clearer.
2 Technically, there are some other functions which look relevant, such as toDyn :: Typeable a => a -> Dynamic and fromDyn :: Typeable a => Dynamic -> a -> a. However, Dynamic is more or less an existential wrapper around Typeables, relying on typeOf and TypeReps to know when to unsafeCoerce (GHC uses some implementation-specific types and unsafeCoerce, but you could do it this way, with the possible exception of dynApply/dynApp), so toDyn doesn't do anything new. And fromDyn doesn't really expect its argument of type a; it's just a wrapper around cast. These functions, and the other similar ones, don't provide any extra power that isn't available with just typeOf and cast. (For instance, going back to a Dynamic isn't very useful for your problem!)
3 To see the error in action, you can try to compile the following complete Haskell program:
{-# LANGUAGE ExistentialQuantification #-}
import Data.Dynamic
import Data.Typeable
import Data.Maybe
randItem :: [a] -> IO a
randItem = return . head -- Good enough for a short and non-compiling example
dynListToList :: Typeable a => [Dynamic] -> [a]
dynListToList = mapMaybe fromDynamic
data TypeableList = forall a. Typeable a => TypeableList [a]
randList :: [Dynamic] -> IO TypeableList
randList dl = do
tr <- randItem $ map dynTypeRep dl
return . TypeableList $ dynListToList dl -- Error! Ambiguous type variable.
Sure enough, if you try to compile this, you get the error:
SO12273982.hs:17:27:
Ambiguous type variable `a0' in the constraint:
(Typeable a0) arising from a use of `dynListToList'
Probable fix: add a type signature that fixes these type variable(s)
In the second argument of `($)', namely `dynListToList dl'
In a stmt of a 'do' block: return . TypeableList $ dynListToList dl
In the expression:
do { tr <- randItem $ map dynTypeRep dl;
return . TypeableList $ dynListToList dl }
But as is the entire point of the question, you can't "add a type signature that fixes these type variable(s)", because you don't know what type you want.
4 Mostly. GHC 7.4 has support for lifting types to kinds and for kind polymorphism; see section 7.8, "Kind polymorphism and promotion", in the GHC 7.4 user manual. This doesn't make Haskell dependently typed—something like TypeRep -> * example is still out5—but you will be able to write Vec by using very expressive types that look like values.
5 Technically, you could now write down something which looks like it has the desired type: type family ToType :: TypeRep -> *. However, this takes a type of the promoted kind TypeRep, and not a value of the type TypeRep; and besides, you still wouldn't be able to implement it. (At least I don't think so, and I can't see how you would—but I am not an expert in this.) But at this point, we're pretty far afield.
What you're observing is that the type TypeRep doesn't actually carry any type-level information along with it; only term-level information. This is a shame, but we can do better when we know all the type constructors we care about. For example, suppose we only care about Ints, lists, and function types.
{-# LANGUAGE GADTs, TypeOperators #-}
import Control.Monad
data a :=: b where Refl :: a :=: a
data Dynamic where Dynamic :: TypeRep a -> a -> Dynamic
data TypeRep a where
Int :: TypeRep Int
List :: TypeRep a -> TypeRep [a]
Arrow :: TypeRep a -> TypeRep b -> TypeRep (a -> b)
class Typeable a where typeOf :: TypeRep a
instance Typeable Int where typeOf = Int
instance Typeable a => Typeable [a] where typeOf = List typeOf
instance (Typeable a, Typeable b) => Typeable (a -> b) where
typeOf = Arrow typeOf typeOf
congArrow :: from :=: from' -> to :=: to' -> (from -> to) :=: (from' -> to')
congArrow Refl Refl = Refl
congList :: a :=: b -> [a] :=: [b]
congList Refl = Refl
eq :: TypeRep a -> TypeRep b -> Maybe (a :=: b)
eq Int Int = Just Refl
eq (Arrow from to) (Arrow from' to') = liftM2 congArrow (eq from from') (eq to to')
eq (List t) (List t') = liftM congList (eq t t')
eq _ _ = Nothing
eqTypeable :: (Typeable a, Typeable b) => Maybe (a :=: b)
eqTypeable = eq typeOf typeOf
toDynamic :: Typeable a => a -> Dynamic
toDynamic a = Dynamic typeOf a
-- look ma, no unsafeCoerce!
fromDynamic_ :: TypeRep a -> Dynamic -> Maybe a
fromDynamic_ rep (Dynamic rep' a) = case eq rep rep' of
Just Refl -> Just a
Nothing -> Nothing
fromDynamic :: Typeable a => Dynamic -> Maybe a
fromDynamic = fromDynamic_ typeOf
All of the above is pretty standard. For more on the design strategy, you'll want to read about GADTs and singleton types. Now, the function you want to write follows; the type is going to look a bit daft, but bear with me.
-- extract only the elements of the list whose type match the head
firstOnly :: [Dynamic] -> Dynamic
firstOnly [] = Dynamic (List Int) []
firstOnly (Dynamic rep v:xs) = Dynamic (List rep) (v:go xs) where
go [] = []
go (Dynamic rep' v:xs) = case eq rep rep' of
Just Refl -> v : go xs
Nothing -> go xs
Here we've picked a random element (I rolled a die, and it came up 1) and extracted only the elements that have a matching type from the list of dynamic values. Now, we could have done the same thing with regular boring old Dynamic from the standard libraries; however, what we couldn't have done is used the TypeRep in a meaningful way. I now demonstrate that we can do so: we'll pattern match on the TypeRep, and then use the enclosed value at the specific type the TypeRep tells us it is.
use :: Dynamic -> [Int]
use (Dynamic (List (Arrow Int Int)) fs) = zipWith ($) fs [1..]
use (Dynamic (List Int) vs) = vs
use (Dynamic Int v) = [v]
use (Dynamic (Arrow (List Int) (List (List Int))) f) = concat (f [0..5])
use _ = []
Note that on the right-hand sides of these equations, we are using the wrapped value at different, concrete types; the pattern match on the TypeRep is actually introducing type-level information.
You want a function that chooses a different type of values to return based on runtime data. Okay, great. But the whole purpose of a type is to tell you what operations can be performed on a value. When you don't know what type will be returned from a function, what do you do with the values it returns? What operations can you perform on them? There are two options:
You want to read the type, and perform some behaviour based on which type it is. In this case you can only cater for a finite list of types known in advance, essentially by testing "is it this type? then we do this operation...". This is easily possible in the current Dynamic framework: just return the Dynamic objects, using dynTypeRep to filter them, and leave the application of fromDynamic to whoever wants to consume your result. Moreover, it could well be possible without Dynamic, if you don't mind setting the finite list of types in your producer code, rather than your consumer code: just use an ADT with a constructor for each type, data Thing = Thing1 Int | Thing2 String | Thing3 (Thing,Thing). This latter option is by far the best if it is possible.
You want to perform some operation that works across a family of types, potentially some of which you don't know about yet, e.g. by using type class operations. This is trickier, and it's tricky conceptually too, because your program is not allowed to change behaviour based on whether or not some type class instance exists – it's an important property of the type class system that the introduction of a new instance can either make a program type check or stop it from type checking, but it can't change the behaviour of a program. Hence you can't throw an error if your input list contains inappropriate types, so I'm really not sure that there's anything you can do that doesn't essentially involve falling back to the first solution at some point.

Resources