Satisfying monad laws without a type constructor - haskell

Given e.g. a type like
data Tree a = Branch (Tree a) (Tree a)
| Leaf a
I can easily write instances for Functor, Applicative, Monad, etc.
But if the "contained" type is predetermined, e.g.
data StringTree = Branch StringTree StringTree
| Leaf String
I lose the ability to write these instances.
If I were to write functions for my StringTree type
stringTreeReturn :: String -> StringTree
stringTreeBind :: String -> (String -> StringTree) -> StringTree
stringTreeFail :: String -> StringTree
-- etc.
that satisfied the monad laws, could I still say that StringTree is a monad?

Tree a isn't a monad, either generally or for any particular a, Tree itself is a monad. A monad isn't a type, it's a correspondence between any type and a "monadic version" that type. For example, Integer is the type of integers, and Maybe Integer is the type of integers in the Maybe monad.
Consequently, StringTree, which is a type, can't be a monad. It's just not the same kind of thing. You can try to imagine it as the type of strings in a monad, but your functions stringTreeReturn, etc, don't match the types of their monadic correspondents. Look at the type of >>= in the Maybe monad:
Maybe a -> (a -> Maybe b) -> Maybe b
The second argument is a function from some a to any type in the Maybe monad (Maybe b). stringTreeBind has the type:
String -> (String -> StringTree) -> StringTree
The second argument can only be a function from String to the monadic version of String, rather than to the monadic version of any type.
Consequently, you can't do everything you can do to values in an arbitrary monadic type to StringTree values, which is why it can't be made an instance. Even if you could somehow get it treated as a monad, things would start going wrong when generic monadic code expects to be able to use generic monadic operations in ways that don't make sense for StringTree.
Ultimately, if you're thinking of it as "like" a monad because it's String in a container that can behave similarly to monadic containers, the easiest thing to do is simply make it a generic container of any type (Tree a). If you need have auxiliary functionality that depends specifically on it being a tree of strings, then you can write that code as operating only on Tree String values, and it will happily co-exist along with code that works generically on Tree a.

No. You're looking at the difference between a type and a kind. Simplistically, a kind is the way that Haskell classifies types, so it's a level of abstraction above types. ghci can help.
:type stringReturn
stringReturn :: String -> StringTree
:type Functor
<interactive>:1:1: Not in scope: data constructor `Functor'
So right off the bat, we can see that a function, which has a type, is an entirely different thing than a type, which has a kind.
We can also ask ghci about kinds.
:kind Functor
Functor (* -> *) -> Constraint
:kind StringTree
StringTree :: *
:kind Tree
Tree :: * -> *
Kinds use * for variables in their notation. We can see from the interaction above that Functor expects a type that is parameterized over one argument. We can also see that StringTree doesn't take any parameters, and Tree takes one.
That's a long way of expressing and unpacking your question, but hopefully it shows the difference between types and kinds.

No, that wouldn't be a monad. Think about it, can you write a monad m so only m String is ever possible?

Related

Which is a polymorphic type: a type or a set of types?

Programming in Haskell by Hutton says:
A type that contains one or more type variables is called polymorphic.
Which is a polymorphic type: a type or a set of types?
Is a polymorphic type with a concrete type substituting its type variable a type?
Is a polymorphic type with different concrete types substituting its type variable considered the same or different types?
Is a polymorphic type with a concrete type substituting its type variable a type?
That's the point, yes. However, you need to be careful. Consider:
id :: a -> a
That's polymorphic. You can substitute a := Int and get Int -> Int, and a := Float -> Float and get (Float -> Float) -> Float -> Float. However, you cannot say a := Maybe and get id :: Maybe -> Maybe. That just doesn't make sense. Instead, we have to require that you can only substitute concrete types like Int and Maybe Float for a, not abstract ones like Maybe. This is handled with the kind system. This is not too important for your question, so I'll just summarize. Int and Float and Maybe Float are all concrete types (that is, they have values), so we say that they have type Type (the type of a type is often called its kind). Maybe is a function that takes a concrete type as an argument and returns a new concrete type, so we say Maybe :: Type -> Type. In the type a -> a, we say the type variable a must have type Type, so now the substitutions a := Int, a := String, etc. are allowed, while stuff like a := Maybe isn't.
Is a polymorphic type with different concrete types substituting its type variable considered the same or different types?
No. Back to a -> a: a := Int gives Int -> Int, but a := Float gives Float -> Float. Not the same.
Which is a polymorphic type: a type or a set of types?
Now that's a loaded question. You can skip to the TL;DR at the end, but the question of "what is a polymorphic type" is actually really confusing in Haskell, so here's a wall of text.
There are two ways to see it. Haskell started with one, then moved to the other, and now we have a ton of old literature referring to the old way, so the syntax of the modern system tries to maintain compatibility. It's a bit of a hot mess. Consider
id x = x
What is the type of id? One point of view is that id :: Int -> Int, and also id :: Float -> Float, and also id :: (Int -> Int) -> Int -> Int, ad infinitum, all simultaneously. This infinite family of types can be summed up with one polymorphic type, id :: a -> a. This point of view gives you the Hindley-Milner type system. This is not how modern GHC Haskell works, but this system is what Haskell was based on at its creation.
In Hindley-Milner, there is a hard line between polymorphic types and monomorphic types, and the union of these two groups gives you "types" in general. It's not really fair to say that, in HM, polymorphic types (in HM jargon, "polytypes") are types. You can't take polytypes as arguments, or return them from functions, or place them in a list. Instead, polytypes are only templates for monotypes. If you squint, in HM, a polymorphic type can be seen as a set of those monotypes that fit the schema.
Modern Haskell is built on System F (plus extensions). In System F,
id = \x -> x -- rewriting the example
is not a complete definition. Therefore we can't even think about giving it a type. Every lambda-bound variable needs a type annotation, but x has no annotation. Worse, we can't even decide on one: \(x :: Int) -> x is just as good as \(x :: Float) -> x. In System F, what we do is we write
id = /\(a :: Type) -> \(x :: a) -> x
using /\ to represent Λ (upper-case lambda) much as we use \ to represent λ.
id is a function taking two arguments. The first argument is a Type, named a. The second argument is an a. The result is also an a. The type signature is:
id :: forall (a :: Type). a -> a
forall is a new kind of function arrow, basically. Note that it provides a binder for a. In HM, when we said id :: a -> a, we didn't really define what a was. It was a fresh, global variable. By convention, more than anything else, that variable is not used anywhere else (otherwise the Generalization rule doesn't apply and everything breaks down). If I had written e.g. inject :: a -> Maybe a, afterwards, the textual occurrences of a would be referring to a new global entity, different from the one in id. In System F, the a in forall a. a -> a actually has scope. It's a "local variable" available only for use underneath that forall. The a in inject :: forall a. a -> Maybe a may or may not be the "same" a; it doesn't matter, because we have actual scoping rules that keep everything from falling apart.
Because System F has hygienic scoping rules for type variables, polymorphic types are allowed to do everything other types can do. You can take them as arguments
runCont :: forall (a :: Type). (forall (r :: Type). (a -> r) -> r) -> a
runCons a f = f a (id a) -- omitting type signatures; you can fill them in
You put them in data constructors
newtype Yoneda f a = Yoneda (forall b. (a -> b) -> f b)
You can place them in polymorphic containers:
type Bool = forall a. a -> a -> a
true, false :: Bool
true a t f = t
false a t f = f
thueMorse :: [Bool]
thueMorse = false : true : true : false : _etc
There's an important difference from HM. In HM, if something has polymorphic type, it also has, simultaneously, an infinity of monomorphic types. In System F, a thing can only have one type. id = /\a -> \(x :: a) -> x has type forall a. a -> a, not Int -> Int or Float -> Float. In order to get an Int -> Int out of id, you have to actually give it an argument: id Int :: Int -> Int, and id Float :: Float -> Float.
Haskell is not System F, however. System F is closer to what GHC calls Core, which is an internal language that GHC compiles Haskell to—basically Haskell without any syntax sugar. Haskell is a Hindley-Milner flavored veneer on top of a System F core. In Haskell, nominally a polymorphic type is a type. They do not act like sets of types. However, polymorphic types are still second class. Haskell doesn't let you actually type forall without -XExplicitForalls. It emulates Hindley-Milner's wonky implicit global variable creation by inserting foralls in certain places. The places where it does so are changed by -XScopedTypeVariables. You can't take polymorphic arguments or have polymorphic fields unless you enable -XRankNTypes. You cannot say things like [forall a. a -> a -> a], nor can you say id (forall a. a -> a -> a) :: (forall a. a -> a -> a) -> (forall a. a -> a -> a)—you must define e.g. newtype Bool = Bool { ifThenElse :: forall a. a -> a -> a } to wrap the polymorphism under something monomorphic. You cannot explicitly give type arguments unless you enable -XTypeApplications, and then you can write id #Int :: Int -> Int. You cannot write type lambdas (/\), period; instead, they are inserted implicitly whenever possible. If you define id :: forall a. a -> a, then you cannot even write id in Haskell. It will always be implicitly expanded to an application, id #_.
TL;DR: In Haskell, a polymorphic type is a type. It's not treated as a set of types, or a rule/schema for types, or whatever. However, due to historical reasons, they are treated as second class citizens. By default, it looks like they are treated as mere sets of types, if you squint a bit. Most restrictions on them can be lifted with suitable language extensions, at which point they look more like "just types". The one remaining big restriction (no impredicative instantiations allowed) is rather fundamental and cannot be erased, but that's fine because there's a workaround.
There is some nuance in the word "type" here. Values have concrete types, which cannot be polymorphic. Expressions, on the other hand, have general types, which can be polymorphic. If you're thinking of types for values, then a polymorphic type can be thought of loosely as defining sets of possible concrete types. (At least first-order polymorphic types! Higher-order polymorphism breaks this intuition.) But that's not always a particularly useful way of thinking, and it's not a sufficient definition. It doesn't capture which sets of types can be described in this way (and related notions like parametricity.)
It's a good observation, though, that the same word, "type", is used in these two related, but different, ways.
EDIT: The answer below turns out not to answer the question. The difference is a subtle mistake in terminology: types like Maybe and [] are higher-kinded, whereas types like forall a. a -> a and forall a. Maybe a are polymorphic. The answer below relates to higher-kinded types, but the question was asked about polymorphic types. I’m still leaving this answer up in case it helps anyone else, but I realise now it’s not really an answer to the question.
I would argue that a polymorphic higher-kinded type is closer to a set of types. For instance, you could see Maybe as the set {Maybe Int, Maybe Bool, …}.
However, strictly speaking, this is a bit misleading. To address this in more detail, we need to learn about kinds. Similarly to how types describe values, we say that kinds describe types. The idea is:
A concrete type (that is, one which has values) has a kind of *. Examples include Bool, Char, Int and Maybe String, which all have type *. This is denoted e.g. Bool :: *. Note that functions such as Int -> String also have kind *, as these are concrete types which can contain values such as show!
A type with a type parameter has a kind containing arrows. For instance, in the same way that id :: a -> a, we can say that Maybe :: * -> *, since Maybe takes a concrete type as an argument (such as Int), and produces a concrete type as a result (such as Maybe Int). Something like a -> a also has kind * -> *, since it has one type parameter (a) and produces a concrete result (a -> a). You can get more complex kinds as well: for instance, data Foo f x = FooConstr (f x x) has kind Foo :: (* -> * -> *) -> * -> *. (Can you see why?)
(If the above explanation doesn’t make sense, the Learn You a Haskell book has a great section on kinds as well.)
So now we can answer your questions properly:
Which is a polymorphic higher-kinded type: a type or a set of types?
Neither: a polymorphic higher-kinded type is a type-level function, as indicated by the arrows in its kind. For instance, Maybe :: * -> * is a type-level function which converts e.g. Int → Maybe Int, Bool → Maybe Bool etc.
Is a polymorphic higher-kinded type with a concrete type substituting its type variable a type?
Yes, when your polymorphic higher-kinded type has a kind * -> * (i.e. it has one type parameter, which accepts a concrete type). When you apply a concrete type Conc :: * to a type Poly :: * -> *, it’s just function application, as detailed above, with the result being Poly Conc :: * i.e. a concrete type.
Is a polymorphic higher-kinded type with different concrete types substituting its type variable considered the same or different types?
This question is a bit out of place, as it doesn’t have anything to do with kinds. The answer is definitely no: two types like Maybe Int and Maybe Bool are not the same. Nothing may be a member of both types, but only the former contains a value Just 4, and only the latter contains a value Just False.
On the other hand, it is possible to have two different substitutions where the resulting types are isomorphic. (An isomorphism is where two types are different, but equivalent in some way. For instance, (a, b) and (b, a) are isomorphic, despite being the same type. The formal condition is that two types p,q are isomorphic when you can write two inverse functions p -> q and q -> p.)
One example of this is Const:
data Const a b = Const { getConst :: a }
This type just ignores its second type parameter; as a result, two types like Const Int Char and Const Int Bool are isomorphic. However, they are not the same type: if you make a value of type Const Int Char, but then use it as something of type Const Int Bool, this will result in a type error. This sort of functionality is incredibly useful, as it means you can ‘tag’ a type a using Const a tag, then use the tag as a marker of information on the type level.

Why is Identity monad useful?

I often read that
It seem that identity monad is useless. It's not... but that's another
topic.
So can anyone tell my how is it useful?
Identity is to monads, functors and applicative functors as 0 is to numbers. On its own it seems useless, but it's often needed in places where one expects a monad or an (applicative) functor that actually doesn't do anything.
As already mentioned, Identity allows us to define just monad transformers and then define their corresponding monads just as SomeT Identity.
But that's not all. It's often convenient to also define other concepts in terms of monads, which usually adds a lot of flexibility. For example Conduit i m o (also see this tutorial) defines an element in a pipeline that can request data of type i, can produce data of type o, and uses monad m for internal processing. Then such a pipeline can be run in the given monad using
($$) :: Monad m => Source m a -> Sink a m b -> m b
(where Source is an alias for Conduit with no input and Sink for Conduit with no output). And when no effectful computations are needed in the pipeline, just pure code, we just specialize m to Identity and run such a pipeline as
runIdentity (source $$ sink)
Identity is also the "empty" functor and applicative functor: Identity composed with another functor or applicative functor is isomorphic to the original. For example, Lens' is defined as a function polymorphic in a Functor:
Functor f => (a -> f a) -> s -> f s
roughly speaking, such a lens allows to read or manipulate something of type a inside s, for example a field inside a record (for an introduction to lenses see this post). If we specialize f to Identity, we get
(a -> Identity a) -> s -> Identity s
which is isomorphic to
(a -> a) -> s -> s
so given an updating function on a, return an updating function on s. (For completeness: If we specialize f to Const a, we get (a -> Const b a) -> s -> Const b s, which is isomorphic to (a -> b) -> (s -> b), that is, given a reader on a, return a reader on s.)
Sometimes I work with records whose fields are optional in some contexts (like when parsing the record from JSON) but mandatory in others.
I solve that by parametrizing the record with a functor, and using Maybe or Identity in each case.
{-# LANGUAGE DeriveGeneric #-}
{-# LANGUAGE StandaloneDeriving #-}
data Query f = Query
{
_viewName :: String
, _target :: f Server -- Server is some type, it doesn't matter which
}
deriving (Generic)
The server field is optional when parsing JSON:
instance FromJSON (Query Maybe)
But then I have a function like
withDefaultServer :: Server -> Query Maybe -> Query Identity
withDefaultServer = undefined
that returns a record in which the _target field is mandatory.
(This answer doesn't use anything monadic about Identity, though.)
One use of it is as a base monad for monad transformer stacks: instead of having to provide two types Some :: * ->* and SomeT :: (* -> *) -> * -> *, it is enough to provide just a latter by setting type Some = SomeT Identity.
Another, somewhat similar use case (but completely detached from the whole monad business) is when you need to refer to tuples: we can say () is a nullary tuple, (a, b) is a binary tuple, (a, b, c) is a ternary tuple, and so on, but what does that leave for the unary case? Saying a is a unary tuple for any choice of a is often not satisfactory, for example when we are building some typeclass instances like Data.Tuple.Select, some type constructor is needed to act as the unambiguous key. So by adding e.g. Sel1 instances to Identity a, it forces us to distinguish between (a, b) (a two-tuple containing an a and a b), and Identity (a, b) (a one-tuple containing a single (a, b) value).
(Note that Data.Tuple.Select defines its own type called OneTuple instead of reusing Identity, but it is isomorphic to Identity—in fact, it's just a rename away—and I think it only exists to avoid a non-base dependency.)
One real use-case is to be a (pure) base of monad transformers stack, e.g.
type Reader r = ReaderT r Identity

Relationship between TypeRep and "Type" GADT

In Scrap your boilerplate reloaded, the authors describe a new presentation of Scrap Your Boilerplate, which is supposed to be equivalent to the original.
However, one difference is that they assume a finite, closed set of "base" types, encoded with a GADT
data Type :: * -> * where
Int :: Type Int
List :: Type a -> Type [a]
...
In the original SYB, type-safe cast is used, implemented using the Typeable class.
My questions are:
What is the relationship between these two approaches?
Why was the GADT representation chosen for the "SYB Reloaded" presentation?
[I am one of the authors of the "SYB Reloaded" paper.]
TL;DR We really just used it because it seemed more beautiful to us. The class-based Typeable approach is more practical. The Spine view can be combined with the Typeable class and does not depend on the Type GADT.
The paper states this in its conclusions:
Our implementation handles the two central ingredients of generic programming differently from the original SYB paper: we use overloaded functions with
explicit type arguments instead of overloaded functions based on a type-safe
cast 1 or a class-based extensible scheme [20]; and we use the explicit spine
view rather than a combinator-based approach. Both changes are independent
of each other, and have been made with clarity in mind: we think that the structure of the SYB approach is more visible in our setting, and that the relations
to PolyP and Generic Haskell become clearer. We have revealed that while the
spine view is limited in the class of generic functions that can be written, it is
applicable to a very large class of data types, including GADTs.
Our approach cannot be used easily as a library, because the encoding of
overloaded functions using explicit type arguments requires the extensibility of
the Type data type and of functions such as toSpine. One can, however, incorporate Spine into the SYB library while still using the techniques of the SYB
papers to encode overloaded functions.
So, the choice of using a GADT for type representation is one we made mainly for clarity. As Don states in his answer, there are some obvious advantages in this representation, namely that it maintains static information about what type a type representation is for, and that it allows us to implement cast without any further magic, and in particular without the use of unsafeCoerce. Type-indexed functions can also be implemented directly by using pattern matching on the type, and without falling back to various combinators such as mkQ or extQ.
Fact is that I (and I think the co-authors) simply were not very fond of the Typeable class. (In fact, I'm still not, although it is finally becoming a bit more disciplined now in that GHC adds auto-deriving for Typeable, makes it kind-polymorphic, and will ultimately remove the possibility to define your own instances.) In addition, Typeable wasn't quite as established and widely known as it is perhaps now, so it seemed appealing to "explain" it by using the GADT encoding. And furthermore, this was the time when we were also thinking about adding open datatypes to Haskell, thereby alleviating the restriction that the GADT is closed.
So, to summarize: If you actually need dynamic type information only for a closed universe, I'd always go for the GADT, because you can use pattern matching to define type-indexed functions, and you do not have to rely on unsafeCoerce nor advanced compiler magic. If the universe is open, however, which is quite common, certainly for the generic programming setting, then the GADT approach might be instructive, but isn't practical, and using Typeable is the way to go.
However, as we also state in the conclusions of the paper, the choice of Type over Typeable isn't a prerequisite for the other choice we're making, namely to use the Spine view, which I think is more important and really the core of the paper.
The paper itself shows (in Section 8) a variation inspired by the "Scrap your Boilerplate with Class" paper, which uses a Spine view with a class constraint instead. But we can also do a more direct development, which I show in the following. For this, we'll use Typeable from Data.Typeable, but define our own Data class which, for simplicity, just contains the toSpine method:
class Typeable a => Data a where
toSpine :: a -> Spine a
The Spine datatype now uses the Data constraint:
data Spine :: * -> * where
Constr :: a -> Spine a
(:<>:) :: (Data a) => Spine (a -> b) -> a -> Spine b
The function fromSpine is as trivial as with the other representation:
fromSpine :: Spine a -> a
fromSpine (Constr x) = x
fromSpine (c :<>: x) = fromSpine c x
Instances for Data are trivial for flat types such as Int:
instance Data Int where
toSpine = Constr
And they're still entirely straightforward for structured types such as binary trees:
data Tree a = Empty | Node (Tree a) a (Tree a)
instance Data a => Data (Tree a) where
toSpine Empty = Constr Empty
toSpine (Node l x r) = Constr Node :<>: l :<>: x :<>: r
The paper then goes on and defines various generic functions, such as mapQ. These definitions hardly change. We only get class constraints for Data a => where the paper has function arguments of Type a ->:
mapQ :: Query r -> Query [r]
mapQ q = mapQ' q . toSpine
mapQ' :: Query r -> (forall a. Spine a -> [r])
mapQ' q (Constr c) = []
mapQ' q (f :<>: x) = mapQ' q f ++ [q x]
Higher-level functions such as everything also just lose their explicit type arguments (and then actually look exactly the same as in original SYB):
everything :: (r -> r -> r) -> Query r -> Query r
everything op q x = foldl op (q x) (mapQ (everything op q) x)
As I said above, if we now want to define a generic sum function summing up all Int occurrences, we cannot pattern match anymore, but have to fall back to mkQ, but mkQ is defined purely in terms of Typeable and completely independent of Spine:
mkQ :: (Typeable a, Typeable b) => r -> (b -> r) -> a -> r
(r `mkQ` br) a = maybe r br (cast a)
And then (again exactly as in original SYB):
sum :: Query Int
sum = everything (+) sumQ
sumQ :: Query Int
sumQ = mkQ 0 id
For some of the stuff later in the paper (e.g., adding constructor information), a bit more work is needed, but it can all be done. So using Spine really does not depend on using Type at all.
Well, obviously the Typeable use is open -- new variants can be added after the fact, and without modifying the original definitions.
The important change though is that in that TypeRep is untyped. That is, there is no connection between the runtime type , TypeRep, and the static type it encodes. With the GADT approach we can encode the mapping between a type a and its Type, given by the GADT Type a.
We thus bake in evidence for the type rep being statically linked to its origin type, and can write statically typed dynamic application (for example) using Type a as evidence that we have a runtime a.
In the older TypeRep case, we have no such evidence and it comes down to runtime string equality, and a coerce and hope for the best through fromDynamic.
Compare the signatures:
toDyn :: Typeable a => a -> TypeRep -> Dynamic
versus GADT style:
toDyn :: Type a => a -> Type a -> Dynamic
I can't fake my type evidence, and I can use that later when reconstructing things, to e.g. lookup the type class instances for a when all I have is a Type a.

What are these explicit "forall"s doing?

What is the purpose of the foralls in this code?
class Monad m where
(>>=) :: forall a b. m a -> (a -> m b) -> m b
(>>) :: forall a b. m a -> m b -> m b
-- Explicit for-alls so that we know what order to
-- give type arguments when desugaring
(some code omitted). This is from the code for Monads.
My background: I don't really understand forall or when Haskell has them implicitly.
Also, and it may not be significant, but GHCi allows me to omit the forall when giving >> a type:
Prelude> :t (>>) :: Monad m => m a -> m b -> m b
(>>) :: Monad m => m a -> m b -> m b
:: (Monad m) => m a -> m b -> m b
(no error).
My background: I don't really understand forall or when Haskell has them implicitly.
Okay, consider the type of id, a -> a. What does a mean, and where does it come from? When you define a value, you can't just use arbitrary variables that aren't defined anywhere. You need a top-level definition, or a function argument, or a where clause, &c. In general, if you use a variable, it must be bound somewhere.
The same is true of type variables, and forall is one such way to bind a type variable. Anywhere you see a type variable that isn't explicitly bound (for example, class Foo a where ... binds a inside the class definition), it's implicitly bound by a forall.
So, the type of id is implicitly forall a. a -> a. What does this mean? Pretty much what it says. We can get a type a -> a for all possible types a, or from another perspective, if you pick any specific type you can get a type representing "functions from your chosen type to itself". The latter phrasing should sound a bit like defining a function, and as such you can think of forall as being similar to a lambda abstraction for types.
GHC uses various intermediate representations during compilation, and one of the transformations it applies is making the similarity to functions more direct: implicit foralls are made explicit, and anywhere a polymorphic value is used for a specific type, it is first applied to a type argument.
We can even write both foralls and lambdas as one expression. I'll abuse notation for a moment and replace forall a. with /\a => for visual consistency. In this style, we can define id = /\a => \(x::a) -> (x::a) or something similar. So, an expression like id True in your code would end up translated to something like id Bool True instead; just id True would no longer even make sense.
Just as you can reorder function arguments, you can likewise reorder the type arguments, subject only to the (rather obvious) restriction that type arguments must come before any value arguments of that type. Since implicit foralls are always the outermost layer, GHC could potentially choose any order it wanted when making them explicit. Under normal circumstances, this obviously doesn't matter.
I'm not sure exactly what's going on in this case, but based on the comments I would guess that the conversion to using explicit type arguments and the desugaring of do notation are, in some sense, not aware of each other, and therefore the order of type arguments is specified explicitly to ensure consistency. After all, if something is blindly applying two type arguments to an expression, it matters a great deal whether that expression's type is forall a b. m a -> m b -> m b or forall b a. m a -> m b -> m b!

Programmatic type annotations in Haskell

When metaprogramming, it may be useful (or necessary) to pass along to Haskell's type system information about types that's known to your program but not inferable in Hindley-Milner. Is there a library (or language extension, etc) that provides facilities for doing this—that is, programmatic type annotations—in Haskell?
Consider a situation where you're working with a heterogenous list (implemented using the Data.Dynamic library or existential quantification, say) and you want to filter the list down to a bog-standard, homogeneously typed Haskell list. You can write a function like
import Data.Dynamic
import Data.Typeable
dynListToList :: (Typeable a) => [Dynamic] -> [a]
dynListToList = (map fromJust) . (filter isJust) . (map fromDynamic)
and call it with a manual type annotation. For example,
foo :: [Int]
foo = dynListToList [ toDyn (1 :: Int)
, toDyn (2 :: Int)
, toDyn ("foo" :: String) ]
Here foo is the list [1, 2] :: [Int]; that works fine and you're back on solid ground where Haskell's type system can do its thing.
Now imagine you want to do much the same thing but (a) at the time you write the code you don't know what the type of the list produced by a call to dynListToList needs to be, yet (b) your program does contain the information necessary to figure this out, only (c) it's not in a form accessible to the type system.
For example, say you've randomly selected an item from your heterogenous list and you want to filter the list down by that type. Using the type-checking facilities supplied by Data.Typeable, your program has all the information it needs to do this, but as far as I can tell—this is the essence of the question—there's no way to pass it along to the type system. Here's some pseudo-Haskell that shows what I mean:
import Data.Dynamic
import Data.Typeable
randList :: (Typeable a) => [Dynamic] -> IO [a]
randList dl = do
tr <- randItem $ map dynTypeRep dl
return (dynListToList dl :: [<tr>]) -- This thing should have the type
-- represented by `tr`
(Assume randItem selects a random item from a list.)
Without a type annotation on the argument of return, the compiler will tell you that it has an "ambiguous type" and ask you to provide one. But you can't provide a manual type annotation because the type is not known at write-time (and can vary); the type is known at run-time, however—albeit in a form the type system can't use (here, the type needed is represented by the value tr, a TypeRep—see Data.Typeable for details).
The pseudo-code :: [<tr>] is the magic I want to happen. Is there any way to provide the type system with type information programatically; that is, with type information contained in a value in your program?
Basically I'm looking for a function with (pseudo-) type ??? -> TypeRep -> a that takes a value of a type unknown to Haskell's type system and a TypeRep and says, "Trust me, compiler, I know what I'm doing. This thing has the value represented by this TypeRep." (Note that this is not what unsafeCoerce does.)
Or is there something completely different that gets me the same place? For example, I can imagine a language extension that permits assignment to type variables, like a souped-up version of the extension enabling scoped type variables.
(If this is impossible or highly impractical,—e.g., it requires packing a complete GHCi-like interpreter into the executable—please try to explain why.)
No, you can't do this. The long and short of it is that you're trying to write a dependently-typed function, and Haskell isn't a dependently typed language; you can't lift your TypeRep value to a true type, and so there's no way to write down the type of your desired function. To explain this in a little more detail, I'm first going to show why the way you've phrased the type of randList doesn't really make sense. Then, I'm going to explain why you can't do what you want. Finally, I'll briefly mention a couple thoughts on what to actually do.
Existentials
Your type signature for randList can't mean what you want it to mean. Remembering that all type variables in Haskell are universally quantified, it reads
randList :: forall a. Typeable a => [Dynamic] -> IO [a]
Thus, I'm entitled to call it as, say, randList dyns :: IO [Int] anywhere I want; I must be able to provide a return value for all a, not simply for some a. Thinking of this as a game, it's one where the caller can pick a, not the function itself. What you want to say (this isn't valid Haskell syntax, although you can translate it into valid Haskell by using an existential data type1) is something more like
randList :: [Dynamic] -> (exists a. Typeable a => IO [a])
This promises that the elements of the list are of some type a, which is an instance of Typeable, but not necessarily any such type. But even with this, you'll have two problems. First, even if you could construct such a list, what could you do with it? And second, it turns out that you can't even construct it in the first place.
Since all that you know about the elements of the existential list is that they're instances of Typeable, what can you do with them? Looking at the documentation, we see that there are only two functions2 which take instances of Typeable:
typeOf :: Typeable a => a -> TypeRep, from the type class itself (indeed, the only method therein); and
cast :: (Typeable a, Typeable b) => a -> Maybe b (which is implemented with unsafeCoerce, and couldn't be written otherwise).
Thus, all that you know about the type of the elements in the list is that you can call typeOf and cast on them. Since we'll never be able to usefully do anything else with them, our existential might just as well be (again, not valid Haskell)
randList :: [Dynamic] -> IO [(TypeRep, forall b. Typeable b => Maybe b)]
This is what we get if we apply typeOf and cast to every element of our list, store the results, and throw away the now-useless existentially typed original value. Clearly, the TypeRep part of this list isn't useful. And the second half of the list isn't either. Since we're back to a universally-quantified type, the caller of randList is once again entitled to request that they get a Maybe Int, a Maybe Bool, or a Maybe b for any (typeable) b of their choosing. (In fact, they have slightly more power than before, since they can instantiate different elements of the list to different types.) But they can't figure out what type they're converting from unless they already know it—you've still lost the type information you were trying to keep.
And even setting aside the fact that they're not useful, you simply can't construct the desired existential type here. The error arises when you try to return the existentially-typed list (return $ dynListToList dl). At what specific type are you calling dynListToList? Recall that dynListToList :: forall a. Typeable a => [Dynamic] -> [a]; thus, randList is responsible for picking which a dynListToList is going to use. But it doesn't know which a to pick; again, that's the source of the question! So the type that you're trying to return is underspecified, and thus ambiguous.3
Dependent types
OK, so what would make this existential useful (and possible)? Well, we actually have slightly more information: not only do we know there's some a, we have its TypeRep. So maybe we can package that up:
randList :: [Dynamic] -> (exists a. Typeable a => IO (TypeRep,[a]))
This isn't quite good enough, though; the TypeRep and the [a] aren't linked at all. And that's exactly what you're trying to express: some way to link the TypeRep and the a.
Basically, your goal is to write something like
toType :: TypeRep -> *
Here, * is the kind of all types; if you haven't seen kinds before, they are to types what types are to values. * classifies types, * -> * classifies one-argument type constructors, etc. (For instance, Int :: *, Maybe :: * -> *, Either :: * -> * -> *, and Maybe Int :: *.)
With this, you could write (once again, this code isn't valid Haskell; in fact, it really bears only a passing resemblance to Haskell, as there's no way you could write it or anything like it within Haskell's type system):
randList :: [Dynamic] -> (exists (tr :: TypeRep).
Typeable (toType tr) => IO (tr, [toType tr]))
randList dl = do
tr <- randItem $ map dynTypeRep dl
return (tr, dynListToList dl :: [toType tr])
-- In fact, in an ideal world, the `:: [toType tr]` signature would be
-- inferable.
Now, you're promising the right thing: not that there exists some type which classifies the elements of the list, but that there exists some TypeRep such that its corresponding type classifies the elements of the list. If only you could do this, you would be set. But writing toType :: TypeRep -> * is completely impossible in Haskell: doing this requires a dependently-typed language, since toType tr is a type which depends on a value.
What does this mean? In Haskell, it's perfectly acceptable for values to depend on other values; this is what a function is. The value head "abc", for instance, depends on the value "abc". Similarly, we have type constructors, so it's acceptable for types to depend on other types; consider Maybe Int, and how it depends on Int. We can even have values which depend on types! Consider id :: a -> a. This is really a family of functions: id_Int :: Int -> Int, id_Bool :: Bool -> Bool, etc. Which one we have depends on the type of a. (So really, id = \(a :: *) (x :: a) -> x; although we can't write this in Haskell, there are languages where we can.)
Crucially, however, we can never have a type that depends on a value. We might want such a thing: imagine Vec 7 Int, the type of length-7 lists of integers. Here, Vec :: Nat -> * -> *: a type whose first argument must be a value of type Nat. But we can't write this sort of thing in Haskell.4 Languages which support this are called dependently-typed (and will let us write id as we did above); examples include Coq and Agda. (Such languages often double as proof assistants, and are generally used for research work as opposed to writing actual code. Dependent types are hard, and making them useful for everyday programming is an active area of research.)
Thus, in Haskell, we can check everything about our types first, throw away all that information, and then compile something that refers only to values. In fact, this is exactly what GHC does; since we can never check types at run-time in Haskell, GHC erases all the types at compile-time without changing the program's run-time behavior. This is why unsafeCoerce is easy to implement (operationally) and completely unsafe: at run-time, it's a no-op, but it lies to the type system. Consequently, something like toType is completely impossible to implement in the Haskell type system.
In fact, as you noticed, you can't even write down the desired type and use unsafeCoerce. For some problems, you can get away with this; we can write down the type for the function, but only implement it with by cheating. That's exactly how fromDynamic works. But as we saw above, there's not even a good type to give to this problem from within Haskell. The imaginary toType function allows you to give the program a type, but you can't even write down toType's type!
What now?
So, you can't do this. What should you do? My guess is that your overall architecture isn't ideal for Haskell, although I haven't seen it; Typeable and Dynamic don't actually show up that much in Haskell programs. (Perhaps you're "speaking Haskell with a Python accent", as they say.) If you only have a finite set of data types to deal with, you might be able to bundle things into a plain old algebraic data type instead:
data MyType = MTInt Int | MTBool Bool | MTString String
Then you can write isMTInt, and just use filter isMTInt, or filter (isSameMTAs randomMT).
Although I don't know what it is, there's probably a way you could unsafeCoerce your way through this problem. But frankly, that's not a good idea unless you really, really, really, really, really, really know what you're doing. And even then, it's probably not. If you need unsafeCoerce, you'll know, it won't just be a convenience thing.
I really agree with Daniel Wagner's comment: you're probably going to want to rethink your approach from scratch. Again, though, since I haven't seen your architecture, I can't say what that will mean. Maybe there's another Stack Overflow question in there, if you can distill out a concrete difficulty.
1 That looks like the following:
{-# LANGUAGE ExistentialQuantification #-}
data TypeableList = forall a. Typeable a => TypeableList [a]
randList :: [Dynamic] -> IO TypeableList
However, since none of this code compiles anyway, I think writing it out with exists is clearer.
2 Technically, there are some other functions which look relevant, such as toDyn :: Typeable a => a -> Dynamic and fromDyn :: Typeable a => Dynamic -> a -> a. However, Dynamic is more or less an existential wrapper around Typeables, relying on typeOf and TypeReps to know when to unsafeCoerce (GHC uses some implementation-specific types and unsafeCoerce, but you could do it this way, with the possible exception of dynApply/dynApp), so toDyn doesn't do anything new. And fromDyn doesn't really expect its argument of type a; it's just a wrapper around cast. These functions, and the other similar ones, don't provide any extra power that isn't available with just typeOf and cast. (For instance, going back to a Dynamic isn't very useful for your problem!)
3 To see the error in action, you can try to compile the following complete Haskell program:
{-# LANGUAGE ExistentialQuantification #-}
import Data.Dynamic
import Data.Typeable
import Data.Maybe
randItem :: [a] -> IO a
randItem = return . head -- Good enough for a short and non-compiling example
dynListToList :: Typeable a => [Dynamic] -> [a]
dynListToList = mapMaybe fromDynamic
data TypeableList = forall a. Typeable a => TypeableList [a]
randList :: [Dynamic] -> IO TypeableList
randList dl = do
tr <- randItem $ map dynTypeRep dl
return . TypeableList $ dynListToList dl -- Error! Ambiguous type variable.
Sure enough, if you try to compile this, you get the error:
SO12273982.hs:17:27:
Ambiguous type variable `a0' in the constraint:
(Typeable a0) arising from a use of `dynListToList'
Probable fix: add a type signature that fixes these type variable(s)
In the second argument of `($)', namely `dynListToList dl'
In a stmt of a 'do' block: return . TypeableList $ dynListToList dl
In the expression:
do { tr <- randItem $ map dynTypeRep dl;
return . TypeableList $ dynListToList dl }
But as is the entire point of the question, you can't "add a type signature that fixes these type variable(s)", because you don't know what type you want.
4 Mostly. GHC 7.4 has support for lifting types to kinds and for kind polymorphism; see section 7.8, "Kind polymorphism and promotion", in the GHC 7.4 user manual. This doesn't make Haskell dependently typed—something like TypeRep -> * example is still out5—but you will be able to write Vec by using very expressive types that look like values.
5 Technically, you could now write down something which looks like it has the desired type: type family ToType :: TypeRep -> *. However, this takes a type of the promoted kind TypeRep, and not a value of the type TypeRep; and besides, you still wouldn't be able to implement it. (At least I don't think so, and I can't see how you would—but I am not an expert in this.) But at this point, we're pretty far afield.
What you're observing is that the type TypeRep doesn't actually carry any type-level information along with it; only term-level information. This is a shame, but we can do better when we know all the type constructors we care about. For example, suppose we only care about Ints, lists, and function types.
{-# LANGUAGE GADTs, TypeOperators #-}
import Control.Monad
data a :=: b where Refl :: a :=: a
data Dynamic where Dynamic :: TypeRep a -> a -> Dynamic
data TypeRep a where
Int :: TypeRep Int
List :: TypeRep a -> TypeRep [a]
Arrow :: TypeRep a -> TypeRep b -> TypeRep (a -> b)
class Typeable a where typeOf :: TypeRep a
instance Typeable Int where typeOf = Int
instance Typeable a => Typeable [a] where typeOf = List typeOf
instance (Typeable a, Typeable b) => Typeable (a -> b) where
typeOf = Arrow typeOf typeOf
congArrow :: from :=: from' -> to :=: to' -> (from -> to) :=: (from' -> to')
congArrow Refl Refl = Refl
congList :: a :=: b -> [a] :=: [b]
congList Refl = Refl
eq :: TypeRep a -> TypeRep b -> Maybe (a :=: b)
eq Int Int = Just Refl
eq (Arrow from to) (Arrow from' to') = liftM2 congArrow (eq from from') (eq to to')
eq (List t) (List t') = liftM congList (eq t t')
eq _ _ = Nothing
eqTypeable :: (Typeable a, Typeable b) => Maybe (a :=: b)
eqTypeable = eq typeOf typeOf
toDynamic :: Typeable a => a -> Dynamic
toDynamic a = Dynamic typeOf a
-- look ma, no unsafeCoerce!
fromDynamic_ :: TypeRep a -> Dynamic -> Maybe a
fromDynamic_ rep (Dynamic rep' a) = case eq rep rep' of
Just Refl -> Just a
Nothing -> Nothing
fromDynamic :: Typeable a => Dynamic -> Maybe a
fromDynamic = fromDynamic_ typeOf
All of the above is pretty standard. For more on the design strategy, you'll want to read about GADTs and singleton types. Now, the function you want to write follows; the type is going to look a bit daft, but bear with me.
-- extract only the elements of the list whose type match the head
firstOnly :: [Dynamic] -> Dynamic
firstOnly [] = Dynamic (List Int) []
firstOnly (Dynamic rep v:xs) = Dynamic (List rep) (v:go xs) where
go [] = []
go (Dynamic rep' v:xs) = case eq rep rep' of
Just Refl -> v : go xs
Nothing -> go xs
Here we've picked a random element (I rolled a die, and it came up 1) and extracted only the elements that have a matching type from the list of dynamic values. Now, we could have done the same thing with regular boring old Dynamic from the standard libraries; however, what we couldn't have done is used the TypeRep in a meaningful way. I now demonstrate that we can do so: we'll pattern match on the TypeRep, and then use the enclosed value at the specific type the TypeRep tells us it is.
use :: Dynamic -> [Int]
use (Dynamic (List (Arrow Int Int)) fs) = zipWith ($) fs [1..]
use (Dynamic (List Int) vs) = vs
use (Dynamic Int v) = [v]
use (Dynamic (Arrow (List Int) (List (List Int))) f) = concat (f [0..5])
use _ = []
Note that on the right-hand sides of these equations, we are using the wrapped value at different, concrete types; the pattern match on the TypeRep is actually introducing type-level information.
You want a function that chooses a different type of values to return based on runtime data. Okay, great. But the whole purpose of a type is to tell you what operations can be performed on a value. When you don't know what type will be returned from a function, what do you do with the values it returns? What operations can you perform on them? There are two options:
You want to read the type, and perform some behaviour based on which type it is. In this case you can only cater for a finite list of types known in advance, essentially by testing "is it this type? then we do this operation...". This is easily possible in the current Dynamic framework: just return the Dynamic objects, using dynTypeRep to filter them, and leave the application of fromDynamic to whoever wants to consume your result. Moreover, it could well be possible without Dynamic, if you don't mind setting the finite list of types in your producer code, rather than your consumer code: just use an ADT with a constructor for each type, data Thing = Thing1 Int | Thing2 String | Thing3 (Thing,Thing). This latter option is by far the best if it is possible.
You want to perform some operation that works across a family of types, potentially some of which you don't know about yet, e.g. by using type class operations. This is trickier, and it's tricky conceptually too, because your program is not allowed to change behaviour based on whether or not some type class instance exists – it's an important property of the type class system that the introduction of a new instance can either make a program type check or stop it from type checking, but it can't change the behaviour of a program. Hence you can't throw an error if your input list contains inappropriate types, so I'm really not sure that there's anything you can do that doesn't essentially involve falling back to the first solution at some point.

Resources