Tagging a string with corresponding symbol - haskell

I would like an easy way to create a String tagged with itself. Right now I can
do something like:
data TagString :: Symbol -> * where
Tag :: String -> TagString s
deriving Show
tag :: KnownSymbol s => Proxy s -> TagString s
tag s = Tag (symbolVal s)
and use it like
tag (Proxy :: Proxy "blah")
But this is not nice because
The guarantee about the tag is only provided by tag not by the GADT.
Every time I want to create a value I have to provide a type signature, which
gets unwieldy if the value is part of some bigger expression.
Is there any way to improve this, preferably going in the opposite direction, i.e. from String to Symbol? I would like to write Tag "blah" and have ghc infer the type
TagString "blah".
GHC.TypeLits provides the someSymbolVal function which looks somewhat
related but it produces a SomeSymbol, not a Symbol and I can quite grasp how to use
it.

Is there any way to improve this, preferably going in the opposite direction, i.e. from String to Symbol?
There is no way to go directly from String to Symbol, because Haskell isn't dependently typed, unfortunately. You do have to write out a type annotation every time you want a new value and there isn't an existing tag with the desired symbol already around.
The guarantee about the tag is only provided by tag not by the GADT.
The following should work well (in fact, the same type can be found in the singletons package):
data SSym :: Symbol -> * where
SSym :: KnownSymbol s => SSym s
-- defining values
sym1 = SSym :: SSym "foo"
sym2 = SSym :: SSym "bar"
This type essentially differs from Proxy only by having the KnownSymbol dictionary in the constructor. The dictionary lets us recover the string contained within even if the symbol is not known statically:
extractString :: SSym s -> String
extractString s#SSym = symbolVal s
We pattern matched on SSym, thereby bringing into scope the implicit KnownSymbol dictionary. The same doesn't work with a mere Proxy:
extractString' :: forall (s :: Symbol). Proxy s -> String
extractString' p#Proxy = symbolVal p
-- type error, we can't recover the string from anywhere
... it produces a SomeSymbol, not a Symbol and I can quite grasp how to use it.
SomeSymbol is like SSym except it hides the string it carries around so that it doesn't appear in the type. The string can be recovered by pattern matching on the constructor.
extractString'' :: SomeSymbol -> String
extractString'' (SomeSymbol proxy) = symbolVal proxy
It can be useful when you want to manipulate different symbols in bulk, for example you can put them in a list (which you can't do with different SSym-s, because their types differ).

Related

Type design for the AST of my language remembering token locations

I wrote a parser and evaluator for a simple programming language. Here is a simplified version of the types for the AST:
data Value = IntV Int | FloatV Float | BoolV Bool
data Expr = IfE Value [Expr] | VarDefE String Value
type Program = [Expr]
I want error messages to tell the line and column of the source code in which the error occured. For example, if the value in an If expression is not a boolean, I want the evaluator to show an error saying "expected boolean at line x, column y", with x and y referring to the location of the value.
So, what I need to do is redefine the previous types so that they can store the relevant locations of different things. One option would be to add a location to each constructor for expressions, like so:
type Location = (Int, Int)
data Expr = IfE Value [Expr] Location | VarDef String Value Location
This clearly isn't optimal, because I have to add those Location fields to every possible expression, and if for example a value contained other values, I would need to add locations to that value too:
{-
this would turn into FunctionCall String [Value] [Location],
with one location for each value in the function call
-}
data Value = ... | FunctionCall String [Value]
I came up with another solution, which allows me to add locations to everything:
data Located a = Located Location a
type LocatedExpr = Located Expr
type LocatedValue = Located Value
data Value = IntV Int | FloatV Float | BoolV Bool | FunctionCall String [LocatedValue]
data Expr = IfE LocatedValue [LocatedExpr] | VarDef String LocatedValue
data Program = [LocatedExpr]
However I don't like this that much. First of all, it clutters the definition of the evaluator and pattern matching has an extra layer every time. Also, I don't think saying that a function call takes located values as arguments is quite right. Function calls should take values as arguments, and locations should be metadata that doesn't interfere with the evaluator.
I need help redefining my types so that the solution is as clean as possible. Maybe there is a language extension or a design pattern I don't know about that could be helpful.
There are many ways to annotate an AST! This is half of what’s known as the AST typing problem, the other half being how you manage an AST that changes over the course of compilation. The problem isn’t exactly “solved”: all of the solutions have tradeoffs, and which one to pick depends on your expected use cases. I’ll go over a few that you might like to investigate at the end.
Whichever method you choose for organising the actual data types, if it makes pattern-matching ugly or unwieldy, the natural solution is PatternSynonyms.
Considering your first example:
{-# Language PatternSynonyms #-}
type Location = (Int, Int)
data Expr
= LocatedIf Value [Expr] Location
| LocatedVarDef String Value Location
-- Unidirectional pattern synonyms which ignore the location:
pattern If :: Value -> [Expr] -> Expr
pattern If val exprs <- LocatedIf val exprs _loc
pattern VarDef :: String -> Value -> Expr
pattern VarDef name expr <- LocatedVarDef name expr _loc
-- Inform GHC that matching ‘If’ and ‘VarDef’ is just as good
-- as matching ‘LocatedIf’ and ‘LocatedVarDef’.
{-# Complete If, VarDef #-}
This may be sufficiently tidy for your purposes already. But here are a few more tips that I find helpful.
Put annotations first: when adding an annotation type to an AST directly, I often prefer to place it as the first parameter of each constructor, so that it can be conveniently partially applied.
data LocatedExpr
= LocatedIf Location Value [Expr]
| LocatedVarDef Location String Value
If the annotation is a location, then this also makes it more convenient to obtain when writing certain kinds of parsers, along the lines of AnnotatedIf <$> (getSourceLocation <* ifKeyword) <*> value <*> many expr in a parser combinator library.
Parameterise your annotations: I often make the annotation type into a type parameter, so that GHC can derive some useful classes for me:
{-# Language
DeriveFoldable,
DeriveFunctor,
DeriveTraversable #-}
data AnnotatedExpr a
= AnnotatedIf a Value [Expr]
| AnnotatedVarDef a String Value
deriving (Functor, Foldable, Traversable)
type LocatedExpr = AnnotatedExpr Location
-- Get the annotation of an expression.
-- (Total as long as every constructor is annotated.)
exprAnnotation :: AnnotatedExpr a -> a
exprAnnotation = head
-- Update annotations purely.
mapAnnotations
:: (a -> b)
-> AnnotatedExpr a -> AnnotatedExpr b
mapAnnotations = fmap
-- traverse, foldMap, &c.
If you want “doesn’t interfere”, use polymorphism: you can enforce that the evaluator can’t inspect the annotation type by being polymorphic over it. Pattern synonyms still let you match on these expressions conveniently:
pattern If :: Value -> [AnnotatedExpr a] -> AnnotatedExpr a
pattern If val exprs <- AnnotatedIf _anno val exprs
-- …
eval :: AnnotatedExpr a -> Value
eval expr = case expr of
If val exprs -> -- …
VarDef name expr -> -- …
Unannotated terms aren’t your enemy: a term without source locations is no good for error reporting, but I think it’s still a good idea to make the pattern synonyms bidirectional for the convenience of constructing unannotated terms with a unit () annotation. (Or something equivalent, if you use e.g. Maybe Location as the annotation type.)
The reason is that this is quite convenient for writing unit tests, where you want to check the output, but want to use Eq instead of pattern matching, and don’t want to have to compare all the source locations in tests that aren’t concerned with them. Using the derived classes, void :: (Functor f) => f a -> f () strips out all the annotations on an AST.
import Control.Monad (void)
type BareExpr = AnnotatedExpr ()
-- One way to define bidirectional synonyms, so e.g.
-- ‘If’ can be used as either a pattern or a constructor.
pattern If :: Value -> [BareExpr] -> BareExpr
pattern If val exprs = AnnotatedIf () val exprs
-- …
stripAnnotations :: AnnotatedExpr a -> BareExpr
stripAnnotations = void
Equivalently, you could use GADTs / ExistentialQuantification to say data AnyExpr where { AnyExpr :: AnnotatedExpr a -> AnyExpr } / data AnyExpr = forall a. AnyExpr (AnnotatedExpr a); that way, the annotations have exactly as much information as (), but you don’t need to fmap over the entire tree with void in order to strip it, just apply the AnyExpr constructor to hide the type.
Finally, here are some brief introductions to a few AST typing solutions.
Annotate each AST node with a tag (e.g. a unique ID), then store all metadata like source locations, types, and whatever else, separately from the AST:
import Data.IntMap (IntMap)
-- More sophisticated/stronglier-typed tags are possible.
newtype Tag = Tag Int
newtype TagMap a = TagMap (IntMap a)
data Expr
= If !Tag Value [Expr]
| VarDef !Tag String Expr
type Span = (Location, Location)
type SourceMap = TagMap Span
type CommentMap = TagMap (Span, String)
parse
:: String -- Input
-> Either ParseError
( Expr -- Parsed expression
, SourceMap -- Source locations of tags
, CommentMap -- Sideband for comments
-- …
)
The advantage is that you can very easily mix in arbitrary new types of annotations anywhere, without affecting the AST itself, and avoid rewriting the AST just to change annotations. You can think of the tree and annotation tables as a kind of database, where the tags are the “foreign keys” relating them. A downside is that you must be careful to maintain these tags when you do rewrite the AST.
I don’t know if this approach has an established name; I think of it as just “tagging” or a “tagged AST”.
recursion-schemes and/or Data Types à la CartePDF: separate out the “recursive” part of an annotated expression tree from the “annotation” part, and use Fix to tie them back together, with Compose (or Cofree) to add annotations in the middle.
data ExprF e
= IfF Value [e]
| VarDefF String e
-- …
deriving (Foldable, Functor, Traversable, …)
-- Unannotated: Expr ~ ExprF (ExprF (ExprF (…)))
type Expr = Fix ExprF
-- With a location at each recursive step:
--
-- LocatedExpr ~ Located (ExprF (Located (ExprF (…))))
type LocatedExpr = Fix (Compose Located ExprF)
data Located a = Located Location a
deriving (Foldable, Functor, Traversable, …)
-- or: type Located = (,) Location
A distinct advantage is that you get a bunch of nice traversal stuff like cata for free-ish, so you can avoid having to write manual traversals over your AST over and over. A downside is that it adds some pattern clutter to clean up, as does the “à la carte” approach, but they do offer a lot of flexibility.
Trees That GrowPDF is overkill for just source locations, but in a serious compiler it’s quite helpful. If you expect to have more than one annotation type (such as inferred types or other analysis results) or an AST that changes over time, then you add a type parameter for the “compilation phase” (parsed, renamed, typechecked, desugared, &c.) and select field types or enable & disable constructors based on that index.
A really unfortunate downside of this is that you often have to rewrite the tree even in places nothing has changed, because everything depends on the “phase”. An alternative that I use is to add one type parameter for each type of phase or annotation that can vary independently, e.g. data Expr annotation termVarName typeVarName, and abstract over that with type and pattern synonyms. This lets you update indices independently and still use classes like Functor and Bitraversable.

Resolving an ambiguous type variable

I have these two functions:
load :: Asset a => Reference -> IO (Maybe a)
send :: Asset a => a -> IO ()
The Asset class look like this:
class (Typeable a,ToJSON a, FromJSON a) => Asset a where
ref :: a -> Reference
...
The first reads an asset from disk, and the second transmits a JSON representation to a WebSocket. In isolation they work fine, but when I combine them the compiler cannot deduce what concrete type a should be. (Could not deduce (Asset a0) arising from a use of 'load')
This makes sense, I have not given a concrete type and both load and send are polymorphic. Somehow the compiler has to decide which version of send (and by extension what version of toJSON) to use.
I can determine at run time what the concrete type of a is. This information is actually encoded both in the data on the disk and the Reference type, but I do not know for sure at compile time as the type checker is being run.
Is there a way to pass the correct type at run time an still keep the type checker happy?
Additional Information
The definition of Reference
data Reference = Ref {
assetType:: String
, assetIndex :: Int
} deriving (Eq, Ord, Show, Generic)
References are derived by parsing a request from a WebSocket as follows where Parser comes from the Parsec library.
reference :: Parser Reference
reference = do
t <- string "User"
<|> string "Port"
<|> string "Model"
<|> ...
char '-'
i <- int
return Ref {assetType = t, assetIndex =i}
If I added a type parameter to Reference I simply push my problem back into the parser. I still need to turn a string that I do not know at compile time into a type to make this work.
You can't make a function that turns string data into values of different types depending on what is in the string. That's simply impossible. You need to rearrange things so that your return-type doesn't depend on the string contents.
Your type for load, Asset a => Reference -> IO (Maybe a) says "pick any a (where Asset a) you like and give me a Reference, and I'll give you back an IO action that produces Maybe a". The caller picks the type they expect to be loaded by the reference; the contents of the file do not influence which type is loaded. But you don't want it to be chosen by the caller, you want it to be chosen by what's stored on disk, so the type signature simply doesn't express the operation you actually want. That's your real problem; the ambiguous type variable when combining load and send would be easily resolved (with a type signature or TypeApplications) if load and send were individually correct and combining them was the only problem.
Basically you can't just have load return a polymorphic type, because if it does then the caller gets to (must) decide what type it returns. There's two ways to avoid this that are more-or-less equivalent: return an existential wrapper, or use rank 2 types and add a polymorphic handler function (continuation) as a parameter.
Using an existential wrapper (requires GADTs extension), it looks something like this:
data SomeAsset
where Some :: Asset a => a -> SomeAsset
load :: Reference -> IO (Maybe SomeAsset)
Notice load is no longer polymorphic. You get a SomeAsset that (as far as the type checker is concerned) could contain any type that has an Asset instance. load can internally use whatever logic it wants split into multiple branches and come up with values of different types of asset on different branches; provided each branch ends with wrapping up the asset value with the SomeAsset constructor all of the branches will return the same type.
To send it, you would use something like (ignoring that I'm not handling Nothing):
loadAndSend :: Reference -> IO ()
loadAndSend ref
= do Just someAsset <- load ref
case someAsset
of SomeAsset asset -> send asset
The SomeAsset wrapper guarantees that Asset holds for its wrapped value, so you can unwrap them and call any Asset-polymorphic function on the result. However you can never do anything with the value that depends on the specific type in any other way1, which is why you have to keep it wrapped up and case match on it all the time; if the case expression results in a type that depends on the contained type (such as case someAsset of SomeAsset a -> a) the compiler will not accept your code.
The other way is to instead use RankNTypes and give load a type like this:
load :: (forall a. Asset a => a -> r) -> Reference -> IO (Maybe r)
Here load doesn't return a value representing the loaded asset at all. What it does instead is take a polymorphic function as an argument; the function works on any Asset and returns a type r (that was chosen by load's caller), so again load can internally branch however it wants and construct differently-typed assets in the different branches. The different asset types can all be passed to the handler, so the handler can be called in every branch.
My preference is often to use the SomeAsset approach, but then to also use RankNTypes and define a helper function like:
withSomeAsset :: (forall a. Asset a => a -> r) -> (SomeAsset -> r)
withSomeAsset f (SomeAsset a) = f a
This avoids having to restructure your code into continuation passing style, but takes away the heave case syntax everywhere you need to use a SomeAsset:
loadAndSend :: Reference -> IO ()
loadAndSend ref
= do Just asset <- load ref
withSomeAsset send asset
Or even add:
sendSome = withSomeAsset send
Daniel Wagner suggested adding the type parameter to Reference, which the OP objected to by stating that simply moves the same problem to when the references are constructed. If the references contain data representing which type of asset they refer to, then I would strongly recommend taking Daniel's advice, and using the concepts described in this answer to address that problem at the reference-constructing level. Reference having a type parameter prevents mixing up references to the wrong types of assets where you do know the type.
And if you do significant processing with references and assets of the same type, then having the type parameter in your workhorse code can catch easy mistakes mixing them up even if you usually existential the type away at the outer levels of code.
1 Technically your Asset implies Typeable, so you can test it for specific types and then return those.
Sure, make Reference store the type.
data Reference a where
UserRef :: Int -> Reference User
PortRef :: Int -> Reference Port
ModelRef :: Int -> Reference Model
load :: Asset a => Reference a -> IO (Maybe a)
send :: Asset a => a -> IO ()
If necessary, you can still recover the strong points of your original Reference type by existentially boxing it.
data SomeAsset f where SomeAsset :: Asset a => f a -> SomeAsset f
reference :: Parser (SomeAsset Reference)
reference = asum
[ string "User" *> go UserRef
, string "Port" *> go PortRef
, string "Model" *> go ModelRef
]
where
go :: Asset a => (Int -> Parser (Reference a)) -> Parser (SomeAsset Reference)
go constructor = constructor <$ char '-' <*> int
loadAndSend :: SomeAsset Reference -> IO ()
loadAndSend (SomeAsset reference) = load reference >>= traverse_ send
After reviewing the answers from Daniel Wagner and Ben, I ultimately resolved my issue using a combination of the two which I place here in hopes it will aid others.
First, per Daniel Wagner's answer, I added a phantom type to Reference:
data Reference a = Ref {
assetType:: String
, assetIndex :: Int
} deriving (Eq, Ord, Show, Generic)
I chose not to use a GADT constructors and leave the string reference to assetType as I frequently send references over the wire and/or parse them from incoming text. I felt there were too many code points where I needed a generic reference. For those cases, I fill in the phantom type with Void:
{-# LANGUAGE EmptyDataDecls #-}
data Void
-- make this reference Generic
voidRef :: Reference a -> Reference Void
castRef :: a -> Reference b -> Reference a
-- ^^^ Note this can be undefined used only for its type
With this the load type signature becomes load :: Asset a => Reference a -> IO (Maybe a) So the Asset is always matches the type of the Reference. (Yay type safety!)
That still doesn't address how to load a generic reference. For those cases, I wrote some new code using the second half of Ben's answer. By wrapping the asset in SomeAsset, I can return a Type which is making the type checker happy.
{-# LANGUAGE GADTs #-}
import Data.Aeson (encode)
loadGenericAsset :: Reference Void -> IO SomeAsset
loadGenericAsset ref =
case assetType ref of
"User" -> Some <$> load (castRef (undefined :: User) ref)
"Port" -> Some <$> load (castRef (undefined :: Port) ref)
[etc...]
send :: SomeAsset -> IO ()
send (Some a) = writeToUser (encode a)
data SomeAsset where
Some :: Asset a => a -> SomeAsset

Programmatic type annotations in Haskell

When metaprogramming, it may be useful (or necessary) to pass along to Haskell's type system information about types that's known to your program but not inferable in Hindley-Milner. Is there a library (or language extension, etc) that provides facilities for doing this—that is, programmatic type annotations—in Haskell?
Consider a situation where you're working with a heterogenous list (implemented using the Data.Dynamic library or existential quantification, say) and you want to filter the list down to a bog-standard, homogeneously typed Haskell list. You can write a function like
import Data.Dynamic
import Data.Typeable
dynListToList :: (Typeable a) => [Dynamic] -> [a]
dynListToList = (map fromJust) . (filter isJust) . (map fromDynamic)
and call it with a manual type annotation. For example,
foo :: [Int]
foo = dynListToList [ toDyn (1 :: Int)
, toDyn (2 :: Int)
, toDyn ("foo" :: String) ]
Here foo is the list [1, 2] :: [Int]; that works fine and you're back on solid ground where Haskell's type system can do its thing.
Now imagine you want to do much the same thing but (a) at the time you write the code you don't know what the type of the list produced by a call to dynListToList needs to be, yet (b) your program does contain the information necessary to figure this out, only (c) it's not in a form accessible to the type system.
For example, say you've randomly selected an item from your heterogenous list and you want to filter the list down by that type. Using the type-checking facilities supplied by Data.Typeable, your program has all the information it needs to do this, but as far as I can tell—this is the essence of the question—there's no way to pass it along to the type system. Here's some pseudo-Haskell that shows what I mean:
import Data.Dynamic
import Data.Typeable
randList :: (Typeable a) => [Dynamic] -> IO [a]
randList dl = do
tr <- randItem $ map dynTypeRep dl
return (dynListToList dl :: [<tr>]) -- This thing should have the type
-- represented by `tr`
(Assume randItem selects a random item from a list.)
Without a type annotation on the argument of return, the compiler will tell you that it has an "ambiguous type" and ask you to provide one. But you can't provide a manual type annotation because the type is not known at write-time (and can vary); the type is known at run-time, however—albeit in a form the type system can't use (here, the type needed is represented by the value tr, a TypeRep—see Data.Typeable for details).
The pseudo-code :: [<tr>] is the magic I want to happen. Is there any way to provide the type system with type information programatically; that is, with type information contained in a value in your program?
Basically I'm looking for a function with (pseudo-) type ??? -> TypeRep -> a that takes a value of a type unknown to Haskell's type system and a TypeRep and says, "Trust me, compiler, I know what I'm doing. This thing has the value represented by this TypeRep." (Note that this is not what unsafeCoerce does.)
Or is there something completely different that gets me the same place? For example, I can imagine a language extension that permits assignment to type variables, like a souped-up version of the extension enabling scoped type variables.
(If this is impossible or highly impractical,—e.g., it requires packing a complete GHCi-like interpreter into the executable—please try to explain why.)
No, you can't do this. The long and short of it is that you're trying to write a dependently-typed function, and Haskell isn't a dependently typed language; you can't lift your TypeRep value to a true type, and so there's no way to write down the type of your desired function. To explain this in a little more detail, I'm first going to show why the way you've phrased the type of randList doesn't really make sense. Then, I'm going to explain why you can't do what you want. Finally, I'll briefly mention a couple thoughts on what to actually do.
Existentials
Your type signature for randList can't mean what you want it to mean. Remembering that all type variables in Haskell are universally quantified, it reads
randList :: forall a. Typeable a => [Dynamic] -> IO [a]
Thus, I'm entitled to call it as, say, randList dyns :: IO [Int] anywhere I want; I must be able to provide a return value for all a, not simply for some a. Thinking of this as a game, it's one where the caller can pick a, not the function itself. What you want to say (this isn't valid Haskell syntax, although you can translate it into valid Haskell by using an existential data type1) is something more like
randList :: [Dynamic] -> (exists a. Typeable a => IO [a])
This promises that the elements of the list are of some type a, which is an instance of Typeable, but not necessarily any such type. But even with this, you'll have two problems. First, even if you could construct such a list, what could you do with it? And second, it turns out that you can't even construct it in the first place.
Since all that you know about the elements of the existential list is that they're instances of Typeable, what can you do with them? Looking at the documentation, we see that there are only two functions2 which take instances of Typeable:
typeOf :: Typeable a => a -> TypeRep, from the type class itself (indeed, the only method therein); and
cast :: (Typeable a, Typeable b) => a -> Maybe b (which is implemented with unsafeCoerce, and couldn't be written otherwise).
Thus, all that you know about the type of the elements in the list is that you can call typeOf and cast on them. Since we'll never be able to usefully do anything else with them, our existential might just as well be (again, not valid Haskell)
randList :: [Dynamic] -> IO [(TypeRep, forall b. Typeable b => Maybe b)]
This is what we get if we apply typeOf and cast to every element of our list, store the results, and throw away the now-useless existentially typed original value. Clearly, the TypeRep part of this list isn't useful. And the second half of the list isn't either. Since we're back to a universally-quantified type, the caller of randList is once again entitled to request that they get a Maybe Int, a Maybe Bool, or a Maybe b for any (typeable) b of their choosing. (In fact, they have slightly more power than before, since they can instantiate different elements of the list to different types.) But they can't figure out what type they're converting from unless they already know it—you've still lost the type information you were trying to keep.
And even setting aside the fact that they're not useful, you simply can't construct the desired existential type here. The error arises when you try to return the existentially-typed list (return $ dynListToList dl). At what specific type are you calling dynListToList? Recall that dynListToList :: forall a. Typeable a => [Dynamic] -> [a]; thus, randList is responsible for picking which a dynListToList is going to use. But it doesn't know which a to pick; again, that's the source of the question! So the type that you're trying to return is underspecified, and thus ambiguous.3
Dependent types
OK, so what would make this existential useful (and possible)? Well, we actually have slightly more information: not only do we know there's some a, we have its TypeRep. So maybe we can package that up:
randList :: [Dynamic] -> (exists a. Typeable a => IO (TypeRep,[a]))
This isn't quite good enough, though; the TypeRep and the [a] aren't linked at all. And that's exactly what you're trying to express: some way to link the TypeRep and the a.
Basically, your goal is to write something like
toType :: TypeRep -> *
Here, * is the kind of all types; if you haven't seen kinds before, they are to types what types are to values. * classifies types, * -> * classifies one-argument type constructors, etc. (For instance, Int :: *, Maybe :: * -> *, Either :: * -> * -> *, and Maybe Int :: *.)
With this, you could write (once again, this code isn't valid Haskell; in fact, it really bears only a passing resemblance to Haskell, as there's no way you could write it or anything like it within Haskell's type system):
randList :: [Dynamic] -> (exists (tr :: TypeRep).
Typeable (toType tr) => IO (tr, [toType tr]))
randList dl = do
tr <- randItem $ map dynTypeRep dl
return (tr, dynListToList dl :: [toType tr])
-- In fact, in an ideal world, the `:: [toType tr]` signature would be
-- inferable.
Now, you're promising the right thing: not that there exists some type which classifies the elements of the list, but that there exists some TypeRep such that its corresponding type classifies the elements of the list. If only you could do this, you would be set. But writing toType :: TypeRep -> * is completely impossible in Haskell: doing this requires a dependently-typed language, since toType tr is a type which depends on a value.
What does this mean? In Haskell, it's perfectly acceptable for values to depend on other values; this is what a function is. The value head "abc", for instance, depends on the value "abc". Similarly, we have type constructors, so it's acceptable for types to depend on other types; consider Maybe Int, and how it depends on Int. We can even have values which depend on types! Consider id :: a -> a. This is really a family of functions: id_Int :: Int -> Int, id_Bool :: Bool -> Bool, etc. Which one we have depends on the type of a. (So really, id = \(a :: *) (x :: a) -> x; although we can't write this in Haskell, there are languages where we can.)
Crucially, however, we can never have a type that depends on a value. We might want such a thing: imagine Vec 7 Int, the type of length-7 lists of integers. Here, Vec :: Nat -> * -> *: a type whose first argument must be a value of type Nat. But we can't write this sort of thing in Haskell.4 Languages which support this are called dependently-typed (and will let us write id as we did above); examples include Coq and Agda. (Such languages often double as proof assistants, and are generally used for research work as opposed to writing actual code. Dependent types are hard, and making them useful for everyday programming is an active area of research.)
Thus, in Haskell, we can check everything about our types first, throw away all that information, and then compile something that refers only to values. In fact, this is exactly what GHC does; since we can never check types at run-time in Haskell, GHC erases all the types at compile-time without changing the program's run-time behavior. This is why unsafeCoerce is easy to implement (operationally) and completely unsafe: at run-time, it's a no-op, but it lies to the type system. Consequently, something like toType is completely impossible to implement in the Haskell type system.
In fact, as you noticed, you can't even write down the desired type and use unsafeCoerce. For some problems, you can get away with this; we can write down the type for the function, but only implement it with by cheating. That's exactly how fromDynamic works. But as we saw above, there's not even a good type to give to this problem from within Haskell. The imaginary toType function allows you to give the program a type, but you can't even write down toType's type!
What now?
So, you can't do this. What should you do? My guess is that your overall architecture isn't ideal for Haskell, although I haven't seen it; Typeable and Dynamic don't actually show up that much in Haskell programs. (Perhaps you're "speaking Haskell with a Python accent", as they say.) If you only have a finite set of data types to deal with, you might be able to bundle things into a plain old algebraic data type instead:
data MyType = MTInt Int | MTBool Bool | MTString String
Then you can write isMTInt, and just use filter isMTInt, or filter (isSameMTAs randomMT).
Although I don't know what it is, there's probably a way you could unsafeCoerce your way through this problem. But frankly, that's not a good idea unless you really, really, really, really, really, really know what you're doing. And even then, it's probably not. If you need unsafeCoerce, you'll know, it won't just be a convenience thing.
I really agree with Daniel Wagner's comment: you're probably going to want to rethink your approach from scratch. Again, though, since I haven't seen your architecture, I can't say what that will mean. Maybe there's another Stack Overflow question in there, if you can distill out a concrete difficulty.
1 That looks like the following:
{-# LANGUAGE ExistentialQuantification #-}
data TypeableList = forall a. Typeable a => TypeableList [a]
randList :: [Dynamic] -> IO TypeableList
However, since none of this code compiles anyway, I think writing it out with exists is clearer.
2 Technically, there are some other functions which look relevant, such as toDyn :: Typeable a => a -> Dynamic and fromDyn :: Typeable a => Dynamic -> a -> a. However, Dynamic is more or less an existential wrapper around Typeables, relying on typeOf and TypeReps to know when to unsafeCoerce (GHC uses some implementation-specific types and unsafeCoerce, but you could do it this way, with the possible exception of dynApply/dynApp), so toDyn doesn't do anything new. And fromDyn doesn't really expect its argument of type a; it's just a wrapper around cast. These functions, and the other similar ones, don't provide any extra power that isn't available with just typeOf and cast. (For instance, going back to a Dynamic isn't very useful for your problem!)
3 To see the error in action, you can try to compile the following complete Haskell program:
{-# LANGUAGE ExistentialQuantification #-}
import Data.Dynamic
import Data.Typeable
import Data.Maybe
randItem :: [a] -> IO a
randItem = return . head -- Good enough for a short and non-compiling example
dynListToList :: Typeable a => [Dynamic] -> [a]
dynListToList = mapMaybe fromDynamic
data TypeableList = forall a. Typeable a => TypeableList [a]
randList :: [Dynamic] -> IO TypeableList
randList dl = do
tr <- randItem $ map dynTypeRep dl
return . TypeableList $ dynListToList dl -- Error! Ambiguous type variable.
Sure enough, if you try to compile this, you get the error:
SO12273982.hs:17:27:
Ambiguous type variable `a0' in the constraint:
(Typeable a0) arising from a use of `dynListToList'
Probable fix: add a type signature that fixes these type variable(s)
In the second argument of `($)', namely `dynListToList dl'
In a stmt of a 'do' block: return . TypeableList $ dynListToList dl
In the expression:
do { tr <- randItem $ map dynTypeRep dl;
return . TypeableList $ dynListToList dl }
But as is the entire point of the question, you can't "add a type signature that fixes these type variable(s)", because you don't know what type you want.
4 Mostly. GHC 7.4 has support for lifting types to kinds and for kind polymorphism; see section 7.8, "Kind polymorphism and promotion", in the GHC 7.4 user manual. This doesn't make Haskell dependently typed—something like TypeRep -> * example is still out5—but you will be able to write Vec by using very expressive types that look like values.
5 Technically, you could now write down something which looks like it has the desired type: type family ToType :: TypeRep -> *. However, this takes a type of the promoted kind TypeRep, and not a value of the type TypeRep; and besides, you still wouldn't be able to implement it. (At least I don't think so, and I can't see how you would—but I am not an expert in this.) But at this point, we're pretty far afield.
What you're observing is that the type TypeRep doesn't actually carry any type-level information along with it; only term-level information. This is a shame, but we can do better when we know all the type constructors we care about. For example, suppose we only care about Ints, lists, and function types.
{-# LANGUAGE GADTs, TypeOperators #-}
import Control.Monad
data a :=: b where Refl :: a :=: a
data Dynamic where Dynamic :: TypeRep a -> a -> Dynamic
data TypeRep a where
Int :: TypeRep Int
List :: TypeRep a -> TypeRep [a]
Arrow :: TypeRep a -> TypeRep b -> TypeRep (a -> b)
class Typeable a where typeOf :: TypeRep a
instance Typeable Int where typeOf = Int
instance Typeable a => Typeable [a] where typeOf = List typeOf
instance (Typeable a, Typeable b) => Typeable (a -> b) where
typeOf = Arrow typeOf typeOf
congArrow :: from :=: from' -> to :=: to' -> (from -> to) :=: (from' -> to')
congArrow Refl Refl = Refl
congList :: a :=: b -> [a] :=: [b]
congList Refl = Refl
eq :: TypeRep a -> TypeRep b -> Maybe (a :=: b)
eq Int Int = Just Refl
eq (Arrow from to) (Arrow from' to') = liftM2 congArrow (eq from from') (eq to to')
eq (List t) (List t') = liftM congList (eq t t')
eq _ _ = Nothing
eqTypeable :: (Typeable a, Typeable b) => Maybe (a :=: b)
eqTypeable = eq typeOf typeOf
toDynamic :: Typeable a => a -> Dynamic
toDynamic a = Dynamic typeOf a
-- look ma, no unsafeCoerce!
fromDynamic_ :: TypeRep a -> Dynamic -> Maybe a
fromDynamic_ rep (Dynamic rep' a) = case eq rep rep' of
Just Refl -> Just a
Nothing -> Nothing
fromDynamic :: Typeable a => Dynamic -> Maybe a
fromDynamic = fromDynamic_ typeOf
All of the above is pretty standard. For more on the design strategy, you'll want to read about GADTs and singleton types. Now, the function you want to write follows; the type is going to look a bit daft, but bear with me.
-- extract only the elements of the list whose type match the head
firstOnly :: [Dynamic] -> Dynamic
firstOnly [] = Dynamic (List Int) []
firstOnly (Dynamic rep v:xs) = Dynamic (List rep) (v:go xs) where
go [] = []
go (Dynamic rep' v:xs) = case eq rep rep' of
Just Refl -> v : go xs
Nothing -> go xs
Here we've picked a random element (I rolled a die, and it came up 1) and extracted only the elements that have a matching type from the list of dynamic values. Now, we could have done the same thing with regular boring old Dynamic from the standard libraries; however, what we couldn't have done is used the TypeRep in a meaningful way. I now demonstrate that we can do so: we'll pattern match on the TypeRep, and then use the enclosed value at the specific type the TypeRep tells us it is.
use :: Dynamic -> [Int]
use (Dynamic (List (Arrow Int Int)) fs) = zipWith ($) fs [1..]
use (Dynamic (List Int) vs) = vs
use (Dynamic Int v) = [v]
use (Dynamic (Arrow (List Int) (List (List Int))) f) = concat (f [0..5])
use _ = []
Note that on the right-hand sides of these equations, we are using the wrapped value at different, concrete types; the pattern match on the TypeRep is actually introducing type-level information.
You want a function that chooses a different type of values to return based on runtime data. Okay, great. But the whole purpose of a type is to tell you what operations can be performed on a value. When you don't know what type will be returned from a function, what do you do with the values it returns? What operations can you perform on them? There are two options:
You want to read the type, and perform some behaviour based on which type it is. In this case you can only cater for a finite list of types known in advance, essentially by testing "is it this type? then we do this operation...". This is easily possible in the current Dynamic framework: just return the Dynamic objects, using dynTypeRep to filter them, and leave the application of fromDynamic to whoever wants to consume your result. Moreover, it could well be possible without Dynamic, if you don't mind setting the finite list of types in your producer code, rather than your consumer code: just use an ADT with a constructor for each type, data Thing = Thing1 Int | Thing2 String | Thing3 (Thing,Thing). This latter option is by far the best if it is possible.
You want to perform some operation that works across a family of types, potentially some of which you don't know about yet, e.g. by using type class operations. This is trickier, and it's tricky conceptually too, because your program is not allowed to change behaviour based on whether or not some type class instance exists – it's an important property of the type class system that the introduction of a new instance can either make a program type check or stop it from type checking, but it can't change the behaviour of a program. Hence you can't throw an error if your input list contains inappropriate types, so I'm really not sure that there's anything you can do that doesn't essentially involve falling back to the first solution at some point.

Why does the Data.String.IsString typeclass only define one conversion?

Why does the Haskell base package only define the IsString class to have a conversion from String to 'like-string' value, and not define the inverse transformation, from 'like-string' value to String?
The class should be defined as:
class IsString a where
fromString :: String -> a
toString :: a -> String
ref: http://hackage.haskell.org/packages/archive/base/4.4.0.0/doc/html/Data-String.html
The reason is IMHO that IsString's primary purpose is to be used for string literals in Haskell source code (or (E)DSLs -- see also Paradise: A two-stage DSL embedded in Haskell) via the OverloadedStrings language extension in an analogous way to how other polymorphic literals work (e.g. via fromRational for floating point literals or fromInteger for integer literals)
The term IsString might be a bit misleading, as it suggests that the type-class represents string-like structures, whereas it's really just to denote types which have a quoted-string-representation in Haskell source code.
If you desire to use toString :: a -> String, I think you're simply forgetting about show :: a -> String, or more properly Show a => show :: a -> String.
If you want to operate on a type both having a :: a -> String and :: String -> a, you can simply put those type-class constraints on the functions.
doubleConstraintedFunction :: Show a, IsString a => a -> .. -> .. -> a
We carefully note that we avoid defining type classes having a set of functions that can as well be split into two subclasses. Therefor we don't put toString in IsString.
Finally, I must also mention about Read, which provides Read a => String -> a. You use read and show for very simple serialization. fromString from IsString has a different purpose, it's useful with the language pragma OverloadedStrings, then you can very conveniently insert code like "This is not a string" :: Text. (Text is a (efficient) data-structure for Strings)

Is it possible to use SYB to transform the type?

I want to write a rename function to replace String names (which represent hierarchical identifiers) in my AST with GUID names (integers) from a symbol table carried as hidden state in a Renamer monad.
I have an AST a type that is parameterized over the type of name. Names in the leaves of the AST are of type Name a:
data Name a = Name a
Which makes it easy to target them with a SYB transformer.
The parser is typed (ignoring the possibility of error for brevity):
parse :: String -> AST String
and I want the rename function to be typed:
rename :: AST String -> Renamer (AST GUID)
Is it possible to use SYB to transform all Name String's into Name GUID's with a transformer:
resolveName :: Name String -> Renamer (Name GUID)
and all other values from c String to c GUID by transforming their children, and pasting them back together with the same constructor, albeit with a different type parameter?
The everywhereM function is close to what I want, but it can only transform c a -> m (c a) and not c a -> m (c b).
My fallback solution (other than writing the boiler-plate by hand) is to remove the type parameter from AST, and define Name like this:
data Name = StrName String
| GuidName GUID
so that the rename would be typed:
rename :: AST -> Renamer AST
making it work with everywhereM. However, this would leave the possibility that an AST could still hold StrName's after being renamed. I wanted to use the type system to formally capture the fact that a renamed AST can only hold GUID names.
One solution (perhaps less efficient than you were hoping for) would be to make your AST an instance of Functor, Data and Typeable (GHC 7 can probably derive all of these for you) then do:
import Data.Generics.Uniplate.Data(universeBi) -- from the uniplate package
import qualified Data.Map as Map
rename :: AST String -> Renamer (AST GUID)
rename x = do
let names = nub $ universeBi x :: [Name String]
guids <- mapM resolveName names
let mp = Map.fromList $ zip names guids
return $ fmap (mp Map.!) x
Two points:
I'm assuming it's easy to eliminate the Name bit from resolveName, but I suspect it is.
You can switch universeBi for something equivalent in SYB, but I find it much easier to understand the Uniplate versions.

Resources