Typed abstract syntax and DSL design in Haskell - haskell

I'm designing a DSL in Haskell and I would like to have an assignment operation. Something like this (the code below is just for explaining my problem in a limited context, I didn't have type checked Stmt type):
data Stmt = forall a . Assign String (Exp a) -- Assignment operation
| forall a. Decl String a -- Variable declaration
data Exp t where
EBool :: Bool -> Exp Bool
EInt :: Int -> Exp Int
EAdd :: Exp Int -> Exp Int -> Exp Int
ENot :: Exp Bool -> Exp Bool
In the previous code, I'm able to use a GADT to enforce type constraints on expressions. My problem is how can I enforce that the left hand side of an assignment is: 1) Defined, i.e., a variable must be declared before it is used and 2) The right hand side must have the same type of the left hand side variable?
I know that in a full dependently typed language, I could define statements indexed by some sort of typing context, that is, a list of defined variables and their type. I believe that this would solve my problem. But, I'm wondering if there is some way to achieve this in Haskell.
Any pointer to example code or articles is highly appreciated.

Given that my work focuses on related issues of scope and type safety being encoded at the type-level, I stumbled upon this old-ish question whilst googling around and thought I'd give it a try.
This post provides, I think, an answer quite close to the original specification. The whole thing is surprisingly short once you have the right setup.
First, I'll start with a sample program to give you an idea of what the end result looks like:
program :: Program
program = Program
$ Declare (Var :: Name "foo") (Of :: Type Int)
:> Assign (The (Var :: Name "foo")) (EInt 1)
:> Declare (Var :: Name "bar") (Of :: Type Bool)
:> increment (The (Var :: Name "foo"))
:> Assign (The (Var :: Name "bar")) (ENot $ EBool True)
:> Done
Scoping
In order to ensure that we may only assign values to variables which have been declared before, we need a notion of scope.
GHC.TypeLits provides us with type-level strings (called Symbol) so we can very-well use strings as variable names if we want. And because we want to ensure type safety, each variable declaration comes with a type annotation which we will store together with the variable name. Our type of scopes is therefore: [(Symbol, *)].
We can use a type family to test whether a given Symbol is in scope and return its associated type if that is the case:
type family HasSymbol (g :: [(Symbol,*)]) (s :: Symbol) :: Maybe * where
HasSymbol '[] s = 'Nothing
HasSymbol ('(s, a) ': g) s = 'Just a
HasSymbol ('(t, a) ': g) s = HasSymbol g s
From this definition we can define a notion of variable: a variable of type a in scope g is a symbol s such that HasSymbol g s returns 'Just a. This is what the ScopedSymbol data type represents by using an existential quantification to store the s.
data ScopedSymbol (g :: [(Symbol,*)]) (a :: *) = forall s.
(HasSymbol g s ~ 'Just a) => The (Name s)
data Name (s :: Symbol) = Var
Here I am purposefully abusing notations all over the place: The is the constructor for the type ScopedSymbol and Name is a Proxy type with a nicer name and constructor. This allows us to write such niceties as:
example :: ScopedSymbol ('("foo", Int) ': '("bar", Bool) ': '[]) Bool
example = The (Var :: Name "bar")
Statements
Now that we have a notion of scope and of well-typed variables in that scope, we can start considering the effects Statements should have. Given that new variables can be declared in a Statement, we need to find a way to propagate this information in the scope. The key hindsight is to have two indices: an input and an output scope.
To Declare a new variable together with its type will expand the current scope with the pair of the variable name and the corresponding type.
Assignments on the other hand do not modify the scope. They merely associate a ScopedSymbol to an expression of the corresponding type.
data Statement (g :: [(Symbol, *)]) (h :: [(Symbol,*)]) where
Declare :: Name s -> Type a -> Statement g ('(s, a) ': g)
Assign :: ScopedSymbol g a -> Exp g a -> Statement g g
data Type (a :: *) = Of
Once again we have introduced a proxy type to have a nicer user-level syntax.
example' :: Statement '[] ('("foo", Int) ': '[])
example' = Declare (Var :: Name "foo") (Of :: Type Int)
example'' :: Statement ('("foo", Int) ': '[]) ('("foo", Int) ': '[])
example'' = Assign (The (Var :: Name "foo")) (EInt 1)
Statements can be chained in a scope-preserving way by defining the following GADT of type-aligned sequences:
infixr 5 :>
data Statements (g :: [(Symbol, *)]) (h :: [(Symbol,*)]) where
Done :: Statements g g
(:>) :: Statement g h -> Statements h i -> Statements g i
Expressions
Expressions are mostly unchanged from your original definition except that they are now scoped and a new constructor EVar lets us dereference a previously-declared variable (using ScopedSymbol) giving us an expression of the appropriate type.
data Exp (g :: [(Symbol,*)]) (t :: *) where
EVar :: ScopedSymbol g a -> Exp g a
EBool :: Bool -> Exp g Bool
EInt :: Int -> Exp g Int
EAdd :: Exp g Int -> Exp g Int -> Exp g Int
ENot :: Exp g Bool -> Exp g Bool
Programs
A Program is quite simply a sequence of statements starting in the empty scope. We use, once more, an existential quantification to hide the scope we end up with.
data Program = forall h. Program (Statements '[] h)
It is obviously possible to write subroutines in Haskell and use them in your programs. In the example, I have the very simple increment which can be defined like so:
increment :: ScopedSymbol g Int -> Statement g g
increment v = Assign v (EAdd (EVar v) (EInt 1))
I have uploaded the whole code snippet together with the right LANGUAGE pragmas and the examples listed here in a self-contained gist. I haven't however included any comments there.

You should know that your goals are quite lofty. I don't think you will get very far treating your variables exactly as strings. I'd do something slightly more annoying to use, but more practical. Define a monad for your DSL, which I'll call M:
newtype M a = ...
data Exp a where
... as before ...
data Var a -- a typed variable
assign :: Var a -> Exp a -> M ()
declare :: String -> a -> M (Var a)
I'm not sure why you have Exp a for assignment and just a for declaration, but I reproduced that here. The String in declare is just for cosmetics, if you need it for code generation or error reporting or something -- the identity of the variable should really not be tied to that name. So it's usually used as
myFunc = do
foobar <- declare "foobar" 42
which is the annoying redundant bit. Haskell doesn't really have a good way around this (though depending on what you're doing with your DSL, you may not need the string at all).
As for the implementation, maybe something like
data Stmt = forall a. Assign (Var a) (Exp a)
| forall a. Declare (Var a) a
data Var a = Var String Integer -- string is auxiliary from before, integer
-- stores real identity.
For M, we need a unique supply of names and a list of statements to output.
newtype M a = M { runM :: WriterT [Stmt] (StateT Integer Identity a) }
deriving (Functor, Applicative, Monad)
Then the operations as usually fairly trivial.
assign v a = M $ tell [Assign v a]
declare name a = M $ do
ident <- lift get
lift . put $! ident + 1
let var = Var name ident
tell [Declare var a]
return var
I've made a fairly large DSL for code generation in another language using a fairly similar design, and it scales well. I find it a good idea to stay "near the ground", just doing solid modeling without using too many fancy type-level magical features, and accepting minor linguistic annoyances. That way Haskell's main strength -- it's ability to abstract -- can still be used for code in your DSL.
One drawback is that everything needs to be defined within a do block, which can be a hinderance to good organization as the amount of code grows. I'll steal declare to show a way around that:
declare :: String -> M a -> M a
used like
foo = declare "foo" $ do
-- actual function body
then your M can have as a component of its state a cache from names to variables, and the first time you use a declaration with a certain name you render it and put it in a variable (this will require a bit more sophisticated monoid than [Stmt] as the target of your Writer). Later times you just look up the variable. It does have a rather floppy dependence on uniqueness of names, unfortunately; an explicit model of namespaces can help with that but never eliminate it entirely.

After seeing all the code by #Cactus and the Haskell suggestions by #luqui, I've managed to got a solution close to what I want in Idris. The complete code is available at the following gist:
(https://gist.github.com/rodrigogribeiro/33356c62e36bff54831d)
Some little things I need to fix in the previous solution:
I don't know (yet) if Idris support integer literal overloading, what would be quite useful to build my DSL.
I've tried to define in DSL syntax a prefix operator for program variables, but it didn't worked as I like. I've got a solution (in the previous gist) that uses a keyword --- use --- for variable access.
I'll check this minor points with guys in Idris #freenode channel to see if these two points are possible.

Related

Is it possible to ensure that two GADT type variables are the same without dependent types?

I'm writing a compiler where I'm using GADTs for my IR but standard data types for my everything else. I'm having trouble during the conversion from the old data type to the GADT. I've attempted to recreate the situation with a smaller/simplified language below.
To start with, I have the following data types:
data OldLVal = VarOL Int -- The nth variable. Can be used to construct a Temp later.
| LDeref OldLVal
data Exp = Var Int -- See above
| IntT Int32
| Deref Exp
data Statement = AssignStmt OldLVal Exp
| ...
I want to convert these into this intermediate form:
{-# LANGUAGE DataKinds #-}
{-# LANGUAGE GADTs #-}
{-# LANGUAGE KindSignatures #-}
-- Note: this is a Phantom type
data Temp a = Temp Int
data Type = IntT
| PtrT Type
data Command where
Assign :: NewLVal a -> Pure a -> Command
...
data NewLVal :: Type -> * where
VarNL :: Temp a -> NewLVal a
DerefNL :: NewLVal ('PtrT ('Just a)) -> NewLVal a
data Pure :: Type -> * where
ConstP :: Int32 -> Pure 'IntT
ConstPtrP :: Int32 -> Pure ('PtrT a)
VarP :: Temp a -> Pure a
At this point, I just want to write a conversion from the old data type to the new GADT. For right now, I have something that looks like this.
convert :: Statement -> Either String Command
convert (AssignStmt oldLval exp) = do
newLval <- convertLVal oldLval -- Either String (NewLVal a)
pure <- convertPure exp -- Either String (Pure b)
-- return $ Assign newLval pure -- Obvious failure. Can't ensure a ~ b.
pure' <- matchType newLval pure -- Either String (Pure a)
return $ Assign newLval pure'
-- Converts Pure b into Pure a. Should essentially be a noop, but simply
-- proves that it is possible.
matchType :: NewLVal a -> Pure b -> Either String (Pure a)
matchType = undefined
I realized that I couldn't write convert trivially, so I attempted to solve the problem using this idea of matchType which acts as a proof that these two types are indeed equal. The question is: how do I actually write matchType? This would be much easier if I had fully dependent types (or so I'm told), but can I finish this code here?
An alternative to this would be to somehow provide newLval as an argument to convertPure, but I think that essentially is just attempting to use dependent types.
Any other suggestions are welcome.
If it helps, I also have a function that can convert an Exp or OldLVal to its type:
class Typed a where
typeOf :: a -> Type
instance Typed Exp where
...
instance Typed OldLVal where
...
EDIT:
Thanks to the excellent answers below, I've been able to finish writing this module.
I ended up using the singletons package mentioned below. It was a little strange at first, but I found it pretty reasonable to use after I started understanding what I was doing. However, I did run into one pitfall: The type of convertLVal and convertPure requires an existential to express.
data WrappedPure = forall a. WrappedPure (Pure a, SType a)
data WrappedLVal = forall a. WrappedLVal (NewLVal a, SType a)
convertPure :: Exp -> Either String WrappedPure
convertLVal :: OldLVal -> Either String WrappedLVal
This means that you'll have to unwrap that existential in convert, but otherwise, the answers below show you the way. Thanks so much once again.
You want to perform a comparison at runtime on some type level data (namely the Types by which your values are indexed). But by the time you run your code, and the values start to interact, the types are long gone. They're erased by the compiler, in the name of producing efficient code. So you need to manually reconstruct the type level data that was erased, using a value which reminds you of the type you'd forgotten you were looking at. You need a singleton copy of Type.
data SType t where
SIntT :: SType IntT
SPtrT :: SType t -> SType (PtrT t)
Members of SType look like members of Type - compare the structure of a value like SPtrT (SPtrT SIntT) with that of PtrT (PtrT IntT) - but they're indexed by the (type-level) Types that they resemble. For each t :: Type there's precisely one SType t (hence the name singleton), and because SType is a GADT, pattern matching on an SType t tells the type checker about the t. Singletons span the otherwise strictly-enforced separation between types and values.
So when you're constructing your typed tree, you need to track the runtime STypes of your values and compare them when necessary. (This basically amounts to writing a partially verified type checker.) There's a class in Data.Type.Equality containing a function which compares two singletons and tells you whether their indexes match or not.
instance TestEquality SType where
-- testEquality :: SType t1 -> SType t2 -> Maybe (t1 :~: t2)
testEquality SIntT SIntT = Just Refl
testEquality (SPtrT t1) (SPtrT t2)
| Just Refl <- testEquality t1 t2 = Just Refl
testEquality _ _ = Nothing
Applying this in your convert function looks roughly like this:
convert :: Statement -> Either String Command
convert (AssignStmt oldLval exp) = do
(newLval, newLValSType) <- convertLVal oldLval
(pure, pureSType) <- convertPure exp
case testEquality newLValSType pureSType of
Just Refl -> return $ Assign newLval pure'
Nothing -> Left "type mismatch"
There actually aren't a whole lot of dependently typed programs you can't fake up with TypeInType and singletons (are there any?), but it's a real hassle to duplicate all of your datatypes in both "normal" and "singleton" form. (The duplication gets even worse if you want to pass singletons around implicitly - see Hasochism for the details.) The singletons package can generate much of the boilerplate for you, but it doesn't really alleviate the pain caused by duplicating the concepts themselves. That's why people want to add real dependent types to Haskell, but we're a good few years away from that yet.
The new Type.Reflection module contains a rewritten Typeable class. Its TypeRep is GADT-like and can act as a sort of "universal singleton". But programming with it is even more awkward than programming with singletons, in my opinion.
matchType as written is not possible to implement, but the idea you are going for is definitely possible. Do you know about Data.Typeable? Typeable is a class that provides some basic reflective operations for inspecting types. To use it, you need a Typeable a constraint in scope for any type variable a you want to know about. So for matchType you would have
matchType :: (Typeable a, Typeable b) => NewLVal a -> Pure b -> Either String (Pure a)
It needs also to infect your GADTs any time you want to hide a type variable:
data Command where
Assign :: (Typeable a) => NewLVal a -> Pure a -> Command
...
But if you have the appropriate constraints in scope, you can use eqT to make type-safe runtime type comparisons. For example
-- using ScopedTypeVariables and TypeApplications
matchType :: forall a b. (Typeable a, Typeable b) => NewLVal a -> Pure b -> Either String (Pure b)
matchType = case eqT #a #b of
Nothing -> Left "types are not equal"
Just Refl -> {- in this scope the compiler knows that
a and b are the same type -}

Parsing to Free Monads

Say I have the following free monad:
data ExampleF a
= Foo Int a
| Bar String (Int -> a)
deriving Functor
type Example = Free ExampleF -- this is the free monad want to discuss
I know how I can work with this monad, eg. I could write some nice helpers:
foo :: Int -> Example ()
foo i = liftF $ Foo i ()
bar :: String -> Example Int
bar s = liftF $ Bar s id
So I can write programs in haskell like:
fooThenBar :: Example Int
fooThenBar =
do
foo 10
bar "nice"
I know how to print it, interpret it, etc. But what about parsing it?
Would it be possible to write a parser that could parse arbitrary
programs like:
foo 12
bar nice
foo 11
foo 42
So I can store them, serialize them, use them in cli programs etc.
The problem I keep running into is that the type of the program depends on which program is being parsed. If the program ends with a foo it's of
type Example () if it ends with a bar it's of type Example Int.
I do not feel like writing parsers for every possible permutation (it's simple here because there are only two possibilities, but imagine we add
Baz Int (String -> a), Doo (Int -> a), Moz Int a, Foz String a, .... This get's tedious and error-prone).
Perhaps I'm solving the wrong problem?
Boilerplate
To run the above examples, you need to add this to the beginning of the file:
{-# LANGUAGE DeriveFunctor #-}
import Control.Monad.Free
import Text.ParserCombinators.Parsec
Note: I put up a gist containing this code.
Not every Example value can be represented on the page without reimplementing some portion of Haskell. For example, return putStrLn has a type of Example (String -> IO ()), but I don't think it makes sense to attempt to parse that sort of Example value out of a file.
So let's restrict ourselves to parsing the examples you've given, which consist only of calls to foo and bar sequenced with >> (that is, no variable bindings and no arbitrary computations)*. The Backus-Naur form for our grammar looks approximately like this:
<program> ::= "" | <expr> "\n" <program>
<expr> ::= "foo " <integer> | "bar " <string>
It's straightforward enough to parse our two types of expression...
type Parser = Parsec String ()
int :: Parser Int
int = fmap read (many1 digit)
parseFoo :: Parser (Example ())
parseFoo = string "foo " *> fmap foo int
parseBar :: Parser (Example Int)
parseBar = string "bar " *> fmap bar (many1 alphaNum)
... but how can we give a type to the composition of these two parsers?
parseExpr :: Parser (Example ???)
parseExpr = parseFoo <|> parseBar
parseFoo and parseBar have different types, so we can't compose them with <|> :: Alternative f => f a -> f a -> f a. Moreover, there's no way to know ahead of time which type the program we're given will be: as you point out, the type of the parsed program depends on the value of the input string. "Types depending on values" is called dependent types; Haskell doesn't feature a proper dependent type system, but it comes close enough for us to have a stab at making this example work.
Let's start by forcing the expressions on either side of <|> to have the same type. This involves erasing Example's type parameter using existential quantification.†
data Ex a = forall i. Wrap (a i)
parseExpr :: Parser (Ex Example)
parseExpr = fmap Wrap parseFoo <|> fmap Wrap parseBar
This typechecks, but the parser now returns an Example containing a value of an unknown type. A value of unknown type is of course useless - but we do know something about Example's parameter: it must be either () or Int because those are the return types of parseFoo and parseBar. Programming is about getting knowledge out of your brain and onto the page, so we're going to wrap up the Example value with a bit of GADT evidence which, when unwrapped, will tell you whether a was Int or ().
data Ty a where
IntTy :: Ty Int
UnitTy :: Ty ()
data (a :*: b) i = a i :&: b i
type Sig a b = Ex (a :*: b)
pattern Sig x y = Wrap (x :&: y)
parseExpr :: Parser (Sig Ty Example)
parseExpr = fmap (\x -> Sig UnitTy x) parseFoo <|>
fmap (\x -> Sig IntTy x) parseBar
Ty is (something like) a runtime "singleton" representative of Example's type parameter. When you pattern match on IntTy, you learn that a ~ Int; when you pattern match on UnitTy you learn that a ~ (). (Information can be made to flow the other way, from types to values, using classes.) :*:, the functor product, pairs up two type constructors ensuring that their parameters are equal; thus, pattern matching on the Ty tells you about its accompanying Example.
Sig is therefore called a dependent pair or sigma type - the type of the second component of the pair depends on the value of the first. This is a common technique: when you erase a type parameter by existential quantification, it usually pays to make it recoverable by bundling up a runtime representative of that parameter.
Note that this use of Sig is equivalent to Either (Example Int) (Example ()) - a sigma type is a sum, after all - but this version scales better when you're summing over a large (or possibly infinite) set.
Now it's easy to build our expression parser into a program parser. We just have to repeatedly apply the expression parser, and then manipulate the dependent pairs in the list.
parseProgram :: Parser (Sig Ty Example)
parseProgram = fmap (foldr1 combine) $ parseExpr `sepBy1` (char '\n')
where combine (Sig _ val) (Sig ty acc) = Sig ty (val >> acc)
The code I've shown you is not exemplary. It doesn't separate the concerns of parsing and typechecking. In production code I would modularise this design by first parsing the data into an untyped syntax tree - a separate data type which doesn't enforce the typing invariant - then transform that into a typed version by type-checking it. The dependent pair technique would still be necessary to give a type to the output of the type-checker, but it wouldn't be tangled up in the parser.
*If binding is not a requirement, have you thought about using a free applicative to represent your data?
†Ex and :*: are reusable bits of machinery which I lifted from the Hasochism paper
So, I worry that this is the same sort of premature abstraction that you see in object-oriented languages, getting in the way of things. For example, I am not 100% sure that you are using the structure of the free monad -- your helpers for example simply seem to use id and () in a rather boring way, in fact I'm not sure if your Int -> x is ever anything other than either Pure :: Int -> Free ExampleF Int or const (something :: Free ExampleF Int).
The free monad for a functor F can basically be described as a tree whose data is stored in leaves and whose branching factor is controlled by the recursion in each constructor of the functor F. So for example Free Identity has no branching, hence only one leaf, and thus has the same structure as the monad:
data MonoidalFree m x = MF m x deriving (Functor)
instance Monoid m => Monad (MonoidalFree m) where
return x = MF mempty x
MF m x >>= my_x = case my_x x of MF n y -> MF (mappend m n) y
In fact Free Identity is isomorphic to MonoidalFree (Sum Integer), the difference is just that instead of MF (Sum 3) "Hello" you see Free . Identity . Free . Identity . Free . Identity $ Pure "Hello" as the means of tracking this integer. On the other hand if you have data E x = L x | R x deriving (Functor) then you get a sort of "path" of Ls and Rs before you hit this one leaf, Free E is going to be isomorphic to MonoidalFree [Bool].
The reason I'm going through this is that when you combine Free with an Integer -> x functor, you get an infinitely branching tree, and when I'm looking through your code to figure out how you're actually using this tree, all I see is that you use the id function with it. As far as I can tell, that restricts the recursion to either have the form Free (Bar "string" Pure) or else Free (Bar "string" (const subExpression)), in which case the system would seem to reduce completely to the MonoidalFree [Either Int String] monad.
(At this point I should pause to ask: Is that correct as far as you know? Was this what was intended?)
Anyway. Aside from my problems with your premature abstraction, the specific problem that you're citing with your monad (you can't tell the difference between () and Int has a bunch of really complicated solutions, but one really easy one. The really easy solution is to yield a value of type Example (Either () Int) and if you have a () you can fmap Left onto it and if you have an Int you can fmap Right onto it.
Without a much better understanding of how you're using this thing over TCP/IP we can't recommend a better structure for you than the generic free monads that you seem to be finding -- in particular we'd need to know how you're planning on using the infinite-branching of Int -> x options in practice.

Does exporting type constructors make a difference?

Let's say I have an internal data type, T a, that is used in the signature of exported functions:
module A (f, g) where
newtype T a = MkT { unT :: (Int, a) }
deriving (Functor, Show, Read) -- for internal use
f :: a -> IO (T a)
f a = fmap (\i -> T (i, a)) randomIO
g :: T a -> a
g = snd . unT
What is the effect of not exporting the type constructor T? Does it prevent consumers from meddling with values of type T a? In other words, is there a difference between the export list (f, g) and (f, g, T()) here?
Prevented
The first thing a consumer will see is that the type doesn't appear in Haddock documentation. In the documentation for f and g, the type Twill not be hyperlinked like an exported type. This may prevent a casual reader from discovering T's class instances.
More importantly, a consumer cannot doing anything with T at the type level. Anything that requires writing a type will be impossible. For instance, a consumer cannot write new class instances involving T, or include T in a type family. (I don't think there's a way around this...)
At the value level, however, the main limitation is that a consumer cannot write a type annotation including T:
> :t (f . read) :: Read b => String -> IO (A.T b)
<interactive>:1:39: Not in scope: type constructor or class `A.T'
Not prevented
The restriction on type signatures is not as significant a limitation as it appears. The compiler can still infer such a type:
> :t f . read
f . read :: Read b => String -> IO (A.T b)
Any value expression within the inferrable subset of Haskell may therefore be expressed regardless of the availability of the type constructor T. If, like me, you're addicted to ScopedTypeVariables and extensive annotations, you may be a little surprised by the definition of unT' below.
Furthermore, because typeclass instances have global scope, a consumer can use any available class functions without additional limitation. Depending on the classes involved, this may allow significant manipulation of values of the unexposed type. With classes like Functor, a consumer can also freely manipulate type parameters, because there's an available function of type T a -> T b.
In the example of T, deriving Show of course exposes the "internal" Int, and gives a consumer enough information to hackishly implement unT:
-- :: (Show a, Read a) => T a -> (Int, a)
unT' = (read . strip . show') `asTypeOf` (mkPair . g)
where
strip = reverse . drop 1 . reverse . drop 9
-- :: T a -> String
show' = show `asTypeOf` (mkString . g)
mkPair :: t -> (Int, t)
mkPair = undefined
mkString :: t -> String
mkString = undefined
> :t unT'
unT' :: (Show b, Read b) => A.T b -> (Int, b)
> x <- f "x"
> unT' x
(-29353, "x")
Implementing mkT' with the Read instance is left as an exercise.
Deriving something like Generic will completely explode any idea of containment, but you'd probably expect that.
Prevented?
In the corners of Haskell where type signatures are necessary or where asTypeOf-style tricks don't work, I guess not exporting the type constructor could actually prevent a consumer from doing something they could with the export list (f, g, T()).
Recommendation
Export all type constructors that are used in the type of any value you export. Here, go ahead and include T() in your export list. Leaving it out doesn't accomplish anything other than muddying the documentation. If you want to expose an purely abstract immutable type, use a newtype with a hidden constructor and no class instances.

RankNTypes and pattern matching

Is there a way in haskell to erase type information/downcast to a polymorphic value ?
In the example I have a boxed type T which can contain either an Int or a Char
And I want to write a function which extract this value without knowing which type it is.
{#- LANGUAGE RankNTypes -#}
data T = I Int | C Char
-- This is not working because GHC cannot bind "a"
-- with specific type Int and Char at the same time.
-- I just want a polymorphic value back ;(
getValue :: T -> (forall a. a)
getValue (I val) = val
getValue (C val) = val
-- This on the other hand works, because the function
-- is local to the pattern matching expression
onValue :: T -> (forall a. a -> a) -> T
onValue (I val) f = I $ f val
onValue (C val) f = C $ f val
Is there a way to write a function that can extract this value without forcing a type at the end ?
a getValue function like the first one ?
Let me know if it is not clear enough.
Answer
So the question was stupid as AndrewC (in the comment) and YellPika pointed out. An infinite type has no meaning.
J. Abrahamson provides an explanation for what I am looking for, so I put his answer as the solution.
P.S: I do not want to use GADT as I do not want a new type each time.
What you probably want is not to return a value (forall a . a) as it is wrong on several fronts. For one, you do not have any value but instead just one of two. For two, such a type cannot exist in a well-behaved program: it corresponds to the type of infinite loops and exceptions, e.g. bottom.
Finally, such a type allows the person who owns it to make the choice to instantiate it more concretely. Since you're giving it to the caller of your function that means that they would get to choose which of an Int or Char you had. Clearly that doesn't make sense.
Instead, what you most likely want is to make a demand of the user of your function: "you have to work regardless of what this type is".
foo :: (forall a . a -> r) -> (T -> r)
foo f (I i) = f i
foo f (C c) = f c
You'll find this function to be really similar to the following
bar :: r -> T -> r
bar x (I _) = x
bar x (C _) = x
In other words, if you force the consumer of your function to disregard all type information then, well, actually nothing at all remains: e.g. a constant function.
You can use GADTs:
{-# LANGUAGE GADTs #-}
data T a where
I :: Int -> T Int
C :: Char -> T Char
getValue :: T a -> a
getValue (I i) = i
getValue (C c) = c
If you turn on ExistentialTypes, you can write:
data Anything = forall a. Anything a
getValue :: T -> Anything
getValue (I val) = Anything val
getValue (C val) = Anything val
However, this is pretty useless. Say we pattern match on an Anything:
doSomethingWith (Anything x) = ?
We don't know anything about x other than that it exists... (well, not even - it might be undefined). There's no type information, so we can't do anything with it.

Haskell get type of algebraic parameter

I have a type
class IntegerAsType a where
value :: a -> Integer
data T5
instance IntegerAsType T5 where value _ = 5
newtype (IntegerAsType q) => Zq q = Zq Integer deriving (Eq)
newtype (Num a, IntegerAsType n) => PolyRing a n = PolyRing [a]
I'm trying to make a nice "show" for the PolyRing type. In particular, I want the "show" to print out the type 'a'. Is there a function that returns the type of an algebraic parameter (a 'show' for types)?
The other way I'm trying to do it is using pattern matching, but I'm running into problems with built-in types and the algebraic type.
I want a different result for each of Integer, Int and Zq q.
(toy example:)
test :: (Num a, IntegerAsType q) => a -> a
(Int x) = x+1
(Integer x) = x+2
(Zq x) = x+3
There are at least two different problems here.
1) Int and Integer are not data constructors for the 'Int' and 'Integer' types. Are there data constructors for these types/how do I pattern match with them?
2) Although not shown in my code, Zq IS an instance of Num. The problem I'm getting is:
Ambiguous constraint `IntegerAsType q'
At least one of the forall'd type variables mentioned by the constraint
must be reachable from the type after the '=>'
In the type signature for `test':
test :: (Num a, IntegerAsType q) => a -> a
I kind of see why it is complaining, but I don't know how to get around that.
Thanks
EDIT:
A better example of what I'm trying to do with the test function:
test :: (Num a) => a -> a
test (Integer x) = x+2
test (Int x) = x+1
test (Zq x) = x
Even if we ignore the fact that I can't construct Integers and Ints this way (still want to know how!) this 'test' doesn't compile because:
Could not deduce (a ~ Zq t0) from the context (Num a)
My next try at this function was with the type signature:
test :: (Num a, IntegerAsType q) => a -> a
which leads to the new error
Ambiguous constraint `IntegerAsType q'
At least one of the forall'd type variables mentioned by the constraint
must be reachable from the type after the '=>'
I hope that makes my question a little clearer....
I'm not sure what you're driving at with that test function, but you can do something like this if you like:
{-# LANGUAGE ScopedTypeVariables #-}
class NamedType a where
name :: a -> String
instance NamedType Int where
name _ = "Int"
instance NamedType Integer where
name _ = "Integer"
instance NamedType q => NamedType (Zq q) where
name _ = "Zq (" ++ name (undefined :: q) ++ ")"
I would not be doing my Stack Overflow duty if I did not follow up this answer with a warning: what you are asking for is very, very strange. You are probably doing something in a very unidiomatic way, and will be fighting the language the whole way. I strongly recommend that your next question be a much broader design question, so that we can help guide you to a more idiomatic solution.
Edit
There is another half to your question, namely, how to write a test function that "pattern matches" on the input to check whether it's an Int, an Integer, a Zq type, etc. You provide this suggestive code snippet:
test :: (Num a) => a -> a
test (Integer x) = x+2
test (Int x) = x+1
test (Zq x) = x
There are a couple of things to clear up here.
Haskell has three levels of objects: the value level, the type level, and the kind level. Some examples of things at the value level include "Hello, world!", 42, the function \a -> a, or fix (\xs -> 0:1:zipWith (+) xs (tail xs)). Some examples of things at the type level include Bool, Int, Maybe, Maybe Int, and Monad m => m (). Some examples of things at the kind level include * and (* -> *) -> *.
The levels are in order; value level objects are classified by type level objects, and type level objects are classified by kind level objects. We write the classification relationship using ::, so for example, 32 :: Int or "Hello, world!" :: [Char]. (The kind level isn't too interesting for this discussion, but * classifies types, and arrow kinds classify type constructors. For example, Int :: * and [Int] :: *, but [] :: * -> *.)
Now, one of the most basic properties of Haskell is that each level is completely isolated. You will never see a string like "Hello, world!" in a type; similarly, value-level objects don't pass around or operate on types. Moreover, there are separate namespaces for values and types. Take the example of Maybe:
data Maybe a = Nothing | Just a
This declaration creates a new name Maybe :: * -> * at the type level, and two new names Nothing :: Maybe a and Just :: a -> Maybe a at the value level. One common pattern is to use the same name for a type constructor and for its value constructor, if there's only one; for example, you might see
newtype Wrapped a = Wrapped a
which declares a new name Wrapped :: * -> * at the type level, and simultaneously declares a distinct name Wrapped :: a -> Wrapped a at the value level. Some particularly common (and confusing examples) include (), which is both a value-level object (of type ()) and a type-level object (of kind *), and [], which is both a value-level object (of type [a]) and a type-level object (of kind * -> *). Note that the fact that the value-level and type-level objects happen to be spelled the same in your source is just a coincidence! If you wanted to confuse your readers, you could perfectly well write
newtype Huey a = Louie a
newtype Louie a = Dewey a
newtype Dewey a = Huey a
where none of these three declarations are related to each other at all!
Now, we can finally tackle what goes wrong with test above: Integer and Int are not value constructors, so they can't be used in patterns. Remember -- the value level and type level are isolated, so you can't put type names in value definitions! By now, you might wish you had written test' instead:
test' :: Num a => a -> a
test' (x :: Integer) = x + 2
test' (x :: Int) = x + 1
test' (Zq x :: Zq a) = x
...but alas, it doesn't quite work like that. Value-level things aren't allowed to depend on type-level things. What you can do is to write separate functions at each of the Int, Integer, and Zq a types:
testInteger :: Integer -> Integer
testInteger x = x + 2
testInt :: Int -> Int
testInt x = x + 1
testZq :: Num a => Zq a -> Zq a
testZq (Zq x) = Zq x
Then we can call the appropriate one of these functions when we want to do a test. Since we're in a statically-typed language, exactly one of these functions is going to be applicable to any particular variable.
Now, it's a bit onerous to remember to call the right function, so Haskell offers a slight convenience: you can let the compiler choose one of these functions for you at compile time. This mechanism is the big idea behind classes. It looks like this:
class Testable a where test :: a -> a
instance Testable Integer where test = testInteger
instance Testable Int where test = testInt
instance Num a => Testable (Zq a) where test = testZq
Now, it looks like there's a single function called test which can handle any of Int, Integer, or numeric Zq's -- but in fact there are three functions, and the compiler is transparently choosing one for you. And that's an important insight. The type of test:
test :: Testable a => a -> a
...looks at first blush like it is a function that takes a value that could be any Testable type. But in fact, it's a function that can be specialized to any Testable type -- and then only takes values of that type! This difference explains yet another reason the original test function didn't work. You can't have multiple patterns with variables at different types, because the function only ever works on a single type at a time.
The ideas behind the classes NamedType and Testable above can be generalized a bit; if you do, you get the Typeable class suggested by hammar above.
I think now I've rambled more than enough, and likely confused more things than I've clarified, but leave me a comment saying which parts were unclear, and I'll do my best.
Is there a function that returns the type of an algebraic parameter (a 'show' for types)?
I think Data.Typeable may be what you're looking for.
Prelude> :m + Data.Typeable
Prelude Data.Typeable> typeOf (1 :: Int)
Int
Prelude Data.Typeable> typeOf (1 :: Integer)
Integer
Note that this will not work on any type, just those which have a Typeable instance.
Using the extension DeriveDataTypeable, you can have the compiler automatically derive these for your own types:
{-# LANGUAGE DeriveDataTypeable #-}
import Data.Typeable
data Foo = Bar
deriving Typeable
*Main> typeOf Bar
Main.Foo
I didn't quite get what you're trying to do in the second half of your question, but hopefully this should be of some help.

Resources