Reflection on inputs to a function in Haskell? - haskell

I have a reflective situation where I want to display the types of inputs/outputs of a function. I could just add it to a separate data structure, but then I have duplication and would have to make sure they stay in sync manually.
For example, a function:
myFunc :: (String, MyType, Double) -> (Int, SomeType TypeP, MyOtherType a)
and so now I want to have something like (can be somewhat flexible, esp. when params involved):
input = ["String", "MyType", "Double"]
output = ["Int", "SomeType TypeP", "MyOtherType a"]
defined automatically. It doesn't have to be Strings directly. Is there a simple way to do this?

You don't really need a custom parser. You can just reflect on the TypeRep value you receive. For instance the following would work:
module ModuleReflect where
import Data.Typeable
import Control.Arrow
foo :: (Int, Bool, String) -> String -> String
foo = undefined
-- | We first want to in the case that no result is returned from splitTyConApp
-- to just return the input type, this. If we find an arrow constructor (->)
-- we want to get the start of the list and then recurse on the tail if any.
-- if we get anything else, like [] Char then we want to return the original type
-- [Char]
values :: Typeable a => a -> [TypeRep]
values x = split (typeOf x)
where split t = case first tyConString (splitTyConApp t) of
(_ , []) -> [t]
("->", [x]) -> [x]
("->", x) -> let current = init x
next = last x
in current ++ split next
(_ , _) -> [t]
inputs :: Typeable a => a -> [TypeRep]
inputs x = case values x of
(x:xs) | not (null xs) -> x : init xs
_ -> []
output :: Typeable a => a -> TypeRep
output x = last $ values x
and a sample session
Ok, modules loaded: ModuleReflect.
*ModuleReflect Data.Function> values foo
[(Int,Bool,[Char]),[Char],[Char]]
*ModuleReflect Data.Function> output foo
[Char]
*ModuleReflect Data.Function> inputs foo
[(Int,Bool,[Char]),[Char]]
This is just some quick barely tested code, I'm sure it can be cleaned up. And the reason I don't use typeRepArgs is that in the case of other constructors they would be broken up to, e.g [] Char returns Char instead of [Char].
This version does not treat elements of tuples as seperate results, but it can be easily changed to add that.
However as mentioned before this has a limitation, It'll only work on monomorphic types. If you want to be able to find this for any type, then you should probably use the GHC api. You can load the module, ask it to Typecheck and then inspect the Typecheck AST for the function and thus find it's type. Or you can use Hint to ask for the type of a function. and parse that.

Have a look at Data.Typeable: It defined a functiontypeOf :: Typeable a => a -> TypeRep, TypeRep is instance of Show. For example:
$ ghci
GHCi, version 6.12.1: http://www.haskell.org/ghc/ :? for help
:m Loading package ghc-prim ... linking ... done.
Loading package integer-gmp ... linking ... done.
Loading package base ... linking ... done.
Prelude> :m +Data.Typeable
Prelude Data.Typeable> :i TypeRep
data TypeRep
= Data.Typeable.TypeRep !Data.Typeable.Key TyCon [TypeRep]
-- Defined in Data.Typeable
instance [overlap ok] Eq TypeRep -- Defined in Data.Typeable
instance [overlap ok] Show TypeRep -- Defined in Data.Typeable
instance [overlap ok] Typeable TypeRep -- Defined in Data.Typeable
Prelude Data.Typeable> typeOf (+)
Integer -> Integer -> Integer
Prelude Data.Typeable> typeOf (\(a,(_:x),1) -> (a :: (),x :: [()]))
((),[()],Integer) -> ((),[()])
Now, you only need a custom parser that translated this output into something suitable for you. This is left as an exercise to the reader.
PS: There seems to be one limitation: All types must be known before, for instance typeOf id will fail.

Related

How can I read the metadata of a type at runtime?

I'd like to write a program that prints out some metadata of a Haskell type. Although I know this isn't valid code, the idea is something like:
data Person = Person { name :: String, age :: Int }
metadata :: Type -> String
metadata t = ???
metadata Person -- returns "Person (name,age)"
The important restriction being I don't have an instance of Person, just the type.
I've started looking into Generics & Typeable/Data, but without an instance I'm not sure they'll do what I need. Can anyone point me in the right direction?
Reflection in Haskell works using the Typeable class, which is defined in Data.Typeable and includes the typeOf* method to get a run-time representation of a value's type.
ghci> :m +Data.Typeable
ghci> :t typeOf 'a'
typeOf 'a' :: TypeRep
ghci> typeOf 'a' -- We could use any value of type Char and get the same result
Char -- the `Show` instance of `TypeRep` just returns the name of the type
If you want Typeable to work for your own types, you can have the compiler generate an instance for you with the DeriveDataTypeable extension.
{-# LANGUAGE DeriveDataTypeable #-}
import Data.Typeable
data Person = Person { name :: String, age :: Int } deriving Typeable
You can also write your own instance, but really, no one has the time for that. Apparently you can't - see the comments
You can now use typeOf to grab a run-time representation of your type. We can query information about the type constructor (abbreviated to TyCon) and its type arguments:
-- (undefined :: Person) stands for "some value of type Person".
-- If you have a real Person you can use that too.
-- typeOf does not use the value, only the type
-- (which is known at compile-time; typeOf is dispatched using the normal instance selection rules)
ghci> typeOf (undefined :: Person)
Person
ghci> tyConName $ typeRepTyCon $ typeOf (undefined :: Person)
"Person"
ghci> tyConModule $ typeRepTyCon $ typeOf (undefined :: Person)
"Main"
Data.Typeable also provides a type-safe cast operation which allows you to branch on a value's runtime type, somewhat like C#'s as operator.
f :: Typeable a => a -> String
f x = case (cast x :: Maybe Int) of
Just i -> "I can treat i as an int in this branch " ++ show (i * i)
Nothing -> case (cast x :: Maybe Bool) of
Just b -> "I can treat b as a bool in this branch " ++ if b then "yes" else "no"
Nothing -> "x was of some type other than Int or Bool"
ghci> f True
"I can treat b as a bool in this branch yes"
ghci> f (3 :: Int)
"I can treat i as an int in this branch 9"
Incidentally, a nicer way to write f is to use a GADT enumerating the set of types you expect your function to be called with. This allows us to lose the Maybe (f can never fail!), does a better job of documenting our assumptions, and gives compile-time feedback when we need to change the set of admissible argument types for f. (You can write a class to make Admissible implicit if you like.)
data Admissible a where
AdInt :: Admissible Int
AdBool :: Admissible Bool
f :: Admissible a -> a -> String
f AdInt i = "I can treat i as an int in this branch " ++ show (i * i)
f AdBool b = "I can treat b as a bool in this branch " ++ if b then "yes" else "no"
In reality I probably wouldn't do either of these - I'd just stick f in a class and define instances for Int and Bool.
If you want run-time information about the right-hand side of a type definition, you need to use the entertainingly-named Data.Data, which defines a subclass of Typeable called Data.** GHC can derive Data for you too, with the same extension:
{-# LANGUAGE DeriveDataTypeable #-}
import Data.Typeable
import Data.Data
data Person = Person { name :: String, age :: Int } deriving (Typeable, Data)
Now we can grab a run-time representation of the values of a type, not just the type itself:
ghci> dataTypeOf (undefined :: Person)
DataType {tycon = "Main.Person", datarep = AlgRep [Person]}
ghci> dataTypeConstrs $ dataTypeOf (undefined :: Person)
[Person] -- Person only defines one constructor, called Person
ghci> constrFields $ head $ dataTypeConstrs $ dataTypeOf (undefined :: Person)
["name","age"]
Data.Data is the API for generic programming; if you ever hear people talking about "Scrap Your Boilerplate", this (along with Data.Generics, which builds on Data.Data) is what they mean. For example, you can write a function which converts record types to JSON using reflection on the type's fields.
toJSON :: Data a => a -> String
-- Implementation omitted because it is boring.
-- But you only have to write the boring code once,
-- and it'll be able to serialise any instance of `Data`.
-- It's a good exercise to try to write this function yourself!
* In recent versions of GHC, this API has changed somewhat. Consult the docs.
** Yes, the fully-qualified name of that class is Data.Data.Data.

Generic data constructor for Data instance

Given a datatype
data Foo = IFoo Int | SFoo String deriving (Data, Typeable)
what is a simple definition of
gconstr :: (Typeable a, Data t) => a -> t
such that
gconstr (5 :: Int) :: Foo == IFoo 5
gconstr "asdf" :: Foo == SFoo "asdf"
gconstr True :: Foo == _|_
It would be essentially the opposite of syb's gfindtype.
Or does such a thing exist already? I've tried hoogle-ing the type and haven't found much, but the syb types are kind of hard to interpret. A function returning Nothing on error is also acceptable.
This seems to be possible, though it's not completely trivial.
Preliminaries:
{-# LANGUAGE DeriveDataTypeable #-}
import Control.Monad ( msum )
import Data.Data
import Data.Maybe
First a helper function gconstrn, which tries to do the same thing as required of gconstr, but for a specific constructor only:
gconstrn :: (Typeable a, Data t) => Constr -> a -> Maybe t
gconstrn constr arg = gunfold addArg Just constr
where
addArg :: Data b => Maybe (b -> r) -> Maybe r
addArg Nothing = Nothing
addArg (Just f) =
case cast arg of
Just v -> Just (f v)
Nothing -> Nothing
The key part is that the addArg function will use arg as an argument to the constructor, if the types match.
Essentially gunfold starts unfolding with Just IFoo or Just SFoo, and then the next step is to try addArg to provide it with its argument.
For multi-argument constructors this would be called repeatedly, so if you defined an IIFoo constructor that took two Ints, it would also get successfully filled in by gconstrn. Obviously with a bit more work you could do something more sophisticated like providing a list of arguments.
Then it's just a question of trying this with all possible constructors. The recursive definition between result and dt is just to get the right type argument for dataTypeOf, the actual value being passed in doesn't matter at all. ScopedTypeVariables would be an alternative for achieving this.
gconstr :: (Typeable a, Data t) => a -> Maybe t
gconstr arg = result
where result = msum [gconstrn constr arg | constr <- dataTypeConstrs dt]
dt = dataTypeOf (fromJust result)
As discussed in the comments, both functions can be simplified with <*> from Control.Applicative to the following, though it's a bit harder to see what's going on in the gunfold:
gconstr :: (Typeable a, Data t) => a -> Maybe t
gconstr arg = result
where
result = msum $ map (gunfold (<*> cast arg) Just) (dataTypeConstrs dt)
dt = dataTypeOf (fromJust result)

Binary instance for an existential

Given an existential data type, for example:
data Foo = forall a . (Typeable a, Binary a) => Foo a
I'd like to write instance Binary Foo. I can write the serialisation (serialise the TypeRep then serialise the value), but I can't figure out how to write the deserialisation. The basic problem is that given a TypeRep you need to map back to the type dictionary for that type - and I don't know if that can be done.
This question has been asked before on the haskell mailing list http://www.haskell.org/pipermail/haskell/2006-September/018522.html, but no answers were given.
You need some way that each Binary instance can register itself (just as in your witness version). You can do this by bundling each instance declaration with an exported foreign symbol, where the symbol name is derived from the TypeRep. Then when you want to deserialize you get the name from the TypeRep and look up that symbol dynamically (with dlsym() or something similar). The value exported by the foreign export can, e.g., be the deserializer function.
It's crazy ugly, but it works.
This can be solved in GHC 7.10 and onwards using the Static Pointers Language extension:
{-# LANGUAGE StaticPointers #-}
{-# LANGUAGE InstanceSigs #-}
data Foo = forall a . (StaticFoo a, Binary a, Show a) => Foo a
class StaticFoo a where
staticFoo :: a -> StaticPtr (Get Foo)
instance StaticFoo String where
staticFoo _ = static (Foo <$> (get :: Get String))
instance Binary Foo where
put (Foo x) = do
put $ staticKey $ staticFoo x
put x
get = do
ptr <- get
case unsafePerformIO (unsafeLookupStaticPtr ptr) of
Just value -> deRefStaticPtr value :: Get Foo
Nothing -> error "Binary Foo: unknown static pointer"
A full description of the solution can be found on this blog post, and a complete snippet here.
If you could do that, you would also be able to implement:
isValidRead :: TypeRep -> String -> Bool
This would be a function that changes its behavior due to someone defining a new type! Not very pure-ish.. I think (and hope) that one can't implement this in Haskell..
I have an answer that slightly works in some situations (not enough for my purposes), but may be the best that can be done. You can add a witness function to witness any types that you have, and then the deserialisation can lookup in the witness table. The rough idea is (untested):
witnesses :: IORef [Foo]
witnesses = unsafePerformIO $ newIORef []
witness :: (Typeable a, Binary a) => a -> IO ()
witness x = modifyIORef (Foo x :)
instance Binary Foo where
put (Foo x) = put (typeOf x) >> put x
get = do
ty <- get
wits <- unsafePerformIO $ readIORef witnesses
case [Foo x | Foo x <- wits, typeOf x == ty] of
Foo x:_ -> fmap Foo $ get `asTypeOf` return x
[] -> error $ "Could not find a witness for the type: " ++ show ty
The idea is that as you go through, you call witness on values of every type that you may plausibly encounter when deserialising. When you deserialise you search this list. The obvious problem is that if you fail to call witness before deserialisation you get a crash.

Why doesn't this typecheck?

This is toy-example.hs:
{-# LANGUAGE ImpredicativeTypes #-}
import Control.Arrow
data From = From (forall a. Arrow a => a Int Char -> a [Int] String)
data Fine = Fine (forall a. Arrow a => a Int Char -> a () String)
data Broken = Broken (Maybe (forall a. Arrow a => a Int Char -> a () String))
fine :: From -> Fine
fine (From f) = Fine g
where g :: forall a. Arrow a => a Int Char -> a () String
g x = f x <<< arr (const [1..5])
broken :: From -> Broken
broken (From f) = Broken (Just g) -- line 17
where g :: forall a. Arrow a => a Int Char -> a () String
g x = f x <<< arr (const [1..5])
And this is what ghci thinks of it:
GHCi, version 7.0.3: http://www.haskell.org/ghc/ :? for help
Loading package ghc-prim ... linking ... done.
Loading package integer-gmp ... linking ... done.
Loading package base ... linking ... done.
Loading package ffi-1.0 ... linking ... done.
Prelude> :l toy-example.hs
[1 of 1] Compiling Main ( toy-example.hs, interpreted )
toy-example.hs:17:32:
Couldn't match expected type `forall (a :: * -> * -> *).
Arrow a =>
a Int Char -> a () String'
with actual type `a0 Int Char -> a0 () String'
In the first argument of `Just', namely `g'
In the first argument of `Broken', namely `(Just g)'
In the expression: Broken (Just g)
Failed, modules loaded: none.
Why does fine typecheck while broken does not?
How do I get broken to typecheck?
(In my real code I can add the type parameter a to Broken if I have to, instead of having it universally quantified inside the constructor, but I'd like to avoid this if possible.)
Edit: If I change the definition of Broken to
data Broken = Broken (forall a. Arrow a => Maybe (a Int Char -> a () String))
then broken typechecks. Yay!
But if I then add the following function
munge :: Broken -> String
munge (Broken Nothing) = "something" -- line 23
munge (Broken (Just f)) = f chr ()
then I get the error message
toy-example.hs:23:15:
Ambiguous type variable `a0' in the constraint:
(Arrow a0) arising from a pattern
Probable fix: add a type signature that fixes these type variable(s)
In the pattern: Nothing
In the pattern: Broken Nothing
In an equation for `munge': munge (Broken Nothing) = "something"
How do I get munge to typecheck as well?
2nd edit: In my real program I have replaced the Broken (Maybe ...) constructor with BrokenNothing and BrokenJust ... constructors (there were already other constructors), but I'm curious how pattern matching is supposed to work in this situation.
ImpredicativeTypes leaves you on quite shaky ground that changes from GHC version to version in any case - they're struggling to find a formulation of impredicativity that appropriately balances power, ease of use and ease of implementation.
In this particular case, trying to put a quantified type inside a Maybe, which is a datatype not explicitly defined to behave this way, is really tricky, so I'd advise going for the custom-defined constructors instead as you mention.
I think that you can fix munge above by re-deconstructing the argument to Broken on the RHS, at a time when the type its being used as will be known, e.g.:
munge (Broken x#(Just _)) = fromJust x chr ()
It's pretty ugly, though.

Get a list of the instances in a type class in Haskell

Is there a way to programmatically get a list of instances of a type class?
It strikes me that the compiler must know this information in order to type check and compile the code, so is there some way to tell the compiler: hey, you know those instances of that class, please put a list of them right here (as strings or whatever some representation of them).
You can generate the instances in scope for a given type class using Template Haskell.
import Language.Haskell.TH
-- get a list of instances
getInstances :: Name -> Q [ClassInstance]
getInstances typ = do
ClassI _ instances <- reify typ
return instances
-- convert the list of instances into an Exp so they can be displayed in GHCi
showInstances :: Name -> Q Exp
showInstances typ = do
ins <- getInstances typ
return . LitE . stringL $ show ins
Running this in GHCi:
*Main> $(showInstances ''Num)
"[ClassInstance {ci_dfun = GHC.Num.$fNumInteger, ci_tvs = [], ci_cxt = [], ci_cls = GHC.Num.Num, ci_tys = [ConT GHC.Integer.Type.Integer]},ClassInstance {ci_dfun = GHC.Num.$fNumInt, ci_tvs = [], ci_cxt = [], ci_cls = GHC.Num.Num, ci_tys = [ConT GHC.Types.Int]},ClassInstance {ci_dfun = GHC.Float.$fNumFloat, ci_tvs = [], ci_cxt = [], ci_cls = GHC.Num.Num, ci_tys = [ConT GHC.Types.Float]},ClassInstance {ci_dfun = GHC.Float.$fNumDouble, ci_tvs = [], ci_cxt = [], ci_cls = GHC.Num.Num, ci_tys = [ConT GHC.Types.Double]}]"
Another useful technique is showing all instances in scope for a given type class using GHCi.
Prelude> :info Num
class (Eq a, Show a) => Num a where
(+) :: a -> a -> a
(*) :: a -> a -> a
(-) :: a -> a -> a
negate :: a -> a
abs :: a -> a
signum :: a -> a
fromInteger :: Integer -> a
-- Defined in GHC.Num
instance Num Integer -- Defined in GHC.Num
instance Num Int -- Defined in GHC.Num
instance Num Float -- Defined in GHC.Float
instance Num Double -- Defined in GHC.Float
Edit: The important thing to know is that the compiler is only aware of type classes in scope in any given module (or at the ghci prompt, etc.). So if you call the showInstances TH function with no imports, you'll only get instances from the Prelude. If you have other modules in scope, e.g. Data.Word, then you'll see all those instances too.
See the template haskell documentation: http://hackage.haskell.org/packages/archive/template-haskell/2.5.0.0/doc/html/Language-Haskell-TH.html
Using reify, you can get an Info record, which for a class includes its list of instances. You can also use isClassInstance and classInstances directly.
This is going to run into a lot of problems as soon as you get instance declarations like
instance Eq a => Eq [a] where
[] == [] = True
(x:xs) == (y:ys) = x == y && xs == ys
_ == _ = False
and
instance (Eq a,Eq b) => Eq (a,b) where
(a1,b1) == (a2,b2) = a1 == a2 && b1 == b2
along with a single concrete instance (e.g. instance Eq Bool).
You'll get an infinite list of instances for Eq - Bool,[Bool],[[Bool]],[[[Bool]]] and so on, (Bool,Bool), ((Bool,Bool),Bool), (((Bool,Bool),Bool),Bool) etcetera, along with various combinations of these such as ([((Bool,[Bool]),Bool)],Bool) and so forth. It's not clear how to represent these in a String; even a list of TypeRep would require some pretty smart enumeration.
The compiler can (try to) deduce whether a type is an instance of Eq for any given type, but it doesn't read in all the instance declarations in scope and then just starts deducing all possible instances, since that will never finish!
The important question is of course, what do you need this for?
I guess, it's not possible. I explain you the implementation of typeclasses (for GHC), from it, you can see, that the compiler has no need to know which types are instance of a typeclass. It only has to know, whether a specific type is instance or not.
A typeclass will be translated into a datatype. As an example, let's take Eq:
class Eq a where
(==),(/=) :: a -> a -> Bool
The typeclass will be translated into a kind of dictionary, containing all its functions:
data Eq a = Eq {
(==) :: a -> a -> Bool,
(/=) :: a -> a -> Bool
}
Each typeclass constraint is then translated into an extra argument containing the dictionary:
elem :: Eq a => a -> [a] -> Bool
elem _ [] = False
elem a (x:xs) | x == a = True
| otherwise = elem a xs
becomes:
elem :: Eq a -> a -> [a] -> Bool
elem _ _ [] = False
elem eq a (x:xs) | (==) eq x a = True
| otherwise = elem eq a xs
The important thing is, that the dictionary will be passed at runtime. Imagine, your project contains many modules. GHC doesn't have to check all the modules for instances, it just has to look up, whether an instance is defined anywhere.
But if you have the source available, I guess an old-style grep for the instances would be sufficient.
It is not possible to automatically do this for existing classes. For your own class and instances thereof you could do it. You would need to declare everything via Template Haskell (or perhaps the quasi-quoting) and it would automatically generate some strange data structure that encodes the declared instances. Defining the strange data structure and making Template Haskell do this are details left to whomever has a use case for them.
Perhaps you could add some Template Haskell or other magic to your build to include all the source files as text available at run-time (c.f. program quine). Then your program would 'grep itself'...

Resources