How to create a well-typed function that returns two different types? - haskell

I'm highly interested in compiling Formality-Core modules to Haskell libraries. While I could use unsafeCoerce everywhere, it would be more interesting if I could preserve the type information, allowing compiled modules to be published on Cabal and used by other Haskell projects. The problem is that dependent types allow programs that are forbidden by Haskell. For example, the function foo below:
foo: (b : Bool) -> If(b)(Nat)(Bool)
(b)
b<(b) If(b)(Nat)(Bool)>
| zero;
| false;
Returns a different type depending on the input. If the input is false, it returns the number zero. If the input is true, it returns the boolean false. It seems like a function like this can't be translated to Haskell. I believe that, on the last years, Haskell has made good progress towards dependent type, so, I wonder: is there any way to write functions that return different types based on the input value?

GADTs + TypeFamilies (optionally, + DataKinds) can do roughly this. So:
{-# LANGUAGE DataKinds #-}
{-# LANGUAGE GADTs #-}
{-# LANGUAGE TypeFamilies #-}
data GADTBool a where
GADTFalse :: GADTBool False
GADTTrue :: GADTBool True
type family If cond t f where
If False t f = f
If True t f = t
foo :: GADTBool b -> If b Int Bool
foo GADTTrue = 0
foo GADTFalse = False
Of course, you'll probably actually want foo :: GADTBool b -> If b (GADTInt 0) (GADTBool False) if you plan to do this kind of thing pervasively. The search term you want for seeing more examples of this kind of hackery is "singleton types", often abbreviated just "singletons".

The state of the art remains the singleton approach.
data SBool b where
SFalse :: SBool 'False
STrue :: SBool 'True
type family If (b :: Bool) (t1 :: k) (t2 :: k) :: k where
If 'False x _ = x
If 'True _ y = y
foo :: SBool b -> If b Natural Bool
foo SFalse = 0
foo STrue = False

It's maybe worth noting that, as a practical matter, the singletons library can be used to take care of most of the boilerplate. So, you can write:
{-# LANGUAGE GADTs #-}
module Formality where
import Numeric.Natural
import Data.Singletons.Prelude
foo :: SBool b -> If b Bool Natural
foo SFalse = 0
foo STrue = False
using almost exactly the syntax used by #dfeuer, up to the order of arguments to If.
The main disadvantage of the singletons library is that any serious type dependent programming is going to eventually require understanding how things are actually implemented internally, and the guts of the library are complicated and not very well documented.
You may find it helpful to start by hand-compiling some Formality using a from-scratch solution using your own singleton GADTs and type families (as in the other answers), and then try to convert it over to use singletons.

Related

Parameterized Types in Haskell

Why do types in Haskell have to be explicitly parameterized in the type constructor parameter?
For example:
data Maybe a = Nothing | Just a
Here a has to be specified with the type. Why can't it be specified only in the constructor?
data Maybe = Nothing | Just a
Why did they make this choice from a design point of view? Is one better than the other?
I do understand that first is more strongly typed than the second, but there isn't even an option for the second one.
Edit :
Example function
data Maybe = Just a | Nothing
div :: (Int -> Int -> Maybe)
div a b
| b == 0 = Nothing
| otherwise = Just (a / b)
It would probably clear things up to use GADT notation, since the standard notation kind of mangles together the type- and value-level languages.
The standard Maybe type looks thus as a GADT:
{-# LANGUAGE GADTs #-}
data Maybe a where
Nothing :: Maybe a
Just :: a -> Maybe a
The “un-parameterised” version is also possible:
data EMaybe where
ENothing :: EMaybe
EJust :: a -> EMaybe
(as Joseph Sible commented, this is called an existential type). And now you can define
foo :: Maybe Int
foo = Just 37
foo' :: EMaybe
foo' = EJust 37
Great, so why don't we just use EMaybe always?
Well, the problem is when you want to use such a value. With Maybe it's fine, you have full control of the contained type:
bhrar :: Maybe Int -> String
bhrar Nothing = "No number 😞"
bhrar (Just i)
| i<0 = "Negative 😖"
| otherwise = replicate i '😌'
But what can you do with a value of type EMaybe? Not much, it turns out, because EJust contains a value of some unknown type. So whatever you try to use the value for, will be a type error, because the compiler has no way to confirm it's actually the right type.
bhrar :: EMaybe -> String
bhrar' (EJust i) = replicate i '😌'
=====> Error couldn't match expected type Int with a
If a variable is not reflected in the return type it is considered existential. This is possible to define data ExMaybe = ExNothing | forall a. ExJust a but the argument to ExJust is completely useless. ExJust True and ExJust () both have type ExMaybe and are indistinguisable from the type system's perspective.
Here is the GADT syntax for both the original Maybe and the existential ExMaybe
{-# Language GADTs #-}
{-# Language LambdaCase #-}
{-# Language PolyKinds #-}
{-# Language ScopedTypeVariables #-}
{-# Language StandaloneKindSignatures #-}
{-# Language TypeApplications #-}
import Data.Kind (Type)
import Prelude hiding (Maybe(..))
type Maybe :: Type -> Type
data Maybe a where
Nothing :: Maybe a
Just :: a -> Maybe a
type ExMaybe :: Type
data ExMaybe where
ExNothing :: ExMaybe
ExJust :: a -> ExMaybe
You're question is like asking why a function f x = .. needs to specify its argument, there is the option of making the type argument invisible but this is very odd but the argument is still there even if invisible.
-- >> :t JUST
-- JUST :: a -> MAYBE
-- >> :t JUST 'a'
-- JUST 'a' :: MAYBE
type MAYBE :: forall (a :: Type). Type
data MAYBE where
NOTHING :: MAYBE #a
JUST :: a -> MAYBE #a
mAYBE :: b -> (a -> b) -> MAYBE #a -> b
mAYBE nOTHING jUST = \case
NOTHING -> nOTHING
JUST a -> jUST a
Having explicit type parameters makes it much more expressive. You lose so much information without it. For example, how would you write the type of map? Or functors in general?
map :: (a -> b) -> [a] -> [b]
This version says almost nothing about what’s going on
map :: (a -> b) -> [] -> []
Or even worse, head:
head :: [] -> a
Now we suddenly have access to unsafe coerce and zero type safety at all.
unsafeCoerce :: a -> b
unsafeCoerce x = head [x]
But we don’t just lose safety, we also lose the ability to do some things. For example if we want to read something into a list or Maybe, we can no longer specify what kind of list we want.
read :: Read a => a
example :: [Int] -> String
main = do
xs <- getLine
putStringLine (example xs)
This program would be impossible to write without lists having an explicit type parameter. (Or rather, read would be unable to have different implementations for different list types, since content type is now opaque)
It is however, as was mentioned by others, still possible to define a similar type by using the ExistentialQuantification extension. But in those cases you are very limited in how you can use those data types, since you cannot know what they contain.

How to 'show' unshowable types?

I am using data-reify and graphviz to transform an eDSL into a nice graphical representation, for introspection purposes.
As simple, contrived example, consider:
{-# LANGUAGE GADTs #-}
data Expr a where
Constant :: a -> Expr a
Map :: (other -> a) -> Expr a -> Expr a
Apply :: Expr (other -> a) -> Expr a -> Expr a
instance Functor Expr where
fmap fun val = Map fun val
instance Applicative Expr where
fun_expr <*> data_expr = Apply fun_expr data_expr
pure val = Constant val
-- And then some functions to optimize an Expr AST, evaluate Exprs, etc.
To make introspection nicer, I would like to print the values which are stored inside certain AST nodes of the DSL datatype.
However, in general any a might be stored in Constant, even those that do not implement Show. This is not necessarily a problem since we can constrain the instance of Expr like so:
instance Show a => Show (Expr a) where
...
This is not what I want however: I would still like to be able to print Expr even if a is not Show-able, by printing some placeholder value (such as just its type and a message that it is unprintable) instead.
So we want to do one thing if we have an a implementing Show, and another if a particular a does not.
Furthermore, the DSL also has the constructors Map and Apply which are even more problematic. The constructor is existential in other, and thus we cannot assume anything about other, a or (other -> a). Adding constraints to the type of other to the Map resp. Apply constructors would break the implementation of Functor resp. Applicative which forwards to them.
But here also I'd like to print for the functions:
a unique reference. This is always possible (even though it is not pretty as it requires unsafePerformIO) using System.Mem.StableName.
Its type, if possible (one technique is to use show (typeOf fun), but it requires that fun is Typeable).
Again we reach the issue where we want to do one thing if we have an f implementing Typeable and another if f does not.
How to do this?
Extra disclaimer: The goal here is not to create 'correct' Show instances for types that do not support it. There is no aspiration to be able to Read them later, or that print a != print b implies a != b.
The goal is to print any datastructure in a 'nice for human introspection' way.
The part I am stuck at, is that I want to use one implementation if extra constraints are holding for a resp. (other -> a), but a 'default' one if these do not exist.
Maybe type classes with FlexibleInstances, or maybe type families are needed here? I have not been able to figure it out (and maybe I am on the wrong track all together).
Not all problems have solutions. Not all constraint systems have a satisfying assignment.
So... relax the constraints. Store the data you need to make a sensible introspective function in your data structure, and use functions with type signatures like show, fmap, pure, and (<*>), but not exactly equal to them. If you need IO, use IO in your type signature. In short: free yourself from the expectation that your exceptional needs fit into the standard library.
To deal with things where you may either have an instance or not, store data saying whether you have an instance or not:
data InstanceOrNot c where
Instance :: c => InstanceOrNot c
Not :: InstanceOrNot c
(Perhaps a Constraint-kinded Either-alike, rather than Maybe-alike, would be more appropriate. I suspect as you start coding this you will discover what's needed.) Demand that clients that call notFmap and friends supply these as appropriate.
In the comments, I propose parameterizing your type by the constraints you demand, and giving a Functor instance for the no-constraints version. Here's a short example showing how that might look:
{-# LANGUAGE GADTs #-}
{-# LANGUAGE DataKinds #-}
{-# LANGUAGE TypeOperators #-}
{-# LANGUAGE TypeFamilies #-}
{-# LANGUAGE FlexibleInstances #-}
import Data.Kind
type family All cs a :: Constraint where
All '[] a = ()
All (c:cs) a = (c a, All cs a)
data Lol cs a where
Leaf :: a -> Lol cs a
Fmap :: All cs b => (a -> b) -> Lol cs a -> Lol cs b
instance Functor (Lol '[]) where
fmap f (Leaf a) = Leaf (f a)
fmap f (Fmap g garg) = Fmap (f . g) garg
Great timing! Well-typed recently released a library which allows you to recover runtime information. They specifically have an example of showing arbitrary values. It's on github at https://github.com/well-typed/recover-rtti.
It turns out that this is a problem which has been recognized by multiple people in the past, known as the 'Constrained Monad Problem'. There is an elegant solution, explained in detail in the paper The Constrained-Monad Problem by Neil Sculthorpe and Jan Bracker and George Giorgidze and Andy Gill.
A brief summary of the technique: Monads (and other typeclasses) have a 'normal form'. We can 'lift' primitives (which are constrained any way we wish) into this 'normal form' construction, itself an existential datatype, and then use any of the operations available for the typeclass we have lifted into. These operations themselves are not constrained, and thus we can use all of Haskell's normal typeclass functions.
Finally, to turn this back into the concrete type (which again has all the constraints we are interested in) we 'lower' it, which is an operation that takes for each of the typeclass' operations a function which it will apply at the appropriate time.
This way, constraints from the outside (which are part of the functions supplied to the lowering) and constraints from the inside (which are part of the primitives we lifted) are able to be matched, and finally we end up with one big happy constrained datatype for which we have been able to use any of the normal Functor/Monoid/Monad/etc. operations.
Interestingly, while the intermediate operations are not constrained, to my knowledge it is impossible to write something which 'breaks' them as this would break the categorical laws that the typeclass under consideration should adhere to.
This is available in the constrained-normal Hackage package to use in your own code.
The example I struggled with, could be implemented as follows:
{-# LANGUAGE GADTs #-}
{-# LANGUAGE DataKinds #-}
{-# LANGUAGE FlexibleInstances #-}
{-# LANGUAGE ScopedTypeVariables #-}
{-# LANGUAGE MultiParamTypeClasses #-}
{-# LANGUAGE FlexibleContexts #-}
{-# LANGUAGE StandaloneDeriving #-}
{-# LANGUAGE ConstraintKinds #-}
{-# LANGUAGE KindSignatures #-}
{-# LANGUAGE UndecidableInstances #-}
module Example where
import Data.Dynamic
import Data.Kind
import Data.Typeable
import Control.Monad.ConstrainedNormal
-- | Required to have a simple constraint which we can use as argument to `Expr` / `Expr'`.
-- | This is definitely the part of the example with the roughest edges: I have yet to figure out
-- | how to make Haskell happy with constraints
class (Show a, Typeable a) => Introspectable a where {}
instance (Show a, Typeable a) => Introspectable a where {}
data Expr' (c :: * -> Constraint) a where
C :: a -> Expr' c a
-- M :: (a -> b) -> Expr' a -> Expr' b --^ NOTE: This one is actually never used as ConstrainedNormal will use the 'free' implementation based on A + C.
A :: c a => Expr' c (a -> b) -> Expr' c a -> Expr' c b
instance Introspectable a => Show (Expr' Introspectable a) where
show e = case e of
C x -> "(C " ++ show x ++ ")"
-- M f x = "(M " ++ show val ++ ")"
A fx x -> "(A " ++ show (typeOf fx) ++ " " ++ show x ++ ")"
-- | In user-facing code you'd not want to expose the guts of this construction
-- So let's introduce a 'wrapper type' which is what a user would normally interact with.
type Expr c a = NAF c (Expr' c) a
liftExpr :: c a => Expr' c a -> Expr c a
liftExpr expr = liftNAF expr
lowerExpr :: c a => Expr c a -> Expr' c a
lowerExpr lifted_expr = lowerNAF C A lifted_expr
constant :: Introspectable a => a -> Expr c a
constant val = pure val -- liftExpr (C val)
You could now for instance write
ghci> val = constant 10 :: Expr Introspectable Int
(C 10)
ghci> (+2) <$> val
(C 12)
ghci> (+) <$> constant 10 <*> constant 32 :: Expr Introspectable Int
And by using Data.Constraint.Trivial (part of the trivial-constrained library, although it is also possible to write your own 'empty constrained') one could instead write e.g.
ghci> val = constant 10 :: Expr Unconstrained Int
which will work just as before, but now val cannot be printed.
The one thing I have not yet figured out, is how to properly work with subsets of constraints (i.e. if I have a function that only requires Show, make it work with something that is Introspectable). Currently everything has to work with the 'big' set of constraints.
Another minor drawback is of course that you'll have to annotate the constraint type (e.g. if you do not want constraints, write Unconstrained manually), as GHC will otherwise complain that c0 is not known.
We've reached the goal of having a type which can be optionally be constrained to be printable, with all machinery that does not need printing to work also on all instances of the family of types including those that are not printable, and the types can be used as Monoids, Functors, Applicatives, etc just as you like.
I think it is a beautiful approach, and want to commend Neil Sculthorpe et al. for their work on the paper and the constrained-normal library that makes this possible. It's very cool!

Is this use of GADTs fully equivalent to existential types?

Existentially quantified data constructors like
data Foo = forall a. MkFoo a (a -> Bool)
| Nil
can be easily translated to GADTs:
data Foo where
MkFoo :: a -> (a -> Bool) -> Foo
Nil :: Foo
Are there any differences between them: code which compiles with one but not another, or gives different results?
They are nearly equivalent, albeit not completely so, depending on which extensions you turn on.
First of all, note that you don't need to enable the GADTs extension to use the data .. where syntax for existential types. It suffices to enable the following lesser extensions.
{-# LANGUAGE GADTSyntax #-}
{-# LANGUAGE ExistentialQuantification #-}
With these extensions, you can compile
data U where
U :: a -> (a -> String) -> U
foo :: U -> String
foo (U x f) = f x
g x = let h y = const y x
in (h True, h 'a')
The above code also compiles if we replace the extensions and the type definition with
{-# LANGUAGE ExistentialQuantification #-}
data U = forall a . U a (a -> String)
The above code, however, does not compile with the GADTs extension turned on! This is because GADTs also turns on the MonoLocalBinds extension, which prevents the above definition of g to compile. This is because the latter extension prevents h to receive a polymorphic type.
From the documentation:
Notice that GADT-style syntax generalises existential types (Existentially quantified data constructors). For example, these two declarations are equivalent:
data Foo = forall a. MkFoo a (a -> Bool)
data Foo' where { MKFoo :: a -> (a->Bool) -> Foo' }
(emphasis on the word equivalent)
The latter isn't actually a GADT - it's an existentially quantified data type declared with GADT syntax. As such, it is identical to the former.
The reason it's not a GADT is that there is no type variable that gets refined based on the choice of constructor. That's the key new functionality added by GADTs. If you have a GADT like this:
data Foo a where
StringFoo :: String -> Foo String
IntFoo :: Int -> Foo Int
Then pattern-matching on each constructor reveals additional information that can be used inside the matching clause. For instance:
deconstructFoo :: Foo a -> a
deconstructFoo (StringFoo s) = "Hello! " ++ s ++ " is a String!"
deconstructFoo (IntFoo i) = i * 3 + 1
Notice that something very interesting is happening there, from the point of view of the type system. deconstructFoo promises it will work for any choice of a, as long as it's passed a value of type Foo a. But then the first equation returns a String, and the second equation returns an Int.
This is what you cannot do with a regular data type, and the new thing GADTs provide. In the first equation, the pattern match adds the constraint (a ~ String) to its context. In the second equation, the pattern match adds (a ~ Int).
If you haven't created a type where pattern-matching can cause type refinement, you don't have a GADT. You just have a type declared with GADT syntax. Which is fine - in a lot of ways, it's a better syntax than the basic data type syntax. It's just more verbose for the easiest cases.

Specialization of singleton parameters

I'm playing around with specialization of singletons:
{-# LANGUAGE DataKinds #-}
{-# LANGUAGE GADTs #-}
{-# LANGUAGE KindSignatures #-}
module Data.Test where
data SingBool (b :: Bool) where
STrue :: SingBool 'True
SFalse :: SingBool 'False
sing :: SingBool b -> Bool
sing SFalse = False
sing STrue = True
{-# SPECIALIZE sing :: SingBool 'False -> Bool #-}
This specializes to something like the following:
singSFalse :: SingBool 'False -> Bool
singSFalse SFalse = False
I'd expect it to generate an RHS of singSFalse _ = False instead.
Is that coercion unpacked only to satisfy the type-checker or is there actual runtime overhead involved in that pattern match? I imagine that GHC does not discard the pattern match on the argument to account for bottom, in order not to increase laziness. But I want to be sure before I begin to model this through Proxy + a SingI-style type class.
OK, to mostly answer my own question: Knowing that SingBool 'False only has one inhabitant is not enough for GHC to get rid of the pattern match, because we could call the function like singSFalse (error "matched"), e.g. bottom is always another inhabitant.
So, specialization (e.g. inlining based on concrete TypeApplications) does not really work well with singletons (turning those type applications into presumably constant value applications) in Haskell (lazy, non-total) w.r.t. zero cost abstractions.
However, by using a SingI-style type class with a proxy (e.g. singByProxy), we don't have the same problems:
{-# LANGUAGE DataKinds #-}
{-# LANGUAGE KindSignatures #-}
{-# LANGUAGE MagicHash #-}
module Data.Test where
import GHC.Exts (Proxy#)
class SingIBool (b :: Bool) where
sing :: Proxy# b -> Bool
instance SingIBool 'False where
sing _ = False
instance SingIBool 'True where
sing _ = True
refurbulate :: SingIBool b => Proxy# b -> Int
refurbulate p
| sing p = 0
| otherwise = 1
The specialization refurbulate #(Proxy# 'False) will not only be implemented as const False, also there will not be passed any Proxy# argument at the value level, so it's rather coerce False :: Proxy# -> Bool. Neat! However, I don't get to use singletons in the real world :(
To recap why singletons fail (to get optimized) and type classes work:
By specializing the type class instance, we get to know the RHS of sing, from which we can deduce totality.
By specializing the singleton, we get to know what value the parameter evaluates to, if evaluation terminates.
Knowing the canonical RHS of a type class method x :: () is more informative than just knowing that a parameter x :: () can only evaluate to one value in a non-total, lazy (e.g. Haskell's) setting.

Test if a value matches a constructor

Say I have a data type like so:
data NumCol = Empty |
Single Int |
Pair Int Int |
Lots [Int]
Now I wish to filter out the elements matching a given constructor from a [NumCol]. I can write it for, say, Pair:
get_pairs :: [NumCol] -> [NumCol]
get_pairs = filter is_pair
where is_pair (Pair _ _) = True
is_pair _ = False
This works, but it's not generic. I have to write a separate function for is_single, is_lots, etc.
I wish instead I could write:
get_pairs = filter (== Pair)
But this only works for type constructors that take no arguments (i.e. Empty).
So the question is, how can I write a function that takes a value and a constructor, and returns whether the value matches the constructor?
At least get_pairs itself can be defined relatively simply by using a list comprehension to filter instead:
get_pairs xs = [x | x#Pair {} <- xs]
For a more general solution of matching constructors, you can use prisms from the lens package:
{-# LANGUAGE TemplateHaskell #-}
import Control.Lens
import Control.Lens.Extras (is)
data NumCol = Empty |
Single Int |
Pair Int Int |
Lots [Int]
-- Uses Template Haskell to create the Prisms _Empty, _Single, _Pair and _Lots
-- corresponding to your constructors
makePrisms ''NumCol
get_pairs :: [NumCol] -> [NumCol]
get_pairs = filter (is _Pair)
Tags of tagged unions ought to be first-class values, and with a wee bit of effort, they are.
Jiggery-pokery alert:
{-# LANGUAGE GADTs, DataKinds, KindSignatures,
TypeFamilies, PolyKinds, FlexibleInstances,
PatternSynonyms
#-}
Step one: define type-level versions of the tags.
data TagType = EmptyTag | SingleTag | PairTag | LotsTag
Step two: define value-level witnesses for the representability of the type-level tags. Richard Eisenberg's Singletons library will do this for you. I mean something like this:
data Tag :: TagType -> * where
EmptyT :: Tag EmptyTag
SingleT :: Tag SingleTag
PairT :: Tag PairTag
LotsT :: Tag LotsTag
And now we can say what stuff we expect to find associated with a given tag.
type family Stuff (t :: TagType) :: * where
Stuff EmptyTag = ()
Stuff SingleTag = Int
Stuff PairTag = (Int, Int)
Stuff LotsTag = [Int]
So we can refactor the type you first thought of
data NumCol :: * where
(:&) :: Tag t -> Stuff t -> NumCol
and use PatternSynonyms to recover the behaviour you had in mind:
pattern Empty = EmptyT :& ()
pattern Single i = SingleT :& i
pattern Pair i j = PairT :& (i, j)
pattern Lots is = LotsT :& is
So what's happened is that each constructor for NumCol has turned into a tag indexed by the kind of tag it's for. That is, constructor tags now live separately from the rest of the data, synchronized by a common index which ensures that the stuff associated with a tag matches the tag itself.
But we can talk about tags alone.
data Ex :: (k -> *) -> * where -- wish I could say newtype here
Witness :: p x -> Ex p
Now, Ex Tag, is the type of "runtime tags with a type level counterpart". It has an Eq instance
instance Eq (Ex Tag) where
Witness EmptyT == Witness EmptyT = True
Witness SingleT == Witness SingleT = True
Witness PairT == Witness PairT = True
Witness LotsT == Witness LotsT = True
_ == _ = False
Moreover, we can easily extract the tag of a NumCol.
numColTag :: NumCol -> Ex Tag
numColTag (n :& _) = Witness n
And that allows us to match your specification.
filter ((Witness PairT ==) . numColTag) :: [NumCol] -> [NumCol]
Which raises the question of whether your specification is actually what you need. The point is that detecting a tag entitles you an expectation of that tag's stuff. The output type [NumCol] doesn't do justice to the fact that you know you have just the pairs.
How might you tighten the type of your function and still deliver it?
One approach is to use DataTypeable and the Data.Data module. This approach relies on two autogenerated typeclass instances that carry metadata about the type for you: Typeable and Data. You can derive them with {-# LANGUAGE DeriveDataTypeable #-}:
data NumCol = Empty |
Single Int |
Pair Int Int |
Lots [Int] deriving (Typeable, Data)
Now we have a toConstr function which, given a value, gives us a representation of its constructor:
toConstr :: Data a => a -> Constr
This makes it easy to compare two terms just by their constructors. The only remaining problem is that we need a value to compare against when we define our predicate! We can always just create a dummy value with undefined, but that's a bit ugly:
is_pair x = toConstr x == toConstr (Pair undefined undefined)
So the final thing we'll do is define a handy little class that automates this. The basic idea is to call toConstr on non-function values and recurse on any functions by first passing in undefined.
class Constrable a where
constr :: a -> Constr
instance Data a => Constrable a where
constr = toConstr
instance Constrable a => Constrable (b -> a) where
constr f = constr (f undefined)
This relies on FlexibleInstance, OverlappingInstances and UndecidableInstances, so it might be a bit evil, but, using the (in)famous eyeball theorem, it should be fine. Unless you add more instances or try to use it with something that isn't a constructor. Then it might blow up. Violently. No promises.
Finally, with the evil neatly contained, we can write an "equal by constructor" operator:
(=|=) :: (Data a, Constrable b) => a -> b -> Bool
e =|= c = toConstr e == constr c
(The =|= operator is a bit of a mnemonic, because constructors are syntactically defined with a |.)
Now you can write almost exactly what you wanted!
filter (=|= Pair)
Also, maybe you'd want to turn off the monomorphism restriction. In fact, here's the list of extensions I enabled that you can just use:
{-# LANGUAGE DeriveDataTypeable, FlexibleInstances, NoMonomorphismRestriction, OverlappingInstances, UndecidableInstances #-}
Yeah, it's a lot. But that's what I'm willing to sacrifice for the cause. Of not writing extra undefineds.
Honestly, if you don't mind relying on lens (but boy is that dependency a doozy), you should just go with the prism approach. The only thing to recommend mine is that you get to use the amusingly named Data.Data.Data class.

Resources