How to count the length of a tuple in Haskell? - haskell

I tried searching in web and stackexchange, but surprisingly no one asked how to calculate length of a tuple in Haskell in the form below.
So suppose you have tuple like (1,2,3,4) or (1,3,5,6,7) in Haskell and wish to write the length function that calculates length of a tuple. How can I do this? For list I know how I will be able to do so using recursion without explicitly invoking built-in functions. But tuple is different, and I can't use "head"-"tail" distinction.
Will the method involve creating a new data type?

The reason you don't find anything on this is that it doesn't make a lot of sense to count the length of a tuple:
It is pre-determined a compile time (i.e. you might just as well hard-code it)
There's not really much you could do with that information: unlike with e.g. a list, it's not possible to index specific entries in a generic tuple.
That said, it is possible to achieve your goal, however this shouldn't be a normal function but a type-level function, aka type family. Using singleton type nats:
{-# LANGUAGE TypeFamilies, DataKinds #-}
import Data.Singletons
import Data.Singletons.TypeLits
type family TupLength a :: Nat where
TupLength () = 0
TupLength (a,b) = 2
TupLength (a,b,c) = 3
TupLength (a,b,c,d) = 4
-- ...
TupLength x = 1
Then
> mapM_ print [ natVal (sing :: SNat (TupLength ()))
, natVal (sing :: SNat (TupLength (Int,String,Double)))
, natVal (sing :: SNat (TupLength Bool)) ]
0
3
1

One possible answer (using just the base library) is
import Data.Data
import Data.Functor.Const
length :: Data a => a -> Int
length =
getConst .
gfoldl (\(Const c) _ -> Const (c+1)) (const 0)
I gave a whole talk on this subject at London Haskell recently. The slides are here, but the video has not been published yet.

Related

Type-level constraints in Haskell

I'm trying to encode term algebra as a data type in Haskell for further use in an algorithm. By term algebra I mean a set of terms which are either variables or functions applied to other terms. The functions with zero arguments are constants (but that's actually does not matter here).
Firstly, one would need the following GHC language extensions to replicate my code:
{-# LANGUAGE GADTs, DataKinds,
TypeApplications,
TypeFamilies,
TypeOperators,
StandaloneKindSignatures,
UndecidableInstances#-}
and the following imports:
import qualified GHC.TypeLits as GTL
import Data.Kind
The direct way to encode terms (the first one I took):
data Term where
Var :: String -> Term
Func :: String -> Integer -> [Term] -> Term
where by String I want to encode the name, by Integer the arity and by [Terms] the list of arguments of a function.
Then I want to be sure that the list of terms as arguments have the same length as an arity.
The first idea is to use smart constructors, but I would like to encode such constraints on a type level. So the second idea would be to create type-level naturals, lists of a specified length and the data type where these numbers coincide:
data Z = Z
data S a = S a
data List n a where
Nil :: List Z a
Cons :: a -> List m a -> List (S m) a
data WWTerm where
WWVar :: String -> WWTerm
WWFunc :: String -> m -> List m WWTerm -> WWTerm
My question here is the following: is there a way to impose type-level constraints using ordinary lists at the same time via 1) creating special type-families, 2) creating special type classes, or 3) via constraints in data types?
Regarding my attempts, I wrote the following:
type MyLength :: [Type] -> GTL.Nat
type family MyLength xs where
MyLength '[] = 0
MyLength (x':xs) = 1 GTL.+ (MyLength xs)
data QFunc n l where
QFunc :: (MyLength l ~ n) => String -> (n :: GTL.Nat) -> l -> QFunc n l
Unfortunately, this part of code doesn't compile for the following reasons:
Expected a type, but ‘n :: Nat’ has kind ‘Nat’
...
Expected a type, but ‘l’ has kind ‘[*]’
...
Are there any thoughts on how to approach my goal?

Is there a canonical way of comparing/changing one/two records in haskell?

I want to compare two records in haskell, without defining each change in the datatype of the record with and each function of 2 datas for all of the elements of the record over and over.
I read about lens, but I could not find an example for that,
and do not know where begin to read in the documentation.
Example, not working:
data TheState = TheState { number :: Int,
truth :: Bool
}
initState = TheState 77 True
-- not working, example:
stateMaybe = fmap Just initState
-- result should be:
-- ANewStateType{ number = Just 77, truth = Just True}
The same way, I want to compare the 2 states:
state2 = TheState 78 True
-- not working, example
stateMaybe2 = someNewCompare initState state2
-- result should be:
-- ANewStateType{ number = Just 78, truth = Nothing}
As others have mentioned in comments, it's most likely easier to create a different record to hold the Maybe version of the fields and do the manual conversion. However there is a way to get the functor like mapping over your fields in a more automated way.
It's probably more involved than what you would want but it's possible to achieve using a pattern called Higher Kinded Data (HKD) and a library called barbies.
Here is a amazing blog post on the subject: https://chrispenner.ca/posts/hkd-options
And here is my attempt at using HKD on your specific example:
{-# LANGUAGE DeriveAnyClass #-}
{-# LANGUAGE DeriveGeneric #-}
{-# LANGUAGE FlexibleContexts #-}
-- base
import Data.Functor.Identity
import GHC.Generics (Generic)
-- barbie
import Data.Barbie
type TheState = TheState_ Identity
data TheState_ f = TheState
{ number :: f Int
, truth :: f Bool
} deriving (Generic, FunctorB)
initState :: TheState
initState = TheState (pure 77) (pure True)
stateMaybe :: TheState_ Maybe
stateMaybe = bmap (Just . runIdentity) initState
What is happening here, is that we are wrapping every field of the record in a custom f. We now get to choose what to parameterise TheState with in order to wrap every field. A normal record now has all of its fields wrapped in Identity. But you can have other versions of the record easily available as well. The bmap function let's you map your transformation from one type of TheState_ to another.
Honestly, the blog post will do a much better job at explaining this than I would. I find the subject very interesting, but I am still very new to it myself.
Hope this helped! :-)
How to make a Functor out of a record. For that I have an answer: apply the function to > all of the items of the record.
I want to use the record as an heterogenous container / hashmap, where
the names determine the values-types
While there's no "easy", direct way of doing this, it can be accomplished with several existing libraries.
This answer uses red-black-record library, which is itself built over the anonymous products of sop-core. "sop-core" allows each field in a product to be wrapped in a functor like Maybe and provides functions to manipulate fields uniformly. "red-black-record" inherits this, adding named fields and conversions from normal records.
To make TheState compatible with "red-black-record", we need to do the following:
{-# LANGUAGE DataKinds, FlexibleContexts, ScopedTypeVariables,
DeriveGeneric, DeriveAnyClass,
TypeApplications #-}
import GHC.Generics
import Data.SOP
import Data.SOP.NP (NP,cliftA2_NP) -- anonymous n-ary products
import Data.RBR (Record, -- generalized record type with fields wrapped in functors
I(..), -- an identity functor for "simple" cases
Productlike, -- relates a map of types to its flattened list of types
ToRecord, toRecord, -- convert a normal record to its generalized form
RecordCode, -- returns the map of types correspoding to a normal record
toNP, fromNP, -- convert generalized record to and from n-ary product
getField) -- access field from generalized record using TypeApplication
data TheState = TheState { number :: Int,
truth :: Bool
} deriving (Generic,ToRecord)
We auto-derive the Generic instance that allows other code to introspect the structure of the datatype. This is needed by ToRecord, that allows conversion of normal records into their "generalized forms".
Now consider the following function:
compareRecords :: forall r flat. (ToRecord r,
Productlike '[] (RecordCode r) flat,
All Eq flat)
=> r
-> r
-> Record Maybe (RecordCode r)
compareRecords state1 state2 =
let mapIIM :: forall a. Eq a => I a -> I a -> Maybe a
mapIIM (I val1) (I val2) = if val1 /= val2 then Just val2
else Nothing
resultNP :: NP Maybe flat
resultNP = cliftA2_NP (Proxy #Eq)
mapIIM
(toNP (toRecord state1))
(toNP (toRecord state2))
in fromNP resultNP
It compares two records whatsoever that have ToRecord r instances, and also a corresponding flattened list of types that all have Eq instances (the Productlike '[] (RecordCode r) flat and All Eq flat constraints).
First it converts the initial record arguments to their generalized forms with toRecord. These generalized forms are parameterized with an identity functor I because they come from "pure" values and there aren't any effects are play, yet.
The generalized record forms are in turn converted to n-ary products with toNP.
Then we can use the cliftA2_NP function from "sop-core" to compare accross all fields using their respective Eq instances. The function requires specifying the Eq constraint using a Proxy.
The only thing left to do is reconstructing a generalized record (this one parameterized by Maybe) using fromNP.
An example of use:
main :: IO ()
main = do
let comparison = compareRecords (TheState 0 False) (TheState 0 True)
print (getField #"number" comparison)
print (getField #"truth" comparison)
getField is used to extract values from generalized records. The field name is given as a Symbol by way of -XTypeApplications.

What's a better way of managing large Haskell records?

Replacing fields names with letters, I have cases like this:
data Foo = Foo { a :: Maybe ...
, b :: [...]
, c :: Maybe ...
, ... for a lot more fields ...
} deriving (Show, Eq, Ord)
instance Writer Foo where
write x = maybeWrite a ++
listWrite b ++
maybeWrite c ++
... for a lot more fields ...
parser = permute (Foo
<$?> (Nothing, Just `liftM` aParser)
<|?> ([], bParser)
<|?> (Nothing, Just `liftM` cParser)
... for a lot more fields ...
-- this is particularly hideous
foldl1 merge [foo1, foo2, ...]
merge (Foo a b c ...seriously a lot more...)
(Foo a' b' c' ...) =
Foo (max a a') (b ++ b') (max c c') ...
What techniques would allow me to better manage this growth?
In a perfect world a, b, and c would all be the same type so I could keep them in a list, but they can be many different types. I'm particularly interested in any way to fold the records without needing the massive patterns.
I'm using this large record to hold the different types resulting from permutation parsing the vCard format.
Update
I've implemented both the generics and the foldl approaches suggested below. They both work, and they both reduce three large field lists to one.
Datatype-generic programming techniques can be used to transform all the fields of a record in some "uniform" sort of way.
Perhaps all the fields in the record implement some typeclass that we want to use (the typical example is Show). Or perhaps we have another record of "similar" shape that contains functions, and we want to apply each function to the corresponding field of the original record.
For these kinds of uses, the generics-sop library is a good option. It expands the default Generics functionality of GHC with extra type-level machinery that provides analogues of functions like sequence or ap, but which work over all the fields of a record.
Using generics-sop, I tried to create a slightly less verbose version of your merge funtion. Some preliminary imports:
{-# language TypeOperators #-}
{-# language DeriveGeneric #-}
{-# language TypeFamilies #-}
{-# language DataKinds #-}
import Control.Applicative (liftA2)
import qualified GHC.Generics as GHC
import Generics.SOP
A helper function that lifts a binary operation to a form useable by the functions of generics-sop:
fn_2' :: (a -> a -> a) -> (I -.-> (I -.-> I)) a -- I is simply an Identity functor
fn_2' = fn_2 . liftA2
A general merge function that takes a vector of operators and works on any single-constructor record that derives Generic:
merge :: (Generic a, Code a ~ '[ xs ]) => NP (I -.-> (I -.-> I)) xs -> a -> a -> a
merge funcs reg1 reg2 =
case (from reg1, from reg2) of
(SOP (Z np1), SOP (Z np2)) ->
let npResult = funcs `hap` np1 `hap` np2
in to (SOP (Z npResult))
Code is a type family that returns a type-level list of lists describing the structure of a datatype. The outer list is for constructors, the inner lists contain the types of the fields for each constructor.
The Code a ~ '[ xs ] part of the constraint says "the datatype can only have one constructor" by requiring the outer list to have exactly one element.
The (SOP (Z _) pattern matches extract the (heterogeneus) vector of field values from the record's generic representation. SOP stands for "sum-of-products".
A concrete example:
data Person = Person
{
name :: String
, age :: Int
} deriving (Show,GHC.Generic)
instance Generic Person -- this Generic is from generics-sop
mergePerson :: Person -> Person -> Person
mergePerson = merge (fn_2' (++) :* fn_2' (+) :* Nil)
The Nil and :* constructors are used to build the vector of operators (the type is called NP, from n-ary product). If the vector doesn't match the number of fields in the record, the program won't compile.
Update. Given that the types in your record are highly uniform, an alternative way of creating the vector of operations is to define instances of an auxiliary typeclass for each field type, and then use the hcpure function:
class Mergeable a where
mergeFunc :: a -> a -> a
instance Mergeable String where
mergeFunc = (++)
instance Mergeable Int where
mergeFunc = (+)
mergePerson :: Person -> Person -> Person
mergePerson = merge (hcpure (Proxy :: Proxy Mergeable) (fn_2' mergeFunc))
The hcliftA2 function (that combines hcpure, fn_2 and hap) could be used to simplify things further.
Some suggestions:
(1) You can use the RecordWildCards extension to automatically
unpack a record into variables. Doesn't help if you need to unpack
two records of the same type, but it's a useful to keep in mind.
Oliver Charles has a nice blog post on it: (link)
(2) It appears your example application is performing a fold over the records.
Have a look at Gabriel Gonzalez's foldl package. There is also a blog post: (link)
Here is a example of how you might use it with a record like:
data Foo = Foo { _a :: Int, _b :: String }
The following code computes the maximum of the _a fields and the
concatenation of the _b_ fields.
import qualified Control.Foldl as L
import Data.Profunctor
data Foo = Foo { _a :: Int, _b :: String }
deriving (Show)
fold_a :: L.Fold Foo Int
fold_a = lmap _a (L.Fold max 0 id)
fold_b :: L.Fold Foo String
fold_b = lmap _b (L.Fold (++) "" id)
fold_foos :: L.Fold Foo Foo
fold_foos = Foo <$> fold_a <*> fold_b
theFoos = [ Foo 1 "a", Foo 3 "b", Foo 2 "c" ]
test = L.fold fold_foos theFoos
Note the use of the Profunctor function lmap to extract out
the fields we want to fold over. The expression:
L.Fold max 0 id
is a fold over a list of Ints (or any Num instance), and therefore:
lmap _a (L.Fold max 0 id)
is the same fold but over a list of Foo records where we use _a
to produce the Ints.

Simple dependent type example in Haskell for Dummies. How are they useful in practice in Haskell? Why should I care about dependent types ?

I hear a lot about dependent types nowadays and I heard that DataKinds is somehow related to dependent typing (but I am not sure about this... just heard it on a Haskell Meetup).
Could someone illustrate with a super simple Haskell example what dependent typing is and what is it good for ?
On wikipedia it is written that dependent types can help prevent bugs. Could you give a simple example about how dependent types in Haskell can prevent bugs?
Something that I could start using in five minutes right now to prevent bugs in my Haskell code?
Dependent types are basically functions from values to types, how can this be used in practice? Why is that good ?
Late to the party, this answer is basically a shameless plug.
Sam Lindley and I wrote a paper about Hasochism, the pleasure and pain of dependently typed programming in Haskell. It gives plenty of examples of what's possible now in Haskell and draws points of comparison (favourable as well as not) with the Agda/Idris generation of dependently typed languages.
Although it is an academic paper, it is about actual programs, and you can grab the code from Sam's repo. We have lots of little examples (e.g. orderedness of mergesort output) but we end up with a text editor example, where we use indexing by width and height to manage screen geometry: we make sure that components are regular rectangles (vectors of vectors, not ragged lists of lists) and that they fit together exactly.
The key power of dependent types is to maintain consistency between separate data components (e.g., the head vector in a matrix and every vector in its tail must all have the same length). That's never more important than when writing conditional code. The situation (which will one day come to be seen as having been ridiculously naïve) is that the following are all type-preserving rewrites
if b then t else e => if b then e else t
if b then t else e => t
if b then t else e => e
Although we are presumably testing b because it gives us some useful insight into what would be appropriate (or even safe) to do next, none of that insight is mediated via the type system: the idea that b's truth justifies t and its falsity justifies e is missing, despite being critical.
Plain old Hindley-Milner does give us one means to ensure some consistency. Whenever we have a polymorphic function
f :: forall a. r[a] -> s[a] -> t[a]
we must instantiate a consistently: however the first argument fixes a, the second argument must play along, and we learn something useful about the result while we are at it. Allowing data at the type level is useful because some forms of consistency (e.g. lengths of things) are more readily expressed in terms of data (numbers).
But the real breakthrough is GADT pattern matching, where the type of a pattern can refine the type of the argument it matches. You have a vector of length n; you look to see whether it's nil or cons; now you know whether n is zero or not. This is a form of testing where the type of the code in each case is more specific than the type of the whole, because in each case something which has been learned is reflected at the type level. It is learning by testing which makes a language dependently typed, at least to some extent.
Here's a silly game to play, whatever typed language you use. Replace every type variable and every primitive type in your type expressions with 1 and evaluate types numerically (sum the sums, multiply the products, s -> t means t-to-the-s) and see what you get: if you get 0, you're a logician; if you get 1, you're a software engineer; if you get a power of 2, you're an electronic engineer; if you get infinity, you're a programmer. What's going on in this game is a crude attempt to measure the information we're managing and the choices our code must make. Our usual type systems are good at managing the "software engineering" aspects of coding: unpacking and plugging together components. But as soon as a choice has been made, there is no way for types to observe it, and as soon as there are choices to make, there is no way for types to guide us: non-dependent type systems approximate all values in a given type as the same. That's a pretty serious limitation on their use in bug prevention.
The common example is to encode the length of a list in it's type, so you can do things like (pseudo code).
cons :: a -> List a n -> List a (n+1)
Where n is an integer. This let you specify that adding an object to list increment its length by one.
You can then prevent head (which give you the first element of a list) to be ran on empty list
head :: n > 0 => List a n -> a
Or do things like
to3uple :: List a 3 -> (a,a,a)
The problem with this type of approach is you then can't call head on a arbitrary list without having proven first that the list is not null.
Sometime the proof can be done by the compiler, ex:
head (a `cons` l)
Otherwise, you have to do things like
if null list
then ...
else (head list)
Here it's safe to call head, because you are in the else branch and therefore guaranteed that the length is not null.
However, Haskell doesn't do dependent type at the moment, all the examples have given won't work as nicely, but you should be able to declare this type of list using DataKind because you can promote a int to type which allow to instanciate List a b with List Int 1. (b is a phantom type taking a literal).
If you are interested in this type of safety, you can have a look a liquid Haskell.
Here is a example of such code
{-# LANGUAGE DataKinds, KindSignatures, TypeFamilies, TypeOperators #-}
import GHC.TypeLits
data List a (n:: Nat) = List [a] deriving Show
cons :: a -> List a n -> List a (n + 1)
cons x (List xs) = List (x:xs)
singleton :: a -> List a 1
singleton x = List [x]
data NonEmpty
data EmptyList
type family ListLength a where
ListLength (List a 0) = EmptyList
ListLength (List a n) = NonEmpty
head' :: (ListLength (List a n) ~ NonEmpty) => List a n -> a
head' (List xs) = head xs
tail' :: (ListLength (List a n) ~ NonEmpty) => List a n -> List a (n-1)
tail' (List xs) = List (tail xs)
list = singleton "a"
head' list -- return "a"
Trying to do head' (tail' list) doesn't compile and give
Couldn't match type ‘EmptyList’ with ‘NonEmpty’
Expected type: NonEmpty
Actual type: ListLength (List [Char] 0)
In the expression: head' (tail' list)
In an equation for ‘it’: it = head' (tail' list)
Adding to #mb14's example, here's some simpler working code.
First, we need DataKinds, GADTs, and KindSignatures to really make it clear:
{-# LANGUAGE DataKinds #-}
{-# LANGUAGE GADTS #-}
{-# LANGUAGE KindSignatures #-}
Now let's define a Nat type, and a Vector type based on it:
data Nat :: * where
Z :: Nat
S :: Nat -> Nat
data Vector :: Nat -> * -> * where
Nil :: Vector Z a
(:-:) :: a -> Vector n a -> Vector (S n) a
And voila, lists using dependent types that can be called safe in certain circumstances.
Here are the head and tail functions:
head' :: Vector (S n) a -> a
head' (a :-: _) = a
-- The other constructor, Nil, doesn't apply here because of the type signature!
tail' :: Vector (S n) a -> Vector n a
tail (_ :-: xs) = xs
-- Ditto here.
This is a more concrete and understandable example than above, but does the same sort of thing.
Note that in Haskell, Types can influence values, but values cannot influence types in the same dependent ways. There are languages such as Idris that are similar to Haskell but also support value-to-type dependent typing, which I would recommend looking into.
The machines package lets users define machines that can request values. Many machines request only one type of value, but it's also possible to define machines that sometimes ask for one type and sometimes ask for another type. The requests are values of a GADT type, which allows the value of the request to determine the type of the response.
Step k o r = ...
| forall t . Await (t -> r) (k t) r
The machine provides a request of type k t for some unspecified type t, and a function to deal with the result. By pattern matching on the request, the machine runner learns what type it must supply the machine. The machine's response handler doesn't need to check that it got the right sort of response.

Test if a value matches a constructor

Say I have a data type like so:
data NumCol = Empty |
Single Int |
Pair Int Int |
Lots [Int]
Now I wish to filter out the elements matching a given constructor from a [NumCol]. I can write it for, say, Pair:
get_pairs :: [NumCol] -> [NumCol]
get_pairs = filter is_pair
where is_pair (Pair _ _) = True
is_pair _ = False
This works, but it's not generic. I have to write a separate function for is_single, is_lots, etc.
I wish instead I could write:
get_pairs = filter (== Pair)
But this only works for type constructors that take no arguments (i.e. Empty).
So the question is, how can I write a function that takes a value and a constructor, and returns whether the value matches the constructor?
At least get_pairs itself can be defined relatively simply by using a list comprehension to filter instead:
get_pairs xs = [x | x#Pair {} <- xs]
For a more general solution of matching constructors, you can use prisms from the lens package:
{-# LANGUAGE TemplateHaskell #-}
import Control.Lens
import Control.Lens.Extras (is)
data NumCol = Empty |
Single Int |
Pair Int Int |
Lots [Int]
-- Uses Template Haskell to create the Prisms _Empty, _Single, _Pair and _Lots
-- corresponding to your constructors
makePrisms ''NumCol
get_pairs :: [NumCol] -> [NumCol]
get_pairs = filter (is _Pair)
Tags of tagged unions ought to be first-class values, and with a wee bit of effort, they are.
Jiggery-pokery alert:
{-# LANGUAGE GADTs, DataKinds, KindSignatures,
TypeFamilies, PolyKinds, FlexibleInstances,
PatternSynonyms
#-}
Step one: define type-level versions of the tags.
data TagType = EmptyTag | SingleTag | PairTag | LotsTag
Step two: define value-level witnesses for the representability of the type-level tags. Richard Eisenberg's Singletons library will do this for you. I mean something like this:
data Tag :: TagType -> * where
EmptyT :: Tag EmptyTag
SingleT :: Tag SingleTag
PairT :: Tag PairTag
LotsT :: Tag LotsTag
And now we can say what stuff we expect to find associated with a given tag.
type family Stuff (t :: TagType) :: * where
Stuff EmptyTag = ()
Stuff SingleTag = Int
Stuff PairTag = (Int, Int)
Stuff LotsTag = [Int]
So we can refactor the type you first thought of
data NumCol :: * where
(:&) :: Tag t -> Stuff t -> NumCol
and use PatternSynonyms to recover the behaviour you had in mind:
pattern Empty = EmptyT :& ()
pattern Single i = SingleT :& i
pattern Pair i j = PairT :& (i, j)
pattern Lots is = LotsT :& is
So what's happened is that each constructor for NumCol has turned into a tag indexed by the kind of tag it's for. That is, constructor tags now live separately from the rest of the data, synchronized by a common index which ensures that the stuff associated with a tag matches the tag itself.
But we can talk about tags alone.
data Ex :: (k -> *) -> * where -- wish I could say newtype here
Witness :: p x -> Ex p
Now, Ex Tag, is the type of "runtime tags with a type level counterpart". It has an Eq instance
instance Eq (Ex Tag) where
Witness EmptyT == Witness EmptyT = True
Witness SingleT == Witness SingleT = True
Witness PairT == Witness PairT = True
Witness LotsT == Witness LotsT = True
_ == _ = False
Moreover, we can easily extract the tag of a NumCol.
numColTag :: NumCol -> Ex Tag
numColTag (n :& _) = Witness n
And that allows us to match your specification.
filter ((Witness PairT ==) . numColTag) :: [NumCol] -> [NumCol]
Which raises the question of whether your specification is actually what you need. The point is that detecting a tag entitles you an expectation of that tag's stuff. The output type [NumCol] doesn't do justice to the fact that you know you have just the pairs.
How might you tighten the type of your function and still deliver it?
One approach is to use DataTypeable and the Data.Data module. This approach relies on two autogenerated typeclass instances that carry metadata about the type for you: Typeable and Data. You can derive them with {-# LANGUAGE DeriveDataTypeable #-}:
data NumCol = Empty |
Single Int |
Pair Int Int |
Lots [Int] deriving (Typeable, Data)
Now we have a toConstr function which, given a value, gives us a representation of its constructor:
toConstr :: Data a => a -> Constr
This makes it easy to compare two terms just by their constructors. The only remaining problem is that we need a value to compare against when we define our predicate! We can always just create a dummy value with undefined, but that's a bit ugly:
is_pair x = toConstr x == toConstr (Pair undefined undefined)
So the final thing we'll do is define a handy little class that automates this. The basic idea is to call toConstr on non-function values and recurse on any functions by first passing in undefined.
class Constrable a where
constr :: a -> Constr
instance Data a => Constrable a where
constr = toConstr
instance Constrable a => Constrable (b -> a) where
constr f = constr (f undefined)
This relies on FlexibleInstance, OverlappingInstances and UndecidableInstances, so it might be a bit evil, but, using the (in)famous eyeball theorem, it should be fine. Unless you add more instances or try to use it with something that isn't a constructor. Then it might blow up. Violently. No promises.
Finally, with the evil neatly contained, we can write an "equal by constructor" operator:
(=|=) :: (Data a, Constrable b) => a -> b -> Bool
e =|= c = toConstr e == constr c
(The =|= operator is a bit of a mnemonic, because constructors are syntactically defined with a |.)
Now you can write almost exactly what you wanted!
filter (=|= Pair)
Also, maybe you'd want to turn off the monomorphism restriction. In fact, here's the list of extensions I enabled that you can just use:
{-# LANGUAGE DeriveDataTypeable, FlexibleInstances, NoMonomorphismRestriction, OverlappingInstances, UndecidableInstances #-}
Yeah, it's a lot. But that's what I'm willing to sacrifice for the cause. Of not writing extra undefineds.
Honestly, if you don't mind relying on lens (but boy is that dependency a doozy), you should just go with the prism approach. The only thing to recommend mine is that you get to use the amusingly named Data.Data.Data class.

Resources