Anonymous records: what ways to type-level tag in Haskell? - haskell

I'm playing with lightweight anonymous record-alikes, more to explore the type theory for them than anything 'industrial strength'. I want the fields to be simply type-tagged.
myRec = (EmpId 54321, EmpName "Jo", EmpPhone "98-7654321") -- in which
newtype EmpPhone a = EmpPhone a -- and maybe
data EmpName a where EmpName :: IsString a => a -> EmpName a -- GADT
data EmpId a where EmpId :: Int -> EmpId Int -- GADT to same pattern
Although I could put newtype EmpId = EmpId Int, I want to follow the same pattern for all tags, so that I can go for example:
project (EmpId, EmpName) myRec -- use tags as field names
I'll also use StandaloneDeriving/DeriveAnyType to derive instance Eq, Show, Num etc.
Other possible designs
For the records, rather than Haskell tuples I could use HList or make my own data types Tuple0, Tuple1, Tuple2, .... I don't think that would affect the typing issues below.
For the tags/fields I could pair a Symbol (type-level String) as phantom type with the value -- for example CTRex does something like that. Then use TypeApplications to build fields.
data Tag (tag :: Symbol) a = Tag a
myRec = (Tag #"EmpId" 54321, ...)
That makes the field syntax (and projection list) rather 'noisy'; also prevents any validation that EmpIds are Int, etc.
Three related lines of questions on typing for these:
How best to prevent
sillyRec = (EmpId 65432, Just "not my tag", "or [] as constructor",
Right "or even worse" :: Either Int String)
I could declare a class, put my tags only in it (not too bad with DeriveAnyClass), put constraints everywhere. But my tags have a consistent structure: single data constructor named same as the type; single type parameter which is the only parameter to the data constructor.
How to express I want each record-alike to follow a consistent type pattern? That is prevent:
notaRec = (EmpId 76543, EmpName)
Bare EmpName is OK in a projection list, providing all the other fields are bare constructors. I want to say that notaRec is not well-Kinded, but bare EmpName is Kind * -> *, which is unifiable with *. So I mean more like: all fields in the record fit the same type pattern.
Then when I get to sets-of-records (aka tables/relations)
myTable = ( myRec, -- tuple of tuples
(EmpName "Kaz", EmpPhone 987654312, EmpId 87654),
EmpId 98765, EmpPhone "21-4365879", EmpName "Bo")
Putting the fields in a different order is OK because we have a tuple-of-tuples. But EmpPhone is at two different types in the two records. And the last line isn't a record at all: it's fields at the 'wrong' pattern. (Same mis-match as with bare EmpName in 2.)
Again I want to say these are ill-Kinded. My field tags are appearing at different 'depths' or in differing type patterns.
I guess I could get there with a great deal of hard-coding for valid instances/combos of types. Is there a more generic way?
EDIT: In response to comments. (Yes I'm mortal too. Thanks #duplode for figuring out the formatting.)
why not type Record = (EmpId Int, EmpName String, EmpPhone String)?
As a type synonym that's fine. But doesn't answer the question because I want it equivalent to any permutation of those tags. (I think I can verify that equivalence at type level using HList techniques.)
some sort of high-level overview of your objective [thank you David]
I want to treat the ( ... , ... , ... ) as a set. Because the Relational Database Model says relations are sets of 'tuples' [not Haskell tuples] and 'tuples' are sets of pairs of tag-value. I also want to treat the project function as having a first-class parameter which is a set of tags. (Contrast that in Codd's Relational Algebra, the π operator has its set of tags subscripted as if part of the operator.)
These couldn't be Haskell Sets because the elements are not the same type. I want to say the elements are the same Kind; and that a Haskell-tuple of same-Kinded elements represents a set-of that Kind. But I know that's abusing terminology. (The alternative design I considered using Symbol tags perhaps shows better there's a Kindiness aspect.)
If I can treat the Haskell tuples as set-ish, I can use well-known HList techniques to emulate the Relational Operators.
If this helps explain, I could do this with a lot of boilerplate:
class MyTag a -- type/kind-level predicate
deriving instance MyTag (EmpId Int) -- uses DeriveAnyClass
-- etc for all my tags
class WellKinded tup
instance WellKinded ()
instance {-# OVERLAPPING #-}
(MyTag (n1 a1), MyTag (n2 a2), MyTag (n3 a3))
=> WellKinded (n1 a1, n2 a2, n3 a3) -- and so on for every arity of tuple
instance {-# OVERLAPPABLE #-}
(MyTag (n1 a1), MyTag (n2 a2), MyTag (n3 a3))
=> WellKinded (a1 -> n1 a1, a2 -> n2 a2, a3 -> n3 a3)
All those instances for different arities are rapidly going to get tedious, so I could convert to HList; despatch an instance on the Kind of the first element; iterate down the list verifying all the same Kind.
For tuple-of-tuples, detect the Kind of the first element of the first sub-tuple; iterate both across and down. (Again needs OverlappingInstances: a tuple-of-tuples-of-tuples is still a tuple. This is what I mean by "a great deal of hard-coding" above.) It doesn't seem unachievable. But it does feel like going down the wrong rabbit-hole.

This is crazy enough it might just work. Pattern synonyms to the rescue:
newtype Label (n :: Symbol) (a :: *) = MkLab a -- newtype yay!
deriving (Eq, Ord, Show)
pattern EmpPhone x = MkLab x :: Label "EmpPhone" a
pattern EmpName x = MkLab x :: IsString a => Label "EmpName" a
pattern EmpId x = MkLab x :: Label "EmpId" Int
myRec = (EmpId 54321, EmpName "Jo", EmpPhone "98-7654321") -- works a treat
Then to answer the q's
To count as a record, all tuple elements must be of type Label s a.
To count as a projection list, all tuple elements must be of type a -> Label s a.
(That works, by the way.)
Those are the only types/kinds allowed in tuples-as-records.
So to parse a tuple-of-tuples at type level, I need only despatch on the type of the leftmost element.
I'm looking for type constructor Label.
All the rest I can do with HList-style type matching.
For those patterns I did need to switch on a swag of extensions:
{-# LANGUAGE PatternSynonyms,
KindSignatures, DataKinds,
ScopedTypeVariables, -- for the signatures on patterns
RankNTypes #-} -- for the signatures with contexts
import GHC.TypeLits -- for the Symbols

Here's a kinda answer or at least explanation for 2., 3.; a partial answer to 1.
How to express I want each record-alike to follow a consistent type pattern? That is prevent:
notaRec = (EmpId 76543, EmpName)
On the face of it EmpId 76543 matches type pattern (n a); whereas EmpName :: a -> (n a). But Hindley-Milner doesn't "match" simplistically like that, it uses unifiability. So all of these unify with (n a):
-- as `( n a )`
a -> (n a) -- as `( ((->) a) (n a) )`
(b, c) -- as `( (,) b ) c `
(b, c, d) -- as `( (,,) b c ) d ` -- etc for all larger Haskell tuples
[ a ], Maybe a -- as `( [] a )`, `( Maybe a )`
Either b c -- as `( (Either b) c )`
b -> (Either b c) -- as `( ((->) b) (Either b c) )` -- for example, bare `Left`
To disagree with myself on the abuse of terminology:
I want to say these are ill-Kinded. My field tags are appearing at different 'depths' ...
But I know that's abusing terminology.
Any type with a -> outermost constructor is at a different Kind vs one without. Either is at a different Kind vs EmpId, because it is different arity. Type unification builds the 'most general unifier', and that makes them appear same-Kinded.
For the purposes here we want the opposite of the mgu -- call it the 'maximally specific Kind', MaSK for short.
We can express it with a closed Type Family and lots of overlapping equations (so the order of them is critical). This can also catch the Prelude's constructors that shouldn't count:
type family MaSK ( a :: * ) where
-- presume the result is one from some pre-declared bunch of types
-- use that result to verify all 'elements' of a set are same-kinded
MaSK (_ -> _ _ _) = No -- e.g. bare `Left`
MaSK (_ -> [ _ ]) = No -- reject unwanted constructors
MaSK (_ -> Maybe _ ) = No -- ditto
MaSK (a' -> n a') = YesAsBareTag -- this we want
MaSK (_ -> _ _ ) = No --
MaSK (_ -> _ ) = No
MaSK ( _ , _ , _ , _ ) = YesAsSet -- etc for greater arities
MaSK ( _ , _ , _ ) = YesAsSet
MaSK ( _ , _ ) = YesAsSet
MaSK (_ _ _ ) = No -- too much arity, e.g. `Either b c`
MaSK [ _ ] = No -- reject unwanted constructors
MaSK (Maybe _) = No -- ditto
MaSK (n a) = YesAsTagValue -- this we want providing all the above eliminated
MaSK _ = No -- i.e. bare `Int, Bool, Char, ...`
Limitations: this approach can't check there's a single data constructor for the type, nor that other constructors for that type match the pattern, nor that the constructor is named same as the type, nor that the constructor might smuggle in existentially-quantified parameters. For that, go full metal generics.

Related

How to define a family of type dependent function in Haskell

This is a (series of) Haskell question(s). I am fairly new to Haskell.
Suppose we have a 4-tuple (a1,a2,a3,a4). How do we define a function, kth, that gives the k-th element in this tuple? Example,
kth (1,"A",'b',True) 3 = 'b'
If the types of a1, a2, a3, a4 are the same, then it has a fairly simple definition. For example, if they are all integers:
kth :: (Int,Int,Int,Int) -> Int -> Int
kth (a1,a2,a3,a4) 1 = a1
kth (a1,a2,a3,a4) 2 = a2
kth (a1,a2,a3,a4) 3 = a3
kth (a1,a2,a3,a4) 4 = a4
My suspicion of why this is not straightforward is because Haskell must know the type in advance. In the library function fst and snd, Haskell knows that the output type is the type of the first element for the formal, and the output type is the type of the second element for the latter. Hence, there is no ambiguity. In kth, the output type depends on the second input, hence Haskell cannot do type check based on the syntax.
Now, suppose we have a n-th tuple (a1,a2,...,an). Can we define a family of length functions such that
lengthTuple :: a -> Int
lengthTuple (a1,a2,...,an) = n
This kind of problem (dependent type) is still a headache in Haskell. The Tuple from Prelude is not quite suitable for this kind of task (perhaps doable though). But you can use the sized vector with dependent type for this kind of problem.
Example:
https://www.schoolofhaskell.com/user/konn/prove-your-haskell-for-great-safety/dependent-types-in-haskell
You can not implement your function if the index must be an Int, but you could if that were a custom "singleton" index type. Essentially, if we want to mimic dependent types, the best option we have is to pass singletons around a lot, to connect type-level values to term-level ones.
Here is one example:
{-# LANGUAGE GADTs, DataKinds, TypeFamilies #-}
{-# OPTIONS -Wall #-}
-- custom index type
data Ix = I1 | I2 | I3 | I4
-- associated singleton type (this could be autogenerated using the singletons library)
data SIx (ix :: Ix) where
SI1 :: SIx 'I1
SI2 :: SIx 'I2
SI3 :: SIx 'I3
SI4 :: SIx 'I4
-- type level function
type family T (ix :: Ix) a1 a2 a3 a4
type instance T 'I1 a1 _ _ _ = a1
type instance T 'I2 _ a2 _ _ = a2
type instance T 'I3 _ _ a3 _ = a3
type instance T 'I4 _ _ _ a4 = a4
-- our "dependent" tuple selector
choose :: (a1, a2, a3, a4) -> SIx ix -> T ix a1 a2 a3 a4
choose (x1, _, _, _) SI1 = x1
choose (_, x2, _, _) SI2 = x2
choose (_, _, x3, _) SI3 = x3
choose (_, _, _, x4) SI4 = x4
If wanted, we can "hide" the ix parameter of SIx ix and T ix a1 a2 a3 a4 using an existential wrapper (as a sort of dependent sum type), building a function that given "some index" returns "some component".
This would be much more convenient if we had real dependent types. Still, this is the price we currently pay to have type erasure at runtime. If Haskell one day adds non-erased pi a . ... types to the erased ones forall a . ... we have now, we will have much more control.
The short answer, as suggested in some comments, is that you shouldn't seriously be doing this in Haskell. If you find yourself needing to write functions that can operate on differently sized tuples, you're programming Haskell wrong.
However, the idiomatic method of defining a function like lengthTuple is to use type classes with explicit instances for different tuple sizes. If this is for a library, you pick some upper bound and write instances up to that size. A reasonable choice might be a 15-tuple, since that's also the largest tuple that has a Show instance:
> (1,2,3,4,5,6,7,8,9,10,11,12,13,14,15)
(1,2,3,4,5,6,7,8,9,10,11,12,13,14,15)
> (1,2,3,4,5,6,7,8,9,10,11,12,13,14,15,16)
<interactive>:72:1: error:
• No instance for (Show ...
So, the definition for lengthTuple would look like this:
class Tuple a where lengthTuple :: a -> Int
instance Tuple (a,b) where lengthTuple _ = 2
instance Tuple (a,b,c) where lengthTuple _ = 3
instance Tuple (a,b,c,d) where lengthTuple _ = 4
...up to 15...
Tedious, but pretty standard.
Remarkably, it is possible to write lengthTuple without any boilerplate using Data.Data generics. These generics provide a way to fold over the structure of an algebraic data type in a fairly general way, and you can use a Const functor to ignore the actual content of the data type while counting the number of fields:
import Data.Data
import Data.Functor.Const
lengthTuple :: (Data a) => a -> Int
lengthTuple = getConst . gfoldl (\(Const n) _ -> Const (n+1))
(\_ -> Const 0)
This works fine, though there's no straightforward way to restrict it to tuples, and you may find its return value for non-tuples somewhat surprising:
> lengthTuple (1,2,3,4)
4
> lengthTuple ""
0
> lengthTuple "what the heck?"
2
Writing kth is much, much harder. Your intuition is right. Because the type of the expression kth tuple n depends on the value rather than the type of the parameter n, a simple definition isn't possible. The other answers have covered a couple of approaches.

What can type families do that multi param type classes and functional dependencies cannot

I have played around with TypeFamilies, FunctionalDependencies, and MultiParamTypeClasses. And it seems to me as though TypeFamilies doesn't add any concrete functionality over the other two. (But not vice versa). But I know type families are pretty well liked so I feel like I am missing something:
"open" relation between types, such as a conversion function, which does not seem possible with TypeFamilies. Done with MultiParamTypeClasses:
class Convert a b where
convert :: a -> b
instance Convert Foo Bar where
convert = foo2Bar
instance Convert Foo Baz where
convert = foo2Baz
instance Convert Bar Baz where
convert = bar2Baz
Surjective relation between types, such as a sort of type safe pseudo-duck typing mechanism, that would normally be done with a standard type family. Done with MultiParamTypeClasses and FunctionalDependencies:
class HasLength a b | a -> b where
getLength :: a -> b
instance HasLength [a] Int where
getLength = length
instance HasLength (Set a) Int where
getLength = S.size
instance HasLength Event DateDiff where
getLength = dateDiff (start event) (end event)
Bijective relation between types, such as for an unboxed container, which could be done through TypeFamilies with a data family, although then you have to declare a new data type for every contained type, such as with a newtype. Either that or with an injective type family, which I think is not available prior to GHC 8. Done with MultiParamTypeClasses and FunctionalDependencies:
class Unboxed a b | a -> b, b -> a where
toList :: a -> [b]
fromList :: [b] -> a
instance Unboxed FooVector Foo where
toList = fooVector2List
fromList = list2FooVector
instance Unboxed BarVector Bar where
toList = barVector2List
fromList = list2BarVector
And lastly a surjective relations between two types and a third type, such as python2 or java style division function, which can be done with TypeFamilies by also using MultiParamTypeClasses. Done with MultiParamTypeClasses and FunctionalDependencies:
class Divide a b c | a b -> c where
divide :: a -> b -> c
instance Divide Int Int Int where
divide = div
instance Divide Int Double Double where
divide = (/) . fromIntegral
instance Divide Double Int Double where
divide = (. fromIntegral) . (/)
instance Divide Double Double Double where
divide = (/)
One other thing I should also add is that it seems like FunctionalDependencies and MultiParamTypeClasses are also quite a bit more concise (for the examples above anyway) as you only have to write the type once, and you don't have to come up with a dummy type name which you then have to type for every instance like you do with TypeFamilies:
instance FooBar LongTypeName LongerTypeName where
FooBarResult LongTypeName LongerTypeName = LongestTypeName
fooBar = someFunction
vs:
instance FooBar LongTypeName LongerTypeName LongestTypeName where
fooBar = someFunction
So unless I am convinced otherwise it really seems like I should just not bother with TypeFamilies and use solely FunctionalDependencies and MultiParamTypeClasses. Because as far as I can tell it will make my code more concise, more consistent (one less extension to care about), and will also give me more flexibility such as with open type relationships or bijective relations (potentially the latter is solver by GHC 8).
Here's an example of where TypeFamilies really shines compared to MultiParamClasses with FunctionalDependencies. In fact, I challenge you to come up with an equivalent MultiParamClasses solution, even one that uses FlexibleInstances, OverlappingInstance, etc.
Consider the problem of type level substitution (I ran across a specific variant of this in Quipper in QData.hs). Essentially what you want to do is recursively substitute one type for another. For example, I want to be able to
substitute Int for Bool in Either [Int] String and get Either [Bool] String,
substitute [Int] for Bool in Either [Int] String and get Either Bool String,
substitute [Int] for [Bool] in Either [Int] String and get Either [Bool] String.
All in all, I want the usual notion of type level substitution. With a closed type family, I can do this for any types (albeit I need an extra line for each higher-kinded type constructor - I stopped at * -> * -> * -> * -> *).
{-# LANGUAGE TypeFamilies #-}
-- Subsitute type `x` for type `y` in type `a`
type family Substitute x y a where
Substitute x y x = y
Substitute x y (k a b c d) = k (Substitute x y a) (Substitute x y b) (Substitute x y c) (Substitute x y d)
Substitute x y (k a b c) = k (Substitute x y a) (Substitute x y b) (Substitute x y c)
Substitute x y (k a b) = k (Substitute x y a) (Substitute x y b)
Substitute x y (k a) = k (Substitute x y a)
Substitute x y a = a
And trying at ghci I get the desired output:
> :t undefined :: Substitute Int Bool (Either [Int] String)
undefined :: Either [Bool] [Char]
> :t undefined :: Substitute [Int] Bool (Either [Int] String)
undefined :: Either Bool [Char]
> :t undefined :: Substitute [Int] [Bool] (Either [Int] String)
undefined :: Either [Bool] [Char]
With that said, maybe you should be asking yourself why am I using MultiParamClasses and not TypeFamilies. Of the examples you gave above, all except Convert translate to type families (albeit you will need an extra line per instance for the type declaration).
Then again, for Convert, I am not convinced it is a good idea to define such a thing. The natural extension to Convert would be instances such as
instance (Convert a b, Convert b c) => Convert a c where
convert = convert . convert
instance Convert a a where
convert = id
which are as unresolvable for GHC as they are elegant to write...
To be clear, I am not saying there are no uses of MultiParamClasses, just that when possible you should be using TypeFamilies - they let you think about type-level functions instead of just relations.
This old HaskellWiki page does an OK job of comparing the two.
EDIT
Some more contrasting and history I stumbled upon from augustss blog
Type families grew out of the need to have type classes with
associated types. The latter is not strictly necessary since it can be
emulated with multi-parameter type classes, but it gives a much nicer
notation in many cases. The same is true for type families; they can
also be emulated by multi-parameter type classes. But MPTC gives a
very logic programming style of doing type computation; whereas type
families (which are just type functions that can pattern match on the
arguments) is like functional programming.
Using closed type families
adds some extra strength that cannot be achieved by type classes. To
get the same power from type classes we would need to add closed type
classes. Which would be quite useful; this is what instance chains
gives you.
Functional dependencies only affect the process of constraint solving, while type families introduced the notion of non-syntactic type equality, represented in GHC's intermediate form by coercions. This means type families interact better with GADTs. See this question for the canonical example of how functional dependencies fail here.

What's a better way of managing large Haskell records?

Replacing fields names with letters, I have cases like this:
data Foo = Foo { a :: Maybe ...
, b :: [...]
, c :: Maybe ...
, ... for a lot more fields ...
} deriving (Show, Eq, Ord)
instance Writer Foo where
write x = maybeWrite a ++
listWrite b ++
maybeWrite c ++
... for a lot more fields ...
parser = permute (Foo
<$?> (Nothing, Just `liftM` aParser)
<|?> ([], bParser)
<|?> (Nothing, Just `liftM` cParser)
... for a lot more fields ...
-- this is particularly hideous
foldl1 merge [foo1, foo2, ...]
merge (Foo a b c ...seriously a lot more...)
(Foo a' b' c' ...) =
Foo (max a a') (b ++ b') (max c c') ...
What techniques would allow me to better manage this growth?
In a perfect world a, b, and c would all be the same type so I could keep them in a list, but they can be many different types. I'm particularly interested in any way to fold the records without needing the massive patterns.
I'm using this large record to hold the different types resulting from permutation parsing the vCard format.
Update
I've implemented both the generics and the foldl approaches suggested below. They both work, and they both reduce three large field lists to one.
Datatype-generic programming techniques can be used to transform all the fields of a record in some "uniform" sort of way.
Perhaps all the fields in the record implement some typeclass that we want to use (the typical example is Show). Or perhaps we have another record of "similar" shape that contains functions, and we want to apply each function to the corresponding field of the original record.
For these kinds of uses, the generics-sop library is a good option. It expands the default Generics functionality of GHC with extra type-level machinery that provides analogues of functions like sequence or ap, but which work over all the fields of a record.
Using generics-sop, I tried to create a slightly less verbose version of your merge funtion. Some preliminary imports:
{-# language TypeOperators #-}
{-# language DeriveGeneric #-}
{-# language TypeFamilies #-}
{-# language DataKinds #-}
import Control.Applicative (liftA2)
import qualified GHC.Generics as GHC
import Generics.SOP
A helper function that lifts a binary operation to a form useable by the functions of generics-sop:
fn_2' :: (a -> a -> a) -> (I -.-> (I -.-> I)) a -- I is simply an Identity functor
fn_2' = fn_2 . liftA2
A general merge function that takes a vector of operators and works on any single-constructor record that derives Generic:
merge :: (Generic a, Code a ~ '[ xs ]) => NP (I -.-> (I -.-> I)) xs -> a -> a -> a
merge funcs reg1 reg2 =
case (from reg1, from reg2) of
(SOP (Z np1), SOP (Z np2)) ->
let npResult = funcs `hap` np1 `hap` np2
in to (SOP (Z npResult))
Code is a type family that returns a type-level list of lists describing the structure of a datatype. The outer list is for constructors, the inner lists contain the types of the fields for each constructor.
The Code a ~ '[ xs ] part of the constraint says "the datatype can only have one constructor" by requiring the outer list to have exactly one element.
The (SOP (Z _) pattern matches extract the (heterogeneus) vector of field values from the record's generic representation. SOP stands for "sum-of-products".
A concrete example:
data Person = Person
{
name :: String
, age :: Int
} deriving (Show,GHC.Generic)
instance Generic Person -- this Generic is from generics-sop
mergePerson :: Person -> Person -> Person
mergePerson = merge (fn_2' (++) :* fn_2' (+) :* Nil)
The Nil and :* constructors are used to build the vector of operators (the type is called NP, from n-ary product). If the vector doesn't match the number of fields in the record, the program won't compile.
Update. Given that the types in your record are highly uniform, an alternative way of creating the vector of operations is to define instances of an auxiliary typeclass for each field type, and then use the hcpure function:
class Mergeable a where
mergeFunc :: a -> a -> a
instance Mergeable String where
mergeFunc = (++)
instance Mergeable Int where
mergeFunc = (+)
mergePerson :: Person -> Person -> Person
mergePerson = merge (hcpure (Proxy :: Proxy Mergeable) (fn_2' mergeFunc))
The hcliftA2 function (that combines hcpure, fn_2 and hap) could be used to simplify things further.
Some suggestions:
(1) You can use the RecordWildCards extension to automatically
unpack a record into variables. Doesn't help if you need to unpack
two records of the same type, but it's a useful to keep in mind.
Oliver Charles has a nice blog post on it: (link)
(2) It appears your example application is performing a fold over the records.
Have a look at Gabriel Gonzalez's foldl package. There is also a blog post: (link)
Here is a example of how you might use it with a record like:
data Foo = Foo { _a :: Int, _b :: String }
The following code computes the maximum of the _a fields and the
concatenation of the _b_ fields.
import qualified Control.Foldl as L
import Data.Profunctor
data Foo = Foo { _a :: Int, _b :: String }
deriving (Show)
fold_a :: L.Fold Foo Int
fold_a = lmap _a (L.Fold max 0 id)
fold_b :: L.Fold Foo String
fold_b = lmap _b (L.Fold (++) "" id)
fold_foos :: L.Fold Foo Foo
fold_foos = Foo <$> fold_a <*> fold_b
theFoos = [ Foo 1 "a", Foo 3 "b", Foo 2 "c" ]
test = L.fold fold_foos theFoos
Note the use of the Profunctor function lmap to extract out
the fields we want to fold over. The expression:
L.Fold max 0 id
is a fold over a list of Ints (or any Num instance), and therefore:
lmap _a (L.Fold max 0 id)
is the same fold but over a list of Foo records where we use _a
to produce the Ints.

Test if a value matches a constructor

Say I have a data type like so:
data NumCol = Empty |
Single Int |
Pair Int Int |
Lots [Int]
Now I wish to filter out the elements matching a given constructor from a [NumCol]. I can write it for, say, Pair:
get_pairs :: [NumCol] -> [NumCol]
get_pairs = filter is_pair
where is_pair (Pair _ _) = True
is_pair _ = False
This works, but it's not generic. I have to write a separate function for is_single, is_lots, etc.
I wish instead I could write:
get_pairs = filter (== Pair)
But this only works for type constructors that take no arguments (i.e. Empty).
So the question is, how can I write a function that takes a value and a constructor, and returns whether the value matches the constructor?
At least get_pairs itself can be defined relatively simply by using a list comprehension to filter instead:
get_pairs xs = [x | x#Pair {} <- xs]
For a more general solution of matching constructors, you can use prisms from the lens package:
{-# LANGUAGE TemplateHaskell #-}
import Control.Lens
import Control.Lens.Extras (is)
data NumCol = Empty |
Single Int |
Pair Int Int |
Lots [Int]
-- Uses Template Haskell to create the Prisms _Empty, _Single, _Pair and _Lots
-- corresponding to your constructors
makePrisms ''NumCol
get_pairs :: [NumCol] -> [NumCol]
get_pairs = filter (is _Pair)
Tags of tagged unions ought to be first-class values, and with a wee bit of effort, they are.
Jiggery-pokery alert:
{-# LANGUAGE GADTs, DataKinds, KindSignatures,
TypeFamilies, PolyKinds, FlexibleInstances,
PatternSynonyms
#-}
Step one: define type-level versions of the tags.
data TagType = EmptyTag | SingleTag | PairTag | LotsTag
Step two: define value-level witnesses for the representability of the type-level tags. Richard Eisenberg's Singletons library will do this for you. I mean something like this:
data Tag :: TagType -> * where
EmptyT :: Tag EmptyTag
SingleT :: Tag SingleTag
PairT :: Tag PairTag
LotsT :: Tag LotsTag
And now we can say what stuff we expect to find associated with a given tag.
type family Stuff (t :: TagType) :: * where
Stuff EmptyTag = ()
Stuff SingleTag = Int
Stuff PairTag = (Int, Int)
Stuff LotsTag = [Int]
So we can refactor the type you first thought of
data NumCol :: * where
(:&) :: Tag t -> Stuff t -> NumCol
and use PatternSynonyms to recover the behaviour you had in mind:
pattern Empty = EmptyT :& ()
pattern Single i = SingleT :& i
pattern Pair i j = PairT :& (i, j)
pattern Lots is = LotsT :& is
So what's happened is that each constructor for NumCol has turned into a tag indexed by the kind of tag it's for. That is, constructor tags now live separately from the rest of the data, synchronized by a common index which ensures that the stuff associated with a tag matches the tag itself.
But we can talk about tags alone.
data Ex :: (k -> *) -> * where -- wish I could say newtype here
Witness :: p x -> Ex p
Now, Ex Tag, is the type of "runtime tags with a type level counterpart". It has an Eq instance
instance Eq (Ex Tag) where
Witness EmptyT == Witness EmptyT = True
Witness SingleT == Witness SingleT = True
Witness PairT == Witness PairT = True
Witness LotsT == Witness LotsT = True
_ == _ = False
Moreover, we can easily extract the tag of a NumCol.
numColTag :: NumCol -> Ex Tag
numColTag (n :& _) = Witness n
And that allows us to match your specification.
filter ((Witness PairT ==) . numColTag) :: [NumCol] -> [NumCol]
Which raises the question of whether your specification is actually what you need. The point is that detecting a tag entitles you an expectation of that tag's stuff. The output type [NumCol] doesn't do justice to the fact that you know you have just the pairs.
How might you tighten the type of your function and still deliver it?
One approach is to use DataTypeable and the Data.Data module. This approach relies on two autogenerated typeclass instances that carry metadata about the type for you: Typeable and Data. You can derive them with {-# LANGUAGE DeriveDataTypeable #-}:
data NumCol = Empty |
Single Int |
Pair Int Int |
Lots [Int] deriving (Typeable, Data)
Now we have a toConstr function which, given a value, gives us a representation of its constructor:
toConstr :: Data a => a -> Constr
This makes it easy to compare two terms just by their constructors. The only remaining problem is that we need a value to compare against when we define our predicate! We can always just create a dummy value with undefined, but that's a bit ugly:
is_pair x = toConstr x == toConstr (Pair undefined undefined)
So the final thing we'll do is define a handy little class that automates this. The basic idea is to call toConstr on non-function values and recurse on any functions by first passing in undefined.
class Constrable a where
constr :: a -> Constr
instance Data a => Constrable a where
constr = toConstr
instance Constrable a => Constrable (b -> a) where
constr f = constr (f undefined)
This relies on FlexibleInstance, OverlappingInstances and UndecidableInstances, so it might be a bit evil, but, using the (in)famous eyeball theorem, it should be fine. Unless you add more instances or try to use it with something that isn't a constructor. Then it might blow up. Violently. No promises.
Finally, with the evil neatly contained, we can write an "equal by constructor" operator:
(=|=) :: (Data a, Constrable b) => a -> b -> Bool
e =|= c = toConstr e == constr c
(The =|= operator is a bit of a mnemonic, because constructors are syntactically defined with a |.)
Now you can write almost exactly what you wanted!
filter (=|= Pair)
Also, maybe you'd want to turn off the monomorphism restriction. In fact, here's the list of extensions I enabled that you can just use:
{-# LANGUAGE DeriveDataTypeable, FlexibleInstances, NoMonomorphismRestriction, OverlappingInstances, UndecidableInstances #-}
Yeah, it's a lot. But that's what I'm willing to sacrifice for the cause. Of not writing extra undefineds.
Honestly, if you don't mind relying on lens (but boy is that dependency a doozy), you should just go with the prism approach. The only thing to recommend mine is that you get to use the amusingly named Data.Data.Data class.

Haskell get type of algebraic parameter

I have a type
class IntegerAsType a where
value :: a -> Integer
data T5
instance IntegerAsType T5 where value _ = 5
newtype (IntegerAsType q) => Zq q = Zq Integer deriving (Eq)
newtype (Num a, IntegerAsType n) => PolyRing a n = PolyRing [a]
I'm trying to make a nice "show" for the PolyRing type. In particular, I want the "show" to print out the type 'a'. Is there a function that returns the type of an algebraic parameter (a 'show' for types)?
The other way I'm trying to do it is using pattern matching, but I'm running into problems with built-in types and the algebraic type.
I want a different result for each of Integer, Int and Zq q.
(toy example:)
test :: (Num a, IntegerAsType q) => a -> a
(Int x) = x+1
(Integer x) = x+2
(Zq x) = x+3
There are at least two different problems here.
1) Int and Integer are not data constructors for the 'Int' and 'Integer' types. Are there data constructors for these types/how do I pattern match with them?
2) Although not shown in my code, Zq IS an instance of Num. The problem I'm getting is:
Ambiguous constraint `IntegerAsType q'
At least one of the forall'd type variables mentioned by the constraint
must be reachable from the type after the '=>'
In the type signature for `test':
test :: (Num a, IntegerAsType q) => a -> a
I kind of see why it is complaining, but I don't know how to get around that.
Thanks
EDIT:
A better example of what I'm trying to do with the test function:
test :: (Num a) => a -> a
test (Integer x) = x+2
test (Int x) = x+1
test (Zq x) = x
Even if we ignore the fact that I can't construct Integers and Ints this way (still want to know how!) this 'test' doesn't compile because:
Could not deduce (a ~ Zq t0) from the context (Num a)
My next try at this function was with the type signature:
test :: (Num a, IntegerAsType q) => a -> a
which leads to the new error
Ambiguous constraint `IntegerAsType q'
At least one of the forall'd type variables mentioned by the constraint
must be reachable from the type after the '=>'
I hope that makes my question a little clearer....
I'm not sure what you're driving at with that test function, but you can do something like this if you like:
{-# LANGUAGE ScopedTypeVariables #-}
class NamedType a where
name :: a -> String
instance NamedType Int where
name _ = "Int"
instance NamedType Integer where
name _ = "Integer"
instance NamedType q => NamedType (Zq q) where
name _ = "Zq (" ++ name (undefined :: q) ++ ")"
I would not be doing my Stack Overflow duty if I did not follow up this answer with a warning: what you are asking for is very, very strange. You are probably doing something in a very unidiomatic way, and will be fighting the language the whole way. I strongly recommend that your next question be a much broader design question, so that we can help guide you to a more idiomatic solution.
Edit
There is another half to your question, namely, how to write a test function that "pattern matches" on the input to check whether it's an Int, an Integer, a Zq type, etc. You provide this suggestive code snippet:
test :: (Num a) => a -> a
test (Integer x) = x+2
test (Int x) = x+1
test (Zq x) = x
There are a couple of things to clear up here.
Haskell has three levels of objects: the value level, the type level, and the kind level. Some examples of things at the value level include "Hello, world!", 42, the function \a -> a, or fix (\xs -> 0:1:zipWith (+) xs (tail xs)). Some examples of things at the type level include Bool, Int, Maybe, Maybe Int, and Monad m => m (). Some examples of things at the kind level include * and (* -> *) -> *.
The levels are in order; value level objects are classified by type level objects, and type level objects are classified by kind level objects. We write the classification relationship using ::, so for example, 32 :: Int or "Hello, world!" :: [Char]. (The kind level isn't too interesting for this discussion, but * classifies types, and arrow kinds classify type constructors. For example, Int :: * and [Int] :: *, but [] :: * -> *.)
Now, one of the most basic properties of Haskell is that each level is completely isolated. You will never see a string like "Hello, world!" in a type; similarly, value-level objects don't pass around or operate on types. Moreover, there are separate namespaces for values and types. Take the example of Maybe:
data Maybe a = Nothing | Just a
This declaration creates a new name Maybe :: * -> * at the type level, and two new names Nothing :: Maybe a and Just :: a -> Maybe a at the value level. One common pattern is to use the same name for a type constructor and for its value constructor, if there's only one; for example, you might see
newtype Wrapped a = Wrapped a
which declares a new name Wrapped :: * -> * at the type level, and simultaneously declares a distinct name Wrapped :: a -> Wrapped a at the value level. Some particularly common (and confusing examples) include (), which is both a value-level object (of type ()) and a type-level object (of kind *), and [], which is both a value-level object (of type [a]) and a type-level object (of kind * -> *). Note that the fact that the value-level and type-level objects happen to be spelled the same in your source is just a coincidence! If you wanted to confuse your readers, you could perfectly well write
newtype Huey a = Louie a
newtype Louie a = Dewey a
newtype Dewey a = Huey a
where none of these three declarations are related to each other at all!
Now, we can finally tackle what goes wrong with test above: Integer and Int are not value constructors, so they can't be used in patterns. Remember -- the value level and type level are isolated, so you can't put type names in value definitions! By now, you might wish you had written test' instead:
test' :: Num a => a -> a
test' (x :: Integer) = x + 2
test' (x :: Int) = x + 1
test' (Zq x :: Zq a) = x
...but alas, it doesn't quite work like that. Value-level things aren't allowed to depend on type-level things. What you can do is to write separate functions at each of the Int, Integer, and Zq a types:
testInteger :: Integer -> Integer
testInteger x = x + 2
testInt :: Int -> Int
testInt x = x + 1
testZq :: Num a => Zq a -> Zq a
testZq (Zq x) = Zq x
Then we can call the appropriate one of these functions when we want to do a test. Since we're in a statically-typed language, exactly one of these functions is going to be applicable to any particular variable.
Now, it's a bit onerous to remember to call the right function, so Haskell offers a slight convenience: you can let the compiler choose one of these functions for you at compile time. This mechanism is the big idea behind classes. It looks like this:
class Testable a where test :: a -> a
instance Testable Integer where test = testInteger
instance Testable Int where test = testInt
instance Num a => Testable (Zq a) where test = testZq
Now, it looks like there's a single function called test which can handle any of Int, Integer, or numeric Zq's -- but in fact there are three functions, and the compiler is transparently choosing one for you. And that's an important insight. The type of test:
test :: Testable a => a -> a
...looks at first blush like it is a function that takes a value that could be any Testable type. But in fact, it's a function that can be specialized to any Testable type -- and then only takes values of that type! This difference explains yet another reason the original test function didn't work. You can't have multiple patterns with variables at different types, because the function only ever works on a single type at a time.
The ideas behind the classes NamedType and Testable above can be generalized a bit; if you do, you get the Typeable class suggested by hammar above.
I think now I've rambled more than enough, and likely confused more things than I've clarified, but leave me a comment saying which parts were unclear, and I'll do my best.
Is there a function that returns the type of an algebraic parameter (a 'show' for types)?
I think Data.Typeable may be what you're looking for.
Prelude> :m + Data.Typeable
Prelude Data.Typeable> typeOf (1 :: Int)
Int
Prelude Data.Typeable> typeOf (1 :: Integer)
Integer
Note that this will not work on any type, just those which have a Typeable instance.
Using the extension DeriveDataTypeable, you can have the compiler automatically derive these for your own types:
{-# LANGUAGE DeriveDataTypeable #-}
import Data.Typeable
data Foo = Bar
deriving Typeable
*Main> typeOf Bar
Main.Foo
I didn't quite get what you're trying to do in the second half of your question, but hopefully this should be of some help.

Resources