Creating instance of custom class - haskell

I seem to get an error message when trying to create an instance of IntegerGraph. I do not understand the error message. Any tips?
Here's the code:
import Data.Map (Map)
import qualified Data.Map as Map
import Data.Set (Set)
import qualified Data.Set as Set
class IntGraph g where
emptyG :: g
... (more constructors)
type MyGraph n = (Map n (Set n))
instance IntGraph MyGraph where
Error message:
• Expecting one more argument to ‘MyGraph’
Expected a type, but ‘MyGraph’ has kind ‘* -> *’
• In the first argument of ‘IntGraph’, namely ‘MyGraph’
In the instance declaration for ‘IntGraph MyGraph

In Haskell, just as term-level expressions have types, type-level expressions have "kinds". Type expressions that have kind *, like Bool, Int, Maybe [Char], and MyGraph Char represent types that have associated values of the type. For example, you can construct a MyGraph Char, like Map.fromList [('a', Set.fromList ['b','c'])] :: MyGraph Char, which is how we know MyGraph Char must be of kind *.
In constrast, type-level expressions that have kind * -> *, like MyGraph and Maybe have no associated values. There is no value of type Maybe, for example. Instead, these type-level expressions must first be "applied" to a type of kind * to produce another type-level expression of kind *, representing a type that can have values. For example, if we apply Maybe of kind * -> * to type [Char] of kind *, we get a new type Maybe [Char] of kind *.
If this is all new to you, you might find this old Computer Science Stack Exchange answer of mine helpful.
Anyway, your type class IntGraph can only have instances IntGraph g when g is a type expression of kind *. We can tell this is the case because your definition of class IntGraph g includes a method emptyG that returns a value of type g, meaning that g must be of kind *, the kind of type expressions representing types that have values.
When you try to define an instance instance IntGraph MyGraph, you are trying to define an instance for a type expression MyGraph of kind * -> *, and GHC doesn't like this. It tells you that in the expression instance IntGraph MyGraph, it was expecting MyGraph to be a "type" (technically, a type expression of kind *), but it instead discovered that MyGraph was of kind * -> *.
You can fix this by defining instances for specific MyGraphs:
instance IntGraph (MyGraph Char) where
or a single instance for all possible MyGraph as, but you do need to include that type variable a in the instance declaration, like so:
instance IntGraph (MyGraph a) where

Related

Clarification of Terms around Haskell Type system

Type system in haskell seem to be very Important and I wanted to clarify some terms revolving around haskell type system.
Some type classes
Functor
Applicative
Monad
After using :info I found that Functor is a type class, Applicative is a type class with => (deriving?) Functor and Monad deriving Applicative type class.
I've read that Maybe is a Monad, does that mean Maybe is also Applicative and Functor?
-> operator
When i define a type
data Maybe = Just a | Nothing
and check :t Just I get Just :: a -> Maybe a. How to read this -> operator?
It confuses me with the function where a -> b means it evaluates a to b (sort of returns a maybe) – I tend to think lhs to rhs association but it turns when defining types?
The term type is used in ambiguous ways, Type, Type Class, Type Constructor, Concrete Type etc... I would like to know what they mean to be exact
Indeed the word “type” is used in somewhat ambiguous ways.
The perhaps most practical way to look at it is that a type is just a set of values. For example, Bool is the finite set containing the values True and False.Mathematically, there are subtle differences between the concepts of set and type, but they aren't really important for a programmer to worry about. But you should in general consider the sets to be infinite, for example Integer contains arbitrarily big numbers.
The most obvious way to define a type is with a data declaration, which in the simplest case just lists all the values:
data Colour = Red | Green | Blue
There we have a type which, as a set, contains three values.
Concrete type is basically what we say to make it clear that we mean the above: a particular type that corresponds to a set of values. Bool is a concrete type, that can easily be understood as a data definition, but also String, Maybe Integer and Double -> IO String are concrete types, though they don't correspond to any single data declaration.
What a concrete type can't have is type variables†, nor can it be an incompletely applied type constructor. For example, Maybe is not a concrete type.
So what is a type constructor? It's the type-level analogue to value constructors. What we mean mathematically by “constructor” in Haskell is an injective function, i.e. a function f where if you're given f(x) you can clearly identify what was x. Furthermore, any different constructors are assumed to have disjoint ranges, which means you can also identify f.‡
Just is an example of a value constructor, but it complicates the discussion that it also has a type parameter. Let's consider a simplified version:
data MaybeInt = JustI Int | NothingI
Now we have
JustI :: Int -> MaybeInt
That's how JustI is a function. Like any function of the same signature, it can be applied to argument values of the right type, like, you can write JustI 5.What it means for this function to be injective is that I can define a variable, say,
quoxy :: MaybeInt
quoxy = JustI 9328
and then I can pattern match with the JustI constructor:
> case quoxy of { JustI n -> print n }
9328
This would not be possible with a general function of the same signature:
foo :: Int -> MaybeInt
foo i = JustI $ negate i
> case quoxy of { foo n -> print n }
<interactive>:5:17: error: Parse error in pattern: foo
Note that constructors can be nullary, in which case the injective property is meaningless because there is no contained data / arguments of the injective function. Nothing and True are examples of nullary constructors.
Type constructors are the same idea as value constructors: type-level functions that can be pattern-matched. Any type-name defined with data is a type constructor, for example Bool, Colour and Maybe are all type constructors. Bool and Colour are nullary, but Maybe is a unary type constructor: it takes a type argument and only the result is then a concrete type.
So unlike value-level functions, type-level functions are kind of by default type constructors. There are also type-level functions that aren't constructors, but they require -XTypeFamilies.
A type class may be understood as a set of types, in the same vein as a type can be seen as a set of values. This is not quite accurate, it's closer to true to say a class is a set of type constructors but again it's not as useful to ponder the mathematical details – better to look at examples.
There are two main differences between type-as-set-of-values and class-as-set-of-types:
How you define the “elements”: when writing a data declaration, you need to immediately describe what values are allowed. By contrast, a class is defined “empty”, and then the instances are defined later on, possibly in a different module.
How the elements are used. A data type basically enumerates all the values so they can be identified again. Classes meanwhile aren't generally concerned with identifying types, rather they specify properties that the element-types fulfill. These properties come in the form of methods of a class. For example, the instances of the Num class are types that have the property that you can add elements together.
You could say, Haskell is statically typed on the value level (fixed sets of values in each type), but duck-typed on the type level (classes just require that somebody somewhere implements the necessary methods).
A simplified version of the Num example:
class Num a where
(+) :: a -> a -> a
instance Num Int where
0 + x = x
x + y = ...
If the + operator weren't already defined in the prelude, you would now be able to use it with Int numbers. Then later on, perhaps in a different module, you could also make it usable with new, custom number types:
data MyNumberType = BinDigits [Bool]
instance Num MyNumberType where
BinDigits [] + BinDigits l = BinDigits l
BinDigits (False:ds) + BinDigits (False:es)
= BinDigits (False : ...)
Unlike Num, the Functor...Monad type classes are not classes of types, but of 1-ary type constructors. I.e. every functor is a type constructor taking one argument to make it a concrete type. For instance, recall that Maybe is a 1-ary type constructor.
class Functor f where
fmap :: (a->b) -> f a -> f b
instance Functor Maybe where
fmap f (Just a) = Just (f a)
fmap _ Nothing = Nothing
As you have concluded yourself, Applicative is a subclass of Functor. D being a subclass of C means basically that D is a subset of the set of type constructors in C. Therefore, yes, if Maybe is an instance of Monad it also is an instance of Functor.
†That's not quite true: if you consider the _universal quantor_ explicitly as part of the type, then a concrete type can contain variables. This is a bit of an advanced subject though.
‡This is not guaranteed to be true if the -XPatternSynonyms extension is used.

Cannot implement Foldable instance due to wrong kind

I am learning haskell and trying to make a pretty print program. At some point I want to get the length of a row (i.e. number of columns in that row). To be able to do that on my datatype I understand I have to implement Foldable, which relies on Monoid.
Previously my row was just a type alias for a list but for sake of learning I want to make this move
import System.IO
import System.Directory
import Control.Monad
import Data.Maybe
import Data.Monoid
import Data.Foldable
import Data.Functor
import Data.List.Split
type Field = String
data Row = Row [Field]
instance Monoid Row where
mempty = Row []
instance Foldable Row where
foldMap f (Row fs) = foldMap f fs
But I get the following compiler error (on ghci 8.0.2)
main.hs:20:19: error:
• Expected kind ‘* -> *’, but ‘Row’ has kind ‘*’
• In the first argument of ‘Foldable’, namely ‘Row’
In the instance declaration for ‘Foldable Row’
Now I am not familiar with what the kind of a datatype is. I was expecting this to simply defer to Row's only property of type List
When we have Foldable T, T must be a parametric type, i.e. we must be able to form types T Int, T String, etc.
In Haskell we write T :: * -> * for "a type parametrized over a type", since it resembles a function from types to types. The syntax * -> * is called the kind of T.
In your case, Row is not parametrized, it is a plain type, something of kind *, not * -> *. So, Foldable Row is a kind error. In a sense, a foldable must be a generic list-like container, not one that only carries Field as in your case.
You could instead define data Row a = Row [a], and use Row Field when you need that specific case.
Alternatively, you could try MonoFoldable Row from the mono-traversable package, but note that this is a more advanced option, involving type families. Do not take this path lightly before considering its consequences. It ultimately boils down to why you need a Foldable instance.
what is the "kind" of a datatype?
Types "of kind *" are types of things that can appear in a Haskell program.
Example: Int.
Not an example: Maybe.
a :: Int ; a = 1 can appear in a Haskell program but b :: Maybe ; b = Just 1 can't. It must be b :: Maybe Int ; b = Just 1, to be valid for it to appear in a Haskell program.
What is Maybe Int? It is a type of kind * just as Int is. So what is Maybe? It is a type of kind * -> *. Because Maybe Maybe is also invalid.
The type t appearing after the Maybe must itself be of kind * for Maybe t to be of kind *. Thus the kind of Maybe is * -> *.
Haskell now calls the kind * by new name, Type. Calling it Thing or something could have been more intuitive.

Which is a polymorphic type: a type or a set of types?

Programming in Haskell by Hutton says:
A type that contains one or more type variables is called polymorphic.
Which is a polymorphic type: a type or a set of types?
Is a polymorphic type with a concrete type substituting its type variable a type?
Is a polymorphic type with different concrete types substituting its type variable considered the same or different types?
Is a polymorphic type with a concrete type substituting its type variable a type?
That's the point, yes. However, you need to be careful. Consider:
id :: a -> a
That's polymorphic. You can substitute a := Int and get Int -> Int, and a := Float -> Float and get (Float -> Float) -> Float -> Float. However, you cannot say a := Maybe and get id :: Maybe -> Maybe. That just doesn't make sense. Instead, we have to require that you can only substitute concrete types like Int and Maybe Float for a, not abstract ones like Maybe. This is handled with the kind system. This is not too important for your question, so I'll just summarize. Int and Float and Maybe Float are all concrete types (that is, they have values), so we say that they have type Type (the type of a type is often called its kind). Maybe is a function that takes a concrete type as an argument and returns a new concrete type, so we say Maybe :: Type -> Type. In the type a -> a, we say the type variable a must have type Type, so now the substitutions a := Int, a := String, etc. are allowed, while stuff like a := Maybe isn't.
Is a polymorphic type with different concrete types substituting its type variable considered the same or different types?
No. Back to a -> a: a := Int gives Int -> Int, but a := Float gives Float -> Float. Not the same.
Which is a polymorphic type: a type or a set of types?
Now that's a loaded question. You can skip to the TL;DR at the end, but the question of "what is a polymorphic type" is actually really confusing in Haskell, so here's a wall of text.
There are two ways to see it. Haskell started with one, then moved to the other, and now we have a ton of old literature referring to the old way, so the syntax of the modern system tries to maintain compatibility. It's a bit of a hot mess. Consider
id x = x
What is the type of id? One point of view is that id :: Int -> Int, and also id :: Float -> Float, and also id :: (Int -> Int) -> Int -> Int, ad infinitum, all simultaneously. This infinite family of types can be summed up with one polymorphic type, id :: a -> a. This point of view gives you the Hindley-Milner type system. This is not how modern GHC Haskell works, but this system is what Haskell was based on at its creation.
In Hindley-Milner, there is a hard line between polymorphic types and monomorphic types, and the union of these two groups gives you "types" in general. It's not really fair to say that, in HM, polymorphic types (in HM jargon, "polytypes") are types. You can't take polytypes as arguments, or return them from functions, or place them in a list. Instead, polytypes are only templates for monotypes. If you squint, in HM, a polymorphic type can be seen as a set of those monotypes that fit the schema.
Modern Haskell is built on System F (plus extensions). In System F,
id = \x -> x -- rewriting the example
is not a complete definition. Therefore we can't even think about giving it a type. Every lambda-bound variable needs a type annotation, but x has no annotation. Worse, we can't even decide on one: \(x :: Int) -> x is just as good as \(x :: Float) -> x. In System F, what we do is we write
id = /\(a :: Type) -> \(x :: a) -> x
using /\ to represent Λ (upper-case lambda) much as we use \ to represent λ.
id is a function taking two arguments. The first argument is a Type, named a. The second argument is an a. The result is also an a. The type signature is:
id :: forall (a :: Type). a -> a
forall is a new kind of function arrow, basically. Note that it provides a binder for a. In HM, when we said id :: a -> a, we didn't really define what a was. It was a fresh, global variable. By convention, more than anything else, that variable is not used anywhere else (otherwise the Generalization rule doesn't apply and everything breaks down). If I had written e.g. inject :: a -> Maybe a, afterwards, the textual occurrences of a would be referring to a new global entity, different from the one in id. In System F, the a in forall a. a -> a actually has scope. It's a "local variable" available only for use underneath that forall. The a in inject :: forall a. a -> Maybe a may or may not be the "same" a; it doesn't matter, because we have actual scoping rules that keep everything from falling apart.
Because System F has hygienic scoping rules for type variables, polymorphic types are allowed to do everything other types can do. You can take them as arguments
runCont :: forall (a :: Type). (forall (r :: Type). (a -> r) -> r) -> a
runCons a f = f a (id a) -- omitting type signatures; you can fill them in
You put them in data constructors
newtype Yoneda f a = Yoneda (forall b. (a -> b) -> f b)
You can place them in polymorphic containers:
type Bool = forall a. a -> a -> a
true, false :: Bool
true a t f = t
false a t f = f
thueMorse :: [Bool]
thueMorse = false : true : true : false : _etc
There's an important difference from HM. In HM, if something has polymorphic type, it also has, simultaneously, an infinity of monomorphic types. In System F, a thing can only have one type. id = /\a -> \(x :: a) -> x has type forall a. a -> a, not Int -> Int or Float -> Float. In order to get an Int -> Int out of id, you have to actually give it an argument: id Int :: Int -> Int, and id Float :: Float -> Float.
Haskell is not System F, however. System F is closer to what GHC calls Core, which is an internal language that GHC compiles Haskell to—basically Haskell without any syntax sugar. Haskell is a Hindley-Milner flavored veneer on top of a System F core. In Haskell, nominally a polymorphic type is a type. They do not act like sets of types. However, polymorphic types are still second class. Haskell doesn't let you actually type forall without -XExplicitForalls. It emulates Hindley-Milner's wonky implicit global variable creation by inserting foralls in certain places. The places where it does so are changed by -XScopedTypeVariables. You can't take polymorphic arguments or have polymorphic fields unless you enable -XRankNTypes. You cannot say things like [forall a. a -> a -> a], nor can you say id (forall a. a -> a -> a) :: (forall a. a -> a -> a) -> (forall a. a -> a -> a)—you must define e.g. newtype Bool = Bool { ifThenElse :: forall a. a -> a -> a } to wrap the polymorphism under something monomorphic. You cannot explicitly give type arguments unless you enable -XTypeApplications, and then you can write id #Int :: Int -> Int. You cannot write type lambdas (/\), period; instead, they are inserted implicitly whenever possible. If you define id :: forall a. a -> a, then you cannot even write id in Haskell. It will always be implicitly expanded to an application, id #_.
TL;DR: In Haskell, a polymorphic type is a type. It's not treated as a set of types, or a rule/schema for types, or whatever. However, due to historical reasons, they are treated as second class citizens. By default, it looks like they are treated as mere sets of types, if you squint a bit. Most restrictions on them can be lifted with suitable language extensions, at which point they look more like "just types". The one remaining big restriction (no impredicative instantiations allowed) is rather fundamental and cannot be erased, but that's fine because there's a workaround.
There is some nuance in the word "type" here. Values have concrete types, which cannot be polymorphic. Expressions, on the other hand, have general types, which can be polymorphic. If you're thinking of types for values, then a polymorphic type can be thought of loosely as defining sets of possible concrete types. (At least first-order polymorphic types! Higher-order polymorphism breaks this intuition.) But that's not always a particularly useful way of thinking, and it's not a sufficient definition. It doesn't capture which sets of types can be described in this way (and related notions like parametricity.)
It's a good observation, though, that the same word, "type", is used in these two related, but different, ways.
EDIT: The answer below turns out not to answer the question. The difference is a subtle mistake in terminology: types like Maybe and [] are higher-kinded, whereas types like forall a. a -> a and forall a. Maybe a are polymorphic. The answer below relates to higher-kinded types, but the question was asked about polymorphic types. I’m still leaving this answer up in case it helps anyone else, but I realise now it’s not really an answer to the question.
I would argue that a polymorphic higher-kinded type is closer to a set of types. For instance, you could see Maybe as the set {Maybe Int, Maybe Bool, …}.
However, strictly speaking, this is a bit misleading. To address this in more detail, we need to learn about kinds. Similarly to how types describe values, we say that kinds describe types. The idea is:
A concrete type (that is, one which has values) has a kind of *. Examples include Bool, Char, Int and Maybe String, which all have type *. This is denoted e.g. Bool :: *. Note that functions such as Int -> String also have kind *, as these are concrete types which can contain values such as show!
A type with a type parameter has a kind containing arrows. For instance, in the same way that id :: a -> a, we can say that Maybe :: * -> *, since Maybe takes a concrete type as an argument (such as Int), and produces a concrete type as a result (such as Maybe Int). Something like a -> a also has kind * -> *, since it has one type parameter (a) and produces a concrete result (a -> a). You can get more complex kinds as well: for instance, data Foo f x = FooConstr (f x x) has kind Foo :: (* -> * -> *) -> * -> *. (Can you see why?)
(If the above explanation doesn’t make sense, the Learn You a Haskell book has a great section on kinds as well.)
So now we can answer your questions properly:
Which is a polymorphic higher-kinded type: a type or a set of types?
Neither: a polymorphic higher-kinded type is a type-level function, as indicated by the arrows in its kind. For instance, Maybe :: * -> * is a type-level function which converts e.g. Int → Maybe Int, Bool → Maybe Bool etc.
Is a polymorphic higher-kinded type with a concrete type substituting its type variable a type?
Yes, when your polymorphic higher-kinded type has a kind * -> * (i.e. it has one type parameter, which accepts a concrete type). When you apply a concrete type Conc :: * to a type Poly :: * -> *, it’s just function application, as detailed above, with the result being Poly Conc :: * i.e. a concrete type.
Is a polymorphic higher-kinded type with different concrete types substituting its type variable considered the same or different types?
This question is a bit out of place, as it doesn’t have anything to do with kinds. The answer is definitely no: two types like Maybe Int and Maybe Bool are not the same. Nothing may be a member of both types, but only the former contains a value Just 4, and only the latter contains a value Just False.
On the other hand, it is possible to have two different substitutions where the resulting types are isomorphic. (An isomorphism is where two types are different, but equivalent in some way. For instance, (a, b) and (b, a) are isomorphic, despite being the same type. The formal condition is that two types p,q are isomorphic when you can write two inverse functions p -> q and q -> p.)
One example of this is Const:
data Const a b = Const { getConst :: a }
This type just ignores its second type parameter; as a result, two types like Const Int Char and Const Int Bool are isomorphic. However, they are not the same type: if you make a value of type Const Int Char, but then use it as something of type Const Int Bool, this will result in a type error. This sort of functionality is incredibly useful, as it means you can ‘tag’ a type a using Const a tag, then use the tag as a marker of information on the type level.

What does Functor's fmap tell about types?

What does f a and f b tell me about its type?
class Functor f where
fmap :: (a -> b) -> f a -> f b
I think I get the idea behind standard instances of a functor. However I'm having hard time understanding what f a and f actually represent.
I understand that f a and f b are just types and they must carry information what type constructor was used to create them and type arguments that were used.
Is f a type constructor of kind * -> *? Is (->) r a type constructor just like Maybe is?
I understand that f a and f b are just types and they must carry information what type constructor was used to create them and type arguments that were used.
Good explanation.
Is f a type constructor of kind * -> *?
In effect.
Is (->) r a type constructor just like Maybe is?
In effect, yes:
Yes in the sense that you can apply it to a type like String and get r -> String, just like you can apply Maybe to String to get Maybe String. You can use for f anything that gives you a type from any other type.
..but no...
No, in the sense that Daniel Wagner points out; To be precise, Maybe and [] are type constructors, but (->) r and Either a are sort of like partially applied type constructors. Nevertheless they make good functors, because you can freely apply functions "inside" them and change the type of "the contents".
(Stuff in inverted commas is very hand-wavy imprecise terminology.)
My (possibly mildly tortured) reading of chapter 4 of the Haskell 2010 Report is that Maybe and (->) r are both types, of kind * -> *. Alternatively, the Report also labels them as type expressions—but I can't discern a firm difference in how the Report uses the two terms, except perhaps for surface syntax details. (->) and Maybe are type constructors; type expressions are assembled from type constructors and type variables.
For example, section 4.1.1 ("Kinds") of the 2010 report says (my boldface):
To ensure that they are valid, type expressions are classified into different kinds, which take one of two possible forms:
The symbol ∗ represents the kind of all nullary type constructors.
If κ1 and κ2 are kinds, then κ1 → κ2 is the kind of types that take a type of kind κ1 and return a type of kind κ2.
Section 4.3.2, "Instance Declarations" (my boldface):
An instance declaration that makes the type T to be an instance of class C is called a C-T instance declaration and is subject to these static restrictions:
A type may not be declared as an instance of a particular class more than once in the program.
The class and type must have the same kind; this can be determined using kind inference as described in Section 4.6.
So going by that language, the following instance declaration makes the type (->) r to be an instance of the class Functor:
instance Functor ((->) r) where
fmap f g = f . g
The funny thing about this terminology is that we call (->) r a "type" even though there are no expressions in Haskell that have that type—not even undefined:
foo :: (->) r
foo = undefined
{-
[1 of 1] Compiling Main ( ../src/scratch.hs, interpreted )
../src/scratch.hs:1:8:
Expecting one more argument to `(->) r'
In the type signature for `foo': foo :: (->) r
-}
But I think that's not a big deal. Basically, all declarations in Haskell must have types of kind *.
As a side note, from my limited understanding of dependently typed languages, many of these lack Haskell's firm distinction between terms and types, so that something like (->) Boolean is an expression whose value is a function that takes a type as its argument and produces a type as its result.

'type family' vs 'data family', in brief?

I'm confused about how to choose between data family and type family. The wiki page on TypeFamilies goes into a lot of detail. Occasionally it informally refers to Haskell's data family as a "type family" in prose, but of course there is also type family in Haskell.
There is a simple example that shows where two versions of code are shown, differing only on whether a data family or a type family is being declared:
-- BAD: f is too ambiguous, due to non-injectivity
-- type family F a
-- OK
data family F a
f :: F a -> F a
f = undefined
g :: F Int -> F Int
g x = f x
type and data here have the same meaning, but the type family version fails to type-check, while the data family version is fine, because data family "creates new types and are therefore injective" (says the wiki page).
My takeaway from all of this is "try data family for simple cases, and, if it isn't powerful enough, then try type family". Which is fine, but I'd like to understand it better. Is there a Venn diagram or a decision tree I can follow to distinguish when to use which?
(Boosting useful information from comments into an answer.)
Standalone vs In-Class declaration
Two syntactically different ways to declare a type family and/or data family, that are semantically equivalent:
standalone:
type family Foo
data family Bar
or as part of a typeclass:
class C where
type Foo
data Bar
both declare a type family, but inside a typeclass the family part is implied by the class context, so GHC/Haskell abbreviates the declaration.
"New type" vs "Type Synonym" / "Type Alias"
data family F creates a new type, similar to how data F = ... creates a new type.
type family F does not create a new type, similar to how type F = Bar Baz doesn't create a new type (it just creates an alias/synonym to an existing type).
Example of non-injectivity of type family
An example (slightly modified) from Data.MonoTraversable.Element:
import Data.ByteString as S
import Data.ByteString.Lazy as L
-- Declare a family of type synonyms, called `Element`
-- `Element` has kind `* -> *`; it takes one parameter, which we call `container`
type family Element container
-- ByteString is a container for Word8, so...
-- The Element of a `S.ByteString` is a `Word8`
type instance Element S.ByteString = Word8
-- and the Element of a `L.ByteString` is also `Word8`
type instance Element L.ByteString = Word8
In a type family, the right-side of equations Word8 names an existing type; the things are the left-side creates new synonyms: Element S.ByteString and Element L.ByteString
Having a synonym means we can interchange Element Data.ByteString with Word8:
-- `w` is a Word8....
>let w = 0::Word8
-- ... and also an `Element L.ByteString`
>:t w :: Element L.ByteString
w :: Element L.ByteString :: Word8
-- ... and also an `Element S.ByteString`
>:t w :: Element S.ByteString
w :: Element S.ByteString :: Word8
-- but not an `Int`
>:t w :: Int
Couldn't match expected type `Int' with actual type `Word8'
These type synonyms are "non-injective" ("one-way"), and therefore non-invertible.
-- As before, `Word8` matches `Element L.ByteString` ...
>(0::Word8)::(Element L.ByteString)
-- .. but GHC can't infer which `a` is the `Element` of (`L.ByteString` or `S.ByteString` ?):
>(w)::(Element a)
Couldn't match expected type `Element a'
with actual type `Element a0'
NB: `Element' is a type function, and may not be injective
The type variable `a0' is ambiguous
Worse, GHC can't even solve non-ambiguous cases!:
type instance Element Int = Bool
> True::(Element a)
> NB: `Element' is a type function, and may not be injective
Note the use of "may not be"! I think GHC is being conservative, and refusing to check whether the Element truly is injective. (Perhaps because a programmer could add another type instance later, after importing a pre-compiled module, adding ambiguity.
Example of injectivity of data family
In contrast: In a data family, each right-hand side contains a unique constructor , so the definitions are injective ("reversible") equations.
-- Declare a list-like data family
data family XList a
-- Declare a list-like instance for Char
data instance XList Char = XCons Char (XList Char) | XNil
-- Declare a number-like instance for ()
data instance XList () = XListUnit Int
-- ERROR: "Multiple declarations of `XListUnit'"
data instance XList () = XListUnit Bool
-- (Note: GHCI accepts this; the new declaration just replaces the previous one.)
With data family, seeing the constructor name on the right (XCons, or XListUnit)
is enough to let the type-inferencer know we must be working with XList () not an XList Char. Since constructor names are unique, these defintions are injective/reversible.
If type "just" declares a synonym, why is it semantically useful?
Usually, type synonyms are just abbreviations, but type family synonyms have added power: They can make a simple type (kind *) become a synonym of a "type with kind * -> * applied to an argument":
type instance F A = B
makes B match F a. This is used, for example, in Data.MonoTraversable to make a simple type Word8 match functions of the type Element a -> a (Element is defined above).
For example, (a bit silly), suppose we have a version of const that only works with "related" types:
> class Const a where constE :: (Element a) -> a -> (Element a)
> instance Const S.ByteString where constE = const
> constE (0::Word8) undefined
ERROR: Couldn't match expected type `Word8' with actual type `Element a0'
-- By providing a match `a = S.ByteString`, `Word8` matches `(Element S.ByteString)`
> constE (0::Word8) (undefined::S.ByteString)
0
-- impossible, since `Char` cannot match `Element a` under our current definitions.
> constE 'x' undefined
I don't think any decision tree or Venn diagram will exist because the applications for type and data families are pretty wide.
Generally you've already highlighted the key design differences and I would agree with your takeaway to first see if you can get away with data family.
For me the key point is that each instance of a data family creates a new type, which does substantially limit the power because you can't do what is often the most natural thing and make an existing type be the instance.
For example the GMapKey example on the Haskell wiki page on "indexed types" is a reasonably natural fit for data families:
class GMapKey k where
data GMap k :: * -> *
empty :: GMap k v
lookup :: k -> GMap k v -> Maybe v
insert :: k -> v -> GMap k v -> GMap k v
The key type of the map k is the argument to the data family and the actual map type is the result of the data family (GMap k) . As a user of a GMapKey instance you're probably quite happy for the GMap k type to be abstract to you and just manipulate it via the generic map operations in the type class.
In contrast the Collects example on the same wiki page is sort of the opposite:
class Collects ce where
type Elem ce
empty :: ce
insert :: Elem ce -> ce -> ce
member :: Elem ce -> ce -> Bool
toList :: ce -> [Elem ce]
The argument type is the collection and the result type is the element of the collection. In general a user is going to want to operate on those elements directly using the normal operations on that type. For example the collection might be IntSet and the element might be Int. Having the Int be wrapped up in some other type would be quite inconvenient.
Note - these two examples are with type classes and therefore don't need the family keyword as declaring a type inside a type class implies it must be a family. Exactly the same considerations apply as for standalone families though, it's just a question of how the abstraction is organised.

Resources