Splitting type-classes and their instances to the different submodules in Haskell - haskell

I am currently writing a small helper library and I've faced the problem of really huge source code in one of the modules.
Basically, I am declaring a new parametric type-class and want to implement it for two different monad stacks.
I've decided to split the declaration of type-class and its implementations to the different modules, but I'm constantly getting warnings about orphaned instances.
As I know, that might happen if it is possible to import a datatype without an instance, i.e. if they are in a different module. But I have both type declaration and instance implementation inside each module.
To simplify the whole example, here is what I have now:
First is the module, where I define a type-class
-- File ~/library/src/Lib/API.hs
module Lib.API where
-- Lots of imports
class (Monad m) => MyClass m where
foo :: String -> m ()
-- More functions are declared
Then the module with instance implementation:
-- File ~/library/src/Lib/FirstImpl.hs
{-# LANGUAGE TypeSynonymInstances #-}
{-# LANGUAGE FlexibleInstances #-}
module Lib.FirstImpl where
import Lib.API
import Data.IORef
import Control.Monad.Reader
type FirstMonad = ReaderT (IORef String) IO
instance MyClass FirstMonad where
foo = undefined
Both of them are listed in my project's .cabal file, it's also impossible to use FirstMonad without the instance because they are defined in one file.
However, when I launch ghci using stack ghci lib, I'm getting the next warning:
~/library/src/Lib/FirstImpl.hs:11:1: warning: [-Worphans]
Orphan instance: instance MyClass FirstMonad
To avoid this
move the instance declaration to the module of the class or of the type, or
wrap the type with a newtype and declare the instance on the new type.
|
11 | instance MyClass FirstMonad where
| ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^...
Ok, two modules loaded
What am I missing and is there any way to split type-class declarations and their implementations into the different submodules?

To avoid this, you can wrap type in newtype
newtype FirstMonad a = FirstMonad (ReaderT (IORef String) IO a)
But after deep considering you feel need orphan instances, you can suppress warnings:
{-# OPTIONS_GHC -fno-warn-orphans #-}
Detail
Coherence
For example, considering following definition for now:
data A = A
instance Eq A where
...
It can be regarded as type based overloading. In the above, Checking equality (==) is able to be used under various types:
f :: Eq a => a -> a -> a -> Bool
f x y z = x == y && y == z
g :: A -> A -> A -> Bool
g x y z = x == y && y == z
In definition of f, type a is abstract and under constraint Eq, but in g, type A is concrete. The former derives method from constrains, but Haskell also in the latter can derive. How to derive is to just elaborate Haskell into language which has no type class. This way is called dictionary passing.
class C a where
m1 :: a -> a
instance C A where
m1 x = x
f :: C a => a -> a
f = m1 . m1
It will be converted:
data DictC a = DictC
{ m1 :: a -> a
}
instDictC_A :: DictC A
instDictC_A = DictC
{ m1 = \x -> x
}
f :: DictC a -> a -> a
f d = m1 d . m1 d
As the above, make a data type called dictionary corresponds to a type class, and pass the value of the type.
Haskell has a constraint that a type may not be declared as an instance of a particular class more than once in the program. This causes various problems.
class C1 a where
m1 :: a
class C1 a => C2 a where
m2 :: a -> a
instance C1 Int where
m1 = 0
instance C2 Int where
m2 x = x + 1
f :: (C1 a, C2 a) => a
f = m2 m1
g :: Int
g = f
This code uses inheritance of type class. It derives following elaborated code.
{ m1 :: a
}
data DictC2 a = DictC2
{ superC1 :: DictC1 a
, m2 :: a -> a
}
instDictC1_Int :: DictC1 Int
instDictC1_Int = DictC1
{ m1 = 0
}
instDictC2_Int :: DictC2 Int
instDictC2_Int = DictC2
{ superC1 = instDictC1_Int
, m2 = \x -> x + 1
}
f :: DictC1 a -> DictC2 a -> a
f d1 d2 = ???
g :: Int
g = f instDictC1_Int instDictC2_Int
Well, what is definition of f going on? Actually, Definition's' are following:
f :: DictC1 a -> DictC2 a -> a
f d1 d2 = m2 d2 (m1 d1)
f :: DictC1 a -> DictC2 a -> a
f _ d2 = m2 d2 (m1 d1)
where
d1 = superC1 d2
Do you confirm it has no problem in typing? If Haskell can define Int as a instance of C1 repeatedly, superC1 in DictC2 will be filled in elaboration, the value will be probably defferent from DictC1 a passed to f when call g.
Let's see more example:
h :: (Int, Int)
h = (m1, m1)
Of course, elaboration is one:
h :: (Int, Int)
h = (m1 instDictC1_Int, m1 instDictC1_Int)
But if can define instance repeatedly, can also consider following elaboration:
h :: (Int, Int)
h = (m1 instDictC1_Int, m1 instDictC1_Int')
Hence, two same types are applied two different instances. For example, calling same function twice, but returns different value by different algorithm possibly.
The stated example is little bit exaggerated, though how about next example?
instance C1 Int where
m1 = 0
h1 :: Int
h1 = m1
instance C1 Int where
m1 = 1
h2 :: (Int, Int)
h2 = (m1, h1)
In this case, quite possibly use different instances m1 in h1 and m1 in h2.
Haskell often prefers to transformation based on equational reasoning, so it will be a problem that h1 is not able to be replaced directly to m1.
Generally, type system include resolving instances of type classes. In such a case, resolve instances when check types. And codes are elaborated by derivation tree made during checking types. Such transformation is sometimes adapted by besides type class, specifically, implicit type conversion, record type and so on. Then, these cases possibly cause the problem as the above. This problem can formalized following:
When convert derivation tree of type into language, in two different derivation tree of one type, results of conversion don't become semantically equivalent.
As the stated, even apply whatever instance matches type, and it generally must pass type checking. However, a result of elaboration by using a instance is possibly different a result of elaboration after resolving other instance. Vice versa, if don't have this problem, can acquire certain guarantee of type system. This guarantee, a combination of type system which the problem formalized above doesn't work and property pf elaboration, is generally called coherence. There are some way to guarantee coherence, Haskell limits number of instance definition corresponding type class to one in order to guarantee coherence.
Orphan Instance
How Haskell does is easy to say, but has some issues. Quite famous one is orphan instance. GHC, in a type declaration T as an instance of C, treatment of instance depends on whether or not the declaration is in a same module which has declaration T or C. Especially, not in same module, called orphan instance, GHC will warn. Why how it works?
First, in Haskell, instances propagate implicitly between modules. This is stipulated as following:
All instances in scope within a module are always exported and any import brings all instances in from the imported module. Thus, an instance declaration is in scope if and only if a chain of import declarations leads to the module containing the instance declaration.
--5 Modules
We can't stop this, can't control this. In the first place, Haskell decided to let us define one type as one instance, so it's unnecessary to mind it. By the way, it's as good there is such regulation, actually compiler of Haskell must resolve instances according to the regulation. Of course, compiler doesn't know which modules have instances, must check all modules at worst case. It also bothers us. If two important modules hold each instance definition toward same type, all modules which have their import chains include the modules become unavailable in order to conflict.
Well, to use a type as a instance of a class, we need information of them, so we will go to see a module which has declarations. Then, that a third party fiddles the module is not going to happen. Therefore, if either of the modules includes the instance declaration, compiler can see necessary information with instances, we are happy that enable to load modules guarantees that they have no conflicts. For that reason, that a type as an instance of a class placed in a same module which has declaration the type or the class is being recommended. On the contrary, avoiding orphan instance as much as possible is being recommended. Hence, if want to make a type as a independent instance, making a new type by newtype in order to only change semantics of a instance, declaring the type as the instance.
In addition, GHC marks up internally modules have orphan instances, modules have orphan instances are enumerated in their dependent modules' interface files. And then, compiler refers all of the list. Thus, to make orphan instance once, an interface file of a module which has the instance, when all modules depend on the module recompile, will reloaded if whatever changes. So, orphan instance affects bad to compile time.
Detail is under CC BY-SA 4.0 (C) Mizunashi Mana
Original is 続くといいな日記 – 型クラスの Coherence と Orphan Instance
2020-12-22 revised and translated by Akihito Kirisaki

Related

Ambiguous type variable issue in Haskell

I have read many of the other ambiguous type variable questions on the site but was not able to find a solution to the following issue, although I am new to Haskell so may just not have understood the answers to other questions properly. Simplifying to only include the relevant parts of these definitions, I have a type class
class Dist a where
un :: a
and I have a multiparameter type class
class Dist a => UnBin a b where
opu :: b -> b
opb :: a -> b -> a
where in general a and b are completely unrelated. However, I also have a data type
data LC a b = LC [(a,b)]
and I make it an instance of UnBin using
instance Dist a => UnBin a (LC [(a,b)]) where
opu = ...
opb = ...
where in this particular case, the type LC [(a,b)] already depends on a and I want the two as here to be the same.
What I mean by the two as should be the same is that if I define x = LC [(1 :: Int,'a')]
for example (assuming I have made Int a part of the Dist typeclass already) then I would want to be able to just write opb un x and have Haskell automatically infer that I want un :: Int since a is already determined by the type of x however unless I explicitly use a type signature for un and write opb (un::Int) x I get the error
Ambiguous type variable ‘a0’ arising from a use of ‘un’
prevents the constraint ‘(Dist a0)’ from being solved.
Probable fix: use a type annotation to specify what ‘a0’ should be.
These potential instance exist:
one instance involving out-of-scope types
(use -fprint-potential-instances to see them all)
Presumably this means that Haskell is treating the two as in my instance declaration as unrelated when I want them to really be the same. Is there a way to make Haskell automatically able to infer the correct type of un without having to give a type signature every time I want to use un in this way?
Presumably this means that Haskell is treating the two as in my instance declaration as unrelated when I want them to really be the same.
Actually, GHC does know that the two as in the instance you wrote are the same. However, it doesn't know that it should use that instance. The problem is that there could be another instance out there. For instance, for all GHC knows, you've also written the instance:
instance UnBin Char (LC Int b) where
opu = ...
opb = ...
Then, there are two different types that the un in opb un x could be. It really is ambiguous!
But, you might say, I didn't write any other instance. That may be true, but instances are sneaky. If someone imports your code and then creates an orphan instance like the one I wrote above, what should happen? That value opb un x needs to be the same, but with this new instance in scope, it would also need to change. GHC doesn't like this future possibility, so it gives you an error.
You mentioned in your question that "in general a and b are completely unrelated". Hopefully it's the case that a given b implies a given a. If not, or in other words, if you want to be able to write instances like UnBin Char (LC Int b) like I did above, then you're out of luck. You can't have it both ways: you can't have GHC infer the type you want while also keeping the freedom to decide for yourself whatever type you want.
However, if it is true that a and b are related (in that, say, b can imply a), then there are a few ways forward. For instance, as shown by #chi, you can use a type equality in the instance context to trick GHC into matching on the instance you want first and only verifying that the types are the same later. Another option is to use type families or functional dependencies to achieve a similar goal. For instance:
{-# LANGUAGE FunctionalDependencies #-}
class Dist a => UnBin a b | b -> a where
Note that in either case, you're limiting the total number of instances you can write. In the type equality approach, you have generalized your instance head, and with the functional dependency approach, you force b to imply a. Hopefully, that's okay with you.
Fully working example:
{-# LANGUAGE FlexibleInstances #-}
{-# LANGUAGE MultiParamTypeClasses #-}
{-# LANGUAGE AllowAmbiguousTypes #-}
{-# LANGUAGE GADTs #-}
class Dist a where
un :: a
instance Dist Int where
un = 42
class Dist a => UnBin a b where
opu :: b -> b
opb :: a -> b -> a
data LC a b = LC [(a,b)]
instance (Dist a, a ~ a') => UnBin a (LC a' b) where
opu = undefined
opb = undefined
x = LC [(1 :: Int,'a')]
test = opb un x -- here GHC infers un::Int as wanted
The main technique is this: instead of writing
instance Dist a => UnBin a (LC a b) where
we write
instance (Dist a, a ~ a') => UnBin a (LC a' b) where
so that the instance head UnBin a (LC a' b) matches even when types a and a' are not the same. Then, we require in the instance context a ~ a' forcing them to be the same.
In this way, the instance is picked during inference at a time where we don't yet know how a and a' are related. After this happens, GHC immediately deduces a~a' and exploits it to infer the type Int for un in test = opb un x.

Subset algebraic data type, or type-level set, in Haskell

Suppose you have a large number of types and a large number of functions that each return "subsets" of these types.
Let's use a small example to make the situation more explicit. Here's a simple algebraic data type:
data T = A | B | C
and there are two functions f, g that return a T
f :: T
g :: T
For the situation at hand, assume it is important that f can only return a A or B and g can only return a B or C.
I would like to encode this in the type system. Here are a few reasons/circumstances why this might be desirable:
Let the functions f and g have a more informative signature than just ::T
Enforce that implementations of f and g do not accidentally return a forbidden type that users of the implementation then accidentally use
Allow code reuse, e.g. when helper functions are involved that only operate on subsets of type T
Avoid boilerplate code (see below)
Make refactoring (much!) easier
One way to do this is to split up the algebraic datatype and wrap the individual types as needed:
data A = A
data B = B
data C = C
data Retf = RetfA A | RetfB B
data Retg = RetgB B | RetgC C
f :: Retf
g :: Retg
This works, and is easy to understand, but carries a lot of boilerplate for frequent unwrapping of the return types Retf and Retg.
I don't see polymorphism being of any help, here.
So, probably, this is a case for dependent types. It's not really a type-level list, rather a type-level set, but I've never seen a type-level set.
The goal, in the end, is to encode the domain knowledge via the types, so that compile-time checks are available, without having excessive boilerplate. (The boilerplate gets really annoying when there are lots of types and lots of functions.)
Define an auxiliary sum type (to be used as a data kind) where each branch corresponds to a version of your main type:
{-# LANGUAGE FlexibleInstances #-}
{-# LANGUAGE StandaloneKindSignatures #-}
{-# LANGUAGE StandaloneDeriving #-}
{-# LANGUAGE DataKinds #-}
import Data.Kind
import Data.Void
import GHC.TypeLits
data Version = AllEnabled | SomeDisabled
Then define a type family that maps the version and the constructor name (given as a type-level Symbol) to the type () if that branch is allowed, and to the empty type Void if it's disallowed.
type Enabled :: Version -> Symbol -> Type
type family Enabled v ctor where
Enabled SomeDisabled "C" = Void
Enabled _ _ = ()
Then define your type as follows:
type T :: Version -> Type
data T v = A !(Enabled v "A")
| B !(Enabled v "B")
| C !(Enabled v "C")
(The strictness annotations are there to help the exhaustivity checker.)
Typeclass instances can be derived, but separately for each version:
deriving instance Show (T AllEnabled)
deriving instance Eq (T AllEnabled)
deriving instance Show (T SomeDisabled)
deriving instance Eq (T SomeDisabled)
Here's an example of use:
noC :: T SomeDisabled
noC = A ()
main :: IO ()
main = print $ case noC of
A _ -> "A"
B _ -> "B"
-- this doesn't give a warning with -Wincomplete-patterns
This solution makes pattern-matching and construction more cumbersome, because those () are always there.
A variation is to have one type family per branch (as in Trees that Grow) instead of a two-parameter type family.
I tried to achieve something like this in the past, but without much success -- I was not too satisfied with my solution.
Still, one can use GADTs to encode this constraint:
data TagA = IsA | NotA
data TagC = IsC | NotC
data T (ta :: TagA) (tc :: TagC) where
A :: T 'IsA 'NotC
B :: T 'NotA 'NotC
C :: T 'NotA 'IsC
-- existential wrappers
data TnotC where TnotC :: T ta 'NotC -> TnotC
data TnotA where TnotA :: T 'NotA tc -> TnotA
f :: TnotC
g :: TnotA
This however gets boring fast, because of the wrapping/unwrapping of the exponentials. Consumer functions are more convenient since we can write
giveMeNotAnA :: T 'NotA tc -> Int
to require anything but an A. Producer functions instead need to use existentials.
In a type with many constructors, it also gets inconvenient since we have to use a GADT with many tags/parameters. Maybe this can be streamlined with some clever typeclass machinery.
Giving each individual value its own type scales extremely badly, and is quite unnecessarily fine-grained.
What you probably want is just restrict the types by some property on their values. In e.g. Coq, that would be a subset type:
Inductive T: Type :=
| A
| B
| C.
Definition Retf: Type := { x: T | x<>C }.
Definition Retg: Type := { x: T | x<>A }.
Well, Haskell has no way of expressing such value constraints, but that doesn't stop you from creating types that conceptually fulfill them. Just use newtypes:
newtype Retf = Retf { getRetf :: T }
mkRetf :: T -> Maybe Retf
mkRetf C = Nothing
mkRetf x = Retf x
newtype Retg = Retg { getRetg :: T }
mkRetg :: ...
Then in the implementation of f, you match for the final result of mkRetf and raise an error if it's Nothing. That way, an implementation mistake that makes it give a C will unfortunately not give a compilation error, but at least a runtime error from within the function that's actually at fault, rather than somewhere further down the line.
An alternative that might be ideal for you is Liquid Haskell, which does support subset types. I can't say too much about it, but it's supposedly pretty good (and will in new GHC versions have direct support).

Overlapping instances - how to circumvent them

I am a beginner at Haskell, so please be indulgent. For reasons that are not important here, I am trying to define a operator <^> that takes a function and an argument and returns the value of the function by the argument, irrespective of which of the function and the argument came first. In short, I would like to be able to write the following:
foo :: Int -> Int
foo x = x * x
arg :: Int
arg = 2
foo <^> arg -- valid, returns 4
arg <^> foo -- valid, returns 4
I have tried to accomplish that through type families, as follows:
{-# LANGUAGE FlexibleInstances, MultiParamTypeClasses, TypeFamilies, TypeOperators #-}
class Combine t1 t2 where
type Output t1 t2 :: *
(<^>) :: t1 -> t2 -> Output t1 t2
instance Combine (a->b) a where
type Output (a->b) a = b
f <^> x = f x
instance Combine a (a->b) where
type Output a a->b = b
x <^> f = f x
On this code, GHC throws a Conflicting family instance declarations. My guess is that the overlap GHC complains about occurs when type a->b and type a are the same. I don't know Haskell well enough, but I suspect that with recursive type definitions, one may be able to construct such a situation. I have a couple of questions:
Since this is a rather remote scenario that will never occur in my application (in particular not with foo and arg above), I was wondering if there was a way of specifying a dummy default instance to use in case of overlap? I have tried the different OVERLAPS and OVERLAPPING flags, but they didn't have any effect.
If not, is there a better way of achieving what I want?
Thanks!
This is a bad idea, in my view, but I'll play along.
A possible solution is to switch to functional dependencies. Usually I tend to avoid fundeps in favor of type families, but here they make the instances compile in a simple way.
class Combine t1 t2 r | t1 t2 -> r where
(<^>) :: t1 -> t2 -> r
instance Combine (a->b) a b where
f <^> x = f x
instance Combine a (a->b) b where
x <^> f = f x
Note that this class will likely cause problems during type inference if we use polymorphic functions. This is because, with polymorphic functions, the code can easily become ambiguous.
For instance id <^> id could pick any of the two instances. Above, melpomene already reported const <^> id being ambiguous as well.
The following is weakly related, but I want to share it anyway:
What about type families instead? I tried to experiment a bit, and I just discovered a limitation which I did not know. Consider the closed type family
type family Output a b where
Output (a->b) a = b
Output a (a->b) = b
The code above compiles, but then the type Output a (a->b) is stuck. The second equation does not get applied, as if the first one could potentially match.
Usually, I can understand this in some other scenarios, but here unifying
Output (a' -> b') b' ~ Output a (a -> b)
seems to fail since we would need a ~ (a' -> b') ~ (a' -> a -> b) which is impossible, with finite types. For some reason, GHC does not use this argument (does it pretend infinite types exist in this check? why?)
Anyway, this makes replacing fundeps with type families harder than it could be, it seems. I have no idea about why GHC accepts the fundeps code I posted, yet refuses the OP's code which is essentially the same thing, except using type families.
#chi is close; an approach using either FunDeps or Closed Type Families is possible. But the Combine instances are potentially ambiguous/unifiable just as much as the CTF Output equations.
When chi says the FunDep code is accepted, that's only half-true: GHC plain leads you down the garden path. It will accept the instances but then you find you can't use them/you get weird error messages. See the Users Guide at "potential for overlap".
If you're looking to resolve a potentially ambiguous Combine constraint, you might get an error suggesting you try IncoherentInstances (or INCOHERENT pragma). Don't do that. You have a genuinely incoherent problem; all that will do is defer the problem to somewhere else. It's always possible to avoid Incoherent -- providing you can rejig your instances (as follows) and they're not locked away in libraries.
Notice that because of the potential ambiguity, another Haskell compiler (Hugs) doesn't let you write Combine like that. It has a more correct implementation of Haskell's (not-well-stated) rules.
The answer is to use a sort of overlap where one instance is strictly more specific. You must first decide which you way you want to prefer in case of ambiguity. I'll choose function prefixed to argument:
{-# LANGUAGE UndecidableInstances, TypeFamilies #-}
instance {-# OVERLAPPING #-} (r ~ b)
=> Combine (a->b) a r where ...
instance {-# OVERLAPPABLE #-} (Combine2 t1 t2 r)
=> Combine t1 t2 r where
(<^>) = revApp
class Combine2 t1 t2 r | t1 t2 -> r where
revApp :: t1 -> t2 -> r
instance (b ~ r) => Combine2 a (a->b) r where
revApp x f = f x
Notice that the OVERLAPPABLE instance for Combine has bare tyvars, it's a catch-all so it's always matchable. All the compiler has to do is decide whether some wanted constraint is of the form of the OVERLAPPING instance.
The Combine2 constraint on the OVERLAPPABLE instance is no smaller than the head, so you need UndecidableInstances. Also beware that deferring to Combine2 will mean that if the compiler still can't resolve, you're likely to get puzzling error messages.
Talking of bare tyvars/"always matchable", I've used an additional trick to make the compiler work really hard to improve the types: There's bare r in the head of the instance, with an Equality type improvement constraint (b ~ r) =>. To use the ~, you need to switch on TypeFamilies even though you're not writing any type families.
A CTF approach would be similar. You need a catch-all equation on Output that calls an auxiliary type function. Again you need UndecidableInstances.

Practical applications of Rank 2 polymorphism?

I'm covering polymorphism and I'm trying to see the practical uses of such a feature.
My basic understanding of Rank 2 is:
type MyType = ∀ a. a -> a
subFunction :: a -> a
subFunction el = el
mainFunction :: MyType -> Int
mainFunction func = func 3
I understand that this is allowing the user to use a polymorphic function (subFunction) inside mainFunction and strictly specify it's output (Int). This seems very similar to GADT's:
data Example a where
ExampleInt :: Int -> Example Int
ExampleBool :: Bool -> Example Bool
1) Given the above, is my understanding of Rank 2 polymorphism correct?
2) What are the general situations where Rank 2 polymorphism can be used, as opposed to GADT's, for example?
If you pass a polymorphic function as and argument to a Rank2-polymorphic function, you're essentially passing not just one function but a whole family of functions – for all possible types that fulfill the constraints.
Typically, those forall quantifiers come with a class constraint. For example, I might wish to do number arithmetic with two different types simultaneously (for comparing precision or whatever).
data FloatCompare = FloatCompare {
singlePrecision :: Float
, doublePrecision :: Double
}
Now I might want to modify those numbers through some maths operation. Something like
modifyFloat :: (Num -> Num) -> FloatCompare -> FloatCompare
But Num is not a type, only a type class. I could of course pass a function that would modify any particular number type, but I couldn't use that to modify both a Float and a Double value, at least not without some ugly (and possibly lossy) converting back and forth.
Solution: Rank-2 polymorphism!
modifyFloat :: (∀ n . Num n => n -> n) -> FloatCompare -> FloatCompare
mofidyFloat f (FloatCompare single double)
= FloatCompare (f single) (f double)
The best single example of how this is useful in practice are probably lenses. A lens is a “smart accessor function” to a field in some larger data structure. It allows you to access fields, update them, gather results... while at the same time composing in a very simple way. How it works: Rank2-polymorphism; every lens is polymorphic, with the different instantiations corresponding to the “getter” / “setter” aspects, respectively.
The go-to example of an application of rank-2 types is runST as Benjamin Hodgson mentioned in the comments. This is a rather good example and there are a variety of examples using the same trick. For example, branding to maintain abstract data type invariants across multiple types, avoiding confusion of differentials in ad, a region-based version of ST.
But I'd actually like to talk about how Haskell programmers are implicitly using rank-2 types all the time. Every type class whose methods have universally quantified types desugars to a dictionary with a field with a rank-2 type. In practice, this is virtually always a higher-kinded type class* like Functor or Monad. I'll use a simplified version of Alternative as an example. The class declaration is:
class Alternative f where
empty :: f a
(<|>) :: f a -> f a -> f a
The dictionary representing this class would be:
data AlternativeDict f = AlternativeDict {
empty :: forall a. f a,
(<|>) :: forall a. f a -> f a -> f a }
Sometimes such an encoding is nice as it allows one to use different "instances" for the same type, perhaps only locally. For example, Maybe has two obvious instances of Alternative depending on whether Just a <|> Just b is Just a or Just b. Languages without type classes, such as Scala, do indeed use this encoding.
To connect to leftaroundabout's reference to lenses, you can view the hierarchy there as a hierarchy of type classes and the lens combinators as simply tools for explicitly building the relevant type class dictionaries. Of course, the reason it isn't actually a hierarchy of type classes is that we usually will have multiple "instances" for the same type. E.g. _head and _head . _tail are both "instances" of Traversal' s a.
* A higher-kinded type class doesn't necessarily lead to this, and it can happen for a type class of kind *. For example:
-- Higher-kinded but doesn't require universal quantification.
class Sum c where
sum :: c Int -> Int
-- Not higher-kinded but does require universal quantification.
class Length l where
length :: [a] -> l
If you are using modules in Haskell, you are already using Rank-2 types. Theoretically speaking, modules are records with rank-2 type properties.
For example, the Foo module below in Haskell ...
module Foo(id) where
id :: forall a. a -> a
id x = x
import qualified Foo
main = do
putStrLn (Foo.id "hello")
return ()
... can actually be thought as a record as follows:
type FooType = FooType {
id :: forall a. a -> a
}
Foo :: FooType
Foo = Foo {
id = \x -> x
}
P/S (unrelated this question): from a language design perspective, if you are going to support module system, then you might as well support higher-rank types (i.e. allow arbitrary quantification of type variables on any level) to reduce duplication of efforts (i.e. type checking a module should be almost the same as type checking a record with higher rank types).

Functions with higher kinds?

Suppose the following data types are defined:
data X a = X {getX :: a}
data Y a = Y {getY :: a}
data Z a = Z {getZ :: a}
Must there be three separate functions, getX, getY, and getZ? It seems to me that there could be a function defined something like this:
get :: forall (τ :: (* -> *)) (a :: *). τ a -> a
get (_ x) = x
Obviously this is not valid standard Haskell, but there are so many extensions to GHC that seem like they might have a solution (RankNTypes,ExistentialQuantification,DataKinds,etc.). Besides the simple reason of avoiding a tiny amount of typing, there is the benefit of avoiding the namespace pollution that the record solution creates. I suppose this is really just a more implicit solution than using a type class like this:
class Get f where
get :: f a -> a
However, it appears that defining a generic function would be more useful than a type class, because the fact that it is implicitly defined means it could be used in many more places, in the same way that ($) or (.) is used. So my question has three parts: is there a way to accomplish this, is it a good idea, and if not, what is a better way?
How about this type?
newtype Pred a = Pred (a -> Bool)
Or this one?
data Proxy a = Proxy
There's no way to get an a out of a Pred a. You can only put as in. Likewise, there's no way to get an a out of a Proxy a, because there aren't any as inside it.
So a function get :: forall f a. f a -> a can't exist in general. You need to use a type class to distinguish between those types f from which you can extract an a and those from which you can't.
Well, that unconstrained generic type of get certainly can't work. This would also allow you to extract, say, a Void value from Const () :: Const () Void.
You can however obtain a suitably constrained version of this function quite simply with generics. You still need a type class, but not need to define instances in the traditional sense. It ultimately looks like this:
{-# LANGUAGE TypeFamilies, DeriveGeneric, DeriveAnyClass #-}
import GHC.Generics
class Get τ where
get :: τ a -> a
data X a = X a deriving (Generic1, Get)
data Y a = Y a deriving (Generic1, Get)
data Z a = Z a deriving (Generic1, Get)
To actually get this to work, we only need two weird representation-type instances:
instance Get f => Get (M1 i t f) where get = get . unM1
instance Get Par1 where get = unPar1
Now the actual implementation for X, Y and Z can just use a default signature and reduce the extraction to the underlying type-representation. To this end, define the class thus:
{-# LANGUAGE DefaultSignatures #-}
class Get τ where
get :: τ a -> a
default get :: (Generic1 τ, Get (Rep1 τ)) => τ a -> a
get = get . from1

Resources