Extensible Haskell Type Classes

Extensible Haskell Type Classes - haskell

I am reading a paper on dependently-typed programming and came across the following quote:
"[...] in contrast to Haskell's type classes, the data type [...] is closed", in the sense that one cannot add new types to the universe without extending the data type.
My newbie question is: in what sense are Haskell type classes open? How are they extensible? Also, what are the type-theoretical consequences of having this property (open vs closed)?
Thank you!

Type classes are open, because you can make arbitrary type an instance of it. When you create type class you specify interface, but not the types which belong to it. Then in any code which includes typeclass definition you can make your type instance of it providing necessary functions from interface using instance TypeClass type of syntax.

Given a type class like:
class Monoid m where
mempty :: m
mappend :: m -> m -> m
... it is (basically) implemented under the hood as a dictionary type:
data Monoid m = Monoid
{ mempty :: m
, mappend :: m -> m -> m
}
Instances like:
instance Monoid [a] where
mempty = []
mappend = (++)
... get translated to dictionaries:
listIsAMonoid :: Monoid [a]
listIsAMonoid = Monoid
{ mempty = []
, mappend = (++)
}
... and the compiler consults above dictionary whenever you use lists in their capacity as Monoids.
This brings us to your questions:
in what sense are Haskell type classes open? How are they extensible?
They are open in the same sense that polymorphic values are open. We have some polymorphic data type:
data Monoid m = ...
... and we can instantiate the polymorphic m type variable to any type where we can provide suitable values for the mempty and mappend fields.

Type classes are "open" because they can always have more types added to them "after the fact" by adding more instance declarations. This can even be done in "client" code that merely uses the module containing the type class.
The key point is that I can write code that operates on values with some type-class constraint, and that same code with no modification can be used on types that weren't in existence when I wrote the type class.
Concrete data types in Haskell are "closed" in that this cannot happen. If I write code that operates on members a specific data type (even if it's polymorphic), then there's no way you can use that code to operate on new kinds of thing I hadn't thought of unless you're able to modify the type (which then probably requires modifying all the places where it is used).

Related

Clarification of Terms around Haskell Type system

Type system in haskell seem to be very Important and I wanted to clarify some terms revolving around haskell type system.
Some type classes
Functor
Applicative
Monad
After using :info I found that Functor is a type class, Applicative is a type class with => (deriving?) Functor and Monad deriving Applicative type class.
I've read that Maybe is a Monad, does that mean Maybe is also Applicative and Functor?
-> operator
When i define a type
data Maybe = Just a | Nothing
and check :t Just I get Just :: a -> Maybe a. How to read this -> operator?
It confuses me with the function where a -> b means it evaluates a to b (sort of returns a maybe) – I tend to think lhs to rhs association but it turns when defining types?
The term type is used in ambiguous ways, Type, Type Class, Type Constructor, Concrete Type etc... I would like to know what they mean to be exact

Indeed the word “type” is used in somewhat ambiguous ways.
The perhaps most practical way to look at it is that a type is just a set of values. For example, Bool is the finite set containing the values True and False.Mathematically, there are subtle differences between the concepts of set and type, but they aren't really important for a programmer to worry about. But you should in general consider the sets to be infinite, for example Integer contains arbitrarily big numbers.
The most obvious way to define a type is with a data declaration, which in the simplest case just lists all the values:
data Colour = Red | Green | Blue
There we have a type which, as a set, contains three values.
Concrete type is basically what we say to make it clear that we mean the above: a particular type that corresponds to a set of values. Bool is a concrete type, that can easily be understood as a data definition, but also String, Maybe Integer and Double -> IO String are concrete types, though they don't correspond to any single data declaration.
What a concrete type can't have is type variables†, nor can it be an incompletely applied type constructor. For example, Maybe is not a concrete type.
So what is a type constructor? It's the type-level analogue to value constructors. What we mean mathematically by “constructor” in Haskell is an injective function, i.e. a function f where if you're given f(x) you can clearly identify what was x. Furthermore, any different constructors are assumed to have disjoint ranges, which means you can also identify f.‡
Just is an example of a value constructor, but it complicates the discussion that it also has a type parameter. Let's consider a simplified version:
data MaybeInt = JustI Int | NothingI
Now we have
JustI :: Int -> MaybeInt
That's how JustI is a function. Like any function of the same signature, it can be applied to argument values of the right type, like, you can write JustI 5.What it means for this function to be injective is that I can define a variable, say,
quoxy :: MaybeInt
quoxy = JustI 9328
and then I can pattern match with the JustI constructor:
> case quoxy of { JustI n -> print n }
9328
This would not be possible with a general function of the same signature:
foo :: Int -> MaybeInt
foo i = JustI $ negate i
> case quoxy of { foo n -> print n }
<interactive>:5:17: error: Parse error in pattern: foo
Note that constructors can be nullary, in which case the injective property is meaningless because there is no contained data / arguments of the injective function. Nothing and True are examples of nullary constructors.
Type constructors are the same idea as value constructors: type-level functions that can be pattern-matched. Any type-name defined with data is a type constructor, for example Bool, Colour and Maybe are all type constructors. Bool and Colour are nullary, but Maybe is a unary type constructor: it takes a type argument and only the result is then a concrete type.
So unlike value-level functions, type-level functions are kind of by default type constructors. There are also type-level functions that aren't constructors, but they require -XTypeFamilies.
A type class may be understood as a set of types, in the same vein as a type can be seen as a set of values. This is not quite accurate, it's closer to true to say a class is a set of type constructors but again it's not as useful to ponder the mathematical details – better to look at examples.
There are two main differences between type-as-set-of-values and class-as-set-of-types:
How you define the “elements”: when writing a data declaration, you need to immediately describe what values are allowed. By contrast, a class is defined “empty”, and then the instances are defined later on, possibly in a different module.
How the elements are used. A data type basically enumerates all the values so they can be identified again. Classes meanwhile aren't generally concerned with identifying types, rather they specify properties that the element-types fulfill. These properties come in the form of methods of a class. For example, the instances of the Num class are types that have the property that you can add elements together.
You could say, Haskell is statically typed on the value level (fixed sets of values in each type), but duck-typed on the type level (classes just require that somebody somewhere implements the necessary methods).
A simplified version of the Num example:
class Num a where
(+) :: a -> a -> a
instance Num Int where
0 + x = x
x + y = ...
If the + operator weren't already defined in the prelude, you would now be able to use it with Int numbers. Then later on, perhaps in a different module, you could also make it usable with new, custom number types:
data MyNumberType = BinDigits [Bool]
instance Num MyNumberType where
BinDigits [] + BinDigits l = BinDigits l
BinDigits (False:ds) + BinDigits (False:es)
= BinDigits (False : ...)
Unlike Num, the Functor...Monad type classes are not classes of types, but of 1-ary type constructors. I.e. every functor is a type constructor taking one argument to make it a concrete type. For instance, recall that Maybe is a 1-ary type constructor.
class Functor f where
fmap :: (a->b) -> f a -> f b
instance Functor Maybe where
fmap f (Just a) = Just (f a)
fmap _ Nothing = Nothing
As you have concluded yourself, Applicative is a subclass of Functor. D being a subclass of C means basically that D is a subset of the set of type constructors in C. Therefore, yes, if Maybe is an instance of Monad it also is an instance of Functor.
†That's not quite true: if you consider the _universal quantor_ explicitly as part of the type, then a concrete type can contain variables. This is a bit of an advanced subject though.
‡This is not guaranteed to be true if the -XPatternSynonyms extension is used.

Can Haskell type synonyms be used as type constructors?

I'm writing a benchmark to compare the performance of a number of Haskell collections, including STArray, on a given task. To eliminate repetition, I'm trying to write a set of functions that provide a uniform interface to these collections, so that I can implement the task as a polymorphic higher-order function. More specifically, the task is implemented in terms of a polymorphic monad, which is ST s for STArray, and Identity for collections like HashMap, that do not typically need to be manipulated within a monad.
Due to uniformity requirements, I can't use the Identity and HashMap types directly, as I need their kinds to match the kinds of ST and STArray. I thought that the simplest way to achieve this would be to define type synonyms with phantom parameters:
type Identity' s a = Identity a
type HashMap' s i e = HashMap i e
-- etc.
Unfortunately this doesn't work, because when I try to use these synonyms as type constructors in places where I use ST and STArray as type constructors, GHC gives errors like:
The type synonym ‘Identity'’ should have 2 arguments, but has been given none
I came across the -XLiberalTypeSynonyms GHC extension, and thought it would allow me to do this, as the documentation says:
You can apply a type synonym to a partially applied type synonym
and gives this example of doing so:
type Generic i o = forall x. i x -> o x
type Id x = x
foo :: Generic Id []
That example works in GHC 8.0.2 (with -XExistentialQuantification and -XRank2Types). But replacing Generic with a newtype or data declaration, as needed in my use case, does not work.
I.e. the following code leads to the same kind of error that I reported above:
newtype Generic i o = Generic (forall x. i x -> o x)
type Id x = x
foo :: Generic Id []
foo = Generic (\x -> [x])
Question
Is there some other extension that I need to enable to get this to work? If not, is there a good reason why this doesn't work, or is it just an oversight?
Workaround
I'm aware that I can work around this by defining Identity', etc. as fully-fledged types, e.g.:
newtype Identity' s a = Identity' a
newtype Collection collection s i e = Collection (collection i e)
-- etc.
This is not ideal though, as it means that I have to reimplement Identity's Functor, Applicative and Monad instances for Identity', and it means that I have to write additional wrapping and unwrapping code for the collections.

When shall I define polymorphic functions by type classes or by some other ways?

I am trying to figure out the purpose of type class, and what else is there if not using type class.
Is type class a way to define polymorphic functions?
Is type class the only way to define polymorphic functions? For example:
class Eq a where
(==), (/=) :: a -> a -> Bool
x /= y = not (x == y)
instance Eq Bool where
False == False = True
True == True = True
_ == _ = False
Can I define == and /= for Bool (and any other type) without using type class Eq?
Where there is any other way, when shall I use which way to define polymorphic functions, by using type class or by using the other way?

You can always write unconstrained polymorphic function, that doesn't require any typeclass. A simple example is
length :: [a] -> Int
– this works without a typeclass, and (well, because) it works for any type a whatsoever. Namely, length doesn't actually care what the values in that list are, it only cares about the structure in which those values are contained. It never actually does anything with those values themselves, and the polymorphic type actually guarantees that.
If the polymorphic task you need is of this form, i.e. a type that you don't actually need to access, you just know it's there, then you should not write/invoke a type class, just use ML-style parametric polymorphism as in length. However, quite often you will need to access the values themselves, inspect them in some way. Doing that without however limiting you to a particular concrete type is what type classes are there for. Eq, as you quoted yourself, is an example.

Is type class a way to define polymorphic functions?
Yes, it is a way. But not the only way. For example parametric polymorphism simply means that if you define a function like init :: [a] -> [a], it will work for any a. Type classes are used for ad-hoc polymorphism: depending on the type, the implementation can be entirely different. This in contrast to parametric polymorphism, where the head function is always the same, regardless the type for a.
Is type class the only way to define polymorphic functions?
No, see the previous section.
Can I define == and /= for Bool (and any other type) without using type class Eq?
That depends on whether the implementation is the same for all types or not. You can use the -XNoImplicitPrelude flag to avoid importing the Prelude, and then you can define your own (==) function.

There is a difference between polymorphic fuctions in OOP and in haskell, I say it because the term "polymorphism " is usually used in OOP.
Functions over list, by example, are polymorphic:
cons:: a -> [a] -> [a]
cons x xs = x:xs
where a is the polymorphic type, and there is no typeclass there.
By the way, there is a way to implement quickly typeclasses, by default, such as Eq or Show, by example:
data MBool = MTrue | MFalse deriving (Eq, Show)
So, the difference is that the typeclass is a constraint, imagine this function with lists:
mapShow :: Show a => [a] -> [String]
mapShow = map show
That's different, because now, a is restricted, it can't be any "a". It should implement the typeclass Show.
In conclusion, you can see that a type in cons function is more generic or abstract than Show => a -> a type in mapShow function.

Practical applications of Rank 2 polymorphism?

I'm covering polymorphism and I'm trying to see the practical uses of such a feature.
My basic understanding of Rank 2 is:
type MyType = ∀ a. a -> a
subFunction :: a -> a
subFunction el = el
mainFunction :: MyType -> Int
mainFunction func = func 3
I understand that this is allowing the user to use a polymorphic function (subFunction) inside mainFunction and strictly specify it's output (Int). This seems very similar to GADT's:
data Example a where
ExampleInt :: Int -> Example Int
ExampleBool :: Bool -> Example Bool
1) Given the above, is my understanding of Rank 2 polymorphism correct?
2) What are the general situations where Rank 2 polymorphism can be used, as opposed to GADT's, for example?

If you pass a polymorphic function as and argument to a Rank2-polymorphic function, you're essentially passing not just one function but a whole family of functions – for all possible types that fulfill the constraints.
Typically, those forall quantifiers come with a class constraint. For example, I might wish to do number arithmetic with two different types simultaneously (for comparing precision or whatever).
data FloatCompare = FloatCompare {
singlePrecision :: Float
, doublePrecision :: Double
}
Now I might want to modify those numbers through some maths operation. Something like
modifyFloat :: (Num -> Num) -> FloatCompare -> FloatCompare
But Num is not a type, only a type class. I could of course pass a function that would modify any particular number type, but I couldn't use that to modify both a Float and a Double value, at least not without some ugly (and possibly lossy) converting back and forth.
Solution: Rank-2 polymorphism!
modifyFloat :: (∀ n . Num n => n -> n) -> FloatCompare -> FloatCompare
mofidyFloat f (FloatCompare single double)
= FloatCompare (f single) (f double)
The best single example of how this is useful in practice are probably lenses. A lens is a “smart accessor function” to a field in some larger data structure. It allows you to access fields, update them, gather results... while at the same time composing in a very simple way. How it works: Rank2-polymorphism; every lens is polymorphic, with the different instantiations corresponding to the “getter” / “setter” aspects, respectively.

The go-to example of an application of rank-2 types is runST as Benjamin Hodgson mentioned in the comments. This is a rather good example and there are a variety of examples using the same trick. For example, branding to maintain abstract data type invariants across multiple types, avoiding confusion of differentials in ad, a region-based version of ST.
But I'd actually like to talk about how Haskell programmers are implicitly using rank-2 types all the time. Every type class whose methods have universally quantified types desugars to a dictionary with a field with a rank-2 type. In practice, this is virtually always a higher-kinded type class* like Functor or Monad. I'll use a simplified version of Alternative as an example. The class declaration is:
class Alternative f where
empty :: f a
(<|>) :: f a -> f a -> f a
The dictionary representing this class would be:
data AlternativeDict f = AlternativeDict {
empty :: forall a. f a,
(<|>) :: forall a. f a -> f a -> f a }
Sometimes such an encoding is nice as it allows one to use different "instances" for the same type, perhaps only locally. For example, Maybe has two obvious instances of Alternative depending on whether Just a <|> Just b is Just a or Just b. Languages without type classes, such as Scala, do indeed use this encoding.
To connect to leftaroundabout's reference to lenses, you can view the hierarchy there as a hierarchy of type classes and the lens combinators as simply tools for explicitly building the relevant type class dictionaries. Of course, the reason it isn't actually a hierarchy of type classes is that we usually will have multiple "instances" for the same type. E.g. _head and _head . _tail are both "instances" of Traversal' s a.
* A higher-kinded type class doesn't necessarily lead to this, and it can happen for a type class of kind *. For example:
-- Higher-kinded but doesn't require universal quantification.
class Sum c where
sum :: c Int -> Int
-- Not higher-kinded but does require universal quantification.
class Length l where
length :: [a] -> l

If you are using modules in Haskell, you are already using Rank-2 types. Theoretically speaking, modules are records with rank-2 type properties.
For example, the Foo module below in Haskell ...
module Foo(id) where
id :: forall a. a -> a
id x = x
import qualified Foo
main = do
putStrLn (Foo.id "hello")
return ()
... can actually be thought as a record as follows:
type FooType = FooType {
id :: forall a. a -> a
}
Foo :: FooType
Foo = Foo {
id = \x -> x
}
P/S (unrelated this question): from a language design perspective, if you are going to support module system, then you might as well support higher-rank types (i.e. allow arbitrary quantification of type variables on any level) to reduce duplication of efforts (i.e. type checking a module should be almost the same as type checking a record with higher rank types).

On inferring fmap for ADTs

Suppose that two new types are defined like this
type MyProductType a = (FType1 a, FType2 a)
type MyCoproductType a = Either (FType1 a) (FType2 a)
...and that FType1 and Ftype2 are both instances of Functor.
If one now were to declare MyProductType and MyCoproductType as instances of Functor, would the compiler require explicit definitions for their respective fmap's, or can it infer these definitions from the previous ones?
Also, is the answer to this question implementation-dependent, or does it follow from the Haskell spec?
By way of background, this question was motivated by trying to make sense of a remark in something I'm reading. The author first defines
type Writer a = (a, String)
...and later writes (my emphasis)
...the Writer type constructor is functorial in a. We don't even have to implement fmap for it, because it's just a simple product type.
The emphasized text is the remark I'm trying to make sense of. I thought it meant that Haskell could infer fmap's for any ADT based on functorial types, and, in particular, it could infer the fmap for a "simple product type" like Writer, but now I think this interpretation is not right (at least if I'm reading Ørjan Johansen's answer correctly).
As for what the author meant by that sentence, now I really have no clue. Maybe all he meant is that it's not worth the trouble to re-define Writer in such a way that its functoriality can be made explicit, since it's such a "simple ... type". (Grasping at straws here.)

First, you cannot generally define new instances for type synonyms, especially not partially applied ones as you would need in your case. I think you meant to define a newtype or data instead:
newtype MyProductType a = MP (FType1 a, FType2 a)
newtype MyCoproductType a = MC (Either (FType1 a) (FType2 a))
Standard Haskell says nothing about deriving Functor automatically at all, that is only possible with GHC's DeriveFunctor extension. (Or sometimes GeneralizedNewtypeDeriving, but that doesn't apply in your examples because you're not using a just as the last argument inside the constructor.)
So let's try that:
{-# LANGUAGE DeriveFunctor #-}
data FType1 a = FType1 a deriving Functor
data FType2 a = FType2 a deriving Functor
newtype MyProductType a = MP (FType1 a, FType2 a) deriving Functor
newtype MyCoproductType a = MC (Either (FType1 a) (FType2 a)) deriving Functor
We get the error message:
Test.hs:6:76:
Can't make a derived instance of ‘Functor MyCoproductType’:
Constructor ‘MC’ must use the type variable only as the last argument of a data type
In the newtype declaration for ‘MyCoproductType’
It turns out that GHC can derive the first three, but not the last one. I believe the third one only works because tuples are special cased. Either doesn't work though, because GHC doesn't keep any special knowledge about how Either treats its first argument. It's nominally a mathematical functor in that argument, but not a Haskell Functor.
Note that GHC is smarter about using variables only as last argument of types known to be Functors. The following works fine:
newtype MyWrappedType a = MW (Either (FType1 Int) (FType2 (Maybe a))) deriving Functor
So to sum up: It depends, GHC has an extension for this but it's not always smart enough to do what you want.

Develop Reference

node.js excel linux python-3.x azure haskell apache-spark rust .htaccess string