From what I understand Foldable basically represents structures that have a number of elements of the same type that can be iterated over, i.e. lists, maps, sets, etc.
Is there a class like Appendable or Insertable which basically represents structures one can add elements to? Of course there would be no guarantee of the order of which elements are retrieved.
I'd rather not create a class myself if there already is one.
You should look at the Data.Collections package. It contains the Unfoldable typeclass with the following methods:
class Unfoldable c i | c -> i where
insert :: i -> c -> c
empty :: c
singleton :: i -> c
It also provides the insertMany and insertManySorted methods, to insert all the elements from a Foldable into an Unfoldable.
If you make your type an instance of both Foldable and Unfoldable then you can both insert and retrieve elements from it.
I think insertion itself isn't really a sensible concept. There are better ways to generalize this. For example Alternative is a sensible type class there. You get pure for singletons and some generic union operation in the form of <|>.
Is there a class like Appendable or Insertable which basically represents structures one can add elements to?
You want to be clearer on what you mean by "add elements." Because there are two ways this could go:
class Insertable c where
-- Add one element to the collection.
insert :: a -> c a -> c a
class Appendable c where
-- Append a collection to another.
append :: c a -> c a -> c a
The latter, you will note, does not support adding a lone a to the collection unless you add an operation like this:
class Pointed c where
singleton :: a -> c a
Note that if you have Appendable and Pointed instances you can define an Insertable:
instance (Appendable c, Pointed c) => Insertable c where
insert x xs = append (singleton x) xs
The Insertable class, together with operations for actually accessing the collection's elements (e.g. the Foldable class), can likewise be used to define an Appendable instance.
In any case, my mock Appendable class above is really just Monoid in disguise. My Insertable can likewise be seen as a disguised version of Chris Taylor's suggested Unfoldable class. I'd go with the Collection class from that package, though, which combines both Unfoldable with Foldable.
Related
This might seem like a stupid question, but let's say I have a Haskell type class C (a :: K) defined for some fixed kind K, and I know I have instances C a for all types a :: K. Let's furthermore assume I have T :: K -> * and a value t :: C A => T A, is there any way to eliminate the constraint? I feel like this should be possible, but I can't come up with anything.
I'm mainly interested in the case where K is a data kind.
No.
The problem here is simple: foralls in Haskell are parametric polymorphism. This means that the same code must run no matter which type gets chosen. The thing I want to say next needs a way to pull out the C a => part of a type and talk about it in isolation, and there isn't really a way to do that in Haskell's syntax. But making up a type that's kinda like that is pretty standard:
data Dict c where Dict :: c => Dict c
Now the thing you are asking for, essentially, is:
magic :: forall a. Dict (C a)
Unfortunately, the code that you want this to represent -- the behaviors of the class methods -- is different for each type of the appropriate kind.
"Okay, okay," I hear you cry, "but when I write t :: C a => T a, isn't that running different code depending on the choice of a?". But no, it turns out, you aren't. There are two extant implementation schemes for typeclasses: dictionaries (alluded to above, and essentially the same as vtables if you're familiar with that from C++) and explicit type representations. In both cases, under the hood, <C> => <T> is actually a function -- taking the methods implementing the constraint <C> in the one case and a runtime representation of all the types mentioned in the constraint <C> in the other. And that function is the same function in all cases -- the dispatch to different code paths happens in the caller.
The type forall a. Dict (C a), on the other hand, has no such luxury: its type is not permitted to be a function under the hood, but must instead be a value of a specific data type.
Consider this module:
data K = A | B
class C (t::K) where
str :: String
instance C 'A where
str = "A"
instance C 'B where
str = "B"
type family F :: K
foo :: String
foo = str #F
Arguably, C t holds for all t::K, since we have the two instances. Then, however, we define a type family F::K which is, essentially, an unknown kind: it could be A, it could be B. Without a line such type instance F = ... defining F, it is impossible to define foo, since it could be "A" or "B".
Indeed, it is impossible to resolve this constraint.
Note that the type instance might even be defined in another module. Indeed, it is pretty common to define type families and type instances in distinct modules, like we do for type classes and their instances. It is impossible, however, to perform separate compilation when we have no information about the type instance.
The solution is, of course, deferring the constraint resolution by adding the constraint to foo:
foo :: C F => String
foo = str #F
Coming from the OOP world, I sometimes find myself trying to use the inheritance pattern in Haskell, with varying degrees of success. Here's a little puzzle I encountered with subclassing (using GHC 8.10.7).
{-# LANGUAGE FlexibleInstances #-}
{-# LANGUAGE MultiParamTypeClasses #-}
{-# LANGUAGE UndecidableInstances #-}
import Data.List (sort)
class Collection c a where
-- gets list of elements in the collection
elements :: c a -> [a]
class OrderedCollection c a where
-- gets sorted list of elements in the collection
orderedElements :: c a -> [a]
instance (Ord a, OrderedCollection c a) => Collection c a where
-- "default" implementation
elements = orderedElements
newtype SortedList a = SortedList [a]
deriving Show
instance (Ord a) => OrderedCollection SortedList a where
-- need to sort the elements in the list
orderedElements (SortedList xs) = sort xs
instance Collection SortedList a where
-- "optimized" implementation: no need to sort
elements (SortedList xs) = xs
test :: (Ord a, Show a, OrderedCollection c a) => c a -> IO ()
test coll = do
putStrLn $ "ordered elements: " ++ show (orderedElements coll)
putStrLn $ "elements: " ++ show (elements coll)
myList :: SortedList Int
myList = SortedList [3, 2, 1]
main :: IO ()
main = do
test myList
After including the necessary language extensions, this still gave me an error: Overlapping instances for Collection c a arising from a use of ‘elements’. It suggests using IncoherentInstances. Since this extension is now deprecated in favor of per-instance pragmas, I added an INCOHERENT pragma to the subclass instance:
instance {-# INCOHERENT #-} (Ord a, OrderedCollection c a) => Collection c a where
...
This successfully compiled. However, the result was not what I expected, as the output was:
ordered elements: [1,2,3]
elements: [1,2,3]
What I wanted was for the specialized implementation of Collection for SortedList to override the default (in an OO language, SortedList would inherit from OrderedCollection and then override the elements method). But here the type checker does not know to use SortedList's custom Collection implementation, because the type signature of test only imposes the constraint OrderedCollection c a.
Next, I tried adding the Collection constraint:
test :: (Ord a, Show a, Collection c a, OrderedCollection c a) => c a -> IO ()
This gave me the output I wanted:
ordered elements: [1,2,3]
elements: [3,2,1]
However, GHC also issued a warning about "fragile inner bindings" and suggested I add the MonoLocalBinds extension, which silences that warning. In any case, I'm not thrilled with having to include the Collection c a constraint (given it's implied by OrderedCollection c a), or having to use incoherent instances.
Interestingly, if I changed the INCOHERENT pragma to OVERLAPPABLE, it still compiled, and it also allowed me to remove MonoLocalBinds.
My question is, are there any alternative approaches to achieving the desired "inheritance" behavior here, without needing the redundant constraint in test?
When you write this:
instance ... => Collection c a where
You're declaring a Collection instance for all types ever. And it doesn't matter at all what's on the left of the fat arrow =>. Constraints do not participate in instance resolution. When the compiler tries to lookup an instance for a particular type, it only looks at what's on the right of the fat arrow =>, and only after finding a matching instance does it check if its constraints are satisfied. And if they're not, the compiler won't go back to look for another instance. That's how instance resolution works, and there are good reasons for it.
So, to reiterate: Collection c a means that this is an instance for all types.
And therefore, any subsequent Collection instances you might declare would of course be overlapping.
Thankfully, in this particular case, there is a better way: you can declare default methods without creating a universal instance like that. To do that, declare the method right inside the class declaration. And yes, you can put constraints on it too (see docs):
class Collection c a where
-- gets list of elements in the collection
elements :: c a -> [a]
default elements :: OrderedCollection c a => c a -> [a]
elements = orderedElements
But more generally, while type classes plus existential quantification is technically equivalent to OOP-style class hierarchies, if you try to actually model your domain like that, it would be more and more awkward and painful the further you go. It's a bit like trying to model ADTs in something like Java. Technically possible, but oh so messy!
There are some legitimate cases where a class hierarchy may make sense (one notable example is the GHC exception system), but most of the time there are much simpler ways.
I'm covering polymorphism and I'm trying to see the practical uses of such a feature.
My basic understanding of Rank 2 is:
type MyType = ∀ a. a -> a
subFunction :: a -> a
subFunction el = el
mainFunction :: MyType -> Int
mainFunction func = func 3
I understand that this is allowing the user to use a polymorphic function (subFunction) inside mainFunction and strictly specify it's output (Int). This seems very similar to GADT's:
data Example a where
ExampleInt :: Int -> Example Int
ExampleBool :: Bool -> Example Bool
1) Given the above, is my understanding of Rank 2 polymorphism correct?
2) What are the general situations where Rank 2 polymorphism can be used, as opposed to GADT's, for example?
If you pass a polymorphic function as and argument to a Rank2-polymorphic function, you're essentially passing not just one function but a whole family of functions – for all possible types that fulfill the constraints.
Typically, those forall quantifiers come with a class constraint. For example, I might wish to do number arithmetic with two different types simultaneously (for comparing precision or whatever).
data FloatCompare = FloatCompare {
singlePrecision :: Float
, doublePrecision :: Double
}
Now I might want to modify those numbers through some maths operation. Something like
modifyFloat :: (Num -> Num) -> FloatCompare -> FloatCompare
But Num is not a type, only a type class. I could of course pass a function that would modify any particular number type, but I couldn't use that to modify both a Float and a Double value, at least not without some ugly (and possibly lossy) converting back and forth.
Solution: Rank-2 polymorphism!
modifyFloat :: (∀ n . Num n => n -> n) -> FloatCompare -> FloatCompare
mofidyFloat f (FloatCompare single double)
= FloatCompare (f single) (f double)
The best single example of how this is useful in practice are probably lenses. A lens is a “smart accessor function” to a field in some larger data structure. It allows you to access fields, update them, gather results... while at the same time composing in a very simple way. How it works: Rank2-polymorphism; every lens is polymorphic, with the different instantiations corresponding to the “getter” / “setter” aspects, respectively.
The go-to example of an application of rank-2 types is runST as Benjamin Hodgson mentioned in the comments. This is a rather good example and there are a variety of examples using the same trick. For example, branding to maintain abstract data type invariants across multiple types, avoiding confusion of differentials in ad, a region-based version of ST.
But I'd actually like to talk about how Haskell programmers are implicitly using rank-2 types all the time. Every type class whose methods have universally quantified types desugars to a dictionary with a field with a rank-2 type. In practice, this is virtually always a higher-kinded type class* like Functor or Monad. I'll use a simplified version of Alternative as an example. The class declaration is:
class Alternative f where
empty :: f a
(<|>) :: f a -> f a -> f a
The dictionary representing this class would be:
data AlternativeDict f = AlternativeDict {
empty :: forall a. f a,
(<|>) :: forall a. f a -> f a -> f a }
Sometimes such an encoding is nice as it allows one to use different "instances" for the same type, perhaps only locally. For example, Maybe has two obvious instances of Alternative depending on whether Just a <|> Just b is Just a or Just b. Languages without type classes, such as Scala, do indeed use this encoding.
To connect to leftaroundabout's reference to lenses, you can view the hierarchy there as a hierarchy of type classes and the lens combinators as simply tools for explicitly building the relevant type class dictionaries. Of course, the reason it isn't actually a hierarchy of type classes is that we usually will have multiple "instances" for the same type. E.g. _head and _head . _tail are both "instances" of Traversal' s a.
* A higher-kinded type class doesn't necessarily lead to this, and it can happen for a type class of kind *. For example:
-- Higher-kinded but doesn't require universal quantification.
class Sum c where
sum :: c Int -> Int
-- Not higher-kinded but does require universal quantification.
class Length l where
length :: [a] -> l
If you are using modules in Haskell, you are already using Rank-2 types. Theoretically speaking, modules are records with rank-2 type properties.
For example, the Foo module below in Haskell ...
module Foo(id) where
id :: forall a. a -> a
id x = x
import qualified Foo
main = do
putStrLn (Foo.id "hello")
return ()
... can actually be thought as a record as follows:
type FooType = FooType {
id :: forall a. a -> a
}
Foo :: FooType
Foo = Foo {
id = \x -> x
}
P/S (unrelated this question): from a language design perspective, if you are going to support module system, then you might as well support higher-rank types (i.e. allow arbitrary quantification of type variables on any level) to reduce duplication of efforts (i.e. type checking a module should be almost the same as type checking a record with higher rank types).
As map exists for Data.Text it could be an instance of Functor which would be useful when writing generic algorithms that work on Strings and Texts. Is there a reason why no such instance exists?
Yes, because a Functor has to be a general 'container', and Text can only contain Char.
A more general explanation in addition to augustss answer: a Functor F must have kind * -> * so as to provide a mapping a -> F a, but the kind of Text is *
Text can't be an instance of Functor for the reason already given. But there is a type class called ListLike, which is targeted for data types that have only one kind of elements, such as Text can contain only Chars. With ListLike you can write a function that works on String as well as Text.
It's defined using functional dependencies as
class (FoldableLL full item, Monoid full) => ListLike full item | full -> item where
-- ...
map :: ListLike full' item' => (item -> item') -> full -> full'
-- ...
which allows you to map elements using a function and create another (possibly different) ListLike from them. (However note that some of ListLike's operations can be inefficient, so be careful if you'll be using them for large data.)
See Making a single function work on lists, ByteStrings and Texts (and perhaps other similar representations).
In Scrap your boilerplate reloaded, the authors describe a new presentation of Scrap Your Boilerplate, which is supposed to be equivalent to the original.
However, one difference is that they assume a finite, closed set of "base" types, encoded with a GADT
data Type :: * -> * where
Int :: Type Int
List :: Type a -> Type [a]
...
In the original SYB, type-safe cast is used, implemented using the Typeable class.
My questions are:
What is the relationship between these two approaches?
Why was the GADT representation chosen for the "SYB Reloaded" presentation?
[I am one of the authors of the "SYB Reloaded" paper.]
TL;DR We really just used it because it seemed more beautiful to us. The class-based Typeable approach is more practical. The Spine view can be combined with the Typeable class and does not depend on the Type GADT.
The paper states this in its conclusions:
Our implementation handles the two central ingredients of generic programming differently from the original SYB paper: we use overloaded functions with
explicit type arguments instead of overloaded functions based on a type-safe
cast 1 or a class-based extensible scheme [20]; and we use the explicit spine
view rather than a combinator-based approach. Both changes are independent
of each other, and have been made with clarity in mind: we think that the structure of the SYB approach is more visible in our setting, and that the relations
to PolyP and Generic Haskell become clearer. We have revealed that while the
spine view is limited in the class of generic functions that can be written, it is
applicable to a very large class of data types, including GADTs.
Our approach cannot be used easily as a library, because the encoding of
overloaded functions using explicit type arguments requires the extensibility of
the Type data type and of functions such as toSpine. One can, however, incorporate Spine into the SYB library while still using the techniques of the SYB
papers to encode overloaded functions.
So, the choice of using a GADT for type representation is one we made mainly for clarity. As Don states in his answer, there are some obvious advantages in this representation, namely that it maintains static information about what type a type representation is for, and that it allows us to implement cast without any further magic, and in particular without the use of unsafeCoerce. Type-indexed functions can also be implemented directly by using pattern matching on the type, and without falling back to various combinators such as mkQ or extQ.
Fact is that I (and I think the co-authors) simply were not very fond of the Typeable class. (In fact, I'm still not, although it is finally becoming a bit more disciplined now in that GHC adds auto-deriving for Typeable, makes it kind-polymorphic, and will ultimately remove the possibility to define your own instances.) In addition, Typeable wasn't quite as established and widely known as it is perhaps now, so it seemed appealing to "explain" it by using the GADT encoding. And furthermore, this was the time when we were also thinking about adding open datatypes to Haskell, thereby alleviating the restriction that the GADT is closed.
So, to summarize: If you actually need dynamic type information only for a closed universe, I'd always go for the GADT, because you can use pattern matching to define type-indexed functions, and you do not have to rely on unsafeCoerce nor advanced compiler magic. If the universe is open, however, which is quite common, certainly for the generic programming setting, then the GADT approach might be instructive, but isn't practical, and using Typeable is the way to go.
However, as we also state in the conclusions of the paper, the choice of Type over Typeable isn't a prerequisite for the other choice we're making, namely to use the Spine view, which I think is more important and really the core of the paper.
The paper itself shows (in Section 8) a variation inspired by the "Scrap your Boilerplate with Class" paper, which uses a Spine view with a class constraint instead. But we can also do a more direct development, which I show in the following. For this, we'll use Typeable from Data.Typeable, but define our own Data class which, for simplicity, just contains the toSpine method:
class Typeable a => Data a where
toSpine :: a -> Spine a
The Spine datatype now uses the Data constraint:
data Spine :: * -> * where
Constr :: a -> Spine a
(:<>:) :: (Data a) => Spine (a -> b) -> a -> Spine b
The function fromSpine is as trivial as with the other representation:
fromSpine :: Spine a -> a
fromSpine (Constr x) = x
fromSpine (c :<>: x) = fromSpine c x
Instances for Data are trivial for flat types such as Int:
instance Data Int where
toSpine = Constr
And they're still entirely straightforward for structured types such as binary trees:
data Tree a = Empty | Node (Tree a) a (Tree a)
instance Data a => Data (Tree a) where
toSpine Empty = Constr Empty
toSpine (Node l x r) = Constr Node :<>: l :<>: x :<>: r
The paper then goes on and defines various generic functions, such as mapQ. These definitions hardly change. We only get class constraints for Data a => where the paper has function arguments of Type a ->:
mapQ :: Query r -> Query [r]
mapQ q = mapQ' q . toSpine
mapQ' :: Query r -> (forall a. Spine a -> [r])
mapQ' q (Constr c) = []
mapQ' q (f :<>: x) = mapQ' q f ++ [q x]
Higher-level functions such as everything also just lose their explicit type arguments (and then actually look exactly the same as in original SYB):
everything :: (r -> r -> r) -> Query r -> Query r
everything op q x = foldl op (q x) (mapQ (everything op q) x)
As I said above, if we now want to define a generic sum function summing up all Int occurrences, we cannot pattern match anymore, but have to fall back to mkQ, but mkQ is defined purely in terms of Typeable and completely independent of Spine:
mkQ :: (Typeable a, Typeable b) => r -> (b -> r) -> a -> r
(r `mkQ` br) a = maybe r br (cast a)
And then (again exactly as in original SYB):
sum :: Query Int
sum = everything (+) sumQ
sumQ :: Query Int
sumQ = mkQ 0 id
For some of the stuff later in the paper (e.g., adding constructor information), a bit more work is needed, but it can all be done. So using Spine really does not depend on using Type at all.
Well, obviously the Typeable use is open -- new variants can be added after the fact, and without modifying the original definitions.
The important change though is that in that TypeRep is untyped. That is, there is no connection between the runtime type , TypeRep, and the static type it encodes. With the GADT approach we can encode the mapping between a type a and its Type, given by the GADT Type a.
We thus bake in evidence for the type rep being statically linked to its origin type, and can write statically typed dynamic application (for example) using Type a as evidence that we have a runtime a.
In the older TypeRep case, we have no such evidence and it comes down to runtime string equality, and a coerce and hope for the best through fromDynamic.
Compare the signatures:
toDyn :: Typeable a => a -> TypeRep -> Dynamic
versus GADT style:
toDyn :: Type a => a -> Type a -> Dynamic
I can't fake my type evidence, and I can use that later when reconstructing things, to e.g. lookup the type class instances for a when all I have is a Type a.