How does one statisfy a class constraint in an instance of a class that requires a type constructor rather than a concrete type? - haskell

I'm currently in Chapter 8 of Learn you a Haskell, and I've reached the section on the Functor typeclass. In said section the author gives examples of how different types could be made instances of the class (e.g Maybe, a custom Tree type, etc.) Seeing this, I decided to (for fun and practice) try implementing an instance for the Data.Set type; in all of this ignoring Data.Set.map, of course.
The actual instance itself is pretty straight-forward, and I wrote it as:
instance Functor Set.Set where
fmap f empty = Set.empty
fmap f s = Set.fromList $ map f (Set.elems s)
But, since I happen to use the function fromList this brings in a class constraint calling for the types used in the Set to be Ord, as is explained by a compiler error:
Error occurred
ERROR line 4 - Cannot justify constraints in instance member binding
*** Expression : fmap
*** Type : Functor Set => (a -> b) -> Set a -> Set b
*** Given context : Functor Set
*** Constraints : Ord b
See: Live Example
I tried putting a constraint on the instance, or adding a type signature to fmap, but neither succeeded (both were compiler errors as well.)
Given a situation like this, how can a constraint be fulfilled and satisfied? Is there any possible way?
Thanks in advance! :)

Unfortunately, there is no easy way to do this with the standard Functor class. This is why Set does not come with a Functor instance by default: you cannot write one.
This is something of a problem, and there have been some suggested solutions (e.g. defining the Functor class in a different way), but I do not know if there is a consensus on how to best handle this.
I believe one approach is to rewrite the Functor class using constraint kinds to reify the additional constraints instances of the new Functor class may have. This would let you specify that Set has to contain types from the Ord class.
Another approach uses only multi-parameter classes. I could only find the article about doing this for the Monad class, but making Set part of Monad faces the same problems as making it part of Functor. It's called Restricted Monads.
The basic gist of using multi-parameter classes here seems to be something like this:
class Functor' f a b where
fmap' :: (a -> b) -> f a -> f b
instance (Ord a, Ord b) => Functor' Data.Set.Set a b where
fmap' = Data.Set.map
Essentially, all you're doing here is making the types in the Set also part of the class. This then lets you constrain what these types can be when you write an instance of that class.
This version of Functor needs two extensions: MultiParamTypeClasses and FlexibleInstances. (You need the first extension to be able to define the class and the second extension to be able to define an instance for Set.)
Haskell : An example of a Foldable which is not a Functor (or not Traversable)? has a good discussion about this.

This is impossible. The purpose of the Functor class is that if you have Functor f => f a, you can replace the a with whatever you like. The class is not allowed to constrain you to only return this or that. Since Set requires that its elements satisfy certain constraints (and indeed this isn't an implementation detail but really an essential property of sets), it doesn't satisfy the requirements of Functor.
There are, as mentioned in another answer, ways of developing a class like Functor that does constrain you in that way, but it's really a different class, because it gives the user of the class fewer guarantees (you don't get to use this with whatever type parameter you want), in exchange for becoming applicable to a wider range of types. That is, after all, the classic tradeoff of defining a property of types: the more types you want to satisfy it, the less they must be forced to satisfy.
(Another interesting example of where this shows up is the MonadPlus class. In particular, for every instance MonadPlus TC you can make an instance Monoid (TC a), but you can't always go the other way around. Hence the Monoid (Maybe a) instance is different from the MonadPlus Maybe instance, because the former can restrict the a but the latter can't.)

You can do this using a CoYoneda Functor.
{-# LANGUAGE GADTs #-}
data CYSet a where
CYSet :: (Ord a) => Set.Set a -> (a -> b) -> CYSet b
liftCYSet :: (Ord a) => Set.Set a -> CYSet a
liftCYSet s = CYSet s id
lowerCYSet :: (Ord a) => CYSet a -> Set.Set a
lowerCYSet (CYSet s f) = Set.fromList $ map f $ Set.elems s
instance Functor CYSet where
fmap f (CYSet s g) = CYSet s (f . g)
main = putStrLn . show
$ lowerCYSet
$ fmap (\x -> x `mod` 3)
$ fmap abs
$ fmap (\x -> x - 5)
$ liftCYSet $ Set.fromList [1..10]
-- prints "fromList [0,1,2]"

Related

Instance signatures: constraints on methods

With InstanceSignatures, I can give a signature for a method within an instance decl.
The type signature in the instance declaration must be more polymorphic than (or the same as) the one in the class declaration, instantiated with the instance type.
I can see from this answer that the type part of the sig can't be more specific; but why couldn't you add extra constraints? After all you can put constraints on the instance decl that will make the instance more specific than "the class declaration, instantiated with the instance type".
Addit: (In response to the first couple of comments.) To explain "you can put constraints on the instance decl ... instance more specific": the OVERLAPPABLE instance head here alleges it provides addNat for all types a, b, c. But it doesn't: it only provides if a is of the form SNat a', c is of the form SNat c', etc. I'm asking why I can't similarly restrict the types at the method level?
For example [adapted from Hughes 1999] (more realistically this could be a BST/Rose tree, etc):
data AscList a = AscList [a] -- ascending list
instance Functor AscList where -- constructor class
fmap f (AscList xs) = AscList $ sort $ fmap f xs
-- :: Ord b => (a -> b) -> AscList a -> AscList b -- inferred
Without an instance signature GHC complains no instance for (Ord b). With the signature GHC complains the instantiated sig from the class is more polymorphic than the one given.
This (excellent) answer explains the machinery in the instance dictionary. I can see the entry in the dictionary for the method has a type set from the number of constraints for the method (if any) in the class decl. There's no room to put an extra dictionary/parameter for the method.
That seems merely that the implementation didn't foresee a need. Is there a deeper/more theory-based reason against?
(BTW I'm not ever-so convinced by Hughes' approach: that would want both Ord a => WFT(AscList a) and Ord b => WFT(AscList b). But there's no need to require the incoming (AscList a) is ascending. Other constructor classes for AscList perhaps don't need an Ord constraint anywhere.)
If you allow adding constraints to instance methods that aren't already in the class method, a polymorphic function using type classes might no longer typecheck when you specialize it.
For example, consider your AscList instance above, and the following usage of Functor:
data T f = T (f (IO ())) -- f applied to a non-Ord
example :: Functor f => T f -> T f
example (T t) = T (fmap (\x -> x >> print 33) t)
The type of example says you can instantiate f with any functor, such as f = AscList.
But that makes no sense because it would try to sort an AscList (IO ()).
The only way to tell that something goes wrong when we specialize example here is to read its definition, to find whether uses of fmap are still valid, and that goes against modularity.
The type class constraints in a type signature do not and should not record how their methods are used. Conversely, that means that instances do not get to add constraints to their method implementations.

Hidden forall quantified types in ReifiedTraversal

This question really is more generic, since while I was asking it I found out how to fix it in this particular case (even though I don't like it) but I'll phrase it in my particular context.
Context:
I'm using the lens library and I found it particularly useful to provide functionality for "adding" traversals (conceptually, a traversal that traverses all the elements in both original traversals). I did not find a default implementation so I did it using Monoid. In order to be able to implement an instance, I had to use the ReifiedTraversal wrapper, which I assume is in the library precisely for this purpose:
-- Adding traversals
add_traversals :: Semigroup t => Traversal s t a b -> Traversal s t a b -> Traversal s t a b
add_traversals t1 t2 f s = liftA2 (<>) (t1 f s) (t2 f s)
instance Semigroup t => Semigroup (ReifiedTraversal s t a b) where
a1 <> a2 = Traversal (add_traversals (runTraversal a1) (runTraversal a2))
instance Semigroup s => Monoid (ReifiedTraversal' s a) where
mempty = Traversal (\_ -> pure . id)
The immediate application I want to extract from this is being able to provide a traversal for a specified set of indices in a list. Therefore, the underlying semigroup is [] and so is the underlying Traversable. First, I implemented a lens for an individual index in a list:
lens_idx :: Int -> Lens' [a] a
lens_idx _ f [] = error "No such index in the list"
lens_idx 0 f (x:xs) = fmap (\rx -> rx:xs) (f x)
lens_idx n f (x:xs) = fmap (\rxs -> x:rxs) (lens_idx (n-1) f xs)
All that remains to be done is to combine these two things, ideally to implement a function traversal_idxs :: [Int] -> Traversal' [a] a
Problem:
I get type checking errors when I try to use this. I know it has to do with the fact that Traversal is a type that includes a constrained forall quantifier in its definition. In order to be able to use the Monoid instance, I need to first reify the lenses provided by lens_idx (which are, of course, also traversals). I try to do this by doing:
r_lens_idx :: Int -> ReifiedTraversal' [a] a
r_lens_idx = Traversal . lens_idx
But this fails with two errors (two versions of the same error really):
Couldn't match type ‘f’ with ‘f0’...
Ambiguous type variable ‘f0’ arising from a use of ‘lens_idx’
prevents the constraint ‘(Functor f0)’ from being solved...
I understand this has to do with the hidden forall f. Functor f => in the Traversal definition. While writing this, I realized that the following does work:
r_lens_idx :: Int -> ReifiedTraversal' [a] a
r_lens_idx idx = Traversal (lens_idx idx)
So, by giving it the parameter it can make the f explicit to itself and then it can work with it. However, this feels extremely ad-hoc. Specially because originally I was trying to build this r_lens_idx inline in a where clause in the definition of the traversal_idxs function (in fact... on a function defining this function inline because I'm not really going to use it that often).
So, sure, I guess I can always use lambda abstraction, but... is this really the right way to deal with this? It feels like a hack, or rather, that the original error is an oversight by the type-checker.
The "adding" of traversals that you want was added in the most recent lens release, you can find it under the name adjoin. Note that it is unsound to use if your traversals overlap at all.
I am replying to my own question, although it is only pointing out that what I was trying to do with traversals was not actually possible in that shape and how I overcame it. There is still the underlying problem of the hidden forall quantified variables and how is it possible that lambda abstraction can make code that does not type check suddenly type check (or rather, why it did not type check to start with).
It turns out my implementation of Monoid for Traversal was deeply flawed. I realized when I started debugging it. For instance, I was trying to combine a list of indices, and a function that would return a lens for each index, mapping to that index in a list, to a traversal that would map to exactly those indices. That is possible, but it relies on the fact that List is a Monad, instead of just using the Applicative structure.
The function that I had written originally for add_traversal used only the Applicative structure, but instead of mapping to those indices in the list, it would duplicate the list for each index, concatenating them, each version of the list having applied its lens.
When trying to fix it, I realized I needed to use bind to implement what I really wanted, and then I stumbled upon this: https://www.reddit.com/r/haskell/comments/4tfao3/monadic_traversals/
So the answer was clear: I can do what I want, but it's not a Monoid over Traversal, but instead a Monoid over MTraversal. It still serves my purposes perfectly.
This is the resulting code for that:
-- Monadic traversals: Traversals that only work with monads, but they allow other things that rely on the fact they only need to work with monads, like sum.
type MTraversal s t a b = forall m. Monad m => (a -> m b) -> s -> m t
type MTraversal' s a = MTraversal s s a a
newtype ReifiedMTraversal s t a b = MTraversal {runMTraversal :: MTraversal s t a b}
type ReifiedMTraversal' s a = ReifiedMTraversal s s a a
-- Adding mtraversals
add_mtraversals :: Semigroup t => MTraversal r t a b -> MTraversal s r a b -> MTraversal s t a b
add_mtraversals t1 t2 f s = (t2 f s) >>= (t1 f)
instance Semigroup s => Semigroup (ReifiedMTraversal' s a) where
a1 <> a2 = MTraversal (add_mtraversals (runMTraversal a1) (runMTraversal a2))
instance Semigroup s => Monoid (ReifiedMTraversal' s a) where
mempty = MTraversal (\_ -> return . id)
Note that MTraversal is still a LensLike and an ASetter, so you can use many operators from the lens package, like .~.
As I mentioned, though, I still have to use lambda abstraction when using this for my purposes due to the forall quantifier being in an uncomfortable place, and I'd love if someone could clarify what the heck is up with the type checker in that regard.

The relationship between type classes - dependency vs using instance

Let's say I want to define my own type classes for semigroup and monoid. So I write this code:
class Semigroup g where
(<>) :: g -> g -> g
class Semigroup m => Monoid m where
mempty :: m
But there's another way I could define the relationship between those type classes, with some extensions:
class Semigroup g where
gappend :: g -> g -> g
class Monoid m where
mempty :: m
mappend :: m -> m -> m
instance Monoid m => Semigroup m where
gappend = mappend
The latter design has an advantage - Later I can add more instances for Monoid. For example, if I have a typeclass for a vector space I can later make it an additive group without having to specify it in the class declaration. On the other hand, I am forced to use flexible instances and undecidable instances.
My question is - what is the best design for this particular case?
The first version says "monoids are semigroups, with the additional property mempty".
The second certain says "all types are semigroups, provided they are also monoids". Unless you're happy to turn on overlapping instances you can't add any other instances, so this amounts to "a type is a semigroup if and only if it is a monoid"; exactly backwards from the true relationship.
I would almost always prefer the former. Yes, it forces someone wanting to add a Monoid instance to also write a Semigroup instance, but the real "work" they have to do is the same either way: decide on implementations for mempty and mappend. The only thing you save them from by using the second approach is a little boilerplate.

Any advantage of using type constructors in type classes?

Take for example the class Functor:
class Functor a
instance Functor Maybe
Here Maybe is a type constructor.
But we can do this in two other ways:
Firstly, using multi-parameter type classes:
class MultiFunctor a e
instance MultiFunctor (Maybe a) a
Secondly using type families:
class MonoFunctor a
instance MonoFunctor (Maybe a)
type family Element
type instance Element (Maybe a) a
Now there's one obvious advantage of the two latter methods, namely that it allows us to do things like this:
instance Text Char
Or:
instance Text
type instance Element Text Char
So we can work with monomorphic containers.
The second advantage is that we can make instances of types that don't have the type parameter as the final parameter. Lets say we make an Either style type but put the types the wrong way around:
data Silly t errorT = Silly t errorT
instance Functor Silly -- oh no we can't do this without a newtype wrapper
Whereas
instance MultiFunctor (Silly t errorT) t
works fine and
instance MonoFunctor (Silly t errorT)
type instance Element (Silly t errorT) t
is also good.
Given these flexibility advantages of only using complete types (not type signatures) in type class definitions, is there any reason to use the original style definition, assuming you're using GHC and don't mind using the extensions? That is, is there anything special you can do putting a type constructor, not just a full type in a type class that you can't do with multi-parameter type classes or type families?
Your proposals ignore some rather important details about the existing Functor definition because you didn't work through the details of writing out what would happen with the class's member function.
class MultiFunctor a e where
mfmap :: (e -> ??) -> a -> ????
instance MultiFunctor (Maybe a) a where
mfmap = ???????
An important property of fmap at the moment is that its first argument can change types. fmap show :: (Functor f, Show a) => f a -> f String. You can't just throw that away, or you lose most of the value of fmap. So really, MultiFunctor would need to look more like...
class MultiFunctor s t a b | s -> a, t -> b, s b -> t, t a -> s where
mfmap :: (a -> b) -> s -> t
instance (a ~ c, b ~ d) => MultiFunctor (Maybe a) (Maybe b) c d where
mfmap _ Nothing = Nothing
mfmap f (Just a) = Just (f a)
Note just how incredibly complicated this has become to try to make inference at least close to possible. All the functional dependencies are in place to allow instance selection without annotating types all over the place. (I may have missed a couple possible functional dependencies in there!) The instance itself grew some crazy type equality constraints to allow instance selection to be more reliable. And the worst part is - this still has worse properties for reasoning than fmap does.
Supposing my previous instance didn't exist, I could write an instance like this:
instance MultiFunctor (Maybe Int) (Maybe Int) Int Int where
mfmap _ Nothing = Nothing
mfmap f (Just a) = Just (if f a == a then a else f a * 2)
This is broken, of course - but it's broken in a new way that wasn't even possible before. A really important part of the definition of Functor is that the types a and b in fmap don't appear anywhere in the instance definition. Just looking at the class is enough to tell the programmer that the behavior of fmap cannot depend on the types a and b. You get that guarantee for free. You don't need to trust that instances were written correctly.
Because fmap gives you that guarantee for free, you don't even need to check both Functor laws when defining an instance. It's sufficient to check the law fmap id x == x. The second law comes along for free when the first law is proven. But with that broken mfmap I just provided, mfmap id x == x is true, even though the second law is not.
As the implementer of mfmap, you have more work to do to prove your implementation is correct. As a user of it, you have to put more trust in the implementation's correctness, since the type system can't guarantee as much.
If you work out more complete examples for the other systems, you find that they have just as many issues if you want to support the full functionality of fmap. And this is why they aren't really used. They add a lot of complexity for only a small gain in utility.
Well, for one thing the traditional functor class is just much simpler. That alone is a valid reason to prefer it, even though this is Haskell and not Python. And it also represents the mathematical idea better of what a functor is supposed to be: a mapping from objects to objects (f :: *->*), with extra property (->Constraint) that each (forall (a::*) (b::*)) morphism (a->b) is lifted to a morphism on the corresponding object mapped to (-> f a->f b). None of that can be seen very clearly in the * -> * -> Constraint version of the class, or its TypeFamilies equivalent.
On a more practical account, yes, there are also things you can only do with the (*->*)->Constraint version.
In particular, what this constraint guarantees you right away is that all Haskell types are valid objects you can put into the functor, whereas for MultiFunctor you need to check every possible contained type, one by one. Sometimes that's just not possible (or is it?), like when you're mapping over infinitely many types:
data Tough f a = Doable (f a)
| Tough (f (Tough f (a, a)))
instance (Applicative f) = Semigroup (Tough f a) where
Doable x <> Doable y = Tough . Doable $ (,)<$>x<*>y
Tough xs <> Tough ys = Tough $ xs <> ys
-- The following actually violates the semigroup associativity law. Hardly matters here I suppose...
xs <> Doable y = xs <> Tough (Doable $ fmap twice y)
Doable x <> ys = Tough (Doable $ fmap twice x) <> ys
twice x = (x,x)
Note that this uses the Applicative instance of f not just on the a type, but also on arbitrary tuples thereof. I can't see how you could express that with a MultiParamTypeClasses- or TypeFamilies-based applicative class. (It might be possible if you make Tough a suitable GADT, but without that... probably not.)
BTW, this example is perhaps not as useless as it may look – it basically expresses read-only vectors of length 2n in a monadic state.
The expanded variant is indeed more flexible. It was used e.g. by Oleg Kiselyov to define restricted monads. Roughly, you can have
class MN2 m a where
ret2 :: a -> m a
class (MN2 m a, MN2 m b) => MN3 m a b where
bind2 :: m a -> (a -> m b) -> m b
allowing monad instances to be parametrized over a and b. This is useful because you can restrict those types to members of some other class:
import Data.Set as Set
instance MN2 Set.Set a where
-- does not require Ord
return = Set.singleton
instance Prelude.Ord b => MN3 SMPlus a b where
-- Set.union requires Ord
m >>= f = Set.fold (Set.union . f) Set.empty m
Note than because of that Ord constraint, we are unable to define Monad Set.Set using unrestricted monads. Indeed, the monad class requires the monad to be usable at all types.
Also see: parameterized (indexed) monad.

Does exporting type constructors make a difference?

Let's say I have an internal data type, T a, that is used in the signature of exported functions:
module A (f, g) where
newtype T a = MkT { unT :: (Int, a) }
deriving (Functor, Show, Read) -- for internal use
f :: a -> IO (T a)
f a = fmap (\i -> T (i, a)) randomIO
g :: T a -> a
g = snd . unT
What is the effect of not exporting the type constructor T? Does it prevent consumers from meddling with values of type T a? In other words, is there a difference between the export list (f, g) and (f, g, T()) here?
Prevented
The first thing a consumer will see is that the type doesn't appear in Haddock documentation. In the documentation for f and g, the type Twill not be hyperlinked like an exported type. This may prevent a casual reader from discovering T's class instances.
More importantly, a consumer cannot doing anything with T at the type level. Anything that requires writing a type will be impossible. For instance, a consumer cannot write new class instances involving T, or include T in a type family. (I don't think there's a way around this...)
At the value level, however, the main limitation is that a consumer cannot write a type annotation including T:
> :t (f . read) :: Read b => String -> IO (A.T b)
<interactive>:1:39: Not in scope: type constructor or class `A.T'
Not prevented
The restriction on type signatures is not as significant a limitation as it appears. The compiler can still infer such a type:
> :t f . read
f . read :: Read b => String -> IO (A.T b)
Any value expression within the inferrable subset of Haskell may therefore be expressed regardless of the availability of the type constructor T. If, like me, you're addicted to ScopedTypeVariables and extensive annotations, you may be a little surprised by the definition of unT' below.
Furthermore, because typeclass instances have global scope, a consumer can use any available class functions without additional limitation. Depending on the classes involved, this may allow significant manipulation of values of the unexposed type. With classes like Functor, a consumer can also freely manipulate type parameters, because there's an available function of type T a -> T b.
In the example of T, deriving Show of course exposes the "internal" Int, and gives a consumer enough information to hackishly implement unT:
-- :: (Show a, Read a) => T a -> (Int, a)
unT' = (read . strip . show') `asTypeOf` (mkPair . g)
where
strip = reverse . drop 1 . reverse . drop 9
-- :: T a -> String
show' = show `asTypeOf` (mkString . g)
mkPair :: t -> (Int, t)
mkPair = undefined
mkString :: t -> String
mkString = undefined
> :t unT'
unT' :: (Show b, Read b) => A.T b -> (Int, b)
> x <- f "x"
> unT' x
(-29353, "x")
Implementing mkT' with the Read instance is left as an exercise.
Deriving something like Generic will completely explode any idea of containment, but you'd probably expect that.
Prevented?
In the corners of Haskell where type signatures are necessary or where asTypeOf-style tricks don't work, I guess not exporting the type constructor could actually prevent a consumer from doing something they could with the export list (f, g, T()).
Recommendation
Export all type constructors that are used in the type of any value you export. Here, go ahead and include T() in your export list. Leaving it out doesn't accomplish anything other than muddying the documentation. If you want to expose an purely abstract immutable type, use a newtype with a hidden constructor and no class instances.

Resources