Strange program that requires incoherent instances yet always seems to pick the "right" one? - haskell

Consider the following program, which only compiles with incoherent instances enabled:
{-# LANGUAGE TypeFamilies, MultiParamTypeClasses, FlexibleInstances #-}
{-# LANGUAGE IncoherentInstances #-}
main = do
print (g (undefined :: Int))
print (g (undefined :: Bool))
print (g (undefined :: Char))
data True
class CA t where
type A t; type A t = True
fa :: t -> String
instance CA Int where
fa _ = "Int"
class CB t where
type B t; type B t = True
fb :: t -> String
instance CB Bool where
fb _ = "Bool"
class CC t where
type C t; type C t = True
fc :: t -> String
instance CC Char where
fc _ = "Char"
class CAll t t1 t2 t3 where
g :: (t1 ~ A t, t2 ~ B t, t3 ~ C t) => t -> String
instance (CA t) => CAll t True t2 t3 where g = fa
instance (CB t) => CAll t t1 True t3 where g = fb
instance (CC t) => CAll t t1 t2 True where g = fc
When compiling without incoherent instances it claims multiple instances match. This would seem to imply that when incoherent instances are allowed, the instances would be chosen arbitrarily.
Unless I'm exceedingly lucky, this would result in compile errors as the instance constraints in most cases would not be satisfied.
But with incoherent instances I get no compile errors, and indeed get the following output with the "correct" instances chosen:
"Int"
"Bool"
"Char"
So I can only conclude either one of a few things here:
GHC is backtracking on instance context failures (something it says it doesn't do in it's own documentation)
GHC actually knows there's only one instance that matches, but isn't brave enough to use it unless one turns on incoherent instances
I've just got exceedingly lucky (1 in 33 = 27 chance)
Something else is happening.
I suspect the answer is 4 (maybe combined with 2). I'd like any answer to explain what's going on here, and how much I can rely on this behavior with incoherent instances? If it's reliable it seems I could use this behaviour to make quite complex class hierarchies, that actually behave somewhat like subtyping, e.g. I could say all types of class A and class B are in class C, and an instance writer could make an instance for A without having to explicitly make an instance for C.
Edit:
I suspect the answer has something to do with this in the GHC docs:
Find all instances I that match the target constraint; that is, the target constraint is a substitution instance of I. These instance declarations are the candidates.
Eliminate any candidate IX for which both of the following hold:
There is another candidate IY that is strictly more specific; that is, IY is a substitution instance of IX but not vice versa.
Either IX is overlappable, or IY is overlapping. (This "either/or" design, rather than a "both/and" design, allow a client to deliberately override an instance from a library, without requiring a change to the library.)
If exactly one non-incoherent candidate remains, select it. If all remaining candidates are incoherent, select an arbitary one. Otherwise the search fails (i.e. when more than one surviving candidate is not incoherent).
If the selected candidate (from the previous step) is incoherent, the search succeeds, returning that candidate.
If not, find all instances that unify with the target constraint, but do not match it. Such non-candidate instances might match when the target constraint is further instantiated. If all of them are incoherent, the search succeeds, returning the selected candidate; if not, the search fails.
Correct me if I'm wrong, but whether incoherent instances is selected or not, in the first step, there's only one instance that "matches". In the incoherent case, we fall out with that "matched" case at step 4. But in the non-incoherent case, we then go to step 4, and even though only one instance "matches", we find other instances "unify". So we must reject.
Is this understanding correct?
And if so, could someone explain exactly what "match" and "unify" mean, and the difference between them?

Related

Overlapping multi-parameter instances and instance specificity

{-# LANGUAGE MultiParamTypeClasses #-}
{-# LANGUAGE FlexibleInstances #-}
{-# LANGUAGE FlexibleContexts #-}
module OverlappingSpecificsError where
class EqM a b where
(===) :: a -> b -> Bool
instance {-# OVERLAPPABLE #-} Eq a => EqM a a where
a === b = a == b
instance {-# OVERLAPPABLE #-} EqM a b where
a === b = False
aretheyreallyeq :: (Eq a, Eq b) => Either a b -> Either a b -> Bool
aretheyreallyeq (Left a1) (Right b2) = a1 == b2
aretheyeq :: (Eq a, Eq b) => Either a b -> Either a b -> Bool
aretheyeq (Left a1) (Right b2) = a1 === b2
Neither aretheyreallyeq or aretheyeq compile, but the error for aretheyreallyeq makes sense to me, and also tells me that aretheyeq should not give an error: One of the instances that GHCi suggests are possible for EqM in aretheyeq should be impossible due to the same error on aretheyreallyeq. What's going on?
The point is, GHCi insists that both of the instances of EqM are applicable in aretheyeq. But a1 is of type a and b2 is of type b, so in order for the first instance to be applicable, it would have to have the types a and b unify.
But this should not be possible, since they are declared as type variables at the function signature (that is, using the first EqM instance would give rise to the function being of type Either a a -> Either a a -> Bool, and the error in aretheyreallyeq tells me that GHCi will not allow that (which is what I expected anyway).
Am I missing something, or is this a bug in how overlapping instances with multi-parameter type classes are checked?
I am thinking maybe it has to do with the fact that a and b could be further instantiated later on to the point where they are equal, outside aretheyeq, and then the first instance would be valid? But the same is true for aretheyreallyeq. The only difference is that if they do not ever unify we have an option for aretheyeq, but we do not for aretheyreallyeq. In any case, Haskell does not have dynamic dispatch for plenty of good and obvious reasons, so what is the fear in committing to the instance that will always work regardless of whether later on a and b are unifiable? Maybe there is some way to present this that would make choosing the instance when calling the function possible in some way?
It is worth noting that if I remove the second instance, then the function obviously still does not compile, stating that no instance EqM a b can be found. So if I do not have that instance, then none works, but when that one works, suddenly the other does too and I have an overlap? Smells like bug to me miles away.
Instance matching on generic variables works this way in order to prevent some potentially confusing (and dangerous) scenarios.
If the compiler gave in to your intuition and chose the EqM a b instance when compiling aretheyeq (because a and b do not necessarily unify, as you're saying), then the following call:
x = aretheyeq (Left 'z') (Right 'z')
would return False, contrary to intuition.
Q: wait a second! But in this case, a ~ Char and b ~ Char, and we also have Eq a and Eq b, which means Eq Char, which should make it possible to choose the EqM a a instance, shouldn't it?
Well, yes, I suppose this could be happening in theory, but Haskell just doesn't work this way. Class instances are merely extra parameters passed to functions (as method dictionaries), so in order for there to be an instance, it must either be unambiguously choosable within the function itself, or it must be passed in from the consumer.
The former (unambiguously choosable instance) necessarily requires that there is just one instance. And indeed, if you remove the EqM a a instance, your function compiles and always returns False.
The latter (passing an instance from the consumer) means a constraint on the function, like this:
aretheyeq :: EqM a b => Either a b -> Either a b -> Bool
What you are asking is that Haskell essentially has two different versions of this function: one requiring that a ~ b and picking the EqM a a instance, and the other not requiring that, and picking the EqM a b instance.
And then the compiler would cleverly pick the "right" version. So that if I call aretheyeq (Left 'z') (Right 'z'), the first version gets called, but if I call aretheyeq (Left 'z') (Right 42) - the second.
But now think further: if there are two versions of aretheyeq, and which one to pick depends on whether the types are equal, then consider this:
dummy :: a -> b -> Bool
dummy a b = aretheyeq (Left a) (Right b)
How does dummy know which version of aretheyeq to pick? So now there has to be two versions of dummy as well: one for when a ~ b and another for other cases.
And so on. The ripple effect continues until there are concrete types.
Q: wait a second! Why two versions? Can't there be just one version, which then decides what to do based on what arguments are passed in?
Ah, but it can't! This is because types are erased at compile time. By the time the function starts to run, it's already compiled, and there is no more type information. So everything has to be decided at compile time: which instance to pick, and the ripple effect from it.
It isn't a bug in sense of working exactly as documented. Starting with
Now suppose that, in some client module, we are searching for an instance of the target constraint (C ty1 .. tyn). The search works like this:
The first stage of finding candidate instances works as you expect; EqM a b is the only candidate and so the prime candidate. But the last step is
Now find all instances, or in-scope given constraints, that unify with the target constraint, but do not match it. Such non-candidate instances might match when the target constraint is further instantiated. If all of them are incoherent top-level instances, the search succeeds, returning the prime candidate. Otherwise the search fails.
The EqM a a instance falls into this category, and isn't incoherent, so the search fails. And you can achieve the behavior you want by marking it as {-# INCOHERENT #-} instead of overlappable.
To further complete Alexey's answer, which really gave me the hint at what I should do to achieve the behaviour I wanted, I compiled the following minimum working example of a slightly different situation more alike my real use case (which has to do with ExistentialQuantification):
{-# LANGUAGE MultiParamTypeClasses #-}
{-# LANGUAGE FlexibleInstances #-}
{-# LANGUAGE ExistentialQuantification #-}
module ExistentialTest where
class Class1 a b where
foo :: a -> b
instance {-# INCOHERENT #-} Monoid a => Class1 a (Either b a) where
foo x = Right (x <> x)
instance {-# INCOHERENT #-} Monoid a => Class1 a (Either a b) where
foo x = Left x
data Bar a = Dir a | forall b. Class1 b a => FromB b
getA :: Bar a -> a
getA (Dir a) = a
getA (FromB b) = foo b
createBar :: Bar (Either String t)
createBar = FromB "abc"
createBar2 :: Bar (Either t String)
createBar2 = FromB "def"
If you remove the {-# INCOHERENT #-} annotations, you get exactly the same compile errors as in my original example on both createBar and createBar2, and the point is the same: in createBar it is clear that the only suitable instance is the second one, whereas in createBar2 the only suitable one is the first one, but Haskell refuses to compile because of this apparent confusion it might create when using it, until you annotate them with INCOHERENT.
And then, the code works exactly as you'd expect it to: getA createBar returns Left "abc" whereas getA createBar2 returns Right "defdef", which is exactly the only thing that could happen in a sensible type system.
So, my conclusion is: the INCOHERENT annotation is precisely to allow what I wanted to do since the beginning without Haskell complaining about potentially confusing instances and indeed taking the only one that makes sense. A doubt remains as to whether INCOHERENT may make it so that instances that indeed remain overlapping even after taking into account everything compile, using an arbitrary one (which is obviously bad and dangerous). So, a corollary to my conclusion is: only use INCOHERENT when you absolutely need to and are absolutely convinced that there is indeed only one valid instance.
I still think it is a bit absurd that Haskell has no more natural and safe way to tell the compiler to stop worrying about me being potentially being confused and doing what is obviously the only type checking answer to the problem...
aretheyreallyeq fails because there are two different type variables in scope. In
aretheyreallyeq :: (Eq a, Eq b) => Either a b -> Either a b -> Bool
aretheyreallyeq (Left a1) (Right b2) = a1 == b2
a1 :: a, and b2 :: b, there's no method for comparing values of potentially different types (as this is how they're declared), so this fails. This has nothing to do with any of the enabled extensions or pragmas of course.
aretheyeq fails because there are two instances that could match, not that they definitely do. I'm not sure what version of GHC you're using but here's the exception message I see and it seems to be fairly clear to me:
• Overlapping instances for EqM a b arising from a use of ‘===’
Matching instances:
instance [overlappable] EqM a b -- Defined at /home/tmp.hs:12:31
instance [overlappable] Eq a => EqM a a
-- Defined at /home/tmp.hs:9:31
(The choice depends on the instantiation of ‘a, b’
To pick the first instance above, use IncoherentInstances
when compiling the other instance declarations)
• In the expression: a1 === b2
In an equation for ‘aretheyeq’:
aretheyeq (Left a1) (Right b2) = a1 === b2
In this case, my interpretation is that it's saying that given certain choices for a and b, there are potentially multiple different matching instances.

Overlapping instances - how to circumvent them

I am a beginner at Haskell, so please be indulgent. For reasons that are not important here, I am trying to define a operator <^> that takes a function and an argument and returns the value of the function by the argument, irrespective of which of the function and the argument came first. In short, I would like to be able to write the following:
foo :: Int -> Int
foo x = x * x
arg :: Int
arg = 2
foo <^> arg -- valid, returns 4
arg <^> foo -- valid, returns 4
I have tried to accomplish that through type families, as follows:
{-# LANGUAGE FlexibleInstances, MultiParamTypeClasses, TypeFamilies, TypeOperators #-}
class Combine t1 t2 where
type Output t1 t2 :: *
(<^>) :: t1 -> t2 -> Output t1 t2
instance Combine (a->b) a where
type Output (a->b) a = b
f <^> x = f x
instance Combine a (a->b) where
type Output a a->b = b
x <^> f = f x
On this code, GHC throws a Conflicting family instance declarations. My guess is that the overlap GHC complains about occurs when type a->b and type a are the same. I don't know Haskell well enough, but I suspect that with recursive type definitions, one may be able to construct such a situation. I have a couple of questions:
Since this is a rather remote scenario that will never occur in my application (in particular not with foo and arg above), I was wondering if there was a way of specifying a dummy default instance to use in case of overlap? I have tried the different OVERLAPS and OVERLAPPING flags, but they didn't have any effect.
If not, is there a better way of achieving what I want?
Thanks!
This is a bad idea, in my view, but I'll play along.
A possible solution is to switch to functional dependencies. Usually I tend to avoid fundeps in favor of type families, but here they make the instances compile in a simple way.
class Combine t1 t2 r | t1 t2 -> r where
(<^>) :: t1 -> t2 -> r
instance Combine (a->b) a b where
f <^> x = f x
instance Combine a (a->b) b where
x <^> f = f x
Note that this class will likely cause problems during type inference if we use polymorphic functions. This is because, with polymorphic functions, the code can easily become ambiguous.
For instance id <^> id could pick any of the two instances. Above, melpomene already reported const <^> id being ambiguous as well.
The following is weakly related, but I want to share it anyway:
What about type families instead? I tried to experiment a bit, and I just discovered a limitation which I did not know. Consider the closed type family
type family Output a b where
Output (a->b) a = b
Output a (a->b) = b
The code above compiles, but then the type Output a (a->b) is stuck. The second equation does not get applied, as if the first one could potentially match.
Usually, I can understand this in some other scenarios, but here unifying
Output (a' -> b') b' ~ Output a (a -> b)
seems to fail since we would need a ~ (a' -> b') ~ (a' -> a -> b) which is impossible, with finite types. For some reason, GHC does not use this argument (does it pretend infinite types exist in this check? why?)
Anyway, this makes replacing fundeps with type families harder than it could be, it seems. I have no idea about why GHC accepts the fundeps code I posted, yet refuses the OP's code which is essentially the same thing, except using type families.
#chi is close; an approach using either FunDeps or Closed Type Families is possible. But the Combine instances are potentially ambiguous/unifiable just as much as the CTF Output equations.
When chi says the FunDep code is accepted, that's only half-true: GHC plain leads you down the garden path. It will accept the instances but then you find you can't use them/you get weird error messages. See the Users Guide at "potential for overlap".
If you're looking to resolve a potentially ambiguous Combine constraint, you might get an error suggesting you try IncoherentInstances (or INCOHERENT pragma). Don't do that. You have a genuinely incoherent problem; all that will do is defer the problem to somewhere else. It's always possible to avoid Incoherent -- providing you can rejig your instances (as follows) and they're not locked away in libraries.
Notice that because of the potential ambiguity, another Haskell compiler (Hugs) doesn't let you write Combine like that. It has a more correct implementation of Haskell's (not-well-stated) rules.
The answer is to use a sort of overlap where one instance is strictly more specific. You must first decide which you way you want to prefer in case of ambiguity. I'll choose function prefixed to argument:
{-# LANGUAGE UndecidableInstances, TypeFamilies #-}
instance {-# OVERLAPPING #-} (r ~ b)
=> Combine (a->b) a r where ...
instance {-# OVERLAPPABLE #-} (Combine2 t1 t2 r)
=> Combine t1 t2 r where
(<^>) = revApp
class Combine2 t1 t2 r | t1 t2 -> r where
revApp :: t1 -> t2 -> r
instance (b ~ r) => Combine2 a (a->b) r where
revApp x f = f x
Notice that the OVERLAPPABLE instance for Combine has bare tyvars, it's a catch-all so it's always matchable. All the compiler has to do is decide whether some wanted constraint is of the form of the OVERLAPPING instance.
The Combine2 constraint on the OVERLAPPABLE instance is no smaller than the head, so you need UndecidableInstances. Also beware that deferring to Combine2 will mean that if the compiler still can't resolve, you're likely to get puzzling error messages.
Talking of bare tyvars/"always matchable", I've used an additional trick to make the compiler work really hard to improve the types: There's bare r in the head of the instance, with an Equality type improvement constraint (b ~ r) =>. To use the ~, you need to switch on TypeFamilies even though you're not writing any type families.
A CTF approach would be similar. You need a catch-all equation on Output that calls an auxiliary type function. Again you need UndecidableInstances.

Why does this code using UndecidableInstances compile, then generate a runtime infinite loop?

When writing some code using UndecidableInstances earlier, I ran into something that I found very odd. I managed to unintentionally create some code that typechecks when I believed it should not:
{-# LANGUAGE FlexibleInstances #-}
{-# LANGUAGE MultiParamTypeClasses #-}
{-# LANGUAGE ScopedTypeVariables #-}
{-# LANGUAGE UndecidableInstances #-}
data Foo = Foo
class ConvertFoo a b where
convertFoo :: a -> b
instance (ConvertFoo a Foo, ConvertFoo Foo b) => ConvertFoo a b where
convertFoo = convertFoo . (convertFoo :: a -> Foo)
evil :: Int -> String
evil = convertFoo
Specifically, the convertFoo function typechecks when given any input to produce any output, as demonstrated by the evil function. At first, I thought that perhaps I managed to accidentally implement unsafeCoerce, but that is not quite true: actually attempting to call my convertFoo function (by doing something like evil 3, for example) simply goes into an infinite loop.
I sort of understand what’s going on in a vague sense. My understanding of the problem is something like this:
The instance of ConvertFoo that I have defined works on any input and output, a and b, only limited by the two additional constraints that conversion functions must exist from a -> Foo and Foo -> b.
Somehow, that definition is able to match any input and output types, so it would seem that the call to convertFoo :: a -> Foo is picking the very definition of convertFoo itself, since it’s the only one available, anyway.
Since convertFoo calls itself infinitely, the function goes into an infinite loop that never terminates.
Since convertFoo never terminates, the type of that definition is bottom, so technically none of the types are ever violated, and the program typechecks.
Now, even if the above understanding is correct, I am still confused about why the whole program typechecks. Specifically, I would expect the ConvertFoo a Foo and ConvertFoo Foo b constraints to fail, given that no such instances exist.
I do understand (at least fuzzily) that constraints don’t matter when picking an instance—the instance is picked based solely on the instance head, then constraints are checked—so I could see that those constraints might resolve just fine because of my ConvertFoo a b instance, which is about as permissive as it can possibly be. However, that would then require the same set of constraints to be resolved, which I think would put the typechecker into an infinite loop, causing GHC to either hang on compilation or give me a stack overflow error (the latter of which I’ve seen before).
Clearly, though, the typechecker does not loop, because it happily bottoms out and compiles my code happily. Why? How is the instance context satisfied in this particular example? Why doesn’t this give me a type error or produce a typechecking loop?
I wholeheartedly agree that this is a great question. It speaks to how
our intuitions about typeclasses differ from the reality.
Typeclass surprise
To see what is going on here, going to raise the stakes on the
type signature for evil:
data X
class Convert a b where
convert :: a -> b
instance (Convert a X, Convert X b) => Convert a b where
convert = convert . (convert :: a -> X)
evil :: a -> b
evil = convert
Clearly the Covert a b instance is being chosen as there is only
one instance of this class. The typechecker is thinking something like
this:
Convert a X is true if...
Convert a X is true [true by assumption]
and Convert X X is true
Convert X X is true if...
Convert X X is true [true by assumption]
Convert X X is true [true by assumption]
Convert X b is true if...
Convert X X is true [true from above]
and Convert X b is true [true by assumption]
The typechecker has surprised us. We do not expect Convert X X to be
true as we have not defined anything like it. But (Convert X X, Convert X X) => Convert X X is a kind of tautology: it is
automatically true and it is true no matter what methods are defined in the class.
This might not match our mental model of typeclasses. We expect the
compiler to gawk at this point and complain about how Convert X X
cannot be true because we have defined no instance for it. We expect
the compiler to stand at the Convert X X, to look for another spot
to walk to where Convert X X is true, and to give up because there
is no other spot where that is true. But the compiler is able to
recurse! Recurse, loop, and be Turing-complete.
We blessed the typechecker with this capability, and we did it with
UndecidableInstances. When the documentation states that it is
possible to send the compiler into a loop it is easy to assume the
worst and we assumed that the bad loops are always infinite loops. But
here we have demonstrated a loop even deadlier, a loop that
terminates – except in a surprising way.
(This is demonstrated even more starkly in Daniel's comment:
class Loop a where
loop :: a
instance Loop a => Loop a where
loop = loop
.)
The perils of undecidability
This is the exact sort of situation that UndecidableInstances
allows. If we turn that extension off and turn FlexibleContexts on
(a harmless extension that is just syntactic in nature), we get a
warning about a violation of one of the Paterson
conditions:
...
Constraint is no smaller than the instance head
in the constraint: Convert a X
(Use UndecidableInstances to permit this)
In the instance declaration for ‘Convert a b’
...
Constraint is no smaller than the instance head
in the constraint: Convert X b
(Use UndecidableInstances to permit this)
In the instance declaration for ‘Convert a b’
"No smaller than instance head," although we can mentally rewrite it
as "it is possible this instance will be used to prove an assertion of
itself and cause you much anguish and gnashing and typing." The
Paterson conditions together prevent looping in instance resolution.
Our violation here demonstrates why they are necessary, and we can
presumably consult some paper to see why they are sufficient.
Bottoming out
As for why the program at runtime infinite loops: There is the boring
answer, where evil :: a -> b cannot but infinite loop or throw an
exception or generally bottom out because we trust the Haskell
typechecker and there is no value that can inhabit a -> b except
bottom.
A more interesting answer is that, since Convert X X is
tautologically true, its instance definition is this infinite loop
convertXX :: X -> X
convertXX = convertXX . convertXX
We can similarly expand out the Convert A B instance definition.
convertAB :: A -> B
convertAB =
convertXB . convertAX
where
convertAX = convertXX . convertAX
convertXX = convertXX . convertXX
convertXB = convertXB . convertXX
This surprising behavior, and how constrained instance resolution (by
default without extensions) is meant to be as to avoid these
behaviors, perhaps can be taken as a good reason for why Haskell's
typeclass system has yet to pick up wide adoption. Despite its
impressive popularity and power, there are odd corners to it (whether
it is in documentation or error messages or syntax or maybe even in
its underlying logic) that seem particularly ill fit to how we humans
think about type-level abstractions.
Here's how I mentally process these cases:
class ConvertFoo a b where convertFoo :: a -> b
instance (ConvertFoo a Foo, ConvertFoo Foo b) => ConvertFoo a b where
convertFoo = ...
evil :: Int -> String
evil = convertFoo
First, we start by computing the set of required instances.
evil directly requires ConvertFoo Int String (1).
Then, (1) requires ConvertFoo Int Foo (2) and ConvertFoo Foo String (3).
Then, (2) requires ConvertFoo Int Foo (we already counted this) and ConvertFoo Foo Foo (4).
Then (3) requires ConvertFoo Foo Foo (counted) and ConvertFoo Foo String (counted).
Then (4) requires ConvertFoo Foo Foo (counted) and ConvertFoo Foo Foo (counted).
Hence, we reach a fixed point, which is a finite set of required instances. The compiler has no trouble with computing that set in finite time: just apply the instance definitions until no more constraint is needed.
Then, we proceed to provide the code for those instances. Here it is.
convertFoo_1 :: Int -> String
convertFoo_1 = convertFoo_3 . convertFoo_2
convertFoo_2 :: Int -> Foo
convertFoo_2 = convertFoo_4 . convertFoo_2
convertFoo_3 :: Foo -> String
convertFoo_3 = convertFoo_3 . convertFoo_4
convertFoo_4 :: Foo -> Foo
convertFoo_4 = convertFoo_4 . convertFoo_4
We get a bunch of mutually recursive instance definitions. These, in this case, will loop at runtime, but there's no reason to reject them at compile time.

Constraint Inference from Instances

Consider the following:
{-# LANGUAGE FlexibleContexts #-}
module Foo where
data D a = D a
class Foo b
instance (Num a) => Foo (D a)
f :: (Foo (D a)) => a -> a
f x = x+1
GHC complains that it cannot deduce Num a in f. I would like this constraint to be inferred from the (non-overlapping) instance of Foo for D a.
I know I could use a GADT for D and add the constraint Num a there, but I'm hoping to not have to pollute the constructor for D with lots of unnecessary constraints. Is there any hope of this ever happening, and is it possible now?
I am guessing this would break for overlapping instances, and therefore is not inferred in general. That is, you could have
{-# LANGUAGE OverlappingInstances #-}
...
instance (Num a) => Foo (D a)
instance Foo (D Bool)
and then your desired inference would certainly not be sound.
EDIT: Looking more closely at the documentation, it is possible to have
{-# LANGUAGE FlexibleContexts #-}
module Foo where
data D a = D a
class Foo b
instance (Num a) => Foo (D a)
f :: (Foo (D a)) => a -> a
f x = x+1
and then in a separate file:
{-# LANGUAGE OverlappingInstances #-}
module Bar where
import Foo
instance Foo Bool
test = f True
That is, the documentation implies only one of the modules defining the two instances needs to have the OverlappingInstances flag, so if Foo.f were definable as this, you could make another module Bar break type safety completely. Note that with GHC's separate compilation, f would be compiled completely without knowledge of the module Bar.
The arrow => is directional. It means that if Num a holds then Foo (D a). It does not mean that if Foo (D a) holds then Num a holds.
The knowledge that there are (and will never be) any overlapping instances for Foo (D a) should imply that the reverse implication is also true, but (a) GHC doesn't know this and (b) GHC's instance machinery is not set up to use this knowledge.
To actually compile functions that use type classes, it's not enough for GHC to merely prove that a type must be an instance of a class. It has to actually come up with a specific instance declaration that provides definitions of the member functions. We need a constructive proof, not just an existence proof.
To identify an instance of class C, it can either reuse one that will be chosen by the caller of the function being compiled, or it must know the types involved concretely enough to select a single instance from those available. The function being compiled will only be passed an instance for C if it has a constraint for C; otherwise the function must be sufficiently monomorphic that it can only use a single instance.
Considering your example specifically, we can see that f has a constraint for Foo (D a), so we can rely on the caller providing that for us. But the caller isn't going to give us an instance for Num a. Even if you presume that we know from the Num a constraint on Foo (D a) that there must be such an instance out there somewhere, we have no idea what a is, so which definition of + should we invoke? We can't even call another function that works for any Num a but is defined outside the class, because they will all have the Num a constraint and thus expect us to identify an instance for them. Knowing that there is an instance without having having the instance is just not useful.
It isn't at all obvious, but what you're actually asking GHC to do is to do a runtime switch on the type a that arrives at runtime. This is impossible, because we're supposed to be emitting code that works for any type in Num, even types that don't exist yet, or whose instances don't exist yet.
A similar idea that does work is when you have a constraint on the class rather than on the instance. For example:
class Num a => Foo a
f :: Foo a => a -> a
f x = x + 1
But this only works because we know that all Foo instances must have a corresponding Num instance, and thus all callers of a function polymorphic in Foo a know to also select a Num instance. So even without knowing the particular a in order to select a Num instance, f knows that its caller will also provide a Num instance along with the Foo instance.
In your case, the class Foo knows nothing about Num. In other examples Num might not even be defined in code accessible to the module where the class Foo is defined. It's the class that sets the required information that has to be provided to call a function that is polymorphic in the type class, and those polymorphic functions have to be able to work without any knowledge specific to a certain instance.
So the Num a => Foo (D a) instance can't store the Num instance - indeed, the instance definition is also polymorphic in a, so it's not able to select a particular instance to store even if there was space! So even though f might be able to know that there is a Num a instance from Foo (D a) (if we presume certain knowledge that no overlapping could ever be involved), it still needs a Num a constraint in order to require its callers to select a Num instance for it to use.

Associated Parameter Restriction using Functional Dependency

The function f below, for a given type 'a', takes a parameter of type 'c'. For different types 'a', 'c' is restricted in different ways. Concretely, when 'a' is any Integral type, 'c' should be allowed to be any 'Real' type. When 'a' is Float, 'c' can ONLY be Float.
One attempt is:
{-# LANGUAGE
MultiParamTypeClasses,
FlexibleInstances,
FunctionalDependencies,
UndecidableInstances #-}
class AllowedParamType a c | a -> c
class Foo a where
f :: (AllowedParamType a c) => c -> a
fIntegral :: (Integral a, Real c) => c -> a
fIntegral = error "implementation elided"
instance (Integral i, AllowedParamType i d, Real d) => Foo i where
f = fIntegral
For some reason, GHC 7.4.1 complains that it "could not deduce (Real c) arising from a use of fIntegral". It seems to me that the functional dependency should allow this deduction. In the instance, a is unified with i, so by the functional dependency, d should be unified with c, which in the instance is declared to be 'Real'. What am I missing here?
Functional dependencies aside, will this approach be expressive enough to enforce the restrictions above, or is there a better way? We are only working with a few different values for 'a', so there will be instances like:
instance (Integral i, Real c) => AllowedParamType i c
instance AllowedParamType Float Float
Thanks
A possibly better way, is to use constraint kinds and type families (GHC extensions, requires GHC 7.4, I think). This allows you to specify the constraint as part of the class instance.
{-# LANGUAGE ConstraintKinds, TypeFamilies, FlexibleInstances, UndecidableInstances #-}
import GHC.Exts (Constraint)
class Foo a where
type ParamConstraint a b :: Constraint
f :: ParamConstraint a b => b -> a
instance Integral i => Foo i where
type ParamConstraint i b = Real b
f = fIntegral
EDIT: Upon further experimentation, there are some subtleties that mean that this doesn't work as expected, specifically, type ParamConstraint i b = Real b is too general. I don't know a solution (or if one exists) right now.
OK, this one's been nagging at me. given the wide variety of instances,
let's go the whole hog and get rid of any relationship between the
source and target type other than the presence of an instance:
{-# LANGUAGE OverlappingInstances, FlexibleInstances,TypeSynonymInstances,MultiParamTypeClasses #-}
class Foo a b where f :: a -> b
Now we can match up pairs of types with an f between them however we like, for example:
instance Foo Int Int where f = (+1)
instance Foo Int Integer where f = toInteger.((7::Int) -)
instance Foo Integer Int where f = fromInteger.(^ (2::Integer))
instance Foo Integer Integer where f = (*100)
instance Foo Char Char where f = id
instance Foo Char String where f = (:[]) -- requires TypeSynonymInstances
instance (Foo a b,Functor f) => Foo (f a) (f b) where f = fmap f -- requires FlexibleInstances
instance Foo Float Int where f = round
instance Foo Integer Char where f n = head $ show n
This does mean a lot of explicit type annotation to avoid No instance for... and Ambiguous type error messages.
For example, you can't do main = print (f 6), but you can do main = print (f (6::Int)::Int)
You could list all of the instances with the standard types that you want,
which could lead to an awful lot of repetition, our you could light the blue touchpaper and do:
instance Integral i => Foo Double i where f = round -- requires FlexibleInstances
instance Real r => Foo Integer r where f = fromInteger -- requires FlexibleInstances
Beware: this does not mean "Hey, if you've got an integral type i,
you can have an instance Foo Double i for free using this handy round function",
it means: "every time you have any type i, it's definitely an instance
Foo Double i. By the way, I'm using round for this, so unless your type i is Integral,
we're going to fall out." That's a big issue for the Foo Integer Char instance, for example.
This can easily break your other instances, so if you now type f (5::Integer) :: Integer you get
Overlapping instances for Foo Integer Integer
arising from a use of `f'
Matching instances:
instance Foo Integer Integer
instance Real r => Foo Integer r
You can change your pragmas to include OverlappingInstances:
{-# LANGUAGE OverlappingInstances, FlexibleInstances,TypeSynonymInstances,MultiParamTypeClasses #-}
So now f (5::Integer) :: Integer returns 500, so clearly it's using the more specific Foo Integer Integer instance.
I think this sort of approach might work for you, defining many instances by hand, carefully considering when to go completely wild
making instances out of standard type classes. (Alternatively, there aren't all that many standard types, and as we all know, notMany choose 2 = notIntractablyMany, so you could just list them all.)
Here's a suggestion to solve a more general problem, not yours specifically (I need more detail yet first - I promise to check later). I'm writing it in case other people are searching for a solution to a similar problem to you, I certainly was in the past, before I discovered SO. SO is especially great when it helps you try a radically new approach.
I used to have the work habit:
Introduce a multi-parameter type class (Types hanging out all over the place, so...)
Introduce functional dependencies (Should tidy it up but then I end up needing...)
Add FlexibleInstances (Alarm bells start ringing. There's a reason the compiler has this off by default...)
Add UndecidableInstances (GHC is telling you you're on your own, because it's not convinced it's up to the challenge you're setting it.)
Everything blows up. Refactor somehow.
Then I discovered the joys of type families (functional programming for types (hooray) - multi-parameter type classes are (a bit like) logic programming for types). My workflow changed to:
Introduce a type class including an associated type, i.e. replace
class MyProblematicClass a b | a -> b where
thing :: a -> b
thang :: b -> a -> b
with
class MyJustWorksClass a where
type Thing a :: * -- Thing a is a type (*), not a type constructor (* -> *)
thing :: a -> Thing a
thang :: Thing a -> a -> Thing a
Nervously add FlexibleInstances. Nothing goes wrong at all.
Sometimes fix things by using constraints like (MyJustWorksClass j,j~a)=> instead of (MyJustWorksClass a)=> or (Show t,t ~ Thing a,...)=> instead of (Show (Thing a),...) => to help ghc out. (~ essentially means 'is the same type as')
Nervously add FlexibleContexts. Nothing goes wrong at all.
Everything works.
The reason "Nothing goes wrong at all" is that ghc calculates the type Thing a using my type function Thang rather than trying to deduce it using a merely a bunch of assertions that there's a function there and it ought to be able to work it out.
Give it a go! Read Fun with Type Functions before reading the manual!

Resources