Which dictionary does GHC choose when more than one is in scope? - haskell

Consider the following example:
import Data.Constraint
class Bar a where
bar :: a -> a
foo :: (Bar a) => Dict (Bar a) -> a -> a
foo Dict = bar
GHC has two choices for the dictionary to use when selecting a Bar instance in foo: it could use the dictionary from the Bar a constraint on foo, or it could use the runtime Dict to get a dictionary. See this question for an example where the dictionaries correspond to different instances.
Which dictionary does GHC use, and why is it the "correct" choice?

It just picks one. This isn't the correct choice; it's a pretty well-known wart. You can cause crashes this way, so it's a pretty bad state of affairs. Here is a short example using nothing but GADTs that demonstrates that it is possible to have two different instances in scope at once:
-- file Class.hs
{-# LANGUAGE GADTs #-}
module Class where
data Dict a where
Dict :: C a => Dict a
class C a where
test :: a -> Bool
-- file A.hs
module A where
import Class
instance C Int where
test _ = True
v :: Dict Int
v = Dict
-- file B.hs
module B where
import Class
instance C Int where
test _ = False
f :: Dict Int -> Bool
f Dict = test (0 :: Int)
-- file Main.hs
import TestA
import TestB
main = print (f v)
You will find that Main.hs compiles just fine, and even runs. It prints True on my machine with GHC 7.10.1, but that's not a stable outcome. Turning this into a crash is left to the reader.

GHC just picks one, and this is the correct choice. Any two dictionaries for the same constraint are supposed to be equal.
OverlappingInstances and IncoherentInstances are basically equivalent in destructive power; they both lose instance coherence by design (any two equal constraints in your program being satisfied by the same dictionary). OverlappingInstances gives you a little more ability to work out which instances will be used on a case-by-case basis, but this isn't that useful when you get to the point of passing around Dicts as first class values and so on. I would only consider using OverlappingInstances when I consider the overlapping instances extensionally equivalent (e.g., a more efficient but otherwise equal implementation for a specific type like Int), but even then, if I care enough about performance to write that specialized implementation, isn't it a performance bug if it doesn't get used when it could be?
In short, if you use OverlappingInstances, you give up the right to ask the question of which dictionary will be selected here.
Now it's true that you can break instance coherence without OverlappingInstances. In fact you can do it without orphans and without any extensions other than FlexibleInstances (arguably the problem is that the definition of "orphan" is wrong when FlexibleInstances is enabled). This is a very long-standing GHC bug, which hasn't been fixed in part because (a) it actually can't cause crashes directly as far as anybody seems to know, and (b) there might be a lot of programs that actually rely on having multiple instances for the same constraint in separate parts of the program, and that might be hard to avoid.
Getting back to the main topic, in principle it's important that GHC can select any dictionary that it has available to satisfy a constraint, because even though they are supposed to be equal, GHC might have more static information about some of them than others. Your example is a little bit too simple to be illustrative but imagine that you passed an argument to bar; in general GHC doesn't know anything about the dictionary passed in via Dict so it has to treat this as a call to an unknown function, but you called foo at a specific type T for which there was a Bar T instance in scope, then GHC would know that the bar from the Bar a constraint dictionary was T's bar and could generate a call to a known function, and potentially inline T's bar and do more optimizations as a result.
In practice, GHC is currently not this smart and it just uses the innermost dictionary available. It would probably be already better to always use the outermost dictionary. But cases like this where there are multiple dictionaries available are not very common, so we don't have good benchmarks to test on.

Here's a test:
{-# LANGUAGE FlexibleInstances, OverlappingInstances, IncoherentInstances #-}
import Data.Constraint
class C a where foo :: a -> String
instance C [a] where foo _ = "[a]"
instance C [()] where foo _ = "[()]"
aDict :: Dict (C [a])
aDict = Dict
bDict :: Dict (C [()])
bDict = Dict
bar1 :: String
bar1 = case (bDict, aDict :: Dict (C [()])) of
(Dict,Dict) -> foo [()] -- output: "[a]"
bar2 :: String
bar2 = case (aDict :: Dict (C [()]), bDict) of
(Dict,Dict) -> foo [()] -- output: "[()]"
GHC above happens to use the "last" dictionary which was brought into scope. I wouldn't rely on this, though.
If you limit yourself to overlapping instances, only, then you wouldn't be able to bring in scope two different dictionaries for the same type (as far as I can see), and everything should be fine since the choice of the dictionary becomes immaterial.
However, incoherent instances are another beast, since they allow you to commit to a generic instance and then use it at a type which has a more specific instance. This makes it very hard to understand which instance will be used.
In short, incoherent instances are evil.
Update: I ran some further tests. Using only overlapping instances and an orphan instance in a separate module you can still obtain two different dictionaries for the same type. So, we need even more caveats. :-(

Related

Haskell subclassing and instance overlap

Coming from the OOP world, I sometimes find myself trying to use the inheritance pattern in Haskell, with varying degrees of success. Here's a little puzzle I encountered with subclassing (using GHC 8.10.7).
{-# LANGUAGE FlexibleInstances #-}
{-# LANGUAGE MultiParamTypeClasses #-}
{-# LANGUAGE UndecidableInstances #-}
import Data.List (sort)
class Collection c a where
-- gets list of elements in the collection
elements :: c a -> [a]
class OrderedCollection c a where
-- gets sorted list of elements in the collection
orderedElements :: c a -> [a]
instance (Ord a, OrderedCollection c a) => Collection c a where
-- "default" implementation
elements = orderedElements
newtype SortedList a = SortedList [a]
deriving Show
instance (Ord a) => OrderedCollection SortedList a where
-- need to sort the elements in the list
orderedElements (SortedList xs) = sort xs
instance Collection SortedList a where
-- "optimized" implementation: no need to sort
elements (SortedList xs) = xs
test :: (Ord a, Show a, OrderedCollection c a) => c a -> IO ()
test coll = do
putStrLn $ "ordered elements: " ++ show (orderedElements coll)
putStrLn $ "elements: " ++ show (elements coll)
myList :: SortedList Int
myList = SortedList [3, 2, 1]
main :: IO ()
main = do
test myList
After including the necessary language extensions, this still gave me an error: Overlapping instances for Collection c a arising from a use of ‘elements’. It suggests using IncoherentInstances. Since this extension is now deprecated in favor of per-instance pragmas, I added an INCOHERENT pragma to the subclass instance:
instance {-# INCOHERENT #-} (Ord a, OrderedCollection c a) => Collection c a where
...
This successfully compiled. However, the result was not what I expected, as the output was:
ordered elements: [1,2,3]
elements: [1,2,3]
What I wanted was for the specialized implementation of Collection for SortedList to override the default (in an OO language, SortedList would inherit from OrderedCollection and then override the elements method). But here the type checker does not know to use SortedList's custom Collection implementation, because the type signature of test only imposes the constraint OrderedCollection c a.
Next, I tried adding the Collection constraint:
test :: (Ord a, Show a, Collection c a, OrderedCollection c a) => c a -> IO ()
This gave me the output I wanted:
ordered elements: [1,2,3]
elements: [3,2,1]
However, GHC also issued a warning about "fragile inner bindings" and suggested I add the MonoLocalBinds extension, which silences that warning. In any case, I'm not thrilled with having to include the Collection c a constraint (given it's implied by OrderedCollection c a), or having to use incoherent instances.
Interestingly, if I changed the INCOHERENT pragma to OVERLAPPABLE, it still compiled, and it also allowed me to remove MonoLocalBinds.
My question is, are there any alternative approaches to achieving the desired "inheritance" behavior here, without needing the redundant constraint in test?
When you write this:
instance ... => Collection c a where
You're declaring a Collection instance for all types ever. And it doesn't matter at all what's on the left of the fat arrow =>. Constraints do not participate in instance resolution. When the compiler tries to lookup an instance for a particular type, it only looks at what's on the right of the fat arrow =>, and only after finding a matching instance does it check if its constraints are satisfied. And if they're not, the compiler won't go back to look for another instance. That's how instance resolution works, and there are good reasons for it.
So, to reiterate: Collection c a means that this is an instance for all types.
And therefore, any subsequent Collection instances you might declare would of course be overlapping.
Thankfully, in this particular case, there is a better way: you can declare default methods without creating a universal instance like that. To do that, declare the method right inside the class declaration. And yes, you can put constraints on it too (see docs):
class Collection c a where
-- gets list of elements in the collection
elements :: c a -> [a]
default elements :: OrderedCollection c a => c a -> [a]
elements = orderedElements
But more generally, while type classes plus existential quantification is technically equivalent to OOP-style class hierarchies, if you try to actually model your domain like that, it would be more and more awkward and painful the further you go. It's a bit like trying to model ADTs in something like Java. Technically possible, but oh so messy!
There are some legitimate cases where a class hierarchy may make sense (one notable example is the GHC exception system), but most of the time there are much simpler ways.

How to declare instances of a typeclass (like Show) for all types in my own typeclass?

I have a typeclass:
class Wrapper w where
open :: w -> Map String Int
close :: Map String Int -> w
It doesn't look very useful, but I use it to strongly (not just a type synonym) distinguish between semantically different varieties of Map String Ints:
newtype FlapMap = Flap (Map String Int)
newtype SnapMap = Snap (Map String Int)
...
and still have functions that operate on any type of the class.
Is there a better way to do this distinction (maybe without the Wrapper instances boilerplate)?
I want to do this:
instance (Wrapper wrapper) => Show wrapper where
show w = show $ toList $ open w
instead of writing many boilerplate Show instances as well.
Via FlexibleInstances and UndecidableInstances, GHC leads me to a point where it thinks my instance declaration applies to everything, as it allegedly clashes with the other Show instances in my code and in GHC.Show. HaskellWiki and StackOverflow answerers and HaskellWiki convince me OverlappingInstances is not quite safe and possibly confusing. GHC doesn't even suggest it.
Why does GHC first complain about not knowing which instance of fx Show Int to pick (so why it doesn't look at the constraint I give at compile time?) and then, being told that instances may overlap, suddenly know what to do?
Can I avoid allowing OverlappingInstances with my newtypes?
You can’t do this without OverlappingInstances, which as you mention, is unpredictable. It won’t help you here, anyway, so you pretty much can’t do this at all without a wrapper type.
That’s rather unsatisfying, of course, so why is this the case? As you’ve already determined, GHC does not look at the instance context when picking an instance, only the instance head. Why? Well, consider the following code:
class Foo a where
fooToString :: a -> String
class Bar a where
barToString :: a -> String
data Something = Something
instance Foo Something where
fooToString _ = "foo something"
instance Bar Something where
barToString _ = "bar something"
instance Foo a => Show a where
show = fooToString
instance Bar a => Show a where
show = barToString
If you consider the Foo or Bar typeclasses in isolation, the above definitions make sense. Anything that implements the Foo typeclass should get a Show instance “for free”. Unfortunately, the same is true of the Bar instance, so now you have two valid instances for show Something.
Since typeclasses are always open (and indeed, Show must be open if you are able to define your own instances for it), it’s impossible to know that someone will not come along and add their own similar instance, then create an instance on your datatype, creating ambiguity. This is effectively the classic diamond problem from OO multiple inheritance in typeclass form.
The best you can get is to create a wrapper type that provides the relevant instances:
{-# LANGUAGE ExistentialQuantification #-}
data ShowableWrapper = forall w. Wrapper w => ShowableWrapper w
instance Show ShowableWrapper where
show (ShowableWrapper w) = show . toList $ open w
At that point, though, you really aren’t getting much of an advantage over just writing your own showWrapper :: Wrapper w => w -> String function.

How can I make my type an instance of Arbitrary?

I have the following data and function
data Foo = A | B deriving (Show)
foolist :: Maybe Foo -> [Foo]
foolist Nothing = [A]
foolist (Just x) = [x]
prop_foolist x = (length (foolist x)) == 1
when running quickCheck prop_foolist, ghc tells me that Foo needs to be an instance of Arbitrary.
No instance for (Arbitrary Foo) arising from a use of ‘quickCheck’
In the expression: quickCheck prop_foolist
In an equation for ‘it’: it = quickCheck prop_foolist
I tried data Foo = A | B deriving (Show, Arbitrary), but this results in
Can't make a derived instance of ‘Arbitrary Foo’:
‘Arbitrary’ is not a derivable class
Try enabling DeriveAnyClass
In the data declaration for ‘Foo’
However, I can't figure out how to enble DeriveAnyClass. I just wanted to use quickcheck with my simple function! The possible values of x is Nothing, Just A and Just B. Surely this should be possible to test?
There are two reasonable approaches:
Reuse an existing instance
If there's another instance that looks similar, you can use it. The Gen type is an instance of Functor, Applicative, and even Monad, so you can easily build generators from other ones. This is probably the most important general technique for writing Arbitrary instances. Most complex instances will be built up from one or more simpler ones.
boolToFoo :: Bool -> Foo
boolToFoo False = A
boolToFoo True = B
instance Arbitrary Foo where
arbitrary = boolToFoo <$> arbitrary
In this case, Foo can't be "shrunk" to subparts in any meaningful way, so the default trivial implementation of shrink will work fine. If it were a more interesting type, you could have used some analogue of
shrink = map boolToFoo . shrink . fooToBool
Use the pieces available in Test.QuickCheck.Arbitrary and/or Test.QuickCheck.Gen
In this case, it's pretty easy to just put together the pieces:
import Test.QuickCheck.Arbitrary
data Foo = A | B
deriving (Show,Enum,Bounded)
instance Arbitrary Foo where
arbitrary = arbitraryBoundedEnum
As mentioned, the default shrink implementation would be fine in this case. In the case of a recursive type, you'd likely want to add
{-# LANGUAGE DeriveGeneric #-}
import GHC.Generics (Generic)
and then derive Generic for your type and use
instance Arbitrary ... where
...
shrink = genericShrink
As the documentation warns, genericShrink does not respect any internal validity conditions you may wish to impose, so some care may be required in some cases.
You asked about DeriveAnyClass. If you wanted that, you'd add
{-# LANGUAGE DeriveAnyClass #-}
to the top of your file. But you don't want that. You certainly don't want it here, anyway. It only works for classes that have a full complement of defaults based on Generics, typically using the DefaultSignatures extension. In this case, there is no default arbitrary :: Generic a => Gen a line in the Arbitrary class definition, and arbitrary is mandatory. So an instance of Arbitrary produced by DeriveAnyClass will produce a runtime error as soon as QuickCheck tries to call its arbitrary method.

Transparently implementing a particular form of dynamic typing

The basic idea is that I have a range of functions that work on any types from a particular class, but at runtime the program is supposed to read a configuration file and extract an element of one of the types in the class.
For instance, I have a 'Coefficient' class, various instances of it, and functions of various types that are polymorphic over types of that class; at runtime one particular type of that class is to be determined, and passed around.
I'm unsure how to properly address this; I tried making up 'compound' types, doing something like:
data CompoundCoeff = CompoundInt Int | CompoundDouble Double | ...
where Int, Double, ... are instances of the class 'Coefficient'.
However, it started to become a big effort to adapt all the functions involved in the code to work with these compound types (and it's not a nice solution either, really). It would be OK if all functions had the same, easy type, e.g.
Coefficient a => a -> (stuff not involving a anymore)
but that's unfortunately not the case.
Another issue I ran into, is that I'm using type families, and have something like
class (Monoid (ColourData c), Coordinate (InputData c)) => ColourScheme c where
type ColourData c :: *
type InputData c :: *
colouriseData :: c -> (ColourData c) -> AlphaColour Double
processInput :: c -> InputData c -> ColourData c
This doesn't go through cleanly if I have to use some sort of compound ColourData datatype, like the previous one; in particular I can no longer guarantee that the data stream gives a consistent type (and not just different 'subtypes' of a compound type), and would (among other things) have to make up a bogus Monoid instance if I did make up a compound ColourData type.
I've also looked into Data.Dynamic, but again I can't see how it would properly address the issues; the exact same problems seem to appear (well, slightly worse even, given that there is only one 'generic' Dynamic type as I understand it).
Question: How can I implement dynamic datatypes subordinate to particular classes, without having to rewrite all the functions involving those data types? It would be best if I didn't have to sacrifice any type safety, but I'm not too optimistic.
The program is supposed to read a configuration file at runtime, and all the requisite functions, polymorphic over the relevant class, are to be applied.
The traditional way to provide an object that guarantees that it is an instance of typeclass Foo, but makes no additional guarantees, is like so:
{-# LANGUAGE ExistentialTypes #-}
data SomeFoo = forall a . Foo a => SomeFoo a
instance Foo SomeFoo where
-- all operations just unwrap the SomeFoo straightforwardly
or, with GADTs, which might be more readable...
data SomeFoo where
SomeFoo :: Foo a => a -> SomeFoo
One proposal would be to write a single top-level function that does all the finishing touches once you've chosen a type:
topLevel :: SomeTypeClass a => a -> IO ()
Your program can then be written something like this:
main = do
config <- readConfig
case config of
UseDouble n -> topLevel n
UseSymbolic x -> topLevel x
UseWidgetFrobnosticator wf -> topLevel wf

Is there a way to define an existentially quantified newtype in GHC Haskell?

Is it possible in (GHC) Haskell to define an existentially-quantified newtype? I understand that if type classes are involved it can't be done in a dictionary-passing implementation, but for my purposes type-classes are not needed. What I'd really like to define is this:
newtype Key t where Key :: t a -> Key t
But GHC does not seem to like it. Currently I'm using data Key t where Key :: !(t a) -> Key t. Is there any way (perhaps just using -funbox-strict-fields?) to define a type with the same semantics and overhead as the newtype version above? My understanding is that even with strict fields unboxed there will still be an extra tag word, though I could be totally wrong there.
This is not something that's causing me any noticeable performance issues. It just surprised me that the newtype was not allowed. I'm a naturally curious person, so I can't help wondering whether the version I have is being compiled to the same representation or whether any equivalent type could be defined which would be.
No, according to GHC:
A newtype constructor cannot have an existential context
However, data is just fine:
{-# LANGUAGE ExistentialQuantification #-}
data E = forall a. Show a => E a
test = [ E "foo"
, E (7 :: Int)
, E 'x'
]
main = mapM_ (\(E e) -> print e) test
E.g.
*Main> main
"foo"
7
'x'
Logically, you do need the dictionary (or tag) allocated somewhere. And that doesn't make sense if you erase the constructor.
Note: You can't unbox functions though, as you seem to be hinting at, nor polymorphic fields.
Is there any way (perhaps just using -funbox-strict-fields?) to define a type with the same semantics and overhead as the newtype version above?
Removing the -XGADTs helps me think about this:
{-# LANGUAGE ExistentialQuantification #-}
data Key t = forall a. Key !(t a)
As in, Key (Just 'x') :: Key Maybe
So you want to guarantee the Key constructor is erased.
Here's the code in GHC for type checking the constraints on newtype:
-- Checks for the data constructor of a newtype
checkNewDataCon con
= do { checkTc (isSingleton arg_tys) (newtypeFieldErr con (length arg_tys))
-- One argument
; checkTc (null eq_spec) (newtypePredError con)
-- Return type is (T a b c)
; checkTc (null ex_tvs && null eq_theta && null dict_theta) (newtypeExError con)
-- No existentials
; checkTc (not (any isBanged (dataConStrictMarks con)))
(newtypeStrictError con)
-- No strictness
We can see why ! won't have any effect on the representation, since it contains polymorphic components, so needs to use the universal representation. And unlifted newtype doesn't make sense, nor non-singleton constructors.
The only thing I can think of is that, like for record accessors for existentials, the opaque type variable will escape if the newtype is exposed.
I don't see any reason it couldn't be made to work, but perhaps ghc has some internal representation issues with it.

Resources