How to state that a type variable in a newtype statement is of a type that belongs to some type class? - haskell

Suppose that I have this newtype:
newtype SomeType a = SomeType { foo :: OtherType a }
I want to ensure that a is showable (belongs to the type class Show x).
How do I ensure that? (Is it even possible?)
Bonus points: am I using the terminology correctly?

It is possible with the DatatypeContexts extension, but it is strongly discouraged:
newtype Show a => SomeType a = SomeType { foo :: Maybe a }
It is recommended to put the constraint on the functions that use SomeType or use GADTs. See the answers to these questions for more information.
Alternative for deprecated -XDatatypeContext?
DatatypeContexts Deprecated in Latest GHC: Why?
Basically, it doesn't add anything useful and it makes you have to put constraints where they wouldn't otherwise be necessary.

To demonstrate #David's answer in a small example, imagine you're implementing another incarnation of a balanced binary tree. Of course keys must be Ordered, so you add a constraint to the type declaration:
data Ord k => Tree k v = Tip | Bin k v (Tree k v) (Tree k v)
Now this constraint infects every other signature everywhere you're using this type, even when you don't really need to order keys. It wouldn't probably be a bad thing (after all, you'll need to order them at least somewhere – otherwise, you aren't really using this Tree) and it definitely doesn't break the code, but still makes it less readable, adds noise and distracts from important things.
empty :: Ord k => Tree k v
singleton :: Ord k => k -> v -> Tree k v
find :: Ord k => (v -> Bool) -> Tree k v -> Maybe v
instance Ord k => Functor (Tree k)
In all of these signatures the constraint could be omitted without any problems.

Related

How to change the behavior of the function based on class constraints in Haskell?

I have a data type that represents a collection of values paired with a probability. At first, the implementation was just to use good old lists, but as you can imagine, this can be inefficient (for example, I use a Tree instead of a list to store ordered values)
After some research, I thought about using GADTs
data Tree a b = Leaf | Node {left::Tree a b, val :: (a, b), right :: Tree a b}
data Prob a where
POrd ::Ord a => Tree a Rational -> Prob a
PEq ::Eq a => [(a, Rational)] -> Prob a
PPlain ::[(a, Rational)] -> Prob a
So far, so good. I'm now stuck at trying to create a smart constructor for my new data type,
that takes [(a,Rational)] and depending on the constraints of a, chooses the correct constructor for Prob. Basically:
prob :: [(a, Rational)] -> Prob a
-- chooses the "best" constructor based on the constraints of a
Is this at all possible? If not, how should I go about designing something better? Am I missing something?
Thanks!
There is no way to perform a check of the form "is type T in class C?" in Haskell. The issue here is that it is hard to answer negatively to such question and allow separate compilation: T could be in C in the scope of one module but not in the scope of another one, causing a rather fragile semantics.
To ensure consistency, Haskell only allows to require a constraint, and raise an compile time error otherwise.
As far as I can see, the best you can do is to use another custom type class, which tells you which case is the best one. E.g.
{-# LANGUAGE AllowAmbiguousTypes, TypeApplications, ScopedTypeVariables #-}
data BestConstraint a where
BCOrd :: Ord a => BestConstraint a
BCEq :: Eq a => BestConstraint a
BCNone :: BestConstraint a
class BC a where
bestC :: BestConstraint a
instance BC Int where bestC = BCOrd
-- ... etc.
instance BC a => BC [a] where
bestC = case bestC #a of
BCOrd -> BCOrd
BCEq -> BCEq
BCNone -> BCNone
prob :: forall a . BestConstraint a => [(a, Rational)] -> Prob a
prob xs = case bestC #a of
BCOrd -> POrd .... -- build the tree
BCEq -> PEq xs
BCNone -> PPlain xs
You will have to provide an instance for any type you want to use, though.

Generics : run-time ADT for types with instances

Is it possible with Haskell / GHC, to extract an algebraic data type representing all types with Eq and Ord instances ? This would probably need Generics, Typeable, etc.
What I would like is something like :
data Data_Eq_Ord = Data_String String
| Data_Int Int
| Data_Bool Bool
| ...
deriving (Eq, Ord)
For all types known to have instances for Eq and Ord. If it makes the solution easier, we can limit our scope to Ord instances, since Eq is implied by Ord. But is would be interesting to know if constraints intersection is possible.
This data type would be useful because it gives the possibility to use it where Eq and Ord constraints are required, and pattern-match at runtime to refine on types.
I would need this to implement a generic Map Key Value, where Key would be this type, in a Document Indexing library, where the keys and their type is known at run-time. This library is here. For the moment I worked around the issue by defining a data DocIndexKey, and a FieldKey class, but this is not fully satisfactory since it requires boilerplate and can't cover all legit candidates.
Any good alternative approach to this situation is welcome. For some reasons, I prefer to avoid Template Haskell.
Well, it's not an ADT, but this definitely works:
data Satisfying c = forall a. c a => Satisfy a
class (l a, r a) => And l r a where
instance (l a, r a) => And l r a where
ex :: [Satisfying (Typeable `And` Show `And` Ord)]
ex = [ Satisfy (7 :: Int)
, Satisfy "Hello"
, Satisfy (5 :: Int)
, Satisfy [10..20 :: Int]
, Satisfy ['a'..'z']
, Satisfy ((), 'a')]
-- An example of use, with "complicated" logic
data With f c = forall a. c a => With (f a)
-- vvvvvvvvvvvvvvvvvvvvvvvvvv QuantifiedConstraints chokes on this, which is probably a bug...
partitionTypes :: (forall a. c a => TypeRep a) -> [Satisfying c] -> [[] `With` c]
partitionTypes rep = foldr go []
where go (Satisfy x) [] = [With [x]]
go x'#(Satisfy (x :: a)) (xs'#(With (xs :: [b])) : xss) =
case testEquality rep rep :: Maybe (a :~: b) of
Just Refl -> With (x : xs) : xss
Nothing -> xs' : go x' xss
main :: IO ()
main = traverse_ (\(With xs) -> print (sort xs)) $ partitionTypes typeRep ex
Exhaustivity is much harder. Perhaps with a plugin, you could get GHC to do it, but why bother? I don't believe GHC actually tries to keep track of what types it has seen. In particular, you'd have to scan all modules in the project and its dependencies, even those that haven't been loaded by the module containing the type definition. You'd have to implement it from the ground-up. And, as this answer shows, I very much doubt you would actually be able to use such exhaustivity for anything that you can't already do by just taking the open universe as it is.

Constraints for instances of typeclasses not mentioning the constrained type variable

Let's say we're writing a simple bitset that uses a field of type c (think Word) to signify the presence of different values of type a:
data BitSet c a = BitSet c
deriving (Eq, Ord, Data, Typeable, Generic, NFData)
isSet :: (Bits c, Enum a) => a -> BitSet c a -> Bool
isSet e (BitSet c) = testBit c $ fromEnum e
a is assumed to be Enum so that it could be converted to an integer corresponding to the bit index.
Let's now try writing a Foldable instance for BitSet c, starting with the following:
instance FiniteBits c => Foldable (BitSet c) where
foldr f b0 (BitSet c) = foldr f b0 list
where (list :: [a]) = toEnum <$> filter (testBit c) [0..finiteBitSize c - 1]
But this won't work: it doesn't have the constraint that a should satisfy Enum a, and I don't see any way to provide it (and I have a gut feel that it doesn't make much sense from more theoretical point of view if a isn't listed in the instance declaration, but that's probably another story).
The most voted answer here suggests that using GADTs might help, but GADTs have their own shortcomings:
One would need to use standalone deriving, which gets particularly ugly for Data and Typeable, and, what's worse,
GADTs seem to break derivation of Generic for this type at all.
Is there any other way to write a Foldable instance while still keeping autoderived Generic instance?

Why is context reduction necessary?

I've just read this paper ("Type classes: an exploration of the design space" by Peyton Jones & Jones), which explains some challenges with the early typeclass system of Haskell, and how to improve it.
Many of the issues that they raise are related to context reduction which is a way to reduce the set of constraints over instance and function declarations by following the "reverse entailment" relationship.
e.g. if you have somewhere instance (Ord a, Ord b) => Ord (a, b) ... then within contexts, Ord (a, b) gets reduced to {Ord a, Ord b} (reduction does not always shrink the number of constrains).
I did not understand from the paper why this reduction was necessary.
Well, I gathered it was used to perform some form of type checking. When you have your reduced set of constraint, you can check that there exist some instance that can satisfy them, otherwise it's an error. I'm not too sure what the added value of that is, since you would notice the problem at the use site, but okay.
But even if you have to do that check, why use the result of reduction inside inferred types? The paper points out it leads to unintuitive inferred types.
The paper is quite ancient (1997) but as far as I can tell, context reduction is still an ongoing concern. The Haskell 2010 spec does mention the inference behaviour I explain above (link).
So, why do it this way?
I don't know if this is The Reason, necessarily, but it might be considered A Reason: in early Haskell, type signatures were only permitted to have "simple" constraints, namely, a type class name applied to a type variable. Thus, for example, all of these were okay:
Ord a => a -> a -> Bool
Eq a => a -> a -> Bool
Graph gr => gr n e -> [n]
But none of these:
Ord (Tree a) => Tree a -> Tree a -> Bool
Eq (a -> b) => (a -> b) -> (a -> b) -> Bool
Graph Gr => Gr n e -> [n]
I think there was a feeling then -- and still today, as well -- that allowing the compiler to infer a type which one couldn't write manually would be a bit unfortunate. Context reduction was a way of turning the above signatures either into ones that could be written by hand as well or an informative error. For example, since one might reasonably have
instance Ord a => Ord (Tree a)
in scope, we could turn the illegal signature Ord (Tree a) => ... into the legal signature Ord a => .... On the other hand, if we don't have any instance of Eq for functions in scope, one would report an error about the type which was inferred to require Eq (a -> b) in its context.
This has a couple of other benefits:
Intuitively pleasing. Many of the context reduction rules do not change whether the type is legal, but do reflect things humans would do when writing the type. I'm thinking here of the de-duplication and subsumption rules that let you turn, e.g. (Eq a, Eq a, Ord a) into just Ord a -- a transformation one definitely would want to do for readability.
This can frequently catch stupid errors; rather than inferring a type like Eq (Integer -> Integer) => Bool which can't be satisfied in a law-abiding way, one can report an error like Perhaps you did not apply a function to enough arguments?. Much friendlier!
It becomes the compiler's job to pinpoint what went wrong. Instead of inferring a complicated context like Eq (Tree (Grizwump a, [Flagle (Gr n e) (Gr n' e') c])) and complaining that the context is not satisfiable, it instead is forced to reduce this to the constituent constraints; it will instead complain that we couldn't determine Eq (Grizwump a) from the existing context -- a much more precise and actionable error.
I think this is indeed desirable in a dictionary passing implementation. In such an implementation, a "dictionary", that is, a tuple or record of functions is passed as implicit argument for every type class constraint in the type of the applied function.
Now, the question is simply when and how those dictionaries are created. Observe that for simple types like Int by necessity all dictionaries for whatever type class Int is an instance of will be a constant.
Not so in the case of parameterized types like lists, Maybe or tuples. It is clear that to show a tuple, for instance, the Show instances of the actual tuple elements need to be known. Hence such a polymorphic dictionary cannot be a constant.
It appears that the principle guiding the dictionary passing is such that only dictionaries for types that appear as type variables in the type of the applied function are passed. Or, to put it differently: no redundant information is replicated.
Consider this function:
f :: (Show a, Show b) => (a,b) -> Int
f ab = length (show ab)
The information that a tuple of show-able components is also showable, thus a constraint like Show (a,b) needs not to appear when we already know (Show a, Show b).
An alternative implementation would be possible, though, where the caller .would be responsible to create and pass dictionaries. This could work without context reduction, such that the type of f would look like:
f :: Show (a,b) => (a,b) -> Int
But this would mean that the code to create the tuple dictionary would have to be repeated on every call site. And it is easy to come up with examples where the number of necessary constraints actually increases, like in:
g :: (Show (a,a), Show(b,b), Show (a,b), Show (b, a)) => a -> b -> Int
g a b = maximum (map length [show (a,a), show (a,b), show (b,a), show(b,b)])
It is instructive to implement a type class/instance system with actual records that are explicitly passed. For example:
data Show' a = Show' { show' :: a -> String }
showInt :: Show' Int
showInt = Show' { show' = intshow } where
intshow :: Int -> String
intshow = show
Once you do this you will probably easily recognize the need for "context reduction".

Why constraints on data are a bad thing?

I know this question has been asked and answered lots of times but I still don't really understand why putting constraints on a data type is a bad thing.
For example, let's take Data.Map k a. All of the useful functions involving a Map need an Ord k constraint. So there is an implicit constraint on the definition of Data.Map. Why is it better to keep it implicit instead of letting the compiler and programmers know that Data.Map needs an orderable key.
Also, specifying a final type in a type declaration is something common, and one can see it as a way of "super" constraining a data type.
For example, I can write
data User = User { name :: String }
and that's acceptable. However is that not a constrained version of
data User' s = User' { name :: s }
After all 99% of the functions I'll write for the User type don't need a String and the few which will would probably only need s to be IsString and Show.
So, why is the lax version of User considered bad:
data (IsString s, Show s, ...) => User'' { name :: s }
while both User and User' are considered good?
I'm asking this, because lots of the time, I feel I'm unnecessarily narrowing my data (or even function) definitions, just to not have to propagate constraints.
Update
As far as I understand, data type constraints only apply to the constructor and don't propagate. So my question is then, why do data type constraints not work as expected (and propagate)? It's an extension anyway, so why not have a new extension doing data properly, if it was considered useful by the community?
TL;DR:
Use GADTs to provide implicit data contexts.
Don't use any kind of data constraint if you could do with Functor instances etc.
Map's too old to change to a GADT anyway.
Scroll to the bottom if you want to see the User implementation with GADTs
Let's use a case study of a Bag where all we care about is how many times something is in it. (Like an unordered sequence. We nearly always need an Eq constraint to do anything useful with it.
I'll use the inefficient list implementation so as not to muddy the waters over the Data.Map issue.
GADTs - the solution to the data constraint "problem"
The easy way to do what you're after is to use a GADT:
Notice below how the Eq constraint not only forces you to use types with an Eq instance when making GADTBags, it provides that instance implicitly wherever the GADTBag constructor appears. That's why count doesn't need an Eq context, whereas countV2 does - it doesn't use the constructor:
{-# LANGUAGE GADTs #-}
data GADTBag a where
GADTBag :: Eq a => [a] -> GADTBag a
unGADTBag (GADTBag xs) = xs
instance Show a => Show (GADTBag a) where
showsPrec i (GADTBag xs) = showParen (i>9) (("GADTBag " ++ show xs) ++)
count :: a -> GADTBag a -> Int -- no Eq here
count a (GADTBag xs) = length.filter (==a) $ xs -- but == here
countV2 a = length.filter (==a).unGADTBag
size :: GADTBag a -> Int
size (GADTBag xs) = length xs
ghci> count 'l' (GADTBag "Hello")
2
ghci> :t countV2
countV2 :: Eq a => a -> GADTBag a -> Int
Now we didn't need the Eq constraint when we found the total size of the bag, but it didn't clutter up our definition anyway. (We could have used size = length . unGADTBag just as well.)
Now lets make a functor:
instance Functor GADTBag where
fmap f (GADTBag xs) = GADTBag (map f xs)
oops!
DataConstraints_so.lhs:49:30:
Could not deduce (Eq b) arising from a use of `GADTBag'
from the context (Eq a)
That's unfixable (with the standard Functor class) because I can't restrict the type of fmap, but need to for the new list.
Data Constraint version
Can we do as you asked? Well, yes, except that you have to keep repeating the Eq constraint wherever you use the constructor:
{-# LANGUAGE DatatypeContexts #-}
data Eq a => EqBag a = EqBag {unEqBag :: [a]}
deriving Show
count' a (EqBag xs) = length.filter (==a) $ xs
size' (EqBag xs) = length xs -- Note: doesn't use (==) at all
Let's go to ghci to find out some less pretty things:
ghci> :so DataConstraints
DataConstraints_so.lhs:1:19: Warning:
-XDatatypeContexts is deprecated: It was widely considered a misfeature,
and has been removed from the Haskell language.
[1 of 1] Compiling Main ( DataConstraints_so.lhs, interpreted )
Ok, modules loaded: Main.
ghci> :t count
count :: a -> GADTBag a -> Int
ghci> :t count'
count' :: Eq a => a -> EqBag a -> Int
ghci> :t size
size :: GADTBag a -> Int
ghci> :t size'
size' :: Eq a => EqBag a -> Int
ghci>
So our EqBag count' function requires an Eq constraint, which I think is perfectly reasonable, but our size' function also requires one, which is less pretty. This is because the type of the EqBag constructor is EqBag :: Eq a => [a] -> EqBag a, and this constraint must be added every time.
We can't make a functor here either:
instance Functor EqBag where
fmap f (EqBag xs) = EqBag (map f xs)
for exactly the same reason as with the GADTBag
Constraintless bags
data ListBag a = ListBag {unListBag :: [a]}
deriving Show
count'' a = length . filter (==a) . unListBag
size'' = length . unListBag
instance Functor ListBag where
fmap f (ListBag xs) = ListBag (map f xs)
Now the types of count'' and show'' are exactly as we expect, and we can use standard constructor classes like Functor:
ghci> :t count''
count'' :: Eq a => a -> ListBag a -> Int
ghci> :t size''
size'' :: ListBag a -> Int
ghci> fmap (Data.Char.ord) (ListBag "hello")
ListBag {unListBag = [104,101,108,108,111]}
ghci>
Comparison and conclusions
The GADTs version automagically propogates the Eq constraint everywhere the constructor is used. The type checker can rely on there being an Eq instance, because you can't use the constructor for a non-Eq type.
The DatatypeContexts version forces the programmer to manually propogate the Eq constraint, which is fine by me if you want it, but is deprecated because it doesn't give you anything more than the GADT one does and was seen by many as pointless and annoying.
The unconstrained version is good because it doesn't prevent you from making Functor, Monad etc instances. The constraints are written exactly when they're needed, no more or less. Data.Map uses the unconstrained version partly because unconstrained is generally seen as most flexible, but also partly because it predates GADTs by some margin, and there needs to be a compelling reason to potentially break existing code.
What about your excellent User example?
I think that's a great example of a one-purpose data type that benefits from a constraint on the type, and I'd advise you to use a GADT to implement it.
(That said, sometimes I have a one-purpose data type and end up making it unconstrainedly polymorphic just because I love to use Functor (and Applicative), and would rather use fmap than mapBag because I feel it's clearer.)
{-# LANGUAGE GADTs #-}
import Data.String
data User s where
User :: (IsString s, Show s) => s -> User s
name :: User s -> s
name (User s) = s
instance Show (User s) where -- cool, no Show context
showsPrec i (User s) = showParen (i>9) (("User " ++ show s) ++)
instance (IsString s, Show s) => IsString (User s) where
fromString = User . fromString
Notice since fromString does construct a value of type User a, we need the context explicitly. After all, we composed with the constructor User :: (IsString s, Show s) => s -> User s. The User constructor removes the need for an explicit context when we pattern match (destruct), becuase it already enforced the constraint when we used it as a constructor.
We didn't need the Show context in the Show instance because we used (User s) on the left hand side in a pattern match.
Constraints
The problem is that constraints are not a property of the data type, but of the algorithm/function that operates on them. Different functions might need different and unique constraints.
A Box example
As an example, let's assume we want to create a container called Box which contains only 2 values.
data Box a = Box a a
We want it to:
be showable
allow the sorting of the two elements via sort
Does it make sense to apply the constraint of both Ord and Show on the data type? No, because the data type in itself could be only shown or only sorted and therefore the constraints are related to its use, not it's definition.
instance (Show a) => Show (Box a) where
show (Box a b) = concat ["'", show a, ", ", show b, "'"]
instance (Ord a) => Ord (Box a) where
compare (Box a b) (Box c d) =
let ca = compare a c
cb = compare b d
in if ca /= EQ then ca else cb
The Data.Map case
Data.Map's Ord constraints on the type is really needed only when we have > 1 elements in the container. Otherwise the container is usable even without an Ord key. For example, this algorithm:
transf :: Map NonOrd Int -> Map NonOrd Int
transf x =
if Map.null x
then Map.singleton NonOrdA 1
else x
Live demo
works just fine without the Ord constraint and always produce a non empty map.
Using DataTypeContexts reduces the number of programs you can write. If most of those illegal programs are nonsense, you might say it's worth the runtime cost associated with ghc passing in a type class dictionary that isn't used. For example, if we had
data Ord k => MapDTC k a
then #jefffrey's transf is rejected. But we should probably have transf _ = return (NonOrdA, 1) instead.
In some sense the context is documentation that says "every Map must have ordered keys". If you look at all of the functions in Data.Map you'll get a similar conclusion "every useful Map has ordered keys". While you can create maps with unordered keys using
mapKeysMonotonic :: (k1 -> k2) -> Map k1 a -> Map k2 a
singleton :: k2 a -> Map k2 a
But the moment you try to do anything useful with them, you'll wind up with No instance for Ord k2 somewhat later.

Resources