Confused about GADTs and propagating constraints - haskell

There's plenty of Q&A about GADTs being better than DatatypeContexts, because GADTs automagically make constraints available in the right places. For example here, here, here. But sometimes it seems I still need an explicit constraint. What's going on? Example adapted from this answer:
{-# LANGUAGE GADTs #-}
import Data.Maybe -- fromJust
data GADTBag a where
MkGADTBag :: Eq a => { unGADTBag :: [a] } -> GADTBag a
baz (MkGADTBag x) (Just y) = x == y
baz2 x y = unGADTBag x == fromJust y
-- unGADTBag :: GADTBag a -> [a] -- inferred, no Eq a
-- baz :: GADTBag a -> Maybe [a] -> Bool -- inferred, no Eq a
-- baz2 :: Eq a => GADTBag a -> Maybe [a] -> Bool -- inferred, with Eq a
Why can't the type for unGADTBag tell us Eq a?
baz and baz2 are morally equivalent, yet have different types. Presumably because unGADTBag has no Eq a, then the constraint can't propagate into any code using unGADTBag.
But with baz2 there's an Eq a constraint hiding inside the GADTBag a. Presumably baz2's Eq a will want a duplicate of the dictionary already there(?)
Is it that potentially a GADT might have many data constructors, each with different (or no) constraints? That's not the case here, or with typical examples for constrained data structures like Bags, Sets, Ordered Lists.
The equivalent for a GADTBag datatype using DatatypeContexts infers baz's type same as baz2.
Bonus question: why can't I get an ordinary ... deriving (Eq) for GADTBag? I can get one with StandaloneDeriving, but it's blimmin obvious, why can't GHC just do it for me?
deriving instance (Eq a) => Eq (GADTBag a)
Is the problem again that there might be other data constructors?
(Code exercised at GHC 8.6.5, if that's relevant.)
Addit: in light of #chi's and #leftroundabout's answers -- neither of which I find convincing. All of these give *** Exception: Prelude.undefined:
*DTContexts> unGADTBag undefined
*DTContexts> unGADTBag $ MkGADTBag undefined
*DTContexts> unGADTBag $ MkGADTBag (undefined :: String)
*DTContexts> unGADTBag $ MkGADTBag (undefined :: [a])
*DTContexts> baz undefined (Just "hello")
*DTContexts> baz (MkGADTBag undefined) (Just "hello")
*DTContexts> baz (MkGADTBag (undefined :: String)) (Just "hello")
*DTContexts> baz2 undefined (Just "hello")
*DTContexts> baz2 (MkGADTBag undefined) (Just "hello")
*DTContexts> baz2 (MkGADTBag (undefined :: String)) (Just "hello")
Whereas these two give the same type error at compile time * Couldn't match expected type ``[Char]'* No instance for (Eq (Int -> Int)) arising from a use of ``MkGADTBag'/ ``baz2' respectively [Edit: my initial Addit gave the wrong expression and wrong error message]:
*DTContexts> baz (MkGADTBag (undefined :: [Int -> Int])) (Just [(+ 1)])
*DTContexts> baz2 (MkGADTBag (undefined :: [Int -> Int])) (Just [(+ 1)])
So baz, baz2 are morally equivalent not just in that they return the same result for the same well-defined arguments; but also in that they exhibit the same behaviour for the same ill-defined arguments. Or they differ only in where the absence of an Eq instance gets reported?
#leftroundabout Before you've actually deconstructed the x value, there's no way of knowing that the MkGADTBag constructor indeed applies.
Yes there is: field label unGADTBag is defined if and only if there's a pattern match on MkGADTBag. (It would maybe be different if there were other constructors for the type -- especially if those also had a label unGADTBag.) Again, being undefined/lazy evaluation doesn't postpone the type-inference.
To be clear, by "[not] convincing" I mean: I can see the behaviour and the inferred types I'm getting. I don't see that laziness or potential undefinedness gets in the way of type inference. How could I expose a difference between baz, baz2 that would explain why they have different types?

Function calls never bring type class constraints in scope, only (strict) pattern matching does.
The comparison
unGADTBag x == fromJust y
is essentially a function call of the form
foo (unGADTBag x) (fromJust y)
where foo requires Eq a. That would morally be provided by unGADTBag x, but that expression is not yet evaluated! Because of laziness, unGADTBag x will be evaluated only when (and if) foo demands its first argument.
So, in order to call foo in this example we need its argument to be evaluated in advance. While Haskell could work like this, it would be a rather surprising semantics, where arguments are evaluated or not depending on whether they provide a type class constraint which is needed. Imagine more general cases like
foo (if cond then unGADTBag x else unGADTBag z) (fromJust y)
What should be evaluated here? unGADTBag x? unGADTBag y? Both? cond as well? It's hard to tell.
Because of these issues, Haskell was designed so that we need to manually require the evaluation of a GADT value like x using pattern matching.

Why can't the type for unGADTBag tell us Eq a?
Before you've actually deconstructed the x value, there's no way of knowing that the MkGADTBag constructor indeed applies. Sure, if it doesn't then you have other problems (bottom), but those might conceivably not surface. Consider
ignore :: a -> b -> b
ignore _ = id
baz2' :: GADTBag a -> Maybe [a] -> Bool
baz2' x y = ignore (unGADTBag x) (y==y)
Note that I could now invoke the function with, say, undefined :: GADTBag (Int->Int). Shouldn't be a problem since the undefined is ignored, right★? Problem is, despite Int->Int not having an Eq instance, I was able to write y==y, which y :: Maybe [Int->Int] can't in fact support.
So, we can't have that only mentioning unGADTBag is enough to spew the Eq a constraint into its surrounding scope. Instead, we must clearly delimit the scope of that constraint to where we've confirmed that the MkGADTBag constructor does apply, and a pattern match accomplishes that.
★If you're annoyed that my argument relies on undefined, note that the same issue arises also when there are multiple constructors which would bring different constraints into scope.
An alternative to a pattern-match that does work is this:
{-# LANGUAGE RankNTypes #-}
withGADTBag :: GADTBag a -> (Eq a => [a] -> b) -> b
withGADTBag (MkGADTBag x) f = f x
baz3 :: GADTBag a -> Maybe [a] -> Bool
baz3 x y = withGADTBag x (== fromJust y)
Response to edits
All of these give *** Exception: Prelude.undefined:
Yes of course they do, because you actually evaluate x == y in your function. So the function can only possibly yield non-⟂ if the inputs have a NF. But that's by no means the case for all functions.
Whereas these two give the same type error at compile time
Of course they do, because you're trying to wrap a value of non-Eq type in the MkGADTBag constructor, which explicitly requires that constraint (and allows you to explicitly unwrap it again!), whereas the GADTBag type doesn't require that constraint. (Which is kind of the whole point about this sort of encapsulation!)
Before you've actually deconstructed the x value, there's no way of knowing that the `MkGADTBag` constructor indeed applies.Yes there is: field label `unGADTBag` is defined if and only if there's a pattern match on `MkGADTBag`.
Arguably, that's the way field labels should work, but they don't, in Haskell. A field label is nothing but a function from the data type to the field type, and a nontotal function at that if there are multiple constructors.Yeah, Haskell records are one of the worst-designed features of the language. I personally tend to use field labels only for big, single-constructor, plain-old-data types (and even then I prefer using not the field labels directly but lenses derived from them).
Anyway though, I don't see how “field label is defined iff there's a pattern match” could even be implemented in a way that would allow your code to work the way you think it should. The compiler would have to insert the step of confirming that the constructor applies (and extracting its GADT-encapsulated constraint) somewhere. But where? In your example it's reasonably obvious, but in general x could inhabit a vast scope with lots of decision branches and you really don't want it to get evaluated in a branch where the constraint isn't actually needed.
Also keep in mind that when we argue with undefined/⟂ it's not just about actually diverging computations, more typically you're worried about computations that would simply take a long time (just, Haskell doesn't actually have a notion of “taking a long time”).

The way to think about this is OutsideIn(X) ... with local assumptions. It's not about undefinedness or lazy evaluation. A pattern match on a GADT constructor is outside, the RHS of the equation is inside. Constraints from the constructor are made available only locally -- that is only inside.
baz (MkGADTBag x) (Just y) = x == y
Has an explicit data constructor MkGADTBag outside, supplying an Eq a. The x == y raises a wanted Eq a locally/inside, which gets discharged from the pattern match. OTOH
baz2 x y = unGADTBag x == fromJust y
Has no explicit data constructor outside, so no context is supplied. unGADTBag has a Eq a, but that is deeper inside the l.h. argument to ==; type inference doesn't go looking deeper inside. It just doesn't. Then in the effective definition for unGADTBag
unGADTBag (MkGADTBag x) = x
there is an Eq a made available from the outside, but it cannot escape from the RHS into the type environment at a usage site for unGADTBag. It just doesn't. Sad!
The best I can see for an explanation is towards the end of the OutsideIn paper, Section 9.7 Is the emphasis on principal types well-justified? (A rhetorical question but my answer would me: of course we must emphasise principal types; type inference could get better principaled under some circumstances.) That last section considers this example
data R a where
RInt :: Int -> R Int
RBool :: Bool -> R Bool
RChar :: Char -> R Char
flop1 (RInt x) = x
there is a third type that is arguably more desirable [for flop1], and that type is R Int -> Int.
flop1's definition is of the same form as unGADTBag, with a constrained to be Int.
flop2 (RInt x) = x
flop2 (RBool x) = x
Unfortunately, ordinary polymorphic types are too weak to express this restriction [that a must be only Int or Bool] and we can only get Ɐa.R a -> a for flop2, which does not rule the application of flop2 to values of type R Char.
So at that point the paper seems to give up trying to refine better principal types:
In conclusion, giving up on some natural principal types in favor of more specialized types that eliminate more pattern match errors at runtime is appealing but does not quite work unless we consider a more expressive syntax of types. Furthermore it is far from obvious how to specify these typings in a high-level declarative specification.
"is appealing". It just doesn't.
I can see a general solution is difficult/impossible. But for use-cases of constrained Bags/Lists/Sets, the specification is:
All data constructors have the same constraint(s) on the datatype's parameters.
All constructors yield the same type (... -> T a or ... -> T [a] or ... -> T Int, etc).
Datatypes with a single constructor satisfy that trivially.
To satisfy the first bullet, for a Set type using a binary balanced tree, there'd be a non-obvious definition for the Nil constructor:
data OrdSet a where
SNode :: Ord a => OrdSet a -> a -> OrdSet a -> OrdSet a
SNil :: Ord a => OrdSet a -- seemingly redundant Ord constraint
Even so, repeating the constraint on every node and every terminal seems wasteful: it's the same constraint all the way down (which is unlike GADTs for EDSL abstract syntax trees); presumably each node carries a copy of exactly the same dictionary.
The best way to ensure same constraint(s) on every constructor could just be prefixing the constraint to the datatype:
data Ord a => OrdSet a where ...
And perhaps the constraint could go 'OutsideOut' to the environment that's accessing the tree.

Another possible approach is to use a PatternSynonym with an explicit signature giving a Required constraint.
pattern EqGADTBag :: Eq a => [a] -> GADTBag a -- that Eq a is the *Required*
pattern EqGADTBag{ unEqGADTBag } = MkGADTBag unEqGADTBag -- without sig infers Eq a only as *Provided*
That is, without that explicit sig:
*> :i EqGADTBag
pattern EqGADTBag :: () => Eq a => [a] -> GADTBag a
The () => Eq a => ... shows Eq a is Provided, arising from the GADT constructor.
Now we get both inferred baz, baz2 :: Eq a => GADTBag a -> Maybe [a] -> Bool:
baz (EqGADTBag x) (Just y) = x == y
baz2 x y = unEqGADTBag x == fromJust y
As a curiosity: it's possible to give those equations for baz, baz2 as well as those in the O.P. using the names from the GADT decl. GHC warns of overlapping patterns [correctly]; and does infer the constrained sig for baz.
I wonder if there's a design pattern here? Don't put constraints on the data constructor -- that is, don't make it a GADT. Instead declare a 'shadow' PatternSynonym with the Required/Provided constraints.

You can capture the constraint in a fold function, (Eq a => ..) says you can assume Eq a but only within the function next (which is defined after a pattern match). If you instantiate next as = fromJust maybe == as it uses this constraint to witness equality
-- local constraint
-- |
-- vvvvvvvvvvvvvvvvvv
foldGADTBag :: (Eq a => [a] -> res) -> GADTBag a -> res
foldGADTBag next (MkGADTBag as) = next as
baz3 :: GADTBag a -> Maybe [a] -> Bool
baz3 gadtBag maybe = foldGADTBag (fromJust maybe ==) gadtBag
type Ty :: Type -> Type
data Ty a where
TyInt :: Int -> Ty Int
TyUnit :: Ty ()
-- locally assume Int locally assume unit
-- | |
-- vvvvvvvvvvvvvvvvvvv vvvvvvvvvvvvv
foldTy :: (a ~ Int => a -> res) -> (a ~ () => res) -> (Ty a -> res)
foldTy int unit (TyInt i) = int i
foldTy int unit TyUnit = unit
eval :: Ty a -> a
eval = foldTy id ()

Related

Generics : run-time ADT for types with instances

Is it possible with Haskell / GHC, to extract an algebraic data type representing all types with Eq and Ord instances ? This would probably need Generics, Typeable, etc.
What I would like is something like :
data Data_Eq_Ord = Data_String String
| Data_Int Int
| Data_Bool Bool
| ...
deriving (Eq, Ord)
For all types known to have instances for Eq and Ord. If it makes the solution easier, we can limit our scope to Ord instances, since Eq is implied by Ord. But is would be interesting to know if constraints intersection is possible.
This data type would be useful because it gives the possibility to use it where Eq and Ord constraints are required, and pattern-match at runtime to refine on types.
I would need this to implement a generic Map Key Value, where Key would be this type, in a Document Indexing library, where the keys and their type is known at run-time. This library is here. For the moment I worked around the issue by defining a data DocIndexKey, and a FieldKey class, but this is not fully satisfactory since it requires boilerplate and can't cover all legit candidates.
Any good alternative approach to this situation is welcome. For some reasons, I prefer to avoid Template Haskell.
Well, it's not an ADT, but this definitely works:
data Satisfying c = forall a. c a => Satisfy a
class (l a, r a) => And l r a where
instance (l a, r a) => And l r a where
ex :: [Satisfying (Typeable `And` Show `And` Ord)]
ex = [ Satisfy (7 :: Int)
, Satisfy "Hello"
, Satisfy (5 :: Int)
, Satisfy [10..20 :: Int]
, Satisfy ['a'..'z']
, Satisfy ((), 'a')]
-- An example of use, with "complicated" logic
data With f c = forall a. c a => With (f a)
-- vvvvvvvvvvvvvvvvvvvvvvvvvv QuantifiedConstraints chokes on this, which is probably a bug...
partitionTypes :: (forall a. c a => TypeRep a) -> [Satisfying c] -> [[] `With` c]
partitionTypes rep = foldr go []
where go (Satisfy x) [] = [With [x]]
go x'#(Satisfy (x :: a)) (xs'#(With (xs :: [b])) : xss) =
case testEquality rep rep :: Maybe (a :~: b) of
Just Refl -> With (x : xs) : xss
Nothing -> xs' : go x' xss
main :: IO ()
main = traverse_ (\(With xs) -> print (sort xs)) $ partitionTypes typeRep ex
Exhaustivity is much harder. Perhaps with a plugin, you could get GHC to do it, but why bother? I don't believe GHC actually tries to keep track of what types it has seen. In particular, you'd have to scan all modules in the project and its dependencies, even those that haven't been loaded by the module containing the type definition. You'd have to implement it from the ground-up. And, as this answer shows, I very much doubt you would actually be able to use such exhaustivity for anything that you can't already do by just taking the open universe as it is.

Can a Haskell type constructor have non-type parameters?

A type constructor produces a type given a type. For example, the Maybe constructor
data Maybe a = Nothing | Just a
could be a given a concrete type, like Char, and give a concrete type, like Maybe Char. In terms of kinds, one has
GHCI> :k Maybe
Maybe :: * -> *
My question: Is it possible to define a type constructor that yields a concrete type given a Char, say? Put another way, is it possible to mix kinds and types in the type signature of a type constructor? Something like
GHCI> :k my_type
my_type :: Char -> * -> *
Can a Haskell type constructor have non-type parameters?
Let's unpack what you mean by type parameter. The word type has (at least) two potential meanings: do you mean type in the narrow sense of things of kind *, or in the broader sense of things at the type level? We can't (yet) use values in types, but modern GHC features a very rich kind language, allowing us to use a wide range of things other than concrete types as type parameters.
Higher-Kinded Types
Type constructors in Haskell have always admitted non-* parameters. For example, the encoding of the fixed point of a functor works in plain old Haskell 98:
newtype Fix f = Fix { unFix :: f (Fix f) }
ghci> :k Fix
Fix :: (* -> *) -> *
Fix is parameterised by a functor of kind * -> *, not a type of kind *.
Beyond * and ->
The DataKinds extension enriches GHC's kind system with user-declared kinds, so kinds may be built of pieces other than * and ->. It works by promoting all data declarations to the kind level. That is to say, a data declaration like
data Nat = Z | S Nat -- natural numbers
introduces a kind Nat and type constructors Z :: Nat and S :: Nat -> Nat, as well as the usual type and value constructors. This allows you to write datatypes parameterised by type-level data, such as the customary vector type, which is a linked list indexed by its length.
data Vec n a where
Nil :: Vec Z a
(:>) :: a -> Vec n a -> Vec (S n) a
ghci> :k Vec
Vec :: Nat -> * -> *
There's a related extension called ConstraintKinds, which frees constraints like Ord a from the yoke of the "fat arrow" =>, allowing them to roam across the landscape of the type system as nature intended. Kmett has used this power to build a category of constraints, with the newtype (:-) :: Constraint -> Constraint -> * denoting "entailment": a value of type c :- d is a proof that if c holds then d also holds. For example, we can prove that Ord a implies Eq [a] for all a:
ordToEqList :: Ord a :- Eq [a]
ordToEqList = Sub Dict
Life after forall
However, Haskell currently maintains a strict separation between the type level and the value level. Things at the type level are always erased before the program runs, (almost) always inferrable, invisible in expressions, and (dependently) quantified by forall. If your application requires something more flexible, such as dependent quantification over runtime data, then you have to manually simulate it using a singleton encoding.
For example, the specification of split says it chops a vector at a certain length according to its (runtime!) argument. The type of the output vector depends on the value of split's argument. We'd like to write this...
split :: (n :: Nat) -> Vec (n :+: m) a -> (Vec n a, Vec m a)
... where I'm using the type function (:+:) :: Nat -> Nat -> Nat, which stands for addition of type-level naturals, to ensure that the input vector is at least as long as n...
type family n :+: m where
Z :+: m = m
S n :+: m = S (n :+: m)
... but Haskell won't allow that declaration of split! There aren't any values of type Z or S n; only types of kind * contain values. We can't access n at runtime directly, but we can use a GADT which we can pattern-match on to learn what the type-level n is:
data Natty n where
Zy :: Natty Z
Sy :: Natty n -> Natty (S n)
ghci> :k Natty
Natty :: Nat -> *
Natty is called a singleton, because for a given (well-defined) n there is only one (well-defined) value of type Natty n. We can use Natty n as a run-time stand-in for n.
split :: Natty n -> Vec (n :+: m) a -> (Vec n a, Vec m a)
split Zy xs = (Nil, xs)
split (Sy n) (x :> xs) =
let (ys, zs) = split n xs
in (x :> ys, zs)
Anyway, the point is that values - runtime data - can't appear in types. It's pretty tedious to duplicate the definition of Nat in singleton form (and things get worse if you want the compiler to infer such values); dependently-typed languages like Agda, Idris, or a future Haskell escape the tyranny of strictly separating types from values and give us a range of expressive quantifiers. You're able to use an honest-to-goodness Nat as split's runtime argument and mention its value dependently in the return type.
#pigworker has written extensively about the unsuitability of Haskell's strict separation between types and values for modern dependently-typed programming. See, for example, the Hasochism paper, or his talk on the unexamined assumptions that have been drummed into us by four decades of Hindley-Milner-style programming.
Dependent Kinds
Finally, for what it's worth, with TypeInType modern GHC unifies types and kinds, allowing us to talk about kind variables using the same tools that we use to talk about type variables. In a previous post about session types I made use of TypeInType to define a kind for tagged type-level sequences of types:
infixr 5 :!, :?
data Session = Type :! Session -- Type is a synonym for *
| Type :? Session
| E
I'd recommend #Benjamin Hodgson's answer and the references he gives to see how to make this sort of thing useful. But, to answer your question more directly, using several extensions (DataKinds, KindSignatures, and GADTs), you can define types that are parameterized on (certain) concrete types.
For example, here's one parameterized on the concrete Bool datatype:
{-# LANGUAGE DataKinds, KindSignatures, GADTs #-}
{-# LANGUAGE FlexibleInstances #-}
module FlaggedType where
-- The single quotes below are optional. They serve to notify
-- GHC that we are using the type-level constructors lifted from
-- data constructors rather than types of the same name (and are
-- only necessary where there's some kind of ambiguity otherwise).
data Flagged :: Bool -> * -> * where
Truish :: a -> Flagged 'True a
Falsish :: a -> Flagged 'False a
-- separate instances, just as if they were different types
-- (which they are)
instance (Show a) => Show (Flagged 'False a) where
show (Falsish x) = show x
instance (Show a) => Show (Flagged 'True a) where
show (Truish x) = show x ++ "*"
-- these lists have types as indicated
x = [Truish 1, Truish 2, Truish 3] -- :: Flagged 'True Integer
y = [Falsish "a", Falsish "b", Falsish "c"] -- :: Flagged 'False String
-- this won't typecheck: it's just like [1,2,"abc"]
z = [Truish 1, Truish 2, Falsish 3] -- won't typecheck
Note that this isn't much different from defining two completely separate types:
data FlaggedTrue a = Truish a
data FlaggedFalse a = Falsish a
In fact, I'm hard pressed to think of any advantage Flagged has over defining two separate types, except if you have a bar bet with someone that you can write useful Haskell code without type classes. For example, you can write:
getInt :: Flagged a Int -> Int
getInt (Truish z) = z -- same polymorphic function...
getInt (Falsish z) = z -- ...defined on two separate types
Maybe someone else can think of some other advantages.
Anyway, I believe that parameterizing types with concrete values really only becomes useful when the concrete type is sufficient "rich" that you can use it to leverage the type checker, as in Benjamin's examples.
As #user2407038 noted, most interesting primitive types, like Ints, Chars, Strings and so on can't be used this way. Interestingly enough, though, you can use literal positive integers and strings as type parameters, but they are treated as Nats and Symbols (as defined in GHC.TypeLits) respectively.
So something like this is possible:
import GHC.TypeLits
data Tagged :: Symbol -> Nat -> * -> * where
One :: a -> Tagged "one" 1 a
Two :: a -> Tagged "two" 2 a
Three :: a -> Tagged "three" 3 a
Look at using Generalized Algebraic Data Types (GADTS), which enable you to define concrete outputs based on input type, e.g.
data CustomMaybe a where
MaybeChar :: Maybe a -> CustomMaybe Char
MaybeString :: Maybe a > CustomMaybe String
MaybeBool :: Maybe a -> CustomMaybe Bool
exampleFunction :: CustomMaybe a -> a
exampleFunction (MaybeChar maybe) = 'e'
exampleFunction (MaybeString maybe) = True //Compile error
main = do
print $ exampleFunction (MaybeChar $ Just 10)
To a similar effect, RankNTypes can allow the implementation of similar behaviour:
exampleFunctionOne :: a -> a
exampleFunctionOne el = el
type PolyType = forall a. a -> a
exampleFuntionTwo :: PolyType -> Int
exampleFunctionTwo func = func 20
exampleFunctionTwo func = func "Hello" --Compiler error, PolyType being forced to return 'Int'
main = do
print $ exampleFunctionTwo exampleFunctionOne
The PolyType definition allows you to insert the polymorphic function within exampleFunctionTwo and force its output to be 'Int'.
No. Haskell doesn't have dependent types (yet). See https://typesandkinds.wordpress.com/2016/07/24/dependent-types-in-haskell-progress-report/ for some discussion of when it may.
In the meantime, you can get behavior like this in Agda, Idris, and Cayenne.

How do you apply function constraints in instance methods in Haskell?

I'm learning how to use typeclasses in Haskell.
Consider the following implementation of a typeclass T with a type constrained class function f.
class T t where
f :: (Eq u) => t -> u
data T_Impl = T_Impl_Bool Bool | T_Impl_Int Int | T_Impl_Float Float
instance T T_Impl where
f (T_Impl_Bool x) = x
f (T_Impl_Int x) = x
f (T_Impl_Float x) = x
When I load this into GHCI 7.10.2, I get the following error:
Couldn't match expected type ‘u’ with actual type ‘Float’
‘u’ is a rigid type variable bound by
the type signature for f :: Eq u => T_Impl -> u
at generics.hs:6:5
Relevant bindings include
f :: T_Impl -> u (bound at generics.hs:6:5)
In the expression: x
In an equation for ‘f’: f (T_Impl_Float x) = x
What am I doing/understanding wrong? It seems reasonable to me that one would want to specialize a typeclass in an instance by providing an accompaning data constructor and function implementation. The part
Couldn't match expected type 'u' with actual type 'Float'
is especially confusing. Why does u not match Float if u only has the constraint that it must qualify as an Eq type (Floats do that afaik)?
The signature
f :: (Eq u) => t -> u
means that the caller can pick t and u as wanted, with the only burden of ensuring that u is of class Eq (and t of class T -- in class methods there's an implicit T t constraint).
It does not mean that the implementation can choose any u.
So, the caller can use f in any of these ways: (with t in class T)
f :: t -> Bool
f :: t -> Char
f :: t -> Int
...
The compiler is complaining that your implementation is not general enough to cover all these cases.
Couldn't match expected type ‘u’ with actual type ‘Float’
means "You gave me a Float, but you must provide a value of the general type u (where u will be chosen by the caller)"
Chi has already pointed out why your code doesn't compile. But it's not even that typeclasses are the problem; indeed, your example has only one instance, so it might just as well be a normal function rather than a class.
Fundamentally, the problem is that you're trying to do something like
foobar :: Show x => Either Int Bool -> x
foobar (Left x) = x
foobar (Right x) = x
This won't work. It tries to make foobar return a different type depending on the value you feed it at run-time. But in Haskell, all types must be 100% determined at compile-time. So this cannot work.
There are several things you can do, however.
First of all, you can do this:
foo :: Either Int Bool -> String
foo (Left x) = show x
foo (Right x) = show x
In other words, rather than return something showable, actually show it. That means the result type is always String. It means that which version of show gets called will vary at run-time, but that's fine. Code paths can vary at run-time, it's types which cannot.
Another thing you can do is this:
toInt :: Either Int Bool -> Maybe Int
toInt (Left x) = Just x
toInt (Right x) = Nothing
toBool :: Either Int Bool -> Maybe Bool
toBool (Left x) = Nothing
toBool (Right x) = Just x
Again, that works perfectly fine.
There are other things you can do; without knowing why you want this, it's difficult to suggest others.
As a side note, you want to stop thinking about this like it's object oriented programming. It isn't. It requires a new way of thinking. In particular, don't reach for a typeclass unless you really need one. (I realise this particular example may just be a learning exercise to learn about typeclasses of course...)
It's possible to do this:
class Eq u => T t u | t -> u where
f :: t -> u
You need FlexibleContextx+FunctionalDepencencies and MultiParamTypeClasses+FlexibleInstances on call-site. Or to eliminate class and to use data types instead like Gabriel shows here

When are type signatures necessary in Haskell?

Many introductory texts will tell you that in Haskell type signatures are "almost always" optional. Can anybody quantify the "almost" part?
As far as I can tell, the only time you need an explicit signature is to disambiguate type classes. (The canonical example being read . show.) Are there other cases I haven't thought of, or is this it?
(I'm aware that if you go beyond Haskell 2010 there are plenty for exceptions. For example, GHC will never infer rank-N types. But rank-N types are a language extension, not part of the official standard [yet].)
Polymorphic recursion needs type annotations, in general.
f :: (a -> a) -> (a -> b) -> Int -> a -> b
f f1 g n x =
if n == (0 :: Int)
then g x
else f f1 (\z h -> g (h z)) (n-1) x f1
(Credit: Patrick Cousot)
Note how the recursive call looks badly typed (!): it calls itself with five arguments, despite f having only four! Then remember that b can be instantiated with c -> d, which causes an extra argument to appear.
The above contrived example computes
f f1 g n x = g (f1 (f1 (f1 ... (f1 x))))
where f1 is applied n times. Of course, there is a much simpler way to write an equivalent program.
Monomorphism restriction
If you have MonomorphismRestriction enabled, then sometimes you will need to add a type signature to get the most general type:
{-# LANGUAGE MonomorphismRestriction #-}
-- myPrint :: Show a => a -> IO ()
myPrint = print
main = do
myPrint ()
myPrint "hello"
This will fail because myPrint is monomorphic. You would need to uncomment the type signature to make it work, or disable MonomorphismRestriction.
Phantom constraints
When you put a polymorphic value with a constraint into a tuple, the tuple itself becomes polymorphic and has the same constraint:
myValue :: Read a => a
myValue = read "0"
myTuple :: Read a => (a, String)
myTuple = (myValue, "hello")
We know that the constraint affects the first part of the tuple but does not affect the second part. The type system doesn't know that, unfortunately, and will complain if you try to do this:
myString = snd myTuple
Even though intuitively one would expect myString to be just a String, the type checker needs to specialize the type variable a and figure out whether the constraint is actually satisfied. In order to make this expression work, one would need to annotate the type of either snd or myTuple:
myString = snd (myTuple :: ((), String))
In Haskell, as I'm sure you know, types are inferred. In other words, the compiler works out what type you want.
However, in Haskell, there are also polymorphic typeclasses, with functions that act in different ways depending on the return type. Here's an example of the Monad class, though I haven't defined everything:
class Monad m where
return :: a -> m a
(>>=) :: m a -> (a -> m b) -> m b
fail :: String -> m a
We're given a lot of functions with just type signatures. Our job is to make instance declarations for different types that can be treated as Monads, like Maybe t or [t].
Have a look at this code - it won't work in the way we might expect:
return 7
That's a function from the Monad class, but because there's more than one Monad, we have to specify what return value/type we want, or it automatically becomes an IO Monad. So:
return 7 :: Maybe Int
-- Will return...
Just 7
return 6 :: [Int]
-- Will return...
[6]
This is because [t] and Maybe have both been defined in the Monad type class.
Here's another example, this time from the random typeclass. This code throws an error:
random (mkStdGen 100)
Because random returns something in the Random class, we'll have to define what type we want to return, with a StdGen object tupelo with whatever value we want:
random (mkStdGen 100) :: (Int, StdGen)
-- Returns...
(-3650871090684229393,693699796 2103410263)
random (mkStdGen 100) :: (Bool, StdGen)
-- Returns...
(True,4041414 40692)
This can all be found at learn you a Haskell online, though you'll have to do some long reading. This, I'm pretty much 100% certain, it the only time when types are necessary.

Why constraints on data are a bad thing?

I know this question has been asked and answered lots of times but I still don't really understand why putting constraints on a data type is a bad thing.
For example, let's take Data.Map k a. All of the useful functions involving a Map need an Ord k constraint. So there is an implicit constraint on the definition of Data.Map. Why is it better to keep it implicit instead of letting the compiler and programmers know that Data.Map needs an orderable key.
Also, specifying a final type in a type declaration is something common, and one can see it as a way of "super" constraining a data type.
For example, I can write
data User = User { name :: String }
and that's acceptable. However is that not a constrained version of
data User' s = User' { name :: s }
After all 99% of the functions I'll write for the User type don't need a String and the few which will would probably only need s to be IsString and Show.
So, why is the lax version of User considered bad:
data (IsString s, Show s, ...) => User'' { name :: s }
while both User and User' are considered good?
I'm asking this, because lots of the time, I feel I'm unnecessarily narrowing my data (or even function) definitions, just to not have to propagate constraints.
Update
As far as I understand, data type constraints only apply to the constructor and don't propagate. So my question is then, why do data type constraints not work as expected (and propagate)? It's an extension anyway, so why not have a new extension doing data properly, if it was considered useful by the community?
TL;DR:
Use GADTs to provide implicit data contexts.
Don't use any kind of data constraint if you could do with Functor instances etc.
Map's too old to change to a GADT anyway.
Scroll to the bottom if you want to see the User implementation with GADTs
Let's use a case study of a Bag where all we care about is how many times something is in it. (Like an unordered sequence. We nearly always need an Eq constraint to do anything useful with it.
I'll use the inefficient list implementation so as not to muddy the waters over the Data.Map issue.
GADTs - the solution to the data constraint "problem"
The easy way to do what you're after is to use a GADT:
Notice below how the Eq constraint not only forces you to use types with an Eq instance when making GADTBags, it provides that instance implicitly wherever the GADTBag constructor appears. That's why count doesn't need an Eq context, whereas countV2 does - it doesn't use the constructor:
{-# LANGUAGE GADTs #-}
data GADTBag a where
GADTBag :: Eq a => [a] -> GADTBag a
unGADTBag (GADTBag xs) = xs
instance Show a => Show (GADTBag a) where
showsPrec i (GADTBag xs) = showParen (i>9) (("GADTBag " ++ show xs) ++)
count :: a -> GADTBag a -> Int -- no Eq here
count a (GADTBag xs) = length.filter (==a) $ xs -- but == here
countV2 a = length.filter (==a).unGADTBag
size :: GADTBag a -> Int
size (GADTBag xs) = length xs
ghci> count 'l' (GADTBag "Hello")
2
ghci> :t countV2
countV2 :: Eq a => a -> GADTBag a -> Int
Now we didn't need the Eq constraint when we found the total size of the bag, but it didn't clutter up our definition anyway. (We could have used size = length . unGADTBag just as well.)
Now lets make a functor:
instance Functor GADTBag where
fmap f (GADTBag xs) = GADTBag (map f xs)
oops!
DataConstraints_so.lhs:49:30:
Could not deduce (Eq b) arising from a use of `GADTBag'
from the context (Eq a)
That's unfixable (with the standard Functor class) because I can't restrict the type of fmap, but need to for the new list.
Data Constraint version
Can we do as you asked? Well, yes, except that you have to keep repeating the Eq constraint wherever you use the constructor:
{-# LANGUAGE DatatypeContexts #-}
data Eq a => EqBag a = EqBag {unEqBag :: [a]}
deriving Show
count' a (EqBag xs) = length.filter (==a) $ xs
size' (EqBag xs) = length xs -- Note: doesn't use (==) at all
Let's go to ghci to find out some less pretty things:
ghci> :so DataConstraints
DataConstraints_so.lhs:1:19: Warning:
-XDatatypeContexts is deprecated: It was widely considered a misfeature,
and has been removed from the Haskell language.
[1 of 1] Compiling Main ( DataConstraints_so.lhs, interpreted )
Ok, modules loaded: Main.
ghci> :t count
count :: a -> GADTBag a -> Int
ghci> :t count'
count' :: Eq a => a -> EqBag a -> Int
ghci> :t size
size :: GADTBag a -> Int
ghci> :t size'
size' :: Eq a => EqBag a -> Int
ghci>
So our EqBag count' function requires an Eq constraint, which I think is perfectly reasonable, but our size' function also requires one, which is less pretty. This is because the type of the EqBag constructor is EqBag :: Eq a => [a] -> EqBag a, and this constraint must be added every time.
We can't make a functor here either:
instance Functor EqBag where
fmap f (EqBag xs) = EqBag (map f xs)
for exactly the same reason as with the GADTBag
Constraintless bags
data ListBag a = ListBag {unListBag :: [a]}
deriving Show
count'' a = length . filter (==a) . unListBag
size'' = length . unListBag
instance Functor ListBag where
fmap f (ListBag xs) = ListBag (map f xs)
Now the types of count'' and show'' are exactly as we expect, and we can use standard constructor classes like Functor:
ghci> :t count''
count'' :: Eq a => a -> ListBag a -> Int
ghci> :t size''
size'' :: ListBag a -> Int
ghci> fmap (Data.Char.ord) (ListBag "hello")
ListBag {unListBag = [104,101,108,108,111]}
ghci>
Comparison and conclusions
The GADTs version automagically propogates the Eq constraint everywhere the constructor is used. The type checker can rely on there being an Eq instance, because you can't use the constructor for a non-Eq type.
The DatatypeContexts version forces the programmer to manually propogate the Eq constraint, which is fine by me if you want it, but is deprecated because it doesn't give you anything more than the GADT one does and was seen by many as pointless and annoying.
The unconstrained version is good because it doesn't prevent you from making Functor, Monad etc instances. The constraints are written exactly when they're needed, no more or less. Data.Map uses the unconstrained version partly because unconstrained is generally seen as most flexible, but also partly because it predates GADTs by some margin, and there needs to be a compelling reason to potentially break existing code.
What about your excellent User example?
I think that's a great example of a one-purpose data type that benefits from a constraint on the type, and I'd advise you to use a GADT to implement it.
(That said, sometimes I have a one-purpose data type and end up making it unconstrainedly polymorphic just because I love to use Functor (and Applicative), and would rather use fmap than mapBag because I feel it's clearer.)
{-# LANGUAGE GADTs #-}
import Data.String
data User s where
User :: (IsString s, Show s) => s -> User s
name :: User s -> s
name (User s) = s
instance Show (User s) where -- cool, no Show context
showsPrec i (User s) = showParen (i>9) (("User " ++ show s) ++)
instance (IsString s, Show s) => IsString (User s) where
fromString = User . fromString
Notice since fromString does construct a value of type User a, we need the context explicitly. After all, we composed with the constructor User :: (IsString s, Show s) => s -> User s. The User constructor removes the need for an explicit context when we pattern match (destruct), becuase it already enforced the constraint when we used it as a constructor.
We didn't need the Show context in the Show instance because we used (User s) on the left hand side in a pattern match.
Constraints
The problem is that constraints are not a property of the data type, but of the algorithm/function that operates on them. Different functions might need different and unique constraints.
A Box example
As an example, let's assume we want to create a container called Box which contains only 2 values.
data Box a = Box a a
We want it to:
be showable
allow the sorting of the two elements via sort
Does it make sense to apply the constraint of both Ord and Show on the data type? No, because the data type in itself could be only shown or only sorted and therefore the constraints are related to its use, not it's definition.
instance (Show a) => Show (Box a) where
show (Box a b) = concat ["'", show a, ", ", show b, "'"]
instance (Ord a) => Ord (Box a) where
compare (Box a b) (Box c d) =
let ca = compare a c
cb = compare b d
in if ca /= EQ then ca else cb
The Data.Map case
Data.Map's Ord constraints on the type is really needed only when we have > 1 elements in the container. Otherwise the container is usable even without an Ord key. For example, this algorithm:
transf :: Map NonOrd Int -> Map NonOrd Int
transf x =
if Map.null x
then Map.singleton NonOrdA 1
else x
Live demo
works just fine without the Ord constraint and always produce a non empty map.
Using DataTypeContexts reduces the number of programs you can write. If most of those illegal programs are nonsense, you might say it's worth the runtime cost associated with ghc passing in a type class dictionary that isn't used. For example, if we had
data Ord k => MapDTC k a
then #jefffrey's transf is rejected. But we should probably have transf _ = return (NonOrdA, 1) instead.
In some sense the context is documentation that says "every Map must have ordered keys". If you look at all of the functions in Data.Map you'll get a similar conclusion "every useful Map has ordered keys". While you can create maps with unordered keys using
mapKeysMonotonic :: (k1 -> k2) -> Map k1 a -> Map k2 a
singleton :: k2 a -> Map k2 a
But the moment you try to do anything useful with them, you'll wind up with No instance for Ord k2 somewhat later.

Resources